CN107831824B - Clock signal transmission method and device, multiplexing chip and electronic equipment - Google Patents

Clock signal transmission method and device, multiplexing chip and electronic equipment Download PDF

Info

Publication number
CN107831824B
CN107831824B CN201710958797.XA CN201710958797A CN107831824B CN 107831824 B CN107831824 B CN 107831824B CN 201710958797 A CN201710958797 A CN 201710958797A CN 107831824 B CN107831824 B CN 107831824B
Authority
CN
China
Prior art keywords
clock signal
unit
computing unit
core
core computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
CN201710958797.XA
Other languages
Chinese (zh)
Other versions
CN107831824A (en
Inventor
李继峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Suneng Technology Co ltd
Original Assignee
Bitmain Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=61648107&utm_source=***_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=CN107831824(B) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Bitmain Technologies Inc filed Critical Bitmain Technologies Inc
Priority to CN201710958797.XA priority Critical patent/CN107831824B/en
Publication of CN107831824A publication Critical patent/CN107831824A/en
Application granted granted Critical
Publication of CN107831824B publication Critical patent/CN107831824B/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)
  • Synchronisation In Digital Transmission Systems (AREA)

Abstract

The embodiment of the invention discloses a clock signal transmission method, a clock signal transmission device, a multiplexing chip and electronic equipment, wherein the method comprises the following steps: acquiring the forward transmission direction of data based on the transmission of the data from the first core computing unit to the last core computing unit in the multiplexing chip; the multiplexing chip comprises a plurality of core computing units, wherein output data of a previous core computing unit is used as input data of a next core computing unit; inputting a clock signal from the last core computing unit of the multiplexing chip, and reversely transmitting the clock signal to the first core computing unit; the reverse direction transfer is opposite to the forward direction transfer. According to the embodiment of the invention, the clock signal is transmitted in the direction opposite to the data flow direction, so that the requirement for maintaining the time sequence check of the adjacent operation cores is naturally met, an additional cache unit is not required to be added, and a large amount of chip area and power consumption are saved.

Description

Clock signal transmission method and device, multiplexing chip and electronic equipment
Technical Field
The present invention relates to data processing technologies, and in particular, to a clock signal transmission method, an apparatus, a multiplexing chip, and an electronic device.
Background
At present, many chips in the prior art contain a large number of multiplexing cores, especially for display chips, artificial intelligence chips and digital currency mining chips; in the multiplexing core chip, the data flow between the operation cores is unidirectional; the clock tree is a mesh structure built by balancing a plurality of buffer cells (buffer cells), and has an active point, generally a clock input port, and then is built by one-level and one-level buffer cells.
In the process of implementing the invention, the inventor finds that the prior art has at least the following problems: most of the multiplexing core chips are large, and the corresponding clock tree becomes very long. Engineers must design a very long clock tree in the top design of the chip to meet the requirement of synchronous timing check between adjacent computational cores.
Disclosure of Invention
The embodiment of the invention aims to solve the technical problem that: a clock signal transmission method, a clock signal transmission device, a multiplexing chip and an electronic device are provided.
The clock signal transmission method provided by the embodiment of the invention comprises the following steps:
acquiring the forward transmission direction of data based on the transmission of the data from the first core computing unit to the last core computing unit in the multiplexing chip; the multiplexing chip comprises a plurality of core computing units, wherein output data of a previous core computing unit is used as input data of a next core computing unit;
inputting a clock signal from the last core computing unit of the multiplexing chip, and reversely transmitting the clock signal to the first core computing unit; the reverse direction transfer is opposite to the forward direction transfer.
In another embodiment of the foregoing method according to the present invention, the inversely transferring the clock signal from the last core computing unit to the first core computing unit of the multiplexing chip includes:
taking the last core computing unit as a current core computing unit, inputting a generated current clock signal into the current core computing unit and a group of cache units by a clock generator, wherein the group of cache units comprises at least one cache unit;
performing iteration, namely taking the clock signal processed by the cache unit as a current clock signal, and taking the last core calculation unit as a current core calculation unit; and inputting the current clock signal into the current core computing unit and a group of buffer units until the current core computing unit is the first core computing unit.
In another embodiment of the foregoing method according to the present invention, the method further includes: all the cache units form a clock tree.
In another embodiment based on the foregoing method of the present invention, a group of cache units in the clock tree corresponding to the current core computing unit is a first group of cache units, and a group of cache units corresponding to the previous core computing unit is a second group of cache units; the number of the cache units included in the second group of cache units exceeds the number of the cache units included in the first group of cache units by a preset number.
In another embodiment of the above method according to the invention, the clock tree is a trapezoidal clock tree with increasing size.
In another embodiment based on the foregoing method of the present invention, the cache unit repairs the received clock signal, and the clock signal is repaired to reach a preset standard and then transmitted to the current core computing unit and the group of cache units.
In another embodiment of the method according to the present invention, the core computing unit includes a plurality of basic computing units connected in series, and each of the basic computing units performs the same operation on the input data.
According to another aspect of the embodiments of the present invention, there is provided a clock signal transmission apparatus, including:
the direction obtaining unit is used for obtaining the forward transmission direction of the data based on the transmission of the data from the first core computing unit to the last core computing unit in the multiplexing chip; the multiplexing chip comprises a plurality of core computing units, wherein output data of a previous core computing unit is used as input data of a next core computing unit;
the clock transmission unit is used for inputting a clock signal from the last core calculation unit of the multiplexing chip and reversely transmitting the clock signal to the first core calculation unit; the reverse direction transfer is opposite to the forward direction transfer.
In another embodiment of the above apparatus according to the present invention, the clock transmission unit includes:
the signal transmission module is used for taking the last core calculation unit as a current core calculation unit, the clock generator inputs the generated current clock signal into the current core calculation unit and a group of cache units, and the group of cache units comprise at least one cache unit;
the iteration module is used for performing iteration, the clock signal processed by the cache unit is used as a current clock signal, and the last core calculation unit is used as a current core calculation unit; and inputting the current clock signal into the current core computing unit and a group of buffer units until the current core computing unit is the first core computing unit.
In another embodiment of the above apparatus according to the present invention, the clock transmission unit further includes: and the tree construction module is used for constructing all the cache units into a clock tree.
In another embodiment of the above apparatus according to the present invention, a group of cache units corresponding to the current core computing unit in the clock tree is a first group of cache units, and a group of cache units corresponding to the previous core computing unit is a second group of cache units; the number of the cache units included in the second group of cache units exceeds the number of the cache units included in the first group of cache units by a preset number.
In another embodiment of the above apparatus according to the present invention, the clock tree is a trapezoidal clock tree including a plurality of groups of buffer units.
In another embodiment of the above apparatus according to the present invention, the buffer unit is configured to repair the received clock signal, and transmit the repaired clock signal to the current core computing unit and the group of buffer units after the repaired clock signal meets a preset criterion.
In another embodiment of the above apparatus according to the present invention, the core computing unit includes a plurality of basic computing units connected in series, and each of the basic computing units performs the same operation on the input data.
According to another aspect of the embodiments of the present invention, there is provided a multiplexing chip, including:
a plurality of core computing units for receiving data and transmitting the data sequentially; the sequential transmission is from a first core computing unit to a last core computing unit; wherein, the output data of the last core computing unit is used as the input data of the next core computing unit;
and the clock tree is used for transmitting the clock signal in the direction opposite to the data flow direction in the core computing unit.
In another embodiment of the multiplexing chip according to the present invention, the clock tree includes a plurality of groups of buffer units, each group of buffer units includes at least one buffer unit, and the buffer units are configured to repair the received clock signal.
In another embodiment of the multiplexing chip according to the invention, the number of the buffer units included in the next group of buffer units in the clock tree exceeds the number of the buffer units included in the previous group of buffer units by a preset number.
According to another aspect of the embodiments of the present invention, there is provided an electronic device including the clock signal transfer apparatus as described above or the multiplexing chip as described above.
According to another aspect of the embodiments of the present invention, there is provided an electronic device including: a memory for storing executable instructions;
and a processor for communicating with the memory to execute the executable instructions to perform the operations of the clock signaling method of the multiplexing chip as described above.
Based on the clock signal transmission method, the clock signal transmission device, the multiplexing chip and the electronic device provided by the embodiments of the present invention, a forward transmission direction of data is obtained based on the transmission of the data from the first core computing unit to the last core computing unit in the multiplexing chip; the clock signal is input from the last core computing unit of the multiplexing chip and reversely transmitted to the first core computing unit, and the clock signal is transmitted in the direction opposite to the data flow direction, so that the time sequence check of adjacent operation cores naturally meets the requirement, an additional cache unit is not required to be added, and a large amount of chip area and power consumption are saved.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
The invention will be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:
FIG. 1 is a flowchart illustrating a clock signal transmission method according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of timing check between core compute units.
FIG. 3 is a schematic structural diagram of a clock signal transmission apparatus according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a data flow of a specific example of the multiplexing chip of the present invention.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Embodiments of the invention are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the computer system/server include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above systems, and the like.
The computer system/server may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
In the prior art, the data flow direction of the traditional clock tree and the clock tree growth direction are the same (from the first core computing unit to the last core computing unit), the clock tree growing to the next core computing unit is larger and longer than the clock tree of the previous core computing unit, and a large number of buffers have to be inserted to avoid serious timing violations.
FIG. 1 is a flowchart illustrating a clock signal transmission method according to an embodiment of the present invention. As shown in fig. 1, the method of this embodiment includes:
step 101, obtaining the forward transmission direction of data based on the transmission of data from the first core computing unit to the last core computing unit in the multiplexing chip.
The multiplexing chip comprises a plurality of core computing units, wherein output data of a previous core computing unit is used as input data of a next core computing unit;
102, inputting a clock signal from the last core computing unit of the multiplexing chip, and reversely transmitting the clock signal to the first core computing unit; reverse direction transfers are opposite to forward direction transfers.
Based on the clock signal transmission method provided by the above embodiment of the present invention, a forward transmission direction of data is obtained based on the transmission of data from a first core computing unit to a last core computing unit in a multiplexing chip; the clock signal is input from the last core computing unit of the multiplexing chip and reversely transmitted to the first core computing unit, and the clock signal is transmitted in the direction opposite to the data flow direction, so that the time sequence check of adjacent operation cores naturally meets the requirement, an additional cache unit is not required to be added, and a large amount of chip area and power consumption are saved.
All clock tree structures are in the operation core, and space and resources of a top-level design are not occupied.
Both the branching portion of the clock tree and the on-chip skew cost are determined. The timing problem can be solved in advance inside the operation. The top layer does not need to do extra work.
The clock tree length difference near the operation core is small, and the debit time sequence is favorably met. The difference in the length of the clock tree is equal to the height of each step of the ladder structure.
The clock tree structure of the inverse data flow enables the time sequence checking of the adjacent operation core to be naturally satisfied, and a large amount of chip area and power consumption can be saved.
In another embodiment of the clock signal transmission method according to the present invention, based on the above embodiments, operation 102 includes:
taking the last core computing unit as a current core computing unit, inputting a generated current clock signal into the current core computing unit and a group of cache units by a clock generator, wherein the group of cache units comprises at least one cache unit;
performing iteration, namely taking the clock signal processed by the cache unit as a current clock signal, and taking the last core calculation unit as a current core calculation unit; and inputting the current clock signal into the current core computing unit and a group of buffer units until the current core computing unit is the first core computing unit.
The clock signal is generated by a clock generator on the mainboard, the clock signal is transmitted forward from the last core computing unit in sequence, a clock tree for transmitting the clock signal is established based on the transmitted cache unit, and compared with the clock tree transmitted forward in the prior art, the growth direction of a reverse clock tree is opposite to the data stream inflow direction when the reverse clock tree grows, so that serious timing violation can not occur, and a buffer is not required to be inserted for repairing the timing violation; FIG. 2 is a schematic diagram of timing check between core compute units. As shown in fig. 2, there is a very complicated timing check between each core computing unit, specifically: signals need to be propagated from one time sequence unit to another time sequence unit, time sequence inspection requires that the arrival time 1+ the arrival time 2 of the signals is more than the arrival time 3, a chip can normally operate, if a clock tree and a data stream both flow in the forward direction, the time for the clock tree and the data stream to arrive at the [ time sequence unit 21 ] is short, the time for the clock tree and the data stream to arrive at the [ time sequence unit 22 ] is long, a buffer has to be inserted at the time sequence unit 22 in order to avoid serious time sequence violation, the time spent on the arrival time 1+ the arrival time 2 is more than the arrival time 3, and when the data stream flows in the reverse direction, the time for the clock tree to arrive at the [ time sequence unit 22 ] is short, the time for the clock tree and the data stream to arrive at the [ time sequence unit 21 ] is long, and the arrival time 1+ the; the inverse data flow trapezoidal clock tree saves resources, reduces cost and meets the requirement of time sequence inspection.
In a specific example of the above embodiments of the clock signal transfer method of the present invention, all the buffer units form a clock tree.
The clock tree is a mesh structure which is built by a plurality of buffer cell buffer units in a balanced mode, and has a source point which is generally a clock input port clock input end and then is built by buffer units of one stage and one stage, wherein the specific stages are determined according to your setting and used units, and the purpose is to enable the clock skew (generally most concerned), interrupt delay insertion delay and transition of a used terminal point to meet design requirements.
In a specific example of the above embodiments of the clock signal transmission method of the present invention, a group of cache units corresponding to a current core computing unit in a clock tree is a first group of cache units, and a group of cache units corresponding to a previous core computing unit is a second group of cache units; the number of the buffer units included in the second group of buffer units exceeds the number of the buffer units included in the first group of buffer units by a preset number.
In this embodiment, the clock tree is reversely transferred, and a certain number of buffer units are correspondingly added every time a core computing unit is transferred, and the added buffer units are added into the clock tree to enlarge the clock tree, so that the timing inspection can be satisfied without adding a buffer.
In a specific example of the above embodiments of the clock signal transfer method of the present invention, the clock tree is a trapezoidal clock tree that gradually increases.
In the embodiment, one trapezoid in the trapezoid clock tree represents one core, so that resources are saved, the chip cost is reduced, and meanwhile, the set-up time check and the hold time check in the time sequence check are met.
In another embodiment of the clock signal transmission method according to the present invention, based on the above embodiments, the buffer unit repairs the received clock signal, and transmits the repaired clock signal to the current core computing unit and the group of buffer units after the repaired clock signal reaches the preset standard.
In this embodiment, the buffer unit has no functional function, and is only responsible for repairing signals and transmitting the signals with high quality without changing any function; specifically, the distorted signal may be patched by a filter, and the patched signal may be transmitted to a previous core computing unit, for example, the distorted square wave may be patched into a square wave again for transmission.
In another embodiment of the clock signal transmission method according to the present invention, based on the above embodiments, the core computing unit includes a plurality of basic computing units connected in series, and each basic computing unit performs the same operation on the input data.
In this embodiment, since many chips have complex functions, a large number of basic computing units need to be integrated in the chips, and in order to improve the processing efficiency of the chips, this embodiment proposes that a core computing unit is formed by a plurality of basic computing units, and adjacent core computing units are tightly attached to each other.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
FIG. 3 is a schematic structural diagram of a clock signal transmission apparatus according to an embodiment of the present invention. The apparatus of this embodiment may be used to implement the method embodiments of the present invention described above. As shown in fig. 3, the apparatus of this embodiment includes:
a direction obtaining unit 31, configured to obtain a forward transfer direction of the data based on transfer of the data from the first core computing unit to the last core computing unit in the multiplexing chip.
The multiplexing chip comprises a plurality of core computing units, wherein output data of a previous core computing unit is used as input data of a next core computing unit;
the clock transmission unit 32 is used for inputting a clock signal from the last core calculation unit of the multiplexing chip and reversely transmitting the clock signal to the first core calculation unit; reverse direction transfers are opposite to forward direction transfers.
Based on the clock signal transmission method provided by the above embodiment of the present invention, data flows from the first core computing unit to the last core computing unit in the multiplexing chip; the clock signal is transmitted from the last core computing unit of the multiplexing chip to the first core computing unit, and the clock signal is transmitted in the direction opposite to the data flow direction, so that the time sequence checking of adjacent computing cores naturally meets the requirement, an additional cache unit is not required to be added, and a large amount of chip area and power consumption are saved.
In another embodiment of the clock signal transmission apparatus of the present invention, based on the above embodiments, the clock transmission unit 32 includes:
the signal transmission module is used for taking the last core calculation unit as a current core calculation unit, the clock generator inputs the generated current clock signal into the current core calculation unit and a group of cache units, and the group of cache units comprise at least one cache unit;
the iteration module is used for performing iteration, the clock signal processed by the cache unit is used as a current clock signal, and the last core calculation unit is used as a current core calculation unit; and inputting the current clock signal into the current core computing unit and a group of buffer units until the current core computing unit is the first core computing unit.
The clock signal is generated by a clock generator on the mainboard, the clock signal is transmitted forward from the last core computing unit in sequence, a clock tree for transmitting the clock signal is established based on the transmitted cache unit, and compared with the clock tree transmitted forward in the prior art, the growth direction of a reverse clock tree is opposite to the data stream inflow direction when the reverse clock tree grows, so that serious timing violation can not occur, and a buffer is not required to be inserted for repairing the timing violation; the inverse data flow trapezoidal clock tree saves resources, reduces cost and meets the requirement of time sequence inspection.
In a specific example of the above embodiments of the clock signal transmission apparatus of the present invention, the clock transmission unit 32 further includes: and the tree construction module is used for constructing all the cache units into a clock tree.
In a specific example of each of the above embodiments of the clock signal transmitting apparatus of the present invention, a group of cache units corresponding to a current core computing unit in a clock tree is a first group of cache units, and a group of cache units corresponding to a previous core computing unit is a second group of cache units; the number of the buffer units included in the second group of buffer units exceeds the number of the buffer units included in the first group of buffer units by a preset number.
In a specific example of the above embodiments of the clock signal transmission apparatus of the present invention, the clock tree is a trapezoidal clock tree including a plurality of groups of buffer units.
In another embodiment of the clock signal transmission apparatus of the present invention, based on the above embodiments, the buffer unit is configured to repair the received clock signal, and transmit the repaired clock signal to the current core computing unit and the group of buffer units after the repaired clock signal reaches the preset standard.
In this embodiment, the buffer unit has no functional function, and is only responsible for repairing signals and transmitting the signals with high quality without changing any function; specifically, the distorted signal may be patched by a filter, and the patched signal may be transmitted to a previous core computing unit, for example, the distorted square wave may be patched into a square wave again for transmission.
In another embodiment of the clock signal transmission apparatus of the present invention, based on the above embodiments, the core computing unit includes a plurality of basic computing units connected in series, and each basic computing unit performs the same operation on the input data.
In this embodiment, since many chips have complex functions, a large number of basic computing units need to be integrated in the chips, and in order to improve the processing efficiency of the chips, this embodiment proposes that a core computing unit is formed by a plurality of basic computing units, and adjacent core computing units are tightly attached to each other.
In another aspect of the embodiments of the present invention, an embodiment of a multiplexing chip is provided, including:
a plurality of core computing units for receiving data and transmitting the data sequentially; the sequential transmission is from a first core computing unit to a last core computing unit; wherein, the output data of the last core computing unit is used as the input data of the next core computing unit;
and the clock tree is used for transmitting the clock signal in the direction opposite to the data flow direction in the core computing unit.
Fig. 4 is a schematic diagram of a data flow of a specific example of the multiplexing chip of the present invention. As shown in FIG. 4, the data flow is forward from the left into the plurality of core compute units, while the clock signal is backward from the right into the plurality of core compute units.
In a specific example of the foregoing embodiments of the multiplexing chip of the present invention, the clock tree includes a plurality of groups of buffer units, each group of buffer units includes at least one buffer unit, and the buffer units are configured to repair the received clock signal.
In a specific example of the foregoing embodiments of the multiplexing chip of the present invention, the number of the buffer units included in the next group of buffer units in the clock tree exceeds the number of the buffer units included in the previous group of buffer units by a preset number.
In another aspect of the embodiments of the present invention, an embodiment of an electronic device is provided, which includes any one of the above embodiments of the clock signal transfer apparatus of the present invention or any one of the above embodiments of the multiplexing chip of the present invention.
In another aspect of an embodiment of the present invention, another embodiment of an electronic device is provided, including: a memory for storing executable instructions;
and a processor in communication with the memory for executing the executable instructions to perform the operations of any of the above embodiments of the clock signal delivery method of the present invention.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The method and apparatus of the present invention may be implemented in a number of ways. For example, the methods and apparatus of the present invention may be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustrative purposes only, and the steps of the method of the present invention are not limited to the order specifically described above unless specifically indicated otherwise. Furthermore, in some embodiments, the present invention may also be embodied as a program recorded in a recording medium, the program including machine-readable instructions for implementing a method according to the present invention. Thus, the present invention also covers a recording medium storing a program for executing the method according to the present invention. The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.

Claims (12)

1. A clock signal transfer method, comprising:
acquiring the forward transmission direction of data based on the transmission of the data from the first core computing unit to the last core computing unit in the multiplexing chip; the multiplexing chip comprises a plurality of core computing units, wherein output data of a previous core computing unit is used as input data of a next core computing unit;
inputting a clock signal from the last core computing unit of the multiplexing chip, and reversely and sequentially transmitting the clock signal to the first core computing unit; the reverse direction of transfer is opposite to the forward direction of transfer,
wherein, the inputting the clock signal from the last core calculating unit of the multiplexing chip and reversely transmitting the clock signal to the first core calculating unit comprises:
taking the last core computing unit as a current core computing unit, inputting a generated current clock signal into the current core computing unit and a group of cache units by a clock generator, wherein the group of cache units comprises at least one cache unit;
performing iteration, namely taking the clock signal processed by the cache unit as a current clock signal, and taking the last core calculation unit as a current core calculation unit; inputting a current clock signal into a current core computing unit and a group of buffer units until the current core computing unit is a first core computing unit,
and the cache unit repairs the received clock signal, and the clock signal is repaired to reach a preset standard and then transmitted to the current core computing unit and the group of cache units.
2. The method of claim 1, further comprising: all the cache units form a clock tree.
3. The method of claim 2, wherein the set of cache units in the clock tree corresponding to the current core compute unit is a first set of cache units, and the set of cache units corresponding to the previous core compute unit is a second set of cache units; the number of the cache units included in the second group of cache units exceeds the number of the cache units included in the first group of cache units by a preset number.
4. The method of claim 3, wherein the clock tree is a gradually increasing trapezoidal clock tree.
5. The method of any of claims 1-4, wherein the core computing unit comprises a plurality of basic computing units connected in series, each of the basic computing units performing the same operation on the input data.
6. A clock signal transfer apparatus, comprising:
the direction obtaining unit is used for obtaining the forward transmission direction of the data based on the transmission of the data from the first core computing unit to the last core computing unit in the multiplexing chip; the multiplexing chip comprises a plurality of core computing units, wherein output data of a previous core computing unit is used as input data of a next core computing unit;
the clock transmission unit is used for inputting a clock signal from the last core calculation unit of the multiplexing chip and reversely and sequentially transmitting the clock signal to the first core calculation unit; the reverse direction of transfer is opposite to the forward direction of transfer,
wherein the clock transmission unit includes:
the signal transmission module is used for taking the last core calculation unit as a current core calculation unit, the clock generator inputs the generated current clock signal into the current core calculation unit and a group of cache units, and the group of cache units comprise at least one cache unit;
the iteration module is used for performing iteration, the clock signal processed by the cache unit is used as a current clock signal, and the last core calculation unit is used as a current core calculation unit; inputting a current clock signal into a current core computing unit and a group of buffer units until the current core computing unit is a first core computing unit,
the buffer unit is used for repairing the received clock signal, and transmitting the clock signal to the current core computing unit and the group of buffer units after the clock signal is repaired to reach a preset standard.
7. The apparatus of claim 6, wherein the clock transmission unit further comprises: and the tree construction module is used for constructing all the cache units into a clock tree.
8. The apparatus of claim 7, wherein a group of cache units in the clock tree corresponding to a current core compute unit is a first group of cache units, and a group of cache units corresponding to a previous core compute unit is a second group of cache units; the number of the cache units included in the second group of cache units exceeds the number of the cache units included in the first group of cache units by a preset number.
9. The apparatus of claim 8, wherein the clock tree is a ladder clock tree comprising a plurality of sets of buffer units.
10. The apparatus of any of claims 6-9, wherein the core computing unit comprises a plurality of basic computing units connected in series, each of the basic computing units performing the same operation on input data.
11. An electronic device, characterized in that it comprises a clock signal transfer means according to any one of claims 6 to 10.
12. An electronic device, comprising: a memory for storing executable instructions;
and a processor in communication with the memory to execute the executable instructions to perform the operations of the clock signaling method of any of claims 1 to 5.
CN201710958797.XA 2017-10-16 2017-10-16 Clock signal transmission method and device, multiplexing chip and electronic equipment Ceased CN107831824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710958797.XA CN107831824B (en) 2017-10-16 2017-10-16 Clock signal transmission method and device, multiplexing chip and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710958797.XA CN107831824B (en) 2017-10-16 2017-10-16 Clock signal transmission method and device, multiplexing chip and electronic equipment

Publications (2)

Publication Number Publication Date
CN107831824A CN107831824A (en) 2018-03-23
CN107831824B true CN107831824B (en) 2021-04-06

Family

ID=61648107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710958797.XA Ceased CN107831824B (en) 2017-10-16 2017-10-16 Clock signal transmission method and device, multiplexing chip and electronic equipment

Country Status (1)

Country Link
CN (1) CN107831824B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114528246A (en) * 2020-11-23 2022-05-24 深圳比特微电子科技有限公司 Operation core, calculation chip and encrypted currency mining machine

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324774A (en) * 2012-12-29 2013-09-25 东南大学 Processor performance optimization method based on clock planning deviation algorithm

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6049883A (en) * 1998-04-01 2000-04-11 Tjandrasuwita; Ignatius B. Data path clock skew management in a dynamic power management environment
US7856543B2 (en) * 2001-02-14 2010-12-21 Rambus Inc. Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream
EP1423774A1 (en) * 2001-09-06 2004-06-02 Siemens Aktiengesellschaft Option for independently adjusting the timing of the forward and reverse direction of bi-directional digital signals
DE60201508T2 (en) * 2002-05-02 2005-02-03 Alcatel Method for phase control of a data signal, reverse clock circuit and interface device
DE10236328A1 (en) * 2002-08-08 2004-02-19 Koninklijke Philips Electronics N.V. Shift register circuit with improved electromagnetic compatibility has logic elements connected sequentially, connected in pairs using data line, serially connected together in pairs by clock signal
US9053281B2 (en) * 2013-03-21 2015-06-09 Synopsys, Inc. Dual-structure clock tree synthesis (CTS)

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324774A (en) * 2012-12-29 2013-09-25 东南大学 Processor performance optimization method based on clock planning deviation algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
光接收芯片内时钟数据恢复电路的设计;刘小强;《中国优秀硕士学位论文全文数据库信息科技辑》;20170630(第2017年06期);I136-1159 *

Also Published As

Publication number Publication date
CN107831824A (en) 2018-03-23

Similar Documents

Publication Publication Date Title
CN109447641B (en) Method and apparatus for transmitting blockchain data to blockchain browser
CN107135078B (en) PBKDF2 cryptographic algorithm acceleration method and device used thereby
CN104850560A (en) Method and system for loading business data in webpage in real time
CN106708468B (en) Division operation device
US20170195914A1 (en) Hardware Acceleration for Batched Sparse Codes
CN109542885A (en) Data cleaning method, device, equipment and storage medium
CN109086879B (en) Method for realizing dense connection neural network based on FPGA
CN111008691B (en) Convolutional neural network accelerator architecture with weight and activation value both binarized
CN107831824B (en) Clock signal transmission method and device, multiplexing chip and electronic equipment
CN110569038B (en) Random verification parameter design method, device, computer equipment and storage medium
CN104991883A (en) Sending and receiving apparatuses with chip interconnection and sending and receiving method and system
CN110287023A (en) Message treatment method, device, computer equipment and readable storage medium storing program for executing
US11522680B2 (en) Method and apparatus for computing hash function
CN105357148A (en) Method and system for preventing output message of network exchange chip from being disordered
CN110515604A (en) The acquisition methods and device of the executable program file of verification environment
CN102130744A (en) Method and device for computing Cyclic Redundancy Check (CRC) code
CN109960866A (en) Signal processing method, verification method and electronic equipment
CN110515591A (en) Random digit generation method and device based on block chain
US10338921B2 (en) Asynchronous instruction execution apparatus with execution modules invoking external calculation resources
WO2012149775A1 (en) Data processing method and device
CN115346099A (en) Image convolution method, chip, equipment and medium based on accelerator chip
CN109618070A (en) Bill charging method, device, equipment and medium
EP2793124A1 (en) Device and method for generating application model based on layered structure
CN104598163B (en) A kind of storage method of the high speed memory modules based on load ground test interface adapter
CN109274460A (en) A kind of multi-bit parallel structure serially offsets interpretation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20190418

Address after: 100192 2nd Floor, Building 25, No. 1 Hospital, Baosheng South Road, Haidian District, Beijing

Applicant after: BITMAIN TECHNOLOGIES Inc.

Address before: 100029 Beijing Aubei Industrial Base Project 6 Building 2 Floor

Applicant before: SUANFENG TECHNOLOGY (BEIJING) CO.,LTD.

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210819

Address after: 100192 Building No. 25, No. 1 Hospital, Baosheng South Road, Haidian District, Beijing, No. 301

Patentee after: SUANFENG TECHNOLOGY (BEIJING) Co.,Ltd.

Address before: 100192 2nd Floor, Building 25, No. 1 Hospital, Baosheng South Road, Haidian District, Beijing

Patentee before: BITMAIN TECHNOLOGIES Inc.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220224

Address after: 100176 901, floor 9, building 8, courtyard 8, KEGU 1st Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing (Yizhuang group, high-end industrial area of Beijing Pilot Free Trade Zone)

Patentee after: Beijing suneng Technology Co.,Ltd.

Address before: 100192 Building No. 25, No. 1 Hospital, Baosheng South Road, Haidian District, Beijing, No. 301

Patentee before: SUANFENG TECHNOLOGY (BEIJING) CO.,LTD.

IW01 Full invalidation of patent right
IW01 Full invalidation of patent right

Decision date of declaring invalidation: 20220914

Decision number of declaring invalidation: 57970

Granted publication date: 20210406