WO2021022441A1 - 数据传输方法、装置、电子设备及可读存储介质 - Google Patents

数据传输方法、装置、电子设备及可读存储介质 Download PDF

Info

Publication number
WO2021022441A1
WO2021022441A1 PCT/CN2019/099262 CN2019099262W WO2021022441A1 WO 2021022441 A1 WO2021022441 A1 WO 2021022441A1 CN 2019099262 W CN2019099262 W CN 2019099262W WO 2021022441 A1 WO2021022441 A1 WO 2021022441A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
transmission
network
address
transmitted
Prior art date
Application number
PCT/CN2019/099262
Other languages
English (en)
French (fr)
Inventor
何雷骏
董镇江
屠嘉晋
李震桁
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201980098672.1A priority Critical patent/CN114144793A/zh
Priority to PCT/CN2019/099262 priority patent/WO2021022441A1/zh
Publication of WO2021022441A1 publication Critical patent/WO2021022441A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • the embodiments of the present application relate to computer technology, and in particular, to a data transmission method, device, electronic device, and readable storage medium.
  • data may have the characteristics of sparseness.
  • a neural network involving data calculation and processing as an example.
  • a neural network generally has a sparse ratio in its feature maps and parameters. Among them, the feature map may have a sparse ratio of 20% to 80%, and the parameter may have a sparse ratio of 50% to 90%. The higher the sparsity ratio, the more zero-value data in the data, and these zero-value data do not contribute to the final calculation result. Therefore, the transmission and calculation of these zero-valued data are invalid operations.
  • data can be stored in a storage medium. When data calculation processing is required, the data needs to be transmitted from the storage medium to the calculation module of the processor for calculation processing.
  • the zero-value data needs to be transferred from the storage medium to the calculation module, and the calculation module needs to perform calculation processing on the zero-value data, which will cause a lot of Transmission overhead and computational overhead.
  • the embodiments of the present application provide a data transmission method, device, electronic device, and readable storage medium, which are used to reduce the data transmission overhead and calculation overhead of the electronic device.
  • an embodiment of the present application provides a data transmission method.
  • the method at least one piece of data to be transmitted is first obtained from a storage unit.
  • the storage unit is provided with N source addresses, and the data to be transmitted is stored dispersedly in the storage unit.
  • the N source addresses furthermore, based on the first preset relationship between the source address and the target address, the first transmission subnet is used to store the waiting addresses from the first source address to the N/2th source address.
  • the transmission data is transmitted to the corresponding destination address.
  • the first preset relationship includes: when the source address is K, the corresponding target address is one of 0 to K starting from 0.
  • the above-mentioned first transmission sub-network includes multiple layers, each layer includes at least one switching node, and there is no switching node from the 2 ⁇ (Y-1)+1 position to the 2 ⁇ Y position of layer Y, Moreover, when there is at least one switching node from the first position to the 2 ⁇ Yth position in the layer Y, each switching node in the at least one switching node does not include an uplink connection line.
  • a transmission network for transmitting data between the source address and the target address is proposed.
  • the first transmission sub-network of the transmission network There is no switching node from the 2 ⁇ (Y–1)+1 position to the 2 ⁇ Y position of layer Y, and there is at least one from the first position to the 2 ⁇ Y position in layer Y
  • each of the at least one switching node does not include an uplink connection line, and when data is transmitted through the transmission network, collisions will not occur.
  • the transmission network has significantly reduced the number of switching nodes and the complexity of the transmission network. Therefore, the transmission network has the advantages of fast transmission speed and less transmission resource occupation.
  • the above method further includes:
  • the second transmission subnet is used to transfer the data to be transmitted stored in the N/2+1th source address to the Nth source address to the corresponding target address.
  • the second preset relationship includes: when the source address is L, the corresponding target address is one of M-1 to M-1-[L%(N/2)] starting from M-1, M Is the number of target addresses, M is less than N.
  • the second transmission sub-network includes multiple layers, each layer includes at least one switching node, there is no switching node from the 2 ⁇ (S-1)+1 position to the 2 ⁇ S position of layer S, and When there is at least one switching node from the first position to the 2 ⁇ S position in the layer S, each switching node in the at least one switching node does not include an uplink connection line.
  • a transmission network for transmitting data between the source address and the target address is proposed.
  • the layer There is no exchange node from the 2 ⁇ (S-1)+1 position to the 2 ⁇ S position of S, and there is at least one exchange from the first position to the 2 ⁇ S position in the layer S
  • each of the at least one switching node does not include an uplink connection line.
  • the transmission network has significantly reduced the number of switching nodes and the complexity of the transmission network. Therefore, the transmission network has the advantages of fast transmission speed and less transmission resource occupation.
  • the number of layers of the first transmission sub-network is log 2 (N)+1, and/or the number of layers of the second transmission sub-network is log 2 (N)+1.
  • the storage when using the first transmission subnet to transmit the data to be transmitted stored in the first source address to the N/2th source address to the corresponding destination address, the storage may be obtained first The destination address corresponding to the data transmission to be transmitted from the first source address to the N/2th source address.
  • the destination address is represented by a binary value, and then, starting from the LSB of the destination address, according to each bit in the destination address The value of determines the transmission path of the data to be transmitted in the first transmission sub-network, and the data to be transmitted is transmitted to the target address through the transmission path in the first transmission sub-network.
  • the second transmission sub-network when using the second transmission sub-network to transmit the data to be transmitted stored in the N/2+1th source address to the Nth source address to the corresponding destination address, you can first Obtain the target address corresponding to the data to be transmitted stored in the N/2+1th source address to the Nth source address.
  • the target address is represented by a binary value, and then, starting from the LSB of the target address, according to each target address The value on the bit determines the transmission path of the data to be transmitted in the second transmission sub-network, and the data to be transmitted is transmitted to the target address through the transmission path in the second transmission sub-network.
  • the transmission network is used to route the data to be transmitted to the target address according to the LSB, which can further increase the data transmission speed.
  • the aforementioned target address is an address in a calculation module, and the calculation module includes at least M addresses.
  • the first transmission subnet is used, and the storage is stored in the first source address to the N/2th source address Before the data to be transmitted is transmitted to the corresponding target address, it can be first judged whether the number of data to be transmitted is greater than M. If the number of data to be transmitted is greater than M, then at least one data to be transmitted is divided into multiple groups of sub-data, each group of sub-data Transfer under a transfer clock.
  • the data to be transmitted is divided into multiple groups of sub-data, and each group of sub-data is transmitted under different clocks, thereby avoiding conflicts in data transmission and calculations , To ensure the correctness of data transmission and calculation.
  • N 8 and M is 4.
  • an embodiment of the present application provides a data transmission device, which includes a storage unit, a target module, a transmission network, and a control module.
  • N source addresses are set in the storage unit, and multiple target addresses are set in the target module.
  • the transmission network is respectively connected with the storage unit and the target module.
  • the transmission network includes a first transmission sub-network, the first transmission sub-network includes a plurality of layers, each layer includes at least one switching node, the 2 ⁇ (Y-1)+1 position to the 2 ⁇ Yth position of layer Y There is no switching node in the position, and when there is at least one switching node in the first position to the 2 ⁇ Yth position in layer Y, each switching node in the at least one switching node does not include an uplink connection line .
  • the control module is used to obtain at least one piece of data to be transmitted from the storage unit, the data to be transmitted is stored in the above N source addresses, and based on the first preset relationship between the source address and the target address, using the first
  • the transmission sub-network transmits the data to be transmitted stored in the first source address to the N/2th source address to the corresponding destination address, where the first preset relationship includes: when the source address is K, the corresponding The target address is one from 0 to K starting from 0.
  • the transmission network further includes a second transmission sub-network.
  • the second transmission sub-network includes multiple layers, and each layer includes at least one switching node. There is no switching node from the 2 ⁇ (S-1)+1 position to the 2 ⁇ S position of layer S, and when When there is at least one switching node from the first position to the 2 ⁇ S position in the layer S, each of the at least one switching node does not include an uplink connection line.
  • the control module is also configured to use the second transmission sub-network based on the second preset relationship between the source address and the target address to store the data to be transmitted from the N/2+1th source address to the Nth source address To the corresponding destination address, where the second preset relationship includes: when the source address is L, the corresponding destination address is M-1 to M-1-[L%(N/2 )], M is the number of target addresses, M is less than N.
  • the number of layers of the first transmission sub-network is log 2 (N)+1, and/or the number of layers of the second transmission sub-network is log 2 (N)+1.
  • control module is specifically used for:
  • control module is specifically used for:
  • the target address corresponding to the data to be transmitted stored in the N/2+1th source address to the Nth source address
  • the target address is represented by a binary value; and, starting from the LSB of the target address, according to each target address
  • the value on the bit determines the transmission path of the data to be transmitted in the second transmission sub-network, and the data to be transmitted is transmitted to the target address through the transmission path in the second transmission sub-network.
  • the target module is a calculation module, and the calculation module includes at least M addresses.
  • control module is also used to:
  • At least one data to be transmitted is divided into multiple groups of sub-data, and each group of sub-data is transmitted under one transmission clock.
  • N 8 and M is 4.
  • an embodiment of the present application provides an electronic device, including: a memory and a processor.
  • the processor is configured to be coupled with the memory, read and execute instructions in the memory, so as to implement the method steps described in the first aspect.
  • an embodiment of the present application provides a computer program product, characterized in that the computer program product includes computer program code, and when the computer program code is executed by a computer, the computer is caused to execute the above-mentioned first aspect. The method described.
  • an embodiment of the present application provides a computer-readable storage medium, wherein the computer storage medium stores computer instructions, and when the computer instructions are executed by a computer, the computer executes the first aspect described above. Instructions for the described method.
  • an embodiment of the present application provides a chip, which is connected to a memory, and is used to read and execute a software program stored in the memory to implement the method provided in the first aspect.
  • Figure 1 is an example diagram of the process of convolution operation on a section of parameter (weight) and a section of feature map (feather map) in a certain neural network;
  • FIG. 2 is a schematic flowchart of a data transmission method provided by an embodiment of the application.
  • Figure 3 is a schematic diagram of the structure of a traditional butterfly network
  • Figure 4 is a schematic diagram of the structure of the reverse butterfly network
  • Figure 5(a) shows the evolution of the first half of the sub-network
  • Figure 5(b) is the structure diagram of the transmission network after the evolution
  • Figure 6(a) shows the evolution of the second half of the sub-network
  • Figure 6(b) is the network structure diagram after the evolution
  • Figure 7 is a schematic diagram of a network structure obtained by simultaneously using the two-part sub-network evolution method shown in the previous section;
  • FIG. 8 is a schematic flowchart of a data transmission method provided by an embodiment of the application.
  • FIG. 9 is a module structure diagram of a data transmission device provided by an embodiment of the application.
  • FIG. 10 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • Figure 1 is an example diagram of the process of performing convolution operations on a section of parameter (weight) and a section of feature map (feather map) in a certain neural network.
  • the neural network includes a section of parameters and a section of feature map.
  • This segment of parameter is composed of multiple data, and some of the multiple data are 0.
  • This piece of feature map is composed of multiple data, and some of the multiple data are zero.
  • data is first stored in a storage medium before the calculation, and further, when performing arithmetic processing, it needs to be sent to a calculation module for calculation. After the calculation module performs calculation processing, it may also need to send the calculation result to the next calculation module, and so on.
  • the address where the data before the operation is stored is called the source address
  • the address of the data in the calculation module during the operation processing is called the target address. Data needs to be transferred from the source address to the destination address and processed.
  • the source address may refer to an address in a storage medium, such as an address in a storage medium such as static RAM (SRAM) and dynamic RAM (DRAM), etc.
  • the source address may also refer to the address in the calculation module.
  • the target address can refer to the address in the calculation module.
  • data refers to data that can be used for calculations such as half-precision floating-point numbers, full-precision floating-point numbers, integers, etc.
  • data can be expressed in decimal or binary.
  • the implementation of this application The example does not specifically limit the specific representation of data. Taking the parameter shown in Figure 1 as an example, a segment of parameter is composed of 8 data of 0, 0, 1, 0, 0, 0, 0, -1, and each data is an integer and expressed in decimal.
  • a section of parameter and a section of characteristic diagram can be respectively referred to as a data sequence.
  • the data sequence is uniformly transmitted from the source address to the target address. Specifically, one piece of data in the data sequence is stored in one source address, one source address corresponds to one target address, and the data in each source address is respectively transmitted to the corresponding target address.
  • the valid data in the data sequence is first filtered out before the data sequence is transmitted from the source address to the target address.
  • the valid data can refer to the data in the data sequence that contributes to the operation result.
  • the valid data is transmitted through the transmission network between the source module where the source address is located and the target module where the target address is located.
  • the source module includes multiple addresses
  • the target module also includes multiple addresses. The data saved in each address of the source module is transmitted to each address of the target module through the transmission network between the source module and the target module.
  • the source module is SRAM
  • the target module is a certain calculation module A
  • the data stored in the 8 addresses in SRAM can be passed
  • the transmission network between SRAM and calculation module A is transmitted to the 4 addresses of calculation module A.
  • the transmission overhead and calculation overhead of sparse data can be greatly reduced.
  • the following embodiments of the present application aim to provide a data transmission method based on a transmission network with fast transmission speed and less transmission resource occupancy, so that when data with sparseness is transmitted based on the network, transmission overhead and calculation overhead can be greatly reduced.
  • the method can be applied to any electronic device including a storage medium and a computing module.
  • the electronic device may be a communication device such as a terminal device and a network device, or the electronic device may also be a server or the like.
  • the terminal device may also be called a terminal, user equipment (UE), mobile station (MS), mobile terminal (MT), and so on.
  • the terminal device can be a mobile phone (mobile phone), a tablet computer (pad), a computer with wireless transceiver function, virtual reality (VR) terminal equipment, augmented reality (AR) terminal equipment, industrial control (industrial control) Control), wireless terminals in self-driving (self-driving), wireless terminals in remote medical surgery, wireless terminals in smart grid (smart grid), transportation safety (transportation safety) Wireless terminal, wireless terminal in smart city, wireless terminal in smart home, etc.
  • the network device may be a base station, for example, it may be a base station in the global system for mobile communication (GSM) or code division multiple access (CDMA).
  • Transceiver station, BTS can also be a base station (NodeB) in wideband code division multiple access (WCDMA), or an evolved base station (eNB or e-NodeB, evolutional Node B) in LTE , Or gNB in NR, etc.
  • NodeB base station
  • WCDMA wideband code division multiple access
  • eNB or e-NodeB, evolutional Node B evolutional Node B
  • the base station can also be a wireless controller in a cloud radio access network (cloud radio access network, CRAN) scenario, or can be a relay station, access point, in-vehicle equipment, wearable equipment, and network equipment in 5G networks or future evolution Network equipment in the PLMN network, etc.
  • cloud radio access network cloud radio access network, CRAN
  • CRAN cloud radio access network
  • FIG. 2 is a schematic flowchart of a data transmission method provided by an embodiment of the application. As shown in FIG. 2, the method may include:
  • the above at least one data to be transmitted may be data in a data sequence
  • the data sequence may be any data sequence in the electronic device that needs to be transmitted to the calculation module for calculation processing, for example, it may be a section of parameters illustrated in Figure 1 above. Or a piece of feature map, etc.
  • the calculation module may also be referred to as a calculation unit.
  • the above-mentioned storage unit may be SRAM, DRAM, etc.
  • the storage unit includes multiple source addresses, and each source address may store one piece of data to be transmitted.
  • the aforementioned data to be transmitted may be valid data in a data sequence.
  • the electronic device may pre-mark valid data in the data sequence.
  • the electronic device may determine the data sequence and the valid data in the other data according to another data sequence that is operated on with the data sequence and the operation method of the data sequence and the other data sequence, and mark the valid data . Taking the data sequence as a section of the aforementioned neural network in Fig. 1 with a parameter containing 0, and another data sequence as the aforementioned section of the neural network in Fig.
  • the electronic device first reads the A segment of parameter and a segment of feature map that needs to be calculated with the segment of parameter, and know that the segment of parameter and the segment of feature map need to be multiplied, and then the electronic device multiplies the segment of parameter with the segment of feature map, the result is not 0
  • the data is marked as valid data. Specifically, the data corresponding to subscripts 2 and 7 in Fig. 1 are marked as valid data, that is, in a section of the parameters of the neural network illustrated in Fig. 1, the valid data are 1 and -1. At the same time, the neural network illustrated in Fig. 1 In a piece of data, valid data are 3 and 5. After the valid data is marked, the valid data is scattered and stored in the N source addresses of the storage unit.
  • the foregoing N source addresses may be all addresses in the module storing the data to be transmitted, or the foregoing N source addresses may also be part of the addresses in the module storing the data to be transmitted, which is not discussed in this embodiment of the application. Specific restrictions.
  • the above N is an even number.
  • a segment of parameters includes 8 data, respectively 0, 0, 1, 0, 0, 0, 0, -1, and the 8 data are respectively stored in 8 source addresses of the storage unit.
  • N is 8.
  • the aforementioned at least one data to be transmitted is respectively transmitted to a target address in the calculation module.
  • the calculation module where the target address is located may include at least M addresses.
  • the data in the N source addresses are transferred to the M addresses of the calculation module.
  • M is less than N, that is, the number of data calculated in a single calculation by the calculation module is less than the number of data stored in the module where the source address is located, so as to align the valid data.
  • the first source address among the N source addresses and the target address corresponding to the first source address satisfy the first preset relationship
  • the The first preset relationship includes: when the source address is K, the target address is one of 0 to K starting from 0. Among them, K is a number greater than or equal to zero.
  • the above-mentioned first source address is any source address from the first source address to the N/2th source address.
  • the mapping relationship between the source address and the destination address can be expressed in an entry management manner.
  • the mapping relationship between the source address and the destination address is a fully connected relationship, that is, for a source address
  • the stored data may be transmitted to any target address, and in the embodiment of the present application, for a source address K from the first source address to the N/2th source address, the corresponding target address is not Then there is any target address, but one from 0 to K.
  • Such a design can simplify the complexity of the transmission network under the premise of ensuring the normal transmission of data.
  • the source address and the target address are numbered starting from 0. Therefore, one of 0 to K represents the first target address to the K+1th target address.
  • the target address is the address in the calculation module, and the calculation module includes M addresses. The M addresses are numbered starting from 0. Therefore, the target address 0 represents the first target address in the calculation module, and the target address M-1 Represents the M-th target address in the calculation module.
  • the data stored in these addresses is transmitted to the computing module from The address starts with 0, and the destination address to which the data in the source address K is transferred is one of 0 to K.
  • the corresponding destination address is the forward arrangement.
  • the module where the source address is located and the calculation module are transmitted through a specific transmission network.
  • the above-mentioned transmission network includes a first transmission sub-network, and the first transmission sub-network is used to transmit the data to be transmitted stored in the first source address to the N/2-th source address.
  • the transmission sub-network includes multiple layers, and each layer includes at least one switching node. There is no switching node from the 2 ⁇ (Y-1)+1 position to the 2 ⁇ Y position of layer Y, and when layer Y When there is at least one switching node in the first position to the 2 ⁇ Y position in, each of the at least one switching node does not include an uplink connection line.
  • the symbol " ⁇ " represents a power operation, for example, 2 ⁇ Y represents the Y power of 2, and will not be explained separately below.
  • the switching node may be a logic device implemented by circuit logic.
  • the switching node may be a 2-2 multiplexer (MUX) or the like.
  • the first transmission sub-network is used to transmit the data to be transmitted stored in the first source address to the N/2th source address, that is, the first transmission sub-network is used to transmit the data in the first half of the source address Data to be transmitted.
  • the number of layers of the transmission network can be flexibly set.
  • the number of layers of the transmission network may be determined according to the number of source addresses. When the number of source addresses is the aforementioned N, the number of layers of the transmission network can be the result of log 2 (N)+1 rounded up.
  • the transmission network of the embodiment of the present application can be evolved on the basis of the traditional transmission network.
  • the following uses a traditional butterfly network as an example to illustrate the characteristics of the transmission network in the embodiment of the present application.
  • FIG 3 is a schematic diagram of the structure of a traditional butterfly network.
  • the butterfly network is responsible for transmitting data from 8 source addresses to 4 destination addresses. Data in different source addresses may need to use the same transmission line. Transmission, which may cause collisions. For example, the data stored in source address 0 and the data stored in address 4 may need to be transmitted using the transmission line between node 1 and node 2 at the same time, thereby causing a collision.
  • Figure 4 is a schematic diagram of the reverse butterfly network structure.
  • the network includes two transmissions.
  • Sub-network one transmission sub-network (the first half of the transmission sub-network) is responsible for transmitting the data in the first half of the source address to the destination address
  • the other transmission sub-network (the second half of the transmission sub-network) is responsible for the second half of the source address
  • the data is transferred to the destination address.
  • the first half of the source address and the second half of the source address respectively refer to: assuming that the network includes N source addresses, the first half of the source addresses refer to source addresses 0 to N/2-1, and the second half of source addresses refer to N/2 to N- 1.
  • Both transmission sub-networks include multiple layers, and each layer includes multiple switching nodes. Each node in the first layer of each transmission sub-network is connected to each source address, and each node in the last layer of each transmission sub-network is connected to each target address. It is worth noting that in the network structure shown in FIG. 4, for the switching node A, the switching node B, the switching node C, and the switching node D connected to the target address, they belong to two transmission sub-networks at the same time.
  • the first half of the transmission sub-network can be evolved to obtain the transmission network described in step S202.
  • Figure 5(a) is the evolution process of the first half of the sub-network
  • Figure 5(b) is the structure diagram of the transmission network after the evolution.
  • the transmission network includes a first transmission sub-network and a second transmission sub-network.
  • the first transmission subnet is responsible for transmitting the data in the first half of the source address to the target address
  • the second transmission subnet is responsible for transmitting the data in the second half of the source address to the target address.
  • Both sub-networks include multiple layers, and each layer includes multiple switching nodes.
  • Each node in the first layer of each transmission sub-network is connected to each source address, and each node in the last layer of each sub-network is connected to each target address. It is worth noting that in the network structure shown in Figure 5(b), for the switching node connected to the target address, it belongs to two sub-networks at the same time. At the same time, based on the aforementioned first preset relationship, the first half of the sub-networks in the reverse butterfly network shown in FIG. 4 can be evolved.
  • the evolution is as follows, where Y is greater than or equal to 0 and less than or equal to the difference of the number of layers of the transmission network minus 1.
  • the number of layers of the first transmission sub-network can be log 2 (N)+1 when the result is rounded up, the value of Y is: greater than or equal to 0, and less than or equal to the result of log(N) rounded up.
  • the number of layers of the first transmission sub-network and the number of layers of the second transmission sub-network are respectively the same as the number of layers of the transmission network.
  • the switching nodes in each layer of the first transmission sub-network may be numbered as follows:
  • switching node 0 represents the first switching node
  • switching node 2 ⁇ Y-1 represents the 2 ⁇ Y switching node.
  • the switch node connected to the smallest source address has the smallest number, and so on.
  • the switching node in layer 0 connected to source address 0 is switching node 0
  • the switching node in layer 0 connected to source address 1 is switching node 1. ,And so on.
  • the numbers of each switching node are respectively consistent with the numbers of the switching nodes in the first layer with the same positions as the switching nodes.
  • there are 4 switching nodes in layer 1 and the lowest switching node is in the same position as switching node 0 in layer 0, that is, it belongs to the lowest switching node in the layer, then the lowest switching node in layer 1
  • the switching node is switching node 0.
  • a switching node at the lower level is at the same position as switching node 1 in layer 0, that is, both belong to a switching node at the lower level. Therefore, the switching node at the lower level in layer 1 is switching node 1.
  • the number of each switching node in the other layers except for layer 0 and the last layer of the first transmission sub-network can be obtained.
  • the switching node connected to the smallest target address has the smallest number, and so on.
  • the switching node in layer 3 connected to the target address 0 is switching node 0
  • the switching node in layer 3 connected to the target address 1 is switching node 1. And so on.
  • sequence numbers of the source address and the target address are also numbered starting from 0.
  • source address 0 represents the first source address, and so on.
  • upward transmission refers to the switching node with a smaller number transmitting data to a switching node with a larger number.
  • Figure 5(a) when the switching node 0 of layer 1 transmits data to the switching node 2 of layer 2, That is, upward transmission.
  • the uplink connection line refers to the connection between the switching node with the smaller number in the lower-numbered layer and the switching node with the larger number in the higher-numbered layer.
  • layer 1 is the lower numbered layer
  • layer 2 is the higher numbered layer.
  • switching node 0 in layer 1 and switching node 2 in layer 2 switching node 0 in layer 1 is a switching node with a smaller number
  • switching node 2 in layer 2 is a switching node with a larger number.
  • the connection between switching node 0 in 1 to switching node 2 in layer 2 is an uplink connection line.
  • the layer Y is a layer other than the first layer and the last layer in the first transmission sub-network.
  • the switching nodes deleted in this step include 2*2 switching nodes and 2*1 switching nodes.
  • a 2*2 switching node refers to a node including 2 input connections and 2 output connections
  • a 2*1 switching node refers to a node including 2 input connections and 1 output connection.
  • switching node 1 in layer 1 is only used to connect switching node 1 and layer 0
  • the switching node 1 of layer 2 therefore, after the switching node 1 of layer 1 is deleted, the switching node 1 of layer 0 is directly connected to the switching node 1 of layer 2, and the normal transmission of data in the source address will not be affected.
  • the switching node 2 and switching node 3 of layer 2 are also deleted, and the switching node 3 of layer 1 is directly connected to the switching node 3 of layer 3; the switching node 2 of layer 1 and the switching node 2 of layer 3 are directly connected. even.
  • this step can be performed independently of the above (1) and (2), or, if the above (1) and (2) are performed, the result of this step can be satisfied.
  • the Y layer of the resulting transmission network meets the following conditions:
  • each of the at least one switching node does not include an uplink connection line.
  • the transmission network in Figure 5(b) is used to transmit data in 8 source addresses to 4 destination addresses
  • the first transmission subnet in the transmission network is used to transmit data in source addresses 0 to 3.
  • the first transmission sub-network includes 4 layers, which are layer 0, layer 1, layer 2, and layer 3.
  • Layer 0 includes 4 switching nodes, which are switching node 0, switching node 1, switching node 2, and switching node 3 .
  • Layer 1 includes 3 switching nodes, namely node 0, node 2 and node 3.
  • Layer 2 includes two switching nodes, namely node 0 and node 1.
  • Layer 3 includes 4 switching nodes. For the connection mode of each switching node in each layer, refer to Figure 5(b), which will not be described here.
  • the source address and the destination address meet the above-mentioned first preset relationship, when the data in the first half of the source address is transmitted through the first transmission sub-network shown in FIG. 5(b), it will not In the event of a collision, at the same time, the first transmission sub-network shown in Figure 5(b) above is compared with the traditional transmission network that does not collide, such as a crossbar network (Crossbar network).
  • the number of switching nodes in the transmission network is This has significantly reduced the complexity of the transmission network.
  • a transmission network for transmitting data between the source address and the target address is proposed.
  • the first transmission sub-network of the transmission network There is no switching node from the 2 ⁇ (Y–1)+1 position to the 2 ⁇ Y position of layer Y, and there is at least one from the first position to the 2 ⁇ Y position in layer Y
  • each of the at least one switching node does not include an uplink connection line, and when data is transmitted through the transmission network, collisions will not occur.
  • the transmission network has significantly reduced the number of switching nodes and the complexity of the transmission network. Therefore, the transmission network has the advantages of fast transmission speed and less transmission resource occupation.
  • the second source address and the target address corresponding to the second source address satisfy a second preset relationship
  • the second preset relationship includes:
  • the source address is L
  • the destination address is one of M-1 to M-1-[L%(N/2)] starting from M-1, where the second source address is N/2+1 Any one of the source addresses to the Nth source address.
  • L is a number greater than zero.
  • the data stored in these addresses is transmitted to The address starting from M-1 in the module is calculated, and the destination address to which the data in the source address L is transferred is one of M-1 to M-1-[L%(N/2)].
  • the corresponding destination address is arranged in a reverse direction.
  • the above-mentioned transmission network further includes a second transmission sub-network, and the second transmission sub-network is used to transmit the data to be transmitted stored in the N/2+1 source addresses to the Nth source address.
  • the second transmission sub-network includes multiple layers, and each layer includes at least one switching node. There is no switching node from the 2 ⁇ (S-1)+1 position to the 2 ⁇ S position of layer S, and when When there is at least one switching node from the first position to the 2 ⁇ S position in the layer S, each of the at least one switching node does not include an uplink connection line.
  • the aforementioned second transmission sub-network can be used to transmit the data to be transmitted stored in the N/2+1th source address to the Nth source address to the corresponding destination address.
  • the foregoing evolution process may be used to evolve the second transmission sub-network in the transmission network.
  • Figure 6(a) is the evolution process of the second half of the sub-network
  • Figure 6(b) is the network structure diagram after the evolution.
  • the following evolution process is performed on the second half of the sub-network of the reverse butterfly network shown in FIG. 4. It is worth noting that in Figures 6(a) and 6(b), the second transmission sub-network is first connected according to the above-mentioned reverse arrangement. Specifically, node 0 of layer 2 is connected to node 3 of layer 3. Node 1 of 2 is connected to node 2 of layer 3, and so on.
  • the evolution is as follows, where S is greater than or equal to 0 and less than or equal to the difference of the number of layers of the second transmission sub-network minus 1.
  • the number of layers of the second transmission sub-network can be When it is the result of log 2 (N)+1 rounded up, the value of S is: greater than or equal to 0, and less than or equal to the result of log 2 (N) rounded up.
  • layer S the uplink connection from switching node 0 to switching node 2 ⁇ S-1 is omitted.
  • the switching nodes in each layer of the second transmission sub-network may be numbered as follows:
  • switching node 0 represents the first switching node
  • switching node 2 ⁇ Y-1 represents the 2 ⁇ Yth switching node.
  • the switch node connected to the smallest source address has the smallest number, and so on.
  • the switching node in layer 0 connected to source address 0 is switching node 0
  • the switching node in layer 0 connected to source address 1 is switching node 1. ,And so on.
  • the numbers of each switching node are respectively consistent with the numbers of the switching nodes in the first layer with the same positions as the switching nodes.
  • the lowest switching node is in the same position as switching node 0 in layer 0, that is, it belongs to the lowest switching node in the layer, then the lowest switching node in layer 1
  • the switching node is switching node 0.
  • a switching node at the lower level is at the same position as switching node 1 in layer 0, that is, both belong to a switching node at the lower level. Therefore, the switching node at the lower level in layer 1 is switching node 1.
  • the number of each switching node in the other layers except for layer 0 and the last layer of the second transmission sub-network can be obtained.
  • the switching node connected to the smallest target address has the smallest number, and so on.
  • the switching node in layer 3 connected to target address 0 is switching node 0
  • the switching node in layer 3 connected to target address 1 is switching node 1. ,And so on.
  • the layer S is a layer other than the first layer and the last layer in the second transmission sub-network.
  • the switching nodes deleted in this step include 2*2 switching nodes and 2*1 switching nodes.
  • this step can be performed independently of the above (1) and (2), or, if the above (1) and (2) are performed, the result of this step can be satisfied.
  • the S layer of the resulting transmission network satisfies the following conditions:
  • each of the at least one switching node does not include an uplink connection line.
  • the transmission network in Figure 6(b) is used to transmit data in 8 source addresses to 4 destination addresses
  • the second transmission sub-network in the transmission network is used to transmit data in source addresses 4 to 7
  • the second transmission sub-network includes 4 layers, namely layer 0, layer 1, layer 2, and layer 3, and layer 0 includes 4 switching nodes, namely switching node 0, switching node 1, switching node 2, and switching node 3 .
  • Layer 1 includes 3 switching nodes, namely node 0, node 2 and node 3.
  • Layer 2 includes two switching nodes, namely node 0 and node 1.
  • Layer 3 includes 4 switching nodes.
  • the connection mode of each switching node in each layer can be referred to Figure 6(b), which will not be described here.
  • the source address and the target address satisfy the above-mentioned second preset relationship, when the data in the second half of the source address is transmitted through the second transmission sub-network shown in FIG. 6(b), it is not Collision will occur.
  • the second transmission sub-network shown in Figure 6(b) above has a significant reduction in the number of switching nodes in the transmission network compared to the traditional non-collision transmission network, such as the Crossbar network. The complexity of the transmission network has dropped significantly.
  • a transmission network for transmitting data between the source address and the target address is proposed.
  • the second transmission sub-network of the transmission network There is no switching node from the 2 ⁇ (S-1)+1 position to the 2 ⁇ S position of layer S, and there is at least one from the first position to the 2 ⁇ S position in layer S
  • each of the at least one switching node does not include an uplink connection line.
  • the transmission network has significantly reduced the number of switching nodes and the complexity of the transmission network. Therefore, the transmission network has the advantages of fast transmission speed and less transmission resource occupation.
  • the transmission network can use the structure shown in Figure 5(b) above, that is, only the first transmission sub-network uses the evolved network structure, or the transmission network can use the structure shown in Figure 6(b) above.
  • the structure shown, that is, only the second transmission sub-network uses the evolved network structure.
  • the transmission network can also use the structure shown in Figure 7 below.
  • Figure 7 is a schematic diagram of the network structure obtained by simultaneously using the two-part sub-network evolution method shown in the preceding paragraph.
  • the structure of the first transmission sub-network is It is the same as the first transmission sub-network in Fig. 5(b), and the structure of the second transmission sub-network is the same as that of the first transmission sub-network in Fig. 6(b), which will not be repeated here.
  • Table 1 is an example of comparing the above-mentioned Figure 5(b) and the above-mentioned Figure 7 with the conventional transmission network.
  • the above-mentioned Figure 5(b) and the above-mentioned Figure 7 greatly save the number of 2*2 switching nodes and connecting lines compared to the traditional Crossbar network. At the same time, compared to the traditional butterfly network, it can Avoid collisions.
  • step S202 The following describes the specific process of data transmission in step S202 based on the above-mentioned transmission network.
  • FIG. 8 is a schematic flowchart of a data transmission method provided by an embodiment of the application. As shown in FIG. 8, the process of using the above-mentioned first transmission sub-network to transmit data to be transmitted to a target address includes:
  • the target address of the data to be transmitted may be obtained according to a preset correspondence between the number of the data to be transmitted and the target address.
  • the target address of the data to be transmitted is address 0
  • the destination address of the second data to be transmitted is address 1.
  • This section of the characteristic diagram includes two valid data, and the two valid data are to be transmitted. data.
  • This section of the feature map is stored in the eight source addresses shown in FIG. 7, where the data 5 is stored in the source address 0, and so on, stored in sequence. From the foregoing description, it can be seen that the valid data in this section of the characteristic diagram are 5 and 3, and the data 3 is stored in the source address 5. Therefore, the data 3 can be transmitted using the second transmission sub-network. At the same time, according to the above second preset relationship, the data 3 can be transmitted to the target address 2.
  • the binary value of destination address 1 is 001.
  • data 3 is routed on the second transmission subnet. Specifically, if the LSB of 001 is 1, data 3 is routed from layer 0 switching node 1 to layer 2 switching node 1, and from layer 2 switching node 1 to layer 3 switching node 2 directly, and then transmitted to the destination Address 2.
  • the above-mentioned transmission network is used to route the data to be transmitted to the target address according to the LSB, which can further increase the data transmission speed.
  • the destination address corresponding to the data to be transmitted stored in the N/2+1th source address to the Nth source address can be obtained first .
  • the target address is represented by a binary value, and further, starting from the LSB of the target address, the transmission path of the data to be transmitted in the second transmission sub-network is determined according to the value of each bit in the target address, and the transmission path of the data to be transmitted in the second transmission sub-network is determined by The transmission path transmits the data to be transmitted to the target address.
  • the specific execution process is the same as the processing process of the first transmission sub-network in FIG. 8, and will not be repeated here.
  • the number M of target addresses is less than the number N of source addresses.
  • M may be 4 and N may be 8.
  • the data to be transmitted can be divided into multiple groups of sub-data, and under one transmission clock, the aforementioned transmission network is used to divide a group of sub-data. The data is transferred to the corresponding destination address.
  • the data to be transmitted can be divided according to the source address.
  • the number of source addresses is 8 and the number of target addresses is 4, the data in source address 0 to source address 3 is taken as the first group of sub-data, and the data in source address 4 to source address 7 is taken as the second group. Group sub-data.
  • the data to be transmitted in the first group of sub-data is transmitted to the target address through the transmission network under one clock for calculation, and the data to be transmitted in the second group of sub-data is transmitted to the target address through the transmission network under another clock.
  • the target address is calculated.
  • the data to be transmitted is divided into multiple groups of sub-data, and each group of sub-data is transmitted under different clocks, thereby avoiding conflicts in data transmission and calculations. , To ensure the correctness of data transmission and calculation.
  • FIG. 9 is a module structure diagram of a data transmission device provided by an embodiment of the application.
  • the device may be the electronic device described in the foregoing embodiment, or may be a device in the electronic device that can implement the functions in the method provided by the embodiment of the application
  • the device may be a device or a chip system in an electronic device.
  • the device includes:
  • N source addresses are set in the storage unit 901, and multiple target addresses are set in the target module 902.
  • the transmission network 903 is connected to the storage unit 901 and the target module 902 respectively.
  • the transmission network 903 includes a first transmission sub-network, and the first transmission sub-network includes a plurality of layers, and each layer includes at least one switching node, from the 2 ⁇ (Y-1)+1 position to the 2 ⁇ Y of layer Y There is no switching node in each position, and when there is at least one switching node in the first position to the 2 ⁇ Y position in layer Y, each switching node in the at least one switching node does not include an uplink connection line.
  • the control module 904 may be connected to the storage unit 901, the target module 902, and the transmission network 903 respectively.
  • the control module 904 is configured to obtain at least one to-be-transmitted data from the storage unit 901, the to-be-transmitted data is stored in the aforementioned N source addresses, and, based on the first preset relationship between the source address and the target address, use
  • the first transmission sub-network transmits the data to be transmitted stored in the first source address to the N/2th source address to the corresponding destination address, where the first preset relationship includes: when the source address is K , The corresponding target address is one of 0 to K starting from 0.
  • the transmission network 903 further includes a second transmission sub-network.
  • the above-mentioned second transmission sub-network includes multiple layers, each layer includes at least one switching node, and there is no switching node at the 2 ⁇ (S-1)+1 position to the 2 ⁇ S position of layer S, and, When there is at least one switching node from the first position to the 2 ⁇ S position in layer S, each switching node in the at least one switching node does not include an uplink connection line;
  • the control module 904 is further configured to use the second transmission sub-network based on the second preset relationship between the source address and the target address to transfer the data stored in the N/2+1th source address to the Nth source address to be transmitted
  • the data is transmitted to the corresponding target address, where the second preset relationship includes: when the source address is L, the corresponding target address is M-1 to M-1-[L%(N/ 2)], M is the number of target addresses, M is less than N.
  • the number of layers of the first transmission sub-network is log 2 (N)+1, and/or the number of layers of the second transmission sub-network is log 2 (N)+1.
  • control module 904 is specifically configured to:
  • the destination address corresponding to the data transmission to be transmitted stored in the first source address to the N/2th source address
  • the destination address is represented by a binary value; and, starting from the LSB of the destination address, according to each bit in the destination address The value on the bit determines the transmission path of the data to be transmitted in the first transmission sub-network, and the data to be transmitted is transmitted to the target address through the transmission path in the first transmission sub-network.
  • control module 904 is specifically configured to:
  • the target address corresponding to the data to be transmitted stored in the N/2+1th source address to the Nth source address
  • the target address is represented by a binary value; and, starting from the LSB of the target address, according to each target address
  • the value on the bit determines the transmission path of the data to be transmitted in the second transmission sub-network, and the data to be transmitted is transmitted to the target address through the transmission path in the second transmission sub-network.
  • the target module 902 may be a calculation module, and the calculation module includes at least M addresses.
  • control module 904 is further configured to divide the at least one data to be transmitted into multiple groups of sub-data, and each group of sub-data is transmitted under one transmission clock.
  • the data transmission device provided in the embodiment of the present application can execute the method steps in the above method embodiment, and its implementation principles and technical effects are similar, and will not be repeated here.
  • the division of the various modules of the above device is only a division of logical functions, and may be fully or partially integrated into a physical entity in actual implementation, or may be physically separated.
  • these modules can all be implemented in the form of software called by processing elements; they can also be implemented in the form of hardware; some modules can be implemented in the form of calling software by processing elements, and some of the modules can be implemented in the form of hardware.
  • the determining module may be a separately established processing element, or it may be integrated into a certain chip of the above-mentioned device for implementation.
  • each step of the above method or each of the above modules can be completed by hardware integrated logic circuits in the processor element or instructions in the form of software.
  • the above modules may be one or more integrated circuits configured to implement the above methods, such as one or more application specific integrated circuit (ASIC), or one or more microprocessors (digital signal processor, DSP), or, one or more field programmable gate arrays (FPGA), etc.
  • ASIC application specific integrated circuit
  • DSP digital signal processor
  • FPGA field programmable gate arrays
  • the processing element may be a general-purpose processor, such as a central processing unit (CPU) or other processors that can call program codes.
  • CPU central processing unit
  • these modules can be integrated together and implemented in the form of a system-on-a-chip (SOC).
  • SOC system-on-a-chip
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the above-mentioned computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the above-mentioned computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the above-mentioned computer instructions can be transmitted from a website, computer, server, or data center through a cable (Such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • a cable such as coaxial cable, optical fiber, digital subscriber line (DSL)
  • wireless such as infrared, wireless, microwave, etc.
  • the foregoing computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the above-mentioned usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
  • a magnetic medium for example, a floppy disk, a hard disk, and a magnetic tape
  • an optical medium for example, a DVD
  • a semiconductor medium for example, a solid state disk (SSD)
  • FIG. 10 is a schematic structural diagram of an electronic device provided by an embodiment of the application.
  • the electronic device 1000 may include: a processor 101 (for example, a CPU), a memory 102, and a transceiver 103; the transceiver 103 is coupled to the processor 101, and the processor 101 controls the transceiver 103 to send and receive actions.
  • Various instructions may be stored in the memory 102 to complete various processing functions and implement method steps executed by the electronic device in the embodiments of the present application.
  • the electronic device involved in the embodiment of the present application may further include: a power supply 104, a system bus 105, and a communication port 106.
  • the transceiver 103 may be integrated in the transceiver of the electronic device, or may be an independent transceiver antenna on the electronic device.
  • the system bus 105 is used to implement communication connections between components.
  • the aforementioned communication port 106 is used to implement connection and communication between the electronic device and other peripherals.
  • the above-mentioned processor 101 is configured to be coupled with the memory 102 to read and execute instructions in the memory 102 to implement the method steps performed by the electronic device in the above-mentioned method embodiment. Its implementation principle and technical effect are similar, so it will not be repeated here.
  • the system bus mentioned in FIG. 10 may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • PCI peripheral component interconnect standard
  • EISA extended industry standard architecture
  • the system bus can be divided into address bus, data bus, control bus, etc. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface is used to realize the communication between the database access device and other devices (such as client, read-write library and read-only library).
  • the memory may include random access memory (RAM), and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
  • the above-mentioned processor can be a general-purpose processor, including a central processing unit CPU, a network processor (NP), etc.; it can also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • a general-purpose processor including a central processing unit CPU, a network processor (NP), etc.; it can also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • an embodiment of the present application further provides a computer-readable storage medium, where instructions are stored in the storage medium, which when run on a computer, cause the computer to execute the processing procedure of the electronic device in the foregoing embodiment.
  • an embodiment of the present application further provides a chip for executing instructions, and the chip is used to execute the processing procedure of the electronic device in the foregoing embodiment.
  • the embodiment of the present application also provides a program product, the program product includes a computer program, the computer program is stored in a storage medium, at least one processor can read the computer program from the storage medium, and the at least one processor executes the above implementation The processing process of the electronic device in the example.
  • At least one refers to one or more, and “multiple” refers to two or more.
  • “And/or” describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, both A and B exist, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the associated objects before and after are in an “or” relationship; in the formula, the character “/” indicates that the associated objects before and after are in a “division” relationship.
  • “The following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or plural items (a).
  • at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple One.
  • the size of the sequence numbers of the foregoing processes does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not be implemented in this application.
  • the implementation process of the example constitutes any limitation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本申请实施例提供一种数据传输方法、装置、电子设备及可读存储介质,在该方法中,从存储单元中获取至少一个待传输数据,存储单元中设置有N个源地址,待传输数据被分散存储于N个源地址中,基于源地址和目标地址之间的第一预设关系,使用第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址。第一预设关系包括:当源地址为K时,对应的目标地址为从0开始的0至K中的一个。第一传输子网络包括多个层,层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点,并且,当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。该方法可以极大地降低传输开销以及计算开销,极大提升具有稀疏性的数据的处理效率。

Description

数据传输方法、装置、电子设备及可读存储介质 技术领域
本申请实施例涉及计算机技术,尤其涉及一种数据传输方法、装置、电子设备及可读存储介质。
背景技术
在一些涉及数据计算处理的领域中,数据可能具有稀疏性的特点。以涉及数据计算处理的神经网络为例,神经网络在其特征图和参数中普遍存在稀疏比。其中,特征图中可能存在20%至80%的稀疏比,参数中可能存在50%至90%的稀疏比。稀疏比越高,则表示数据中的0值数据越多,这些0值数据对于最终的计算结果并没有贡献。因此,对这些0值数据的传输以及计算属于无效操作。在进行数据计算处理的处理器中,数据可以保存在存储介质中,当需要进行数据计算处理时,需要将数据从存储介质中传输到处理器的计算模块中进行计算处理。如果将前述的0值数据与其他非0值数据一样进行处理,则需要将0值数据从存储介质中传输到计算模块,并且,计算模块需要对0值数据进行计算处理,这会造成较大的传输开销以及计算开销。
因此,如何对具有稀疏性的数据进行传输和计算处理,以减少对0值数据传输和计算的无效操作,降低传输开销和计算开销,是亟待解决的问题。
发明内容
本申请实施例提供一种数据传输方法、装置、电子设备及可读存储介质,用于降低电子设备中数据的传输开销和计算开销。
第一方面,本申请实施例提供一种数据传输方法,在该方法中,首先从存储单元中获取至少一个待传输数据,该存储单元中设置有N个源地址,待传输数据被分散存储于该N个源地址中,进而,基于源地址和目标地址之间的第一预设关系,使用第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址。其中,该第一预设关系包括:当源地址为K时,对应的目标地址为从0开始的0至K中的一个。另外,上述第一传输子网络包括多个层,每个层包括至少一个交换节点,层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点,并且,当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。
在该方法中,基于源地址和目标地址之间满足的第一预设关系,提出了一种在源地址和目标地址之间传输数据的传输网络,该传输网络的第一传输子网络中,层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点,并且,当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路,通过该传输网络传输数据时,不会产生碰撞的情况。同时,该传输网络相比于 传统的不会发生碰撞的网络,交换节点的数量有了明显减少,传输网络的复杂度有了明显下降。因此,该传输网络具有传输速度快、传输资源占用少的优点。在使用该传输网络传输具有稀疏性的数据时,可以极大地降低传输开销以及计算开销,极大提升具有稀疏性的数据的处理效率。
在一种可选的实现方式中,上述方法还包括:
基于源地址和目标地址之间的第二预设关系,使用第二传输子网络,将存储于第N/2+1个源地址至第N个源地址中的待传输数据传输至对应的目标地址。其中,该第二预设关系包括:当源地址为L时,对应的目标地址为从M-1开始的M-1至M-1-[L%(N/2)]中的一个,M为目标地址的数量,M小于N。另外,第二传输子网络包括多个层,每个层包括至少一个交换节点,层S的第2^(S–1)+1个位置至第2^S个位置上不存在交换节点,并且,当层S中的第1个位置至第2^S个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。
该方式中,基于源地址和目标地址之间满足的第二预设关系,提出了一种在源地址和目标地址之间传输数据的传输网络,该传输网络的第二传输子网络中,层S的第2^(S–1)+1个位置至第2^S个位置上不存在交换节点,并且,当层S中的第1个位置至第2^S个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。通过该传输网络传输数据时,不会产生碰撞的情况。同时,该传输网络相比于传统的不会发生碰撞的网络,交换节点的数量有了明显减少,传输网络的复杂度有了明显下降。因此,该传输网络具有传输速度快、传输资源占用少的优点。在使用该传输网络传输具有稀疏性的数据时,可以极大地降低传输开销以及计算开销,极大提升具有稀疏性的数据的处理效率。
在一种可选的实现方式中,第一传输子网络的层数为log 2(N)+1,和/或,第二传输子网络的层数为log 2(N)+1。
在一种可选的实现方式中,在使用第一传输子网络将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址时,可以首先获取存储于第1个源地址至第N/2个源地址中的待传输数据传输对应的目标地址,该目标地址使用二进制数值表示,进而,从目标地址的LSB开始,根据目标地址中各比特位上的数值确定待传输数据在第一传输子网络中的传输路径,通过第一传输子网络中的传输路径将待传输数据传输至目标地址。
在一种可选的实现方式中,在使用第二传输子网络将存储于第N/2+1个源地址至第N个源地址中的待传输数据传输至对应的目标地址时,可以首先获取存储于第N/2+1个源地址至第N个源地址中的待传输数据对应的目标地址,该目标地址使用二进制数值表示,进而,从目标地址的LSB开始,根据目标地址中各比特位上的数值确定待传输数据在第二传输子网络中的传输路径,通过第二传输子网络中的传输路径将待传输数据传输至目标地址。
在上述两种可选方式中,利用传输网络,将待传输数据按照LSB路由到目标地址,能够使得数据传输的速度进一步提升。
在一种可选的实现方式中,上述目标地址为计算模块中的地址,该计算模块中至少包括M个地址。
在一种可选的实现方式中,在基于源地址和目标地址之间的第一预设关系,使用第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标 地址之前,可以首先判断待传输数据的数量是否大于,若待传输数据的数量大于M,则将至少一个待传输数据划分为多组子数据,每组子数据在一个传输时钟下传输。
在该方式中,当待传输数据的数量大于目标地址的数量时,通过将待传输数据划分为多组子数据,并在不同的时钟下传输各组子数据,从而避免数据传输和运算出现冲突,保证数据传输和运算的正确性。
在一种可选的实现方式中,N为8,M为4。
第二方面,本申请实施例提供一种数据传输装置,该装置包括:存储单元、目标模块、传输网络以及控制模块。
存储单元中设置有N个源地址,目标模块中设置多个目标地址。
传输网络分别与存储单元以及目标模块连接。
传输网络包括第一传输子网络,该第一传输子网络包括多个层,每个层包括至少一个交换节点,层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点,并且,当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。
控制模块用于从存储单元中获取至少一个待传输数据,该待传输数据被分散存储于上述N个源地址中,以及,基于源地址和目标地址之间的第一预设关系,使用第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址,其中,该第一预设关系包括:当源地址为K时,对应的目标地址为从0开始的0至K中的一个。
在一种可选的实现方式中,传输网络还包括第二传输子网络。
第二传输子网络包括多个层,每个层包括至少一个交换节点,层S的第2^(S–1)+1个位置至第2^S个位置上不存在交换节点,并且,当层S中的第1个位置至第2^S个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。
控制模块还用于基于源地址和目标地址之间的第二预设关系,使用第二传输子网络,将存储于第N/2+1个源地址至第N个源地址中的待传输数据传输至对应的目标地址,其中,该第二预设关系包括:当源地址为L时,对应的目标地址为从M-1开始的M-1至M-1-[L%(N/2)]中的一个,M为目标地址的数量,M小于N。
在一种可选的实现方式中,第一传输子网络的层数为log 2(N)+1,和/或,第二传输子网络的层数为log 2(N)+1。
在一种可选的实现方式中,控制模块具体用于:
获取所述存储于第1个源地址至第N/2个源地址中的待传输数据传输对应的目标地址,该目标地址使用二进制数值表示;以及,从目标地址的LSB开始,根据目标地址中各比特位上的数值确定待传输数据在第一传输子网络中的传输路径,通过第一传输子网络中的传输路径将待传输数据传输至目标地址。
在一种可选的实现方式中,控制模块具体用于:
获取存储于第N/2+1个源地址至第N个源地址中的待传输数据对应的目标地址,该目标地址使用二进制数值表示;以及,从目标地址的LSB开始,根据目标地址中各比特位上的数值确定待传输数据在第二传输子网络中的传输路径,通过第二传输子网络中的传输路 径将待传输数据传输至目标地址。
在上述各可选的实现方式中,目标模块为计算模块,该计算模块中至少包括M个地址。
在一种可选的实现方式中,控制模块还用于:
在待传输数的数量大于M时,将至少一个待传输数据划分为多组子数据,每组子数据在一个传输时钟下传输。
在一种可选的实现方式中,N为8,M为4。
第三方面,本申请实施例提供一种电子设备,包括:存储器和处理器。
所述处理器用于与所述存储器耦合,读取并执行所述存储器中的指令,以实现上述第一方面所述的方法步骤。
第四方面,本申请实施例提供一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序代码,当所述计算机程序代码被计算机执行时,使得所述计算机执行上述第一方面所述的方法。
第五方面,本申请实施例提供一种计算机可读存储介质,其特征在于,所述计算机存储介质存储有计算机指令,当所述计算机指令被计算机执行时,使得所述计算机执行上述第一方面所述的方法的指令。
第六方面,本申请实施例提供一种芯片,所述芯片与存储器相连,用于读取并执行所述存储器中存储的软件程序,以实现上述第一方面所提供的方法。
附图说明
图1为对某神经网络中的一段参数(weight)和一段特征图(feather map)进行卷积运算的过程示例图;
图2为本申请实施例提供的数据传输方法的流程示意图;
图3为传统的butterfly网络的结构示意图;
图4为反向butterfly网络的结构示意图;
图5(a)为前半部分子网络的演变过程;
图5(b)为演变之后的传输网络结构图;
图6(a)为后半部分子网络的演变过程;
图6(b)为演变之后的网络结构图;
图7为同时使用前文所示的两部分子网络演变方法所得到的网络结构示意图;
图8为本申请实施例提供的数据传输方法的流程示意图;
图9为本申请实施例提供的数据传输装置的模块结构图;
图10为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
首先通过一个示例对数据稀疏比进行说明。
图1为对某神经网络中的一段参数(weight)和一段特征图(feather map)进行卷积运算的过程示例图,如图1所示,该神经网络中包含一段参数以及一段特征图。该一段参数由多个数据组成,该多个数据中部分数据为0。该一段特征图由多个数据组成,该多个数据中部分数据为0。对该一段参数和一段特征图进行卷积运算时,需要将同一角标对应的 特征图数据和参数数据相乘,再将相乘的结果进行累加。由于一段特征图和一段参数中均含0,因此,只有角标2和角标7对应的数据在相乘之后为非0值,这些数据对最终的运算结果有贡献,而其余角标对应的数据对最终的运算结果没有贡献。在图1所示的示例中,角标2和角标7对应的数据对最终的运算结果有贡献,其余角标0、1、3、4、5、6对应的数据对最终的运算结果没有贡献,即数据稀疏比为75%。
在进行运算处理的电子设备中,数据在运算之前,首先保存在存储介质中,进而,在进行运算处理时,需要发送到计算模块中进行运算。计算模块进行运算处理后,还可能需要将运算结果发送到下一个计算模块中,以此类推。
为便于描述,在本申请实施例中,将保存运算之前的数据的地址称为源地址,将进行运算处理时数据在计算模块中的地址称为目标地址。数据需要从源地址传输到目的地址并进行运算处理。
值得说明的是,在本申请实施例中,源地址可以是指存储介质中的地址,例如静态随机存储器(static RAM,SRAM)、动态随机存储器(dynamic RAM,DRAM)等存储介质中的地址,或者,源地址还可以是指计算模块中的地址。目标地址可以是指计算模块中的地址。
另外,在本申请实施例中,“数据”是指半精度浮点数、全精度浮点数、整数等可以用于计算的数据,“数据”可以通过十进制表示,也可以通过二进制表示,本申请实施例对于数据的具体表示方式不作具体限定。以上述图1所示的参数为例,一段参数由0、0、1、0、0、0、0、-1这8个数据组成,每个数据为整数,使用十进制表示。
在上述图1的示例中,一段参数和一段特征图可以分别称为一个数据序列,在计算处理时,数据序列统一从源地址传输到目标地址。具体的,数据序列中的一个数据存储在一个源地址中,一个源地址对应一个目标地址,各源地址中的数据分别被传输到对应的目标地址中。
在一种可能设计中,在将数据序列从源地址传输到目标地址之前,首先筛选出数据序列中的有效数据,其中,该有效数据可以指数据序列中对运算结果有贡献的数据,在筛选出有效数据之后,将有效数据通过源地址所在源模块与目标地址所在目标模块之间的传输网络进行传输。其中,源模块中包括多个地址,目标模块中也包括多个地址。源模块的各地址中所保存的数据均通过源模块与目标模块之间的传输网络传输到目标模块的各地址中。示例性的,假设源模块为SRAM,目标模块为某个计算模块A,SRAM中有8个地址,计算模块A中有4个地址,则SRAM中的8个地址中所存储的数据均可以通过SRAM与计算模块A之间的传输网络传输到计算模块A的4个地址中。在筛选出有效数据的基础上,如果能够提供传输速度快、传输资源占用少的传输网络,则可以极大降低具有稀疏性的数据的传输开销和计算开销。
本申请以下实施例,旨在提供一种基于传输速度快、传输资源占用少的传输网络的数据传输方法,以使得基于该网络传输具有稀疏性的数据时可以极大降低传输开销和计算开销。
该方法可以应用于任何包括存储介质和计算模块的电子设备中。示例性的,该电子设备可以是终端设备、网络设备等通信设备,或者,该电子设备还可以是服务器等。
以电子设备为终端设备为例,该终端设备也可以称为终端Terminal、用户设备(user  equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)等。该终端设备可以是手机(mobile phone)、平板电脑(pad)、带无线收发功能的电脑、虚拟现实(virtual reality,VR)终端设备、增强现实(augmented reality,AR)终端设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端等。
以电子设备为网络设备为例,该网络设备可以是基站,例如可以是全球移动通信***(global system for mobile communication,GSM)或码分多址(code division multiple access,CDMA)中的基站(base transceiver station,BTS),也可以是宽带码分多址(wideband code division multiple access,WCDMA)中的基站(NodeB),还可以是LTE中的演进型基站(eNB或e-NodeB,evolutional Node B),也可以是NR中的gNB等。基站还可以是云无线接入网络(cloud radio access network,CRAN)场景下的无线控制器,或者可以为中继站、接入点、车载设备、可穿戴设备以及5G网络中的网络设备或者未来演进的PLMN网络中的网络设备等。
图2为本申请实施例提供的数据传输方法的流程示意图,如图2所示,该方法可以包括:
S201、从存储单元中获取至少一个待传输数据,该存储单元中设置有N个源地址,待传输数据被分散存储于该N个源地址中。
其中,上述至少一个待传输数据可以为一个数据序列中的数据,该数据序列可以是电子设备中需要传输到计算模块中进行计算处理的任何数据序列,例如可以为上述图1所示例的一段参数或一段特征图等。
本申请实施例中,计算模块也可以称为计算单元。
可选的,上述存储单元可以为SRAM、DRAM等,该存储单元中包括多个源地址,每个源地址中可以存储一个待传输数据。
可选的,上述待传输数据可以为数据序列中的有效数据。在将待传输数据保存至存储单元之前,电子设备可以预先标记数据序列中的有效数据。示例性的,电子设备可以根据与数据序列进行运算的另一数据序列,以及数据序列与另一数据序列的运算方式,确定出数据序列以及另一数据中的有效数据,并对有效数据进行标记。以数据序列为前述的图1中的神经网络中的一段含0的参数,另一数据序列为前述的图1中的神经网络中的一段含0的特征图为例,电子设备首先读取该一段参数以及需要与该一段参数进行运算的一段特征图,并获知该一段参数与该一段特征图需要进行相乘,进而,电子设备将该一段参数与该一段特征图中相乘结果不为0的数据标记为有效数据。具体的,将图1中角标2和7对应的数据标记为有效数据,即在图1示例的神经网络的一段参数中,有效数据为1和-1,同时,在图1示例的神经网络的一段数据中,有效数据为3和5。在标记有效数据之后,将有效数据分散存储在存储单元的N个源地址中。
可选的,上述N个源地址可以是存储待传输数据的模块中的所有地址,或者,上述N个源地址也可以是存储待传输数据的模块中的部分地址,本申请实施例对此不作具体限定。
可选的,上述N为偶数。
继续参照图1的示例,一段参数包括8个数据,分别为0、0、1、0、0、0、0、-1,该8个数据分别存储在存储单元的8个源地址中。在该示例中,N为8。
S202、基于源地址和目标地址之间的第一预设关系,使用第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址,其中,该第一预设关系包括:当源地址为K时,对应的目标地址为从0开始的0至K中的一个。
可选的,上述至少一个待传输数据分别传输至计算模块中的一个目标地址。其中,目标地址所在的计算模块中至少可以包括M个地址。N个源地址中的数据被传输到计算模块的M个地址中。在本申请实施例中,M小于N,即计算模块单次计算的数据个数小于源地址所在模块存储的数据个数,以进行有效数据的对齐。
其中,N个源地址的数据被传输到计算模块的M个地址时,一方面,N个源地址中的第一源地址与该第一源地址对应的目标地址满足第一预设关系,该第一预设关系包括:当源地址为K时,目标地址为从0开始的0至K中的一个。其中,K为大于等于0的数。上述第一源地址为第1个源地址至第N/2个源地址中的任意一个源地址。
示例性的,源地址与目标地址的映射关系可以通过表项管理方式表示,在诸如Crossbar网络等传统网络中,源地址和目标地址之间的映射关系为全连接关系,即对于一个源地址,其中所存储的数据可能被传输到任意一个目标地址中,而在本申请实施例中,对于第1个源地址至第N/2个源地址中的一个源地址K,其对应的目标地址不再是任意一个目标地址,而是0至K中的一个。这样的设计在保证数据正常传输的前提下,能够简化传输网络的复杂度。
值得说明的是,为便于描述,本申请实施例中假定源地址和目标地址从0开始编号,因此,0至K中的一个,表示第一个目标地址至第K+1个目标地址。例如,假设目标地址为计算模块中的地址,计算模块中包括M个地址,该M个地址从0开始编号,因此,目标地址0表示计算模块中的第一个目标地址,目标地址M-1表示计算模块中的第M个目标地址。
对于存储带传输数据的模块中的第1个源地址至第N/2个源地址,即存储待传输数据的模块中的前半部分源地址,这些地址中保存的数据被传输至计算模块中从0开始的地址,并且,源地址K中的数据被传输到的目标地址为0至K中的一个。对于前半部分源地址来说,对应的目标地址为正向排布方式。
源地址所在模块与计算模块之间通过特定的传输网络进行传输。
在本申请实施例中,上述传输网络包括第一传输子网络,该第一传输子网络用于传输第1个源地址至第N/2个源地址中所存储的待传输数据,该第一传输子网络包括多个层,每个层包括至少一个交换节点,层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点,并且,当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,所述至少一个交换节点中的每个交换节点均不包括上行连接线路。
其中,本申请实施例中,符号“^”表示次方运算,例如,2^Y表示2的Y次方,下述不再另行解释。
值得说明的是,在本申请实施例中,交换节点可以是由电路逻辑来实现的逻辑器件。示例性的,交换节点可以是2-2多路复用器(Multiplexer,MUX)等。
在上述传输网络中,第一传输子网络用于传输第1个源地址至第N/2个源地址中所存 储的待传输数据,即第一传输子网络用于传输前半部分源地址中的待传输数据。在本申请实施例中,传输网络的层数可以灵活设置。作为一种可选的实施方式,传输网络的层数可以根据源地址的数量确定。当源地址的数量为上述的N时,传输网络的层数可以为log 2(N)+1向上取整的结果。
在具体实施过程中,可以基于传统的传输网络,在传统的传输网络的基础上,演变出本申请实施例的传输网络。
以下从传统的蝴蝶型网络(butterfly网络)为例,说明本申请实施例的传输网络的特征。
图3为传统的butterfly网络的结构示意图,如图3所示,该butterfly网络负责将8个源地址的数据传输到4个目标地址中,不同源地址中的数据可能需要使用同一条传输线路进行传输,这样可能产生碰撞现象。例如,源地址0中存储的数据和地址4中存储的数据可能同时需要使用节点1和节点2之间的传输线路进行传输,从而产生碰撞。
基于图3所示的传统butterfly网络,在本申请实施例中,首先提出一种反向butterfly网络结构,图4为反向butterfly网络的结构示意图,如图4所示,该网络包括两个传输子网络,一个传输子网络(前半部分传输子网络)负责将前半部分源地址中的数据传输到目标地址中,另一个传输子网络(后半部分传输子网络)负责将后半部分源地址中的数据传输到目标地址中。其中,前半部分源地址和后半部分源地址分别指:假设网络包括N个源地址,前半部分源地址指源地址0至N/2-1,后半部分源地址指N/2至N-1。两个传输子网络均包括多层,每层包括多个交换节点。每个传输子网络中第一层的每个节点分别与每个源地址连接,每个传输子网络中最后一层的每个节点分别与每个目标地址连接。值得说明的是,在图4所示的网络结构中,对于与目标地址连接的交换节点A、交换节点B、交换节点C和交换节点D,其同时属于两个传输子网络。
在图4所示的网络结构的基础上,基于上述的第一预设关系,可以对前半部分传输子网络进行演变,得到上述步骤S202中所述的传输网络。图5(a)为前半部分子网络的演变过程,图5(b)为演变之后的传输网络结构图。如图5(a)和图5(b)所示,该传输网络包括了第一传输子网络和第二传输子网络。其中,第一传输子网络负责将前半部分源地址中的数据传输到目标地址中,第二传输子网络负责将后半部分源地址中的数据传输到目标地址中。两个子网络均包括多层,每层包括多个交换节点。每个传输子网络中第一层的每个节点分别与每个源地址连接,每个子网络中最后一层的每个节点分别与每个目标地址连接。值得说明的是,在图5(b)所示的网络结构中,对于与目标地址连接的交换节点,其同时属于两个子网络。同时,基于上述的第一预设关系,可以对上述图4所示的反向butterfly网络中的前半部分子网络进行演变得到。
对于第一传输子网络中的层Y,演变如下,其中,Y大于等于0,并且小于等于传输网络的层数减去1的差值,例如,第一传输子网络的层数可以为log 2(N)+1向上取整的结果时,Y的取值为:大于等于0,并且小于等于log(N)向上取整的结果。
第一传输子网络的层数以及第二传输子网络的层数分别与传输网络的层数相同。(1)在层Y中,省略交换节点0至交换节点2^Y-1的上行连接线路。
可选的,第一传输子网络的每一层中交换节点可以按照如下方式进行编号:
A、交换节点的序号从0开始编号。例如,交换节点0表示第1个交换节点,交换节 点2^Y-1表示第2^Y个交换节点。
B、在层0中,与最小的源地址连接的交换节点的编号最小,依次类推。例如,在图5(a)所示的第一传输子网络中,与源地址0连接的层0中的交换节点为交换节点0,与源地址1连接的层0的交换节点为交换节点1,依次类推。
C、在除层0以及第一传输子网络的最后一层外的其他层中,各交换节点的编号分别与第一层中与各交换节点位置相同的交换节点的编号保持一致。示例性的,层1中包括4个交换节点,最下方的一个交换交换节点与层0中交换节点0位置相同,即均属于所在层的最下方的一个交换节点,则层1中最下方的交换节点为交换节点0。次下方的一个交换节点与层0中交换节点1位置相同,即均属于所在层次下方的一个交换节点,因此,层1中次下方的交换节点为交换节点1。以此类推,可以得出除层0以及第一传输子网络的最后一层外的其他层中每个交换节点的编号。
D、在最后一层中,与最小的目标地址连接的交换节点的编号最小,依次类推。例如,在图5(a)所述的第一传输子网络,与目标地址0连接的层3中的交换节点为交换节点0,与目标地址1连接的层3的交换节点为交换节点1,依次类推。
另外,在本申请实施例中,源地址和目标地址的序号也从0开始编号。例如,源地址0表示第1个源地址,以此类推。
参照图4、图5(a)和图5(b),以Y=1为例,由于在上述第一预设关系中,在源地址K中的数据仅能传输到目标地址0到K中的一个,则对于层1来说,源地址中的数据经过层1中的交换节点0或交换节点1时,并不需要再向上传输,因此,省略掉层1中交换节点0和交换节点1的上行连接线路之后,并不会影响源地址中数据的正常传输。
其中,向上传输是指编号较小的交换节点向编号较大的交换节点传输数据,示例性的,参照图5(a),层1的交换节点0向层2的交换节点2传输数据时,即为向上传输。
相应的,上行连接线路是指编号较小层中编号较小的交换节点到编号较大层中编号较大的交换节点之间的连接。示例性的,对于层1和层2来说,层1为编号较小层,层2为编号较大层。对于层1中交换节点0和层2中的交换节点2来说,层1中的交换节点0为编号较小的交换节点,层2中的交换节点2为编号较大的交换节点,则层1中的交换节点0至层2中的交换节点2之间的连接为一个上行连接线路。
(2)在层Y中,删除第2^(Y–1)+1个位置至第2^Y个位置上的交换节点。
其中,该层Y为除第一传输子网络中第一层与最后一层之外的层。
值得说明的是,在删除第2^(Y–1)+1个位置至第2^Y个位置上的交换节点之后,第2^(Y–1)+1个位置至第2^Y个位置依然存在,并且,这些位置上不再存在交换节点。
其中,该步骤中删除的交换节点包括2*2的交换节点以及2*1的交换节点。其中,2*2的交换节点是指包括2个输入连接和2个输出连接的节点,2*1的交换节点是指包括2个输入连接和1输出连接的节点。
继续参照图4、图5(a)和图5(b),以Y=1为例,在执行上述(1)之后,层1中的交换节点1仅用来连接层0的交换节点1和层2的交换节点1,因此,删除到层1的交换节点1之后,使得层0的交换节点1与层2的交换节点1直接连接,并不会影响源地址中数据的正常传输。依照这一原则,层2的交换节点2和交换节点3也删除,将层1的交换节点3与层3的交换节点3直连;将层1的交换节点2和层3的交换节点2直连。
(3)当Y>=1时,将交换节点0至交换节点2^(Y-1)-1从2x2节点修改为2x1节点或1x2节点。
可选的,该步骤可以独立于上述(1)和(2)执行,或者,如果执行了上述(1)和(2)之后,该步骤的结果可以被满足。
经过上述演变之后,所得到的传输网络的Y层满足如下条件:
(1)层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点。
(2)当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,所述至少一个交换节点中的每个交换节点均不包括上行连接线路。
参照图4、图5(a)和图5(b),在经过上述的演变之后,传输网络的第一传输子网络中,层1上的交换节点1、层2上的交换节点2和交换节点3被删除。
具体的,图5(b)中的传输网络用于将8个源地址中的数据传输到4个目标地址中,传输网络中的第一传输子网络用于传输源地址0到3中的数据,第一传输子网络包括4层,分别为层0、层1、层2和层3,层0中包括4个交换节点,分别为交换节点0、交换节点1、交换节点2和交换节点3。层1中包括3个交换节点,分别为节点0、节点2和节点3。层2中包括2个交换节点,分别为节点0和节点1。层3中包括4个交换节点。每层中每个交换节点的连接方式可以参照图5(b),此处不再一一说明。
由于在本申请实施例中,源地址和目标地址满足上述第一预设关系,因此,前半部分源地址中的数据经过上述图5(b)所示的第一传输子网络传输时,不会发生碰撞的情况,同时,上述图5(b)所示的第一传输子网络相比于传统的不会发生碰撞的传输网络,例如交叉开关网络(Crossbar网络),传输网络的交换节点数量有了明显减少,传输网络的复杂度有了明显下降。
本实施例中,基于源地址和目标地址之间满足的第一预设关系,提出了一种在源地址和目标地址之间传输数据的传输网络,该传输网络的第一传输子网络中,层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点,并且,当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路,通过该传输网络传输数据时,不会产生碰撞的情况。同时,该传输网络相比于传统的不会发生碰撞的网络,交换节点的数量有了明显减少,传输网络的复杂度有了明显下降。因此,该传输网络具有传输速度快、传输资源占用少的优点。在使用该传输网络传输具有稀疏性的数据时,可以极大地降低传输开销以及计算开销,极大提升具有稀疏性的数据的处理效率。
作为一种可选的实施方式,在前文所述的N个源地址中,第二源地址与该第二源地址对应的目标地址满足第二预设关系,该第二预设关系包括:当源地址为L时,目标地址为从M-1开始的M-1至M-1-[L%(N/2)]中的一个,其中,该第二源地址为第N/2+1个源地址至第N个源地址中的任意一个源地址。L为大于0的数。
对于存储待传输数据的模块中的第N/2+1个源地址至第第N个源地址,即存储待传输数据的模块中的后半部分源地址,这些地址中保存的数据被传输至计算模块中从M-1开始的地址,并且,源地址L中的数据被传输到的目标地址为M-1至M-1-[L%(N/2)]中的一个。对于后半部分源地址来说,对应的目标地址为逆向排布方式。
在本申请实施例中,上述传输网络还包括第二传输子网络,该第二传输子网络用于传 输N/2+1个源地址至第N个源地址中所存储的待传输数据,该第二传输子网络包括多个层,每个层包括至少一个交换节点,层S的第2^(S–1)+1个位置至第2^S个位置上不存在交换节点,并且,当层S中的第1个位置至第2^S个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。
基于上述的第二预设关系,可以使用上述第二传输子网络,将存储于第N/2+1个源地址至第N个源地址中的待传输数据传输至对应的目标地址。
基于上述的第二预设关系,在前述的图4所示的网络结构的基础上,可以使用前述的演变过程,演变出传输网络中的第二传输子网络。图6(a)为后半部分子网络的演变过程,图6(b)为演变之后的网络结构图。基于上述的第二预设关系,对上述图4所示的反向butterfly网络的后半部分子网络进行以下演变过程。值得说明的是,在图6(a)和图6(b)中,第二传输子网络首先按照上述逆向排布方式连接,具体为,层2的节点0与层3的节点3连接,层2的节点1与层3的节点2连接,以此类推。
对于第二传输子网络中的层S,演变如下,其中,S大于等于0,并且小于等于第二传输子网络的层数减去1的差值,例如,第二传输子网络的层数可以为log 2(N)+1向上取整的结果时,S的取值为:大于等于0,并且小于等于log 2(N)向上取整的结果。
(1)在层S中,省略交换节点0至交换节点2^S-1的上行连接。
可选的,第二传输子网络的每一层中交换节点可以按照如下方式进行编号:
A、交换节点的序号从0开始编号。例如,交换节点0表示第1个交换节点,交换节点2^Y-1表示第2^Y个交换节点。
B、在层0中,与最小的源地址连接的交换节点的编号最小,依次类推。例如,在图5(a)所示的第二传输子网络中,与源地址0连接的层0中的交换节点为交换节点0,与源地址1连接的层0的交换节点为交换节点1,依次类推。
C、在除层0以及第二传输子网络的最后一层外的其他层中,各交换节点的编号分别与第一层中与各交换节点位置相同的交换节点的编号保持一致。示例性的,层1中包括4个交换节点,最下方的一个交换交换节点与层0中交换节点0位置相同,即均属于所在层的最下方的一个交换节点,则层1中最下方的交换节点为交换节点0。次下方的一个交换节点与层0中交换节点1位置相同,即均属于所在层次下方的一个交换节点,因此,层1中次下方的交换节点为交换节点1。以此类推,可以得出除层0以及第二传输子网络的最后一层外的其他层中每个交换节点的编号。
D、在最后一层中,与最小的目标地址连接的交换节点的编号最小,依次类推。例如,在图5(a)所示的第二传输子网络中,与目标地址0连接的层3中的交换节点为交换节点0,与目标地址1连接的层3的交换节点为交换节点1,依次类推。
参照图4、图6(a)和图6(b),以S=1为例,由于在上述第二预设关系中,在源地址L中的数据仅能传输到目标地址M-1到M-1-[L%(N/2)]中的一个,则对于层1来说,源地址中的数据经过层1中的交换节点0或交换节点1时,并不需要再向上传输,因此,省略掉层1中交换节点0和交换节点1的上行连接线路之后,并不会影响源地址中数据的正常传输。
(2)在层S中,删除第2^(S–1)+1个位置至第2^S个位置上的交换节点。
其中,该层S为除第二传输子网络中第一层与最后一层之外的层。
值得说明的是,在删除第2^(S–1)+1个位置至第2^S个位置上的交换节点之后,第2^(S–1)+1个位置至第2^S个位置依然存在,并且,这些位置上不再存在交换节点。
其中,该步骤中删除的交换节点包括2*2的交换节点以及2*1的交换节点。
继续参照图4、图6(a)和图6(b),以S=1为例,在执行上述(1)之后,层1中的交换节点1仅用来连接层0的交换节点1和层2的交换节点1,因此,删除到层1的交换节点1之后,使得层0的交换节点1与层2的交换节点1直接连接,并不会影响源地址中数据的正常传输。依照这一原则,层2的交换节点2和交换节点3也删除,将层1的交换节点3与层3的交换节点0直连;将层1的交换节点2和层3的交换节点1直连。
(3)当S>=1时,将交换节点0至交换节点2^(S-1)-1从2x2节点修改为2x1节点或1x2节点。
可选的,该步骤可以独立于上述(1)和(2)执行,或者,如果执行了上述(1)和(2)之后,该步骤的结果可以被满足。
经过上述演变之后,所得到的传输网络的S层满足如下条件:
(1)层S的第2^(S–1)+1个位置至第2^S个位置上不存在交换节点。
(2)当层S中的第1个位置至第2^S个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。
参照图4、图6(a)和图6(b),在经过上述的演变之后,传输网络的第二传输子网络中,层1上的交换节点1、层2上的交换节点2和交换节点3被删除。
具体的,图6(b)中的传输网络用于将8个源地址中的数据传输到4个目标地址中,传输网络中的第二传输子网络用于传输源地址4到7中的数据,第二传输子网络包括4层,分别为层0、层1、层2和层3,层0中包括4个交换节点,分别为交换节点0、交换节点1、交换节点2和交换节点3。层1中包括3个交换节点,分别为节点0、节点2和节点3。层2中包括2个交换节点,分别为节点0和节点1。层3中包括4个交换节点。每层中每个交换节点的连接方式可以参照图6(b),此处不再一一说明。
由于在本申请实施例中,源地址和目标地址满足上述第二预设关系,因此,后半部分源地址中的数据经过上述图6(b)所示的第二传输子网络传输时,不会发生碰撞的情况,同时,上述图6(b)所示的第二传输子网络相比于传统的不会发生碰撞的传输网络,例如Crossbar网络,传输网络的交换节点数量有了明显减少,传输网络的复杂度有了明显下降。
本实施例中,基于源地址和目标地址之间满足的第二预设关系,提出了一种在源地址和目标地址之间传输数据的传输网络,该传输网络的第二传输子网络中,层S的第2^(S–1)+1个位置至第2^S个位置上不存在交换节点,并且,当层S中的第1个位置至第2^S个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。通过该传输网络传输数据时,不会产生碰撞的情况。同时,该传输网络相比于传统的不会发生碰撞的网络,交换节点的数量有了明显减少,传输网络的复杂度有了明显下降。因此,该传输网络具有传输速度快、传输资源占用少的优点。在使用该传输网络传输具有稀疏性的数据时,可以极大地降低传输开销以及计算开销,极大提升具有稀疏性的数据的处理效率。
在具体实施过程中,传输网络可以使用如上述图5(b)所示的结构,即仅第一传输子网络使用演变后的网络结构,或者,传输网络可以使用如上述图6(b)所示的结构,即仅 第二传输子网络使用演变后的网络结构。或者,传输网络还可以使用下述图7所示的结构,图7为同时使用前文所示的两部分子网络演变方法所得到的网络结构示意图,在图7中,第一传输子网络的结构与图5(b)中的第一传输子网络相同,第二传输子网络的结构与图6(b)中的第一传输子网络相同,此处不再赘述。
如下表1为对上述图5(b)和上述图7与传统的传输网络进行比较的示例。如表1所示,上述图5(b)和上述图7相比于传统的Crossbar网络,极大节省了2*2交换节点以及连接线的数量,同时,相比于传统的butterfly网络,能够避免碰撞现象的发生。
表1
Figure PCTCN2019099262-appb-000001
以下说明基于上述的传输网络,在步骤S202中进行数据传输时的具体过程。
图8为本申请实施例提供的数据传输方法的流程示意图,如图8所示,使用上述的第一传输子网络将待传输数据传输到目标地址的过程包括:
S801、获取所述存储于第1个源地址至第N/2个源地址中的待传输数据传输对应的目标地址,该目标地址使用二进制数值表示。
可选的,待传输数据的目标地址可以根据预设的待传输数据编号与目标地址的对应关系获取。示例性的,假设8个源地址中保存了8个数据,其中包括2个待传输数据,则第一个待传输数据的目标地址为地址0,第二个待传输数据的目标地址为地址1。
S802、从上述目标地址的最低有效位(least significant bit,LSB)开始,根据上述目标地址中各比特位上的数值确定上述待传输数据在上述传输网络中的传输路径,通过上述传输路径将上述待传输数据传输至上述目标地址。
以数据序列为上述图1所示的神经网络中的一段特征图,传输网络为图7所示的传输网络为例,该一段特征图中包括2个有效数据,该2个有效数据为待传输数据。该一段特征图被存储在图7所示的8个源地址中,其中,数据5保存在源地址0中,以此类推,顺序保存。由前文的描述可知,该一段特征图中的有效数据为5和3,其中,数据3保存在源地址5中。因此,数据3可以使用第二传输子网络进行传输。同时,根据上述第二预设关系,数据3可以传输到目标地址2。目标地址1的二进制数值为001。则从001的LSB开始,将数据3在第二传输子网络上进路由。具体的,001的LSB为1,则数据3从层0的交换节点1路由到层2的交换节点1,并从层2的交换节点1直接路由到层3的交换节点2,进而传输到目标地址2。
本实施例中,利用上述的传输网络,将待传输数据按照LSB路由到目标地址,能够使得数据传输的速度进一步提升。
与上述图8所示过程类似的,当使用第二传输子网络传输数据时,可以首先获取存储于第N/2+1个源地址至第N个源地址中的待传输数据对应的目标地址,该目标地址使用二 进制数值表示,进而,从目标地址的LSB开始,根据目标地址中各比特位上的数值确定待传输数据在第二传输子网络中的传输路径,通过第二传输子网络中的传输路径将待传输数据传输至目标地址。具体执行过程与上述图8中第一传输子网络的处理过程一致,此处不再赘述。
在上述各实施例中,目标地址的数量M小于源地址的数量N。示例性的,M可以为4,N可以为8。在这种方式下,如果源地址中保存的待传输数据的数量大于M,则无法一次将所有的待传输数据传输到目标地址进行处理。基于该问题,作为一种可选的实施方式,如果上述待传输数据的数量大于M,则可以将待传输数据划分为多组子数据,在一个传输时钟下,使用上述传输网络将一组子数据传输至对应的目标地址。
可选的,可以按照源地址对待传输数据进行划分。示例性的,若源地址数量为8,目标地址数量为4,则将源地址0到源地址3中的数据作为第一组子数据,将源地址4到源地址7中的数据作为第二组子数据。进而,将第一组子数据中的待传输数据在一个时钟下通过上述传输网络传输到目标地址进行运算,将第二组子数据中的待传输数据在另一个时钟下通过上述传输网络传输到目标地址进行运算。
本实施例中,当待传输数据的数量大于目标地址的数量时,通过将待传输数据划分为多组子数据,并在不同的时钟下传输各组子数据,从而避免数据传输和运算出现冲突,保证数据传输和运算的正确性。
图9为本申请实施例提供的数据传输装置的模块结构图,该装置可以为前述实施例所描述的电子设备,也可以为电子设备中能够实现本申请实施例提供的方法中的功能的装置,例如该装置可以是电子设备中的装置或芯片***。如图9所示,该装置包括:
存储单元901、目标模块902、传输网络903以及控制模块904。
其中,存储单元901中设置有N个源地址,目标模块902中设置多个目标地址。
传输网络903分别与存储单元901以及目标模块902连接。
传输网络903包括第一传输子网络,该第一传输子网络包括多个层,每个层包括至少一个交换节点,层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点,并且,当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路。
控制模块904可以与存储单元901、目标模块902、传输网络903分别连接。
控制模块904用于从存储单元901中获取至少一个待传输数据,该待传输数据被分散存储于上述N个源地址中,以及,基于源地址和目标地址之间的第一预设关系,使用第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址,其中,该第一预设关系包括:当源地址为K时,对应的目标地址为从0开始的0至K中的一个。
在一种可选的实施方式中,传输网络903还包括第二传输子网络。
上述第二传输子网络包括多个层,每个层包括至少一个交换节点,层S的第2^(S–1)+1个位置至第2^S个位置上不存在交换节点,并且,当层S中的第1个位置至第2^S个位置上存在至少一个交换节点时,该至少一个交换节点中的每个交换节点均不包括上行连接线路;
控制模块904还用于基于源地址和目标地址之间的第二预设关系,使用第二传输子网 络,将存储于第N/2+1个源地址至第N个源地址中的待传输数据传输至对应的目标地址,其中,该第二预设关系包括:当源地址为L时,对应的目标地址为从M-1开始的M-1至M-1-[L%(N/2)]中的一个,M为目标地址的数量,M小于N。
在一种可选的实施方式中,上述第一传输子网络的层数为log 2(N)+1,和/或,上述第二传输子网络的层数为log 2(N)+1。
在一种可选的实施方式中,控制模块904具体用于:
获取存储于第1个源地址至第N/2个源地址中的待传输数据传输对应的目标地址,该目标地址使用二进制数值表示;以及,从目标地址的LSB开始,根据目标地址中各比特位上的数值确定待传输数据在第一传输子网络中的传输路径,通过第一传输子网络中的传输路径将待传输数据传输至目标地址。
在一种可选的实施方式中,控制模块904具体用于:
获取存储于第N/2+1个源地址至第N个源地址中的待传输数据对应的目标地址,该目标地址使用二进制数值表示;以及,从目标地址的LSB开始,根据目标地址中各比特位上的数值确定待传输数据在第二传输子网络中的传输路径,通过第二传输子网络中的传输路径将待传输数据传输至目标地址。
作为一种可选的实施方式,目标模块902可以为计算模块,该计算模块中至少包括M个地址。
当待传输数据的数量大于M时,控制模块904还用于将至少一个待传输数据划分为多组子数据,每组子数据在一个传输时钟下传输。
本申请实施例提供的数据传输装置,可以执行上述方法实施例中的方法步骤,其实现原理和技术效果类似,在此不再赘述。
需要说明的是,应理解以上装置的各个模块的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。且这些模块可以全部以软件通过处理元件调用的形式实现;也可以全部以硬件的形式实现;还可以部分模块通过处理元件调用软件的形式实现,部分模块通过硬件的形式实现。例如,确定模块可以为单独设立的处理元件,也可以集成在上述装置的某一个芯片中实现,此外,也可以以程序代码的形式存储于上述装置的存储器中,由上述装置的某一个处理元件调用并执行以上确定模块的功能。其它模块的实现与之类似。此外这些模块全部或部分可以集成在一起,也可以独立实现。这里所描述的处理元件可以是一种集成电路,具有信号的处理能力。在实现过程中,上述方法的各步骤或以上各个模块可以通过处理器元件中的硬件的集成逻辑电路或者软件形式的指令完成。
例如,以上这些模块可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个特定集成电路(application specific integrated circuit,ASIC),或,一个或多个微处理器(digital signal processor,DSP),或,一个或者多个现场可编程门阵列(field programmable gate array,FPGA)等。再如,当以上某个模块通过处理元件调度程序代码的形式实现时,该处理元件可以是通用处理器,例如中央处理器(central processing unit,CPU)或其它可以调用程序代码的处理器。再如,这些模块可以集成在一起,以片上***(system-on-a-chip,SOC)的形式实现。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。 当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例所描述的流程或功能。上述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。上述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,上述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。上述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。上述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘solid state disk(SSD))等。
图10为本申请实施例提供的一种电子设备的结构示意图。如图10所示,该电子设备1000可以包括:处理器101(例如CPU)、存储器102、收发器103;收发器103耦合至处理器101,处理器101控制收发器103的收发动作。存储器102中可以存储各种指令,以用于完成各种处理功能以及实现本申请实施例中电子设备执行的方法步骤。可选的,本申请实施例涉及的电子设备还可以包括:电源104、***总线105以及通信端口106。收发器103可以集成在电子设备的收发信机中,也可以为电子设备上独立的收发天线。***总线105用于实现元件之间的通信连接。上述通信端口106用于实现电子设备与其他外设之间进行连接通信。
在本申请实施例中,上述处理器101用于与存储器102耦合,读取并执行存储器102中的指令,以实现上述方法实施例中电子设备执行的方法步骤。其实现原理和技术效果类似,在此不再赘述。
该图10中提到的***总线可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。该***总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。通信接口用于实现数据库访问装置与其他设备(例如客户端、读写库和只读库)之间的通信。存储器可能包含随机存取存储器(random access memory,RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。
上述的处理器可以是通用处理器,包括中央处理器CPU、网络处理器(network processor,NP)等;还可以是数字信号处理器DSP、专用集成电路ASIC、现场可编程门阵列FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
可选的,本申请实施例还提供一种计算机可读存储介质,该存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述实施例中电子设备的处理过程。
可选的,本申请实施例还提供一种运行指令的芯片,该芯片用于执行上述实施例中电子设备的处理过程。
本申请实施例还提供一种程序产品,该程序产品包括计算机程序,该计算机程序存储在存储介质中,至少一个处理器可以从上述存储介质读取上述计算机程序,上述至少一个处理器执行上述实施例中电子设备的处理过程。
在本申请实施例中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系;在公式中,字符“/”,表示前后关联对象是一种“相除”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中,a,b,c可以是单个,也可以是多个。
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请实施例的范围。
可以理解的是,在本申请的实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请的实施例的实施过程构成任何限定

Claims (17)

  1. 一种数据传输方法,其特征在于,包括:
    从存储单元中获取至少一个待传输数据,所述存储单元中设置有N个源地址,所述待传输数据被分散存储于所述N个源地址中;
    基于源地址和目标地址之间的第一预设关系,使用第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址,其中,所述第一预设关系包括:当源地址为K时,对应的目标地址为从0开始的0至K中的一个;
    其中,所述第一传输子网络包括多个层,每个层包括至少一个交换节点,层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点,并且,当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,所述至少一个交换节点中的每个交换节点均不包括上行连接线路。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    基于源地址和目标地址之间的第二预设关系,使用第二传输子网络,将存储于第N/2+1个源地址至第N个源地址中的待传输数据传输至对应的目标地址,其中,所述第二预设关系包括:当源地址为L时,对应的目标地址为从M-1开始的M-1至M-1-[L%(N/2)]中的一个,M为目标地址的数量,M小于N;
    所述第二传输子网络包括多个层,每个层包括至少一个交换节点,层S的第2^(S–1)+1个位置至第2^S个位置上不存在交换节点,并且,当层S中的第1个位置至第2^S个位置上存在至少一个交换节点时,所述至少一个交换节点中的每个交换节点均不包括上行连接线路。
  3. 根据权利要求2所述的方法,其特征在于,所述第一传输子网络的层数为log 2(N)+1,和/或,所述第二传输子网络的层数为log 2(N)+1。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述使用第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址,包括:
    获取所述存储于第1个源地址至第N/2个源地址中的待传输数据传输对应的目标地址,所述目标地址使用二进制数值表示;
    从所述目标地址的最低有效位LSB开始,根据所述目标地址中各比特位上的数值确定所述待传输数据在所述第一传输子网络中的传输路径,通过所述第一传输子网络中的传输路径将所述待传输数据传输至所述目标地址。
  5. 根据权利要求2或3所述的方法,其特征在于,所述使用第二传输子网络,将存储于第N/2+1个源地址至第N个源地址中的待传输数据传输至对应的目标地址,包括:
    获取所述存储于第N/2+1个源地址至第N个源地址中的待传输数据对应的目标地址,所述目标地址使用二进制数值表示;
    从所述目标地址的LSB开始,根据所述目标地址中各比特位上的数值确定所述待传输数据在所述第二传输子网络中的传输路径,通过所述第二传输子网络中的传输路径将所述待传输数据传输至所述目标地址。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述目标地址为计算模块中的地址,所述计算模块中至少包括M个地址。
  7. 根据权利要求6所述的方法,其特征在于,所述基于源地址和目标地址之间的第 一预设关系,使用第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址之前,还包括:
    若所述待传输数据的数量大于M,则将所述至少一个待传输数据划分为多组子数据,每组子数据在一个传输时钟下传输。
  8. 一种数据传输装置,其特征在于,包括:存储单元、目标模块、传输网络以及控制模块;
    所述存储单元中设置有N个源地址;
    所述目标模块中设置多个目标地址;
    所述传输网络分别与所述存储单元以及所述目标模块连接;
    所述传输网络包括第一传输子网络,所述第一传输子网络包括多个层,每个层包括至少一个交换节点,层Y的第2^(Y–1)+1个位置至第2^Y个位置上不存在交换节点,并且,当层Y中的第1个位置至第2^Y个位置上存在至少一个交换节点时,所述至少一个交换节点中的每个交换节点均不包括上行连接线路;
    所述控制模块用于从所述存储单元中获取至少一个待传输数据,所述待传输数据被分散存储于所述N个源地址中,以及,基于源地址和目标地址之间的第一预设关系,使用所述第一传输子网络,将存储于第1个源地址至第N/2个源地址中的待传输数据传输至对应的目标地址,其中,所述第一预设关系包括:当源地址为K时,对应的目标地址为从0开始的0至K中的一个。
  9. 根据权利要求8所述的装置,其特征在于,所述传输网络还包括第二传输子网络;
    所述第二传输子网络包括多个层,每个层包括至少一个交换节点,层S的第2^(S–1)+1个位置至第2^S个位置上不存在交换节点,并且,当层S中的第1个位置至第2^S个位置上存在至少一个交换节点时,所述至少一个交换节点中的每个交换节点均不包括上行连接线路;
    所述控制模块还用于基于源地址和目标地址之间的第二预设关系,使用第二传输子网络,将存储于第N/2+1个源地址至第N个源地址中的待传输数据传输至对应的目标地址,其中,所述第二预设关系包括:当源地址为L时,对应的目标地址为从M-1开始的M-1至M-1-[L%(N/2)]中的一个,M为目标地址的数量,M小于N。
  10. 根据权利要求9所述的装置,其特征在于,所述第一传输子网络的层数为log 2(N)+1,和/或,所述第二传输子网络的层数为log 2(N)+1。
  11. 根据权利要求8-10任一项所述的装置,其特征在于,所述控制模块具体用于:
    获取所述存储于第1个源地址至第N/2个源地址中的待传输数据传输对应的目标地址,所述目标地址使用二进制数值表示;以及,
    从所述目标地址的最低有效位LSB开始,根据所述目标地址中各比特位上的数值确定所述待传输数据在所述第一传输子网络中的传输路径,通过所述第一传输子网络中的传输路径将所述待传输数据传输至所述目标地址。
  12. 根据权利要求9或10所述的装置,其特征在于,所述控制模块具体用于:
    获取所述存储于第N/2+1个源地址至第N个源地址中的待传输数据对应的目标地址,所述目标地址使用二进制数值表示;以及,
    从所述目标地址的LSB开始,根据所述目标地址中各比特位上的数值确定所述待传输 数据在所述第二传输子网络中的传输路径,通过所述第二传输子网络中的传输路径将所述待传输数据传输至所述目标地址。
  13. 根据权利要求8-12任一项所述的装置,其特征在于,所述目标模块为计算模块,所述计算模块中至少包括M个地址。
  14. 根据权利要求13所述的装置,其特征在于,所述控制模块还用于:
    在所述待传输数的数量大于M时,将所述至少一个待传输数据划分为多组子数据,每组子数据在一个传输时钟下传输。
  15. 一种电子设备,其特征在于,包括:存储器和处理器;
    所述处理器用于与所述存储器耦合,读取并执行所述存储器中的指令,以实现权利要求1-7任一项所述的方法步骤。
  16. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序代码,当所述计算机程序代码被计算机执行时,使得所述计算机执行权利要求1-7任一项所述的方法。
  17. 一种计算机可读存储介质,其特征在于,所述计算机存储介质存储有计算机指令,当所述计算机指令被计算机执行时,使得所述计算机执行权利要求1-7任一项所述的方法的指令。
PCT/CN2019/099262 2019-08-05 2019-08-05 数据传输方法、装置、电子设备及可读存储介质 WO2021022441A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980098672.1A CN114144793A (zh) 2019-08-05 2019-08-05 数据传输方法、装置、电子设备及可读存储介质
PCT/CN2019/099262 WO2021022441A1 (zh) 2019-08-05 2019-08-05 数据传输方法、装置、电子设备及可读存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/099262 WO2021022441A1 (zh) 2019-08-05 2019-08-05 数据传输方法、装置、电子设备及可读存储介质

Publications (1)

Publication Number Publication Date
WO2021022441A1 true WO2021022441A1 (zh) 2021-02-11

Family

ID=74502548

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/099262 WO2021022441A1 (zh) 2019-08-05 2019-08-05 数据传输方法、装置、电子设备及可读存储介质

Country Status (2)

Country Link
CN (1) CN114144793A (zh)
WO (1) WO2021022441A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107003943A (zh) * 2016-12-05 2017-08-01 华为技术有限公司 NVMe over Fabric架构中数据读写命令的控制方法、存储设备和***
CN109165728A (zh) * 2018-08-06 2019-01-08 济南浪潮高新科技投资发展有限公司 一种卷积神经网络的基本计算单元及计算方法
CN109214543A (zh) * 2017-06-30 2019-01-15 华为技术有限公司 数据处理方法及装置
CN109284130A (zh) * 2017-07-20 2019-01-29 上海寒武纪信息科技有限公司 神经网络运算装置及方法
US20190147342A1 (en) * 2017-11-13 2019-05-16 Raytheon Company Deep neural network processor with interleaved backpropagation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107003943A (zh) * 2016-12-05 2017-08-01 华为技术有限公司 NVMe over Fabric架构中数据读写命令的控制方法、存储设备和***
CN109214543A (zh) * 2017-06-30 2019-01-15 华为技术有限公司 数据处理方法及装置
CN109284130A (zh) * 2017-07-20 2019-01-29 上海寒武纪信息科技有限公司 神经网络运算装置及方法
US20190147342A1 (en) * 2017-11-13 2019-05-16 Raytheon Company Deep neural network processor with interleaved backpropagation
CN109165728A (zh) * 2018-08-06 2019-01-08 济南浪潮高新科技投资发展有限公司 一种卷积神经网络的基本计算单元及计算方法

Also Published As

Publication number Publication date
CN114144793A (zh) 2022-03-04

Similar Documents

Publication Publication Date Title
TW202022644A (zh) 一種運算裝置和運算方法
WO2021097962A1 (zh) 一种异构芯片的任务处理方法、任务处理装置及电子设备
US20190158575A1 (en) Platform as a service cloud server and multi-tenant operating method thereof
CN105740199A (zh) 芯片上网络的时序功率估算装置与方法
WO2023065983A1 (zh) 计算装置、神经网络处理设备、芯片及处理数据的方法
CN109729731B (zh) 一种加速处理方法及设备
JP2022510803A (ja) バス上のメモリ要求チェーン
US20230403232A1 (en) Data Transmission System and Method, and Related Device
CN101692212B (zh) 一种访问存储器的方法、***和总线仲裁装置
WO2021022441A1 (zh) 数据传输方法、装置、电子设备及可读存储介质
WO2021135572A1 (zh) 神经网络的卷积实现方法、卷积实现装置及终端设备
CN111274193A (zh) 数据处理装置及方法
CN107193656B (zh) 多核***的资源管理方法、终端设备及计算机可读存储介质
CN112905523B (zh) 一种芯片及核间数据传输方法
US20170272327A1 (en) Network topology system and method
CN111258641B (zh) 运算方法、装置及相关产品
CN111260070B (zh) 运算方法、装置及相关产品
CN110647355B (zh) 数据处理器和数据处理方法
CN114095289B (zh) 数据多播电路、方法、电子设备及计算机可读存储介质
CN112395003A (zh) 运算方法、装置及相关产品
CN115017072B (zh) 突发长度拆分方法、装置、芯片***和电子设备
CN116340246B (zh) 用于直接内存访问读取操作的数据预读方法及介质
CN115955429B (zh) 片上网络的路由方法、装置、***及电子设备
WO2021196904A1 (zh) 一种设备管理方法、装置及计算机***
US11782834B2 (en) System and method for round robin arbiters in a network-on-chip (NoC)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19940350

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19940350

Country of ref document: EP

Kind code of ref document: A1