CN103493039B - Data processing method, data processing equipment, access device and subscriber equipment - Google Patents

Data processing method, data processing equipment, access device and subscriber equipment Download PDF

Info

Publication number
CN103493039B
CN103493039B CN201280000317.4A CN201280000317A CN103493039B CN 103493039 B CN103493039 B CN 103493039B CN 201280000317 A CN201280000317 A CN 201280000317A CN 103493039 B CN103493039 B CN 103493039B
Authority
CN
China
Prior art keywords
radix
butterfly operation
data
level
butterfly
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201280000317.4A
Other languages
Chinese (zh)
Other versions
CN103493039A (en
Inventor
周扬
刘彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN103493039A publication Critical patent/CN103493039A/en
Application granted granted Critical
Publication of CN103493039B publication Critical patent/CN103493039B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • G06F17/142Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm

Landscapes

  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Discrete Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Traffic Control Systems (AREA)
  • Complex Calculations (AREA)

Abstract

A kind of data processing method, data processing equipment, access device and subscriber equipment.Wherein, a kind of data processing method comprises the steps that input base y butterfly computation device carries out z level base y butterfly computation, and wherein, y and z is positive integer by the data of y road pending z level base y butterfly computation;When stating z level base y butterfly computation and being afterbody base y butterfly computation, export the result data that z level base y butterfly computation obtains;When z level base y butterfly computation is not afterbody base y butterfly computation, the result data obtained by z level base y butterfly computation, input base y butterfly computation device carries out z+1 level base y butterfly computation.The technical scheme that the embodiment of the present invention provides, has utilization to reduce and data is carried out the resource consumption needed for butterfly computation.

Description

Data processing method, data processing device, access equipment and user equipment
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method, a data processing apparatus, an access device, and a user equipment.
Background
At present, data processing is needed in the processes of channel estimation, signal analysis, signal comparison and the like in the communication field, a method widely applied to fourier analysis is needed, and corresponding processing is conducted on data after fourier analysis, so that the processes of channel estimation, signal analysis and the like are simplified. The fourier analysis can be implemented by a Fast Fourier Transform (FFT) arithmetic processing unit, and has similar applications in other data processing fields.
The FFT operation processing unit in the existing data processing system can be implemented by various algorithms (e.g., radix-2 butterfly, radix-4 butterfly, and mixed-radix algorithms). However, the utilization rate of butterfly computation resources is still relatively low, for example, 256-point data requires 4-level radix-4 butterfly computation, which needs to consume 4 sets of computation circuits and storage resources, and the resource consumption is relatively high.
Disclosure of Invention
The embodiment of the invention provides a data processing method, a data processing device, access equipment and user equipment, which are used for reducing resource consumption required by butterfly operation on data.
An aspect of an embodiment of the present invention provides a data processing method, which may include:
inputting y paths of data to be subjected to the z-th-level radix y butterfly operation into a radix y butterfly operator to perform the z-th-level radix y butterfly operation, wherein y and z are positive integers;
under the condition that the z-th-stage radix y butterfly operation is the last-stage radix y butterfly operation, outputting result data obtained by the z-th-stage radix y butterfly operation;
and under the condition that the z-th-stage radix y butterfly operation is not the last-stage radix y butterfly operation, inputting result data obtained by the z-th-stage radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-stage radix y butterfly operation.
Another aspect of the embodiments of the present invention further provides a data processing apparatus, including:
a controller and a radix-y butterfly operator;
the controller is used for inputting y paths of data to be subjected to the z-th-level radix y butterfly operation into the radix y butterfly operation device to be subjected to the z-th-level radix y butterfly operation, and outputting result data obtained by the z-th-level radix y butterfly operation under the condition that the z-th-level radix y butterfly operation is the last-level radix y butterfly operation; and under the condition that the z-th-stage radix y butterfly operation is not the last-stage radix y butterfly operation, inputting result data obtained by the z-th-stage radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-stage radix y butterfly operation, wherein y and z are positive integers.
In another aspect, an embodiment of the present invention further provides an access device, where the data processing apparatus according to the foregoing embodiment is deployed in the access device.
In another aspect, an embodiment of the present invention further provides a user equipment, where the data processing apparatus according to the foregoing embodiment is deployed in the user equipment.
Another aspect of embodiments of the present invention also provides a computer storage medium,
the computer storage medium stores a program which, when executed, includes some or all of the steps of the data processing method described above.
As can be seen from the above, in the embodiment of the present invention, y paths of data to be subjected to the z-th-level radix y butterfly operation are input to a radix y butterfly operator to perform the z-th-level radix y butterfly operation, and under the condition that the z-th-level radix y butterfly operation is the last-level radix y butterfly operation, result data obtained by the z-th-level radix y butterfly operation is output; and under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, inputting result data obtained by the z-th-level radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-level radix y butterfly operation. The mechanism multiplexes the radix-y butterfly operators, and the two-stage radix-y butterfly operations multiplex the same radix-y butterfly operator, for example, the time multiplexing radix-y butterfly operator with data input can be used for operation, so that the time of butterfly operation is reduced, and the resource consumption is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following briefly introduces the embodiments and the drawings used in the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to these drawings without inventive labor.
FIG. 1-a is a schematic diagram of a radix-2 butterfly model according to an embodiment of the invention;
FIG. 1-b is a schematic diagram of a radix-4 butterfly model according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a 4-level radix-4 butterfly circuit according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating read and write operations of a shift register;
FIG. 4 is a diagram illustrating a data processing method according to an embodiment of the present invention;
FIG. 5 is a diagram of a data processing apparatus according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating the operation of a shift register according to an embodiment of the present invention;
FIG. 7-a is a schematic diagram of a butterfly order arrangement according to an embodiment of the invention;
FIG. 7-b is a schematic diagram of another butterfly order arrangement according to an embodiment of the invention;
FIG. 8-a is a schematic diagram of a data processing apparatus according to an embodiment of the present invention;
FIG. 8-b is a schematic diagram of another data processing apparatus provided by an embodiment of the present invention;
fig. 9 is a schematic diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides a data processing method, a data processing device, access equipment and user equipment, which are used for reducing resource consumption required by butterfly operation on data.
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
First, it can be derived from the FFT formula:
X ( k ) = F F T [ x ( n ) ] = Σ n = 0 N - 1 x ( n ) W N k n = Σ n = 0 N 2 - 1 x ( n ) W N k n + Σ n = N 2 N - 1 x ( n ) W N k n = Σ n = 0 N 2 - 1 x ( n ) W N k n + Σ n = 0 N 2 - 1 x ( n + N 2 ) W N k ( n + N 2 ) = Σ n = 0 N 2 - 1 [ x ( n ) + W N k N 2 x ( n + N 2 ) ] W N k n
wherein, W N k n = e - j 2 π N k n , 0≤k≤N-1;
x (N) is the data to be FFT operated, x (k) is the FFT operated result data of x (N), N is the point number of the data to be FFT operated.
The radix-2 butterfly may be as shown in FIG. 1-a.
It can be understood that the FFT pipeline of the 2k point data can be obtained by cascade connection of k-level radix-2 butterfly operations;
the two-stage radix-2 butterfly operation is equivalent to the one-stage radix-4 butterfly operation, the radix-4 butterfly operation equally divides the N point input data into 4 parts, and corresponding data of each part is taken for calculation each time, and the radix-4 butterfly operation mode can be shown in figure 1-b. The 4k point data is subjected to FFT operation, k-level radix-4 butterfly operation cascade connection is needed, and for example, 256-point data FFT operation needs 4-level radix-4 butterfly operation.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating an exemplary 4-level radix-4 butterfly circuit. Generally, in the process of performing radix-4 butterfly operation on data, it is assumed that N point data needs to perform radix-4 butterfly operation, and 1 clock is needed for writing 1 point data, 3N/4 point data are sequentially stored in three shift registers in the first 3N/4 cycles, and when the last N/4 point data is input, a radix-4 butterfly operation circuit starts to perform operation on the data, so that the time of about 3/4 of the radix-4 butterfly operation circuit is in an idle state, the utilization rate of operation resources is relatively low, for example, 256 point data needs 4 levels of radix-4 butterfly operation, 4 sets of operation circuits and storage resources need to be consumed, and the resource consumption is relatively high.
To facilitate understanding of the operation mechanism of the architecture shown in fig. 2, the following takes the first-level radix-4 butterfly operation process as an example, N-point data is input, and it is assumed that each clock inputs a point of data and requires N clock cycles; buffering the input data into the shift register (R0) for the first N/4 clock cycles; buffering input data to a shift register (R1) for N/4 to N/2 clock cycles; buffering the input data to a shift register (R2) in N/2 to 3N/4 clock cycles; finally, N/4 periods are carried out, data are read out from three shift registers at the same time and are synchronously input into a radix-4 butterfly arithmetic unit with the last N/4 data for carrying out butterfly arithmetic, 4 paths of N/4 data are obtained simultaneously through arithmetic, N/4-point FFT arithmetic needs to be carried out continuously, the first path of N/4 obtained through arithmetic is sent to a rear-stage butterfly arithmetic unit, the rest three paths of N/4 data are written back to a corresponding cache of the rear-stage butterfly arithmetic unit, after the first path of arithmetic result N/4 data are all sent to the rear-stage radix-4 butterfly arithmetic unit, N/4 data are read out from a shift register-0 and sent to the rear-stage radix-4 butterfly arithmetic unit, then corresponding N/4 data are read from a shift register-1 and sent to the rear-stage radix-4 butterfly arithmetic unit, then corresponding data are read from a shift register-2 and sent to the rear-stage radix-4 butterfly arithmetic unit, and so on.
For the following example, the operation of the shift register is illustrated in FIG. 3 by performing a radix-4 butterfly operation with 8-point data at the next stage. As shown in the P1-P6 states in FIG. 3, A1-A8 are 8-point data, A _ 1-A _8 are data after A1-A8 are subjected to the radix-4 butterfly operation, the data are sequentially stored in 3 shift registers (here, the shift registers are used as an example) in the first 3N/4 periods, the butterfly operation is completed in the last N/4 periods (P7, P8), and A7 and A8 are not sent to the shift registers, but directly sent to the radix-4 butterfly operator together with the previously stored data. After each operation, the last 3 of the 4 result data are sent back to the shift register (i.e., backfilled as shown in P8), and the 1 st (i.e., a _1, a _2) is sent to the next 1 st stage for the radix-4 butterfly operation (e.g., P9). Then, N/4 data are sequentially fetched from the 3 shift registers and sent to the next stage of radix-4 butterfly operation, at this time, the next set of N input data (i.e., B1-B8, as shown in P10 and P11) can be written into the shift registers at the same time.
In FIG. 3, P1-P6 are data preparation stages, where the radix-4 butterfly operator is idle, P7-P9 are data operation stages, and P10-P11 are data write-back states. And the calculation process of the data of other points is analogized.
The scheme provided by the embodiment of the invention is beneficial to reducing the resource consumption required by butterfly operation on the data.
The following are detailed descriptions of the respective embodiments.
One embodiment of the data processing method of the present invention, wherein the method may comprise: inputting y paths of data to be subjected to the z-th-level radix y butterfly operation into a radix y butterfly operator to perform the z-th-level radix y butterfly operation, wherein y and z are positive integers; under the condition that the z-th-stage radix y butterfly operation is the last-stage radix y butterfly operation, outputting result data obtained by the z-th-stage radix y butterfly operation; and under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, inputting result data obtained by the z-th-level radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-level radix y butterfly operation.
Referring to fig. 4, a data processing method provided in an embodiment of the present invention may include the following steps:
401. inputting y paths of data to be subjected to the z-th-level radix y butterfly operation into a radix y butterfly operator to perform the z-th-level radix y butterfly operation;
it is understood that y and z are positive integers. Assuming that 256 points (44) of data are needed to perform FFT operation, if the FFT operation is performed by the radix-2 butterfly operation, y is 2, and if the FFT operation is performed by the radix-4 butterfly operation, y is 4; if FFT operation is performed through radix-3 butterfly operation, y is 3; if FFT operation is performed through radix-8 butterfly operation, y is 8; if the FFT operation is performed by a radix-16 butterfly operation, y is 16, and so on.
In some embodiments of the present invention, for example, there is an N-point FFT operation, and if the radix-y butterfly is used for processing, M (logyN) level radix-y butterfly operation is needed, and z is less than or equal to M. Generally, N is a positive integer power of 2, and when N is not a positive integer power of 2, N may be supplemented to a positive integer power of 2, and for example, 0 may be supplemented by a number of 0 s in addition to the original N point data, so that the total number of 0-supplemented data points is a positive integer power of 2.
402. Under the condition that the z-th-stage radix y butterfly operation is the last-stage radix y butterfly operation, outputting result data obtained by the z-th-stage radix y butterfly operation; and under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, inputting result data obtained by the z-th-level radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-level radix y butterfly operation.
It can be seen that, in this embodiment, y paths of data to be subjected to the z-th-level radix y butterfly operation are input to a radix y butterfly operator to perform the z-th-level radix y butterfly operation, and under the condition that the z-th-level radix y butterfly operation is the last-level radix y butterfly operation, result data obtained by the z-th-level radix y butterfly operation are output; and under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, inputting result data obtained by the z-th-level radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-level radix y butterfly operation. The mechanism multiplexes the radix-y butterfly operators, and the two-stage radix-y butterfly operations multiplex the same radix-y butterfly operator, for example, the time multiplexing radix-y butterfly operator with data input can be used for operation, so that the time of butterfly operation is reduced, and the resource consumption is reduced.
In some embodiments of the present invention, a buffer may also be introduced to store intermediate result data of the radix-y butterfly, and the like. For example, in the case that the above-mentioned z-th radix y butterfly operation is not the last-level radix y butterfly operation, each way of result data in the y ways of result data obtained by the z-th radix y butterfly operation may be equally divided into y parts, and the y parts are respectively written into y cache regions; and then equally dividing each path of result data respectively written into the y cache regions into y parts, taking the y parts as y paths of data to be subjected to the z +1 th-level radix y butterfly operation, and inputting the data into the radix y butterfly operation device to perform the z +1 th-level radix y butterfly operation.
In addition, when the z +1 th radix y butterfly operation is the last 1 st radix y butterfly operation, the result data of the z +1 th radix y butterfly operation can be output; and under the condition that the z +1 th-level radix y butterfly operation is not the last-level radix y butterfly operation, the result data obtained by the z +1 th-level radix y butterfly operation can be input into the radix y butterfly operation device (or other radix y butterfly operation devices) to carry out the z +2 th-level radix y butterfly operation. When the z +1 th radix y butterfly operation is not the last-stage radix y butterfly operation, for example, each path of result data in the y paths of result data obtained by the z +1 th radix y butterfly operation may be equally divided into y parts, and the y parts may be written into y cache regions, respectively; and then equally dividing each path of result data respectively written into the y cache regions into y parts, taking the y parts as y paths of data to be subjected to the z +2 th-level radix-y butterfly operation, inputting the y parts into the radix-y butterfly operator to perform the z +2 th-level radix-y butterfly operation, and if the result data needs to be subjected to the radix-y butterfly operation subsequently, performing cyclic processing according to the z +2 th-level radix-y butterfly operation to complete all levels of radix-y butterfly operation.
Wherein, the y buffer areas can be respectively positioned in the y buffers; alternatively, the y buffer sections may be located in at least one buffer, wherein the buffer may be, for example, a shift buffer (e.g., a shift register or other shift buffer).
It can be understood that, if the FFT operation result is not obtained after the current-level radix-y butterfly operation is performed, that is, in the case that the current-level radix-y butterfly operation is not the last-level radix-y butterfly operation, the loop processing may be performed based on the mechanism, the loop processing may be performed, the loop processing may be respectively read from the y cache regions, each path of result data in the current-level radix-y butterfly operation result data written into the y cache regions is equally divided into y parts, the y parts are used as y paths of data to be subjected to the next-level radix-y butterfly operation, the y parts are input to the radix-y butterfly operator to perform the next-level radix-y butterfly operation, and the FFT operation result of the data is finally obtained through the loop processing.
For example, taking an FFT operation on 256-point (44-point) data as an example, the 256-point (44) data requires a 4-level radix-4 butterfly operation. In the 1 st level, 256 point data is subjected to base 4 butterfly operation to obtain 4 groups of data of 64 point data/group; in the 2 nd stage, 4 groups of data of 64 point data/group are subjected to base 4 butterfly operation to obtain 16 groups of data of 16 point data/group; the 3 rd level carries out the radix-4 butterfly operation on 16 groups of data of 16 point data/group to obtain 64 groups of data of 4 point data/group; and the 4 th level carries out the radix-4 butterfly operation on 64 groups of data of 4 points/group to obtain the FFT operation result of 256 points of data. It can be understood that the FFT operation of other points of data is analogized.
Referring to fig. 5, the data processing operation apparatus in fig. 5 includes: 4 shift registers with at least 64 depths (namely at least storing 64 point data) and a radix 4 butterfly arithmetic unit, after finishing the operation each time, writing the operation result data back to the shift register, when the next-stage radix 4 butterfly operation starts, reading the data to be processed with the next-stage radix 4 butterfly operation from the shift register to process the next-stage radix 4 butterfly operation until obtaining the FFT operation result to output.
In some embodiments of the present invention, for example, after all data to be subjected to each level of radix-y butterfly operation is subjected to the level of radix-y butterfly operation, a next level of radix-y butterfly operation may be performed on all data to be subjected to a next level of radix-y butterfly operation.
For example, after the z +1 th-level radix y butterfly operation is performed to obtain y × y path z +1 th-level radix y butterfly operation result data, in the case that the z +1 th-level radix y butterfly operation is not the last-level radix y butterfly operation, each path of result data in the y × y path z +1 th-level radix y butterfly operation result data may be equally divided into y parts, the y parts are used as y paths of data to be subjected to the z +2 th-level radix y butterfly operation, the y parts are input to the radix y butterfly operation unit (or other butterfly operation units) to perform the z +2 th-level radix y butterfly operation, for example, each path of result data in the y × y path obtained by the z +1 th-level radix y butterfly operation may be equally divided into y parts, and the y parts are respectively written into y cache regions; and then equally dividing y parts of each path of result data respectively written into the y cache regions into y parts to be used as y paths of data to be subjected to the z +2 th-level radix y butterfly operation, inputting the y parts into the y butterfly operation device to perform the z +2 th-level radix y butterfly operation, if the z +2 th-level radix y butterfly operation result data needs to be subjected to the radix y butterfly operation, performing cyclic processing according to the result data to complete all levels of radix y butterfly operations, and performing next-level radix y butterfly operation on all the data to be subjected to the next-level radix y butterfly operation after performing the level of radix y butterfly operation on all the data to be subjected to the each level of radix y butterfly operation.
Taking FFT operation on 256-point (44-point) data as an example, the 256-point data needs to be subjected to 4-level radix-4 butterfly operation, wherein 4 groups of data of 64-point data/group are obtained after the 1 st level completes the radix-4 butterfly operation of the 256-point data; then, performing radix-4 butterfly operation on 4 groups of data of the 64-level data/group at the 2 nd level (wherein each group of data in the 4 groups of data of the 64-level data/group is equally divided into 4 parts, and the 4 parts into which each group of data is equally divided are used as 4 paths of data to be subjected to the 2 nd-level radix-4 butterfly operation and input into a radix-4 butterfly operator to perform the 2 nd-level radix-4 butterfly operation), so as to obtain 16 groups of data of the 16-level data/group; after the radix-4 butterfly operation of 4 groups of data of 64 point data/group is completed, the radix-4 butterfly operation of 16 groups of data of 16 point data/group of the 3 rd level is carried out to obtain 64 groups of data of 4 point data/group; after the radix-4 butterfly operations of 16 groups of data of 16 point data/group are all completed, the radix-4 butterfly operations of 64 groups of data of 4 point data/group of 4 level are carried out to obtain the FFT operation result of 256 point data. And the FFT operation mode of the data of other points is analogized.
In other embodiments of the present invention, for example, the data to be subjected to the kth-level (e.g., the 2 nd or 3 rd level or other non-final 1 level) radix-y butterfly operation may be divided into several parts (e.g., 2 parts, 3 parts or more), the kth-level radix-y butterfly operation of some part of the data is performed first until all remaining radix-y butterfly operations of all levels of the some part of the data are completed, then the kth-level radix-y butterfly operation of another part of the data to be subjected to the kth-level radix-y butterfly operation is performed until all remaining radix-y butterfly operations of all levels of the another part of the data are completed, and if there is still another part of the data to be subjected to the kth-level radix-y butterfly operation, then the kth-level radix-y butterfly operation of another part of the data to be subjected to the kth-level radix-y butterfly operation is performed again until all remaining radix-y butterfly operations of the another part of the data are completed, and by analogy, all the levels of radix y butterfly operations of all the data are finally completed, and the FFT calculation result is obtained.
For example, among y paths of result data obtained by performing the z-th level radix y butterfly operation on y paths of data, y parts into which each path of data of m paths of result data (the result data written into the y buffers is also the result data calculated first) of y buffers is equally divided are firstly written as y paths of data to be subjected to the z + 1-th level radix y butterfly operation to be input into the radix y butterfly operator for performing the z + 1-th level radix y butterfly operation, and then y x m paths of result data of the z + 1-th level radix y butterfly operation can be obtained; after the y x m paths of z +1 th-level radix y butterfly operation result data are obtained, writing the y paths of result data into y parts, into which each path of data in the remaining n paths of result data of the y cache regions is equally divided, of the y paths of result data, as y paths of data to be subjected to z +1 th-level radix y butterfly operation, inputting the y paths of result data into the radix y butterfly operation device to perform z +1 th-level radix y butterfly operation, and obtaining y x n paths of z +1 th-level radix y butterfly operation result data, wherein m is smaller than y, and the sum of m and n is equal to y; m is not more than n.
Wherein, in the case where the z +1 th radix y butterfly is not the last stage radix y butterfly, and equally dividing y parts into which each path of result data in the remaining n paths of result data of the y cache regions is respectively written in the y paths of result data obtained by the z-th-level radix y butterfly operation as y paths of data to be subjected to the z + 1-th-level radix y butterfly operation before inputting the y-radix butterfly operation device to perform the z + 1-th-level radix y butterfly operation, equally dividing each path of result data in the y-m paths of z + 1-level radix y butterfly operation into y parts, taking the y parts as y paths of data to be subjected to the z + 2-level radix y butterfly operation, inputting the y parts into the y-radix butterfly operation device to perform the z + 2-level radix y butterfly operation, and obtaining y-m paths of z + 2-level radix y butterfly operation result data.
For example, taking FFT operation on 256-point (44-point) data as an example, the 256-point data needs to be subjected to 4-level radix-4 butterfly operation, wherein 4 groups of data of 64-point data/group are obtained after the radix-4 butterfly operation of the 256-point data is completed in level 1; firstly, performing radix-4 butterfly operation on two groups of data in 4 groups of data of the 2 nd-level 64-point data/group to obtain 8 groups of data of the 16-point data/group; after the radix-4 butterfly operation of 2 groups of data of 64 point data/group is completed, the 3 rd-level radix-4 butterfly operation of 8 groups of data of 16 point data/group is carried out to obtain 32 groups of data of 4 point data/group, and then the 4 th-level radix-4 butterfly operation of 32 groups of data of 4 point data/group is carried out to obtain the FFT operation result of 128 point data; then, performing radix-4 butterfly operation on the remaining two groups of data in 4 groups of data of the 2 nd-level 64-point data/group to obtain 8 groups of data of the 16-point data/group; after the radix-4 butterfly operations of the remaining 2 groups of data of 64 point data/group are all completed, the 3 rd-level radix-4 butterfly operations of the 8 groups of data of 16 point data/group are performed to obtain 32 groups of data of 4 point data/group, and then the 4 th-level radix-4 butterfly operations of the 32 groups of data of 4 point data/group are performed to obtain the FFT operation result of 128 point data, so that all the 4-level radix-4 butterfly operations required to be performed on 256 point data are completed. It is understood that the FFT operation of other points of data can be analogized.
In the embodiment of the invention, in the process of performing the same-level radix-y butterfly operation on the data to be subjected to the same-level radix-y butterfly operation, the data which is written into the cache region first can be input into the radix-y butterfly operation device for operation, and the data which is written into the cache region later can be input into the radix-y butterfly operation device for operation.
It will be appreciated that the steps in the above described scheme may be implemented under the control of a controller, which may be independent of the radix-y butterfly and the buffer, or may have some or all of its functionality integrated into the radix-y butterfly and the buffer.
For better understanding and implementation of the above scheme, the following description will be made by taking as an example a process of performing a radix-4 butterfly of 256-point data.
Referring to fig. 6, in a possible implementation, the shift register storage operation mechanism of the data processing apparatus with the architecture shown in fig. 5 may be as follows, where the description is mainly given by taking an example that each way of data is buffered by using a different buffer.
First, a radix-4 butterfly operation of 256-point data is performed on 4-way data of 64-point data/way input, and 4-way calculation results are obtained and stored in four shift registers (shift registers R0 to R3), respectively.
The 4 shadings in fig. 6 represent the 4-way data obtained by the radix-4 butterfly operation. In order to simultaneously read 4 data of the same path (same shading in the figure) and perform the radix-4 butterfly operation, when storing data, the shift register storing the path of data can be switched after 16 data of the path 1 are stored, finally, the first 16 data in the path 1 are stored in the lower 16 addresses of the shift register R0, the 17 th to 32 th data in the path 1 are stored in the addresses 16 to 31 of the shift register R1, the 33 th to 48 th data in the path 1 are stored in the addresses 32 to 47 of the shift register R2, and the 49 th to 64 th data in the path 1 are stored in the addresses 48 to 63 of the shift register R3; and the other ways of storing data (other shadings in the figure) and the like.
When the next level radix-4 butterfly operation is needed, the part of the same shading read from the 4 shift registers is sent to the radix-4 butterfly operation device, the data obtained after the operation can be stored in the corresponding address read out (if the shift register is large enough, the data obtained after the operation can also be stored in other idle positions), for example, the data obtained after the 1 st path of 64-point data is subjected to the radix-4 butterfly operation can still be stored in the 0-15 address positions of the shift register R0, the 16-31 address positions of the shift register R1-47 address positions of the shift register R2 and the 48-63 address positions of the shift register R3, and the storage modes of other paths are analogized, namely, in the 4 paths obtained by the radix-4 butterfly operation, the data of the same path 1 is divided into 4 parts and stored in different shift registers, so as to ensure that the data can be read out simultaneously subsequently to carry out the next-stage radix-4 butterfly operation. The address storage manner of the data obtained by the third-stage and fourth-stage radix-4 butterfly operations can be analogized, and details are not described here.
In order to obtain the highest operation efficiency, the priority order of the radix-4 butterfly operation can be reasonably and skillfully arranged. Several ways of prioritizing radix-4 butterflies are provided by embodiments of the present invention are described below.
It can be understood that the precondition of the latter stage butterfly operation is that the result of the former stage butterfly operation is already obtained, and the order of each stage butterfly operation of each path of data can be properly arranged according to the needs in consideration of the situations of calculation delay, control delay and the like.
Taking the FFT operation on the 256-point data (44) as an example, an arrangement of the butterfly calculations is shown in fig. 7-a.
Firstly, executing 256-point first-stage base 4 butterfly operation to obtain 4 paths of data of 64 points/path, wherein 64 clock cycles (cycles) are needed;
second-level 4-butterfly operation of 64-point data of 1 path (for example, the 1 st path) is executed to obtain 4 paths of data of 16-point data/path, and 16cycles are needed;
executing third-level base 4 butterfly operation of the 4 paths of data of the 16 points/paths to obtain 16 paths of data of the 4 points/paths, wherein 16cycles are needed;
fourthly, 16cycles are needed for executing the radix-4 butterfly operation of the 16 paths of 4-point data;
at this point, 1 of the 4 paths of data (64 points data/path) obtained by executing the 256-point level 1 radix-4 butterfly operation is calculated. And by analogy, repeating the third step and the fourth step for 3 times, and obtaining 4 paths of data obtained by the 1 st-level radix-4 butterfly operation and completing the other three paths of operation.
Assuming that 64-point radix-4 butterfly operation is equivalent to the operation process in step two of fig. 5, 16cycles are required for inputting 4 channels of data, and the labels of 64 data may be as follows:
0、1、2、3、4、5、6、7、8、9、10、11、12、13、14、15
16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31
32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47
48、49、50、51、52、53、54、55、56、57、58、59、60、61、62、63
the way 1 data of the 4 ways data obtained by performing the radix-4 butterfly operation is, for example, marked as:
A. b, C, D, E, F, G, H, I, J, K, L, M, N, O, P, the 16 data need to undergo the radix-4 butterfly operation of the next stage 1;
the 4-way total 16 input data for the next level 1 radix-4 butterfly is as follows:
A、B、C、D
E、F、G、H
I、J、K、L
M、N、O、P
considering that addition and subtraction and twiddle factor multiplication require several clock cycles when performing the butterfly operation, in the radix-4 butterfly operation with 16-point data, the data M, N, O, P to be used is generated by performing the radix-4 butterfly operation from 64-point data, and the data 12, 13, 14, 15, 28, 29, 30, 31, 44, 45, 46 and 47, 60, 61, 62 and 63 are generated by calculation, and the data just input to the radix-4 butterfly operator requires several clock cycles (generally more than 4 clock cycles) to generate the operation result M, N, O, P, so that the next stage of 16-point radix-4 operation may not be performed until the next stage of 16-point radix-4 operation is performed after 4 paths of 16 numbers required by the 64-point butterfly operation are read from the shift register, because the data 12, 28, 44 and 60 have passed 3 clocks when being input to the radix-4 butterfly operator (the 3 clocks are 13, 14 and 15 input to the radix-4 butterfly operator), the data M may not be calculated and stored in the shift register, and similarly N, O, P is not calculated. Therefore, a certain delay processing can be performed, so that data required for performing the subsequent butterfly operation is obtained and stored in the shift register after the corresponding previous butterfly operation.
Still taking the FFT operation on the 256-point data (44) as an example, another butterfly calculation sequence arrangement provided by the embodiment of the invention is shown in fig. 7-b.
Executing 256-point data 1 st-level radix-4 butterfly operation to obtain 4 paths (64-point data/path) of data to be subjected to 2 nd-level radix-4 butterfly operation;
executing 4 paths (64 point data/paths) of data to be subjected to level 2 base 4 butterfly operation in the 4 paths of data to be subjected to level 2 base 4 butterfly operation to obtain 8 paths (16 point data/paths) of data to be subjected to level 3 base 4 butterfly operation;
performing 8 paths of 3-level base 4 butterfly operations of 4 paths (16 point data/path) of data in the 3-level base 4 butterfly operation data obtained in the step II to obtain 16 paths (4 point data/path) of data to be subjected to the 4-level base 4 butterfly operation;
fourth, the 4 th-order radix-4 butterfly operation of the 16-way (4-point data/way) data obtained in step three is performed, and 64 pieces of output data (FFT operation results) are obtained.
The step four is performed for 1 time, then 1 step is performed again, step four is performed, then 1 step three is performed again, step four is performed again, step 1 is performed again, step four is performed again, and step four is performed again, then step 1 is performed again, and step four is performed again, so that all data are operated and output operation results.
In the second step, 32 cycles are used, two of the 4 paths of data (64 point data/path) obtained by performing the first-stage radix-4 butterfly operation on 256 point data are firstly used for performing the corresponding rear-stage radix-4 butterfly operation, and then the other two of the 4 paths of data (64 point data/path) obtained by performing the 1 st-stage radix-4 butterfly operation on the 256 point data are used for performing the corresponding rear-stage radix-4 butterfly operation. Therefore, the execution of the next-level 1 radix-4 butterfly operation can be guaranteed to be performed after the current-level operation result is stored in the shift register, meanwhile, the working period of the radix-4 butterfly operator is fully utilized, the idle time is reduced, and the operation efficiency is greatly improved.
It is understood that the above examples are mainly described by taking the case of performing the radix-4 butterfly operation on data, and the process of performing other butterfly operations on data can be analogized, and thus the description is omitted here. The implementation of the above-described aspects of the embodiments of the present invention may be controlled by a controller, which may be deployed independently and connected to each cache region and the radix-y butterfly, or may be deployed directly in the cache region or the radix-y butterfly.
As can be seen from the above, in this embodiment, y paths of data to be subjected to the z-th-level radix y butterfly operation are input to a radix y butterfly operator to perform the z-th-level radix y butterfly operation, and under the condition that the z-th-level radix y butterfly operation is the last-level radix y butterfly operation, result data obtained by the z-th-level radix y butterfly operation is output; and under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, inputting result data obtained by the z-th-level radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-level radix y butterfly operation. Because the mechanism multiplexes the radix-y butterfly arithmetic unit and the buffer area, the two-stage radix-y butterfly arithmetic unit multiplexes the same radix-y butterfly arithmetic unit, for example, the time multiplexing radix-y butterfly arithmetic unit with data input can be used for arithmetic, thereby being beneficial to reducing the time of butterfly arithmetic and reducing the resource consumption.
In addition, by skillfully arranging the sequence of each-level radix y butterfly operation of the data, the delay waiting time is favorably reduced, and the operation efficiency is further improved.
In order to better implement the related device of the embodiment of the invention, the embodiment of the invention also provides the related device for implementing the scheme.
Referring to fig. 8-a, an embodiment of the present invention provides a data processing apparatus 800, which may include:
a controller 810 and a radix-y butterfly 820;
the controller 810 is configured to control y paths of data to be subjected to a z-th-level radix y butterfly operation to be input into the radix y butterfly operator 820 to perform the z-th-level radix y butterfly operation, where y and z are positive integers; under the condition that the z-th-stage radix y butterfly operation is the last-stage radix y butterfly operation, outputting result data obtained by the z-th-stage radix y butterfly operation; under the condition that the z-th-stage radix y butterfly operation is not the last-stage radix y butterfly operation, the result data obtained by the z-th-stage radix y butterfly operation is input into the radix y butterfly operation device 820 to carry out the z + 1-th-stage radix y butterfly operation.
Referring to fig. 8-b, in some embodiments of the invention, data processing apparatus 800 may also include y cache blocks 830;
the controller 810 may further be configured to, after inputting the y ways of data to be subjected to the z-th-level radix-y butterfly operation into the radix-y butterfly operator 820 to perform the z-th-level radix-y butterfly operation to obtain y ways of result data of the z-th-level radix-y butterfly operation, equally divide each way of result data in the y ways of result data obtained by the z-th-level radix-y butterfly operation into y parts, and write the y parts into the y cache regions 830, respectively.
In some embodiments of the present invention, the controller 810 may be specifically configured to input y-path data to be subjected to the z-th-level radix y butterfly operation into the radix y butterfly operation unit 820 to perform the z-th-level radix y butterfly operation, and output result data obtained by the z-th-level radix y butterfly operation when the z-th-level radix y butterfly operation is the last-level radix y butterfly operation; under the condition that the z-th level radix y butterfly operation is not the last level radix y butterfly operation, equally dividing each path of result data in y paths of result data obtained by the z-th level radix y butterfly operation into y parts, respectively writing the y parts into y cache regions 830, respectively writing the equally divided y parts of each path of result data in the y cache regions 830 as y paths of data to be subjected to the z + 1-th level radix y butterfly operation, and inputting the y paths of result data into the radix y butterfly operation device 820 to perform the z + 1-th level radix y butterfly operation, wherein y and z are positive integers.
In some embodiments of the present invention, the controller 810 may, for example, control the first step of dividing data to be subjected to the kth-level (e.g., the 2 nd or 3 rd level or other levels other than the last 1 st level) radix-y butterfly operation into several parts (e.g., 2 parts, 3 parts or more), the first step of performing the kth-level radix-y butterfly operation on some part of the data until all remaining radix-y butterfly operations on some part of the data are completed, the second step of performing the kth-level radix-y butterfly operation on another part of the data to be subjected to the kth-level radix-y butterfly operation until all remaining radix-y butterfly operations on another part of the data are completed, and if there is still another part of data to be subjected to the kth-level radix-y butterfly operation, the second step of performing the kth-level radix-y butterfly operation on another part of the data to be subjected to the kth-level radix-y butterfly operation until all remaining radix-y butterfly operations on another part of the data are completed, and by analogy, all the levels of radix y butterfly operations of all the data are finally completed, and the FFT calculation result is obtained.
For example, the controller 810 may be specifically configured to input y-path data to be subjected to the z-th-level radix y butterfly operation into the radix y butterfly operator 820 to perform the z-th-level radix y butterfly operation, and output result data obtained by the z-th-level radix y butterfly operation when the z-th-level radix y butterfly operation is the last-level radix y butterfly operation; under the condition that the z-th level radix y butterfly operation is not the last level radix y butterfly operation, equally dividing each path of result data in y paths of result data obtained by the z-th level radix y butterfly operation into y parts, and respectively writing the y parts into y cache regions 830; firstly, writing each path of result data in m paths of result data of the y cache regions 830 into y parts which are equally divided respectively, and inputting the y parts as y paths of data to be subjected to the z +1 th-level radix y butterfly operation into the radix y butterfly operation device 820 to perform the z +1 th-level radix y butterfly operation to obtain y x m paths of z +1 th-level radix y butterfly operation result data; after the y × m ways of z +1 th-level radix y butterfly operation result data are obtained, the y ways of result data are written into y parts, into which each way of result data in the remaining n ways of result data of the y buffer area 830 is equally divided, as y ways of data to be subjected to the z +1 th-level radix y butterfly operation, and the y ways of result data are input into the radix y butterfly operation device 820 to be subjected to the z +1 th-level radix y butterfly operation, so that y n ways of z +1 th-level radix y butterfly operation result data are obtained, wherein m is smaller than y, and the sum of m and n is equal to y.
In some embodiments of the present invention, the controller 810 may be specifically configured to, among y paths of result data of the z-th-level radix-y butterfly operation, write the y paths of result data into the y buffer areas 830, equally divide the remaining n paths of result data into y portions respectively, as y paths of data to be subjected to the z + 1-th-level radix-y butterfly operation, before the input radix-y butterfly operation device 820 performs the z + 1-th-level radix-y butterfly operation, equally divide each path of result data of the z + 1-th-level radix-y butterfly operation into y portions, use the y portions as y paths of data to be subjected to the z + 2-th-level radix-y butterfly operation, and input the y-y butterfly operation device 820 to perform the z + 2-th-level radix-y butterfly operation, so as to obtain y-m paths of result data of the z + 2-level radix-y butterfly operation.
In other embodiments of the present invention, for example, the controller 810 may also control the next level radix-y butterfly operation of all data to be subjected to the next level radix-y butterfly operation after all data to be subjected to each level radix-y butterfly operation is subjected to the level radix-y butterfly operation.
For example, the controller 810 may be further configured to, after performing the z +1 th radix-y butterfly operation, equally divide each path of result data of the z +1 th radix-y butterfly operation into y parts, use the y parts as y paths of data to be subjected to the z +2 th radix-y butterfly operation, and input the y parts to the radix-y butterfly operator 820 to perform the z +2 th radix-y butterfly operation.
In one embodiment of the present invention, y buffer sections 830 may be respectively located in the y buffers;
alternatively, y buffer blocks 830 are located in at least one buffer.
It is understood that the controller 810 may be built into the radix-y butterfly 820, or may be built into the buffer 830, or may be provided independently. The functions of the controller 810 may also be partially or fully integrated into the buffers of the radix-y butterfly 820 or the buffer 830.
It is to be understood that the functions of the functional modules of the data processing apparatus 800 in this embodiment may be specifically implemented according to the method described in the foregoing method embodiment, and the specific implementation process may refer to relevant descriptions in the foregoing embodiment, and may also include an electronic device and the like for executing the foregoing method embodiment, which is not described herein again.
It can be seen that the controller in the data processing apparatus 800 of this embodiment controls the y-path data to be subjected to the z-th level radix y butterfly operation to be input to the radix y butterfly operation device for performing the z-th level radix y butterfly operation, and outputs the result data obtained by the z-th level radix y butterfly operation when the z-th level radix y butterfly operation is the last level radix y butterfly operation; and under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, inputting result data obtained by the z-th-level radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-level radix y butterfly operation. The mechanism multiplexes the radix-y butterfly operators, and the two-stage radix-y butterfly operations multiplex the same radix-y butterfly operator, for example, the time multiplexing radix-y butterfly operator with data input can be used for operation, so that the time of butterfly operation is reduced, and the resource consumption is reduced.
Referring to fig. 9, an embodiment of the present invention further provides an electronic device 900, where the data processing apparatus 800 is deployed in the electronic device. The electronic device 900 may be, for example, an access device (e.g., a base station, an access point, etc.), a user terminal (e.g., a mobile phone, a portable computer, etc.), or other electronic devices that require data processing.
An embodiment of the present invention further provides a computer storage medium, where the computer storage medium may store a program, and the program includes, when executed, some or all of the steps of the data processing method described in the foregoing method embodiment.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In summary, in the embodiment of the present invention, y paths of data to be subjected to the z-th-stage radix y butterfly operation are input to a radix y butterfly operator to perform the z-th-stage radix y butterfly operation, and under the condition that the z-th-stage radix y butterfly operation is the last-stage radix y butterfly operation, result data obtained by the z-th-stage radix y butterfly operation is output; and under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, inputting result data obtained by the z-th-level radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-level radix y butterfly operation. The mechanism multiplexes the radix-y butterfly operators, and the two-stage radix-y butterfly operations multiplex the same radix-y butterfly operator, for example, the time multiplexing radix-y butterfly operator with data input can be used for operation, so that the time of butterfly operation is reduced, and the resource consumption is reduced.
In addition, by skillfully arranging the sequence of each-level radix y butterfly operation of the data, the delay waiting time is favorably reduced, and the operation efficiency is further improved.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, which may include, for example: read-only memory, random access memory, magnetic or optical disk, and the like.
The data processing method and the related apparatus provided by the embodiment of the present invention are described in detail above, and the principle and the embodiment of the present invention are explained herein by applying a specific example, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (12)

1. A data processing method, comprising:
inputting y paths of data to be subjected to the z-th-level radix y butterfly operation into a radix y butterfly operator to perform the z-th-level radix y butterfly operation, wherein y and z are positive integers;
equally dividing each path of result data in the y paths of result data obtained by the z-th-level radix y butterfly operation into y parts, and respectively writing the y parts into y cache regions;
under the condition that the z-th-stage radix y butterfly operation is the last-stage radix y butterfly operation, outputting result data obtained by the z-th-stage radix y butterfly operation;
under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, inputting result data obtained by the z-th-level radix y butterfly operation into the radix y butterfly operator to perform z + 1-th-level radix y butterfly operation;
wherein, the inputting the result data obtained by the z-th-level radix y butterfly operation into the radix y butterfly operator to perform z + 1-level radix y butterfly operation comprises: dividing the result data of each path respectively written into the y cache areas into y parts which are equally divided into y parts, taking the y parts as y paths of data to be subjected to the z +1 th-level radix y butterfly operation, and inputting the data into the radix y butterfly operation device to perform the z +1 th-level radix y butterfly operation;
the equally dividing y parts of each path of result data respectively written into the y cache regions are used as y paths of data to be subjected to the z +1 th-level radix y butterfly operation, and the y paths of result data are input into the radix y butterfly operation device to be subjected to the z +1 th-level radix y butterfly operation, and the method comprises the following steps:
firstly, writing each path of result data in m paths of result data of the y cache regions into y parts which are equally divided respectively, and inputting the y parts as y paths of data to be subjected to the z +1 th-level radix y butterfly operation into the radix y butterfly operation device to perform the z +1 th-level radix y butterfly operation to obtain y m paths of z +1 th-level radix y butterfly operation result data;
after the result data of the y x m paths of z + 1-level radix y butterfly operations are obtained, writing the y paths of result data into y parts, which are respectively divided into equal parts, of each path of result data in the remaining n paths of result data of the y cache areas, as y paths of data to be subjected to the z + 1-level radix y butterfly operations, inputting the y paths of result data into the radix y butterfly operation device to perform the z + 1-level radix y butterfly operations, and obtaining y n paths of result data of the z + 1-level radix y butterfly operations,
wherein m is less than y and the sum of m and n is equal to y.
2. The method of claim 1,
the method further comprises the following steps:
under the condition that the z +1 th-level radix y butterfly operation is the last-level radix y butterfly operation, outputting result data of the z +1 th-level radix y butterfly operation;
and under the condition that the z +1 th-level radix y butterfly operation is not the last-level radix y butterfly operation, inputting result data obtained by the z +1 th-level radix y butterfly operation into the radix y butterfly operator to perform the z +2 th-level radix y butterfly operation.
3. The method of claim 1,
the method further comprises the following steps: and equally dividing y parts into which each path of result data in the remaining n paths of result data written into the y cache regions is equally divided respectively in the y paths of result data as y paths of data to be subjected to the z +1 th-level radix y butterfly operation, before inputting the y paths of result data into the radix y butterfly operation device to perform the z +1 th-level radix y butterfly operation, equally dividing y m paths of result data in the z +1 th-level radix y butterfly operation into y parts, taking the y parts as y paths of data to be subjected to the z +2 th-level radix y butterfly operation, and inputting the y parts into the radix y butterfly operation device to perform the z +2 th-level radix y butterfly operation.
4. The method of claim 1,
the m is less than or equal to the n.
5. The method of claim 1,
the method further comprises the following steps:
after the z +1 th-level radix y butterfly operation result data of the y x y paths obtained by the z +1 th-level radix y butterfly operation is carried out, equally dividing each path of result data in the z +1 th-level radix y butterfly operation result data of the y x y paths into y parts, taking the y parts as y paths of data to be subjected to the z +2 th-level radix y butterfly operation, and inputting the y parts into the radix y butterfly operation device to carry out the z +2 th-level radix y butterfly operation.
6. The method according to any one of claims 1 to 5,
the y cache regions are respectively positioned in the y caches;
or, the y buffer areas are located in at least one buffer.
7. A data processing apparatus, comprising:
a controller and a radix-y butterfly operator;
the controller is used for inputting y paths of data to be subjected to the z-th-level radix y butterfly operation into the radix y butterfly operation device to be subjected to the z-th-level radix y butterfly operation, and outputting result data obtained by the z-th-level radix y butterfly operation under the condition that the z-th-level radix y butterfly operation is the last-level radix y butterfly operation; inputting result data obtained by the z-th-stage radix y butterfly operation into a radix y butterfly operator to perform z + 1-th-stage radix y butterfly operation under the condition that the z-th-stage radix y butterfly operation is not the last-stage radix y butterfly operation, wherein y and z are positive integers; the data processing device also comprises y cache regions;
the controller is further configured to, after inputting y paths of data to be subjected to the z-th-level radix-y butterfly operation into the radix-y butterfly operation device for the z-th-level radix-y butterfly operation, equally divide each path of result data in the y paths of result data obtained by the z-th-level radix-y butterfly operation into y parts, and write the y parts into y cache regions respectively;
the controller is specifically configured to input y-path data to be subjected to a z-th-level radix y butterfly operation into the radix y butterfly operator to perform the z-th-level radix y butterfly operation, and output result data obtained by the z-th-level radix y butterfly operation under the condition that the z-th-level radix y butterfly operation is the last-level radix y butterfly operation; under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, equally dividing each path of result data in y paths of result data obtained by the z-th-level radix y butterfly operation into y parts, respectively writing the y parts into y cache regions, respectively writing the y parts into the y cache regions, equally dividing each path of result data into y parts, serving as y paths of data to be subjected to the z + 1-level radix y butterfly operation, and inputting the y-level radix butterfly operation to a z + 1-level radix y butterfly operation device for performing the z + 1-level radix y butterfly operation, wherein y and z are positive integers;
the controller is specifically configured to input y-path data to be subjected to a z-th-level radix y butterfly operation into the radix y butterfly operator to perform the z-th-level radix y butterfly operation, and output result data obtained by the z-th-level radix y butterfly operation under the condition that the z-th-level radix y butterfly operation is the last-level radix y butterfly operation; under the condition that the z-th-level radix y butterfly operation is not the last-level radix y butterfly operation, equally dividing each path of result data in y paths of result data obtained by the z-th-level radix y butterfly operation into y parts, and respectively writing the y parts into y cache regions; firstly, writing each path of result data in m paths of result data of the y cache regions into y parts which are equally divided respectively, and inputting the y parts as y paths of data to be subjected to the z +1 th-level radix y butterfly operation into the radix y butterfly operation device to perform the z +1 th-level radix y butterfly operation to obtain y m paths of z +1 th-level radix y butterfly operation result data; after the y x m paths of result data of the z +1 th-level radix y butterfly operation are obtained, writing the y paths of result data into y parts, into which each path of result data in the remaining n paths of result data of the y cache regions is equally divided, as y paths of data to be subjected to the z +1 th-level radix y butterfly operation, inputting the y paths of data into the radix y butterfly operation device to perform the z +1 th-level radix y butterfly operation, and obtaining y x n paths of result data of the z +1 th-level radix y butterfly operation, wherein m is smaller than y, and the sum of m and n is equal to y.
8. The data processing apparatus of claim 7,
the controller is further configured to, after the y paths of result data are written into y parts into which the remaining n paths of result data of the y cache regions are equally divided, use the y parts as y paths of data to be subjected to z +1 th-level radix y butterfly operation, before the y paths of data are input to the radix y butterfly operator to perform z +1 th-level radix y butterfly operation, equally divide each path of result data in the y x m paths of z +1 th-level radix y butterfly operation into y parts, use the y parts as y paths of data to be subjected to z +2 th-level radix y butterfly operation, and input to the radix y butterfly operator to perform z +2 th-level radix y butterfly operation, so as to obtain y x m paths of z +2 th-level radix y butterfly operation result data.
9. The data processing apparatus of claim 7,
the controller is further configured to, after the z +1 th-level radix y butterfly operation is performed to obtain y x y th-level radix y butterfly operation result data, equally divide each path of result data in the z +1 th-level radix y butterfly operation result data into y parts, use the y parts as y paths of data to be subjected to the z +2 th-level radix y butterfly operation, and input the data to the radix y butterfly operation device to perform the z +2 th-level radix y butterfly operation.
10. The data processing apparatus of any one of claims 7 to 9,
the y cache regions are respectively positioned in the y caches;
or, the y buffer areas are located in at least one buffer.
11. An access device in which the data processing apparatus of any one of claims 7 to 10 is deployed.
12. A user equipment, characterized in that the data processing apparatus according to any of claims 7 to 10 is deployed in the user equipment.
CN201280000317.4A 2012-04-28 2012-04-28 Data processing method, data processing equipment, access device and subscriber equipment Active CN103493039B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/074911 WO2013159361A1 (en) 2012-04-28 2012-04-28 Data processing method and related device

Publications (2)

Publication Number Publication Date
CN103493039A CN103493039A (en) 2014-01-01
CN103493039B true CN103493039B (en) 2016-06-29

Family

ID=49482181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280000317.4A Active CN103493039B (en) 2012-04-28 2012-04-28 Data processing method, data processing equipment, access device and subscriber equipment

Country Status (2)

Country Link
CN (1) CN103493039B (en)
WO (1) WO2013159361A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106970895B (en) * 2016-01-14 2023-10-03 普天信息技术有限公司 FFT device and method based on FPGA
CN109117188B (en) * 2018-08-06 2022-11-01 合肥工业大学 Multi-path mixed-basis FFT (fast Fourier transform) reconfigurable butterfly operator
CN112051446A (en) * 2020-08-18 2020-12-08 许继集团有限公司 Mixed base FFT implementation method and device for broadband measurement of power system
CN113570612B (en) * 2021-09-23 2021-12-17 苏州浪潮智能科技有限公司 Image processing method, device and equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101300572A (en) * 2005-03-11 2008-11-05 高通股份有限公司 Fast fourier transform twiddle multiplication
JP2011133957A (en) * 2009-12-22 2011-07-07 Hitachi Kokusai Electric Inc Arithmetic device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101072218B (en) * 2007-03-01 2011-11-30 华为技术有限公司 FFT/IFFI paired processing system, device and method
CN101833540B (en) * 2010-04-07 2012-06-06 华为技术有限公司 Signal processing method and device
US20120041996A1 (en) * 2010-08-16 2012-02-16 Leanics Corporation Parallel pipelined systems for computing the fast fourier transform
KR101036873B1 (en) * 2010-09-13 2011-05-25 심흥섭 Flag based low-complexity, expandable split radix fft system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101300572A (en) * 2005-03-11 2008-11-05 高通股份有限公司 Fast fourier transform twiddle multiplication
JP2011133957A (en) * 2009-12-22 2011-07-07 Hitachi Kokusai Electric Inc Arithmetic device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A HIGH-SPEED FFT PROCESSOR FOR OFDM SYSTEMS;Byung S.Son et al;《ISCAS 2002》;20020529;第281-284页 *

Also Published As

Publication number Publication date
WO2013159361A1 (en) 2013-10-31
CN103493039A (en) 2014-01-01

Similar Documents

Publication Publication Date Title
CN103955447B (en) FFT accelerator based on DSP chip
US20140280420A1 (en) Vector processing engines having programmable data path configurations for providing multi-mode radix-2x butterfly vector processing circuits, and related vector processors, systems, and methods
CN103226543B (en) FFT processor with pipeline structure
CN101083643A (en) Mixed base FFT processor with low memory overhead and method thereof
CN103493039B (en) Data processing method, data processing equipment, access device and subscriber equipment
US9262378B2 (en) Methods and devices for multi-granularity parallel FFT butterfly computation
CN111737638A (en) Data processing method based on Fourier transform and related device
Wang et al. Scheduling of data access for the radix-2k FFT processor using single-port memory
JP4160564B2 (en) Fast Fourier transform apparatus with improved processing speed and processing method thereof
US9098449B2 (en) FFT accelerator
Patyk et al. Low-power application-specific FFT processor for LTE applications
CN102541813B (en) Method and corresponding device for multi-granularity parallel FFT (Fast Fourier Transform) butterfly computation
JP2015503785A (en) FFT / DFT reverse sorting system, method, and operation system thereof
KR20220017638A (en) Fast Fourier transform device and method using real valued as input
Ma et al. Simplified addressing scheme for mixed radix FFT algorithms
US10282387B2 (en) FFT device and method for performing a Fast Fourier Transform
WO2013097235A1 (en) Parallel bit order reversing device and method
CN111368250B (en) Data processing system, method and equipment based on Fourier transformation/inverse transformation
El-Shafei et al. Implementation of harmony search on embedded platform
CN106095730A (en) A kind of FFT floating-point optimization method based on ILP and DLP
CN109753629B (en) Multi-granularity parallel FFT computing device
Banerjee et al. A Novel Paradigm of CORDIC-Based FFT Architecture Framed on the Optimality of High-Radix Computation
CN112163184A (en) Device and method for realizing FFT
Guan et al. Design of an application-specific instruction set processor for high-throughput and scalable FFT
CN114116012B (en) Method and device for realizing vectorization of FFT code bit reverse order algorithm based on shuffle operation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant