CN106415526B - Fft processor and operation method - Google Patents

Fft processor and operation method Download PDF

Info

Publication number
CN106415526B
CN106415526B CN201680000901.8A CN201680000901A CN106415526B CN 106415526 B CN106415526 B CN 106415526B CN 201680000901 A CN201680000901 A CN 201680000901A CN 106415526 B CN106415526 B CN 106415526B
Authority
CN
China
Prior art keywords
data
read
twiddle factor
write cell
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201680000901.8A
Other languages
Chinese (zh)
Other versions
CN106415526A (en
Inventor
李一帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Goodix Technology Co Ltd
Original Assignee
Shenzhen Huiding Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Huiding Technology Co Ltd filed Critical Shenzhen Huiding Technology Co Ltd
Publication of CN106415526A publication Critical patent/CN106415526A/en
Application granted granted Critical
Publication of CN106415526B publication Critical patent/CN106415526B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Algebra (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Complex Calculations (AREA)
  • Discrete Mathematics (AREA)

Abstract

The present invention relates to field of signal processing, a kind of fft processor and operation method are disclosed.In the present invention, fft processor, comprising: two data storage cells, twiddle factor storage unit, multiple butterfly processing elements, date read-write cell and twiddle factor read-write cell;Date read-write cell connects two data storage cells and each butterfly processing element;Two data storage cells are respectively used to uniformly store N number of input data of multiple butterfly processing elements and N number of output data;Twiddle factor read-write cell connects twiddle factor storage unit and each butterfly processing element;Twiddle factor storage unit is for storing N/2 twiddle factor;The N/2 twiddle factor read one by one is sequentially input multiple butterfly processing elements for reading N/2 twiddle factor one by one by twiddle factor read-write cell;Date read-write cell is also used to store N number of output data one by one.The invention also discloses a kind of FFT operation methods.Embodiment of the present invention realizes multiple spot FFT operation, reduces circuit area.

Description

Fft processor and operation method
Technical field
The present invention relates to field of signal processing, in particular to a kind of fft processor and operation method.
Background technique
Fourier transformation is a kind of variation that signal is transformed from the time domain to frequency domain, is the important analysis of signal processing Means.Discrete Fourier transform (Discrete Fourier Transform, referred to as " DFT ") is Fourier transformation in discrete system Representation in system.But the calculation amount of DFT is very big.Fast Fourier changes (Fast Fourier Transformation, referred to as " FFT ") be a kind of DFT highly effective algorithm, it is right according to the characteristics such as odd, even, empty, real of DFT DFT algorithm is improved and is obtained, to substantially reduce the operand of DFT algorithm.
Fft processor is a kind of hardware configuration of fft algorithm, in the prior art, can be realized the side of fft processor function There are many kinds of methods, but has some limitations mostly.For some implementation methods, the FFT operation of single points can only be supported, Also some methods then need to occupy a large amount of resource, and hardware circuit area is larger.
Summary of the invention
Embodiment of the present invention is designed to provide a kind of fft processor and operation method, so that multiple spot FFT operation obtains With realize, increase the application scenarios of fft processor, at the same occupy small electric road surface product, lower circuit power consumption, reduction circuit at This.
In order to solve the above technical problems, embodiments of the present invention provide a kind of fft processor, comprising: two data Storage unit, twiddle factor storage unit, multiple butterfly processing elements, date read-write cell and twiddle factor read-write cell;
Date read-write cell is connected to two data storage cells and each butterfly processing element;Two data storage cells point The N number of input data and N number of output data of multiple butterfly processing elements Yong Yu not be stored uniformly;Wherein, N=2k, k >=3 and k is Integer;
Twiddle factor read-write cell is connected to twiddle factor storage unit and each butterfly processing element;Twiddle factor storage is single Member is for storing N/2 twiddle factor;
Wherein, date read-write cell is for reading N number of input data one by one, and by the N number of input data read one by one according to The secondary multiple butterfly processing elements of input;Twiddle factor read-write cell will be read one by one for reading N/2 twiddle factor one by one N/2 twiddle factor sequentially input multiple butterfly processing elements;Date read-write cell is also used to store N number of output number one by one According to.
Embodiments of the present invention additionally provide a kind of FFT operation method, comprising:
Date read-write cell will equably be stored in one of data storage cell from external received N number of input data In;
Twiddle factor read-write cell will be stored in twiddle factor storage unit from external received N/2 twiddle factor;
Date read-write cell reads N number of input data one by one, and the N number of input data read one by one is sequentially input multiple Butterfly processing element;
Twiddle factor read-write cell reads N/2 twiddle factor one by one, and successively by the N/2 twiddle factor read one by one Input multiple butterfly processing elements;
Each input data and each twiddle factor operation obtain each output data to each butterfly processing element based on the received;
Date read-write cell stores each output data one by one to another data storage cell;
Wherein, each input data of each output data as next stage operation, and carry out k grades of loop computations.
In terms of existing technologies, data are uniformly stored in two data storage cells embodiment of the present invention, right It for the input data of different points, can be read out using same reading rule, therefore may be implemented to support multiple spot Operation.Also, date read-write cell reads input data one by one, and the input data read one by one is sequentially input multiple butterflies Arithmetic element, and the output data of each butterfly processing element is stored one by one, i.e., in the same time, there is one to input and one is defeated Data out can save circuit area therefore it may only be necessary to which two data storage cells carry out the storage of data.
In addition, the number of butterfly processing element is 4.By the mode of 4 butterfly processing element cycle operations, utmostly Multiplexing butterfly processing element, circuit area can be reduced as far as possible, and 4 butterfly processing elements are from data storage cell Data are successively continuously read, can be to avoid the free time of arithmetic element, and export result and be constantly in effective status, thus effectively Improve butterfly processing element utilization rate in ground.
In addition, each butterfly processing element includes 1 multiplier and 2 adders;Each butterfly processing element is for realizing base 2 Butterfly computation.The structure of each butterfly processing element in present embodiment is relatively simple, to substantially reduce circuit area.
In addition, the value of k is k≤10.Value according to k configuration is different, supports the FFT of different points to handle to realize Device operation.
In addition, the storage address of each input data is incremented by successively;Each data storage cell includes 1024 addresses, works as k= 10, N=210When=1024, each input data is successively stored;As k≤9, the address gaps of each adjacent input data are equal.It is defeated Enter, output data uniformly occupies entire memory address space, facilitate calculating, without for different points fft processor configuration not Colleague's numerical procedure.
In addition, wherein i=0,1 ..., k, read N number of input data in date read-write cell one by one for i-stage operation In, the producing method of the reading address of each input data includes: to obtain the corresponding counter binary system ordered series of numbers of each input data;It will Last i+1 in counter binary system ordered series of numbers are inverted;The above-mentioned last i+1 entire data after inverted are inverted, Using the reading address as each input data.Address data memory is reasonably distributed, correct FFT fortune can be completed It calculates.
Detailed description of the invention
Fig. 1 is the structural schematic diagram of the fft processor of first embodiment according to the present invention;
Fig. 2 is a kind of butterfly processing element internal arithmetic process of fft processor of first embodiment according to the present invention Schematic diagram;
Fig. 3 is a kind of flow chart of FFT operation method of third embodiment according to the present invention;
Fig. 4 is that the reading address of input data in a kind of FFT operation method of the 4th embodiment according to the present invention generates The flow chart of mode;
Fig. 5 is the reading address sequence of twiddle factor in a kind of FFT operation method of the 5th embodiment according to the present invention Producing method flow chart.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to each reality of the invention The mode of applying is explained in detail.However, it will be understood by those skilled in the art that in each embodiment of the present invention, In order to make the reader understand this application better, many technical details are proposed.But even if without these technical details and base In the various changes and modifications of following embodiment, the application technical solution claimed also may be implemented.
The first embodiment of the present invention is related to a kind of fft processors.Concrete structure schematic diagram is as shown in Figure 1, comprising: two A data storage cell 11 and 12, twiddle factor storage unit 13, multiple butterfly processing elements 161 to 164, date read-write cell 14 and twiddle factor read-write cell 15;Date read-write cell 14 is connected to two data storage cells 11 and 12 and each butterfly is transported Calculate unit 161 to 164;Twiddle factor read-write cell 15 is connected to twiddle factor storage unit 13 and each butterfly processing element 161 To 164.
Wherein, two data storage cells 11 and 12, two data storage cells in present embodiment can be random It accesses memory (Random Access Memory, referred to as " RAM "), two RAM are respectively used to uniformly store multiple butterfly fortune Calculate the N number of input data and N number of output data of unit 161 to 164;Wherein, N=2k, k >=3 and k are integer.Reading and writing data list Member 14 will be read one by one for reading N number of input data one by one according to the address sequence of generation according to the address sequence of generation N number of input data sequentially input multiple butterfly processing elements 161 to 164.In addition, date read-write cell 14 is also used to deposit one by one Store up N number of output data.
Specifically, the data that will be used to calculate required for butterfly processing element 161 to 164 are needed before starting calculating, It imported into a data storage cell 11, such as imports in data storage cell 11, when the enable signal of fft processor is arranged After height, data storage cell 11 generates suitable address sequence according to locating series, this address sequence is stored in data Corresponding data can be used as input data in unit 11, be read by date read-write cell 14, and carry out further FFT operation, After calculating, which is stored in data storage cell 12 by date read-write cell 14 according to storage address, and full Sufficient storage address is consistent with the reading address in data storage cell 11, that is, participates in the reading address of the data calculated and calculate The storage address for finishing data keeps identical.
It should be noted that date read-write cell 14 can read the data of input from data storage cell 11, it can also Output data is written to data storage cell 11, such operation similarly can also be done for data storage cell 12, data are deposited Storage unit reduces consumed resource, i.e. data storage cell 11 or 12 in the form of ping-pong ram, both can be used as input The storage unit of data can also be used as the storage unit of output data, and data are uniformly to be stored in data storage cell 11, in 12, it can be understood as, input or the address stored in data storage cell 11 or 12 of output data are equal to each other.
Fft processor in present embodiment uses the algorithm of base 2, and N is the operation points that fft processor is supported, wherein N=2k, k >=3 and k are integer;Then the minimum value of N is 8, illustrates the FFT operation of minimum 8 points of support, and 2-base algorithm is a total of K grades of operations.The value of k setting is different, i.e., FFT operation points N is different, i.e. operation series is also different.Wherein, every level-one is transported It calculates, requires date read-write cell 14 and read data from one of data storage cell, and by butterfly unit after calculating Output data is stored into another data storage cell, and when carrying out next stage operation, and date read-write cell 14 is from upper level It is stored in the data storage cell of data and reads data, the output data storage of butterfly unit after calculating is read into number to upper level According to data storage cell in.For example, date read-write cell 14 reads data from data storage cell 11, and queen butterfly will be calculated The output data of shape unit is stored into data storage cell 12, and when carrying out next stage operation, and date read-write cell 14 is from number According to data are read in storage unit 12, the output data of butterfly unit after calculating is stored into data storage cell 11.
In addition, the parameter in fft processor calculating process, can be configured by user.
In present embodiment, twiddle factor storage unit 13 is for storing N/2 twiddle factor.
Before starting calculating, the twiddle factor table that will be used to calculate required for butterfly processing element 161 to 164 is needed to import Into corresponding twiddle factor storage unit 13, more particularly, according toIt can know Road for it could support up 1024 fft processors, required for twiddle factor can be converted intoWherein N Value range is 0-511, and 16 bit wides (bit) that real part and imaginary part do signed number are quantified, the result of quantization is stored respectively In twiddle factor storage unit 13.For every two input term signal, it is only necessary to which a twiddle factor operation obtains output Therefore data for supporting the fft processor of N point, need N/2 twiddle factor.
In general, twiddle factor storage unit 13 is segmented into two pieces, with one of storage unit store rotation because The real part of son, the imaginary part of twiddle factor is stored with another piece of storage unit, and the data of each identical address correspond.So not It is limited to this, in practical applications, the high-low-position that the twiddle factor storage unit 13 also can be used stores twiddle factor respectively Real and imaginary parts.In addition, twiddle factor can be stored in twiddle factor storage unit 13, in operation in advance in table form When needing to use twiddle factor in the process, reading pair in the twiddle factor table in twiddle factor storage unit 13 is imported from preparatory The twiddle factor answered, is further calculated.
Further, twiddle factor read-write cell 15 be used for according to the address sequence of generation read one by one N/2 rotate because Son, and by the N/2 twiddle factor read one by one according to the address sequence of generation sequentially input multiple butterfly processing elements 161 to 164。
Specifically, when needing to use twiddle factor in calculating process, by the twiddle factor address signal that provides and Read enabled, twiddle factor read-write cell 15 reads the value of twiddle factor, and twiddle factor address every two mechanical periodicity is primary, leads to Control logic is crossed to realize.The N/ that twiddle factor read-write cell 15 will be read one by one according to the address sequence of the twiddle factor of generation 2 twiddle factors, and multiple butterfly processing elements 161 to 164 are sequentially input, further calculated.
In addition, N number of input data of butterfly processing element 161 to 164 is by date read-write cell 14 from data storage cell It reads and gets in 11 or 12, each butterfly processing element, if butterfly processing element 161 needs 2 periods to obtain input data, And by the reading of twiddle factor read-write cell 15 and butterfly processing element 161 is written in N/2 twiddle factor, and N/2 twiddle factor Real and imaginary parts be stored in twiddle factor storage unit 13 by high-low-position respectively, can be obtained simultaneously with N number of input data.Its In, the period needed for present embodiment referred specifically to for the time cycle.
For a butterfly processing element 161, need to do corresponding operation inside it, and obtain output data, such as formula (1) and (2):
Wherein, x1 (k) and x2 (k) is respectively input data,For twiddle factor, x (k) and x (k+N/2) are by butterfly The output data of shape arithmetic element 161.
Then the calculating process of entire butterfly processing element is as shown in Fig. 2, its output data is respectively as follows: x (k) and x (k+N/ 2).By the form of each output data writing real and imaginary parts, such as formula (3) and (4):
Out1=(xa+xbxc-ybyc)+(ya+xbyc+xcyb)j (3)
Out2=(xa-xbxc+ybyc)+(ya-xbyc-xcyb)j (4)
Wherein xa, xb, xc are respectively x1 (k), x 2 (k),Real part, ya, yb, yc be x1 (k), x2 (k),Void Portion.
It is noted that for each butterfly processing element, such as butterfly processing element 161, including 1 multiplier and 2 A adder, the real and imaginary parts of 2 output out1 and out2, which all calculate to finish, needs 7 periods.Calculating process are as follows:
A cycle, multiplier calculate xb*xc, are as a result denoted as mul_out;
Second period, No. 1 adder calculate xa+xb*xc, and No. 2 adders calculate xa-xb*xc, and multiplier calculates As a result yb*yc is equally stored in mul_out;
Third period, No. 1 adder calculate xa+xb*xc-yb*yc, and here it is the real parts that first exports out1.Together When, No. 2 adders calculate xa-xb*xc+yb*yc, and here it is the real part that second exports out2, multiplier calculates xb*yc;
4th period, No. 1 adder calculate ya+xb*yc, and No. 2 adders calculate ya-xb*yc, and multiplier calculates xc* yb;
5th period, No. 1 adder calculate ya+xb*yc+xc*yb, and the imaginary part of out1 output at this time, which calculates, to be completed, and No. 2 Adder calculates ya-xb*yc-xc*yb, and the imaginary part of out2 output at this time, which calculates, to be completed.
From above-mentioned analysis we can see that the real part of 2 output out1 and out2 is to calculate completion simultaneously, calculating After the completion, the output data of butterfly processing element is needed to be stored in data storage cell 11 or 12 by date read-write cell 14 In, and a cycle of data storage cell 11 or 12 can be only written a data, therefore can be by third period and the 5th A bat delay is made in the output of No. 2 adders in a period, can obtain the reality of output out1 when the 3rd period in this way Portion, the 4th period when, obtain the real part of output out2, and the 5th period when obtains the imaginary part of output out1, and the 6th When period, the imaginary part of output out2 is obtained, to meet the memory requirement of data storage cell 11 or 12.
Since calculated result is 17 data, and data storage cell 11 or 12 is only capable of storage 16, it is therefore desirable to meter It calculates result and makees cut position processing, cast out minimum 1, that is, result will be exported divided by 2, every first-level outcome is carried out such as This operation, k grades in total, therefore final result has been reduced N times, but due between them relative size it is unaffected, because This still can determine frequency by last spectrum distribution.
Further, since the write address of output data and the read address needs of input data are consistent, so for each Butterfly processing element 161, we will cache the address of 2 input datas and finish until calculating, and the place of sole exception is Afterbody, afterbody needs again carry out output data sequence once arranging to obtain correct storage order, such as right For 1024 points of fft processor, the input address of last 1 grade of first butterfly processing element is 0,1, and output address is answered It should be 0,512 (k and k+N/2 always occur in pairs), it is therefore desirable to it is correct to reach output sequence that the additional judgement of level-one be added The purpose of sequence.
It is noted that the number of butterfly processing element is 4.
Specifically, the entire butterfly computation part of fft processor is the folded of 4 butterfly processing elements in present embodiment Add, each butterfly processing element takes out two numbers from data storage cell 11 or 12 and calculated, if by butterfly computation list First 161 to 164 arrays participate in operation, then can find when the 4th butterfly processing element is started to work, first butterfly computation list Member work finishes, and can begin preparing and fetch next time, 4 butterfly processing elements can meet from data storage cell 11 or 12 In the requirement successively continuously fetched, it is possible to by the mode with 4 butterfly processing element cycle operations, to complete every level-one Required butterfly computation.The circuit area of such mode is minimum.Wherein, the operation of every level-one is all by N/2 butterfly computation list Member composition.
In addition, the entire butterfly computation part of fft processor is the superposition of 4 butterfly processing elements in present embodiment, Then every 8 periods are that (each butterfly processing element needs 2 periods to complete the reading of input data to one cycle, then 4 successively The butterfly processing element of work needs 8 periods), after carrying out N/8 circulation, the N/2 butterfly computation calculating in level-one is finished, Due to also needing 7 periods to carry out the output data that operation comes to the end after the last one arithmetic element reads data, because The time of this every level-one can be set to N (time for reading data)+7 (last read and finish the time for needing operation)+1 and (read ground The bat postponed between location and reading data)=N+8 the period.
It is analyzed according to front, for supporting 1024 points of fft processor, level-one operation needs 1032 periods, meter Number device indicates this 1032 periods from 0-1031, and wherein the 0-1023 period is reading letter from data storage cell 11 or 12 Number, it reads to enable as height at this time, and for 4 butterfly processing element cycle calculations modes, every 8 periods are a circulation, therefore 8 periods every in this way can occur once, and other inputs enable successively to postpone a cycle, just successively obtain in this way 8 whole inputs of 4 butterfly processing elements are enabled.
And for output data, according to the analysis of front, a butterfly processing element output is divided into real and imaginary parts, Real part output is enabled and imaginary part output enables to continue 2 periods, and butterfly processing element input exactly needs 2 periods to read It fetches evidence, therefore the result of next butterfly computation is just caing be compared to result evening an of butterfly processing element and coming out 2 periods, because This is for generally speaking, output result is constantly in effective status, and for real part output and imaginary part, every 8 periods are one A circulation sequentially inputs first, and second, third, the operating structure of the 4th butterfly processing element calculates until this takes turns It finishes.
It should be noted that in present embodiment, using the mode of 4 butterfly processing element cycle operations, then 4 butterflies Arithmetic element can need a butterfly computation list as the smallest basic processing unit, and due to the input data of every two o'clock Member, therefore, minimum can support that N is 8 points of fft processor operation, in conjunction with N=2k, it will be understood that the integer that k is >=3.
Present embodiment is opposite with for the prior art, and main difference and effect are: data are uniformly stored in two It in a data storage cell, for the input data of different points, can be read out with same rule, therefore can be with Realize the operation for supporting multiple spot.Date read-write cell reads input data according to the address sequence of generation one by one, and will be according to production The input data that raw address sequence is read one by one sequentially inputs multiple butterfly processing elements, and date read-write cell can be one by one The data of storage input, output in the same time, have an input and an output data, therefore it may only be necessary to which two data are deposited Storage unit carries out the storage of data, can save circuit area.
It is noted that each module involved in present embodiment is logic module, and in practical applications, one A logic unit can be a physical unit, be also possible to a part of a physical unit, can also be with multiple physics lists The combination of member is realized.In addition, in order to protrude innovative part of the invention, it will not be with solution institute of the present invention in present embodiment The technical issues of proposition, the less close unit of relationship introduced, but this does not indicate that there is no other single in present embodiment Member.
Second embodiment of the present invention is related to a kind of fft processor.Second embodiment be first embodiment into One-step optimization, main optimization place are: in second embodiment of the invention, the value of k is k≤10, and each input data Storage address it is incremented by successively.Each data storage cell includes 1024 addresses, works as k=10, N=210It is each to input when=1024 Data are successively stored;As k≤9, the address gaps of each adjacent input data are equal.It is known that in addition, according to the value of k The fft processor for supporting 8-1024 point may be implemented, i.e., in the case where not changing existing equipment hardware environment, for example, not changing in difference In the case where the spatial content or address signal bit wide of data storage, the FFT operation for supporting maximum number of points range may be implemented.
Specifically, calculative data are written in data storage cell during calculating the FFT of low spot number Address is not continuous, for example, for 512 fft processors, in data write-in data storage cell address be 0,2,4, 6 ... 1022 jumps in this way, and for 256 point FFT, the address in data write-in data storage cell is then 0,4,8, 16...1020 variation in this way, the core concept that they meet is to need uniformly to occupy data into entire address space, without It is continuously to write in a certain piece in address space, for uniformly storing the specific restriction of data mode, so that data are with same One rule is read, and supports multiple spot FFT operation to realize.
It should be noted that present embodiment not only supports maximum 1024 points of fft processor to calculate, can also support The operation of fft processor more than 1024 points.If necessary to support the fft processor operation of higher points, need to change only It is only to define higher k series, and the address signal of bigger data storage cell and bigger bit.
Present embodiment is opposite with for the prior art, and main difference and effect are: changing existing equipment Under hardware environment, for example, may be implemented to support in the case where not changing spatial content or the address signal bit wide of data storage The FFT operation of maximum number of points range.
Third embodiment of the invention is related to a kind of FFT operation method, as shown in Figure 3, comprising:
Step 301: date read-write cell will uniformly be stored in data storage cell from external received input data.
Specifically, date read-write cell will equably be stored in one of number from external received N number of input data According in storage unit.The numerical value of N is the points that fft processor can be supported in present embodiment, and N can not change existing set In standby situation, allowing to carry out value in the maximum magnitude counted.Also, in data storage cell, FFT will be carried out in advance by importing N number of input data of operation, N number of input data are uniformly stored in a data storage cell, wherein being uniformly distributed can manage Xie Wei, it is the same for needing the address gaps of storing data in the data store, to guarantee that data are read with identical rule It takes, realizes the operation for supporting multiple spot.
Step 302: twiddle factor read-write cell will be stored in twiddle factor storage unit from external received twiddle factor.
Specifically, twiddle factor read-write cell will be stored in from external received N/2 twiddle factor the rotation because Sub- storage unit.Since in FFT calculating process, every two input data is used in conjunction with a twiddle factor, if input number According to be N number of, then N/2 twiddle factor is needed, can just carry out FFT operation.Also, N/2 twiddle factor is with twiddle factor The form of table is pre-deposited in twiddle factor storage unit by twiddle factor read-write cell.
It should be noted that there is no stringent logical orders between step 301 and step 302, sequence can be carried out It exchanges, input data is stored in data storage cell to date read-write cell and twiddle factor read-write cell places the data in rotation The execution sequence front and back of transposon storage unit, can't cause any impact to the result of FFT operation.
Step 303: date read-write cell reads input data one by one, and is sequentially input butterfly processing element.
Specifically, date read-write cell reads N number of input data according to the address sequence of generation one by one, and will be according to production N number of input data that raw address sequence is read one by one sequentially inputs multiple butterfly processing elements.Date read-write cell is from wherein The N number of input data that obtains N number of input data in one data storage cell, and will acquire is stored in multiple butterfly processing elements. Wherein, basic butterfly processing element is 4 butterfly processing elements.Since date read-write cell reads and writes data one by one, together One time, only one input and an output data, then only need two data storage cells to carry out data storage, saves electricity Road area occupied.
Step 304: twiddle factor read-write cell reads twiddle factor one by one, and is sequentially input butterfly processing element.
Specifically, twiddle factor read-write cell reads N/2 twiddle factor according to the address sequence of generation one by one, and will Multiple butterfly processing elements are sequentially input according to the N/2 twiddle factor that the address sequence of generation is read one by one.By rotation because Sub- read-write cell reads out N/2 twiddle factor, and the N/2 twiddle factor that will be read out in twiddle factor storage unit Input multiple butterfly processing elements.
It should be noted that there is no stringent logical orders between step 303 and step 304, sequence can be carried out It exchanges, date read-write cell reads input data and is stored in butterfly processing element and twiddle factor read-write cell reading twiddle factor And the execution sequence for being stored in butterfly processing element is successive, can't cause any impact to the result of FFT operation.
Step 305: butterfly processing element operation obtains output data.
Specifically, each input data and each twiddle factor operation obtain each output to each butterfly processing element based on the received Data.It is recycled using 4 butterfly processing elements unit basic as one, is read and write by date read-write cell and twiddle factor Unit does not stop the basic structure that data are read and write from data storage cell and twiddle factor storage unit, input data with Twiddle factor is constantly selected, and is then calculated, subsequent output data, the ground for the write-in for only needing to read every level-one Location is converted accordingly, can complete entire FFT arithmetic operation by not stopping to be multiplexed basic processing unit, and reduce The idle waiting time.
Step 306: date read-write cell stores output data to data storage cell.
Specifically, date read-write cell stores each output data one by one to another data storage cell.Wherein, it walks The data of another data storage cell are stored in the data address and this step read in the data store in rapid 301 Location needs to be consistent, and could facilitate and operate to read-write data.
Step 307: counter records previous cycle series.
Specifically, date read-write cell is in output data storage of every completion, counter, which automatically records, currently to be followed Ring series.Wherein, the initial value of counter is 0, then recurring series is 0 when representing initial operation, in the primary output number of every completion When according to storage, counter adds one automatically, and the result after adding one is stored in counter again.
Step 308: judging whether previous cycle series is equal with k value.
Specifically, judging whether the current value saved in counter is equal to k value, if be equal to, 309 are entered step In, if differed, enter step 303.If the value in counter is identical as k value, illustrate to have completed whole FFT fortune It calculates, and enters step in 309.Otherwise, illustrate that previous cycle series is less than k value, i.e., do not complete k grades of loop computations also, then enter In step 303, the input data and twiddle factor of next stage are reacquired, and they are inputted into butterfly processing element, until complete Until all k grades of operations.
Wherein, each input data of each output data as next stage operation, and in step 301 and step 302, by In pre-depositing data storage cell and twiddle factor storage unit respectively from external received input data and twiddle factor, because This, does not enter in k grades of cycle calculations.
Step 309: emptying counter.
Specifically, when recurring series is equal to k value, that is, k grades of operations is had been completed, then empty the value of counter, It in next FFT operation, counts again, that is, recalculates operation series.
Contain 2 identical data storage cells in present embodiment, two data storage cells can both store input number According to also can store output data, 2 data storage cells carry out the storage of data in the form of ping-pong ram.It is understood that For, FFT operation for every level-one, need to obtain input data from first data storage cell, and by the knot after calculating Fruit is output to second data storage cell, and the FFT operation of next stage will just obtain input from second data storage cell Data, and will be in first deposit data storage cell of output data after calculating.
Wherein, the selection of the value of k can determine the points that FFT operation can be supported, and FFT operation runs k grades altogether and follows Ring.
Present embodiment is opposite with for the prior art, and main difference and effect are: data are uniformly stored in two In a data storage cell, for the input data of different points, it can be read out using same reading rule, therefore The operation of support multiple spot may be implemented.Date read-write cell reads input data according to the address sequence of generation one by one, and will be by Multiple butterfly processing elements are sequentially input according to the input data that the address sequence of generation is read one by one, and by each butterfly processing element Output data store one by one, i.e., in the same time, have one input and an output data, therefore it may only be necessary to two data Storage unit carries out the storage of data, can save circuit area.
It is not difficult to find that present embodiment is embodiment of the method corresponding with first embodiment, present embodiment can be with First embodiment is worked in coordination implementation.The relevant technical details mentioned in first embodiment still have in the present embodiment Effect, in order to reduce repetition, which is not described herein again.Correspondingly, the relevant technical details mentioned in present embodiment are also applicable in In first embodiment.
Four embodiment of the invention is related to a kind of FFT operation method.4th embodiment is third embodiment into one Step optimization, main optimization place are: in four embodiment of the invention, providing a kind of reading address of input data Producing method;That is, for i-stage operation, wherein i=0,1 ..., k, date read-write cell according to generation address sequence by A to read in N number of input data, the producing method of the reading address for each input data that present embodiment provides can guarantee to read The correctness for evidence of fetching.Wherein, step 303 date read-write cell in third embodiment is read one by one in N number of input data, The flow chart of each input data address producing method, as shown in Figure 4, comprising:
Step 401: obtaining the corresponding counter binary system ordered series of numbers of input data.
Specifically, obtaining the corresponding counter binary system ordered series of numbers of each input data.It is obtained by date read-write cell Input data is that metric data convert two for metric input data after getting metric input data The input data of system.For example, the metric input data address obtained is " 1 ".Since in the present embodiment, k's takes Value is less than or equal to 10, if the value of k is 10, i.e. N=1024.The decimal data address of acquisition is after " 1 " corresponding conversion Binary system data address be " 0000000001 ".
Step 402: last i+1 in counter binary system ordered series of numbers are inverted.
Specifically, being the 0th grade for i, then the binary system data address obtained in step 401 passes through this step Afterwards, it exports as " 0000000001 ";
It is the 1st grade for i, then the binary system data address obtained in step 401 is by exporting after this step “0000000010”
Step 403: the last i+1 entire data after inverted are inverted, using the reading as each input data Location.
Specifically, being the 0th grade for i, then the binary system data address obtained in step 402 passes through this step Afterwards, the binary data address of output is " 1000000000 ", and corresponding metric data address is " 512 ";
It is the 1st grade for i, then after the binary system data address obtained in step 402 passes through this step, the two of output Binary data address is " 0100000000 ", and corresponding metric data address is " 256 ".
As can be seen that the selection rule of read address: for kth grade, input data address sequence is counter binary number Entire data are become reciprocal again after taking last k+1 of inverse by column.
By taking 1024 point FFT operations as an example, we count down to 1023 from 0 by making a counter first, to this sequence Corresponding read address conversion is carried out, the sequence converted then means to be sent into first butterfly fortune for 0,512,256,768 ... Calculating the data in unit is the data in address 0 and address 512, and the data being sent into second butterfly processing element are address 256 and address 768 in data, and so on, data are sequentially sent in butterfly processing element according to this rule, and are being had been calculated It is successively removed after finishing.
Present embodiment is opposite and for the prior art, and main difference and effect are: to address data memory into The reasonable distribution of row, guarantees to complete correct FFT operation.
Fifth embodiment of the invention is related to a kind of FFT operation method.5th embodiment is third embodiment into one Step optimization, main optimization place are: in fifth embodiment of the invention, for i-stage operation, reading and writing in twiddle factor single Member is read in N/2 twiddle factor one by one according to the address sequence of generation, the generation of the reading address sequence of N/2 twiddle factor Mode.Wherein, N/ is read one by one according to the address sequence of generation to step 304 twiddle factor read-write cell in third embodiment In 2 twiddle factors, the flow chart of each twiddle factor address producing method, as shown in Figure 5, comprising:
Step 501: generating counting sequence.
Specifically, generate a counting sequence, counting sequence indicate are as follows: 0,1,2,3 ..., 2i-1.It is appreciated that For the 0th grade, i.e. when i=0, counting sequence 0,0,0 ...
For the 1st grade, counting sequence 0,1,0,1 ...;
For the 2nd grade, counting sequence 0,1,2,3,0,1,2,3 ...;
For i-stage, counting sequence 0,1,2,3 ... 2i-1、0、1、2、3…
Step 502: counting sequence is inverted as twiddle factor reading address sequence.
Specifically, by counting sequence 0,1,2,3 ..., 2i- 1 inverted rear expression are as follows: 0,512,256,768 ..., Using the reading address sequence as N/2 twiddle factor.For example, for the 2nd grade, counting sequence 0,1,2,3,0,1,2,3 ..., It is indicated after inverted are as follows: 0,512,256,768,0,512,256,768 ...
The principle that mode is chosen in the address of twiddle factor isTherefore for selection required for every level-one Twiddle factor value can be converted intoWherein the value range of k is 0-511, only needs the storage list of one piece of 1KB in this way Member can then be stored in all twiddle factors, while according to the variation of address, reading suitable twiddle factor and being sent into butterfly computation list Member carries out operation.By twiddle factor in analysis fft algorithm in the value of every level-one, and convert them toSimultaneously by k Address used in twiddle factor is selected as reading from storage unit, address sequence meets certain rule, carries out to rule Summarize the available address producing method such as present embodiment introduction.
The step of various methods divide above, be intended merely to describe it is clear, when realization can be merged into a step or Certain steps are split, multiple steps are decomposed into, as long as including identical logical relation, all in the protection scope of this patent It is interior;To adding inessential modification in algorithm or in process or introducing inessential design, but its algorithm is not changed Core design with process is all in the protection scope of the patent.
It will be appreciated by those skilled in the art that implementing the method for the above embodiments is that can pass through Program is completed to instruct relevant hardware, which is stored in a storage medium, including some instructions are used so that one A equipment (can be single-chip microcontroller, chip etc.) or processor (processor) execute each embodiment the method for the application All or part of the steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey The medium of sequence code.
It will be understood by those skilled in the art that the respective embodiments described above are to realize specific embodiments of the present invention, And in practical applications, can to it, various changes can be made in the form and details, without departing from the spirit and scope of the present invention.

Claims (8)

1. a kind of fft processor characterized by comprising two data storage cells, twiddle factor storage unit, Duo Gedie Shape arithmetic element, date read-write cell and twiddle factor read-write cell;
The date read-write cell is connected to described two data storage cells and each butterfly processing element;Described two data are deposited Storage unit is respectively used to uniformly store N number of input data of the multiple butterfly processing element and N number of output data;Wherein, N= 2k, k >=3 and k are integer;
The twiddle factor read-write cell is connected to the twiddle factor storage unit and each butterfly processing element;The rotation Transposon storage unit is for storing N/2 twiddle factor;
Wherein, the date read-write cell is used to read N number of input data one by one, and described N number of defeated by what is read one by one Enter data and sequentially inputs the multiple butterfly processing element;The twiddle factor read-write cell for reading the N/2 one by one Twiddle factor, and the N/2 twiddle factor read one by one is sequentially input into the multiple butterfly processing element;The data Read-write cell is also used to store N number of output data, the address of N number of input data and N number of output data one by one Address be consistent respectively.
2. fft processor according to claim 1, which is characterized in that the number of the butterfly processing element is 4.
3. fft processor according to claim 1, which is characterized in that each butterfly processing element includes 1 multiplier and 2 A adder;Each butterfly processing element is for realizing 2 butterfly computation of base.
4. fft processor according to claim 1, which is characterized in that the value of the k is k≤10.
5. fft processor according to claim 4, which is characterized in that the storage address of each input data is incremented by successively;
Each data storage cell includes 1024 addresses, works as k=10, N=210When=1024, each input data is successively stored;Work as k When≤9, the address gaps of each adjacent input data are equal.
6. a kind of FFT operation method, which is characterized in that applied to fft processor described in any one of claim 1 to 5, The FFT operation method includes:
The date read-write cell will equably be stored in one of data storage from external received N number of input data In unit;
The twiddle factor read-write cell will be stored in the twiddle factor storage from the external received N/2 twiddle factor Unit;
The date read-write cell reads N number of input data one by one, and successively by the N number of input data read one by one Input the multiple butterfly processing element;
The twiddle factor read-write cell reads the N/2 twiddle factor, and the N/2 rotation that will be read one by one one by one The factor sequentially inputs the multiple butterfly processing element;
Each input data and each twiddle factor operation obtain each output data to each butterfly processing element based on the received;
The date read-write cell stores each output data one by one to another data storage cell, N number of input The address of data and the address of N number of output data are consistent respectively;
Wherein, each input data of each output data as next stage operation, and carry out k grades of loop computations.
7. FFT operation method according to claim 6, which is characterized in that for i-stage operation, wherein i=0,1 ..., K is read in N number of input data one by one in the date read-write cell, the producing method of the reading address of each input data Include:
Obtain the corresponding counter binary system ordered series of numbers of each input data;
Last i+1 in the counter binary system ordered series of numbers are inverted;
The above-mentioned last i+1 entire data after inverted are inverted, using the reading address as each input data.
8. FFT operation method according to claim 6, which is characterized in that for i-stage operation, in the twiddle factor Read-write cell is read one by one in the N/2 twiddle factor, the producing method of the reading address sequence of the N/2 twiddle factor Include:
Generate a counting sequence, the counting sequence indicates are as follows: 0,1,2,3 ..., 2i-1;
The counting sequence is inverted, it indicates are as follows: 0,512,256,768 ..., using the reading as the N/2 twiddle factor Address sequence.
CN201680000901.8A 2016-08-10 2016-08-10 Fft processor and operation method Active CN106415526B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/094465 WO2018027706A1 (en) 2016-08-10 2016-08-10 Fft processor and algorithm

Publications (2)

Publication Number Publication Date
CN106415526A CN106415526A (en) 2017-02-15
CN106415526B true CN106415526B (en) 2019-05-24

Family

ID=58087900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201680000901.8A Active CN106415526B (en) 2016-08-10 2016-08-10 Fft processor and operation method

Country Status (2)

Country Link
CN (1) CN106415526B (en)
WO (1) WO2018027706A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062289B (en) * 2018-01-25 2021-09-03 天津芯海创科技有限公司 Fast Fourier Transform (FFT) address order changing method, signal processing method and device
CN108319804B (en) * 2018-04-17 2023-08-08 福州大学 8192 point base 2 DIT ASIC design method for low resource call
CN110347968B (en) * 2019-07-08 2023-06-13 河海大学常州校区 FPGA-based FFT optimization algorithm and device
CN112307423B (en) * 2020-11-19 2023-09-22 天津大学 FFT processor based on base 2SDF pipeline type and implementation method thereof in ACO-OFDM system
CN113569189B (en) * 2021-07-02 2024-03-15 星思连接(上海)半导体有限公司 Fast Fourier transform calculation method and device
CN117591784B (en) * 2024-01-19 2024-05-03 武汉格蓝若智能技术股份有限公司 FPGA-based twiddle factor calculation method and FPGA chip

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101290613A (en) * 2007-04-16 2008-10-22 卓胜微电子(上海)有限公司 FFT processor data storage system and method
CN103176950A (en) * 2011-12-20 2013-06-26 中国科学院深圳先进技术研究院 Circuit and method for achieving fast Fourier transform (FFT) / inverse fast Fourier transform (IFFT)
CN103605636A (en) * 2013-12-09 2014-02-26 中国科学院微电子研究所 Device and method for realizing FFT (Fast Fourier Transform) operation
CN103970718A (en) * 2014-05-26 2014-08-06 苏州威士达信息科技有限公司 Quick Fourier transformation implementation device and method
CN104268122A (en) * 2014-09-12 2015-01-07 安徽四创电子股份有限公司 Point-changeable floating point FFT (fast Fourier transform) processor

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7047268B2 (en) * 2002-03-15 2006-05-16 Texas Instruments Incorporated Address generators for mapping arrays in bit reversed order
EP1516467B1 (en) * 2002-06-27 2023-04-26 Samsung Electronics Co., Ltd. Modulation apparatus using mixed-radix fast fourier transform
TWI298448B (en) * 2005-05-05 2008-07-01 Ind Tech Res Inst Memory-based fast fourier transformer (fft)
CN101072218B (en) * 2007-03-01 2011-11-30 华为技术有限公司 FFT/IFFI paired processing system, device and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101290613A (en) * 2007-04-16 2008-10-22 卓胜微电子(上海)有限公司 FFT processor data storage system and method
CN103176950A (en) * 2011-12-20 2013-06-26 中国科学院深圳先进技术研究院 Circuit and method for achieving fast Fourier transform (FFT) / inverse fast Fourier transform (IFFT)
CN103605636A (en) * 2013-12-09 2014-02-26 中国科学院微电子研究所 Device and method for realizing FFT (Fast Fourier Transform) operation
CN103970718A (en) * 2014-05-26 2014-08-06 苏州威士达信息科技有限公司 Quick Fourier transformation implementation device and method
CN104268122A (en) * 2014-09-12 2015-01-07 安徽四创电子股份有限公司 Point-changeable floating point FFT (fast Fourier transform) processor

Also Published As

Publication number Publication date
CN106415526A (en) 2017-02-15
WO2018027706A1 (en) 2018-02-15

Similar Documents

Publication Publication Date Title
CN106415526B (en) Fft processor and operation method
CN105022670B (en) Heterogeneous distributed task processing system and its processing method in a kind of cloud computing platform
Demmel et al. Avoiding communication in sparse matrix computations
He et al. GPU-accelerated parallel sparse LU factorization method for fast circuit analysis
Shu et al. A parallel transient stability simulation for power systems
CN100538886C (en) Rapid read-write method and the device of Multidimensional numerical on dynamic RAM
CN103970720B (en) Based on extensive coarseness imbedded reconfigurable system and its processing method
CN103617150A (en) GPU (graphic processing unit) based parallel power flow calculation system and method for large-scale power system
CN110415157A (en) A kind of calculation method and device of matrix multiplication
CN103049384A (en) Automatic generating frame of multi-core-based multithread limit energy consumption testing source program
Raghavan et al. A fast and scalable FPGA-based parallel processing architecture for K-means clustering for big data analysis
CN110187965A (en) The running optimizatin and data processing method of neural network, equipment and storage medium
El Zein et al. Generating optimal CUDA sparse matrix–vector product implementations for evolving GPU hardware
CN116710912A (en) Matrix multiplier and control method thereof
CN109740244A (en) A kind of multicore interconnection verification method of the irredundant uniform fold of excitation space
CN107402905A (en) Computational methods and device based on neutral net
Tar et al. Parallel search paths for the simplex algorithm
CN102541813B (en) Method and corresponding device for multi-granularity parallel FFT (Fast Fourier Transform) butterfly computation
CN109240644A (en) A kind of local search approach and circuit for Yi Xin chip
Kelefouras et al. A methodology for speeding up fast fourier transform focusing on memory architecture utilization
CN112506992A (en) Fuzzy query method and device for Kafka data, electronic equipment and storage medium
CN110119265A (en) Multiplication implementation method, device, computer storage medium and electronic equipment
CN111651208A (en) Modal parallel computing method and system for heterogeneous many-core parallel computer
CN113112084B (en) Training plane rear body research and development flow optimization method and device
CN116225640A (en) Concurrent construction method for welding digital twin three-dimensional scene model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant