CN104123266A - Method for achieving extremely-low-latency fast Fourier transform under gigabit sampling rate - Google Patents

Method for achieving extremely-low-latency fast Fourier transform under gigabit sampling rate Download PDF

Info

Publication number
CN104123266A
CN104123266A CN201410350688.6A CN201410350688A CN104123266A CN 104123266 A CN104123266 A CN 104123266A CN 201410350688 A CN201410350688 A CN 201410350688A CN 104123266 A CN104123266 A CN 104123266A
Authority
CN
China
Prior art keywords
data
fft
input
fourier transform
fast fourier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410350688.6A
Other languages
Chinese (zh)
Inventor
刘皓
何元波
候号前
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201410350688.6A priority Critical patent/CN104123266A/en
Publication of CN104123266A publication Critical patent/CN104123266A/en
Pending legal-status Critical Current

Links

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention belongs to the technical field of digital signal processing, and particularly relates to a method for conducting extremely-low-latency fast Fourier transform (FFT) operation on sampled data under an ultra-high-speed sampling rate through a programmable logic device. According to the method for achieving fast Fourier transform on the programmable logic device, through certain rules, high-speed sampled data are input into a Fourier transform module in parallel, after the data are arranged according to the rules, hardware resources are reused on the basis of parallel, and resource occupation is reduced. Compared with a traditional method, the method can meet the requirements for high data speed and low processing delay.

Description

A kind of extremely low delay Fast Fourier Transform (FFT) method of counting under GHz sampling rate
Technical field
The invention belongs to digital signal processing technique field, relate in particular under hypervelocity sampling rate and utilize programmable logic device (PLD), sampled data is carried out to the fast Fourier transformation operation (Fast Fourier Transform, FFT) of extremely low delay.
Background technology
Digital information processing system is reliable with it, cheapness and precision advantages of higher have obtained swift and violent development in nearly decades, is used to almost each engineering field.In digital signal processing, a kind of computing often running into is discrete time Fourier transform (Discrete Fourier Transform, DFT).Directly calculate DFT and can cause very high computational complexity, if the counting as N of input signal, directly calculating DFT needs N 2inferior complex multiplication and N 2-N time complex addition.When counting, this is difficult to accept when larger.
Nineteen sixty-five, the computing method that Cooley and Tukey have proposed a kind of DFT of reduction operand are Fast Fourier Transform (FFT) (Fast Fourier Transform, FFT).While utilizing FFT to calculate, if counting as N of input signal needs inferior complex multiplication and Nlog 2(N) inferior complex addition.Along with the operand of the increase FFT of N is only faster than linear growth.Than directly settling accounts DFT, the operand of FFT reduces greatly.So all adopt the method for FFT to realize the calculating of DFT in practical application.
When specific implementation FFT, need to consider various factors, such as the speed of input data, to processing the requirement of time delay, to the requirement of resource occupation etc.At present, the IP kernel of the ripe FFT that FPGA producer provides has two class formations.The first kind is only to use a butterfly processing element, by controlling sequential, reaches the effect of multiplexing this butterfly processing element in time.Equations of The Second Kind is pipelining, uses the butterfly computation of a plurality of serial arrangement, and data are also serial inputs.Analyze this two class formation, first kind structure takies resource seldom, but the processing delay of system is very high, and input data transfer rate can be too not high.Equations of The Second Kind structure adopts pipelining, can reduce processing delay, but because butterfly computation is serial, so that processing delay can not drop to is extremely low, input data rate equally can be too not high in addition.The IP kernel that certain producer provides calculates 256 FFT, if adopt first kind structure, in approximately 2000 FPGA work clock cycles of processing delay, adopts Equations of The Second Kind structure approximately 700 FPGA work clock cycles of processing delay.This can meet the demand of common calculating FFT.
But, there is in practice the demand of another kind of calculating FFT, the sampling rate of this class demand is very high.Such as every number of seconds GHz sample (Gigabit Samples Per Second, GSPS), require the processing delay of calculating FFT extremely low simultaneously.This demand adopts common FFT implementation structure noted earlier no longer suitable.Common flowing water serial structure or time division multiplex structure are not processed so high input rate, and processing delay can not be accomplished extremely low.
Summary of the invention
The present invention is directed to the defect of existing common FFT implementation structure, proposed a kind of extremely low delay Fast Fourier Transform (FFT) method of counting under GHz sampling rate, it is high that the method can meet data rate, the demand that low these two the common FFT of processing delay cannot be satisfied.
Count the extremely low delay Fast Fourier Transform (FFT) method under GHz sampling rate, specific as follows:
S1, employing parallel input mode input sampling data: note need to be carried out counting as N=2 of FFT l2 m, the data rate of note input is F s, after parallel input, the frequency of operation of logical device is reduced to F s/ 2 l, wherein, 2 lthe number of active lanes that represents parallel input, described 2 lvalue by sample devices, determined, 2 mrepresent the parallel input needed clock period of total data;
S2, determine the mapping relations of sampling point data and each parallel channel: the input channel that will walk abreast is numbered Chx, and it is upper that n data in N point FFT are mapped to x channel C hx, and mapping relations formula is n=x+m2 l, wherein, n represents n the data of N point FFT, and x represents x input channel, and m represents m clock period after input starts, 0≤m≤2 m-1,0≤x≤2 l-1;
S3, through the processing of S1 and S2, FFT has extracted M time according to base 2 described in S1, draws needs 2 lindividual 2 mthe DFT module of point, all data described in the corresponding S2 of the input data of each DFT module in parallel input channel Chx, the data of each DFT module enter according to clock serial, by controlling sequential multiplexed resource;
S4, when described in S3 2 lafter individual DFT module output, the number of times M extracting according to FFT completes follow-up M butterfly computation.
Further, described in S3 2 lindividual DFT module walks abreast, and described 2 lindividual DFT module each other hardware is independent.
The invention has the beneficial effects as follows:
The present invention, by the parallelization of FFT structure, improves the data transfer rate of processing, and greatly reduces the processing delay time of FFT simultaneously.
Accompanying drawing explanation
Fig. 1 is the data parallel input schematic diagram of 256 FFT modules.
Fig. 2 is the operating structure figure that 256 FFT extract after 4 times by base 2.
Fig. 3 is 16 DFT operation relations and implementation structure.
Fig. 4 is 256 final implementation structure figure of FFT.
Embodiment
Below in conjunction with embodiment and accompanying drawing, describe technical scheme of the present invention in detail.
Count the extremely low delay Fast Fourier Transform (FFT) method under GHz sampling rate, specific as follows:
S1, employing parallel input mode input sampling data: note need to be carried out counting as N=2 of FFT l2 m.In FFT, N must be 2 power, has N=2 l* 2 m, wherein, 2 lthe number of active lanes that represents parallel input, 2 lvalue by sample devices, determined, 2 mrepresent the parallel input needed clock period of total data.The data rate of note input is F s, after parallel input, the frequency of operation of logical device is reduced to F s/ 2 l.Parallel input like this makes whole FFT module can process superfast sampled data.
S2, determine the mapping relations of sampling point data and each parallel channel: the input channel that will walk abreast is numbered Chx, wherein, 0≤x≤2 l-1.It is upper that n data in N point FFT are mapped to x channel C hx, the following n=x+m2 of mapping relations formula l, wherein n represents n the data of N point FFT, and x represents x input channel, and m represents m clock period after input starts, 0≤m≤2 m-1.
The processing FFT of S3, process S1 and S2 has extracted M time according to base 2, draws and needs 2 lindividual 2 mthe DFT module of point, all data described in the corresponding S2 of the input data of each DFT module in parallel input channel Chx.The data of each DFT module enter according to clock serial, by controlling sequential multiplexed resource, described 2 lindividual DFT module walks abreast, and hardware is independent each other.Because all data in a passage are serial inputs, so calculating 2 mpoint DFT time, multiplexing needed butterfly unit.
S4, when described in S3 2 lafter individual DFT module output, the number of times M extracting according to FFT completes follow-up M butterfly computation.By controlling the output timing of each DFT module, butterfly processing element that can multiplexing follow-up M time, makes each butterfly processing element of counting only need 1, and saving resource, can not cause time delay excessive simultaneously greatly.
On the xc7vx485t-1ffg1157 of Xilinx company, realize the FFT of 256.The speed of sampled data is 2.4GSPS, and the work clock of FFT computing module is 150Mhz, the parallel FFT module that is input in sampled data Yi16 road, and 256 sampled points are inputted need 16 clock period completely.Data parallel input structure as shown in Figure 1.
Because 256 data of FFT are inputted through 16 clock period by parallel 16 tunnels, so FFT extracts 4 times according to base 2, so just have 16 DFT.In order to see more in detail this point, Fig. 2,256 operating structure figure that FFT extracts after 4 times by base 2 have been shown.Observing these 16 DFT can see, the input data of each 16 DFT are the data in corresponding 1 input channel just.Such as the input data of first 16 DFT are data of Ch0 input channel.
The DFT of 16 is realized by butterfly computation.But because the input data of 16 DFT are according to clock serial input, so reusable butterfly processing element.16 DFT modules comprise 12 butterfly computation, 14 butterfly computation, 18 butterfly computation, 1 16 butterfly computation.This implementation is utilized the multiplexing butterfly processing element of serial input feature of data, greatly reduces taking of resource.The structure of 16 DFT module realizations is referring to Fig. 3.
Because the output of 16 DFT is independently, so by controlling sequential, make 32 follow-up points, 64 points, and 256 butterfly processing elements can be multiplexing at 128, in the time of saving resource, can not cause too many that time delay increases.Final implementation structure as shown in Figure 4.
The DSP48 that the present embodiment takies is less than 30%, and logic is less than 20%.The data rate of input is 2.4GSPS.From first data, enter and start all to export altogether less than 50 FPGA work clock cycles to 256 FFT result of calculations.

Claims (2)

1. the extremely low delay Fast Fourier Transform (FFT) method under number GHz sampling rate, is characterized in that, comprises the steps:
S1, employing parallel input mode input sampling data: note need to be carried out counting as N=2 of FFT l2 m, the data rate of note input is F s, after parallel input, the frequency of operation of logical device is reduced to F s/ 2 l, wherein, 2 lthe number of active lanes that represents parallel input, described 2 lvalue by sample devices, determined, 2 mrepresent the parallel input needed clock period of total data;
S2, determine the mapping relations of sampling point data and each parallel channel: the input channel that will walk abreast is numbered Chx, and it is upper that n data in N point FFT are mapped to x channel C hx, and mapping relations formula is n=x+m2 l, wherein, n represents n the data of N point FFT, and x represents x input channel, and m represents m clock period after input starts, 0≤m≤2 m-1,0≤x≤2 l-1;
S3, through the processing of S1 and S2, FFT has extracted M time according to base 2 described in S1, draws needs 2 lindividual 2 mthe DFT module of point, all data described in the corresponding S2 of the input data of each DFT module in parallel input channel Chx, the data of each DFT module enter according to clock serial, by controlling sequential multiplexed resource;
S4, when described in S3 2 lafter individual DFT module output, the number of times M extracting according to FFT completes follow-up M butterfly computation.
2. a kind of extremely low delay Fast Fourier Transform (FFT) method of counting under GHz sampling rate according to claim 1, is characterized in that: described in S3 2 lindividual DFT module walks abreast, and described 2 lindividual DFT module each other hardware is independent.
CN201410350688.6A 2014-07-23 2014-07-23 Method for achieving extremely-low-latency fast Fourier transform under gigabit sampling rate Pending CN104123266A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410350688.6A CN104123266A (en) 2014-07-23 2014-07-23 Method for achieving extremely-low-latency fast Fourier transform under gigabit sampling rate

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410350688.6A CN104123266A (en) 2014-07-23 2014-07-23 Method for achieving extremely-low-latency fast Fourier transform under gigabit sampling rate

Publications (1)

Publication Number Publication Date
CN104123266A true CN104123266A (en) 2014-10-29

Family

ID=51768680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410350688.6A Pending CN104123266A (en) 2014-07-23 2014-07-23 Method for achieving extremely-low-latency fast Fourier transform under gigabit sampling rate

Country Status (1)

Country Link
CN (1) CN104123266A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108205517A (en) * 2016-12-20 2018-06-26 中国航天科工集团八五研究所 A kind of FFT multiplexing methods
CN112732339A (en) * 2021-01-20 2021-04-30 上海微波设备研究所(中国电子科技集团公司第五十一研究所) Time division multiplexing time extraction FFT implementation method, system and medium
CN113407902A (en) * 2021-06-29 2021-09-17 哈尔滨工业大学 Input block remapping FFT method based on FPGA

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108205517A (en) * 2016-12-20 2018-06-26 中国航天科工集团八五研究所 A kind of FFT multiplexing methods
CN108205517B (en) * 2016-12-20 2021-06-08 中国航天科工集团八五一一研究所 FFT multiplexing method
CN112732339A (en) * 2021-01-20 2021-04-30 上海微波设备研究所(中国电子科技集团公司第五十一研究所) Time division multiplexing time extraction FFT implementation method, system and medium
CN113407902A (en) * 2021-06-29 2021-09-17 哈尔滨工业大学 Input block remapping FFT method based on FPGA

Similar Documents

Publication Publication Date Title
CN103973324B (en) A kind of wideband digital receiver and real time spectrum processing method thereof
CN104123266A (en) Method for achieving extremely-low-latency fast Fourier transform under gigabit sampling rate
CN103116599A (en) Urban mass data flow fast redundancy elimination method based on improved Bloom filter structure
CN102508031A (en) Fourier series based measurement method of phase angle of partial discharge pulse
CN104199942B (en) A kind of Hadoop platform time series data incremental calculation method and system
CN103901405B (en) Block floating point frequency domain four road pulse shortener and impulse compression methods thereof in real time
CN106059530A (en) Half-band filter structure with frequency response weakly correlated with coefficient quantization digit
CN101582059A (en) Method of realizing parallel structure for FFT processor based on FPGA
CN104504205A (en) Parallelizing two-dimensional division method of symmetrical FIR (Finite Impulse Response) algorithm and hardware structure of parallelizing two-dimensional division method
Haveliya FPGA implementation of a Vedic convolution algorithm
CN108628805A (en) A kind of butterfly processing element and processing method, fft processor of low-power consumption
CN105893333B (en) A kind of hardware circuit for calculating covariance matrix in MUSIC algorithms
CN103716055A (en) Pre-modulation integral multichannel parallel analog information conversion circuit
CN109829132A (en) The quick spectral analysis method of long data sequence under a kind of embedded environment
CN104808086B (en) A kind of AD analog input cards and acquisition method with adaptation function
CN108205517B (en) FFT multiplexing method
CN203942513U (en) Adjustable high precision fractional frequency division circuit based on FPGA
Wang et al. A novel approach of feature extraction for analog circuit fault diagnosis based on WPD-LLE-CSA
CN104794002B (en) A kind of multidiameter delay division methods and system
CN103425879B (en) A kind of method asking for mass data correlation peak location fast
CN102253924B (en) Method for realizing root extraction arithmetic on hardware and root extraction arithmetic device
CN106685412B (en) Frequency divider, frequency divider system and scaling down processing method
CN104156338B (en) Spectral analysis calculation method and calculator for illumination intensity data
CN105487090B (en) A kind of the capture circuit and method of compatible continuous and cycle matching signal
CN107944548A (en) A kind of FPGA realizes the design method of convolutional layer

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20141029