CN101136891B - 3780-point quick Fourier transformation processor of pipelining structure - Google Patents

3780-point quick Fourier transformation processor of pipelining structure Download PDF

Info

Publication number
CN101136891B
CN101136891B CN2007100447161A CN200710044716A CN101136891B CN 101136891 B CN101136891 B CN 101136891B CN 2007100447161 A CN2007100447161 A CN 2007100447161A CN 200710044716 A CN200710044716 A CN 200710044716A CN 101136891 B CN101136891 B CN 101136891B
Authority
CN
China
Prior art keywords
fft
output
input
unit
mentioned
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2007100447161A
Other languages
Chinese (zh)
Other versions
CN101136891A (en
Inventor
曾晓洋
陈赟
林一帆
肖昊
许光铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI FUDAN MICRONANO ELECTRONICS CO Ltd
Fudan University
Original Assignee
SHANGHAI FUDAN MICRONANO ELECTRONICS CO Ltd
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI FUDAN MICRONANO ELECTRONICS CO Ltd, Fudan University filed Critical SHANGHAI FUDAN MICRONANO ELECTRONICS CO Ltd
Priority to CN2007100447161A priority Critical patent/CN101136891B/en
Publication of CN101136891A publication Critical patent/CN101136891A/en
Application granted granted Critical
Publication of CN101136891B publication Critical patent/CN101136891B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

3780 points Fast Fourier Transform (FFT) processor is based on pipeline architecture. Based on mixed base algorithm, the invention decomposes 3780 points FFT into 63X60. Based on prime factor algorithm, 63 points FFT and 60 points FFT are decomposed to 7X9 and 5X3X4 respectively. Characters are that the state design based on pipeline is adopted by each FFT arithmetic unit, and memory unit of using 'ping-pong' structure is in use for storing arithmetic data so as to constitute 3780 points FFT processor in complete pipeline architecture. Thus, the input data can be carried out FFT operation uninterruptedly, and operational result can be output uninterruptedly. The invention raises data throughput of processor, and reduces complexity for controlling address of reading and writing memory.

Description

3780 point quick Fourier transform processors of pipeline organization
Technical field
The invention belongs to Digital Signal Processing and digital communication technology field, be specifically related to a kind of fast Fourier transform (FFT) processor based on pipeline organization.
Background technology
National standard digital TV ground transmission scheme adopts the mode of time-domain synchronization OFDM (TDS-OFDM) that signal is carried out modulation and demodulation, promptly adopted discrete fourier inverse transformation (IDFT) that the signal that the modulating data of frequency domain is converted into time domain is sent at transmitting terminal, by discrete Fourier transform (DFT) time-domain signal that receives is changed into frequency-region signal at receiving terminal and adjudicate demodulation then, recover the modulation intelligence of frequency domain.As seen thereby DFT/IDFT has occupied a large amount of operation time of TDS-OFDM system and power consumption and the complexity that calculation resources has determined receiver to a great extent.Therefore how to realize DFT/IDFT computing fast and effectively, the realization of whole TDS-OFDM communication system is had very big influence.
At present, the implementation method of leaf transformation roughly has two kinds in the 3780 comparatively general point discrete Fouriers: 1, use interpolation method, obtain 4096 points with 3780 by interpolation, utilize various based on 2 again NFast Fourier transform (FFT) algorithm of point obtains 4096 FFT result, subtracts the result that sampling obtains 3780 FFT again.2, utilize the hybrid base FFT algorithm, resolve into 63 * 60 to 3780, wherein 63 FFT utilize prime factor algorithm to be decomposed into 7 * 9, and 60 FFT utilize prime factor algorithm to be decomposed into 5 * 3 * 4, and 3,5,7,9 FFT realizes with the WFAT algorithm respectively.But, more than two kinds of methods all have and come with some shortcomings.In the method 1, reduced hard-wired complexity, also introduced error simultaneously, computational accuracy is reduced, therefore in reality realizes, generally do not adopted this method though 3780 interpolations are at 4096.In the method 2, owing to adopt the mixed base algorithm, address mapping of data is quite complicated in the middle of it, thus be not suitable for adopting Fully-pipelined structure, thus limited the data throughput of processor.
Summary of the invention
In order to overcome the shortcoming of above-mentioned two kinds of 3780FFT processors, the present invention proposes a kind of new 3780 fft processors based on Fully-pipelined structure, make it can not only calculate 3780 DFT/IDFT accurately, and can effectively improve the data throughput of processor, reduce the power consumption of hard-wired complexity and chip.
3780 fft processors of the present invention's design are by form (seeing shown in Figure 1) with lower unit:
A FFT/IFFT control unit 101, its processor controls is calculated FFT or IFFT (inverse fast Fourier transform);
Read/write address and control state machine 102, its produces the read/write address of memory cell, controls the operating state of each unit;
An input conjugate unit 103, its an input input is the input data that will calculate, another input links to each other with the output of above-mentioned FFT/IFFT control unit 101;
First " table tennis " structure memory cell 104, its input links to each other with the output of above-mentioned input conjugate unit 103, and another input links to each other with the output of above-mentioned read/write address with control state machine 102;
First selector 105, its input links to each other with the output of above-mentioned first " table tennis " structure memory cell 104;
63 FFT computing units 106, its input links to each other with the output of above-mentioned first selector 105;
A twiddle factor storage ROM107, its input links to each other with the output of above-mentioned read/write address with control state machine 102;
A twiddle factor map unit 108, its input links to each other with the output of above-mentioned twiddle factor storage ROM107;
A complex multiplication unit 109, its input links to each other with the output of above-mentioned 63 FFT computing units 106, and another input links to each other with the output of above-mentioned twiddle factor map unit 108;
Second " table tennis " structure memory cell 110, its input links to each other with the output of above-mentioned complex multiplication unit 109, and another input links to each other with the output of above-mentioned read/write address with control state machine 102;
Second selector 111, its input links to each other with the output of above-mentioned second " table tennis " structure memory cell 110;
60 FFT computing units 112, its input links to each other with the output of above-mentioned second selector 111;
An output conjugate unit 113, its input links to each other with the output of above-mentioned FFT/IFFT control unit 101, and another input links to each other with the output of above-mentioned 60 FFT computing units 112, and it is output as the result of calculation of whole 3780 FFT.
Among the present invention, described 63 FFT computing units are by form (seeing shown in Figure 2) with lower unit:
7 FFT computing units 201, its input is imported the input data of this unit;
Read/write address and control state machine 202, it produces the required read/write address in this unit, controls the state of this unit;
" table tennis " structure memory cell 203, its input links to each other with the output of above-mentioned 7 FFT computing units 201, and another input links to each other with the output of above-mentioned read/write address with control state machine 202;
A selector 204, its input links to each other with the output of above-mentioned " table tennis " structure memory cell 203;
9 FFT computing units 205, its input links to each other with the output of above-mentioned selector 204, and its dateout is this unit dateout.
Among the present invention, described 60 FFT computing units are by form (seeing shown in Figure 3) with lower unit:
5 FFT computing units 301, its input are the input data of this unit;
Read/write address and control state machine 302, it produces the required read/write address in this unit, controls the state of this unit;
" table tennis " structure memory cell 303, its input links to each other with the output of above-mentioned 5 FFT computing units 301, and another input links to each other with the output of above-mentioned read/write address with control state machine 303;
A selector 304, its input links to each other with the output of above-mentioned " table tennis " structure memory cell 303;
3 FFT computing units 305, its input links to each other with the output of above-mentioned selector 304;
4 FFT computing units 306, its input links to each other with the output of above-mentioned 3 FFT computing units 305, and its output is the dateout of this unit.
Above-mentioned 3,5,7,9 FFT computing units are by form (seeing shown in Figure 9) with lower unit:
A state machine 901 is controlled the state exchange of this unit;
An adder group 902 is made up of N adder, and the number of concrete N is by the decision of counting of being calculated;
A multiplier group 903 is made up of N multiplier, and the number of concrete N is by the decision of counting of being calculated;
A registers group 904 is used for the intermediate data of storage computation process;
Four selectors 906 are linked to each other with adder group 902, multiplier group 903 and registers group 904 respectively by state machine 901 controls.
It is according to the mixed base algorithm, 3780 FFT are decomposed into 63 * 60, use the mode of streamline to calculate 63 FFT and 60 FFT respectively, and use the memory cell of " table tennis " structure to deposit operational data, whole 3780 FFT computings can be carried out in Fully-pipelined mode, and it is corresponding to processing structure shown in Figure 1.Wherein:
63 point quick Fourier conversion (FFT) are decomposed into 7 * 9 according to prime factor algorithm, use the mode of streamline to calculate 7 FFT and 9 FFT, and use the memory cell of " table tennis " structure to deposit operational data, thereby 63 FFT computings can be carried out in the mode of streamline, and it is corresponding to processing structure shown in Figure 2;
60 point quick Fourier conversion (FFT) are decomposed into 5 * 3 * 4 according to prime factor algorithm, use the mode of streamline to calculate 5 FFT, 3 FFT and 4 FFT, and use the memory cell of " table tennis " structure to deposit operational data, thereby 60 FFT computings can be carried out in the mode of streamline, and it is corresponding to processing structure shown in Figure 3.
Among the present invention, 3 FFT arithmetic elements adopt the Winograd Fourier Transform Algorithm, and its calculating process uses the state transformation design based on streamline shown in Fig. 4 and table 1.
4 FFT arithmetic elements adopt general fft algorithm, and its calculating process uses the state transformation design based on streamline shown in Fig. 5 and table 2.
5 FFT arithmetic elements adopt the Winograd Fourier Transform Algorithm, and its calculating process uses the state transformation design based on streamline shown in Fig. 6 and table 3.
7 FFT arithmetic elements adopt the Winograd Fourier Transform Algorithm, and its calculating process uses the state transformation design based on streamline shown in Fig. 7 and table 4.
9 FFT arithmetic elements adopt the Winograd Fourier Transform Algorithm, and its calculating process uses the state transformation design based on streamline shown in Fig. 8 and table 5.
3 points, 4 points, 5 points, 7 points, 9 FFT arithmetic elements are used circuit structure realization as shown in Figure 9.
3780 fft processors that the present invention proposes by adopting the foregoing circuit structure, have been realized Fully-pipelined structure, have improved the data throughput of processor when guaranteeing computational accuracy, have improved processing speed, have avoided complicated map addresses.
Its performance index satisfy the digital TV ground transmission request of national standard fully.
Description of drawings
Fig. 1 is these 3780 fft processor electrical block diagrams.
Fig. 2 is 63 FFT arithmetic element circuit structural representations.
Fig. 3 is 60 FFT arithmetic element circuit structural representations.
Fig. 4 is the state diagram based on The pipeline design of calculating 3 FFT.
Fig. 5 is the state diagram based on The pipeline design of calculating 4 FFT.
Fig. 6 is the state diagram based on The pipeline design of calculating 5 FFT.
Fig. 7 is the state diagram based on The pipeline design of calculating 7 FFT.
Fig. 8 is the state diagram based on The pipeline design of calculating 9 FFT.
Fig. 9 is the electrical block diagram that Winograd Fourier Transform Algorithm (WFTA) realizes, wherein 901 is state machine, and 902 is the adder group, and 903 is the multiplier group, and 904 is registers group, and 905-908 is four selectors.
Number in the figure: 101 is the FFT/IFFT control unit, 102 is read/write address and control state machine, 103 are the input conjugate unit, 104 is first " table tennis " structure memory cell, 105 is first selector, 106 is 63 FFT computing units, and 107 are twiddle factor storage ROM, and 108 is the twiddle factor map unit, 109 is the complex multiplication unit, 110 is second " table tennis " structure memory cell, and 111 for being second selector, and 112 is 60 FFT computing units, 113 are the output conjugate unit, 201 is 7 FFT computing units, and 202 is read/write address and control state machine, and 203 is " table tennis " structure memory cell, 204 is selector, 205 is 9 FFT computing units, and 301 is 5 FFT computing units, and 302 is read/write address and control state machine, 303 is " table tennis " structure memory cell, 304 is selector, and 305 is 3 FFT computing units, and 306 is 4 FFT computing units.
Embodiment
Ask for an interview Fig. 1, be input as one-period with one group 3780, its calculation step is illustrated:
(1) FFT/IFFT control unit 101 is according to input signal, and it still is the IFFT computing that the control entire process is carried out the FFT computing;
(2) in first cycle, data will be imported data by 102 controls of read/write address and control state machine and deposit among high 3780 of first " table tennis " structure memory cell 104 behind input conjugate unit 103;
(3) in second period, what read/write address and control state machine 102 controls will imports that data deposit first " table tennis " structure memory cell 104 in hangs down 3780,3780 inputs of high 3780 last one-periods of depositing are calculated 63 FFT by 63 FFT computing units 106 of selector 105 inputs in first " table tennis " structure memory cell 104 simultaneously;
(4) twiddle factor storage ROM107 exports corresponding twiddle factor according to the address that is provided by read/write address and control state machine 102, after twiddle factor map unit 108, be converted to the corresponding twiddle factor and the dateout of 63 FFT computing units 106 in the step (3) and multiply each other, corresponding results deposits the high 3780 of second " table tennis " structure memory cell 110 in by read/write address and control state machine 102 controls;
After (5) 60 steps (4), first group of 3780 data have all been passed through 63 FFT and have been calculated, and exist among high 3780 of second " table tennis " structure memory cell 110;
(6) in the step (5), the result in high 3780 of second " table tennis " structure memory cell 110 calculates through being input in 60 FFT computing units 112 behind the selector 111 again;
(7) through 63 steps (6), first group of 3780 FFT calculates and finishes, result's output behind output conjugate unit 113.
Whole computational process is carried out in the mode of streamline, data can continual each input 1 o'clock in first " table tennis " structure memory cell 4, result of calculation is exported in the mode of 1 on every clock simultaneously.
Figure 2 shows that the circuit structure of 63 FFT arithmetic elements, be input as one-period with 63, its calculation step is illustrated:
(1) in one-period, 7 FFT computing units 201 of data input, operation result is deposited in by 202 controls of read/write address and control state machine among high 63 of " table tennis " structure memory cell 203, imports 9 FFT computing units 205 behind result's process selector 204 of the one-period of lasting 7 FFT that deposit in low 63 of " table tennis " structure memory cell 203 simultaneously;
(2) after step (1) was carried out 9 times, 63 inputs had all been passed through 7 FFT and have been calculated, and existed among high 63 of " table tennis " structure memory cell 203;
(3) in the step (2), computing in 9 the FFT computing units 205 of data input in high 63 of " table tennis " structure memory cell 203, one group of 63 FFT calculates and finishes, and another is organized in 7 FFT computing units 201 of 63 point data input and calculates simultaneously.
Whole 63 FFT computational processes are carried out in the mode of streamline, and data can continual each input 1 point, and result of calculation is also exported in the mode of 1 on every clock simultaneously.
Figure 3 shows that the circuit structure of 60 FFT arithmetic elements, its operation principle is similar to 63 FFT arithmetic elements shown in Figure 2, does not repeat them here.
More than 63 FFT computings be to be decomposed into 7 FFT and 9 FFT according to prime factor algorithm, 60 FFT computings are decomposed into 5 points, and 4 FFT at 3, wherein 3 points, 5 points, 7 points, 9 FFT are by the WFTA algorithm computation, 4 FFT are calculated by general fft algorithm.
The WFTA algorithm of 3 FFT is:
a 1=x(1)+x(2) m 1 = 1 2 a 1 c 1=x(0)-m 1
a 2=x(1)-x(2) m 2=0.86603a 2
a 3=x(0)+a 1 X(0)=a 3
X(1)=c 1-jm 2
X(2)=c 1+jm 2
The general algorithm of 4 FFT is:
a 1=x(0)+x(2) c 1=a 1+a 3
a 2=x(0)-x(2) c 2=a 2+a 4
a 3=x(1)+x(3) c 3=a 1-a 3
a 4=x(1)-x(3) c 4=a 2-a 4
X(0)=c 1
X(1)=c 2
X(2)=c 3
X(3)=c 4
The WFTA algorithm of 5 FFT is:
a 1=x(1)+x(4) m 1=0.95106a 5 c 1=x(0)-m 5
a 2=x(1)-x(4) m 2=1.53884a 2 ?c 2=c 1+m 4
a 3=x(2)+x(3) m 3=0.36327a 4 c 3=c 1-m 4
a 4=x(2)-x(3) m 4=0.55902a 6 c 4=m 1-m 3
a 5=a 2+a 4 m 5 = 1 4 a 7 c 5=m 2-m 1
a 6=a 1-a 3
a 7=a 1+a 3 X(0)=a 8
a 8=x(0)+a 7 X(1)=c 2-jc 4
X(2)=c 3-jc 5
X(3)=c 3+jc 5
X(4)=c 2+jc 4
The WFTA algorithm of 7 FFT is:
a 1=x(1)+x(6) m 1=0.16667a 7 c 1=x(0)-m 1
a 2=x(1)-x(6) m 2=0.79016a 8 c 2=c 1+m 2+m 3
a 3=x(2)+x(5) m 3=0.05585a 9 c 3=c 1-m 2-m 4
a 4=x(2)-x(5) m 4=0.73430a 10 c 4=c 1-m 3+m 4
a 5=x(3)+x(4) m 5=0.44096a 11 c 5=m 5+m 6-m 7
a 6=x(3)-x(4) m 6=0.34087a 12 c 6=m 5-m 6-m 8
a 7=a 1+a 3+a 5 m 7=0.53397a 13 c 7=-m 5-m 7-m 8
a 8=a 1-a 5 m 8=0.87484a 14
a 9=-a 3+a 5 X(0)=a 15
a 10=-a 1+a 3 X(1)=c 2-jc 5
a 11=a 2+a 4-a 6 X(2)=c 3-jc 6
a 12=a 2+a 6 X(3)=c 4-jc 7
a 13=-a 4-a 6 X(4)=c 4+jc 7
a 14=-a 2+a 4 X(5)=c 3+jc 6
a 15=x(0)+a 7 X(6)=c 2+jc 5
The WFTA algorithm of 9 FFT is:
a 1=x(1)+x(8) m 1=0.19740a 9 c 1=x(0)-m 7
a 2=x(1)-x(8) m 2=0.56858a 10 c 2=m 2-m 3
a 3=x(2)+x(7) m 3=0.37111a 11 c 3=m 1+m 3
a 4=x(2)-x(7) m 4=0.54254a 12 c 4=m 1+m 2
a 5=x(4)+x(5) m 5=0.10026a 13 c 5=c 1+c 2-c 3
a 6=x(4)-x(5) m 6=0.44228a 14 c 6=c 1+c 3+c 4
a 7=x(3)+x(6)? m 7 = 1 2 a 7 c 7=c 1-c 2-c 4
a 8=x(3)-x(6)?m 8=0.86603a 8 c 8=m 4-m 6
a 9=-a 1+a 5 m 9 = 1 2 a 15 c 9=m 5-m 6
a 10=a 1-a 3 m 10=0.86603a 16 c 10=m 4-m 5
a 11=-a 3+a 5 c 11=c 8+c 9+m 8
a 12=a 2-a 6 c 12=c 8+c 10-m 8
a 13=a 2+a 4 c 13=-c 9+c 10+m 8
a 14=-a 4-a 6 c 14=x(0)+a 7-m 9
a 15=a 1+a 3+a 5
a 16=a 2-a 4-a 6 X(0)=a 17
a 17=x(0)+a 7+a 15 X(1)=c 5-jc 11
X(2)=c 6-jc 12
X(3)=c 14-jm 10
X(4)=c 7-jc 13
X(5)=c 7+jc 13
X(6)=c 14+jm 10
X(7)=c 6+jc 12
X(8)=c 5+jc 11
According to above algorithm, the present invention has designed a kind of state transformation based on pipeline organization, be each clock cycle all to import a point data, each clock also has some result of calculation output simultaneously, calculate the state transformation of 3 points, 4 points, 5 points, 7 and 9 FFT such as Fig. 3~shown in Figure 8, the operation of each state is shown in following table 1~table 5 among the figure:
Table 1: 3 FFT states shown in Figure 4
State Operation
ST1 Input x (0)
ST2 Input x (1)
ST3 Input x (2), a 1=x(1)+x(2),a 2=x(1)-x(2)
ST4 Input x ' (0), m 1=0.5×a 1,m 2=0.86603×a 2,a 3=x(0)+a 1, output X (0)
ST5 Input x ' (1), c 1=x(0)-m 1, output X (1)
ST6 Input x ' (2), a 1’=x’(1)+x’(2),a 2'=x ' (1)-x ' (2), output X (2)
Table 2: 4 FFT states shown in Figure 5
State Operation
ST1 Input x (0)
ST2 Input x (1)
ST3 Input x (2), a 1=x(0)+x(2),a 2=x(0)-x(2)
ST4 Input x (3), a 3=x(1)+x(3),a 4=x(1)-x(3)
ST5 Input x ' (0), c 1=a 1+a 3,c 3=a 1-a 3, output X (0)
ST6 Input x ' (1), c 2=a 2+a 4,c 4=a 2-a 4, output X (1)
ST7 Input x ' (2), a 1’=x’(0)+x’(2),a 2'=x ' (0)-x ' (2), output X (2)
ST8 Input x ' (3), a 3’=x’(1)+x’(3),a 4'=x ' (1)-x ' (3), output X (3)
Table 3: 5 FFT states shown in Figure 6
State Operation
ST1 Input x (0)
ST2 Input x (1)
ST3 Input x (2)
ST4 Input x (3), a 3=x(2)+x(3),a 4=x(2)-x(3)
ST5 Input x (4), a 1=x(1)+x(4),a 2=x(1)-x(4)
ST6 Input x ' (0), a 5=a 2+a 4,a 6=a 1-a 3,a 7=a 1+a 3,m 2=1.53884a 2,m 3=0.36327a 4m 5=0.25a 7
ST7 Input x ' (1), c 1=x(0)-m 5,a 8=x(0)+a 7,m 1=0.95106a 5,m 4=0.55902a 6, output X (0)
ST8 Input x ' (2), m 2=1.53884a 2,m 3=0.36327a 4,c 2=c 1+m 4,c 4=m 1-m 3, output X (1)
ST9 Input x ' (3), c 3=c 1-m 4,c 5=m 2-m 1,a 3’=x’(2)+x’(3),a 4'=x ' (2)-x ' (3), output X (2)
ST10 Input x ' (4), a 1’=x’(1)+x’(4),a 2'=x ' (1)-x ' (4), output X (3)
ST11 Input x " (0), a 5’=a 2’+a 4’,a 6’=a 1’-a 3’,a 7’=a 1’+a 3’,m 2’=1.53884a 2’,
m 3’=0.36327a 4’,m 5’=0.25a 7', output X (4)
Table 4: 7 FFT states shown in Figure 7
State Operation
ST1 Input x (0)
ST2 Input x (1)
ST3 Input x (2)
ST4 Input x (3)
ST5 Input x (4), a 5=x(3)+x(4),a 6=x(3)-x(4)
ST6 Input x (5), a 3=x(2)+x(5),a 4=x(2)-x(5)
ST7 Input x (6), a 1=x(1)+x(6),a 2=x(1)-x(6)
ST8 Input x ' (0), a 7=a 1+a 3+a 5,a 8=a 1-a 5,a 9=-a 3+a 5,a 10=-a 1+a 3,a 11=a 2+a 4-a 6a 12=a 2+a 6,a 13=-a 4-a 6,a 14=-a 2+a 4
ST9 Input x ' (1), a 15=x(0)+a 7,m 1=0.16667a 7,m 2=0.79016a 8,m 3=0.05585a 9m 4=0.73430a 10,m 5=0.44096a 11,m 6=0.34087a 12,m 7=0.53397a 13 m 2=0.87484a 14
ST10 Input x ' (2), c 1=x(0)-m 1, output X (0)
ST11 Input x ' (3), c 2=c 1+m 2+m 3,c 3=c 1-m 2-m 4,c 4=c 1-m 3+m 4,c 5=m 5+m 6-m 7c 6=m 5-m 6-m 8,c 7=-m 5-m 7-m 8, output X (1)
ST12 Input x ' (4), a 5’=x’(3)+x’(4),a 6'=x ' (3)-x ' (4), output X (2)
ST13 Input x ' (5), a 3’=x’(2)+x’(5),a 4'=x ' (2)-x ' (5), output X (3)
ST14 Input x ' (6), a 1’=x’(1)+x’(6),a 2'=x ' (1)-x ' (6), output X (4)
ST15 Input x " (0), a 7’=a 1’+a 3’+a 5’,a 8’=a 1’-a 5’,a 9’=-a 3’+a 5’,a 10’=-a 1’+a 3’?a 11’=a 2’+a 4’-a 6’,a 12’=a 2’+a 6’,a 13’=-a 4’-a 6’,a 14’=-a 2’+a 4', output X (5)
ST16 Input x " (1), a 15’=x’(0)+a 7’,m 1’=0.16667a 7’,m 2’=0.79016a 8’, m 3’=0.05585a 9’,m 4’=0.73430a 10’,m 5’=0.44096a 11’,m 6’=0.34087a 12’,m 7’=0.53397a 13’,m 2’=0.87484a 14', output X (5)
Table 5: 9 FFT states shown in Figure 8
State Operation
ST1 Input x (0)
ST2 Input x (1)
ST3 Input x (2)
ST4 Input x (3)
ST5 Input x (4)
ST6 Input x (5), a 5=x(4)+x(5),a 6=x(4)-x(5)
ST7 Input x (6), a 7=x(3)+x(6),a 8=x(3)-x(6)
ST8 Input x (7), a 3=x(2)+x(7),a 4=x(2)-x(7)
ST9 Input x (8), a 1=x(1)+x(8),a 2=x(1)-x(8)
ST10 Input x ' (0), m 8=0.86603a 8,a 9=-a 1+a 5,a 10=a 1-a 3,a 11=-a 3+a 5,a 12=a 2-a 6a 13=a 2+a 4,a 14=-a 4-a 6,a 15=a 1+a 3+a 5,a 16=a 2-a 4-a 6,a 14=-a 2+a 4
ST11 Input x ' (1), m 7=0.5a 7,m 1=0.19740a 9,m 2=0.56858a 10,m 3=0.37111a 11?m 4=0.54254a 12,m 5=0.10026a 13,m 6=0.44228a 14,m 7=0.5a 7,m 9=0.5a 15m 10=0.86603a 16,a 17=x(0)+a 7+a 15
ST12 Input x ' (2), c 1=x(0)-m 7,c 2=m 2-m 3,c 3=m 1+m 3,c 4=m 1+m 2, output X (0)
ST13 Input x ' (3), c 5=c 1+c 2-c 3,c 6=c 1+c 3+c 4,c 7=c 1-c 2-c 4,c 11=c 8+c 9+m 8c 12=c 8+c 10-m 8,c 13=-c 9+c 10+m 8,c 14=x(0)+a 7-m 9, output X (1)
ST14 Input x ' (4), output X (2)
ST15 Input x ' (5), a 5’=x’(4)+x’(5),a 6'=x ' (4)-x ' (5), output X (3)
ST16 Input x ' (6), a 7’=x’(3)+x’(6),a 8'=x ' (3)-x ' (6), output X (4)
ST17 Input x ' (7), a 3’=x’(2)+x’(7),a 4'=x ' (2)-x ' (7), output X (5)
ST18 Input x ' (8), a 1’=x’(1)+x’(8),a 2'=x ' (1)-x ' (8), output X (6)
ST19 Input x " (0), m 8’=0.86603a 8’,a 9’=-a 1’+a 5’,a 10’=a 1’-a 3’,a 11’=-a 3+a 5’ a 12’=a 2’-a 6’,a 13’=a 2’+a 4’,a 14’=-a 4’-a 6’,a 15’=a 1’+a 3’+a 5’,a 16’=a 2’-a 4’-a 6' output X (7)
ST20 Input x " (1), m 7’=0.5a 7’,m 1’=0.19740a 9’,m 2’=0.56858a 10’ m 3’=0.37111a 11’,m 4’=0.54254a 12’,m 5’=0.10026a 13’,m 6’=0.44228a 14’m 7’=0.5a 7’,m 9’=0.5a 15’,m 10’=0.86603a 16’,a 17’=x’(0)+a 7’+a 15' output X (8)
From table 1~table 5 as seen, by this state design, can realize that the FFT of pipeline organization calculates, promptly each clock cycle is all imported a point data, and each clock also has some result of calculation output simultaneously.Figure 4 shows that this programme calculates the state transition graph of 3 FFT, its operation principle is:
(1) " IDLE " is idle condition, do not carry out any operation, and when reset signal was effective, state machine entered this state, and " wfta3in_valid " is the input useful signal, and when its value was 1, state machine entered next state;
(2) two status registers of " ST1 " " ST2 " latch the x (0) and the x (1) of input;
(3) " ST3 " state, x (2) input is carried out computing with the x (1) that preserves before by the WFTA algorithm, obtains corresponding intermediate object program, and it is kept in the register;
(4) " ST4 " state carries out corresponding computing with the median of preserving in the step (3) again by the WFTA algorithm, intermediate object program is stored, obtain result of calculation X (0) output of 3 FFT simultaneously, output useful signal " wfta3out_valid " is effective, import input of 3 FFT of next group this moment, wherein first value x (0) is latched;
(5) " ST5 " state is undertaken the intermediate data of preserving in the step (4) to obtain second output X (1) as a result after the corresponding computing by the WFTA algorithm, output useful signal " wfta3out_valid " is effective, latchs second input x (1) of 3 FFT of next group simultaneously;
(6) " ST6 " and " ST5 " state class seemingly carry out obtaining last output X (2) after the corresponding computing, and output useful signal " wfta3out_valid " is effective, latchs last input x (2) of 3 FFT of next group simultaneously;
(7) step (6) is as long as there is the data input afterwards, and state machine constantly circulates between three states of " ST4 " " ST5 " " ST6 ", constantly exports the result of calculation of 3 FFT, thereby realizes pipeline organization.
Fig. 5~the state diagram based on The pipeline design that is respectively calculating 4 points, 5 points, 7 points, 9 FFT shown in Figure 8, it is similar that its operation principle and above-described 3 FFT calculate principle, do not repeat them here.
Figure 9 shows that the circuit structure that above 3,4,5,7,9 FFT realize, wherein state machine 901 is the state shown in Fig. 4~Fig. 8, registers group 904 is used to store the input and the intermediate variable of FFT computing, adder group 902 and multiplier group 903 are used to do corresponding computing, wherein multiplier group 903 can be thought the realization of (Booth) structure multiplier with cloth, also can realize with the constant coefficient multiplier.The used hardware resource of the present invention is as shown in table 6 below:
Table 6: each sub-FFT hardware resource that arithmetic element is used
Figure S07144716120070907D000121
Figure S07144716120070907D000131
In sum, the present invention adopts based on the small point FFT computing unit of The pipeline design and the memory cell of " table tennis " structure, has realized 3780 fft processors of Fully-pipelined structure.In the whole design, by 4 degree of depth is that 3780 memory cell, 2 degree of depth are that 63 memory cell and 2 degree of depth are the memory cell that 60 memory cell has constituted 4 " table tennis " structures, has used 3 FFT, 4 FFT, 5 FFT, 7 FFT and 9 FFT totally 5 sub-FFT arithmetic elements based on The pipeline design.By above design, the input data can continual mode with streamline be carried out the FFT computing and constantly export operation result, improved the data throughput of system, and this structure has also reduced the complexity of map addresses, reduce hard-wired resource, reduced the power consumption of processor.At last, the present invention has reached the digital TV ground transmission request of national standard through the FPGA checking.

Claims (2)

1. 3780 of a pipeline organization FFT fast fourier transform processors is characterized in that being made up of following unit:
A FFT/IFFT control unit (101), its processor controls is calculated FFT or IFFT;
First read/write address and control state machine (102), it produces the read/write address of memory cell, controls the operating state of each unit;
An input conjugate unit (103), its an input input is the input data that will calculate, another input links to each other with the output of above-mentioned FFT/IFFT control unit (101);
First " table tennis " structure memory cell (104), its input links to each other with the output of above-mentioned input conjugate unit (103), and another input links to each other with the output of above-mentioned first read/write address with control state machine (102);
First selector (105), its input links to each other with the output of above-mentioned first " table tennis " structure memory cell (104);
One 63 FFT computing units (106), its input links to each other with the output of above-mentioned first selector (105);
A twiddle factor storage ROM (107), its input links to each other with the output of above-mentioned first read/write address with control state machine (102);
A twiddle factor map unit (108), its input links to each other with the output of above-mentioned twiddle factor storage ROM (107);
A complex multiplication unit (109), its input links to each other with the output of above-mentioned 63 FFT computing units (106), and another input links to each other with the output of above-mentioned twiddle factor map unit (108);
Second " table tennis " structure memory cell (110), its input links to each other with the output of above-mentioned complex multiplication unit (109), and another input links to each other with the output of above-mentioned first read/write address with control state machine (102);
Second selector (111), its input links to each other with the output of above-mentioned second " table tennis " structure memory cell (110);
One 60 FFT computing units (112), its input links to each other with the output of above-mentioned second selector (111);
An output conjugate unit (113), its input links to each other with the output of above-mentioned FFT/IFFT control unit (101), and another input links to each other with the output of above-mentioned 60 FFT computing units, and it is output as the result of calculation of whole 3780 FFT; Concrete processing procedure is:
(1) FFT/IFFT control unit (101) is according to input signal, and it still is the IFFT computing that the control entire process is carried out the FFT computing;
(2) in first cycle, data will be imported data by first read/write address and control state machine (102) control and deposit among high 3780 of first " table tennis " structure memory cell (104) after input conjugate unit (103);
(3) in second period, what first read/write address and control state machine (102) control will import that data deposit first " table tennis " structure memory cell (104) in hangs down 3780,3780 inputs of high 3780 last one-periods of depositing are calculated 63 FFT by first selector (105) input 63 FFT computing units (106) in first " table tennis " the structure memory cell (104) simultaneously;
(4) twiddle factor storage ROM (107) exports corresponding twiddle factor according to the address that is provided by first read/write address and control state machine (102), after twiddle factor map unit (108), be converted to the corresponding twiddle factor and the dateout of 63 the FFT computing units (106) in the step (3) and multiply each other, corresponding results deposits the high 3780 of second " table tennis " structure memory cell (110) in by first read/write address and control state machine (102) control;
After (5) 60 steps (4), first group of 3780 data have all been passed through 63 FFT and have been calculated, and exist among high 3780 of second " table tennis " structure memory cell (110);
(6) in the step (5), the result in high 3780 of second " table tennis " structure memory cell (110) is input in 60 FFT computing units (112) after through third selector (111) again and calculates;
(7) through 63 steps (6), first group of 3780 FFT calculates and finishes, and the result is output behind output conjugate unit (113);
Described 63 FFT computing units (106) are by forming with lower unit:
One 7 FFT computing units (201), its input are imported the input data of this unit;
Second reading write address and control state machine (202), it produces the required read/write address in this unit, controls the state of this unit;
One the 3rd " table tennis " structure memory cell (203), its input links to each other with the output of above-mentioned 7 FFT computing units (201), and another input links to each other with the output of above-mentioned second reading write address with control state machine (202);
The 4th selector (204), its input links to each other with the output of above-mentioned the 3rd " table tennis " structure memory cell (203);
One 9 FFT computing units (205), its input links to each other with the output of above-mentioned the 4th selector (204), and its dateout is this unit dateout; Concrete processing procedure is:
(A) in one-period, data input 7 FFT computing units (201), operation result is deposited in by the control of second reading write address and control state machine (202) among high 63 of " table tennis " structure memory cell (203), and the result of 7 FFT of last one-period that deposit in low 63 of " table tennis " structure memory cell (203) afterwards imports 9 FFT computing units (205) through the 4th selector (204) simultaneously;
(B) after step (A) was carried out 9 times, 63 inputs had all been passed through 7 FFT and have been calculated, and existed among high 63 of " table tennis " structure memory cell (203);
(C) in the step (B), computing in data 9 FFT computing units of input (205) in high 63 of " table tennis " structure memory cell (203), one group of 63 FFT calculates and finishes, and another is organized in 63 point data, 7 FFT computing units of input (201) and calculates simultaneously;
Described 60 FFT computing units (112) are by forming with lower unit:
One 5 FFT computing units (301), its input are the input data of this unit;
Third reading write address and control state machine (302), it produces the required read/write address in this unit, controls the state of this unit;
One the 4th " table tennis " structure memory cell (303), its input links to each other with the output of above-mentioned 5 FFT computing units (301), and another input links to each other with the output of above-mentioned third reading write address with control state machine (302);
The 5th selector (304), its input links to each other with the output of above-mentioned the 4th " table tennis " structure memory cell (303);
One 3 FFT computing units (305), its input links to each other with the output of above-mentioned the 5th selector (304);
One 4 FFT computing units (306), its input links to each other with the output of above-mentioned 3 FFT computing units (305), and its output is the dateout of this unit;
Concrete processing procedure is with the processing procedure of 63 FFT computing units (106).
2. 3780 of pipeline organization according to claim 1 somes FFT fast fourier transform processors, its feature at described 3,4,5,7,9 FFT computing units by forming with lower unit:
A state machine (901) is controlled the state exchange of this unit;
An adder group (902) is made up of N adder, and the number of concrete N is by the decision of counting of being calculated;
A multiplier group (903) is made up of N multiplier, and the number of concrete N is by the decision of counting of being calculated;
A registers group (904) is used for the intermediate data of storage computation process;
The 6th selector (906) is linked to each other with adder group (902), multiplier group (903) and registers group (904) respectively by state machine (901) control.
CN2007100447161A 2007-08-09 2007-08-09 3780-point quick Fourier transformation processor of pipelining structure Expired - Fee Related CN101136891B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2007100447161A CN101136891B (en) 2007-08-09 2007-08-09 3780-point quick Fourier transformation processor of pipelining structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2007100447161A CN101136891B (en) 2007-08-09 2007-08-09 3780-point quick Fourier transformation processor of pipelining structure

Publications (2)

Publication Number Publication Date
CN101136891A CN101136891A (en) 2008-03-05
CN101136891B true CN101136891B (en) 2011-12-28

Family

ID=39160725

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2007100447161A Expired - Fee Related CN101136891B (en) 2007-08-09 2007-08-09 3780-point quick Fourier transformation processor of pipelining structure

Country Status (1)

Country Link
CN (1) CN101136891B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847986B (en) * 2009-03-27 2012-06-06 中兴通讯股份有限公司 Circuit and method for realizing FFT/IFFT conversion
CN102073620B (en) * 2009-11-20 2013-05-29 扬智电子科技(上海)有限公司 Fast Fourier converter, reverse fast Fourier converter and method thereof
CN102104773B (en) * 2009-12-18 2013-03-20 上海华虹集成电路有限责任公司 Radix-4 module of FFT (Fast Fourier Transform)/IFFT (Inverse Fast Fourier Transform) processor for realizing variable data number
CN102238348B (en) * 2010-04-20 2014-02-05 上海华虹集成电路有限责任公司 Data amount-variable radix-4 module for fast Fourier transform (FFT)/inverse fast Fourier transform (IFFT) processor
CN102567282B (en) * 2010-12-27 2016-03-30 北京国睿中数科技股份有限公司 In general dsp processor, FFT calculates implement device and method
CN102611667B (en) * 2011-01-25 2016-06-15 深圳市中兴微电子技术有限公司 Stochastic accessing detection FFT/IFFT treatment process and device
CN102708092B (en) * 2012-05-21 2016-01-20 复旦大学 A kind of iteration of maps method realizing hybrid base FFT final stage and reorder
CN103516656B (en) * 2012-06-29 2018-03-27 中兴通讯股份有限公司 Inverse fast fourier transform implementation method and device
CN102880592A (en) * 2012-10-09 2013-01-16 苏州威士达信息科技有限公司 High-precision processing device and high-precision processing method for 3780-point FFT (fast Fourier transform) by sequential output
CN103810144B (en) * 2012-11-08 2018-12-07 无锡汉兴电子有限公司 A kind of prime length FFT/IFFT method and apparatus
CN103412851A (en) * 2013-07-30 2013-11-27 复旦大学 High-precision and low-power-consumption FFT (fast Fourier transform) processor
CN113591022A (en) * 2021-07-02 2021-11-02 星思连接(上海)半导体有限公司 Read-write scheduling processing method and device capable of decomposing data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1348141A (en) * 2001-11-23 2002-05-08 清华大学 Discrete 3780-point Fourier transformation processor system and its structure

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1348141A (en) * 2001-11-23 2002-05-08 清华大学 Discrete 3780-point Fourier transformation processor system and its structure

Also Published As

Publication number Publication date
CN101136891A (en) 2008-03-05

Similar Documents

Publication Publication Date Title
CN101136891B (en) 3780-point quick Fourier transformation processor of pipelining structure
CN103970718B (en) Device and method is realized in a kind of fast Fourier transform
CN103955447B (en) FFT accelerator based on DSP chip
CN101340414A (en) Variable length fft system and method
US20030009502A1 (en) Complex vector operation processor with pipeline processing function and system using the same
CN105045766B (en) Data processing method and processor based on the transformation of 3072 point quick Fouriers
CN101571849B (en) Fast Foourier transform processor and method thereof
JP2009535678A (en) Pipeline FFT Architecture and Method
CN103699515B (en) FFT (fast Fourier transform) parallel processing device and FFT parallel processing method
CN101836202B (en) Fast fourier transform/inverse fast fourier transform operation core
JP2007513431A (en) FFT architecture and method
CN101465834B (en) DFT/IDFT transformation system for 3GPP LTE/4G wireless communication
CN101894096A (en) FFT computing circuit structure applied to CMMB and DVB-H/T
CN112231626A (en) FFT processor
CN103493039A (en) Data processing method and related device
WO2013097235A1 (en) Parallel bit order reversing device and method
CN106649200B (en) One kind being based on time-multiplexed auto-correlation computation VLSI design method
CN101833540B (en) Signal processing method and device
CN101277283B (en) Fast Fourier Transform Butterfly Device
CN103188192A (en) Baseband processing device applied to video sensor
CN101354701B (en) FFT processor implementing base 4FFT/IFFT operation
CN102023963B (en) High-speed multi-mode time domain and frequency domain transform method
CN102104773B (en) Radix-4 module of FFT (Fast Fourier Transform)/IFFT (Inverse Fast Fourier Transform) processor for realizing variable data number
CN101764778A (en) Base band processor and base band processing method
Zhang et al. Small area high speed configurable FFT processor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20111228

Termination date: 20140809

EXPY Termination of patent right or utility model