CN1064507C - IC processing device able to actuate cyclically discrete cosine transform and the inverse transform thereof - Google Patents

IC processing device able to actuate cyclically discrete cosine transform and the inverse transform thereof Download PDF

Info

Publication number
CN1064507C
CN1064507C CN97102599A CN97102599A CN1064507C CN 1064507 C CN1064507 C CN 1064507C CN 97102599 A CN97102599 A CN 97102599A CN 97102599 A CN97102599 A CN 97102599A CN 1064507 C CN1064507 C CN 1064507C
Authority
CN
China
Prior art keywords
mentioned
unit
data
multiplying
arithmetic element
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CN97102599A
Other languages
Chinese (zh)
Other versions
CN1169083A (en
Inventor
徐荣富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Winbond Electronics Corp
Original Assignee
Winbond Electronics Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Winbond Electronics Corp filed Critical Winbond Electronics Corp
Priority to CN97102599A priority Critical patent/CN1064507C/en
Publication of CN1169083A publication Critical patent/CN1169083A/en
Application granted granted Critical
Publication of CN1064507C publication Critical patent/CN1064507C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Abstract

The present invention relates to a discrete cosine conversion and inverse conversion integrated circuit processor which can be executed itinerantly. The present invention comprises a butterfly arithmetic unit, a multiplication unit for executing butterfly arithmetic, an auxiliary plus-minus arithmetic unit which can execute simple multiplication arithmetic and can be combined with the multiplication unit to execute prefix plus multiplication arithmetic or following minus multiplication arithmetic, and a storing unit used for storing intermediate results during the arithmetic process. Two times of one-dimensional DCT/IDCT arithmetic can be itinerantly executed, and six times of interphase butterfly arithmetic and multiplication arithmetic, which comprise three times of butterfly arithmetic, one time of simple multiplication arithmetic and two times of auxiliary plus-minus multiplication arithmetic, are itinerantly executed during each time of the one-dimensional DCT/IDCT arithmetic.

Description

Mobile executable discrete cosine inversion and reversing integrated circuit processor
The present invention relates to a kind of integrated circuit processor, particularly relate to a kind of mobile executable discrete cosine inversion and reversing integrated circuit processor.
Discrete cosine transform and inverse conversion thereof (Discrete Cosine Transform/In-verse Discrete Cosine Transform is hereinafter to be referred as DCT/IDCT) are compression and decompression (Compression/Decompression) processes that is respectively applied for digital image data.In a digitized video compression process, one image is subdivided into the square (Block) of many 8 * 8 pixels (Pixel) usually, one by one each square is carried out DCT again, be converted to the data kenel of frequency domain (Freqrency Domain), decompression process then with frequency domain data process IDCT, is reduced to pixel data.
Carry out one two-dimensional dct/IDCT, can carry out one dimension row (Row or row Col-umn) conversion earlier, afterwards, carry out an one dimension capable (or row) conversion again and reach, the one dimension DCT of right-8 * 8 the arbitrary column or row of square can be expressed as:
The above-mentioned relation formula is to be made of with the institute that adds up a series of multiplication, wherein S (m) is the pixel data of spatial domain (Spatial Domain), F (k) is the data of conversion back frequency domain, can derive a kind of rapid algorithm (Fast Alg-orithm) by relational expression, make one dimension DCT only carry out 13 multiplication and 29 addition and subtractions, its flow process as shown in Figure 1, three kinds of computings in this flow process, have been defined especially, except mere multiplication (Intrinsic Multiplication) computing, other has two kinds of combinatorial operations, one is called butterfly computing (Butterfly Operation), another is called preposition addition multiplication (Pre-added Multiplication) computing, as shown in Figure 2, it is the butterfly computing; Fig. 3 is the mere multiplication computing; Fig. 4 is preposition addition multiplication; In addition, also have a class combinatorial operation to be called the back with subtracting each other multiplication (Post-subtracted Multi-plication) computing in addition, as shown in Figure 5, it is for being used in the IDCT flow process.So one dimension DCT flow process can be summarized 12 butterfly computings of common execution, 5 preposition addition multiplyings and 8 mere multiplication computings, these computings can be divided into six again and take turns execution sequence, are followed successively by:
The first round: carry out 4 butterfly computings.
Second takes turns: carry out 2 preposition addition multiplyings.
Third round: carry out 4 butterfly computings again.
Four-wheel: carry out 3 preposition addition multiplyings.
The 5th takes turns: carry out 4 butterfly computings again.
The 6th takes turns: carry out 8 mere multiplication computings.So just, finish one dimension DCT.If with counter the pushing away of one dimension DCT flow process, can obtain one dimension ID-CT flow process, as shown in Figure 6; In like manner, one dimension IDCT flow process also can be divided into six and take turns execution sequence, and the first round is carried out simple multiplying, and second, four, six take turns the computing of execution butterfly, and the 3rd, five take turns the execution back with subtracting each other multiplying.
The present invention promptly is to serve as the designed DCT/IDCT integrated circuit processor in basis with above-mentioned rapid algorithm, with its distinctive hardware structure, right-8 * 8 square data can go the rounds to carry out aforesaid six and take turns computing, to finish after the one dimension DCT/IDCT, carry out another time one dimension DCT/IDCT with transposition (Transpose) order again, so just can finish two-dimensional transformations.Because traditional DCT/IDCT processor is to use huge firmware hardwired logic (Hardwired Logic) more, to reach pipeline operation closely, improve processing speed, to expend cost very huge for this reason, in fact, general applied environment does not need so closely sequential fast, still can reach instant (Real Time) conversion.
Main purpose of the present invention is to be to provide a mobile executable discrete cosine inversion and reversing integrated circuit processor, and the structure with a low hardware complexity provides the path that can go the rounds to carry out, significantly reduces the area of integrated circuit, to reduce cost.
Another object of the present invention is the mobile executable discrete cosine inversion and reversing integrated circuit processor that is to provide a quickening execution speed, makes it meet the requirement that general applied environment instant (Real Time) is handled.
A further object of the present invention is to be to provide a processing speed to promote one times the discrete cosine transform of going the rounds to carry out and reverse integrated circuit processor, makes it be fit to the applied environment of higher bit rate (Bit Rate).
For reaching above-mentioned purpose, the present invention constructs and mainly comprises a butterfly arithmetic element, carry out butterfly computing, a multiplication unit, carry out simple multiplying, an auxiliary addition and subtraction unit, make with above-mentioned multiplication unit and combine, carrying out preposition addition multiplying or back has four ends and writes the deposit unit of reading port, the intermediate object program of access calculating process with subtracting each other multiplying and; And can go the rounds to carry out twice one dimension DCT/IDCT computing, and alternate butterfly computing and multiplying are taken turns in the touring execution six of every time one dimension DCT/IDCT computing, take turns the multiplying of simple multiplying and the auxiliary plus-minus of two-wheeled process comprising three-wheel butterfly computing,, the mode parallel processing of operating according to pipeline (Pipeline) by above-mentioned butterfly arithmetic element and multiplying unit respectively, this processor that can go the rounds to carry out can significantly reduce the area of integrated circuit, and meets the demand of general instant processing; Another derived structure is that above-mentioned processor two is in series, and both handle first and second time one dimension DCT/IDCT computing of square data respectively in the mode of pipeline operation, can promote one times of processing speed, is fit to the applied environment of higher bit rate.
The present invention is described in detail below in conjunction with drawings and Examples:
Fig. 1 is the discrete cosine transform flow chart.
Fig. 2-the 5th, butterfly computing, mere multiplication computing, preposition addition multiplying and, the back is with the definition schematic diagram that subtracts each other multiplying.
Fig. 6 is discrete cosine inverse flow path switch figure.
Fig. 7 is the block diagram of mobile executable discrete cosine inversion and reversing integrated circuit processor one preferred embodiment of the present invention.
Fig. 8 is the program flow diagram that utilizes the structure of Fig. 7 to carry out the DCT computing.
Fig. 9 is the program flow diagram that utilizes the structure of Fig. 7 to carry out the IDCT computing.
Figure 10 is the calcspar of another embodiment of mobile executable discrete cosine inversion and reversing integrated circuit processor of the present invention.
Figure 11 is the program flow diagram that utilizes the structure of Figure 10 to carry out the DCT/IDCT computing.
See also shown in Figure 7ly, be a preferred embodiment of the present invention, comprise an input unit 1, one butterfly arithmetic element 2, one multiplying unit 3, one data deposit units, 4, one output units 5 and control units 6; Wherein input unit 1 is a demultiplexer, so that extraneous data Din is selected to deliver to above-mentioned butterfly arithmetic element 2 or above-mentioned multiplying unit 3 according to DCT or IDCT computing.Butterfly arithmetic element 2 comprises one first a preposition multiplexer 21 and a butterfly arithmetic unit 22, wherein butterfly arithmetic unit 22 is made of a pair of adder and subtracter, and select data to the butterfly arithmetic unit 22 of above-mentioned input unit 1 or 4 outputs of above-mentioned data deposit unit to carry out the butterfly computing by the first preposition multiplexer 21, and operation result writes data deposit unit 4 or delivers to the external world via above-mentioned output unit 5.Multiplying unit 3 comprises one second preposition multiplexer 31, one auxiliary adder-subtractor 32, a multiplier the 33, the one, coefficients R OM34, reach an output and select multiplexer 35, wherein coefficients R OM34 is a coefficient part of depositing multiplying, as an operand (Operand) input of above-mentioned multiplier 33; Mere multiplication, preposition addition multiplication and back are responsible for carrying out with subtracting each other multiplication three class computings in multiplying unit 3, its input data also can be sent here via above-mentioned input unit 1 or above-mentioned data deposit unit 4, send into the above-mentioned second preposition multiplexer 31 or above-mentioned auxiliary adder-subtractor 32 according to nonidentity operation, and computing output result all selects multiplexer 35 to write back data deposit unit 4 by above-mentioned output, or sends to the external world via the output of multiplier 33 through above-mentioned output unit 5.And when simple multiplying was carried out in multiplying unit 3, the input data promptly selected to enter multiplier 33 by the second preposition multiplexer 31, and with multiplication from coefficients R OM34 corresponding address, finish the mere multiplication computing; Preposition addition multiplying then by data deposit unit 4 output two data to auxiliary adder-subtractor 32, two in advance addition again the result is delivered to multiplier 33, with multiplication, just finish preposition addition multiplying from coefficients R OM34 corresponding address; As for the back with subtracting each other multiplying, also by data deposit unit 4 outputs two data, one to second preposition multiplexer 31, another is to auxiliary adder-subtractor 32, entering the second preposition multiplexer 31 enters multiplier 33 again and carries out the mere multiplication computing, continuous the delivering to of operation result assisted adder-subtractor 32, makes to deduct another data that advance into earlier, and both differences are the back with the result who subtracts each other multiplying.Above-mentioned data deposit unit 4 is one or four block register groups, deposit the intermediate object program of calculating process with-RAM, its four block structure can be divided into two pairs and write one and read port WP1-RP1 and WP2-RP2, respectively as the path of above-mentioned butterfly arithmetic element 2 and above-mentioned multiplying unit 3 access datas.Above-mentioned output unit 5 is to be a multiplexer, and the output of selecting above-mentioned butterfly arithmetic unit 22 or above-mentioned multiplier 33 according to DCT or IDCT computing is as delivering to extraneous dateout.And above-mentioned control unit 6 is in order to produce a control timing, to control the operation workflow of above-mentioned each unit.
By above-mentioned structure, the present invention is as Fig. 8 and shown in Figure 9 in the program of carrying out the DCT/IDCT computing, one square Block N is successively carried out twice one dimensions change Ist 1-D-DCT/IDCT and 2nd 1-D DCT/IDCT, and alternate butterfly computing and multiplying are taken turns in the touring execution six of each time one dimension conversion, respectively by the mode parallel processing of above-mentioned butterfly arithmetic element 2 and multiplying unit 3 according to pipeline operation, first of DCT wherein shown in Figure 8, three, five take turns the computing of execution butterfly, second, four-wheel is carried out preposition addition multiplying, and the 6th takes turns T then carries out simple multiplying.And IDCT shown in Figure 9 then is to carry out earlier simple multiplying the first round, and the 3rd, five take turns and carry out the back with subtracting each other multiplying, and second, four, six take turns and then carry out the butterfly computing.And the first round computing of first time one dimension conversion Ist I-D DCT/IDCT simultaneously data import by Din, and take turns computing at the 6th of second time one dimension conversion 2nd 1-D DCT/IDCT and simultaneously the result exported to extraneous Dout.
Except that with the external world input or output, each take turns operational data all with above-mentioned data deposit unit 4 as source place and destination, with the result of previous round source as follow-up round, the access mode of data deposit unit 4 transposition order each other between twice one dimensions conversion of same square, also if i.e. first time one dimension conversion is the order access to be listed as, then second time one dimension conversion must change the order access with row, and opposite also is the same.But between the two adjacent squares, i.e. second of last square time one dimension conversion changed with first time one dimension of a time square, and its access order is then constant.Above-mentioned butterfly arithmetic element 2 and above-mentioned multiplying unit 3 can be by other write and one read port and simultaneously data deposit unit 4 is carried out data access of 4 of data deposit units, therefore can make the mode parallel processing of two arithmetic elements 2,3 with the pipeline operation, similar computing round is order execution in same arithmetic element one by one then.Below lift example explanation now:
(1) when carrying out DCT, totally 64 pixel datas of-8 * 8 pixel squares are imported from Din with the order of row (or row), select to deliver to butterfly arithmetic element 2 by input unit 1 and carry out first round computing, be to be the butterfly computing, its result writes data deposit unit 4 by writing inbound port WP1, continuing and read and deliver to multiplying unit 3 and carry out second and take turns computing by reading port RP2, is to be preposition addition multiplying.And will writing inbound port WP2 by another, the result of preposition addition multiplying writes back data deposit unit 4, treat that first round computing finishes, another is read piece RP1 and promptly reads the result of preceding two-wheeled computing to butterfly arithmetic element 2, then carry out the third round computing, write back data deposit unit 4 by WP1 behind its operation result.In like manner, the four-wheel computing is connected second and takes turns computing 3 execution in the multiplying unit afterwards, the 5th takes turns computing is connected the third round computing afterwards in 2 execution of butterfly arithmetic element, the 6th takes turns computing is connected and enters multiplying unit 3 after the four-wheel computing, carry out the mere multiplication computing, and the result all writes back data deposit unit 4, takes turns computing to the 6th and finishes promptly to finish first time one dimension DCT.Then, with the transposition order, the order of (or row) is delivered to butterfly arithmetic element 2 by RP1 from data deposit unit 4 reading of data at once, the first round computing of second time one dimension DCT of beginning, after this it is all identical with first time one dimension DCT respectively to take turns compute mode, and be not both the last the 6th and take turns operation result and no longer write back data deposit unit 4, and directly from multiplier 33 outputs of multiplying unit 3, send to extraneous Dout through output unit 5, this final output i.e. the frequency domain data of a square pixel after the DCT computing.
(2) and when carrying out IDCT, 64 frequency domain datas enter input unit 1 with the order that is listed as (or row) from Din, deliver to multiplying unit 3 through its selection and carry out first round computing, be to be pure multiplying, its result writes data deposit unit 4 by writing inbound port WP2, continuous read and deliver to butterfly arithmetic element 2 and carry out second and take turns computing by reading port RP1, it is the butterfly computing, its result will write inbound port WP1 by another and write back data deposit unit 4, and third round is that the back is with subtracting each other multiplying, be to read above-mentioned second and take turns the result of computing to multiplying unit 3 by reading port RP2, and carry out after being connected first round computing, its operation result is multiple to write back data deposit unit 4 by WP2.In like manner, the four-wheel computing is connected second and takes turns computing afterwards in 2 execution of butterfly arithmetic element, the 5th takes turns computing is connected third round computing 3 execution in the multiplying unit afterwards, the 6th takes turns computing is connected the back in 2 execution of butterfly arithmetic element of four-wheel computing again, operation result all writes back data deposit unit 4, treats that the 6th takes turns computing and finish promptly to finish first time one dimension IDCT.Then, RP2 delivers to the first round computing of second time one dimension IDCT of multiplying unit 3 beginnings to data deposit unit 4 reading of data according to the order of row (or row), after this respectively take turns compute mode because of identical with first time one dimension IDCT, different is the last the 6th takes turns operation result and no longer writes back data deposit unit 4, and send to extraneous Dout via output unit 5, the pixel data that this final output is promptly reduced through IDCT.
See also shown in Figure 10 again, be another embodiment of the present invention, be that series connection two basic modules shown in Figure 7 form, and handle two times one dimension DCT/IDCT of square data with the pipeline operation, take turns computing by six of an one dimension conversion of each touring execution of two modules, mainly include 7, one second times one dimension processing units 8 of one first time one dimension processing unit and a control unit 9, wherein first time one dimension processing unit 7 includes an input unit 71; One first butterfly arithmetic element 72, it more comprises a preposition multiplexer 721 and a butterfly arithmetic unit 722; One first multiplying unit 73, it more comprises preposition multiplexer 731, one an auxiliary adder-subtractor 732, a multiplier 733, a coefficients R OM734 and an output selection multiplexer 735; And one first data deposit unit 74, be responsible for carrying out input and first time dimension conversion of square data, and hold a concurrent post transpose memory (Trans-pose Memory) with the first data deposit unit 74, with input Data Source as above-mentioned second time one dimension processing unit 8.Second time 8 of one dimension processing units comprise one second butterfly arithmetic element 81, and it more comprises a preposition multiplexer 811 and a butterfly arithmetic unit 812; One second multiplying unit 82, it more comprises preposition multiplexer 821, one an auxiliary adder-subtractor 822, a multiplier 823 and an output selection multiplexer 824; One second a data deposit unit 83 and an output unit 84, wherein the input of an operand of the multiplier 823 of the second multiplying unit 82 is to be connected with the above-mentioned first multiplying unit 73, with the coefficients R OM734 of the shared first multiplying unit 73; Second time one dimension processing unit 8 is responsible for carrying out the second time one dimension conversion of aforementioned square data and the output of end product.And control unit 9 is in order to control the execution flow process of above-mentioned first, second time one dimension processing unit 7,8.
By above-mentioned structure, it carries out the operation program of DCT/IDCT, as shown in figure 11, be by two one dimension processing units 7,8 pipeline operations, to carry out the DCT/ID-CT computing of square data, when the input data of a square N are entered above-mentioned first time one dimension processing unit 7 by Din when, last square N-1 through the result of first time one dimension conversion be by the above-mentioned first data deposit unit 74 read RP1A (when carrying out DCT) or RP2A (when carrying out IDCT) enters above-mentioned second time one dimension processing unit 8, two one dimension processing units 7, computing is taken turns in six of an one dimension conversion of 8 each touring execution, the end product of last square N-1 is promptly delivered to extraneous Dout by the output of above-mentioned output unit 84, square N then is stored in the above-mentioned first data deposit unit 74 through the result of first time one dimension conversion, and the first data deposit unit 74 must be with row between continuous square, the order that row exchange carries out data access.This structure can promote one times of processing speed, is fit to the applied environment of higher bit rate.
So, conclude above-mentioned exposure, it is that parallel processing, is for promoting one times of processing speed for touring execution, that spirit of the present invention has three: one.So-called " the touring execution " is meant and utilizes same hardware structure, and samsara is carried out for several times, to reach job, is to be the mode with " time exchanges the space for ", and its purpose reduces cost, but increases the operating time at the reduction hardware.
So-called " parallel processing " is just in time opposite with above-mentioned touring execution, be with " space exchanges the time for ", promptly under the structure of above-mentioned touring execution, still desire to reach the speed of instant (Real Time), so utilize two separate units to do parallel processing, allow the operating time unlikelyly draw oversizely.The structure of " parallel processing " among the present invention is a parallel way, and this is the structure with respect to the series system of following " promoting one times of processing speed ".
So-called " promoting one times of processing speed " is under the structure of above-mentioned " the touring execution " and " parallel processing ", handle higher bit rate as desire, probably speed can't reach instant and must be again in the mode of " space exchanges the time for ", and this mode is in-line.
And be very cleverly: the algorithm (Algorithm) that the present invention uses just in time can above three spirit of construction in invention.Because no matter DCT or IDCT all can be divided into the compute mode of two times same programs, and each backhauls calculation and can be divided into six wheel computings, and six wheel computings can be divided into two class computings (butterfly computing and multiplying) just, and two class computings are alternate carrying out, and so feature brings up structure of the present invention and manner of execution.
With Fig. 7, promptly use " the touring execution " and " parallel processing " two features, structure carries out DCT/IDCT will " go the rounds to carry out " 12 rounds (two times 1-D DCT/IDCT) altogether, and allowing two operation independent unit (butterfly arithmetic element and multiplying unit) respectively share 6 rounds, 12 rounds carry out " parallel processing ", so Fig. 8, Fig. 9 handles a square (Bl-ock) as can be seen and almost only needed for 6 times of taking turns, rather than 12 time of taking turns, with any point time, almost two arithmetic elements are all in action, with execution sequence, take turns because of each and all will handle 64 data (with the row or column order, every row or column contains 8 data, have 8 row or column), after the first round is handled several data, second takes turns and just can continue with regard to the result of the first round to handle, as for what is called " several data " is slightly different with IDCT at DCT herein, and last is taken turns (2nd 1-D the 6th takes turns) and how long drags and decide to see last square, if handle first square at the beginning, " several data " are at most 8 data (row or column) so, no matter DCT or IDCT are all right.As for third round because of being to belong to similar computing with the first round, just can continue so must wait the first round 64 data to handle entirely, and necessarily can guarantee that second takes turns and handle many data this moment, surpass 8 (row or column), can allow third round have data to handle, need not wait for.
In like manner, the 4th, the 5th, the 6th to take turns all be like this, this compute mode, data connect data, a round is followed a round, it (is lst 1-D that one tieback one time, 2nd1-D, lst l-D ...), just be called pipeline operation (Pipeline), pipeline operation backhauls at two of a square that (being between lst 1-D and the 2nd 1-D) has interruption slightly between the calculation, and this is because the data deposit unit must change access sequence (row makes row into, or be listed as make into to go) between this is critical, finish entirely just and can begin so the first round of 2nd 1-D must wait the 6th of lst 1-D to take turns, and be connected on not directly that the 5th of lst 1-D takes turns after because required data of the 2nd 1-D first round are still unripe at that time, the beginning of therefore just having no way of.After the first round of 2nd 1-D opened the beginning, the order of execution was just fully the same with lst 1-D with the time.Between two adjacent squares, pipeline operation can not interrupted, because this moment, the data of a time square were to be come in by Din, so the first round need not waited for the 5th of the 2nd 1-D that can be connected on a square and carry out after taking turns, and last square data are also taken turns at the 6th of 2nd 1-D successively and are finished processing, deliver to Dout, and the access order of data deposit unit also needn't be done the ranks exchange, this is spiritual place of the present invention.
In sum, mobile executable discrete cosine inversion and reversing integrated circuit processor of the present invention really can be used disclosed constructing apparatus, reaches effect, the purpose of expection, and has the value of utilizing on the industry.

Claims (6)

1. the method for the reverse discrete cosine conversion that can go the rounds to carry out and inverse conversion thereof, it is characterized in that, when it uses reverse discrete cosine conversion (IDCT), utilize one or six to take turns the input data that the IDCT rapid algorithm is handled a succession of 8 * 8 data block, to produce a series of translation data, above-mentioned IDCT rapid algorithm comprises the first round, and it comprises most mere multiplication computings, second, the 4th and the 6th takes turns, and every the wheel comprises most butterfly computings; And the 3rd and the 5th take turns, and every the wheel comprises most backs with subtracting each other multiplying; The step of above-mentioned IDCT method comprises:
(a) provide an input unit to receive above-mentioned input data;
(b) the above-mentioned input unit of control provides above-mentioned input data to multiplying unit, carries out the IDCT rapid algorithm of the above-mentioned first round to start above-mentioned multiplying unit;
(c) control one data buffer is to store the first round dateout of above-mentioned multiplying unit;
(d) the above-mentioned data buffer of control provides above-mentioned first round dateout to butterfly arithmetic element, carries out the above-mentioned second IDCT rapid algorithm of taking turns to start above-mentioned butterfly arithmetic element;
(e) the above-mentioned data buffer of control stores second of above-mentioned butterfly arithmetic element and takes turns dateout;
(f) the above-mentioned data buffer of control provides above-mentioned second to take turns dateout to above-mentioned multiplying unit, starts the IDCT rapid algorithm that above-mentioned third round is carried out in above-mentioned multiplying unit;
(g) control above-mentioned data buffer, store the third round dateout of above-mentioned multiplying unit;
(h) the above-mentioned data buffer of control provide above-mentioned second take turns and the third round dateout to above-mentioned butterfly arithmetic element, behind the IDCT rapid algorithm that above-mentioned butterfly arithmetic element complete second is taken turns, start the IDCT rapid algorithm that above-mentioned butterfly arithmetic element is carried out above-mentioned four-wheel;
(i) control above-mentioned data buffer, store the four-wheel dateout of above-mentioned butterfly arithmetic element;
(j) the above-mentioned data buffer of control provides above-mentioned four-wheel dateout to above-mentioned multiplying unit, starts above-mentioned multiplying unit and carries out the above-mentioned the 5th IDCT rapid algorithm of taking turns;
(k) the above-mentioned data buffer of control stores the 5th of above-mentioned multiplying unit and takes turns dateout;
(l) the above-mentioned data buffer of control provides above-mentioned four-wheel and the 5th to take turns dateout to above-mentioned butterfly arithmetic element, behind the IDCT rapid algorithm of the complete four-wheel of above-mentioned butterfly arithmetic element, start above-mentioned butterfly arithmetic element and carry out the above-mentioned the 6th IDCT rapid algorithm of taking turns; And
(m) control one output unit receives the 6th of above-mentioned butterfly arithmetic element and takes turns dateout.
2. the reverse discrete cosine conversion of going the rounds to carry out as claimed in claim 1 and the method for inverse conversion thereof is characterized in that, further comprise step in step (l) and (m):
(11) the above-mentioned data buffer of control is taken turns dateout to store the above-mentioned the 6th;
(12) the above-mentioned data buffer of control provides the above-mentioned the 6th to take turns dateout to above-mentioned multiplying unit, starts the IDCT rapid algorithm that the above-mentioned first round is carried out in above-mentioned multiplying unit; And
(13) repeat (c)~(l) step.
3. the integrated circuit processor of the reverse discrete cosine conversion that can go the rounds to carry out and inverse conversion thereof, it is characterized in that, comprise an input unit, a butterfly arithmetic element, a multiplying unit, a data buffer, an output unit and a control unit, wherein: above-mentioned input unit is input data of accepting the external world:
Above-mentioned butterfly arithmetic element comprises:
One first preposition multiplexer is to select the data that above-mentioned input unit/the data buffer is sent; And
One butterfly arithmetic unit is that a pair of adder and subtracter constitute, and with the data of accepting to transmit from the above-mentioned first preposition multiplexer, carries out addition of these data and the butterfly computing of subtracting each other simultaneously;
Above-mentioned multiplying unit comprises:
One second preposition multiplexer is to select the data that above-mentioned input unit/the data buffer is sent;
One auxiliary adder-subtractor connects the above-mentioned second preposition multiplexer, with the addition section of carrying out preposition addition multiplying and back with the subtraction part of subtracting each other multiplying;
One multiplier connects the above-mentioned second preposition multiplexer and above-mentioned auxiliary adder-subtractor, to carry out mere multiplication, preposition addition multiplication and back with the multiplication part of subtracting each other multiplication three class computings;
One coefficients R OM connects above-mentioned multiplier, is the coefficient part of depositing multiplying, with the input as another arithmetic element of this multiplier; And
Multiplexer is selected in one output, connects above-mentioned auxiliary adder-subtractor and above-mentioned multiplier, to select to assist the above-mentioned data buffer that exports to of adder-subtractor/multiplier;
Above-mentioned data buffer connects above-mentioned butterfly arithmetic element and above-mentioned multiplying unit, with result between among the access calculating process;
Above-mentioned output unit connects above-mentioned butterfly arithmetic element and above-mentioned multiplying unit, with the output of selecting this butterfly arithmetic element and this multiplying unit as the dateout of delivering to the external world; And
Above-mentioned control unit is to produce a control timing, to control the operation workflow of above-mentioned each unit.
4. the reverse discrete cosine conversion of going the rounds to carry out as claimed in claim 3 and the integrated circuit processor of inverse conversion thereof is characterized in that this input unit is to be a de-multiplexer, should import data and select to deliver to above-mentioned multiplying unit.
5. the integrated circuit processor of the reverse discrete cosine conversion that can go the rounds to carry out and inverse conversion thereof is characterized in that comprising:
One first time one dimension processing unit comprises an input unit, a butterfly arithmetic element, a multiplying unit and one first data buffer, wherein:
Above-mentioned input unit is input data of accepting the external world;
Above-mentioned butterfly arithmetic element comprises:
One first preposition multiplexer is to select the data that above-mentioned input unit/the first data buffer is sent; And
One butterfly arithmetic unit is that a pair of adder and subtracter constitute, and with the data of accepting to transmit from the above-mentioned first preposition multiplexer, carries out addition of these data and the butterfly computing of subtracting each other simultaneously;
Above-mentioned multiplying unit comprises:
One second preposition multiplexer is to select the data that above-mentioned input unit/the first data buffer is sent;
One auxiliary adder-subtractor connects the above-mentioned first data buffer, with the addition section of carrying out preposition addition multiplying and back with the subtraction part of subtracting each other multiplying;
One multiplier connects the above-mentioned second preposition multiplexer and above-mentioned auxiliary adder-subtractor, to carry out mere multiplication, preposition addition multiplication and back with the multiplication part of subtracting each other multiplication three class computings;
One coefficients R OM connects above-mentioned multiplier, is the coefficient part of depositing multiplying, with the input as another arithmetic element of this multiplier; And
Multiplexer is selected in one output, connects above-mentioned auxiliary adder-subtractor and above-mentioned multiplier, to select assisting above-mentioned the filling with that export to of adder-subtractor/multiplier to state the first data buffer; And,
The above-mentioned first data buffer connects above-mentioned butterfly arithmetic element and above-mentioned multiplying unit, with result between among the access calculating process
One second time one dimension processing unit comprises a butterfly arithmetic element, a multiplying unit, one second data buffer and an output unit, wherein:
Above-mentioned butterfly arithmetic element comprises:
One first preposition multiplexer is with the data of selecting above-mentioned the first/the second data buffer to send; And,
One butterfly arithmetic unit is that a pair of adder and subtracter constitute, and with the data of accepting to transmit from the above-mentioned first preposition multiplexer, carries out addition of these data and the butterfly computing of subtracting each other simultaneously;
Above-mentioned multiplying unit comprises:
One second preposition multiplexer is with the data of selecting above-mentioned the first/the second data buffer to send;
One auxiliary adder-subtractor connects the above-mentioned second data buffer, with the addition section of carrying out preposition addition multiplying and back with the subtraction part of subtracting each other multiplying;
One multiplier connects the above-mentioned second preposition multiplexer and above-mentioned auxiliary adder-subtractor, carrying out mere multiplication, preposition addition multiplication and back with the multiplication part of subtracting each other multiplication three class computings, and connects the coefficients R OM of above-mentioned first time one dimension processing unit; And,
Multiplexer is selected in one output, connects above-mentioned auxiliary adder-subtractor and above-mentioned multiplier, to select to assist the above-mentioned second data buffer that exports to of adder-subtractor/multiplier;
The above-mentioned second data buffer connects above-mentioned butterfly arithmetic element and above-mentioned multiplying unit, with result between among the access calculating process; And
Above-mentioned output unit connects above-mentioned butterfly arithmetic element and above-mentioned multiplying unit, with the output of selecting this butterfly arithmetic element and this multiplying unit as the dateout of delivering to the external world; And,
One control unit produces a control timing, to control the execution flow process of above-mentioned first time one dimension processing unit and above-mentioned second time one dimension processing unit.
6. the reverse discrete cosine conversion of going the rounds to carry out as claimed in claim 5 and the integrated circuit processor of inverse conversion thereof, it is characterized in that, this input unit is to be a de-multiplexer, should import the butterfly arithmetic element that data select to deliver to above-mentioned first time one dimension processing unit.
CN97102599A 1997-03-06 1997-03-06 IC processing device able to actuate cyclically discrete cosine transform and the inverse transform thereof Expired - Lifetime CN1064507C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN97102599A CN1064507C (en) 1997-03-06 1997-03-06 IC processing device able to actuate cyclically discrete cosine transform and the inverse transform thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN97102599A CN1064507C (en) 1997-03-06 1997-03-06 IC processing device able to actuate cyclically discrete cosine transform and the inverse transform thereof

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN94104170A Division CN1053545C (en) 1994-05-05 1994-05-05 Mobile executable discrete cosine inversion and reversing integrated circuit processor

Publications (2)

Publication Number Publication Date
CN1169083A CN1169083A (en) 1997-12-31
CN1064507C true CN1064507C (en) 2001-04-11

Family

ID=5166345

Family Applications (1)

Application Number Title Priority Date Filing Date
CN97102599A Expired - Lifetime CN1064507C (en) 1997-03-06 1997-03-06 IC processing device able to actuate cyclically discrete cosine transform and the inverse transform thereof

Country Status (1)

Country Link
CN (1) CN1064507C (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7487193B2 (en) * 2004-05-14 2009-02-03 Microsoft Corporation Fast video codec transform implementations
TWI472932B (en) * 2012-12-07 2015-02-11 Nuvoton Technology Corp Digital signal processing apparatus and processing method thereof
CN103067718B (en) * 2013-01-30 2015-10-14 上海交通大学 Be applicable to the one-dimensional discrete cosine inverse transform module circuit of digital video decoding

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1050115A (en) * 1990-10-16 1991-03-20 郭静峰 Steplessly-regulated remote controller

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1050115A (en) * 1990-10-16 1991-03-20 郭静峰 Steplessly-regulated remote controller

Also Published As

Publication number Publication date
CN1169083A (en) 1997-12-31

Similar Documents

Publication Publication Date Title
JP5689282B2 (en) Computer-implemented method, computer-readable storage medium and system for transposing a matrix on a SIMD multi-core processor architecture
JP4778086B2 (en) Data processing apparatus and method for calculating cosine transform of matrix
US5053985A (en) Recycling dct/idct integrated circuit apparatus using a single multiplier/accumulator and a single random access memory
JP2628493B2 (en) Image coding device and image decoding device provided with cosine transform calculation device, calculation device and the like
EP0267729B1 (en) An orthogonal transform processor
US20070094320A1 (en) Parallel Adder-Based DCT / IDCT Design Using Cyclic Convolution
KR970703565A (en) HIGH-SPEED ARITHMETIC UNIT FOR DISCRETE COSING TRANSFORM AND ASSOCIATED OPERATION
JP2001236496A (en) Reconstructible simd co-processor structure for summing and symmetrical filtration of absolute difference
EP0884686A3 (en) Method and apparatus for performing discrete cosine transform and its inverse
JP3852895B2 (en) Method of performing two-dimensional discrete cosine transform capable of reducing multiplicative operation and its inverse transform
JP2003223433A (en) Method and apparatus for orthogonal transformation, encoding method and apparatus, method and apparatus for inverse orthogonal transformation, and decoding method and apparatus
CN1064507C (en) IC processing device able to actuate cyclically discrete cosine transform and the inverse transform thereof
US20010033617A1 (en) Image processing device
CN1053545C (en) Mobile executable discrete cosine inversion and reversing integrated circuit processor
JPH10504408A (en) Apparatus and method for performing inverse discrete cosine transform
Alam et al. A new time distributed DCT architecture for MPEG-4 hardware reference model
JP2662501B2 (en) Integrated circuit processor for discrete cosine transform and inverse transform
KR0126109B1 (en) Vlsi processor for dct/idct
Kuroda Processor architecture driven algorithm optimization for fast 2D-DCT
JP3709291B2 (en) Fast complex Fourier transform method and apparatus
JPS62239271A (en) Circuit for primary conversion of numerical signal
Weeks et al. On block architectures for discrete wavelet transform
JP2005229182A (en) Image processing method and vector processor
JP3396818B2 (en) DCT operation circuit and IDCT operation circuit
JPS62105287A (en) Signal processor

Legal Events

Date Code Title Description
C10 Entry into substantive examination
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CX01 Expiry of patent term

Expiration termination date: 20140505

Granted publication date: 20010411