JP2001084242A

JP2001084242A - Variable operation processor

Info

Publication number: JP2001084242A
Application number: JP25711199A
Authority: JP
Inventors: Motonobu Tonomura; 元伸外村
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1999-09-10
Filing date: 1999-09-10
Publication date: 2001-03-30

Abstract

PROBLEM TO BE SOLVED: To provide a variable operation processor, in which an operation part can be made in common and can be shared and which is suitable for efficiently executing multimedia-oriented mass operation by converting a product into the expression of a sum through the use of the addition theorem of trigonometric functions, so that only one multiplication occurs on an operation path in the operation of two-dimensional DCT/inverse DCT and adopting a system for directly executing calculation. SOLUTION: A selector 160 for switching a DCT and inverse DCT operation, the selector 165 of post-addition of inverse DCT or the pre-addition of the common coefficient terms of DCT, a programmable computing element 170, the latches 171 and 176 of an operation path instruction signal, the selector 172 of a variable operation path, a multiplier/adder 175 and a butterfly calculation block 180 are installed. The flow of the input/output data path of the programmable computing element 170 is controlled variably by the selector 172, and an operation element is constituted/controlled so that it can be shared as much as possible. The selection control information 171 and 176 are kept, using memories 151 and 152 which can be accessed at a high speed.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、演算プロセッサに
係わり、特にマルチメディア指向の大量演算を効率的に
実行するのに好適な可変演算プロセッサに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an arithmetic processor, and more particularly, to a variable arithmetic processor suitable for efficiently executing a large amount of multimedia-oriented arithmetic operations.

【０００２】[0002]

【従来の技術】今日、大容量メディア情報を扱うマルチ
メディア処理において、記憶／転送能力を補うために、
データの圧縮／伸張技術は必須である。画像データの圧
縮／伸張にはＭＰＥＧ−２（ＭｏｖｉｎｇＰｉｃｔｕ
ｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）などの規格技術が
用いられている。この圧縮／伸張技術の１つを担うの
が、離散コサイン変換（ＤｉｓｃｒｅｔｅＣｏｓｉ
ｎｅＴｒａｎｓ−ｆｏｒｍ：ＤＣＴという）と呼ばれ
るものである。元画像データを３０〜４０分の１の情報
量に圧縮できる。このＤＣＴ計算には、５００ＭＨＺ級
のプロセッサが必要と考えられている。すなわち、一般
にＤＣＴ計算ばかりにプロセッサ能力を費やせるわけで
はないので，ＤＣＴ計算の割合を全体の４分の１程度と
考えた場合、４並列のＳＩＭＤ（ＳｉｎｇｌｅＩｎｓ
ｔｒｕｃｔｉｏｎＳｔｒｅａｍ−Ｍｕｌｔｉｐｌｅ
ＤａｔａＳｔｒｅａｍ）方式でも，５００ＭＨＺ級が
必要となる。このように，演算能力を高めるためにはＳ
ＩＭＤ方式による並列演算が有効であるが、並列度を増
すためには、適度なビット幅に分割して演算する必要が
ある。そのとき問題になるのが、逆ＤＣＴ計算（ＤＣＴ
計算値を元のデータ値に戻す変換）における演算精度で
ある。2. Description of the Related Art Today, in multimedia processing for handling large-capacity media information, in order to supplement storage / transfer capability,
Data compression / decompression technology is essential. MPEG-2 (Moving Picture) is used for compression / expansion of image data.
Standard techniques such as re Experts Group) are used. One of the compression / decompression techniques is a discrete cosine transform (Discrete Cosine Transform).
ne Trans-form: DCT). The original image data can be compressed to 30 to 40 times less information. It is considered that a 500 MHz class processor is required for this DCT calculation. That is, in general, it is not possible to use the processor capacity only for the DCT calculation. Therefore, when the ratio of the DCT calculation is considered to be about one quarter of the total, four parallel SIMDs (Single Ins.
fraction stream-multiple
The Data Stream) method also requires a 500 MHz class. Thus, in order to increase the computing capacity, S
Although the parallel operation by the IMD method is effective, it is necessary to divide the operation into an appropriate bit width in order to increase the degree of parallelism. The problem at that time is the inverse DCT calculation (DCT
This is the calculation accuracy in the conversion (returning the calculated value to the original data value).

【０００３】画像のＤＣＴ／逆ＤＣＴは、ＭＰＥＧ−２
規格によれば２次元の８×８点のアルゴリズムであるた
めΣｘ（ｉ，ｊ）・ｃｏｓ（ｉ）・ｃｏｓ（ｊ）のかた
ちをしている。まず、１次元の８点ＤＣＴ／逆ＤＣＴア
ルゴリズムに分解し、最初に行の８点について、ＤＣＴ
／逆ＤＣＴを計算し、それらを転置して列の８点につい
てＤＣＴ／逆ＤＣＴを計算するというのが通常の方法で
ある。これを数式で表現すれば、ｙ（ｊ）＝Σｘ（ｉ，
ｊ）・ｃｏｓ（ｉ），ｘ＝Σｙ（ｊ）・ｃｏｓ（ｊ）。
ＤＣＴ計算出力値は、規格では１２ビットであるが、従
来の乗算器または積和演算器を用いて１６ビット幅の固
定小数点で演算するとき、まず最初の積和演算結果にお
いて、（符号１ビット＋１５ビット固定）×（符号１ビ
ット＋１５ビット固定）→（符号２ビット＋１４ビット
固定）の固定小数演算となるので、１ビットの精度落ち
の問題が発生する。もう一度乗算を繰返すには、演算精
度はクリティカルなものとなる。結果の符号ビットが余
分に２ビットも取られるので、結果を左１ビットシフト
して、余分な符号の１ビットを捨て、（符号１ビット＋
１５ビット固定）に丸めて回復する方法が提案されてい
る［ＩＮＴＥＬ：ＵＳＰａｔｅｎｔ；５，７５４，
４５６；５，７５４，４５７，Ｍａｒ．５，１
９９６］。他の方法は、演算パス上に一度の乗算しか発
生しないように、２次元の８８点ＤＣＴ／逆ＤＣＴ式
の２重積のかたちを１重積にして直接計算するアルゴリ
ズムを提案することである。三角関数の加法定理：２
ｃｏｓ（ｉ）ｃｏｓ（ｊ）＝ｃｏｓ（ｉ＋ｊ）＋ｃｏｓ
（ｉ−ｊ）を用いて直接計算すれば、２重積を避けて１
重積にすることができる。この方式を用いている最近の
ものに、Ｙｕｈ−ＮｉｎｇＨｕａｎｇＡｎｄＩａ
−ＬｉｎｇＷｕ，ＡＲｅｆｉｎｅｄＦａｓｔ２
−ＤＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆ
ｏｒｍＡｌｇｏｒｉｔｈｍ，ＩＥＥＥＴｒａｎ
ｓ．ＯｎＳｉｇｎａｌ，Ｐｒｏｃｅｓｓｉｎｇ，
Ｖｏｌ．４７，Ｎｏ．３，Ｐｐ．９０４−
９０７，Ｍａｒｃｈ１９９９などがある。[0003] The DCT / inverse DCT of an image is MPEG-2.
According to the standard, since the algorithm is a two-dimensional 8 × 8 point algorithm, it takes the form of Σx (i, j) · cos (i) · cos (j). First, it is decomposed into a one-dimensional eight-point DCT / inverse DCT algorithm.
It is common practice to calculate the inverse / DCT and transpose them to calculate the DCT / inverse DCT for the eight points in the column. If this is expressed by a mathematical expression, y (j) = Σx (i,
j) · cos (i), x = Σy (j) · cos (j).
Although the DCT calculation output value is 12 bits according to the standard, when a conventional multiplier or a product-sum operation unit is used to calculate a 16-bit fixed point, the first product-sum operation result (sign 1 bit) Since the fixed-point operation is (+15 bits fixed) × (sign 1 bit + fixed 15 bits) → (sign 2 bits + fixed 14 bits), the problem of 1-bit precision loss occurs. In order to repeat the multiplication again, the operation precision becomes critical. Since two extra sign bits are taken, the result is shifted one bit to the left, one bit of the extra sign is discarded, and (sign 1 bit +
A method has been proposed in which the data is recovered by rounding to 15 bits (fixed to 15 bits) [INTER: US Patent; 5,754,
456; 5,754,457, Mar. 5, 1
996]. Another method is to propose an algorithm for directly calculating a double product of a two-dimensional 88-point DCT / inverse DCT expression as a single product so that only one multiplication occurs on the operation path. is there. Trigonometric addition theorem: 2
cos (i) cos (j) = cos (i + j) + cos
By calculating directly using (ij), avoid double products and calculate 1
Can be intussusception. Recent ones using this method include the Yuh-Ning Hung And Ia
-Ling Wu, A Refined Fast 2
−D Discrete Cosine Transf
orm Algorithm, IEEE Tran
s. On Signal, Processing,
Vol. 47, no. 3, Pp. 904-
907, March 1999 and others.

【０００４】[0004]

【発明が解決しようとする課題】左１ビット・シフトに
よる従来方法では演算精度の回復処理が必要なために、
処理が重くなり、演算速度を上げることが困難な場合が
ある。最近では、動作周波数が５００〜１０００ＭＨＺ
のプロセッサが要求されるようになり、１マシンサイク
ルあたりに費やせるゲート段数が減少しつつあるので、
処理の増加は避けなければならない。加法定理をもちい
る方法については、提案されている内容がほとんどが特
殊なＤＣＴに関するものであり、さらに逆ＤＣＴを扱っ
たものはほとんど見当たらない。一般に、加法定理を用
いた２次元ＤＣＴ／逆ＤＣＴの直接計算は複雑なためで
あろう。従来の直接計算法はＳＩＭＤ演算方式には向い
ていなかったといわざるを得ない。In the conventional method using left one-bit shift, it is necessary to perform a process of restoring the operation accuracy.
In some cases, the processing becomes heavy, and it is difficult to increase the calculation speed. Recently, the operating frequency is 500-1000MHZ
Processor is required, and the number of gate stages that can be spent per machine cycle is decreasing.
Increases in processing must be avoided. Regarding the method using the addition theorem, most of the proposed contents are related to a special DCT, and there is hardly any method that deals with the inverse DCT. In general, direct computation of two-dimensional DCT / inverse DCT using the addition theorem may be complicated. It must be said that the conventional direct calculation method was not suitable for the SIMD operation method.

【０００５】本発明者は三角関数の周期性に着目し、あ
る種の演算の規則性が見つけられれば、効率よく実行す
る手段が提供できる可能性があることに思い至ったもの
である。The present inventor has paid attention to the periodicity of the trigonometric function, and has come to realize that if a certain kind of arithmetic regularity is found, there is a possibility that a means for executing the operation efficiently can be provided.

【０００６】[0006]

【課題を解決するための手段】本発明では、２次元ＤＣ
Ｔ／逆ＤＣＴの演算において、演算パス上に一度の乗算
しか発生しないように、三角関数の加法定理：２ｃｏ
ｓ（ｉ）ｃｏｓ（ｊ）＝ｃｏｓ（ｉ＋ｊ）＋ｃｏｓ（ｉ
−ｊ）を用いて、積を和の式に変換し、直接計算する方
式を採用する。そして、たとえば２次元ＤＣＴ／逆ＤＣ
Ｔ演算を行う場合には、その三角関数演算の周期性に着
目し、規則性を見い出し、バタフライ演算部（加算と減
算を対で行うことをいう）と乗算部を可変演算シーケン
ス制御指令に基づいて効率良く組み合わせて可変演算を
行うものである。演算の規則性を見え出すことが出来れ
ば演算部を共通化、共有化する道がひらける。一実施例
によれば、バタフライ演算部と乗算部の演算パスを複数
項の計算で共有するために、レジスタ類に格納されたセ
レクタ指示信号によってセレクト制御することにより効
率良い演算手段を提供する。ＤＣＴ演算ばかりでなく、
例えば、フーリエ変換、ハートレイ変換などでも利用で
きるように、セレクタ指示信号を格納するレジスタ類の
内容の入れ替え、乗算部においては係数類の入れ替え制
御手段も設けることができる。According to the present invention, a two-dimensional DC is provided.
In the T / inverse DCT operation, the addition theorem of the trigonometric function is set so that only one multiplication occurs on the operation path:
s (i) cos (j) = cos (i + j) + cos (i
-J) is used to convert the product into a sum expression and directly calculate. Then, for example, two-dimensional DCT / inverse DC
When performing the T operation, focusing on the periodicity of the trigonometric function operation, finding regularity, the butterfly operation unit (which means performing addition and subtraction in pairs) and the multiplication unit are controlled based on the variable operation sequence control command. In this way, the variable operation is efficiently performed in combination. If the regularity of the operation can be revealed, the way to share and share the operation unit will be opened. According to the embodiment, in order to share the operation path of the butterfly operation unit and the operation unit of the multiplication unit in the calculation of a plurality of terms, an efficient operation unit is provided by performing select control using a selector instruction signal stored in registers. Not only DCT operation,
For example, in order to be able to use the Fourier transform, the Hartley transform, and the like, the contents of the registers for storing the selector instruction signal may be replaced, and the multiplier may be provided with a coefficient replacement control means.

【０００７】[0007]

【発明の実施の形態】図１は、従来構成のマイクロプロ
セッサ上に、可変構造演算器を載せた本発明の可変演算
プロセッサ１００を示す。従来のマイクロプロセッサ
は、ＣＰＵコア１１０、メモリ・データバス１２０、制
御部１３０、命令キャッシュ１３１、データ・キャッシ
ュ１３３、メモリ制御部１３２、レジスタ・ファイル１
３４、演算器類１３５、周辺モジュール回路１３６など
で構成されている。FIG. 1 shows a variable operation processor 100 according to the present invention in which a variable structure operation unit is mounted on a conventional microprocessor. A conventional microprocessor includes a CPU core 110, a memory data bus 120, a control unit 130, an instruction cache 131, a data cache 133, a memory control unit 132, a register file 1
34, an operation unit 135, a peripheral module circuit 136, and the like.

【０００８】本発明の一実施例では、ＤＣＴ／逆ＤＣＴ
などの負荷が重い処理を効率良く実行するために、可変
構造演算器に工夫がなされている。図において１６０は
ＤＣＴと逆DCT演算を切り替えるセレクタ、１６５は逆
ＤＣＴの後加算あるいはＤＣＴの共通係数項同士の前加
算のセレクタ、１７０はプログラマブル演算器、１７
１，１７６は演算パス指示信号のラッチ、１７２は可変
演算パスのセレクタ、１７５は乗加算器、１８０はバタ
フライ演算ブロックをしめす。バタフライ演算とは加算
と減算が対になって行われる演算要素をいうものであ
り、プログラマブル演算器１７０の重要な演算要素を構
成する。その一例を第１０図の１０２０，１０４０，１
０５０に示す。プログラマブル演算器１７０の入出力デ
ータパスのながれはセレクタ１７２によって、可変的に
制御され、演算要素はできるだけ共用するように構成さ
れ、制御される。セレクト制御情報１７１、１７６は、
高速アクセスできるメモリ１５１、１５２を用いて保持
する。そして、シーケンサ１４０によって、演算器１７
０、１７５、１８０の実行を制御する。In one embodiment of the present invention, DCT / inverse DCT
In order to efficiently execute processing with a heavy load such as, for example, a variable structure arithmetic unit has been devised. In the figure, 160 is a selector for switching between DCT and inverse DCT operation, 165 is a selector for post-addition of inverse DCT or pre-addition of common coefficient terms of DCT, 170 is a programmable operator, 17
Reference numerals 1 and 176 denote latches of operation path instruction signals, 172 denotes a selector of a variable operation path, 175 denotes a multiplier / adder, and 180 denotes a butterfly operation block. The butterfly operation is an operation element in which addition and subtraction are performed as a pair, and constitutes an important operation element of the programmable operation unit 170. One example is 1020, 1040, 1 in FIG.
050. The flow of the input / output data path of the programmable arithmetic unit 170 is variably controlled by the selector 172, and the arithmetic elements are configured and controlled to share as much as possible. The select control information 171 and 176 are
The data is held using the memories 151 and 152 that can be accessed at high speed. Then, the arithmetic unit 17 is output by the sequencer 140.
0, 175 and 180 are controlled.

【０００９】本発明の可変演算プロセッサ１００上に実
現する２次元ＤＣＴ／逆ＤＣＴ演算のアルゴリズムにつ
いて説明する。まず、２次元８×８ＤＣＴ／逆ＤＣＴ
は、本質を損なわない程度に説明しやすいように以下の
ように表現する。まず、ＤＣＴはThe algorithm of the two-dimensional DCT / inverse DCT operation realized on the variable operation processor 100 of the present invention will be described. First, two-dimensional 8 × 8 DCT / inverse DCT
Is expressed as follows so that it can be easily explained without impairing the essence. First, DCT

【００１０】[0010]

【数１】 (Equation 1)

【００１１】そして、逆ＤＣＴはAnd the inverse DCT is

【００１２】[0012]

【数２】 (Equation 2)

【００１３】ここで，ｃ₀＝１／√２かつｃ_n＝１
（ｎ≠０）。Here, c ₀ = 1 / √2 And c _n = 1
(N ≠ 0).

【００１４】次に、数１、数２には周期性があることに
着目しNext, note that Equations 1 and 2 have periodicity.

【００１５】[0015]

【数３】 (Equation 3)

【００１６】のように入力データ列の並べ替えを行う
と、When the input data sequence is rearranged as shown in FIG.

【００１７】[0017]

【数４】 (Equation 4)

【００１８】[0018]

【数５】 (Equation 5)

【００１９】と書き換えることができる。It can be rewritten as

【００２０】さて、三角関数の加法定理を用いれば、By using the addition theorem of trigonometric functions,

【００２１】[0021]

【数６】 (Equation 6)

【００２２】であるから、三角関数の指数部分のｋに着
目して、Therefore, paying attention to k in the exponent part of the trigonometric function,

【００２３】[0023]

【数７】 (Equation 7)

【００２４】ｎに着目して、Focusing on n,

【００２５】[0025]

【数８】 (Equation 8)

【００２６】と置いて、ＤＣＴ：ｇ（ｋ₁，ｋ₂）と
逆ＤＣＴ：Ｇ（ｎ₁，ｎ₂）の値をそれぞれ求め図２
と図３に示す。同一枠内の上段は数７、数８において括
弧内の左項，下段は右項の値をそれぞれ示す。図３はｃ
ｏｓｉｎｅ関数の周期性を用いて簡略化している。ｇ
（−（ｋ₁，ｋ₂））＝−ｇ（ｋ₁，ｋ₂）およびＧ
（ｎ₁，ｎ₂）＝−Ｇ（［ｎ₁，ｎ₂］）を意味する。
また、＃０はｃｏｓｉｎｅの値がゼロ（ｃｏｓ（ｉ）＝
０）であることを示している。０は、ｇ（０）の表記で
あるから、ｃｏｓ（０）＝１を意味するので混同しては
いけない。Then, the values of DCT: g (k ₁ , k ₂ ) and inverse DCT: G (n ₁ , n ₂ ) are obtained, respectively, as shown in FIG.
FIG. The upper part in the same frame shows the value of the left term in parentheses in Equations 7 and 8, and the lower part shows the value of the right term, respectively. FIG. 3 shows c
It is simplified by using the periodicity of the “osine” function. g
(− (K ₁ , k ₂ )) = − g (k ₁ , k ₂ ) and G
(N ₁ , n ₂ ) = − G ([n ₁ , n ₂ ]).
In # 0, the value of cosine is zero (cos (i) =
0). Since 0 is a notation of g (0), it means cos (0) = 1, so that it should not be confused.

【００２７】まず、図２でＤＣＴを簡単に考察する。図
２（２００）において、４分割した左上４分の１の部分
２０１が２０２、２０３、２０４で符号を除いて繰り返
していることがわかる。この左上４分の１の部分２０１
についても、さらに４分割してみると、ｋ₁とｋ₂の一方
を固定した場合は、固定していないｋが偶数の場合は、
例えば、（（１３ｋ₁−５ｋ₂）−（５ｋ₁−５ｋ₂））／
１６＝８ｋ₁／１６であるから、符号を除いてｃｏｓｉ
ｎｅの値を繰り返していることがわかる。また、ｋ₁と
ｋ₂の両方を動かした場合は、例えば、（（１３ｋ₁−５
ｋ₂）−（５ｋ₁−１３ｋ₂））／１６＝８（ｋ₁＋ｋ₂）
／１６であるから、（ｋ₁＋ｋ₂）が偶数ならば符号を除
いて対角方向にｃｏｓｉｎｅの値を繰り返していること
がわかる。First, the DCT will be briefly considered with reference to FIG. In FIG. 2 (200), it can be seen that the upper left quarter part 201 divided into four parts is repeated at 202, 203 and 204 except for the sign. This upper left quarter part 201
For even, when we further divided into four, the case of fixing one of k ₁ and k _2, if k not fixed is even,
For _{_{example, ((13k 1 -5k 2)}} - (5k 1 -5k 2)) /
16 = because it is 8k _1/16, cosi except the sign
It can be seen that the value of ne is repeated. Also, if you move both k ₁ and k _2, for example, ((13k ₁ -5
_{_{k 2) - (5k 1 -13k}} 2)) / 16 = 8 (k 1 + k 2)
Since it is / 16, if (k ₁ + k ₂ ) is an even number, it is understood that the value of cosine is repeated in the diagonal direction except for the sign.

【００２８】次に、逆ＤＣＴについて、図３の中で同じ
値をもつ項を強調して見やすく記述すると、図４（４０
０）が得られる。このように、逆ＤＣＴはＤＣＴに比べ
て複雑であるが、ある種の対称性があることがわかる。
すなわち、図４の左上４分の１の部分４０１が回転対称
になっている。Ｇ（ｎ₁，ｎ₂）の実際の値を求めて、
全体図のうちの左上４分の１について図５に示す。残り
（左下、右上、右下４分の１）全体の図は、図６に示す
ように、図４の基本パターン４００の符号を変えた組み
合わせになる。組み合わせは、６０１、６０２、６０
３、６０４の４種類がある。従って、まず図４の計算を
求めればよいことになる。この段階を前加算と呼ぶこと
にする。Next, for the inverse DCT, terms having the same value in FIG.
0) is obtained. Thus, it can be seen that inverse DCT is more complicated than DCT, but has some kind of symmetry.
That is, the upper left quarter portion 401 in FIG. 4 is rotationally symmetric. Find the actual value of G (n ₁ , n ₂ )
FIG. 5 shows the upper left quarter of the overall view. As shown in FIG. 6, the remaining figures (lower left, upper right, lower right quarter) are combinations in which the signs of the basic pattern 400 in FIG. 4 are changed. The combinations are 601, 602, 60
3, 604. Therefore, the calculation shown in FIG. This stage is called pre-addition.

【００２９】逆ＤＣＴの図５に対応するＤＣＴの値をも
とめて図７に示す。さて、ここでこれまでの説明の理解
を助けるために、ＤＣＴと逆ＤＣＴの計算の違いについ
てもう一度まとめて整理する。ＤＣＴは８８各ブロッ
ク内で基本パターン２００が４分割されて繰り返される
のに対して、逆ＤＣＴは８×８ブロック全体で４分割さ
れ、基本パターン４００の符号を組み合わせた各ブロッ
クが４種類出現する。さらに、ＤＣＴでは、奇数項ｇ
（２ｋ＋１）と偶数項ｇ（２ｋ）のブロックに分割され
るのに対して、逆ＤＣＴでは、すべての項ｇ（ｋ）が出
現する。ＤＣＴと逆ＤＣＴの演算ブロックの共通化をは
かるためには、このような違いを考慮しておく必要があ
る。そのために、まず逆ＤＣＴの演算手順と演算ブロッ
クの構成を明らかにしてからＤＣＴについて考えること
にする。FIG. 7 shows the DCT values corresponding to the inverse DCT shown in FIG. Now, in order to help understand the above description, the difference between the DCT and the inverse DCT calculation will be summarized once again. In the DCT, the basic pattern 200 is divided into four in each block and repeated, while the inverse DCT is divided into four in the entire 8 × 8 block, and four types of blocks each obtained by combining the codes of the basic pattern 400 appear. I do. Further, in DCT, the odd term g
While the block is divided into (2k + 1) and even-numbered term g (2k) blocks, all the terms g (k) appear in the inverse DCT. It is necessary to consider such a difference in order to share the operation blocks of the DCT and the inverse DCT. For that purpose, first, the operation procedure of the inverse DCT and the configuration of the operation block will be clarified, and then the DCT will be considered.

【００３０】そこで、逆ＤＣＴについて、共通項をまと
めて効率良く前加算１７０するために、図８と図９に示
すように、画素位置（ｉ，ｊ）のデータをｐｉｊによ
って記述して、ｇ（ｋ）別に前加算する手順を示すと、
いわゆる加算と減算が対になって行われるバタフライ演
算で求められることがわかる。図ではほぼ行単位にコン
マ（，）で１ブロック分を区切ってあり、各行の位置が
同じブロックの演算に対応する。このような画素データ
間の演算による前加算を演算器１７０によってあらかじ
め行い、その結果とｇ（ｋ）とを乗加算器１７５によっ
て、１度だけ乗算し、すべてのｋについての結果を加算
する。これら乗加算結果１７５を図６に示す繰返しパタ
ーンの符号関係によって組み合わせて４種類１０９０、
１０９１、１０９２、１０９３計算するようにする。こ
れはバタフライ演算ブロック１８０によってなされ、後
加算と呼ぶこととする。Therefore, in order to collectively perform the common term efficiently and pre-add 170 with respect to the inverse DCT, the data at the pixel position (i, j) is described by pij as shown in FIGS. (K) If you show the procedure for pre-addition separately,
It can be seen that the so-called addition and subtraction are obtained by a butterfly operation performed as a pair. In the figure, one block is delimited by a comma (,) almost every line, and the position of each line corresponds to the operation of the same block. Such pre-addition by calculation between pixel data is performed in advance by the arithmetic unit 170, and the result is multiplied by g (k) only once by the multiplication / addition unit 175, and the results for all k are added. These multiplication and addition results 175 are combined according to the sign relationship of the repeating pattern shown in FIG.
1091, 1092 and 1093 are calculated. This is performed by the butterfly operation block 180 and will be referred to as post-addition.

【００３１】図８について念のためにさらに詳しく説明
しよう。ｇ（１）の１行目の式FIG. 8 will be described in more detail just in case. Expression in the first line of g (1)

【００３２】[0032]

【数９】 (Equation 9)

【００３３】は、以下の２つの式をまとめた表記であ
る。Is a notation in which the following two expressions are put together.

【００３４】[0034]

【数１０】 (Equation 10)

【００３５】[0035]

【数１１】 [Equation 11]

【００３６】このように、他の項も同じように表記して
いる。また、図９についても、As described above, the other terms are similarly described. Also, regarding FIG.

【００３７】[0037]

【数１２】 (Equation 12)

【００３８】などは、以下の２つの式をまとめた表記で
ある。Are notations in which the following two equations are summarized.

【００３９】[0039]

【数１３】 (Equation 13)

【００４０】[0040]

【数１４】 [Equation 14]

【００４１】図１０と図１１には、図８で示される演算
手順に従って演算パス１０３０、１０３１、１０６０、
１０６１、１０６２をセレクト制御してバタフライ演算
１０２０、１０５０し、それらの結果と偶数項ｇ（２
ｋ）：１０８１、１０８２、１０８３、奇数項ｇ（２
ｋ＋１）：１０７１、１０７２、１０７３、１０７４
の乗算を行うために、任意のデータ・パスをセレクトで
きる構成を示す。乗算係数ｇ（ｋ）は、メモリ１５２に
格納してある。FIGS. 10 and 11 show operation paths 1030, 1031 and 1060 according to the operation procedure shown in FIG.
1061 and 1062 are selectively controlled to perform butterfly operations 1020 and 1050, and the results thereof and the even term g (2
k): 1081, 1082, 1083, odd-numbered term g (2
k + 1): 1071, 1072, 1073, 1074
In order to perform the multiplication, an arbitrary data path can be selected. The multiplication coefficient g (k) is stored in the memory 152.

【００４２】図１２と図１３にはＤＣＴ演算の手順が示
してあり、図１４と図１５がそれらの演算パスを制御す
る構成である。図１４と図１５は図８と図９と同じ構造
になっていることから、両者を共通化できることは明白
であり、本発明実施例の重要な特徴ので一つである。FIGS. 12 and 13 show the procedure of the DCT operation, and FIGS. 14 and 15 show a configuration for controlling those operation paths. Since FIGS. 14 and 15 have the same structure as FIGS. 8 and 9, it is clear that both can be shared, and this is one of the important features of the embodiment of the present invention.

【００４３】ＤＣＴと逆ＤＣＴで共通する部分の説明を
したが、本質的に異なる部分について、ここで、もう一
度検討してみよう。ＤＣＴでは、図２にしめすように、
同じ係数をもつ４項同士を最初に加算する。一方、逆Ｄ
ＣＴでは、乗算後に基本パターンの４種類の組み合わせ
１０９０、１０９１、１０９２、１０９３の加算をす
る。いずれも２段の加算を伴うために、これらの演算を
共通にできる。図１６は入口（ＤＣＴ）または出口（逆
ＤＣＴ）においてセレクタ１６５により選択し、２段の
加算１８０を実行する部分を示す実施例図である。図１
７はＤＣＴあるいは逆ＤＣＴのどちらの演算をするかを
選択するセレクタ１６０の実施例図である。Having described the parts common to the DCT and the inverse DCT, the parts that are essentially different will now be considered again. In DCT, as shown in FIG.
Four terms having the same coefficient are added first. On the other hand, reverse D
In the CT, after the multiplication, four types of combinations 1090, 1091, 1092, and 1093 of the basic pattern are added. Since each of these operations involves two stages of addition, these operations can be shared. FIG. 16 is an embodiment diagram showing a portion which is selected by the selector 165 at the entrance (DCT) or exit (inverse DCT) and executes two-stage addition 180. FIG.
FIG. 7 shows an embodiment of the selector 160 for selecting whether to perform the DCT operation or the inverse DCT operation.

【００４４】以上で、ＤＣＴ／逆ＤＣＴの演算手順とそ
の演算器構成について説明したが、さらに、図１の構成
について補足説明する。図１において、演算パスを制御
するために設けられるセレクタ１７２は、図１０（図１
１、１４、１５においても同様）においてはセレクタ１
０３０−１０３１，セレクタ１０６０−１０６１に対応
する。これらセレクタ１７２を制御するためには、制御
信号列を高速アクセス・メモリ１５１に記憶しておき、
シーケンス・プログラム１４１、高速アクセス・メモリ
のアドレス・デコーダ１４２からなるシーケンサ１４０
によって制御信号を供給する。The above has described the DCT / inverse DCT calculation procedure and its arithmetic unit configuration. The configuration of FIG. 1 will be additionally described. In FIG. 1, the selector 172 provided for controlling the operation path is the same as that shown in FIG.
1, 14, and 15).
030-1031, and selectors 1060-1061. To control these selectors 172, a control signal sequence is stored in the high-speed access memory 151, and
Sequencer 140 comprising sequence program 141 and address decoder 142 of high-speed access memory
Supplies a control signal.

【００４５】本発明の可変演算プロセッサ１００におい
て、ＤＣＴ以外の例えば、フーリエ変換、ハートレイ変
換などを可変演算するには、メモリ・バス１２０を介し
て、高速アクセス・メモリ１５１、１５２の内容を入れ
替え、またシーケンサ１４０のプログラム１４１も入れ
替えることによって実現できる。In the variable operation processor 100 of the present invention, the contents of the high-speed access memories 151 and 152 are exchanged via the memory bus 120 to perform variable operations other than DCT, for example, Fourier transform and Hartley transform. Further, this can be realized by exchanging the program 141 of the sequencer 140.

【００４６】図１８は、高速アクセス・メモリ１５１、
１５２の回路要素として用いることのできるフリップ・
フロップ回路１８００を示している。電源１８０１，１
８０２とアース１８０５，１８０６の間にはそれぞれ一
対のトランジスタが接続されそのゲートが交差接続され
ている。１８３０、１８３１はそれぞれ入力かつ出力端
子、１８０３，１８０４はそれぞれゲートである。回路
素子１８５１、１８５２はスイッチング素子であり、リ
ーク電流が無視できない場合には、それを押さえるため
の制御素子である。本回路の詳細は出願中のPCT/JP９９
/０２５０５、「記憶回路」、発明者：橘大に記載さ
れているので説明を援用する。FIG. 18 shows a high-speed access memory 151,
Flip that can be used as a circuit element 152
The flop circuit 1800 is shown. Power supply 1801,1
A pair of transistors are connected between 802 and grounds 1805 and 1806, respectively, and their gates are cross-connected. 1830 and 1831 are input and output terminals, respectively, and 1803 and 1804 are gates, respectively. The circuit elements 1851 and 1852 are switching elements, and are control elements for suppressing the leakage current when it cannot be ignored. For details of this circuit, refer to PCT / JP99
/ 02505, "Memory circuit", inventor: Tachibana University, the description of which is incorporated herein.

【００４７】[0047]

【発明の効果】以上説明した可変演算プロセッサは、大
量のバタフライ演算と乗算を伴うＤＣＴなどを含むフー
リエ関連の変換などにおいて幅広くフレキシブルに対応
できかつ高速な演算であるため効果的である。また、プ
ログラマブルな可変構造によって複数個の演算ブロック
が共通に利用できるため、コンパクトなハードウェアに
よって実現可能である。The variable operation processor described above is effective because it is a high-speed operation that can be widely and flexibly applied to Fourier-related transformations including DCT or the like involving a large amount of butterfly operations and multiplications. Further, since a plurality of operation blocks can be commonly used by a programmable variable structure, it can be realized by compact hardware.

【図面の簡単な説明】[Brief description of the drawings]

【図１】プログラマブルな可変構造の演算器で構成され
る可変演算プロセッサ。FIG. 1 is a variable operation processor including a programmable variable-structure operation unit.

【図２】２次元８×８ＤＣＴのｇ（ｋ₁，ｋ₂）の値を
示す図。FIG. 2 is a diagram showing values of g (k ₁ , k ₂ ) of a two-dimensional 8 × 8 DCT.

【図３】２次元８×８逆ＤＣＴのｇ（ｎ₁，ｎ₂）の値
を示す図。FIG. 3 is a diagram showing a value of g (n ₁ , n ₂ ) of a two-dimensional 8 × 8 inverse DCT.

【図４】２次元８×８逆ＤＣＴの画素間の共通係数関係
を強調して記述したｇ（ｎ₁，ｎ₂）の値を示す図。FIG. 4 is a diagram showing values of g (n ₁ , n ₂ ) in which a common coefficient relationship between pixels of two-dimensional 8 × 8 inverse DCT is emphasized and described.

【図５】２次元８×８逆ＤＣＴのｇ（ｎ₁，ｎ₂）の具
体値を示す図（全体図のうちの左上４分の１の図）。FIG. 5 is a diagram showing a specific value of g (n ₁ , n ₂ ) of the two-dimensional 8 × 8 inverse DCT (a diagram of the upper left quarter of the entire diagram).

【図６】２次元８×８逆ＤＣＴのｇ（ｎ₁，ｎ₂）の全
体図において４分割された基本パターンの符号を示す
図。FIG. 6 is a diagram showing signs of basic patterns divided into four in an overall view of g (n ₁ , n ₂ ) of the two-dimensional 8 × 8 inverse DCT.

【図７】２次元８×８ＤＣＴのｇ（ｋ₁，ｋ₂）の具体
値を示す図（全体図であるが、各ブロックは左上４分の
１のみの図）。FIG. 7 is a diagram showing specific values of g (k ₁ , k ₂ ) of the two-dimensional 8 × 8 DCT (the whole diagram, but each block shows only the upper left quarter).

【図８】２次元８×８逆ＤＣＴの奇数項の演算手順を示
す図。FIG. 8 is a diagram showing a calculation procedure of an odd term of the two-dimensional 8 × 8 inverse DCT.

【図９】２次元８×８逆ＤＣＴの偶数項の演算手順を示
す図。FIG. 9 is a diagram showing a calculation procedure of an even-numbered term of a two-dimensional 8 × 8 inverse DCT.

【図１０】２次元８×８逆ＤＣＴの演算パスの制御ブロ
ックを示す図（その１）。FIG. 10 is a diagram illustrating a control block of a two-dimensional 8 × 8 inverse DCT operation path (part 1).

【図１１】２次元８×８逆ＤＣＴの演算パスの制御ブロ
ックを示す図（その２）。FIG. 11 is a diagram illustrating a control block of an operation path of two-dimensional 8 × 8 inverse DCT (part 2).

【図１２】２次元８×８ＤＣＴの奇数項の演算手順を示
す図。FIG. 12 is a diagram showing a calculation procedure of an odd term of a two-dimensional 8 × 8 DCT.

【図１３】２次元８×８ＤＣＴの偶数項の演算手順を示
す図。FIG. 13 is a diagram showing a calculation procedure of an even-numbered term of a two-dimensional 8 × 8 DCT.

【図１４】２次元８×８ＤＣＴの演算パスの制御ブロッ
クを示す図（その１）。FIG. 14 is a diagram illustrating a control block of a two-dimensional 8 × 8 DCT operation path (part 1).

【図１５】２次元８×８ＤＣＴの演算パスの制御ブロッ
クを示す図（その２）。FIG. 15 is a diagram illustrating a control block of a two-dimensional 8 × 8 DCT operation path (part 2).

【図１６】逆ＤＣＴの後加算あるいはＤＣＴのはじめの
共通係数項同士の加算を選択するセレクタおよび演算回
路。FIG. 16 shows a selector and an arithmetic circuit for selecting post-addition of inverse DCT or addition of common coefficient terms at the beginning of DCT.

【図１７】逆ＤＣＴあるいはＤＣＴ演算するかのセレク
タ。FIG. 17 is a selector for performing inverse DCT or DCT operation.

【図１８】高速アクセス・メモリ回路に用いるフリップ
・フロップ回路の構造図。FIG. 18 is a structural diagram of a flip-flop circuit used for a high-speed access memory circuit.

【符号の説明】[Explanation of symbols]

１００・・・可変演算プロセッサ、１１０・・・ＣＰ
Ｕコア、１２０・・・メモリ・バス、１３０・・・
プロセッサの制御ブロック、１３１・・・命令キャッ
シュ、１３２・・・メモリ制御ブロック、１３３・
・・データ・キャッシュ、１３４・・・レジスタ・フ
ァイル、１３５・・・演算器類、１３６・・・周辺
モジュール・ブロック、１４０・・・可変構造演算器
のシーケンサ、１４１・・・シーケンス・プログラ
ム、１４２、１４３・・・高速アクセス・メモリのア
ドレス・デコーダ、１５１、１５２・・・高速アクセ
ス・メモリ、１６０・・・ＤＣＴと逆ＤＣＴ演算のセ
レクタ、１６５・・・逆ＤＣＴの後加算あるいはＤＣ
Ｔの共通係数項同士の前加算のセレクタ、１７０・・
・プログラマブル演算器、１７１、１７６・・・演
算パス指示信号のラッチ、１７２・・・可変演算パス
のセレクタ、１８０・・・バタフライ演算ブロック、
１０２０、１０４０、１０５０・・・バタフライ演算
器、１０３０、１０３１、１０６０、１０６１、１０
６２・・・セレクタ、１０７１、１０７２、１０７
３、１０７４、１０８０、１０８１、１０８２、１０８
３・・・乗算器、１０７０、１０８０・・・乗加算
器、１８００・・・フリップ・フロップ回路。100 ... variable operation processor, 110 ... CP
U core, 120 ... memory bus, 130 ...
Processor control block 131 instruction cache 132 memory control block 133
.. Data cache, 134: register file, 135: arithmetic unit, 136: peripheral module block, 140: sequencer of variable structure arithmetic unit, 141: sequence program, 142, 143: Address decoder of high-speed access memory 151, 152: High-speed access memory 160: Selector of DCT and inverse DCT operation 165: Post-addition or DC of inverse DCT
A selector for pre-addition between common coefficient terms of T, 170 ...
· Programmable arithmetic units, 171 and 176 ··· Latch of operation path instruction signal · 172 ··· Selector of variable operation path · 180 ··· Butterfly operation block
1020, 1040, 1050 ... butterfly operation unit, 1030, 1031, 1060, 1061, 10
62 ... selector 1071, 1072, 107
3, 1074, 1080, 1081, 1082, 108
3 Multiplier, 1070, 1080 Multiplier / Adder, 1800 Flip-flop circuit.

Claims

【特許請求の範囲】[Claims]

【請求項１】加算と減算を選択できる加算器の対の複
数個、乗算器の複数個、上記加算器の入力側に設けられ
た可変選択可能な第１の複数個のパス、および上記加算
器と上記乗算器の間に設けられた可変選択可能な第２の
複数個のパスを有し、可変演算シーケンス制御指令に基
づいて上記第１のパス群および第２のパス群を選択制御
してデータの可変演算を行うことを特徴とする可変演算
プロセッサ。1. A plurality of pairs of adders capable of selecting addition and subtraction, a plurality of multipliers, a first plurality of variably selectable paths provided at an input side of the adder, and the addition. A plurality of variably selectable paths provided between a multiplier and the multiplier, for selectively controlling the first path group and the second path group based on a variable operation sequence control command. A variable operation processor for performing variable operation of data by using the variable operation processor.

【請求項２】２次元以上のＤＣＴ演算を実行する特許
請求項１の可変演算プロセッサ。2. The variable operation processor according to claim 1, wherein the variable operation processor executes a DCT operation of two or more dimensions.

【請求項３】２次元以上の逆ＤＣＴ演算を実行する特
許請求項１の可変演算プロセッサ。3. The variable operation processor according to claim 1, wherein said variable operation processor executes an inverse DCT operation of two or more dimensions.

【請求項４】２次元以上のＤＣＴ演算および逆ＤＣＴ
演算を実行する特許請求項１の可変演算プロセッサ4. DCT operation and inverse DCT of two or more dimensions
2. The variable operation processor according to claim 1, wherein the variable operation processor executes an operation.

【請求項５】２次元以上のフーリエ関連および逆フー
リエ関連演算を実行する特許請求項１の可変演算プロセ
ッサ5. The variable operation processor according to claim 1, wherein the processor performs two-dimensional or more Fourier-related and inverse Fourier-related operations.

【請求項６】低リーク電流スイッチング素子を上記可
変辺演算シーケンス制御指令を記憶する高速アクセス。
メモリ回路に用いたことを特徴とする特許請求項１の可
変演算プロセッサ。6. A high-speed access to a low-leakage current switching element for storing the variable side operation sequence control command.
2. The variable operation processor according to claim 1, wherein the variable operation processor is used in a memory circuit.