JP2001014160A

JP2001014160A - Information processor

Info

Publication number: JP2001014160A
Application number: JP11188372A
Authority: JP
Inventors: Toru Hiraoka; 徹平岡; Tomonaga Itoi; 朋永糸井; Masashi Hakamata; 正史袴田
Original assignee: Hitachi Ltd; Hitachi ULSI Systems Co Ltd
Current assignee: Hitachi Ltd; Hitachi Solutions Technology Ltd
Priority date: 1999-07-02
Filing date: 1999-07-02
Publication date: 2001-01-19
Anticipated expiration: 2019-07-02
Also published as: JP3668643B2

Abstract

PROBLEM TO BE SOLVED: To advance a pipeline processing by reducing delay due to a branch- destination instruction read by requesting a memory to read the branch- destination instruction of a branch instruction to an instruction buffer when the branch instruction is decoded. SOLUTION: At an IF stage, instructions set in BIRP 400 and BIRS 500 are decoded by a 1st instruction decoder 900. When a branch instruction is decoded by the 1st instruction decoder 900, an instruction read request for a branch-desntination instruction is issued to an instruction cache 100. The instructions set in the BIRP 400 and BIRS 500 are stored in IFR 100, and the instruction set in the BIRP 400 is transferred to a selecting circuit 1010. Thus, two instructions are decoded in every machine cycle at the IF stage as a 1st instruction decoding stage and transferred to a D stage as a 2nd instruction decoding stage. At the 2nd instruction decoding stage, the instructions set in IRP 1200 and IRS 1300 are decoded by a 2nd instruction decoder 1700.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、パイプライン方式
の情報処理装置に係り、特に、分岐命令を効率的に実行
することを可能にした情報処理装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a pipeline type information processing apparatus, and more particularly to an information processing apparatus capable of efficiently executing a branch instruction.

【０００２】[0002]

【従来の技術】図４は従来技術によるパイプライン方式
の情報処理装置の命令レジスタおよび命令デコーダ部分
の構成例を示すブロック図、図５は命令レジスタ、命令
デコーダを含む情報処理装置の構成例を示すブロック
図、図６は分岐命令を含む命令群の流れの一例を示す
図、図７は図６に示す命令列における動作を説明するタ
イミングチャートである。以下図４〜図７を参照して従
来技術による情報処理装置について説明する。2. Description of the Related Art FIG. 4 is a block diagram showing an example of a configuration of an instruction register and an instruction decoder of a pipeline type information processing apparatus according to the prior art, and FIG. FIG. 6 is a diagram showing an example of the flow of an instruction group including a branch instruction, and FIG. 7 is a timing chart for explaining the operation in the instruction sequence shown in FIG. Hereinafter, an information processing apparatus according to the related art will be described with reference to FIGS.

【０００３】図４において、１００はメモリの写しを持
つ命令キャッシュ、２００は命令キャッシュから読み出
した複数の命令を保持する命令バッファ（ＩＢＲ）、１
２００は次に実行する命令を保持する第１命令レジスタ
（ＩＲＰ）、１３００はＩＲＰ１２００に続く命令を保
持する第２命令レジスタ（ＩＲＳ）、１５００はＩＲＰ
１２００に保持されている命令が有効であることを示す
識別子（ＩＲＰＶ）、１６００はＩＲＳ１３００に保持
されている命令が有効であることを示す識別子（ＩＲＳ
Ｖ）、３００はＩＢＲ２００、ＩＲＰ１２００、ＩＲＳ
１３００、ＩＲＰＶ１５００およびＩＲＳＶ１６００の
制御を行うＩＢＲ制御回路、１７００はＩＲＰ１２００
およびＩＲＳ１３００に保持された命令を解読する命令
デコーダである。In FIG. 4, reference numeral 100 denotes an instruction cache having a copy of a memory; 200, an instruction buffer (IBR) for holding a plurality of instructions read from the instruction cache;
200 is a first instruction register (IRP) holding an instruction to be executed next, 1300 is a second instruction register (IRS) holding an instruction following the IRP 1200, and 1500 is an IRP
An identifier (IRPV) 1600 indicating that the instruction held in 1200 is valid is an identifier (IRS) 1600 indicating that the instruction held in IRS 1300 is valid.
V), 300 are IBR200, IRP1200, IRS
An IBR control circuit for controlling the IRP 1200 and the IRPV 1500 and the IRSV 1600.
And an instruction decoder for decoding the instruction held in the IRS 1300.

【０００４】命令キャッシュ１００から読み出された複
数の命令はＩＢＲ２００に保持される。次に実行される
命令がＩＢＲ２００から切り出されＩＲＰ１２００にセ
ットされる。また、ＩＲＰ１２００にセットされた命令
の次の命令も同時にＩＢＲ２００から切り出されＩＲＳ
１３００にセットされる。A plurality of instructions read from the instruction cache 100 are held in the IBR 200. The next instruction to be executed is cut out from IBR 200 and set in IRP 1200. The instruction next to the instruction set in the IRP 1200 is also cut out from the IBR 200 at the same time, and the IRS
It is set to 1300.

【０００５】一般に高性能な情報処理装置では複数の命
令を同時に処理するスーパスカラ方式を採用している。
ＩＢＲ制御回路３００ではＩＲＰ１２００とＩＲＳ１３
００にセットされた命令の組み合わせがスーパスカラ処
理可能な命令組み合わせか否かを判断する。ＩＲＰ１２
００とＩＲＳ１３００にセットされた命令の組み合わせ
がスーパスカラ処理可能な命令組み合わせの場合，両方
の命令が有効であることを示すために、ＩＢＲ制御回路
３００はＩＲＰＶ１５００およびＩＲＳＶ１６００に'
１'をセットする。ＩＲＰ１２００とＩＲＳ１３００に
セットされた命令の組み合わせがスーパスカラ不可な命
令組み合わせの場合、またはＩＲＰ１２００にセットさ
れた命令の次の命令がまだＩＢＲ２００に格納されてい
ない場合、ＩＢＲ制御回路３００はＩＲＰＶ１５００に
のみ'１'をセットする。In general, a high-performance information processing apparatus employs a superscalar system for simultaneously processing a plurality of instructions.
In the IBR control circuit 300, the IRP 1200 and the IRS 13
It is determined whether the combination of instructions set to 00 is an instruction combination that allows superscalar processing. IRP12
If the combination of the instructions set in 00 and IRS1300 is an instruction combination that can perform superscalar processing, the IBR control circuit 300 sends an instruction to the IRPV 1500 and IRSV 1600 to indicate that both instructions are valid.
Set 1 '. If the combination of the instructions set in the IRP 1200 and the IRS 1300 is an instruction combination that cannot be superscalar, or if the instruction following the instruction set in the IRP 1200 has not yet been stored in the IBR 200, the IBR control circuit 300 sets the '1 Set '.

【０００６】ＩＲＰ１２００にセットされる命令がまだ
ＩＢＲ２００に格納されていない場合、ＩＢＲ制御回路
３００はＩＲＰＶ１５００およびＩＲＳＶ１６００に'
０'をセットする。また、ＩＢＲ制御回路３００は次に
実行する命令の切り出しをＩＢＲ２００に対して要求す
る。このとき、ＩＲＰＶ１５００とＩＲＳＶ１６００の
両方が'１'の場合はＩＲＳ１３００にセットされた命令
の次命令からの切り出しを要求し、ＩＲＰＶ１５００の
み'１'の場合はＩＲＰ１２００にセットされた命令の次
命令からの切り出しを要求する。また、ＩＢＲ２００に
空きが生じた場合、ＩＢＲ制御回路３００は命令キャッ
シュ１００に対して命令の読み出しを要求する。If the instruction set in IRP 1200 has not been stored in IBR 200 yet, IBR control circuit 300 sends a signal to IRP 1500 and IRSV 1600.
Set 0 '. Further, the IBR control circuit 300 requests the IBR 200 to cut out an instruction to be executed next. At this time, if both the IRPV 1500 and the IRSV 1600 are “1”, a request is made to cut out the instruction set in the IRS 1300 from the instruction following the instruction set in the IRP 1200 if only the IRP 1500 is “1”. Request clipping. When an empty space is generated in the IBR 200, the IBR control circuit 300 requests the instruction cache 100 to read the instruction.

【０００７】命令デコーダ１７００はＩＲＰ１２００，
ＩＲＳ１３００に格納されている命令を解読する。ＩＲ
Ｐ１２００またはＩＲＳ１３００に分岐命令が解読され
た場合、命令デコーダ１７００は命令キャッシュ１００
に対して分岐先命令の命令読み出しを要求する。The instruction decoder 1700 has an IRP 1200,
The instructions stored in the IRS 1300 are decoded. IR
When a branch instruction is decoded in the P1200 or the IRS 1300, the instruction decoder 1700 sets the instruction cache 100
Is requested to read the branch destination instruction.

【０００８】次に図５で従来技術による情報処理装置の
各パイプラインにおける処理について説明する。図５に
おいて、１００は命令キャッシュ、２００はＩＢＲ、１
２００はＩＲＰ、１３００はＩＲＳ、１７００は命令デ
コーダ、２０００は例えば１６本のレジスタ群で構成さ
れる汎用レジスタ、２１００は命令デコーダ１７００で
の解読結果によりオペランドアドレス計算のために指定
された汎用レジスタ２０００の内容と命令により指定さ
れた変位値から命令の演算実行に必要なメモリオペラン
ドのアドレスを計算するオペランド用アドレス加算器、
２２００はメモリの写しを持つオペランドキャッシュ、
２３００はオペランドキャッシュ２２００から読み出し
たデータのうち演算に使用する部分を先頭に並び替える
アライナ、２４００はアライナ２３００で並び替えを実
施した後のメモリオペランドと、命令デコーダ１７００
での解読結果により演算のために指定された汎用レジス
タ２０００の内容で演算を行う演算器である。Next, the processing in each pipeline of the information processing apparatus according to the prior art will be described with reference to FIG. In FIG. 5, 100 is an instruction cache, 200 is an IBR, 1
200 is an IRP, 1300 is an IRS, 1700 is an instruction decoder, 2000 is a general-purpose register composed of, for example, 16 register groups, and 2100 is a general-purpose register 2000 designated for operand address calculation based on the decoding result of the instruction decoder 1700. An operand address adder for calculating an address of a memory operand required for executing an instruction operation from the contents of the instruction and a displacement value specified by the instruction;
2200 is an operand cache with a copy of memory,
Reference numeral 2300 denotes an aligner for rearranging the data read from the operand cache 2200 with the portion used for the operation first, 2400 a memory operand after rearrangement by the aligner 2300, and an instruction decoder 1700.
Is an arithmetic unit that performs an arithmetic operation based on the contents of the general-purpose register 2000 designated for the arithmetic operation based on the result of decryption performed by the general-purpose register 2000.

【０００９】ＤステージではＩＲＰ１２００およびＩＲ
Ｓ１３００にセットされた命令を命令デコーダ１７００
で同じステージ内で解読する。命令の解読結果により汎
用レジスタ２０００の指定された番号のレジスタを読み
出し、オペランド用アドレス加算器２１００に転送す
る。また、別の解読結果である変位値もオペランド用ア
ドレス加算器２１００に転送する。In the D stage, IRP1200 and IR
The instruction set in S1300 is transferred to the instruction decoder 1700.
To decrypt in the same stage. The register of the designated number in the general-purpose register 2000 is read based on the result of decoding the instruction, and transferred to the operand address adder 2100. Also, a displacement value as another decoding result is transferred to the operand address adder 2100.

【００１０】Ａステージでは指定された汎用レジスタの
内容と変位値よりオペランド用アドレス加算器２１００
でアドレス計算を行い、命令の演算実行に必要なメモリ
オペランド格納先のオペランドアドレスを計算する。オ
ペランド用アドレス加算器２１００により求めたオペラ
ンドアドレスはオペランドキャッシュ２２００に転送さ
れる。In the A stage, an operand address adder 2100 is used based on the contents of the designated general-purpose register and the displacement value.
And calculate the operand address of the memory operand storage destination required for executing the operation of the instruction. The operand address obtained by the operand address adder 2100 is transferred to the operand cache 2200.

【００１１】Ｔステージではオペランドキャッシュの参
照を実施する。オペランドキャッシュの読み出しデータ
はアライナ２３００に転送される。In the T stage, the operand cache is referenced. Read data from the operand cache is transferred to the aligner 2300.

【００１２】Ｂステージではオペランドキャッシュ２２
００からの読み出しデータの並べ替えを行い、オペラン
ドデータが順序通りに配列されるようになされる。In the B stage, the operand cache 22
The read data from 00 is rearranged so that the operand data is arranged in order.

【００１３】Ｌステージで演算器に転送する。Ｅステー
ジではアライナ２２００からのメモリオペランドと汎用
レジスタ２０００からのレジスタオペランドを使用して
演算を行う。演算結果は汎用レジスタ２０００に書き込
まれる。このように、命令はＤ，Ａ，Ｔ，Ｂ，Ｌ，Ｅの
６つのパイプラインステージに分解して実行される。The data is transferred to the arithmetic unit in the L stage. In the E stage, an operation is performed using the memory operand from the aligner 2200 and the register operand from the general-purpose register 2000. The operation result is written to the general-purpose register 2000. As described above, the instruction is decomposed into six pipeline stages of D, A, T, B, L, and E and executed.

【００１４】次に分岐命令を含む命令列についての一連
の処理について説明する。図６に示す命令列においてＬ
はロード命令、Ａは加算命令、ＳＴはストア命令、Ｃは
比較命令、ＢＣは条件分岐命令を示す。また，ＧＲ１〜
ＧＲ４は命令の演算に使用する汎用レジスタの番号を示
し、ｔｅｓｔ１〜ｔｅｓｔ６およびｐｒ１は各々メモリ
上の領域を示すラベルである。Next, a series of processes for an instruction sequence including a branch instruction will be described. In the instruction sequence shown in FIG.
Denotes a load instruction, A denotes an addition instruction, ST denotes a store instruction, C denotes a comparison instruction, and BC denotes a conditional branch instruction. Also, GR1
GR4 indicates the number of a general-purpose register used for the operation of the instruction, and test1 to test6 and pr1 are labels indicating areas on the memory, respectively.

【００１５】図６に示す一連の命令列を実行するときの
タイミングチャートを図７に示す。図７において横軸は
時間を表わし，一目盛が１マシンサイクルを示してい
る。横軸に示す１〜２１の数字は説明に使うために便宜
上付けたサイクル数である。以下、図６に示す命令列の
処理を図５および図７で説明する。FIG. 7 is a timing chart when the series of instructions shown in FIG. 6 is executed. In FIG. 7, the horizontal axis represents time, and one scale indicates one machine cycle. The numbers 1 to 21 shown on the horizontal axis are the number of cycles added for convenience for explanation. Hereinafter, processing of the instruction sequence shown in FIG. 6 will be described with reference to FIGS.

【００１６】サイクル２でＬ命令がＩＲＰ１２００にセ
ットされる。このときＩＲＳ１３００にはＬ命令の次の
Ａ命令がセットされるが、Ｌ命令もＡ命令もメモリオペ
ランド参照が必要であるため、オペランド用アドレス加
算器２１００／オペランドキャッシュ２２００の競合に
よりスーパスカラ処理は実施できない。従って、サイク
ル２ではＬ命令のみが解読され、以降、サイクル３でア
ドレス計算、サイクル４でオペランドキャッシュ参照、
サイクル５で読み出しデータのアライン、サイクル６で
演算器２４００に転送、サイクル７で演算を実行する。
以下、Ａ命令、ＳＴ命令、Ｌ命令、Ａ命令、ＳＴ命令、
Ｌ命令と同様に処理される。In cycle 2, the L instruction is set in IRP 1200. At this time, the A instruction next to the L instruction is set in the IRS 1300. However, since both the L instruction and the A instruction require memory operand reference, superscalar processing is performed due to competition between the operand address adder 2100 and the operand cache 2200. Can not. Therefore, only the L instruction is decoded in cycle 2, and thereafter, the address calculation is performed in cycle 3, the operand cache is referenced in cycle 4,
The read data is aligned in cycle 5, transferred to the arithmetic unit 2400 in cycle 6, and the operation is executed in cycle 7.
Hereinafter, A instruction, ST instruction, L instruction, A instruction, ST instruction,
Processed in the same way as the L instruction.

【００１７】次にサイクル９でＣ命令がＩＲＰ１２００
にセットされる。このとき同時にＩＲＳ１３００にはＢ
Ｃ命令がセットされる。ＢＣ命令はオペランド用アドレ
ス加算器２１００／オペランドキャッシュ２２００を使
用しないためＣ命令とのスーパスカラ処理が可能であ
る。従って、ＢＣ命令はサイクル９で命令の解読が行わ
れ、命令キャッシュ１００に対して分岐先命令の読み出
し要求を行う。サイクル１０からサイクル１２で命令キ
ャッシュ１００の読み出しおよびＩＢＲ２００への命令
の格納が行われ、サイクル１３に分岐先命令であるＬ命
令がＩＲＰ１２００にセットされる。以下、Ａ命令、Ｓ
Ｔ命令と順次処理され、ＳＴ命令の実行はサイクル２０
で完了する。Next, in cycle 9, the C instruction is IRP1200
Is set to At this time, B
The C instruction is set. Since the BC instruction does not use the operand address adder 2100 / operand cache 2200, superscalar processing with the C instruction is possible. Accordingly, the BC instruction is decoded in cycle 9, and issues a request to the instruction cache 100 to read the branch destination instruction. From cycle 10 to cycle 12, reading of the instruction cache 100 and storage of the instruction in the IBR 200 are performed, and in cycle 13, the L instruction which is a branch destination instruction is set in the IRP 1200. Hereinafter, the A instruction, S
Processing is sequentially performed with the T instruction, and execution of the ST instruction is performed in cycle 20.
Complete with

【００１８】前述した従来技術による情報処理装置は，
分岐命令の解読を待ってから分岐先命令の命令読み出し
を始めるため、分岐先の命令の解読を開始するまでの間
に３サイクルの空きが生じる。即ち、一般に分岐命令が
発生すると、分岐先命令読み出しが完了するまで分岐先
命令の解読を開始することができず、このため分岐命令
以降の命令の演算実行が遅れることになるという問題が
発生する。The information processing apparatus according to the prior art described above is
Since the instruction reading of the branch destination instruction is started after the branch instruction is decoded, there is an empty space of three cycles before the decoding of the branch destination instruction is started. That is, generally, when a branch instruction is generated, the decoding of the branch target instruction cannot be started until the reading of the branch target instruction is completed, which causes a problem that the execution of the instructions following the branch instruction is delayed. .

【００１９】こうしたパイプライン処理における分岐命
令の出現における処理の乱れ、（遅れ）については種々
の改善案が提案されてきている。例えば、特開平７−２
３９７８１号公報がある。しかしながら、これは、いず
れも実行の為の命令のデコード処理に応答した高速化の
工夫がなされている。Various improvements have been proposed for the processing disturbance (delay) caused by the appearance of a branch instruction in such pipeline processing. For example, JP-A-7-2
No. 39781. However, any of these techniques is devised to increase the speed in response to the decoding process of the instruction for execution.

【００２０】[0020]

【発明が解決しようとする課題】以上のような従来の技
術においては、パイプライン処理における分岐命令の出
現に対し、処理の遅れが避けられない、分岐先命令のア
ドレスを記憶するためのバッファメモリなどの大きな物
量が必要などの課題が残っている。In the above prior art, a buffer memory for storing the address of a branch destination instruction is inevitably delayed in response to the appearance of a branch instruction in pipeline processing. There are still issues such as the need for large quantities.

【００２１】本発明の目的は，前述した従来技術の問題
を解決し，分岐命令が発生した場合においても、分岐先
命令読み出しのための遅れを極力減じてパイプライン処
理、即ち、命令の解読および演算実行などを進めること
のできる情報処理装置を提供することにある。An object of the present invention is to solve the above-mentioned problem of the prior art, and to minimize the delay for reading a branch destination instruction even when a branch instruction is generated, to perform pipeline processing, that is, to decode and execute instructions. An object of the present invention is to provide an information processing apparatus capable of performing arithmetic execution and the like.

【００２２】[0022]

【課題を解決するための手段】本発明の命令先取り方式
は、命令の解読を２つのステージに分け、第１の命令解
読ステージでは第１の命令デコーダにより、命令バッフ
ァから読み出した命令を１マシンサイクルあたりに複数
の命令の解読を行い、また、第１の命令解読ステージで
は、分岐命令を解読した場合メモリに対して命令バッフ
ァへの当該分岐命令の分岐先命令読み出し要求を行い、
そして前記命令を第２の命令解読ステージにおいて第２
の命令デコーダにより順次実行のため解読するものであ
る。According to the instruction prefetching method of the present invention, the decoding of an instruction is divided into two stages. In a first instruction decoding stage, an instruction read from an instruction buffer by a first instruction decoder is processed by one machine. A plurality of instructions are decoded per cycle, and in a first instruction decoding stage, when a branch instruction is decoded, a request for reading a branch destination instruction of the branch instruction to an instruction buffer is issued to a memory when a branch instruction is decoded.
And in the second instruction decoding stage,
, Which are sequentially decoded by the instruction decoder.

【００２３】[0023]

【発明の実施の形態】本実施形態ではパイプラインを実
質的に２本持つスーパースカラ処理装置を対象にしてい
る。但し、従来の技術で述べたように、オペランドキャ
ッシュを２命令分同時にアクセスすることは出来ないの
で、そうではない条件が成立したときにのみ２つの命令
が併行して実行される。説明する実施形態はスーパース
カラ処理装置であるが、本発明はスカラ処理装置にも同
様に適用することが出来る。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present embodiment is directed to a superscalar processing apparatus having substantially two pipelines. However, as described in the related art, since the operand cache cannot be accessed at the same time for two instructions, the two instructions are executed in parallel only when a condition other than the above is satisfied. Although the embodiment to be described is a super scalar processing apparatus, the present invention can be similarly applied to a scalar processing apparatus.

【００２４】また、本実施形態は分岐成立を前提とした
時のパイプライン処理を例にしたものであり、これに分
岐予測などの技術を組み合わせてもよい。さらに、本実
施形態では、命令キャッシュとオペランドキャッシュと
（それぞれメインメモリの写しを格納するメモリであ
る）を別に構成している。命令キャッシュ、オペランド
キャッシュは命令、オペランドを含むキャッシュメモリ
に置き換え得る。Further, the present embodiment exemplifies pipeline processing on the premise that a branch is taken, and may combine this with a technique such as branch prediction. Further, in the present embodiment, the instruction cache and the operand cache (each of which is a memory for storing a copy of the main memory) are configured separately. The instruction cache and operand cache can be replaced with a cache memory containing instructions and operands.

【００２５】以下、本発明による情報処理装置の一実施
形態を図面により詳細に説明する。An embodiment of the information processing apparatus according to the present invention will be described below in detail with reference to the drawings.

【００２６】図１は本発明の一実施形態によるパイプラ
イン方式の情報処理装置の命令レジスタおよび命令デコ
ーダ部分の構成例を示すブロック図、図２は命令レジス
タ、命令デコーダを含む情報処理装置の構成例を示すブ
ロック図、図３は図６に示す命令列における動作を説明
するタイミングチャートである。以下、図１〜図３を参
照して本発明による情報処理装置について説明する。FIG. 1 is a block diagram showing a configuration example of an instruction register and an instruction decoder part of a pipeline type information processing apparatus according to an embodiment of the present invention. FIG. 2 is a configuration of an information processing apparatus including an instruction register and an instruction decoder. FIG. 3 is a block diagram showing an example, and FIG. 3 is a timing chart for explaining the operation in the instruction sequence shown in FIG. Hereinafter, an information processing apparatus according to the present invention will be described with reference to FIGS.

【００２７】図１において１００はメモリの写しを持つ
命令キャッシュ、２００は命令キャッシュから読み出し
た複数の命令を保持する命令バッファ（ＩＢＲ）、４０
０は第１の命令解読ステージで次に解読する命令を保持
する第１分岐命令レジスタ（ＢＩＲＰ）、５００はＢＩ
ＲＰ４００に続く命令を保持する第２分岐命令レジスタ
（ＢＩＲＳ）、６００はＢＩＲＰ４００およびＢＩＲＳ
５００に命令をセットするセット信号ラッチ（ＳＥＴＢ
ＩＲＤ）、７００はＢＩＲＰ４００に保持されている命
令が有効であることを示す識別子（ＢＩＲＰＶ）、８０
０はＢＩＲＳ５００に保持されている命令が有効である
ことを示す識別子（ＢＩＲＳＶ）、３００はＩＢＲ２０
０、ＢＩＲＰ４００、ＢＩＲＳ５００、ＳＥＴＢＩＲＤ
６００、ＢＩＲＰＶ７００およびＢＩＲＳＶ８００を制
御するＩＢＲ制御回路、９００はＢＩＲＰ４００および
ＢＩＲＳ５００に保持された命令を解読する第１命令デ
コーダ、１０００はＢＩＲＰ４００およびＢＩＲＳ５０
０にセットされた命令を逐次格納する命令フローレジス
タ（ＩＦＲ）であり、例えば８命令分保持することので
きるレジスタ群である。In FIG. 1, reference numeral 100 denotes an instruction cache having a copy of a memory; 200, an instruction buffer (IBR) holding a plurality of instructions read from the instruction cache;
0 is a first branch instruction register (BIRP) holding an instruction to be decoded next in a first instruction decoding stage, 500 is a BIRP
A second branch instruction register (BIRS) holding the instruction following RP400, 600 is BIRP400 and BIRS
Set signal latch (SETB
IRD), 700 is an identifier (BIRPV), 80 indicating that the instruction held in BIRP 400 is valid.
0 is an identifier (BIRSV) indicating that the instruction held in the BIRS 500 is valid, and 300 is an IBR20.
0, BIRP400, BIRS500, SETBIRD
600, an IBR control circuit for controlling the BIRPV 700 and the BIRSV 800; 900, a first instruction decoder for decoding instructions held in the BIRP 400 and the BIRS 500; 1000, a BIRP 400 and a BIRS 50
An instruction flow register (IFR) that sequentially stores instructions set to 0, and is a group of registers that can hold, for example, eight instructions.

【００２８】１０１０はＢＩＲＰ４００の出力とＩＦＲ
１０００の出力を選択する選択回路、１２００は第２の
命令解読ステージで次に解読する命令を保持する第１命
令レジスタ（ＩＲＰ）、１３００はＩＲＰ１２００に続
く命令を保持する第２命令レジスタ（ＩＲＳ）、１５０
０はＩＲＰ１２００に保持されている命令が有効である
ことを示す識別子（ＩＲＰＶ）、１６００はＩＲＳ１３
００に保持されている命令が有効であることを示す識別
子（ＩＲＳＶ）、１１００はＩＦＲ１０００、選択回路
１０１０、ＩＲＰ１２００、ＩＲＳ１３００、ＩＲＰＶ
１５００およびＩＲＳＶ１６００を制御するＩＦＲ制御
回路、１７００はＩＲＰ１２００およびＩＲＳ１３００
に保持された命令を解読する第２命令デコーダである。Reference numeral 1010 denotes the output of the BIRP 400 and the IFR.
A selection circuit for selecting the output of 1000, 1200 is a first instruction register (IRP) holding an instruction to be decoded next in a second instruction decoding stage, and 1300 is a second instruction register (IRS) holding an instruction following IRP 1200. , 150
0 is an identifier (IRPV) indicating that the instruction held in the IRP 1200 is valid, and 1600 is an IRS 13
Identifier (IRSV) 1100 indicating that the instruction held in 00 is valid, IFR1000, selection circuit 1010, IRP1200, IRS1300, IRPV
An IFR control circuit for controlling the IRS 1200 and the IRSV 1600.
Is a second instruction decoder for decoding the instruction stored in the second instruction decoder.

【００２９】選択回路１０１０はＩＦＲ１０００に命令
が存在しないときに、ＩＦＲ１０００をバイパスしてＢ
ＩＲＰ４００の命令をＩＲＰ１２００に格納するための
ものであるが、ある場合には空きサイクルが出ても良い
とする設計思想のもとに常にＩＦＲ１０００を経由する
方式とすれば持たなくとも構わない。また、本実施形態
ではステージの時間的制約上ＢＩＲＳ５００からＩＲＳ
１３００へ至る経路に選択回路を持たせなかったが、設
計態様によってはここにも選択回路を持たせることも有
り得る。また、第１分岐命令レジスタ（ＢＩＲＰ）４０
０、第２分岐命令レジスタ（ＢＩＲＳ）５００と物理的
に分離した２つのレジスタとして例示してあるが、要は
１サイクルに複数の命令の読み出しとデコードが可能で
あれば良いのであって、必要な書き込み、読み出しが出
来るものであれば一体のレジスタであっても良く、これ
を機能的に第１、第２の分岐命令レジスタと呼んでも良
い。When there is no instruction in the IFR 1000, the selection circuit 1010 bypasses the IFR 1000 to
The instruction of the IRP 400 is stored in the IRP 1200. However, the instruction may always be passed through the IFR 1000 based on the design idea that an empty cycle may be generated in some cases. Also, in this embodiment, due to the time constraint of the stage, the BIRS 500
Although no selection circuit is provided on the path to 1300, a selection circuit may be provided here depending on the design. The first branch instruction register (BIRP) 40
0, the second branch instruction register (BIRS) 500 is illustrated as two registers physically separated from each other, but it is only necessary to be able to read and decode a plurality of instructions in one cycle. An integrated register may be used as long as it can write and read data, and it may be functionally referred to as first and second branch instruction registers.

【００３０】命令キャッシュ１００から読み出された複
数の命令はＩＢＲ２００に保持される。次に解読される
命令がＩＢＲ２００から切り出され、ＢＩＲＰ４００に
セットされる。また，ＢＩＲＰ４００にセットされた命
令の次の命令も同時にＩＢＲ２００から切り出されＢＩ
ＲＳ５００にセットされる。ＩＢＲ制御回路３００では
ＢＩＲＰ４００とＢＩＲＳ５００にセットされた命令が
有効か否かを判断する。有効とはレジスタ内に命令全体
が格納されていることを示す。ＢＩＲＰ４００とＢＩＲ
Ｓ５００にセットされた命令が両方とも有効な命令であ
る場合、両方の命令が有効であることを示すために、Ｉ
ＢＲ制御回路３００はＢＩＲＰＶ７００およびＢＩＲＳ
Ｖ８００に'１'をセットする。A plurality of instructions read from the instruction cache 100 are held in the IBR 200. The next instruction to be decoded is cut out from the IBR 200 and set in the BIRP 400. The instruction following the instruction set in BIRP400 is also cut out from IBR200 at the same time and
Set to RS500. The IBR control circuit 300 determines whether the instructions set in the BIRP 400 and the BIRS 500 are valid. Valid indicates that the entire instruction is stored in the register. BIRP400 and BIR
If the instructions set in S500 are both valid instructions, then to indicate that both instructions are valid, I
The BR control circuit 300 includes a BIRPV 700 and a BIRS
"1" is set to V800.

【００３１】ＢＩＲＰ４００にセットされた命令の次の
命令がまだＩＢＲ２００に格納されていない場合、ＩＢ
Ｒ制御回路３００はＢＩＲＰＶ７００にのみ'１'をセッ
トする。ＢＩＲＰ４００にセットされる命令がまだＩＢ
Ｒ２００に格納されていない場合、ＩＢＲ制御回路３０
０はＢＩＲＰＶ７００およびＢＩＲＳＶ８００に'０'を
セットする。また、ＩＢＲ制御回路３００は次に実行す
る命令の切り出しをＩＢＲ２００に対して要求する。こ
のとき、ＢＩＲＰＶ７００とＢＩＲＳＶ８００の両方
が'１'の場合はＢＩＲＳ５００にセットされた命令の次
命令からの切り出しを要求し、ＢＩＲＰＶ７００のみ'
１'の場合はＢＩＲＰ４００にセットされた命令の次命
令からの切り出しを要求する。また、ＩＢＲ２００に空
きが生じた場合、ＩＢＲ制御回路３００は命令キャッシ
ュ１００に対して命令の読み出しを要求する。If the instruction following the instruction set in BIRP 400 has not yet been stored in IBR 200, IB
The R control circuit 300 sets '1' only in the BIRPV 700. Instruction set in BIRP400 is still IB
If not stored in the R200, the IBR control circuit 30
0 sets '0' to BIRPV700 and BIRSV800. Further, the IBR control circuit 300 requests the IBR 200 to cut out an instruction to be executed next. At this time, if both BIRPV700 and BIRSV800 are “1”, a request is made to cut out the instruction set in BIRS500 from the next instruction, and only BIRPV700 is used.
In the case of 1 ', a cut-out from the instruction following the instruction set in BIRP 400 is requested. When an empty space is generated in the IBR 200, the IBR control circuit 300 requests the instruction cache 100 to read the instruction.

【００３２】第１命令デコーダ９００はＢＩＲＰ４０
０、ＢＩＲＳ５００に格納されているＢＩＲＰＶ７０
０、ＢＩＲＳＶ８００で有効とされた命令を解読する。
ＢＩＲＰ４００またはＢＩＲＳ５００に分岐命令が解読
された場合、第１命令デコーダ９００は命令キャッシュ
１００に対して分岐先命令の命令読み出しを要求する。
ＢＩＲＰ４００およびＢＩＲＳ５００にセットされた命
令はＩＦＲ１０００に逐次格納される。このとき、ＩＦ
Ｒ１０００に対する命令の格納はＩＦＲ制御回路１１０
０の制御により、ＳＥＴＢＩＲＤ６００が'１'（即ち、
ＩＦＲ１０００に空きがある）かつＢＩＲＰＶ７００
が'１'かつＢＩＲＳＶ８００が'１（即ち、それぞれの
命令が有効に揃っている）ならば、ＢＩＲＰ４００にセ
ットされた命令とＢＩＲＳ５００にセットされた命令の
両方をＩＦＲ１０００に格納する。また，ＳＥＴＢＩＲ
Ｄ６００が'１'かつＢＩＲＰＶ７００が'１'かつＢＩＲ
ＳＶ８００が'０'ならば、ＢＩＲＰ４００にセットされ
た命令のみをＩＦＲ１０００に格納する。また、ＳＥＴ
ＢＩＲＤ６００が'１'かつＢＩＲＰＶ７００が'０'かつ
ＢＩＲＳＶ８００が'０'またはＳＥＴＢＩＲＤ６００
が'０（即ち、ＩＦＲ１０００に空きがない）ならば、
ＩＦＲ１０００に対して命令の格納は行われない。The first instruction decoder 900 has a BIRP 40
0, BIRPV70 stored in BIRS500
0, the instruction validated by BIRSV800 is decoded.
When the branch instruction is decoded by the BIRP 400 or the BIRS 500, the first instruction decoder 900 requests the instruction cache 100 to read the instruction of the branch destination instruction.
The instructions set in BIRP 400 and BIRS 500 are sequentially stored in IFR 1000. At this time,
The storage of the instruction for R1000 is performed by the IFR control circuit 110.
By the control of 0, SETBIRD 600 becomes “1” (that is,
IFR1000 is empty) and BIRPV700
Is "1" and BIRSV 800 is "1" (that is, the respective instructions are effectively aligned), both the instruction set in BIRP 400 and the instruction set in BIRS 500 are stored in IFR 1000. Also, SETBIR
D600 is '1' and BIRPV700 is '1' and BIR
If SV 800 is “0”, only the instruction set in BIRP 400 is stored in IFR 1000. Also, SET
BIRD600 is '1', BIRPV700 is '0' and BIRSV800 is '0' or SETBIRD600
Is' 0 (ie there is no room in IFR1000),
No instruction is stored in the IFR1000.

【００３３】選択回路１０１０はＩＦＲ制御回路１１０
０からの指示に従い、ＩＦＲ１０００に命令が存在しな
い場合はＢＩＲＰ４００の出力を選択し、ＩＦＲ１００
０に命令が存在する場合はＩＦＲ１０００の第１出力を
選択する（選択回路については先に述べた通りであ
る）。ＩＦＲ１０００の第１出力にはＩＦＲ１０００に
格納されている命令の内、最も最初に格納された命令、
即ち、次に解読される命令が出力される。また、ＩＦＲ
１０００の第２出力には第１出力に出力される命令の次
の命令が出力される。選択回路１０１０の出力，即ち次
に解読される命令がＩＲＰ１２００にセットされる。ま
た、ＩＲＰ１２００にセットされた命令の次の命令も同
時にＩＦＲ１０００から切り出され、ＩＦＲ１０００の
第２出力を経由しＩＲＳ１３００にセットされる。The selection circuit 1010 is an IFR control circuit 110
In accordance with the instruction from 0, if there is no instruction in the IFR 1000, the output of the BIRP 400 is selected.
If an instruction exists at 0, the first output of the IFR 1000 is selected (the selection circuit is as described above). The first output of the IFR1000 is the first stored instruction among the instructions stored in the IFR1000.
That is, the instruction to be decoded next is output. Also, IFR
An instruction next to the instruction output to the first output is output to the second output of 1000. The output of the selection circuit 1010, that is, the instruction to be decoded next, is set in the IRP 1200. The instruction following the instruction set in the IRP 1200 is also cut out from the IFR 1000 at the same time, and set in the IRS 1300 via the second output of the IFR 1000.

【００３４】ＩＦＲ制御回路１１００ではＩＲＰ１２０
０とＩＲＳ１３００にセットされた命令の組み合わせが
スーパスカラ処理可能な命令組み合わせか否かを判断す
る。この判断はオペランドキャッシュというメモリから
の読み出しに競合が生じるかどうかで行なわれる。ＩＲ
Ｐ１２００とＩＲＳ１３００にセットされた命令の組み
合わせがスーパスカラ処理可能な命令組み合わせの場
合、ＩＦＲ制御回路１１００はＩＲＰＶ１５００および
ＩＲＳＶ１６００に'１'をセットする。In the IFR control circuit 1100, the IRP 120
It is determined whether the combination of 0 and the instruction set in the IRS 1300 is an instruction combination capable of superscalar processing. This determination is made based on whether or not a conflict occurs in reading from the memory called the operand cache. IR
If the combination of instructions set in P1200 and IRS1300 is an instruction combination that allows superscalar processing, IFR control circuit 1100 sets '1' to IRPV1500 and IRSV1600.

【００３５】ＩＲＰ１２００とＩＲＳ１３００にセット
された命令の組み合わせがスーパスカラ処理不可な命令
組み合わせの場合、またはＩＲＰ１２００にセットされ
た命令の次の命令がまだＩＦＲ１０００に格納されてい
ない場合、ＩＦＲ制御回路１１００はＩＲＰＶ１５００
にのみ'１'をセットする。ＩＲＰ１２００にセットされ
る命令がまだＩＢＲ２００に格納されていない場合、Ｉ
ＦＲ制御回路１１００はＩＲＰＶ１５００およびＩＲＳ
Ｖ１６００に'０'をセットする。また、ＩＦＲ制御回路
１１００は次に実行する命令の切り出しをＩＦＲ１００
０に対して要求する。このとき、ＩＲＰＶ１５００とＩ
ＲＳＶ１６００の両方が'１'の場合はＩＲＳ１３００に
セットされた命令の次命令からの切り出しを要求し、Ｉ
ＲＰＶ１５００のみ'１'の場合はＩＲＰ１２００にセッ
トされた命令の次命令からの切り出しを要求する。If the combination of instructions set in IRP 1200 and IRS 1300 is an instruction combination that cannot perform superscalar processing, or if the instruction following the instruction set in IRP 1200 has not yet been stored in IFR 1000, IFR control circuit 1100 causes IRP 1500 to
Set '1' only for. If the instruction set in IRP 1200 has not yet been stored in IBR 200,
FR control circuit 1100 includes IRPV 1500 and IRS
"0" is set to V1600. Also, the IFR control circuit 1100 extracts the instruction to be executed next from the IFR 100
Request for 0. At this time, IRPV 1500 and I
When both of the RSVs 1600 are “1”, a cut-out from the next instruction of the instruction set in the IRS 1300 is requested, and
When only the RPV 1500 is “1”, a cut-out from the next instruction of the instruction set in the IRP 1200 is requested.

【００３６】また、ＩＦＲ制御回路１１００はＩＦＲ１
０００の８命令分のレジスタが全て使用中であることを
検出すると、ＩＢＲ制御回路３００に対してＢＩＲＰ４
００およびＢＩＲＳ５００への命令のセットを抑止する
要求を発行する。ＢＩＲＰ４００およびＢＩＲＳ５００
への命令のセットが抑止されるとＳＥＴＢＩＲＤ６００
が'０'となり、ＩＦＲ１０００に対して命令の格納が行
われないため、ＩＦＲ１０００に解読されていない命令
が残っている状態で、上書きされることはない。そし
て、第２命令デコーダ１７００はＩＲＰ１２００、ＩＲ
Ｓ１３００に格納されている命令を解読する。Also, the IFR control circuit 1100
When it is detected that all the registers of eight instructions of 000 are in use, BIRP4
00 and a request to inhibit the instruction set to BIRS 500. BIRP400 and BIRS500
SETBIRD600 when the instruction set to
Becomes '0', and the instruction is not stored in the IFR 1000, so that the undecoded instruction remains in the IFR 1000 without being overwritten. Then, the second instruction decoder 1700 outputs the IRP 1200, IR
The instruction stored in S1300 is decrypted.

【００３７】次に図２で本発明による情報処理装置の各
パイプラインにおける処理について説明する。図２にお
いて、１００は命令キャッシュ、２００はＩＢＲ、４０
０はＢＩＲＰ、５００はＢＩＲＳ、９００は第１命令デ
コーダ、１０００はＩＦＲ、１０１０は選択回路、１２
００はＩＲＰ、１３００はＩＲＳ、１７００は第２命令
デコーダ、２０００は汎用レジスタ、２１００は第２命
令デコーダ１７００での解読結果によりオペランドアド
レス計算のために指定された汎用レジスタ２０００の内
容と命令により指定された変位値から命令の演算実行に
必要なメモリオペランドのアドレスを計算するオペラン
ド用アドレス加算器、２２００はメモリの写しを持つオ
ペランドキャッシュ、２３００はオペランドキャッシュ
２２００から読み出したデータのうち演算に使用する部
分を先頭に並び替えるアライナ、２４００はアライナ２
３００で並び替えを実施した後のメモリオペランドと、
命令デコーダ１７００での解読結果により演算のために
指定された汎用レジスタ２０００の内容で演算を行う演
算器である。Next, the processing in each pipeline of the information processing apparatus according to the present invention will be described with reference to FIG. 2, 100 is an instruction cache, 200 is an IBR, 40
0 is BIRP, 500 is BIRS, 900 is the first instruction decoder, 1000 is IFR, 1010 is a selection circuit, 12
00 is an IRP, 1300 is an IRS, 1700 is a second instruction decoder, 2000 is a general-purpose register, 2100 is specified by the contents and instructions of a general-purpose register 2000 specified for operand address calculation based on the result of decoding by the second instruction decoder 1700. An operand address adder for calculating an address of a memory operand required for execution of an instruction operation from the obtained displacement value, an operand cache 2200 having a copy of a memory, and a reference numeral 2300 used for an operation of data read from the operand cache 2200 Aligner to sort parts first, 2400 is Aligner 2
Memory operands after performing the sorting at 300;
This is an arithmetic unit that performs an operation based on the contents of the general-purpose register 2000 designated for the operation based on the decoding result of the instruction decoder 1700.

【００３８】ＩＦステージではＢＩＲＰ４００およびＢ
ＩＲＳ５００にセットされた命令を第１命令デコーダ９
００で解読する。第１命令デコーダ９００で分岐命令が
解読されると命令キャッシュ１００に対して分岐先命令
の命令読み出し要求が発行される。ＢＩＲＰ４００およ
びＢＩＲＳ５００にセットされた命令はＩＦＲ１０００
に格納される。また、ＢＩＲＰ４００にセットされた命
令は選択回路１０１０にも転送される。このように、第
１の命令解読ステージであるＩＦステージでは１マシン
サイクルあたり２命令ずつ解読し、第２の命令解読ステ
ージであるＤステージに命令を転送する。ここで、ＩＦ
ステージを遂行する回路を命令フェッチ回路と呼ぶこと
とする。In the IF stage, BIRP400 and B
The instruction set in the IRS 500 is transmitted to the first instruction decoder 9
Decode at 00. When the first instruction decoder 900 decodes a branch instruction, an instruction read request of a branch destination instruction is issued to the instruction cache 100. The instructions set in BIRP400 and BIRS500 are IFR1000
Is stored in The instruction set in BIRP 400 is also transferred to selection circuit 1010. As described above, in the IF stage which is the first instruction decoding stage, two instructions are decoded every machine cycle, and the instruction is transferred to the D stage which is the second instruction decoding stage. Where IF
A circuit that performs a stage is called an instruction fetch circuit.

【００３９】第２の命令解読ステージであるＤステージ
ではＩＲＰ１２００およびＩＲＳ１３００にセットされ
た命令を第２命令デコーダ１７００で解読する。このと
き、第２命令デコーダで同時に２命令解読できる組み合
わせは、両方の命令がメモリオペランド参照を必要とす
るとき以外である。即ち、２つの命令の内、少なくとも
片方の命令はメモリオペランド参照を必要としないレジ
スタ−レジスタ間演算命令もしくは分岐命令のときに、
２命令同時に解読することが可能である。一般にはメモ
リオペランド参照を必要とする命令の出現頻度が高いた
め、第２の命令解読ステージであるＤステージでは平均
的にみると１マシンサイクルあたり２命令ずつの解読は
できない。また，Ｄステージでは命令の解読結果により
汎用レジスタ２０００の指定された番号の汎用レジスタ
を読み出し、オペランド用アドレス加算器２１００に転
送する。また、別の解読結果である変位値もオペランド
用アドレス加算器２１００に転送する。ここで、Ｄステ
ージを遂行する回路をデコード回路と呼ぶ事とする。In the D stage, which is the second instruction decoding stage, the instructions set in the IRP 1200 and IRS 1300 are decoded by the second instruction decoder 1700. At this time, the combination in which two instructions can be decoded simultaneously by the second instruction decoder is other than when both instructions require memory operand reference. That is, at least one of the two instructions is a register-register operation instruction or a branch instruction that does not require a memory operand reference.
It is possible to decode two instructions simultaneously. Generally, the frequency of occurrence of instructions that require memory operand reference is high, so that, on average, two instructions cannot be decoded per machine cycle in the second instruction decoding stage, the D stage. In the D stage, the general-purpose register having the designated number is read from the general-purpose register 2000 based on the result of decoding the instruction, and is transferred to the operand address adder 2100. Also, a displacement value as another decoding result is transferred to the operand address adder 2100. Here, the circuit that performs the D stage is called a decode circuit.

【００４０】Ａステージでは指定された汎用レジスタの
内容と変位値よりオペランド用アドレス加算器２１００
でアドレス計算を行い、命令の演算実行に必要なメモリ
オペランド格納先のオペランドアドレスを計算する。オ
ペランド用アドレス加算器２１００により求めたオペラ
ンドアドレスはオペランドキャッシュ２２００に転送さ
れる。In the A stage, an operand address adder 2100 is used based on the contents of the designated general-purpose register and the displacement value.
And calculate the operand address of the memory operand storage destination required for executing the operation of the instruction. The operand address obtained by the operand address adder 2100 is transferred to the operand cache 2200.

【００４１】Ｔステージではオペランドキャッシュの参
照を実施する。オペランドキャッシュの読み出しデータ
はアライナ２３００に転送される。In the T stage, the operand cache is referenced. Read data from the operand cache is transferred to the aligner 2300.

【００４２】Ｂステージではオペランドキャッシュ２２
００からの読み出しデータの並べ替えを行い、Ｌステー
ジで演算器に転送する。In the B stage, the operand cache 22
The read data starting from 00 is rearranged and transferred to the arithmetic unit in the L stage.

【００４３】Ｅステージではアライナ２２００からのメ
モリオペランドと汎用レジスタ２０００からのレジスタ
オペランドより演算を行う。演算結果は汎用レジスタ２
０００に書き込まれる。このように、命令はＩＦ，Ｄ，
Ａ，Ｔ，Ｂ，Ｌ，Ｅの７つのパイプラインステージに分
解して実行される。In the E stage, an operation is performed based on the memory operand from the aligner 2200 and the register operand from the general-purpose register 2000. The calculation result is general-purpose register 2.
000. Thus, the instructions are IF, D,
A, T, B, L and E are decomposed into seven pipeline stages and executed.

【００４４】次に分岐命令を含む命令列についての一連
の処理について説明する。図６に示す一連の命令列を実
行するときのタイミングチャートを図３に示す。図３に
おいて横軸は時間を表わし、一目盛が１マシンサイクル
を示している。横軸に示す１〜２１の数字は説明に使う
ために便宜上付けたサイクル数である。Next, a series of processes for an instruction sequence including a branch instruction will be described. FIG. 3 shows a timing chart when the series of instructions shown in FIG. 6 is executed. In FIG. 3, the horizontal axis represents time, and one scale indicates one machine cycle. The numbers 1 to 21 shown on the horizontal axis are the number of cycles added for convenience for explanation.

【００４５】以下、図６に示す命令列の処理を図２およ
び図３で説明する。サイクル１にＬ命令がＢＩＲＰ４０
０に，Ａ命令がＢＩＲＳ５００にセットされる。サイク
ル１にＢＩＲＰ４００およびＢＩＲＳ５００にセットさ
れた命令は第１命令デコーダ９００で解読されるととも
にＩＦＲ１０００に転送される。また、サイクル１では
ＩＦＲ１０００に命令が格納されていないため、選択回
路１０１０ではＢＩＲＰ４００の出力が選択される。Hereinafter, the processing of the instruction sequence shown in FIG. 6 will be described with reference to FIGS. L instruction is BIRP40 in cycle 1
At 0, the A instruction is set in BIRS500. The instructions set in BIRP 400 and BIRS 500 in cycle 1 are decoded by first instruction decoder 900 and transferred to IFR 1000. In cycle 1, since no instruction is stored in IFR 1000, output of BIRP 400 is selected in selection circuit 1010.

【００４６】サイクル２ではＳＴ命令およびＬ命令が各
々ＢＩＲＰ４００、ＢＩＲＳ５００にセットされ、第１
命令デコーダ９００で解読されるとともにＩＦＲ１００
０に転送される。以降、サイクル３でＡ命令とＳＴ命令
が、サイクル４でＬ命令とＣ命令がＢＩＲＰ４００およ
びＢＩＲＳ５００にセットされ，第１命令デコーダ９０
０で解読されるとともにＩＦＲ１０００に転送される。In cycle 2, the ST instruction and the L instruction are set in BIRP400 and BIRS500, respectively,
Decoded by the instruction decoder 900 and the IFR 100
0 is transferred. Thereafter, the A instruction and the ST instruction are set in BIRP400 and BIRS500 in cycle 4 and the L and C instructions are set in cycle 4, and the first instruction decoder 90
It is decrypted at 0 and transferred to IFR1000.

【００４７】次にサイクル５でＢＣ命令がＢＩＲＰ４０
０にセットされ、ＩＦＲ１０００に転送される。第１命
令デコーダ９００はＢＩＲＰ４００にセットされたＢＣ
命令が分岐命令であることを解読すると、命令キャッシ
ュ１００に対して分岐先命令の読み出し要求を行う。サ
イクル６からサイクル８で命令キャッシュ１００の読み
出しおよびＩＢＲ２００への命令の格納が行われる。
（即ち、ここでは命令キャッシュ１００からＩＢＲ２０
０への命令の格納に３マシンサイクル掛かるとしてい
る。）従って、ＢＣ命令の分岐先命令であるＬ命令およ
びＡ命令はサイクル９でＢＩＲＰ４００およびＢＩＲＳ
５００にセットされる。これは第１命令デコーダ９００
で解読されるとともにＩＦＲ１０００と選択回路１０１
０にもに転送される。そして、サイクル１０ではＳＴ命
令がＢＩＲＰ４００にセットされ、第１命令デコーダ９
００で解読されるとともにＩＦＲ１０００に転送され
る。Next, in cycle 5, the BC instruction is BIRP40
Set to 0 and transferred to IFR1000. The first instruction decoder 900 has the BC set in BIRP400.
When it is determined that the instruction is a branch instruction, the instruction cache 100 is requested to read a branch destination instruction. From cycle 6 to cycle 8, reading of the instruction cache 100 and storage of the instruction in the IBR 200 are performed.
(That is, here, the instruction cache 100 to the IBR 20
It takes three machine cycles to store the instruction at 0. Therefore, the L instruction and the A instruction which are the branch destination instructions of the BC instruction are BIRP400 and BIRS in cycle 9.
Set to 500. This is the first instruction decoder 900
And the IFR1000 and the selection circuit 101
It is also transferred to 0. Then, in cycle 10, the ST instruction is set in BIRP 400, and the first instruction decoder 9
Decoded at 00 and transferred to IFR 1000.

【００４８】このように、ＩＦステージでは毎サイクル
２命令ずつ解読し、逐次ＩＦＲ１０００に命令を転送す
る。一方、サイクル２ではＬ命令が選択回路１０１０で
選択されＩＲＰ１２００にセットされる。このときＩＲ
Ｓ１３００にはＬ命令の次のＡ命令はまだＩＦＲに転送
中のためセットされない。従って，サイクル２ではＬ命
令のみが解読される。以降、サイクル３でアドレス計
算、サイクル４でオペランドキャッシュ参照、サイクル
５で読み出しデータのアライン、サイクル６で演算器２
４００に転送、サイクル７で演算を実行する。以下、Ａ
命令、ＳＴ命令、Ｌ命令、Ａ命令、ＳＴ命令、Ｌ命令と
同様に処理される。As described above, in the IF stage, two instructions are decoded every cycle, and the instructions are sequentially transferred to the IFR 1000. On the other hand, in cycle 2, the L instruction is selected by the selection circuit 1010 and set in the IRP 1200. At this time IR
In S1300, the A instruction following the L instruction is not set because it is still being transferred to the IFR. Therefore, in cycle 2, only the L instruction is decoded. Thereafter, address calculation is performed in cycle 3, operand cache reference is performed in cycle 4, read data is aligned in cycle 5, and arithmetic unit 2 is performed in cycle 6.
Transfer to 400, and execute operation in cycle 7. Hereinafter, A
The instruction, ST instruction, L instruction, A instruction, ST instruction, and L instruction are processed in the same manner.

【００４９】次にサイクル９でＣ命令がＩＲＰ１２００
にセットされる。このとき同時にＩＲＳ１３００にはＢ
Ｃ命令がセットされる。ＢＣ命令はオペランド用アドレ
ス加算器２１００、オペランドキャッシュ２２００を使
用しないためＣ命令とのスーパスカラ処理が可能であ
る。従って、ＢＣ命令はサイクル９で第２命令デコーダ
１７００での解読が行われる。サイクル１０でＢＣ命令
の分岐先命令であるＬ命令がＩＲＰ１２００にセットさ
れる。この時はＩＦＲ１０００からのＩＲＰ１２００の
セットに１マシンサイクル余分にかかってしまうので選
択回路１０１０を通してＩＲＰ１２００にセットされて
いる。即ち、空きサイクルなしに分岐先命令のデコード
のステージに入ることが出来ている。Next, in cycle 9, the C instruction is IRP1200
Is set to At this time, B
The C instruction is set. Since the BC instruction does not use the operand address adder 2100 and the operand cache 2200, superscalar processing with the C instruction is possible. Therefore, the BC instruction is decoded by the second instruction decoder 1700 in cycle 9. In cycle 10, the L instruction, which is a branch instruction of the BC instruction, is set in the IRP 1200. At this time, the setting of the IRP 1200 from the IFR 1000 requires one extra machine cycle, so the setting of the IRP 1200 is performed through the selection circuit 1010. That is, it is possible to enter the decoding stage of the branch destination instruction without an empty cycle.

【００５０】ＢＩＲＰ４００，ＢＩＲＳ５００へのＩＦ
とデコードステージＤとの間にもう１つのマシンサイク
ルの差があればＩＦＲ１０００からの命令をＩＲＰ１２
００にセットすることが出来る。一般にはＩＦＲ１２０
０からの命令のセットが多い。これはどういう命令列に
なっているのかということに依存するもので、図６に説
明のため例示した命令列の場合に上述したようになった
ということである。以下、Ａ命令、ＳＴ命令と順次処理
され、ＳＴ命令の実行はサイクル１７で完了する。IF to BIRP400, BIRS500
If there is another machine cycle difference between the IFR1000 and the decode stage D, the instruction from IFR1000 is
00 can be set. Generally IFR120
There are many instruction sets starting from 0. This depends on what kind of instruction sequence is used, and it is as described above in the case of the instruction sequence illustrated for explanation in FIG. Hereinafter, the A instruction and the ST instruction are sequentially processed, and the execution of the ST instruction is completed in cycle 17.

【００５１】このように本実施形態では分岐先命令の先
行読み出しが容易に出来、空きサイクルなしに分岐先命
令のデコードが実施出来る。As described above, in this embodiment, it is possible to easily read the branch destination instruction in advance and to decode the branch destination instruction without an idle cycle.

【００５２】以上の実施形態では第１の命令デコーダで
２命令ずつデコードし、第２の命令デコーダでは前述し
たようにスーパースカラでも命令の実行は必ずしも２命
令ずつ行なわれないから平均すれば２命令より小さな数
の命令ずつ実行のためデコードされることになってい
る。また、スカラ計算機の場合は平均は１命令デコード
／サイクルを超えない。このように、分岐先命令の先取
りのため、第１のデコーダの１サイクルでの先行命令デ
コード数は、１サイクルでの命令実行デコード数より大
きいという関係を保っている。これによって、分岐先命
令の先取りを可能にしている。In the above embodiment, the first instruction decoder decodes two instructions at a time, and the second instruction decoder does not always execute two instructions even in a superscalar as described above. A smaller number of instructions are to be decoded for execution. In the case of a scalar computer, the average does not exceed one instruction decode / cycle. In this way, the predecessor of the branch destination instruction keeps the relationship that the number of decodes of the preceding instruction in one cycle of the first decoder is larger than the number of decodes of the instruction execution in one cycle. This enables prefetching of branch destination instructions.

【００５３】以上，本発明を実施形態に基づいて説明し
たが，本発明は前述した実施形態に限定されるものでは
なく，その要旨を逸脱しない範囲において種々変更可能
であることは言うまでもない。As described above, the present invention has been described based on the embodiments. However, it is needless to say that the present invention is not limited to the above-described embodiments, and can be variously modified without departing from the gist thereof.

【００５４】[0054]

【発明の効果】以上、説明したように本発明によれば、
分岐命令が発生した場合においても、分岐命令以降の命
令の解読および演算の実行が遅れることを改善して，命
令列の効率的な処理を行うことができる。As described above, according to the present invention,
Even in the case where a branch instruction is generated, it is possible to improve the delay in decoding of instructions following the branch instruction and execution of operations, and to efficiently process an instruction sequence.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の一実施形態によるパイプライン方式の
情報処理装置の命令レジスタおよび命令デコーダ部分の
構成例を示すブロック図。FIG. 1 is a block diagram showing a configuration example of an instruction register and an instruction decoder part of a pipeline type information processing apparatus according to an embodiment of the present invention.

【図２】本発明の一実施形態による命令レジスタおよび
命令デコーダ部分を含む情報処理装置の構成例を示すブ
ロック図。FIG. 2 is a block diagram showing a configuration example of an information processing apparatus including an instruction register and an instruction decoder according to an embodiment of the present invention.

【図３】図６に示す命令列における本発明の一実施形態
の動作を説明するタイミングチャート。FIG. 3 is a timing chart for explaining the operation of the embodiment of the present invention in the instruction sequence shown in FIG. 6;

【図４】従来技術によるパイプライン方式の情報処理装
置の命令レジスタおよび命令デコーダ部分の構成例を示
すブロック図。FIG. 4 is a block diagram showing a configuration example of an instruction register and an instruction decoder part of a conventional pipeline type information processing apparatus.

【図５】従来技術による命令レジスタおよび命令デコー
ダ部分を含む情報処理装置の構成例を示すブロック図。FIG. 5 is a block diagram showing a configuration example of an information processing apparatus including an instruction register and an instruction decoder according to a conventional technique.

【図６】分岐命令を含む命令群の流れの一例を示す図。FIG. 6 is a diagram showing an example of the flow of an instruction group including a branch instruction.

【図７】図６に示す命令列における従来技術の動作を説
明するタイミングチャート。FIG. 7 is a timing chart for explaining the operation of the related art in the instruction sequence shown in FIG. 6;

【符号の説明】[Explanation of symbols]

１００：命令キャッシュ２００：命令バッファ（ＩＢＲ）３００：ＩＢＲ制御回路４００：第１分岐命令レジスタ（ＢＩＲＰ）５００：第２分岐命令レジスタ（ＢＩＲＳ）６００：分岐命令レジスタセット信号ラッチ（ＳＥＴＢ
ＩＲＤ）７００：ＢＩＲＰ有効識別子（ＢＩＲＰＶ）８００：ＢＩＲＳ有効識別子（ＢＩＲＳＶ）９００：第１命令デコーダ１０００：命令フローレジスタ（ＩＦＲ）１０１０：選択回路１１００：ＩＦＲ制御回路１２００：第１命令レジスタ（ＩＲＰ）１３００：第２命令レジスタ（ＩＲＳ）１５００：ＩＲＰ有効識別子（ＩＲＰＶ）１６００：ＩＲＳ有効識別子（ＩＲＳＶ）１７００：第２命令デコーダ100: instruction cache 200: instruction buffer (IBR) 300: IBR control circuit 400: first branch instruction register (BIRP) 500: second branch instruction register (BIRS) 600: branch instruction register set signal latch (SETB)
IRD) 700: BIRP valid identifier (BIRPV) 800: BIRS valid identifier (BIRSV) 900: First instruction decoder 1000: Instruction flow register (IFR) 1010: Select circuit 1100: IFR control circuit 1200: First instruction register (IRP) 1300: Second instruction register (IRS) 1500: IRP valid identifier (IRPV) 1600: IRS valid identifier (IRSV) 1700: Second instruction decoder

───────────────────────────────────────────────────── フロントページの続き (72)発明者糸井朋永神奈川県秦野市堀山下１番地株式会社日立製作所エンタープライズサーバ事業部内 (72)発明者袴田正史神奈川県秦野市堀山下１番地株式会社日立製作所エンタープライズサーバ事業部内Ｆターム(参考） 5B013 AA00 AA01 AA05 BB11 5B033 AA02 AA07 AA13 BA01 ──────────────────────────────────────────────────続き Continuing from the front page (72) Inventor Tomonaga Itoi 1 Horiyamashita, Hadano City, Kanagawa Prefecture Inside the Hitachi Server Enterprise Server Division (72) Inventor Masashi Hakamada 1 Horiyamashita, Hadano City, Kanagawa Japan F-term in the Enterprise Server Division of Ritsumi Works (reference) 5B013 AA00 AA01 AA05 BB11 5B033 AA02 AA07 AA13 BA01

Claims

【特許請求の範囲】[Claims]

【請求項１】メモリから先読みした命令を格納する命
令バッファと、前記命令バッファから読み出される複数
の命令を格納する第１の命令レジスタと、前記第１の命
令レジスタの複数の命令を解読する第１の命令デコーダ
と、前記第１の命令レジスタに格納された命令を逐次格
納する命令フローレジスタと、前記第１の命令レジス
タ、または前記命令フローレジスタの出力する命令を格
納する第２の命令レジスタと、前記第２の命令レジスタ
に格納された命令を解読する第２の命令デコーダを備
え、前記第１の命令デコーダの解析結果に基づき前記メ
モリに対して命令読み出し要求をすることを特徴とする
情報処理装置。An instruction buffer for storing instructions prefetched from a memory; a first instruction register for storing a plurality of instructions read from the instruction buffer; and a second instruction register for decoding a plurality of instructions in the first instruction register. 1 instruction decoder, an instruction flow register for sequentially storing instructions stored in the first instruction register, and a second instruction register for storing instructions output from the first instruction register or the instruction flow register. And a second instruction decoder that decodes the instruction stored in the second instruction register, and issues an instruction read request to the memory based on the analysis result of the first instruction decoder. Information processing device.

【請求項２】前記第１の命令デコーダは分岐命令を解
読すると、前記メモリに対し、当該分岐命令の分岐先命
令の読み出し要求をすることを特徴とする請求項１記載
の情報処理装置。2. The information processing apparatus according to claim 1, wherein when the first instruction decoder decodes the branch instruction, the first instruction decoder requests the memory to read a branch destination instruction of the branch instruction.

【請求項３】前記第１の命令レジスタは複数の命令レ
ジスタからなることを特徴とする請求項１記載の情報処
理装置。3. The information processing apparatus according to claim 1, wherein said first instruction register comprises a plurality of instruction registers.

【請求項４】更に、前記第１の命令レジスタの出力
と、前記命令フローレジスタの出力のいずれかを選択的
に前記第２の命令レジスタへ格納せしめる選択回路を備
えたことを特徴とする請求項１記載の情報処理装置。4. The apparatus according to claim 1, further comprising a selection circuit for selectively storing one of the output of the first instruction register and the output of the instruction flow register in the second instruction register. Item 10. The information processing apparatus according to Item 1.

【請求項５】更に、第１のレジスタの命令が有効のと
き第１の値を取る第１のレジスタ有効識別子と、前記命
令フローレジスタに空きがあるとき第１の値を取るセッ
ト信号ラッチと、前記第１のレジスタ有効識別子が第１
の値であり、前記セット信号ラッチが第１の値であると
き、前記第１のレジスタに格納されている命令を前記命
令フローレジスタに格納せしめる制御回路とを有するこ
とを特徴とする請求項１記載の情報処理装置。5. A first register valid identifier that takes a first value when an instruction of a first register is valid, and a set signal latch that takes a first value when the instruction flow register has a vacancy. , The first register valid identifier is a first
And a control circuit for causing the instruction flow register to store an instruction stored in the first register when the set signal latch is the first value. An information processing apparatus according to claim 1.

【請求項６】前記制御回路は前記命令フローレジスタ
がすべて使用中であることを検出すると前記セット信号
ラッチが第１の値を取ることを抑止することを特徴とす
る請求項５記載の情報処理装置。6. The information processing apparatus according to claim 5, wherein when the control circuit detects that all of the instruction flow registers are in use, the control circuit inhibits the set signal latch from taking a first value. apparatus.

【請求項７】メモリから先読みした命令を格納する命
令バッファと、前記命令バッファから読み出される複数
の命令を格納する第１の命令レジスタと、前記第１の命
令レジスタの複数の命令を解読する第１の命令デコーダ
と、前記第１の命令レジスタに格納された命令を逐次格
納する命令フローレジスタと、前記命令フローレジスタ
の出力する命令を格納する第２の命令レジスタと、前記
第２の命令レジスタに格納された命令を解読する第２の
命令デコーダを備え、前記第１の命令デコーダの解析結
果に基づき前記メモリに対して命令読み出し要求をする
ことを特徴とする情報処理装置。7. An instruction buffer for storing instructions prefetched from a memory, a first instruction register for storing a plurality of instructions read from the instruction buffer, and a second instruction register for decoding a plurality of instructions in the first instruction register. An instruction decoder, an instruction flow register for sequentially storing instructions stored in the first instruction register, a second instruction register for storing instructions output from the instruction flow register, and the second instruction register. An information processing apparatus, comprising: a second instruction decoder for decoding an instruction stored in the first instruction decoder, and making an instruction read request to the memory based on an analysis result of the first instruction decoder.

【請求項８】命令のパイプライン処理装置を持った情
報処理装置であって、命令キャッシュと、前記キャッシ
ュから先読みした命令を格納する命令バッファと、前記
命令バッファから前記パイプライン処理装置の１マシン
サイクルで複数の命令を読み出すレジスタと、読み出さ
れた複数の命令を解読し、分岐命令が解読されたとき前
記命令キャッシュに分岐先命令の先取りを要求する第１
の命令デコーダと、前記命令バッファから読み出された
命令を命令実行のためデコードする第２の命令デコーダ
とを有し、かつ前記命令バッファから１マシンサイクル
で読み出す命令数が前記第２の命令デコーダで１マシン
サイクル当たりデコードする平均命令数より大きいこと
を特徴とする情報処理装置。8. An information processing apparatus having an instruction pipeline processing device, comprising: an instruction cache; an instruction buffer for storing instructions prefetched from the cache; and a machine from the instruction buffer to the pipeline processing device. A register for reading a plurality of instructions in a cycle, and a first for decoding the read instructions and requesting the instruction cache to prefetch a branch destination instruction when the branch instruction is decoded.
And a second instruction decoder for decoding an instruction read from the instruction buffer for executing an instruction, and wherein the number of instructions read from the instruction buffer in one machine cycle is the second instruction decoder. An information processing apparatus characterized in that the number of instructions is larger than the average number of instructions to be decoded per machine cycle.

【請求項９】命令のパイプライン処理装置を持ったも
ので、第１、第２の命令デコーダ、先読みした命令を格
納する命令バッファ、命令バッファから読み出された命
令を一次格納する命令フローレジスタを持った情報処理
装置であり、複数の命令を前記パイプライン処理装置の
１マシンサイクルで読み出し、前記第１の命令デコーダ
で前記読み出した命令を解読し、解読の結果分岐命令が
あれば、分岐先命令を前記命令バッファに先取りさせ、
前記読み出した命令を前記命令フローレジスタへ転送す
る命令フェッチ回路と、前記命令フェッチステージに引
き続いて設けられ、前記命令フローレジスタの命令を前
記第２の命令デコーダで解読するデコード回路とを有す
ることを特徴とする情報処理装置。9. An instruction flow register having an instruction pipeline processing device, comprising first and second instruction decoders, an instruction buffer for storing prefetched instructions, and an instruction flow register for temporarily storing instructions read from the instruction buffer. A plurality of instructions are read in one machine cycle of the pipeline processing device, the first instruction decoder decodes the read instruction, and if there is a branch instruction as a result of the decoding, branching is performed. Prefetching the first instruction into the instruction buffer;
An instruction fetch circuit that transfers the read instruction to the instruction flow register; and a decode circuit that is provided subsequent to the instruction fetch stage and decodes the instruction in the instruction flow register with the second instruction decoder. Characteristic information processing device.

【請求項１０】前記命令フェッチステージで１マシン
サイクルで読み出される命令数は前記デコードステージ
で解読される平均命令数より大きいことを特徴とする請
求項９記載の情報処理装置。10. The information processing apparatus according to claim 9, wherein the number of instructions read in one machine cycle in said instruction fetch stage is larger than the average number of instructions decoded in said decode stage.

【請求項１１】命令バッファへの命令の先取りが行な
われるパイプライン方式の処理装置であって、命令を実
行するための第２の命令デコーダと、第２の命令デコー
ダでデコードされる命令に引き続く命令を第２の命令デ
コーダによるデコードより早くデコードする第１のデコ
ーダを有し、第１のデコーダで分岐命令が検出されたこ
とに応答して、分岐先命令を前記命令バッファへ先取り
することを特徴とする情報処理装置。11. A pipeline-type processing device for prefetching an instruction into an instruction buffer, the second instruction decoder for executing an instruction, and the instruction following the instruction decoded by the second instruction decoder. A first decoder for decoding the instruction earlier than the second instruction decoder, and responsive to the first decoder detecting a branch instruction, for prefetching a branch target instruction to the instruction buffer. Characteristic information processing device.