JP2006031329A

JP2006031329A - Data processor

Info

Publication number: JP2006031329A
Application number: JP2004208270A
Authority: JP
Inventors: Masahito Matsuo; 雅仁松尾
Original assignee: Renesas Technology Corp
Current assignee: Renesas Technology Corp
Priority date: 2004-07-15
Filing date: 2004-07-15
Publication date: 2006-02-02

Abstract

<P>PROBLEM TO BE SOLVED: To provide a data processor that implements a pipeline process and that can be guaranteed to operate normally even if a block to be repeatedly given repeat instructions is small in size. <P>SOLUTION: When a disp determining part 664 determines that a displacement value is not more than "2", a control signal generating part 665 sends an "H" (active state) instruction fetch stop signal 681 to an instruction fetch request generating part 652 upon receiving the determination result. On receiving the "H" instruction fetch stop signal 681, the instruction fetch request generating part 652 starts instruction fetch stop control to temporarily stop instruction fetches. Thereafter, a state in which instruction fetches are stopped is maintained until an "H" (active state) instruction fetch stop cancellation signal 682 or an "H" jump signal 684 is received. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

この発明はブロックリピート命令を有するデータ処理装置に関するものである。 The present invention relates to a data processing apparatus having a block repeat instruction.

ディジタル・シグナル・プロセッサ（ＤＳＰ）等ディジタル信号処理を効率よく処理するデータ処理装置では、繰り返し演算／転送処理を行う頻度が非常に高い。このようなデータ処理装置では、繰り返し処理のループ処理のオーバーヘッドを削減するために、指定された繰り返し対象ブロックの命令を指定された回数だけ繰り返し実行するブロックリピート命令を実装しているものも多い。 In a data processing apparatus that efficiently processes digital signal processing such as a digital signal processor (DSP), the frequency of repeated calculation / transfer processing is very high. Many of such data processing apparatuses are equipped with a block repeat instruction that repeatedly executes a specified repeat target block instruction a specified number of times in order to reduce the loop processing overhead of the repeat process.

例えば、特許文献１で開示されているデータ処理装置は、ブロックリピート命令を備え、ループ制御のためのカウンタのデクリメント、カウンタ値の終了判定、条件分岐を繰り返し対象ブロックの命令実行と並列にハードウェア的に処理することにより、ループ制御のためのオーバーヘッドをなくしている。ブロックリピート命令の実行により繰り返しブロックの先頭アドレスや終了アドレス、繰り返し回数のカウンタ値等のリピート処理に必要な設定を行う。しかし、動作周波数向上のためパイプライン処理を行っているため、リピート命令の実行段階では既に後続命令のフェッチが進んでいる。従って、動作保証できるリピートブロックのサイズには、制限がある。この制約を守るために、ループ処理を展開するなどして繰り返し対象ブロックのサイズを大きくしなければならない場合もある。 For example, the data processing device disclosed in Patent Document 1 includes a block repeat instruction, and repeats counter decrement for loop control, end determination of counter value, and conditional branch repeatedly in parallel with instruction execution of the target block. The overhead for loop control is eliminated by processing automatically. Execution of the block repeat instruction makes necessary settings for repeat processing such as the start address and end address of the repeated block, and the counter value of the number of repetitions. However, since pipeline processing is performed to improve the operating frequency, fetching of subsequent instructions has already progressed in the repeat instruction execution stage. Therefore, there is a limit to the size of the repeat block that can guarantee the operation. In order to observe this restriction, it may be necessary to increase the size of the target block to be repeated, such as by developing loop processing.

繰り返し対象ブロックが大きくすることにより、コードサイズが大きくなる。また、繰り返し対象ブロックの１回の処理データ数が増加すると、リピート命令で対処できない端数処理を行う部分のオーバーヘッドも大きくなる場合がある。さらに、繰り返し回数がダイナミックに変化する場合などは、端数処理のためのオーバーヘッドは更に大きくなる。すなわち、繰り返し回数を条件判定するための処理サイクル数が増大する。また、繰り返し回数の条件判定および繰り返し回数に応じたコードが必要となるため、処理を実現するためのプログラムサイズが大きくなる。 By increasing the repetition target block, the code size increases. Further, when the number of processing data for one repetition target block increases, the overhead of the fraction processing that cannot be handled by the repeat instruction may also increase. Furthermore, when the number of repetitions changes dynamically, the overhead for fraction processing becomes even larger. That is, the number of processing cycles for determining the number of repetitions is increased. Further, since it is necessary to determine the number of repetitions and code according to the number of repetitions, the program size for realizing the processing increases.

ソフトウェアをＲＯＭ化する場合には、プログラムサイズが大きくなると実装命令ＲＯＭサイズが大きくなり、ハードウェアのコストが高くなる。 In the case where software is implemented in ROM, as the program size increases, the mounted instruction ROM size increases and the hardware cost increases.

また、処理サイクル数が大きくなるということは高い性能が得られないと言うことであり、実装すべき機能を実現するための必要動作周波数が高くなるとともに、消費電力も増大する。 In addition, an increase in the number of processing cycles means that high performance cannot be obtained, and a necessary operating frequency for realizing a function to be mounted increases and power consumption also increases.

さらに、高速処理を行うために単純な繰り返し処理でも複雑なプログラムとなるため、プログラム開発負荷が大きく、バグ混入の可能性も高い。 Furthermore, since a high-speed process is performed, even a simple repetitive process results in a complicated program, the program development load is large, and there is a high possibility of bugs being mixed.

米国特許５，９０１，３０１号明細書US Pat. No. 5,901,301

従来のデータ処理装置は以上のように構成されているので、ハードウェア処理（パイプライン処理）を考慮し、リピート命令の繰り返し対象ブロックのサイズの最小値に制限を設けざるを得ないため、コードサイズが大きくなり製品コストが上がったり、処理サイクル数が増え高い性能が得られず消費電力も増大するなどの問題点があった。 Since the conventional data processing apparatus is configured as described above, it is necessary to set a limit on the minimum size of the repeat target block of the repeat instruction in consideration of hardware processing (pipeline processing). There are problems such as an increase in size and product cost, an increase in the number of processing cycles, high performance cannot be obtained, and power consumption also increases.

さらに、プログラムが複雑になるため、ソフトウェアの開発効率が低下し、バグ混入の可能性も高くなるという問題点もあった。 Furthermore, since the program is complicated, there is a problem that the efficiency of software development is reduced and the possibility of bugs being mixed is increased.

この発明は上記のような問題点を解消するためになされたもので、パイプライン処理を行うデータ処理装置において、リピート命令の繰り返し対象ブロックのサイズが小さい場合にも正常動作を保証できるようすることにより、コード効率のよい高性能で低コストなデータ処理装置を実現することを目的とする。また、繰り返し処理単位を小さくすることにより、繰り返し処理のプログラムの開発効率を向上し、バグの混入を削減することを目的とする。 The present invention has been made to solve the above-described problems. In a data processing apparatus that performs pipeline processing, it is possible to guarantee normal operation even when the size of a repeat target block of a repeat instruction is small. Accordingly, an object of the present invention is to realize a high-performance and low-cost data processing apparatus with good code efficiency. Another object of the present invention is to improve the development efficiency of a program for repetitive processing by reducing the unit of repetitive processing and reduce bugs.

この発明における請求項１記載のデータ処理装置は、パイプライン処理機能を有するとともに、リピート関連の設定の少なくとも一部を規定したリピート設定命令に従い、少なくとも一つの命令を含み総コードサイズが所定のコードサイズとなるリピート対象ブロックをハードウェア的に繰り返し実行するリピート機能を有するデータ処理装置であって、パイプライン処理対象の命令をフェッチする命令フェッチ部と、前記命令フェッチ部から前記命令を受け、当該命令が前記リピート設定命令である場合に、前記リピート対象ブロックに関する前記リピート機能の設定内容が所定の条件を満足するか否かを判定する判定動作を実行するリピート設定命令用判定部と、前記リピート設定命令用判定部が前記所定の条件を満足すると判定した場合、前記パイプライン処理対象の命令の新規フェッチ開始を一時抑止させる命令フェッチ抑止制御を行うリピート時パイプライン制御手段とを備えている。 According to a first aspect of the present invention, there is provided a data processing apparatus having a pipeline processing function and a code having a predetermined total code size including at least one instruction in accordance with a repeat setting instruction defining at least a part of repeat-related settings. A data processing apparatus having a repeat function for repeatedly executing a repeat target block having a size in hardware, wherein the instruction fetch unit fetches an instruction to be pipeline processed, receives the instruction from the instruction fetch unit, When the instruction is the repeat setting instruction, a repeat setting instruction determination unit that executes a determination operation for determining whether or not the setting contents of the repeat function related to the repeat target block satisfy a predetermined condition; and the repeat If the setting command determination unit determines that the predetermined condition is satisfied , And a repeat time pipeline control means for performing an instruction fetch suppression control to temporarily suppress new fetch start instruction of the pipeline processing.

この発明における請求項７記載のデータ処理装置は、パイプライン処理機能を有するとともに、リピート関連の設定の少なくとも一部を規定したリピート設定命令に従い、少なくとも一つの命令を含み総コードサイズが所定のコードサイズとなるリピート対象ブロックをハードウェア的に繰り返し実行するリピート機能を有するデータ処理装置であって、パイプライン処理対象の命令をフェッチする命令フェッチ部と、前記命令フェッチ部から前記命令を受け、当該命令が前記リピート設定命令である場合に、前記リピート対象ブロックに関する前記リピート機能の設定内容が第１の条件を満足するか否かを判定する判定動作を実行するリピート設定命令用判定部と、前記リピート設定命令判定部が前記第１の条件を満足すると判定した場合、前記リピート設定命令に規定したリピート関連の設定動作が終了した後に前記リピート設定命令以降の命令のパイプライン処理をキャンセルするパイプラインキャンセル処理を実行するリピート時パイプライン制御手段とを備え、前記第１の条件は前記所定のコードサイズが第１の基準値以下である条件を含んでいる。 According to a seventh aspect of the present invention, there is provided a data processing device having a pipeline processing function and a code having a predetermined total code size including at least one instruction in accordance with a repeat setting instruction that defines at least a part of repeat-related settings. A data processing apparatus having a repeat function for repeatedly executing a repeat target block having a size in hardware, wherein the instruction fetch unit fetches an instruction to be pipeline processed, receives the instruction from the instruction fetch unit, When the instruction is the repeat setting instruction, a repeat setting instruction determination unit that executes a determination operation for determining whether or not the setting content of the repeat function related to the repeat target block satisfies a first condition; When the repeat setting command determination unit determines that the first condition is satisfied A repeat-time pipeline control means for executing pipeline cancel processing for canceling pipeline processing of instructions subsequent to the repeat setting command after completion of a repeat-related setting operation defined in the repeat setting command; This condition includes a condition that the predetermined code size is equal to or smaller than the first reference value.

この発明における請求項９記載のデータ処理装置は、パイプライン処理機能を有するとともに、リピート関連の設定の少なくとも一部を規定したリピート設定命令に従い、少なくとも一つの命令を含み総コードサイズが所定のコードサイズとなるリピート対象ブロックをハードウェア的に繰り返し実行するリピート機能を有するデータ処理装置であって、パイプライン処理対象の命令をフェッチする命令フェッチ部と、前記命令フェッチ部から前記命令を受け、当該命令が前記リピート設定命令である場合に、前記リピート対象ブロックに関する前記リピート機能の設定内容が所定のキャンセル条件を満足するか否かを判定する判定動作を実行するリピート設定命令用判定部と、前記リピート設定命令用判定部が前記所定のキャンセル条件を満足すると判定した場合、前記リピート設定命令以降の命令のパイプライン処理をキャンセルするパイプラインキャンセル処理を実行するリピート時パイプライン制御手段とを備え、前記少なくとも一つの命令は所定の制御基準命令コードを含み、前記所定のキャンセル条件は前記判定動作実行前状態の命令に前記所定の制御基準命令コードが存在する場合を含んでいる。 According to a ninth aspect of the present invention, there is provided a data processing device having a pipeline processing function and a code having a predetermined total code size including at least one instruction in accordance with a repeat setting instruction defining at least a part of repeat-related settings. A data processing apparatus having a repeat function for repeatedly executing a repeat target block having a size in hardware, wherein the instruction fetch unit fetches an instruction to be pipeline processed, receives the instruction from the instruction fetch unit, When the instruction is the repeat setting instruction, a repeat setting instruction determination unit that executes a determination operation for determining whether or not the setting content of the repeat function related to the repeat target block satisfies a predetermined cancellation condition; and The repeat setting command determination unit determines the predetermined cancellation condition. And a repeat pipeline control means for executing pipeline cancel processing for canceling pipeline processing of instructions subsequent to the repeat setting instruction, and wherein at least one instruction has a predetermined control reference instruction code. The predetermined cancellation condition includes a case where the predetermined control reference instruction code is present in the instruction before the execution of the determination operation.

この発明に係る請求項１記載のデータ処理装置は、リピート対象ブロックに関するリピート機能の設定内容が所定の条件を満足すると判定した場合、パイプライン処理対象の命令の新規フェッチ開始を一時抑止させることができる。 According to a first aspect of the present invention, when it is determined that the setting content of the repeat function related to the repeat target block satisfies a predetermined condition, the start of a new fetch of an instruction to be pipeline processed can be temporarily suppressed. it can.

したがって、リピート設定命令の設定動作完了前にリピート対象ブロックの命令が無条件に命令フェッチ部にフェッチされる場合に正常動作が保証できない場合を上記所定の条件に設定することにより、リピート対象ブロックの所定のコードサイズが比較的小さい場合にも正常にリピート動作を実行することができる。 Therefore, if the normal operation cannot be guaranteed when the instruction of the repeat target block is unconditionally fetched by the instruction fetch unit before the repeat setting instruction is set, the predetermined condition is set. Even when the predetermined code size is relatively small, the repeat operation can be normally executed.

この発明に係る請求項７記載のデータ処理装置は、所定のコードサイズが第１の基準値以下である場合であっても、パイプラインキャンセル処理を実行することにより、オーバーヘッドを最小限に抑えながら正常なリピート動作を実行することができる。 According to a seventh aspect of the present invention, there is provided the data processing device according to the seventh aspect of the present invention, by performing the pipeline canceling process even when the predetermined code size is equal to or smaller than the first reference value, while minimizing the overhead. Normal repeat operation can be executed.

この発明に係る請求項９記載のデータ処理装置は、判定実行前状態の命令に所定の制御基準命令コードが存在する場合である場合に、パイプラインキャンセル処理を実行することより、オーバーヘッドを最小限に抑えながら正常なリピート動作を実行することができる。 According to a ninth aspect of the present invention, in the data processing device according to the ninth aspect, the overhead is minimized by executing the pipeline canceling process when a predetermined control reference instruction code is present in the instruction in a state before execution of determination. A normal repeat operation can be executed while suppressing the number of times.

＜実施の形態１＞
本発明の実施の形態１のデータ処理装置について説明する。本データ処理装置は、１６ビットプロセッサであり、アドレス及びデータのビット長は１６ビットである。また、本データ処理装置は、ビット順、バイト順に関してビッグエンディアンを採用しており、ビット位置としてはＭＳＢがビット０になる。実際、本データ処理装置は説明されていない多くの機能を含んでいるが、本発明の説明に関係ない部分は、簡単のため説明を省いている。 <Embodiment 1>
A data processing apparatus according to Embodiment 1 of the present invention will be described. This data processing apparatus is a 16-bit processor, and the bit length of the address and data is 16 bits. The data processing apparatus employs big endian in the bit order and byte order, and the MSB is bit 0 as the bit position. Actually, the data processing apparatus includes many functions that are not described. However, portions that are not related to the description of the present invention are omitted for simplicity.

図１から図３に本データ処理装置のレジスタセットを示す。 1 to 3 show a register set of the data processing apparatus.

図１は汎用レジスタを示しており、１６本の汎用レジスタＧＲ０〜ＧＲ１５はデータやアドレス値を格納する。汎用レジスタＧＲ１４は、サブルーチンジャンプ時の戻り先アドレスを格納するためのリンク（ＬＩＮＫ）レジスタとして割り当てられている。汎用レジスタＧＲ１５はスタックポインタ（ＳＰ）であり、割り込み用のスタックポインタ（ＳＰＩ）ＧＲ１５ａとユーザ用のスタックポインタ（ＳＰＵ）ＧＲ１５ｂとが後で説明するプロセッサ・ステータス・ワード（ＰＳＷ）によって切り替えられる。以後、ＳＰＩとＳＰＵを総称して、スタックポインタ（ＳＰ）と呼ぶ。特別な場合を除き、４ビットのレジスタ指定フィールドでオペランドとなる各レジスタの番号が指定される。本データ処理装置では、例えば汎用レジスタＧＲ０，ＧＲ１のように２つのレジスタをペアにして、処理する命令を備えている。この場合、偶数番号のレジスタを指定する。ペアのレジスタとして、レジスタ番号を１たした奇数番号のレジスタが暗に指定される。 FIG. 1 shows general-purpose registers, and the 16 general-purpose registers GR0 to GR15 store data and address values. The general-purpose register GR14 is assigned as a link (LINK) register for storing a return address at the time of a subroutine jump. The general-purpose register GR15 is a stack pointer (SP), and the stack pointer (SPI) GR15a for interrupt and the stack pointer (SPU) GR15b for user are switched by a processor status word (PSW) described later. Hereinafter, SPI and SPU are collectively referred to as a stack pointer (SP). Except for special cases, the number of each register as an operand is designated in a 4-bit register designation field. In this data processing device, for example, a general register GR0, GR1 is provided with an instruction to process two registers as a pair. In this case, an even-numbered register is designated. As a pair of registers, an odd-numbered register obtained by adding 1 to the register number is implicitly designated.

図２は、アキュムレータを示しており、各々４０ビットのアキュムレータＡ０、Ａ１が構成される。 FIG. 2 shows an accumulator, and 40-bit accumulators A0 and A1 are configured.

図３の制御レジスタＣＲ０〜ＣＲ３，制御レジスタＣＲ９〜ＣＲ１１は各々１６ビットの制御レジスタである。各制御レジスタも、汎用レジスタと同様、通常レジスタの番号が４ビットで示される。制御レジスタＣＲ０は、プロセッサ・ステータス・ワード（ＰＳＷ）であり、データ処理装置の動作モードを指定するビットや演算結果を示すフラグからなる。図４は制御レジスタＣＲ０（ＰＳＷ）の構成を示す図である。 The control registers CR0 to CR3 and control registers CR9 to CR11 in FIG. 3 are each 16-bit control registers. In each control register, as in the general-purpose register, the normal register number is indicated by 4 bits. The control register CR0 is a processor status word (PSW), and includes a bit that specifies an operation mode of the data processing apparatus and a flag that indicates an operation result. FIG. 4 is a diagram showing the configuration of the control register CR0 (PSW).

ＳＭビット４１（ビット０）はスタックモードを示すビットである。ＳＭビット４１が“０”の場合は割り込みモードであることを示し、汎用レジスタＧＲ１５としてＳＰＩが用いられる。ＳＭビット４１が“１”の場合はユーザーモードであることを示し、汎用レジスタＧＲ１５としてＳＰＵが用いられる。 The SM bit 41 (bit 0) is a bit indicating the stack mode. When the SM bit 41 is “0”, it indicates an interrupt mode, and SPI is used as the general-purpose register GR15. When the SM bit 41 is “1”, it indicates the user mode, and the SPU is used as the general-purpose register GR15.

ＩＥビット４２（ビット４）は割り込みイネーブルを指定するビットであり、ＩＥビット４２が“０”の場合は割り込みをマスク（アサートされても無視）し、“１”の場合は割り込みを受け付ける。 The IE bit 42 (bit 4) is a bit for specifying interrupt enable. When the IE bit 42 is “0”, the interrupt is masked (ignored even when asserted), and when it is “1”, the interrupt is accepted.

本データ処理装置では、ゼロオーバーヘッドのループ処理を実現するためのブロックリピート機能が実装されている。ＲＰビット４３（ビット５）はブロックリピート状態を示すビットであり、ＲＰビット４３が“０”の場合はブロックリピート中でないことを、“１”の場合はブロックリピート中であることを示す。 In this data processing apparatus, a block repeat function for realizing a zero overhead loop process is implemented. The RP bit 43 (bit 5) is a bit indicating a block repeat state. When the RP bit 43 is “0”, it indicates that the block is not being repeated, and when it is “1”, it indicates that the block is being repeated.

Ｆ０フラグ４４（ビット１２）は実行制御フラグ（Ｆ０フラグ）であり、比較命令の比較結果などがこのフラグにセットされる。 The F0 flag 44 (bit 12) is an execution control flag (F0 flag), and the comparison result of the comparison instruction or the like is set in this flag.

キャリー・フラグ４５（ビット１５）は、加減算命令実行時のキャリーがセットされるフラグである。 The carry flag 45 (bit 15) is a flag in which the carry at the time of execution of the addition / subtraction instruction is set.

図３の制御レジスタＣＲ２はプログラムカウンタ（ＰＣ）であり、実行中の命令アドレスを示す。本データ処理装置が処理する命令は、基本的に３２ビット固定長であり、制御レジスタＣＲ２（ＰＣ）は、３２ビットを１ワードとした命令ワードアドレスを保持する。本レジスタは、読み出しのみ可能である。 The control register CR2 in FIG. 3 is a program counter (PC) and indicates an instruction address being executed. The instruction processed by this data processing apparatus is basically a 32-bit fixed length, and the control register CR2 (PC) holds an instruction word address with 32 bits as one word. This register can only be read.

制御レジスタＣＲ１は、バックアップ・プロセッサ・ステータス・ワード（ＢＰＳＷ）、制御レジスタＣＲ３は、バックアップ・プログラム・カウンタ（ＢＰＣ）であり、各々例外や割り込みが検出された場合に実行中の制御レジスタＣＲ０（ＰＳＷ）の値と制御レジスタＣＲ２（ＰＣ）の値を待避・保持するためのレジスタである。 The control register CR1 is a backup processor status word (BPSW), and the control register CR3 is a backup program counter (BPC). When an exception or an interrupt is detected, the control register CR0 (PSW) being executed is detected. ) And the value of the control register CR2 (PC) are saved and held.

制御レジスタＣＲ９〜ＣＲ１１は、ブロックリピート関連のレジスタであり、リピート中であっても割り込みを受け付けられるように、ユーザーが値を読み書きできるようになっている。制御レジスタＣＲ９は、リピート・カウンタ（ＲＰＴＣ）であり、リピート回数を示すカウント値を保持する。制御レジスタＣＲ１０は、リピート・ブロック・スタート・アドレス・レジスタ（ＲＰＴＳ）であり、リピートを行うブロックの先頭の命令アドレスを保持する。制御レジスタＣＲ１１は、リピート・ブロック・エンド・アドレス・レジスタ（ＲＰＴＥ）であり、リピートを行うブロックの最後の命令のアドレスを保持する。 The control registers CR9 to CR11 are registers related to block repeat, and the user can read and write values so that an interrupt can be accepted even during repeat. The control register CR9 is a repeat counter (RPTC) and holds a count value indicating the number of repeats. The control register CR10 is a repeat block start address register (RPTS) and holds the instruction address at the head of the block to be repeated. The control register CR11 is a repeat block end address register (RPTE) and holds the address of the last instruction of the block to be repeated.

本データ処理装置は２ウェイのＶＬＩＷ（Very Long Instruction Word）命令セットを処理する。図５は、本データ処理装置の命令フォーマットを示す。基本命令長は３２ビット固定であり、３２ビット境界に整置されている。各３２ビットの命令コードは、命令のフォーマットを示す２ビットのフォーマット指定ビット（ＦＭビット）１０１と、１５ビットの左コンテナ１０２と右コンテナ１０３から構成される。各コンテナ１０２、１０３はそれぞれ１５ビットからなるショートフォーマットのサブ命令を格納できるほか、２つで１つの３０ビットのロングフォーマットのサブ命令を格納できる。今後、簡単のため、ショートフォーマットのサブ命令をショート命令、ロングフォーマットのサブ命令をロング命令と呼ぶ。 This data processing apparatus processes a 2-way VLIW (Very Long Instruction Word) instruction set. FIG. 5 shows an instruction format of the data processing apparatus. The basic instruction length is fixed at 32 bits and is aligned on a 32-bit boundary. Each 32-bit instruction code includes a 2-bit format designation bit (FM bit) 101 indicating the format of the instruction, a 15-bit left container 102, and a right container 103. Each container 102 and 103 can store a short-format sub-instruction consisting of 15 bits, and two containers can store one 30-bit long-format sub-instruction. In the future, for the sake of simplicity, the short-format sub-instruction will be referred to as a short instruction, and the long-format sub-instruction will be referred to as a long instruction.

ＦＭビット１０１は命令のフォーマット及び２つのショート命令の実行順序を指定する。図６に、ＦＭビット１０１のフォーマット及び実行順序指定の詳細を示す。命令実行順序において、第１は先に実行される命令を、第２は後で実行される命令であることを示す。ＦＭビット１０１が“１１”の場合は、コンテナ１０２、１０３の３０ビットで１つのロング命令を保持することを示し、それ以外の場合は各コンテナ１０２、１０３がそれぞれショート命令を保持することを示す。さらに、２つのショート命令を保持する場合、ＦＭビット１０１で実行順序を指定する。ＦＭビット１０１が“００”のときは、２つのショート命令を並列に実行することを示すＲ。ＦＭビット１０１が“０１”のときは、左コンテナ１０２に保持されているショート命令を実行した後に、右コンテナ１０３に保持されているショート命令を実行することを示す。ＦＭビット１０１が“１０”のときは、右コンテナ１０３に保持されているショート命令を実行した後に、左コンテナ１０２に保持されているショート命令を実行することを示す。このように、シーケンシャルに実行する２つのショート命令も含めて１つの３２ビット命令にエンコード出来るようにして、コード効率の向上を図っている。 The FM bit 101 specifies the format of the instruction and the execution order of the two short instructions. FIG. 6 shows details of the FM bit 101 format and execution order designation. In the instruction execution order, the first indicates an instruction to be executed first, and the second indicates an instruction to be executed later. When the FM bit 101 is “11”, it indicates that one long instruction is held in the 30 bits of the containers 102 and 103, and in other cases, the containers 102 and 103 respectively hold short instructions. . Further, when two short instructions are held, the execution order is designated by the FM bit 101. When the FM bit 101 is “00”, R indicates that two short instructions are executed in parallel. When the FM bit 101 is “01”, this indicates that the short instruction held in the right container 103 is executed after the short instruction held in the left container 102 is executed. When the FM bit 101 is “10”, this indicates that the short instruction held in the left container 102 is executed after the short instruction held in the right container 103 is executed. In this way, code efficiency is improved by enabling encoding into one 32-bit instruction including two short instructions to be executed sequentially.

図７〜図１０に典型的な命令のビット割り付けの例を示す。図７は２つのオペランドを持つショート命令のビット割り付けを示す。フィールド１１１、１１４は、オペレーションコードフィールドである。フィールド１１４は、アキュムレータ番号を指定する場合もある。フィールド１１２、１１３はオペランドとして参照あるいは更新されるデータの格納位置を、レジスタ番号やアキュムレータ番号で指定する。フィールド１１３は、４ビットの小さな即値を指定する場合もある。 FIGS. 7 to 10 show examples of bit allocation of typical instructions. FIG. 7 shows the bit assignment of a short instruction having two operands. Fields 111 and 114 are operation code fields. Field 114 may specify an accumulator number. In the fields 112 and 113, the storage position of data to be referred to or updated as an operand is designated by a register number or an accumulator number. The field 113 may specify a 4-bit small immediate value.

図８は、ショートフォーマットの分岐命令の割り付けを示しており、オペレーションコードフィールド１２１と８ビットの分岐変位フィールド１２２からなる。分岐変位は、ＰＣ値と同様、命令ワード（３２ビット）のオフセットで指定される。 FIG. 8 shows allocation of a short format branch instruction, which includes an operation code field 121 and an 8-bit branch displacement field 122. The branch displacement is specified by the offset of the instruction word (32 bits), like the PC value.

図９は、１６ビットの変位や即値を持つ３オペランド命令やロード／ストア命令のフォーマットを示しており、オペレーションコードフィールド１３１、ショートフォーマットと同様レジスタ番号等を指定するフィールド１３２、１３３と、１６ビットの変位や即値等を指定する拡張データフィールド１３４からなる。 FIG. 9 shows the format of a 3-operand instruction or a load / store instruction having a 16-bit displacement or immediate value, and an operation code field 131, fields 132 and 133 for specifying a register number as in the short format, and 16 bits. The extended data field 134 is used to specify the displacement, immediate value, and the like.

図１０は、右コンテナ１０３側にオペレーションコードを持つロングフォーマットの命令のフォーマットを示しており、２ビットのフィールド１４１が“０１”になっている。フィールド１４３，１４６はオペレーションコードフィールドで、フィールド１４４，１４５はレジスタ番号等を指定するフィールドである。フィールド１４２は予約フィールドであり必要に応じてオペレーションコードやレジスタ番号等の指定に使用される。 FIG. 10 shows the format of a long format instruction having an operation code on the right container 103 side, and the 2-bit field 141 is “01”. Fields 143 and 146 are operation code fields, and fields 144 and 145 are fields for specifying register numbers and the like. A field 142 is a reserved field, and is used for designating an operation code, a register number, and the like as necessary.

上述以外に、ＮＯＰ（ノー・オペレーション）のように、１５ビットすべてがオペレーションコードとなる命令や、１オペランド命令等、特殊な命令のビット割り付けを持つものもある。 In addition to the above, there are other types such as NOP (no operation) that have special instruction bit assignments such as an instruction in which all 15 bits are an operation code and a one-operand instruction.

本データ処理装置の各サブ命令は、ＲＩＳＣライクな命令セットとなっている。メモリデータのアクセスを行う命令はロード／ストア命令のみであり、演算命令はレジスタ／アキュムレータ中のオペランドや即値オペランドに対して演算を行う。オペランドデータのアドレッシングモードとしては、レジスタ間接モード、ポストインクリメント付きレジスタ間接モード、ポストデクリメント付きレジスタ間接モード、プッシュモード、レジスタ相対間接モードの５種類ある。各々のニーモニックは、“ Ｒｓｒｃ”、“ Ｒｓｒｃ＋”、“ Ｒｓｒｃ−”、“ −ＳＰ”、“ （ｄｉｓｐ１６、Ｒｓｒｃ）”で示される。Ｒｓｒｃはベースアドレスを指定するレジスタ番号を、ｄｉｓｐ１６は１６ビットの変位値を示す。オペランドのアドレスはバイトアドレスで指定される。 Each sub-instruction of this data processing apparatus is a RISC-like instruction set. The only instruction for accessing memory data is a load / store instruction, and the arithmetic instruction performs an operation on an operand or an immediate operand in a register / accumulator. There are five types of operand data addressing modes: register indirect mode, register indirect mode with post-increment, register indirect mode with post-decrement, push mode, and register relative indirect mode. Each mnemonic is indicated by “Rsrc”, “Rsrc +”, “Rsrc−”, “−SP”, “(disp16, Rsrc)”. Rsrc indicates a register number for designating a base address, and disp16 indicates a 16-bit displacement value. The address of the operand is specified by a byte address.

レジスタ相対間接モード以外のロード／ストア命令は、図７に示す命令フォーマットとなる。フィールド１１３でベースレジスタ番号が指定され、フィールド１１２でメモリからロードしてきた値を書き込むレジスタの番号もしくはストアする値を保持するレジスタの番号が指定される。レジスタ間接モードは、ベースレジスタとして指定されたレジスタの値がオペランドアドレスとなる。ポストインクリメント付きレジスタ間接モードは、ベースレジスタとして指定されたレジスタの値がオペランドアドレスとなり、このベースレジスタの値がオペランドのサイズ（バイト数）分ポストインクリメントされて、書き戻される。ポストデクリメント付きレジスタ間接モードは、ベースレジスタとして指定されたレジスタの値がオペランドアドレスとなり、このベースレジスタの値がオペランドのサイズ（バイト数）分ポストデクリメントされて、書き戻される。プッシュモードは、ストア命令で、かつ、ベースレジスタがＲ１５の場合にのみ使用可能であり、スタックポインタ（ＳＰ）値がオペランドのサイズ（バイト数）分プリデクリメントされた値が、オペランドアドレスとなり、デクリメントされた値がＳＰに書き戻される。 A load / store instruction other than the register relative indirect mode has the instruction format shown in FIG. The base register number is designated in the field 113, and the register number for writing the value loaded from the memory or the register number for holding the stored value is designated in the field 112. In the register indirect mode, the value of the register designated as the base register becomes the operand address. In the register indirect mode with post-increment, the value of the register designated as the base register becomes the operand address, and the value of this base register is post-incremented by the size (number of bytes) of the operand and written back. In the register indirect mode with post-decrement, the value of the register designated as the base register becomes the operand address, and the value of this base register is post-decremented by the size (number of bytes) of the operand and written back. The push mode can be used only in the case of a store instruction and the base register is R15, and the value obtained by predecrementing the stack pointer (SP) value by the size of the operand (number of bytes) becomes the operand address. The written value is written back to the SP.

レジスタ相対間接モードのロード／ストア命令は図９に示す命令フォーマットとなる。フィールド１３３でベースレジスタ番号が指定され、フィールド１３２でメモリからロードしてきた値を書き込むレジスタの番号もしくはストアする値を保持するレジスタの番号が指定される。フィールド１３４はオペランド格納位置のベースアドレスからの変位値を指定する。レジスタ相対間接モードは、ベースレジスタとして指定されたレジスタの値に１６ビットの変位値を加算した値がオペランドアドレスとなる。 The register relative indirect mode load / store instruction has the instruction format shown in FIG. The base register number is designated in the field 133, and the register number for writing the value loaded from the memory or the register number for holding the stored value is designated in the field 132. A field 134 specifies a displacement value from the base address of the operand storage position. In the register relative indirect mode, a value obtained by adding a 16-bit displacement value to a register value designated as a base register is an operand address.

ジャンプ命令のジャンプ先アドレス指定には、ジャンプ先アドレスをレジスタ値で指定するレジスタ間接モードと、ジャンプ命令のプログラム・カウンタＰＣからの分岐変位で指定するＰＣ相対間接モードとがある。ＰＣ相対間接モードについては、分岐変位を８ビットで指定するショートフォーマットと、分岐変位を１６ビットで指定するロングフォーマットの２種類ある。また、オーバーヘッドなしにループ処理を実現するリピート機能を起動するためのブロックリピート命令も備える。 The jump destination address designation of the jump instruction includes a register indirect mode in which the jump destination address is designated by a register value and a PC relative indirect mode in which the jump instruction is designated by a branch displacement from the program counter PC. There are two types of PC relative indirect mode: a short format that specifies branch displacement with 8 bits and a long format that specifies branch displacement with 16 bits. Further, a block repeat instruction for starting a repeat function for realizing loop processing without overhead is also provided.

図１１は本実施の形態のデータ処理装置２００の機能ブロック構成を示すブロック図である。同図に示すように、データ処理装置２００は、ＭＰＵコア部２０１、ＭＰＵコア部２０１からの要求により命令データのアクセスを行う命令フェッチ部２０２、内蔵命令メモリ２０３、ＭＰＵコア部２０１からの要求によりオペランドデータのアクセスを行うオペランドアクセス部２０４、内蔵データメモリ２０５、及び命令フェッチ部２０２とオペランドアクセス部２０４からの要求を調停し、データ処理装置２００の外部のメモリアクセスを行う外部バスインターフェイス部２０６からなる。 FIG. 11 is a block diagram showing a functional block configuration of the data processing apparatus 200 of the present embodiment. As shown in the figure, the data processing device 200 includes an MPU core unit 201, an instruction fetch unit 202 that accesses instruction data in response to a request from the MPU core unit 201, a built-in instruction memory 203, and a request from the MPU core unit 201. From the operand access unit 204 that accesses the operand data, the internal data memory 205, and from the external bus interface unit 206 that arbitrates requests from the instruction fetch unit 202 and the operand access unit 204 and accesses the memory outside the data processing device 200 Become.

ＭＰＵコア部２０１は、制御部２１１、レジスタファイル２２１、第１演算部２２２、第２演算部２２３、ＰＣ部２２４からなる。 The MPU core unit 201 includes a control unit 211, a register file 221, a first calculation unit 222, a second calculation unit 223, and a PC unit 224.

制御部２１１は、パイプライン処理制御、命令の実行制御、命令フェッチ部２０２やオペランドアクセス部２０４とのインターフェイス制御など、ＭＰＵコア部２０１のすべての制御を行う。 The control unit 211 performs all control of the MPU core unit 201 such as pipeline processing control, instruction execution control, and interface control with the instruction fetch unit 202 and the operand access unit 204.

制御部２１１内の命令キュー２１２は、２エントリの３２ビット命令バッファと有効ビット、及び入出力ポインタ等からなり、ＦＩＦＯ（先入れ先出し）方式で制御される。命令フェッチ部２０２でフェッチされた命令データを一時保持し、命令デコード部２１３に送る。 The instruction queue 212 in the control unit 211 includes a 32-entry instruction buffer having two entries, a valid bit, an input / output pointer, and the like, and is controlled by a FIFO (first-in first-out) system. The instruction data fetched by the instruction fetch unit 202 is temporarily held and sent to the instruction decoding unit 213.

命令デコード部２１３は、主として２つのデコーダを含み、命令キュー２１２から送られる命令コードをデコードする。第１デコーダ２１４は、第１演算部２２２で実行する命令をデコードし、第２デコーダ２１５は、第２演算部２２３で実行する命令をデコードする。３２ビットの命令のデコードの第１サイクルでは、必ず左コンテナ１０２の命令コードが第１デコーダ２１４で解析され、右コンテナ１０３の命令コードが第２デコーダ２１５で解析される。ただし、ＦＭビット１０１及び左コンテナのビット０とビット１のデータは両方のデコーダで解析される。また、拡張データの切り出しを行うために、右コンテナ１０３のデータが第１デコーダ２１４に送られるが、解析は行われない。従って、最初に実行する命令はその命令を実行する演算器に対応した位置に置かれなければならない。２つのショート命令をシーケンシャルに実行する場合、先行して実行される命令のデコード中に後で実行する側の命令が図示していないプリデコーダでデコードされ、どちらのデコーダでデコードすべきかを判定する。先行する命令のデコード後、後で実行する命令の命令コードが選択されたデコーダに取り込まれ、解析される。後で実行される命令がどちらのデコーダでも処理できる命令の場合は第１デコーダ２１４でデコードする。 The instruction decoding unit 213 mainly includes two decoders and decodes an instruction code sent from the instruction queue 212. The first decoder 214 decodes an instruction executed by the first arithmetic unit 222, and the second decoder 215 decodes an instruction executed by the second arithmetic unit 223. In the first cycle of decoding a 32-bit instruction, the instruction code of the left container 102 is always analyzed by the first decoder 214, and the instruction code of the right container 103 is analyzed by the second decoder 215. However, the FM bit 101 and the data of bit 0 and bit 1 of the left container are analyzed by both decoders. In addition, in order to cut out the extension data, the data in the right container 103 is sent to the first decoder 214, but analysis is not performed. Therefore, the instruction to be executed first must be placed at a position corresponding to the arithmetic unit that executes the instruction. When two short instructions are executed sequentially, the instruction to be executed later is decoded by a predecoder (not shown) during the decoding of the instruction executed in advance, and it is determined which decoder should be decoded. . After decoding the preceding instruction, the instruction code of the instruction to be executed later is taken into the selected decoder and analyzed. If an instruction to be executed later can be processed by either decoder, the first decoder 214 decodes the instruction.

レジスタファイル２２１は、汎用レジスタＧＲ０〜ＧＲ１５の汎用レジスタ値を物理的に保持するレジスタからなり、第１演算部２２２、第２演算部２２３、ＰＣ部２２４、オペランドアクセス部２０４に複数のバスで結合されている。 The register file 221 is a register that physically holds the general-purpose register values of the general-purpose registers GR0 to GR15, and is coupled to the first arithmetic unit 222, the second arithmetic unit 223, the PC unit 224, and the operand access unit 204 through a plurality of buses. Has been.

図１２は第１演算部２２２の詳細ブロック構成を示すブロック図である。第１演算部２２２は、レジスタファイル２２１と、１６ビット幅のＳ１バス２５１、Ｓ２バス２５２、Ｓ３バス２５３で結合されており、この３つのバスでレジスタからデータを読み出し、演算器等にリードオペランドとなるデータやストアデータを転送する。また、第１演算部２２２は、レジスタファイル２２１と、１６ビット幅のＤ１バス２６１、Ｗバス２７２で結合されており、Ｄ１バス２６１で演算結果や転送データを、Ｗバス２７２でロードしたバイトデータをレジスタファイル２２１に転送する。 FIG. 12 is a block diagram illustrating a detailed block configuration of the first calculation unit 222. The first arithmetic unit 222 is connected to the register file 221 by a 16-bit S1 bus 251, S2 bus 252, and S3 bus 253. The first arithmetic unit 222 reads data from the registers using these three buses, Transfer data and store data. The first calculation unit 222 is coupled to the register file 221 by a 16-bit D1 bus 261 and a W bus 272, and byte results obtained by loading the calculation result and transfer data by the D1 bus 261 by the W bus 272. Is transferred to the register file 221.

さらに、第１演算部２２２及びレジスタファイル２２１は、オペランドアクセス部２０４と３２ビットのＯＤバス２７１で結合されており、１ワード、もしくは、２ワードのデータを転送することが可能である。 Further, the first arithmetic unit 222 and the register file 221 are coupled to the operand access unit 204 via a 32-bit OD bus 271 and can transfer one word or two words of data.

ＡＡラッチ３０２、ＡＢラッチ３０３は、ＡＬＵ３０１の入力ラッチである。ＡＡラッチ３０２は、Ｓ１バス２５１もしくはＳ３バス２５３を介して読み出されたレジスタ値を取り込む。ゼロクリアする機能も備えている。ＡＢラッチ３０３は、Ｓ３バス２５３を介して読み出されたレジスタ値もしくは第１デコーダ２１４でデコードの結果生成された１６ビットの即値を取り込む。ゼロクリアする機能も備えている。ＡＬＵ３０１では、主として比較、算術論理演算、オペランドアドレスの計算／転送、オペランドアドレス値のインクリメント／デクリメント、ジャンプ先アドレスの計算／転送等が行われる。演算結果やアドレスモディファイの結果はセレクタ３０５、Ｄ１バス２６１を介して、レジスタファイル２２１中の命令で指定されたレジスタに書き戻される。ＯＡラッチ３０６は、オペランドのアドレスを保持するラッチであり、ＡＬＵ３０１でのアドレス計算結果もしくはＡＡラッチ３０２に保持されたベースアドレスの値を選択的に保持し、ＯＡバス２７３を介してオペランドアクセス部２０４に出力する。また、ジャンプ先アドレスやリピートブロックエンドアドレスなどを計算した場合には、ＡＬＵ３０１の出力が、ＪＡバス２７４を介してＰＣ部２２４に転送される。ラッチ３０４は、制御レジスタ値や汎用レジスタ値の転送時に転送する値を保持するラッチであり、Ｓ１バス２５１もしくはＳ３バス２５３を介して転送された値をセレクタ３０５に出力する。転送時にはラッチ３０４の値が、Ｄ１バス２６１を介して、レジスタファイル２２１中の命令で指定されたレジスタや、第１演算部２２２もしくはＰＣ部２２４内の制御レジスタに書き込まれる。 An AA latch 302 and an AB latch 303 are input latches of the ALU 301. The AA latch 302 takes in the register value read via the S1 bus 251 or the S3 bus 253. It also has a zero clear function. The AB latch 303 takes in a register value read out via the S3 bus 253 or a 16-bit immediate value generated as a result of decoding by the first decoder 214. It also has a zero clear function. The ALU 301 mainly performs comparison, arithmetic logic operation, operand address calculation / transfer, operand address value increment / decrement, jump destination address calculation / transfer, and the like. The calculation result and the result of the address modification are written back to the register designated by the instruction in the register file 221 via the selector 305 and the D1 bus 261. The OA latch 306 is a latch that holds the address of the operand, and selectively holds the address calculation result in the ALU 301 or the base address value held in the AA latch 302, and the operand access unit 204 via the OA bus 273. Output to. When the jump destination address, repeat block end address, etc. are calculated, the output of the ALU 301 is transferred to the PC unit 224 via the JA bus 274. The latch 304 is a latch that holds a value transferred when the control register value or the general-purpose register value is transferred, and outputs the value transferred via the S1 bus 251 or the S3 bus 253 to the selector 305. At the time of transfer, the value of the latch 304 is written to the register designated by the instruction in the register file 221 or the control register in the first arithmetic unit 222 or the PC unit 224 via the D1 bus 261.

ストアデータ（ＳＤ）レジスタ３１１は、３２ビットのレジスタであり、Ｓ１バス２５１もしくはＳ２バス２５２の一方、もしくは、Ｓ１バス２５１とＳ２バス２５２の両方に出力されたストアデータを一時保持する。ＳＤレジスタ３１１に保持されたデータは、ラッチ３１２を介して整置回路３１３に転送される。整置回路３１３では、オペランドのアドレスに従ってストアデータが３２ビット境界に整置され、整置後のストアデータがラッチ３１４、ＯＤバス２７１を介してオペランドアクセス部２０４に出力される。 The store data (SD) register 311 is a 32-bit register, and temporarily stores the store data output to one of the S1 bus 251 and the S2 bus 252 or both the S1 bus 251 and the S2 bus 252. The data held in the SD register 311 is transferred to the alignment circuit 313 via the latch 312. In the alignment circuit 313, the store data is aligned on a 32-bit boundary according to the operand address, and the stored data after alignment is output to the operand access unit 204 via the latch 314 and the OD bus 271.

また、オペランドアクセス部２０４でロードされたバイトデータは、ＯＤバス２７１を介して、１６ビットのロードデータ（ＬＤ）レジスタ３１５に取り込まれる。ＬＤレジスタ３１５の値は、整置回路３１６に転送される。整置回路３１６では、バイト整置及びバイトデータのゼロ／符号拡張を行う。整置、拡張後のデータが、ラッチ３１７、Ｗバス２７２を介してレジスタファイル２２１中の指定されたレジスタに書き込まれる。１ワード（１６ビット）、あるいは、２ワード（３２ビット）ロードの場合には、ＬＤレジスタ３１５を介さず、ＯＤバス２７１からレジスタファイル２２１にロードした値が直接書き込まれる。 Further, the byte data loaded by the operand access unit 204 is taken into the 16-bit load data (LD) register 315 via the OD bus 271. The value of the LD register 315 is transferred to the alignment circuit 316. The alignment circuit 316 performs byte alignment and zero / sign extension of byte data. The aligned and expanded data is written to the designated register in the register file 221 via the latch 317 and the W bus 272. In the case of loading 1 word (16 bits) or 2 words (32 bits), the value loaded from the OD bus 271 to the register file 221 is directly written without going through the LD register 315.

制御部２１１中のＰＳＷ部３２３は図３の制御レジスタＣＲ０（ＰＳＷ）の値を物理的に保持するラッチや、ＰＳＷ更新回路等からなり、演算結果や命令の実行によりＰＳＷの値を更新する。制御部２１１中の制御レジスタに値を転送する場合、Ｓ３バス２５３に出力されたデータがＣＮＴＩＦラッチ３２１を介して、ＰＳＷ部３２３に転送される。また、制御部２１１中の制御レジスタの値を読み出す場合には、ＰＳＷ部３２３から読み出し対象となる制御レジスタの値がＤ１バス２６１に出力され、レジスタファイル２２１に書き込まれる。ＢＰＳＷレジスタ３２２は図３の制御レジスタＣＲ１（ＢＰＳＷ）の値を物理的に保持するレジスタである。例外処理等の起動にともなうＰＳＷ値の待避時には、Ｄ１バス２６１に出力された制御レジスタＣＲ０（ＰＳＷ）の値がＢＰＳＷレジスタ３２２に書き込まれる。例外処理等からの復帰時には、ＢＰＳＷレジスタ３２２の値は、直接ＣＮＴＩＦラッチ３２１を介して、ＰＳＷ部３２３に転送される。また、ＢＰＳＷレジスタ３２２はＳ３バス２５３への出力経路、及び、Ｄ１バス２６１からの入力経路を備える。 The PSW unit 323 in the control unit 211 includes a latch that physically holds the value of the control register CR0 (PSW) in FIG. 3, a PSW update circuit, and the like, and updates the value of the PSW by executing an operation result or an instruction. When transferring a value to the control register in the control unit 211, the data output to the S3 bus 253 is transferred to the PSW unit 323 via the CNTIF latch 321. When reading the value of the control register in the control unit 211, the value of the control register to be read is output from the PSW unit 323 to the D1 bus 261 and written to the register file 221. The BPSW register 322 is a register that physically holds the value of the control register CR1 (BPSW) in FIG. When the PSW value is saved due to the start of exception processing or the like, the value of the control register CR0 (PSW) output to the D1 bus 261 is written to the BPSW register 322. When returning from exception processing or the like, the value of the BPSW register 322 is directly transferred to the PSW unit 323 via the CNTIF latch 321. The BPSW register 322 includes an output path to the S3 bus 253 and an input path from the D1 bus 261.

図１３はＰＣ部２２４の内部構成の詳細を示すブロック図である。同図に示すように、命令アドレス（ＩＡ）レジスタ３３７は、次にフェッチする命令のアドレスを保持し、次にフェッチする命令のアドレスを命令フェッチ部２０２に出力する。引き続き後続の命令をフェッチする場合には、ＩＡレジスタ３３７からラッチ３３８を介して転送されたアドレス値がインクリメンタ３３９で１インクリメントされて、ＩＡレジスタ３３７に書き戻される。ジャンプやブロックリピート等によりシーケンスが切り替わる場合には、ＩＡレジスタ３３７はＪＡバス２７４を介して転送されるジャンプ先アドレスや、リピートブロックスタートアドレスを取り込む。 FIG. 13 is a block diagram showing details of the internal configuration of the PC unit 224. As shown in the figure, the instruction address (IA) register 337 holds the address of the instruction to be fetched next, and outputs the address of the instruction to be fetched next to the instruction fetch unit 202. When the subsequent instruction is subsequently fetched, the address value transferred from the IA register 337 via the latch 338 is incremented by 1 by the incrementer 339 and written back to the IA register 337. When the sequence is switched by jump, block repeat or the like, the IA register 337 takes in the jump destination address transferred via the JA bus 274 and the repeat block start address.

ＲＰＴＳレジスタ３４１、ＲＰＴＥレジスタ３４３、及びＲＰＴＣレジスタ３４５はブロックリピート制御用の制御レジスタであり、それぞれ図３の制御レジスタＣＲ１０、制御レジスタＣＲ１１、及び制御レジスタＣＲ９に対応する値を物理的に保持する。ＲＰＴＳレジスタ３４１、ＲＰＴＥレジスタ３４３、及びＲＰＴＣレジスタ３４５は、Ｄ１バス２６１からの入力ポートとＳ３バス２５３への出力ポートを持ち、必要に応じてブロックリピート時の初期設定や待避、復帰が行なわれる。 The RPTS register 341, the RPTE register 343, and the RPTC register 345 are control registers for block repeat control, and physically hold values corresponding to the control register CR10, the control register CR11, and the control register CR9 in FIG. The RPTS register 341, the RPTE register 343, and the RPTC register 345 have an input port from the D1 bus 261 and an output port to the S3 bus 253, and are initialized, saved, and restored during block repeat as necessary.

ＲＰＴＳレジスタ３４１はリピートブロックの開始命令アドレスを保持する。ＲＰＴＳレジスタ３４１初期設定直後には、ラッチ３４２も更新される。ブロックリピート処理中で、リピートブロックの先頭命令に戻る場合は、ラッチ３４２の値が、ＪＡバス２７４を介して、ＩＡレジスタ３３７に転送される。 The RPTS register 341 holds the start instruction address of the repeat block. Immediately after the initial setting of the RPTS register 341, the latch 342 is also updated. When returning to the head instruction of the repeat block during the block repeat process, the value of the latch 342 is transferred to the IA register 337 via the JA bus 274.

ＲＰＴＥレジスタ３４３はリピートブロックの最終命令のアドレスを保持する。この最終アドレスは、ブロックリピート命令処理時に第１演算部２２２で計算され、ＪＡバス２７４を介してＲＰＴＥレジスタ３４３に取り込まれる。比較器３４４は、ＲＰＴＥレジスタ３４３の値と、命令フェッチアドレスを保持しているＩＡレジスタ３３７の値とを比較し、一致情報６９２を制御部２１１へ出力する。 The RPTE register 343 holds the address of the last instruction of the repeat block. This final address is calculated by the first arithmetic unit 222 during block repeat instruction processing, and is taken into the RPTE register 343 via the JA bus 274. The comparator 344 compares the value of the RPTE register 343 with the value of the IA register 337 holding the instruction fetch address, and outputs the match information 692 to the control unit 211.

ＲＰＴＣレジスタ３４５及びＴＲＰＴＣレジスタ３４８は、リピートブロックの実行回数を管理するためのカウント値を保持する。ＴＲＰＴＣレジスタ３４８は、パイプライン処理における命令フェッチ段階での先行更新情報を保持する。ＴＲＰＴＣレジスタ３４８はＤ１バス２６１からの入力ポートを備えており、ＲＰＴＣレジスタ３４５の初期設定時に、同時に初期化される。リピートブロック最終命令のフェッチを行った場合、ＴＲＰＴＣレジスタ３４８の値がラッチ３５０を介してデクリメンタ３５１に転送され、デクリメントされてＴＲＰＴＣレジスタ３４８に書き戻される。１検出回路（ONE）３４９は、ＴＲＰＴＣレジスタ３４８が“１”である事を検出し、検出結果情報６９３を制御部２１１へ出力する。ＲＰＴＣレジスタ３４５は、マスタとなる実行段階でのカウント値を保持する。リピートブロック最終命令が実行されると、ＲＰＴＣレジスタ３４５の値がラッチ３４６を介してデクリメンタ３４７に転送され、デクリメントされてＲＰＴＣレジスタ３４５に書き戻される。また、ジャンプが起こった場合にＴＲＰＴＣレジスタ３４８の値を初期化するために、ＲＰＴＣレジスタ３４５から、ラッチ３５２を介し、ＴＲＰＴＣレジスタ３４８へ転送する経路がある。 The RPTC register 345 and the TRPTC register 348 hold count values for managing the number of executions of the repeat block. The TRPTC register 348 holds the preceding update information at the instruction fetch stage in the pipeline processing. The TRPTC register 348 has an input port from the D1 bus 261 and is initialized at the same time when the RPTC register 345 is initialized. When the repeat block final instruction is fetched, the value of the TRPTC register 348 is transferred to the decrementer 351 via the latch 350, decremented, and written back to the TRPTC register 348. The 1 detection circuit (ONE) 349 detects that the TRPTC register 348 is “1”, and outputs detection result information 693 to the control unit 211. The RPTC register 345 holds a count value at the execution stage as a master. When the repeat block final instruction is executed, the value of the RPTC register 345 is transferred to the decrementer 347 via the latch 346, decremented, and written back to the RPTC register 345. Further, there is a path for transferring from the RPTC register 345 to the TRPTC register 348 via the latch 352 in order to initialize the value of the TRPTC register 348 when a jump occurs.

実行ステージＰＣ（ＥＰＣ）３３４は実行中の命令のＰＣ値を保持し、次命令ＰＣ（ＮＰＣ）３３１は次に実行する命令のＰＣ値を保持する。ＮＰＣ３３１は、実行段階でジャンプが起こった場合、ＪＡバス２７４上のジャンプ先アドレス値を取り込む。リピートブロックの処理を繰り返す場合には、ラッチ３４２からリピートを行うブロックの先頭アドレスを取り込む。処理シーケンスの変更なく命令の実行が進むの場合には、１命令の実行が終了する毎にラッチ３３２を介して転送されたＮＰＣ３３１の値が、インクリメンタ３３３でインクリメントされ、ＮＰＣ３３１に書き戻される。サブルーチンジャンプ命令の場合には、ラッチ３３２の値が戻り先アドレスとしてＤ１バス２６１に出力され、レジスタファイル２２１中のリンクレジスタとして定義されているＲ１４に書き込まれる。 The execution stage PC (EPC) 334 holds the PC value of the instruction being executed, and the next instruction PC (NPC) 331 holds the PC value of the instruction to be executed next. The NPC 331 takes in the jump destination address value on the JA bus 274 when a jump occurs in the execution stage. When the repeat block processing is repeated, the start address of the block to be repeated is fetched from the latch 342. When the execution of an instruction proceeds without changing the processing sequence, the value of the NPC 331 transferred via the latch 332 is incremented by the incrementer 333 and written back to the NPC 331 every time execution of one instruction is completed. In the case of a subroutine jump instruction, the value of the latch 332 is output as a return address to the D1 bus 261 and written to R14 defined as a link register in the register file 221.

次に実行する命令のＰＣを参照する場合には、ＮＰＣ３３１の値がＳ３バス２５３に出力され、第１演算部２２２に転送される。また、次の命令が実行状態に入る場合には、ラッチ３３２の値がＥＰＣ３３４に転送される。実行中の命令のＰＣ値を参照する場合には、ＥＰＣ３３４の値がＳ３バス２５３に出力され、第１演算部２２２に転送される。ＢＰＣ３３６は、図３の３４のＣＲ３に対応する値を物理的に保持する。例外や割り込み等が検出された場合には、ＥＰＣ３３４の値がラッチ３３５を介してＢＰＣ３３６に転送される。ＢＰＣ３３６は、Ｄ１バス２６１からの入力ポートとＳ３バス２５３への出力ポートを持ち、必要に応じて待避、復帰が行なわれる。 When referring to the PC of the instruction to be executed next, the value of the NPC 331 is output to the S3 bus 253 and transferred to the first arithmetic unit 222. When the next instruction enters the execution state, the value of the latch 332 is transferred to the EPC 334. When referring to the PC value of the instruction being executed, the value of the EPC 334 is output to the S3 bus 253 and transferred to the first arithmetic unit 222. The BPC 336 physically holds a value corresponding to CR3 in FIG. When an exception or an interrupt is detected, the value of the EPC 334 is transferred to the BPC 336 via the latch 335. The BPC 336 has an input port from the D1 bus 261 and an output port to the S3 bus 253, and is saved and restored as necessary.

第２演算部２２３は、レジスタファイル２２１と、複数のバスで接続されており、参照／更新するレジスタ値を転送する。第２演算部２２３は、ＡＬＵ、バレルシフタ、プライオリティエンコーダ、積和演算器等を含み命令で指定された演算実行等を行う。図２で示したアキュムレータＡ０，Ａ１の２本の４０ビットアキュムレータを物理的に保持するアキュムレータも、第２演算部２２３に含まれる。 The second arithmetic unit 223 is connected to the register file 221 by a plurality of buses, and transfers register values to be referred / updated. The second arithmetic unit 223 includes an ALU, a barrel shifter, a priority encoder, a product-sum arithmetic unit, etc., and performs arithmetic operations designated by instructions. An accumulator that physically holds the two 40-bit accumulators A0 and A1 shown in FIG. 2 is also included in the second arithmetic unit 223.

次に本実施の形態におけるデータ処理装置のパイプライン処理について説明する。図１４はパイプライン処理を示す説明図である。本データ処理装置は、命令データのフェッチを行う命令フェッチ（ＩＦ）ステージ４０１、命令の解析を行う命令デコード（Ｄ）ステージ４０２、演算実行を行う命令実行（Ｅ）ステージ４０３、データメモリのアクセスを行うメモリアクセス（Ｍ）ステージ４０４、メモリからロードしたバイトオペランドをレジスタへ書き込むライトバック（Ｗ）ステージ４０５の５段のパイプライン処理を行う。 Next, pipeline processing of the data processing apparatus in this embodiment will be described. FIG. 14 is an explanatory diagram showing pipeline processing. This data processing apparatus includes an instruction fetch (IF) stage 401 for fetching instruction data, an instruction decode (D) stage 402 for analyzing instructions, an instruction execution (E) stage 403 for executing operations, and data memory access. The memory access (M) stage 404 to be executed and the write back (W) stage 405 for writing the byte operand loaded from the memory to the register are performed in five stages of pipeline processing.

Ｅステージ４０３での演算結果のレジスタへの書き込みはＥステージ４０３、ワード（２バイト）、２ワード（４バイト）ロード時のレジスタへの書き込みはＭステージ４０４で完了する。積和／積差演算、倍精度演算に関しては、更に乗算と加算の２段のパイプラインで命令の実行を行う。後段の処理を命令実行２（Ｅ２）ステージ４０６と呼ぶ。連続する積和／積差演算を１回／１クロックサイクルのスループットで実行できる。 Writing of the operation result in the register in the E stage 403 is completed in the E stage 403, and writing in the register at the time of loading word (2 bytes) or 2 words (4 bytes) is completed in the M stage 404. For product-sum / product-difference operations and double-precision operations, instructions are further executed in a two-stage pipeline of multiplication and addition. The subsequent process is called an instruction execution 2 (E2) stage 406. Successive product-sum / product-difference operations can be executed with a throughput of one time / one clock cycle.

ＩＦステージ４０１では、主として命令のフェッチ、命令キュー２１２の管理、ブロックリピート制御が行われる。命令フェッチ部２０２、内蔵命令メモリ２０３、外部バスインターフェイス部２０６、ＰＣ部２２４のＩＡレジスタ３３７、ラッチ３３８、インクリメンタ３３９、ＴＲＰＴＣレジスタ３４８、ラッチ３５０、デクリメンタ３５１、１検出回路３４９、比較器３４４等や制御部２１１のＩＦステージステージ制御、命令フェッチ制御、命令キュー２１２、ＰＣ部２２４制御等を行う部分が、このＩＦステージ４０１の制御で動作する。ＩＦステージ４０１は、Ｅステージ４０３のジャンプで初期化される。 In the IF stage 401, instruction fetch, instruction queue 212 management, and block repeat control are mainly performed. Instruction fetch unit 202, built-in instruction memory 203, external bus interface unit 206, IA register 337 of PC unit 224, latch 338, incrementer 339, TRPTC register 348, latch 350, decrementer 351, 1 detection circuit 349, comparator 344, etc. The part that performs IF stage stage control, instruction fetch control, instruction queue 212, PC unit 224 control, and the like of the control unit 211 operates under the control of the IF stage 401. The IF stage 401 is initialized by the jump of the E stage 403.

命令フェッチアドレスは、ＩＡレジスタ３３７で保持される。Ｅステージ４０３でジャンプが起こるとＪＡバス２７４を介してジャンプ先アドレスを取り込み、初期化を行う。シーケンシャルに命令データをフェッチする場合には、インクリメンタ３３９でアドレスをインクリメントする。ブロックリピート処理中で、リピートブロックの最終命令処理後リピートブロックの先頭に戻る場合、ＩＦステージ４０１で命令処理シーケンスの切り替え制御が行われる。この際、ＲＰＴＳレジスタ３４１に保持されているアドレスが、ラッチ３４２、ＪＡバス２７４を介してＩＡレジスタ３３７に転送される。 The instruction fetch address is held in the IA register 337. When a jump occurs in the E stage 403, the jump destination address is fetched via the JA bus 274 and initialization is performed. When the instruction data is fetched sequentially, the incrementer 339 increments the address. When returning to the beginning of the repeat block after the final instruction processing of the repeat block during the block repeat process, the IF stage 401 controls the switching of the instruction processing sequence. At this time, the address held in the RPTS register 341 is transferred to the IA register 337 via the latch 342 and the JA bus 274.

ＩＡレジスタ３３７の値は命令フェッチ部２０２に送られ、命令フェッチ部２０２が命令データのフェッチを行う。対応する命令データが内蔵命令メモリ２０３にある場合には、内蔵命令メモリ２０３から命令コードを読み出す。この場合、１クロックサイクルで３２ビットの命令のフェッチを完了する。対応する命令データが内蔵命令メモリ２０３にない場合には、外部バスインターフェイス部２０６に命令フェッチ要求を出す。外部バスインターフェイス部２０６は、オペランドアクセス部２０４からの要求とを調停し、命令の取り込みが可能になったら、外部のメモリから命令データを取り込み、命令フェッチ部２０２に送る。外部バスインターフェイス部２０６は、最小２クロックサイクルで外部メモリのアクセスを行うことが可能である。命令フェッチ部２０２は取り込まれた命令を、命令キュー２１２に転送する。 The value of the IA register 337 is sent to the instruction fetch unit 202, and the instruction fetch unit 202 fetches the instruction data. If the corresponding instruction data is in the internal instruction memory 203, the instruction code is read from the internal instruction memory 203. In this case, a 32-bit instruction fetch is completed in one clock cycle. If the corresponding instruction data is not in the built-in instruction memory 203, an instruction fetch request is issued to the external bus interface unit 206. The external bus interface unit 206 arbitrates the request from the operand access unit 204, and when the instruction can be fetched, fetches the instruction data from the external memory and sends it to the instruction fetch unit 202. The external bus interface unit 206 can access the external memory in a minimum of 2 clock cycles. The instruction fetch unit 202 transfers the fetched instruction to the instruction queue 212.

命令キュー２１２は２エントリのキューになっており、ＦＩＦＯ制御で取り込まれた命令コードを、命令デコード部２１３に出力する。ブロックリピート処理中で命令フェッチアドレスがＲＰＴＥレジスタ３４３と一致した事を示すリピートブロック最終命令情報と、ブロックリピート処理中で、命令フェッチアドレスがＲＰＴＥレジスタ３４３と一致し、かつ、更新前のＴＲＰＴＣレジスタ３４８が“１”であった事を示すブロックリピート処理終了情報が、命令キューに対応する命令コードとともに保持され、対応する命令コードとともに命令デコード部２１３に出力される。以降のステージでは、この情報の基づき、ブロックリピート処理に関する命令非依存のハードウェア制御が行われる。 The instruction queue 212 is a two-entry queue, and outputs the instruction code fetched by the FIFO control to the instruction decoding unit 213. Repeat block final instruction information indicating that the instruction fetch address matches with the RPTE register 343 during block repeat processing, and the TRPTC register 348 before the update when the instruction fetch address matches with the RPTE register 343 during block repeat processing. The block repeat processing end information indicating that is “1” is held together with the instruction code corresponding to the instruction queue, and is output to the instruction decoding unit 213 together with the corresponding instruction code. In subsequent stages, instruction-independent hardware control relating to block repeat processing is performed based on this information.

Ｄステージ４０２では、命令デコード部２１３でオペレーションコードの解析を行い、第１演算部２２２、第２演算部２２３、ＰＣ部２２４等で命令の実行を行うための制御信号群を生成する。Ｄステージ４０２は、Ｅステージ４０３のジャンプで初期化される。命令キュー２１２から送られてくる命令コードが無効な場合には、アイドルサイクルとなり、有効な命令コードが取り込まれるまで待つ。Ｅステージ４０３が次の処理を開始できない場合には、演算器等に送る制御信号を無効化し、Ｅステージ４０３での先行命令の処理の終了を待つ。例えば、Ｅステージ４０３で実行中の命令がメモリアクセスを行う命令であり、Ｍステージ４０４でのメモリアクセスが終了していない場合にこのような状態になる。 In the D stage 402, an operation code is analyzed by the instruction decoding unit 213, and a control signal group for executing instructions by the first calculation unit 222, the second calculation unit 223, the PC unit 224, and the like is generated. The D stage 402 is initialized by the jump of the E stage 403. If the instruction code sent from the instruction queue 212 is invalid, an idle cycle is entered, and a wait is made until a valid instruction code is fetched. If the E stage 403 cannot start the next process, the control signal sent to the arithmetic unit or the like is invalidated, and the process of the preceding instruction in the E stage 403 is awaited. For example, when the instruction being executed at the E stage 403 is an instruction for performing memory access, and the memory access at the M stage 404 is not completed, this state is obtained.

Ｄステージ４０２では、シーケンシャル実行を行う２命令の分割や、２サイクル実行命令のシーケンス制御も行う。さらに、Ｅステージ４０３で参照もしくは更新するレジスタ値のロードが完了しているかどうかを判定するロードオペランドの干渉チェックや第２演算部２２３の演算器のＥ２ステージ４０６とＥステージ４０３の干渉チェック等も行い、干渉が検出された場合には、干渉が解消されるまで制御信号の出力を抑止する。 The D stage 402 also divides two instructions that perform sequential execution and performs sequence control of two-cycle execution instructions. Furthermore, a load operand interference check for determining whether or not a register value to be referred to or updated in the E stage 403 has been completed, an interference check between the E2 stage 406 and the E stage 403 of the arithmetic unit of the second arithmetic unit 223, and the like. If interference is detected, the output of the control signal is suppressed until the interference is resolved.

第１デコーダ２１４は、主として第１演算部２２２のすべて、ＰＣ部２２４のＩＦステージ４０１で制御される部分以外、レジスタファイル２２１のＳ１バス２５１、Ｓ２バス２５２、Ｓ３バス２５３への読み出し制御とＤ１バス２６１からの書き込み制御に関する実行制御信号を生成する。命令に依存するＭステージ４０４やＷステージ４０５での処理に必要な制御信号もここで生成され、パイプラインの処理の流れに付随して転送される。第２デコーダ２１５は、主として第２演算部２２３での実行制御、レジスタファイル２２１の第２演算部２２３にデータを転送する複数のバスへの読み出し制御と第２演算部２２３から出力されるデータの書き込み制御に関する実行制御信号を生成する。 The first decoder 214 controls the reading of the register file 221 to the S1 bus 251, the S2 bus 252, and the S3 bus 253 except for the part controlled by the IF stage 401 of the PC unit 224. An execution control signal related to write control from the bus 261 is generated. Control signals necessary for processing in the M stage 404 and W stage 405 depending on the instruction are also generated here and transferred along with the processing flow of the pipeline. The second decoder 215 mainly performs execution control in the second arithmetic unit 223, read control to a plurality of buses for transferring data to the second arithmetic unit 223 of the register file 221, and data output from the second arithmetic unit 223. An execution control signal related to write control is generated.

命令キュー２１２から取り込まれたリピートブロック最終命令情報とブロックリピート処理終了情報をもとに、命令に依存しないブロックリピート処理に関するＮＰＣ３３１の更新制御信号、ＲＰＴＣレジスタ３４５の更新制御信号や、制御レジスタＣＲ０（ＰＳＷ）のＲＰビット４３のクリアに関する更新制御信号などが生成される。 Based on the repeat block final instruction information and the block repeat process end information fetched from the instruction queue 212, the NPC 331 update control signal, the RPTC register 345 update control signal, and the control register CR0 ( An update control signal for clearing the RP bit 43 of (PSW) is generated.

Ｅステージ４０３では、演算、比較、制御レジスタを含むレジスタ間転送、ロード／ストア命令のオペランドアドレス計算、ジャンプ命令のジャンプ先アドレスの計算、ジャンプ処理、ＥＩＴ（例外、割り込み、トラップの総称）検出と各ＥＩＴのベクタアドレスへのジャンプ等、メモリアクセスと積和／積差演算命令の加算処理を除く命令実行に関するほとんどすべての処理を行う。 In the E stage 403, calculation, comparison, transfer between registers including control registers, calculation of operand addresses of load / store instructions, calculation of jump destination addresses of jump instructions, jump processing, EIT (generic name of exception, interrupt, trap) detection and Almost all processing related to instruction execution is performed except memory access and addition processing of product-sum / product-difference operation instructions, such as jumping to the vector address of each EIT.

割り込みイネーブルの場合の割り込みの検出は、必ず３２ビット命令の切れ目で行われる。３２ビット命令の中にシーケンシャルに実行する２つのショート命令がある場合も、この２つのショート命令間で割り込みを受け付けることはない。 Detection of an interrupt when the interrupt is enabled is always performed at a break of a 32-bit instruction. Even when there are two short instructions to be executed sequentially in the 32-bit instruction, no interrupt is accepted between the two short instructions.

Ｅステージ４０３で処理中の命令がオペランドアクセスを行う命令であり、Ｍステージ４０４でメモリアクセスが完了しない場合には、Ｅステージ４０３での完了は待たされる。ステージ制御は制御部２１１で行われる。 If the instruction being processed in the E stage 403 is an instruction for performing operand access, and the memory access is not completed in the M stage 404, the completion in the E stage 403 is awaited. Stage control is performed by the control unit 211.

Ｅステージ４０３において、第１演算部２２２内のＡＬＵ３０１で、算術論理演算、比較、転送、メモリオペランドのアドレスや分岐先のアドレス計算等が行われる。オペランドとして指定されたレジスタの値が、Ｓ１バス２５１、Ｓ２バス２５２、Ｓ３バス２５３に読み出され、必要に応じて別途取り込まれる即値、変位等の拡張データを使用して、ＡＬＵ３０１で演算が行われ、演算結果がＤ１バス２６１を介してレジスタファイル２２１に書き戻される。ロード／ストア命令の場合には、演算結果はＯＡラッチ３０６、ＯＡバス２７３を介して、オペランドアクセス部２０４に送られる。ジャンプ命令の場合には、ジャンプ先アドレスがＪＡバス２７４を介して、ＰＣ部２２４に送られる。ストアデータはＳ１バス２５１、Ｓ２バス２５２を介して、レジスタファイル２２１から読み出され、ＳＤレジスタ３１１、ラッチ３１２を介して転送後、整置回路３１３で整置が行われる。また、ＰＣ部２２４では、実行中の命令のＰＣ値の管理、次に実行する命令のアドレスの生成が行われる。第１演算部２２２、ＰＣ部２２４に含まれる制御レジスタ（アキュムレータを除く）とレジスタファイル２２１間の転送は、Ｓ３バス２５３、Ｄ１バス２６１を介して行われる。 In the E stage 403, arithmetic logic operation, comparison, transfer, memory operand address, branch destination address calculation, and the like are performed by the ALU 301 in the first arithmetic unit 222. The value of the register specified as the operand is read to the S1 bus 251, S2 bus 252, and S3 bus 253, and the ALU 301 performs an operation using the extension data such as the immediate value and the displacement that are separately fetched as necessary. The operation result is written back to the register file 221 via the D1 bus 261. In the case of a load / store instruction, the operation result is sent to the operand access unit 204 via the OA latch 306 and the OA bus 273. In the case of a jump instruction, the jump destination address is sent to the PC unit 224 via the JA bus 274. Store data is read from the register file 221 via the S1 bus 251 and S2 bus 252, transferred via the SD register 311 and latch 312, and then aligned by the alignment circuit 313. The PC unit 224 manages the PC value of the instruction being executed and generates the address of the instruction to be executed next. Transfer between the control registers (excluding the accumulator) included in the first arithmetic unit 222 and the PC unit 224 and the register file 221 is performed via the S3 bus 253 and the D1 bus 261.

Ｅステージ４０３において、第２演算部２２３では、算術論理演算、比較、転送、シフト他、積和演算の加算以外のすべての演算実行が行われる。オペランドの値が、レジスタファイル２２１やアキュムレータ等からバスを介して各演算器に転送され、指定された演算を行い、演算結果がアキュムレータや、Ｄ２バス２６２を介してレジスタファイル２２１に書き戻される。 In the E stage 403, the second operation unit 223 performs all operations other than arithmetic logic operation, comparison, transfer, shift, and addition of product-sum operations. The value of the operand is transferred from the register file 221 or accumulator to each arithmetic unit via the bus, performs a specified operation, and the operation result is written back to the register file 221 via the accumulator or D2 bus 262.

第１演算部２２２及び第２演算部２２３での演算結果によるＰＳＷ中のフラグ値の更新制御も、Ｅステージ４０３で行われる。しかし、演算結果の確定がＥステージ４０３の遅い時期になるため、実際のＰＳＷ値の更新は、次サイクルで行われる。データ転送によるＰＳＷの更新は、対応するサイクルで完了する。 The E stage 403 also performs update control of the flag value in the PSW based on the calculation results in the first calculation unit 222 and the second calculation unit 223. However, since the calculation result is confirmed later in the E stage 403, the actual PSW value is updated in the next cycle. The update of the PSW by data transfer is completed in the corresponding cycle.

Ｅステージ４０３では、実行する命令に依存しないＰＣ値の更新やブロックリピート制御も行われる。新しい３２ビット命令の処理を開始するたびに、ラッチ３３２の値をＥＰＣ３３４に転送する。ＮＰＣ３３１は次に処理する命令のアドレスを保持する。Ｅステージ４０３でジャンプが起こった場合には、ＡＬＵ３０１で生成されるジャンプ先アドレスがＪＡバス２７４を介してＮＰＣ３３１に書き込まれ、初期化される。シーケンシャルに命令の処理が継続する場合には、３２ビット命令の処理を開始するたびに、インクリメンタ３３３で１インクリメントされた値がＮＰＣ３３１に書き戻される。ブロックリピート継続でリピートブロック最終命令の処理を開始する際には、ラッチ３４２からリピートブロックの先頭アドレスを取り込む。リピートブロック最終命令の処理を終了するサイクルで、ＲＰＴＣレジスタ３４５の値がラッチ３４６を介してデクリメンタ３４７でデクリメントして書き戻される。ブロックリピート処理を終了する場合、リピートブロック最終命令の処理を終了するサイクルで、ＰＳＷのＲＰビット４３を“０”にクリアする。 In the E stage 403, updating of the PC value and block repeat control independent of the instruction to be executed are also performed. Each time processing of a new 32-bit instruction is started, the value of the latch 332 is transferred to the EPC 334. The NPC 331 holds the address of the instruction to be processed next. When a jump occurs in the E stage 403, the jump destination address generated by the ALU 301 is written to the NPC 331 via the JA bus 274 and initialized. When instruction processing is continued sequentially, a value incremented by 1 by the incrementer 333 is written back to the NPC 331 every time processing of a 32-bit instruction is started. When the processing of the repeat block final instruction is started by continuing the block repeat, the start address of the repeat block is fetched from the latch 342. In the cycle in which the processing of the repeat block final instruction is completed, the value of the RPTC register 345 is decremented by the decrementer 347 via the latch 346 and written back. When the block repeat process ends, the RP bit 43 of the PSW is cleared to “0” in a cycle in which the repeat block final instruction process is completed.

第１デコーダ２１４で生成されたロード／ストア命令のメモリアクセス関連情報、ロードレジスタ情報は、Ｅステージ４０３の制御のもとに保持され、Ｍステージ４０４に送られる。また、第２デコーダ２１５で生成された積和／積差演算の加減算実行のための演算制御信号は、Ｅステージ４０３の制御のもとに保持され、Ｅ２ステージ４０６に送られる。Ｅステージ４０３のステージ制御も制御部２１１で行われる。 Memory access related information and load register information of the load / store instruction generated by the first decoder 214 are held under the control of the E stage 403 and sent to the M stage 404. Further, the operation control signal for adding / subtracting the product sum / product difference operation generated by the second decoder 215 is held under the control of the E stage 403 and sent to the E2 stage 406. The stage control of the E stage 403 is also performed by the control unit 211.

Ｍステージ４０４では、第１演算部２２２から送られたアドレスでオペランドのアクセスが行われる。オペランドアクセス部２０４は、オペランドが内蔵データメモリ２０５やチップ内ＩＯ（図示せず）にある場合には、内蔵データメモリ２０５やチップ内ＩＯに対し、１クロックサイクルに１回のオペランドのリードもしくはライトを行う。オペランドが内蔵データメモリ２０５やチップ内ＩＯでない場合には、外部バスインターフェイス部２０６にデータアクセス要求を出す。外部バスインターフェイス部２０６は、外部のメモリに対してデータアクセスを行い、ロードの場合には読み出されたデータをオペランドアクセス部２０４に転送する。 In the M stage 404, the operand is accessed with the address sent from the first arithmetic unit 222. When the operand is in the internal data memory 205 or in-chip IO (not shown), the operand access unit 204 reads or writes the operand once per clock cycle with respect to the internal data memory 205 or in-chip IO. I do. If the operand is not the internal data memory 205 or the on-chip IO, a data access request is issued to the external bus interface unit 206. The external bus interface unit 206 performs data access to an external memory, and transfers the read data to the operand access unit 204 in the case of loading.

外部バスインターフェイス部２０６は、最小２クロックサイクルで外部メモリのアクセスを行うことが可能である。ロードの場合には、オペランドアクセス部２０４は読み出されたデータを、ＯＤバス２７１を介して転送する。バイトデータの場合はＬＤレジスタ３１５に、ワード、もしくは、２ワードデータの場合にはレジスタファイル２２１に直接書き込む。ストアの場合には、整置されたストアデータの値が、整置回路３１３からラッチ３１４、ＯＤバス２７１を介してオペランドアクセス部２０４に転送され、対象となるメモリへの書き込みが行われる。Ｍステージ４０４のステージ制御も制御部２１１で行われる。 The external bus interface unit 206 can access the external memory in a minimum of 2 clock cycles. In the case of loading, the operand access unit 204 transfers the read data via the OD bus 271. Write directly to the LD register 315 for byte data, or directly to the register file 221 for two-word data. In the case of a store, the value of the aligned store data is transferred from the alignment circuit 313 to the operand access unit 204 via the latch 314 and the OD bus 271 and written into the target memory. The stage control of the M stage 404 is also performed by the control unit 211.

Ｗステージ４０５において、ＬＤレジスタ３１５に保持されたロードオペランド（バイト）は、整置回路３１６で整置、ゼロ／符号拡張された後に、ラッチ３１７へ転送され、Ｗバス２７２を介してレジスタファイル２２１へ書き込まれる。 In the W stage 405, the load operand (byte) held in the LD register 315 is aligned and zero / sign-extended by the alignment circuit 316, transferred to the latch 317, and transferred to the register file 221 via the W bus 272. Is written to.

Ｅ２ステージ４０６では、積和／積差演算の加減算処理を行い、加減算結果をアキュムレータに書き戻す。 In the E2 stage 406, addition / subtraction processing of product sum / product difference calculation is performed, and the addition / subtraction results are written back to the accumulator.

本データ処理装置は、入力クロックに基づいて内部制御を行う。最短の場合、各パイプラインステージは、内部の１クロックサイクルで処理を終了する。ここでは、クロック制御の詳細については、本発明に直接関係ないので説明を省略する。 The data processing apparatus performs internal control based on the input clock. In the shortest case, each pipeline stage finishes processing in one internal clock cycle. Here, the details of the clock control are not directly related to the present invention, and the description thereof will be omitted.

各サブ命令の処理例について説明する。加減算、論理演算、比較等の演算命令やレジスタ間の転送命令は、ＩＦステージ４０１、Ｄステージ４０２、Ｅステージ４０３の３段で処理を終了する。演算やデータ転送をＥステージ４０３で行う。 A processing example of each sub instruction will be described. Processing operations such as addition / subtraction, logical operation, comparison, and transfer instruction between registers are completed in three stages of IF stage 401, D stage 402, and E stage 403. Calculation and data transfer are performed in the E stage 403.

積和／積差命令は、乗算を行うＥステージ４０３と加減算を行うＥ２ステージ４０６の２クロックサイクルで演算実行を行うため、４段の処理となる。 The product-sum / product-difference instruction is a four-stage process because the operation is executed in two clock cycles of the E stage 403 that performs multiplication and the E2 stage 406 that performs addition and subtraction.

バイトロード命令は、ＩＦステージ４０１、Ｄステージ４０２、Ｅステージ４０３、Ｍステージ４０４、Ｗステージ４０５の５段で処理を終了する。１ワード／２ワードのロードやストア命令は、ＩＦステージ４０１、Ｄステージ４０２、Ｅステージ４０３、Ｍステージ４０４の４段で処理を終了する。 The byte load instruction ends in five stages of IF stage 401, D stage 402, E stage 403, M stage 404, and W stage 405. The processing of 1-word / 2-word load and store instructions ends in the four stages of IF stage 401, D stage 402, E stage 403, and M stage 404.

非整置アクセスの場合には、オペランドアクセス部２０４でＭステージ４０４の制御のもと整置された２回のアクセスに分割され、メモリアクセスが行われる。 In the case of non-arranged access, the operand access unit 204 is divided into two accesses arranged under the control of the M stage 404, and memory access is performed.

実行に２サイクルかかる命令では、第１、第２命令デコーダ２１４、２１５で、２サイクルかけて処理し、各サイクル毎に各々実行制御信号を出力し、２サイクルかけて演算実行を行う。 An instruction that takes two cycles to execute is processed by the first and second instruction decoders 214 and 215 over two cycles, outputs an execution control signal for each cycle, and executes an operation over two cycles.

ロング命令は、１つの３２ビット命令が１つのロング命令で構成されており、この１つのロング命令の処理で３２ビット命令の実行が完了する。パラレル実行する２つの命令は、２つのショート命令のうち処理サイクルの大きい方の命令の処理に律速される。例えば、２サイクル実行の命令と１サイクル実行の命令の組み合わせの場合には、２サイクルかかる。シーケンシャル実行の２つのショート命令の場合には、各サブ命令の組み合わせになり、デコード段階で各命令がシーケンシャルにデコードされ、実行される。例えば、Ｅステージ４０３で１サイクルで実行が完了する加算命令が２つの場合には、Ｄステージ４０２、Ｅステージ４０３とも各命令に１サイクル、計２サイクルかけて処理する。Ｅステージ４０３における先行命令の実行と並列して、Ｄステージ４０２で後続命令のデコードが行われる。 In the long instruction, one 32-bit instruction is composed of one long instruction, and execution of the 32-bit instruction is completed by processing of the one long instruction. Two instructions to be executed in parallel are limited by the processing of the instruction having the longer processing cycle of the two short instructions. For example, in the case of a combination of a two-cycle execution instruction and a one-cycle execution instruction, two cycles are required. In the case of two short instructions for sequential execution, each sub-instruction is combined, and each instruction is sequentially decoded and executed at the decoding stage. For example, when there are two addition instructions that are completed in one cycle in the E stage 403, both the D stage 402 and the E stage 403 are processed in one cycle for each instruction, for a total of two cycles. In parallel with the execution of the preceding instruction in the E stage 403, the subsequent instruction is decoded in the D stage 402.

次に、本実施の形態におけるデータ処理装置のブロックリピート処理の動作を詳細に説明する。本データ処理装置のリピート命令は２つ命令フォーマットを備えている。 Next, the operation of the block repeat process of the data processing apparatus in this embodiment will be described in detail. The repeat instruction of this data processing apparatus has two instruction formats.

図１５及び図１６はそれぞれリピート命令のビット割り付けを示す説明図である。繰り返し回数をレジスタ値で指定する“ＲＥＰＲｓ，ｄｉｓｐ”のビット割り付けを図１５に、繰り返し回数を８ビットの即値で指定する“ＲＥＰＩｉｍｍ，ｄｉｓｐ”のビット割り付けを図１６に示す。共に、ロング命令のフォーマットであり、図９に示したものに近い命令ビット割り付けとなっている。 FIG. 15 and FIG. 16 are explanatory diagrams showing bit allocation of repeat instructions, respectively. FIG. 15 shows the bit assignment of “REP Rs, disp” for designating the number of repetitions with a register value, and FIG. 16 shows the bit assignment of “REPI imm, disp” for designating the number of repetitions with an immediate value of 8 bits. Both are long instruction formats, and the instruction bit allocation is similar to that shown in FIG.

ＦＭビット１０１はロングフォーマットを示す“１１”となる。図１５及び図１６において、オペレーションコード５０１，５１１、予約フィールド５０２、ｄｉｓｐ５０４、ｄｉｓｐ５１３はリピートブロックの最終命令のアドレスを、リピート命令のアドレスからの変位（３２ビットの命令ワード単位）で指定する。リピートブロック（繰り返し対象命令のブロック）は、リピート命令の次の命令から、ｄｉｓｐ５０４、５１３で指定される命令までとなる。Ｒｓ５０３はリピートブロックの繰り返し回数を指定するレジスタ番号である。ｉｍｍ５１２はリピートブロックの繰り返し回数を指定する即値フィールドである。なお、本実施の形態では、繰り返し回数が“０”の場合には、動作保証しないものとする。 The FM bit 101 is “11” indicating the long format. 15 and 16, operation codes 501 and 511, reserved fields 502, disp 504, and disp 513 specify the address of the last instruction of the repeat block by a displacement (32-bit instruction word unit) from the address of the repeat instruction. The repeat block (the block of the repeat target instruction) is from the instruction next to the repeat instruction to the instruction specified by the disps 504 and 513. Rs503 is a register number that designates the number of repetitions of the repeat block. imm512 is an immediate field for designating the number of repetitions of the repeat block. In this embodiment, the operation is not guaranteed when the number of repetitions is “0”.

基本的に、リピート命令はリピートブロックのサイズにより、３種類に分類され、異なった処理を行う。第１はリピートブロックのサイズが１命令（変位値が“１”）の場合であり、リピート命令の実行後ジャンプ処理を行い、パイプラインをキャンセルする。 Basically, repeat instructions are classified into three types according to the size of the repeat block, and different processes are performed. The first is a case where the repeat block size is 1 instruction (displacement value is “1”), and jump processing is executed after execution of the repeat instruction to cancel the pipeline.

第２はリピートブロックのサイズが２命令（変位値が“２”）の場合であり、リピート命令デコード時に命令フェッチを一旦停止（抑止）し、リピート命令の実行終了時に、命令フェッチの停止を解除する。 The second is when the repeat block size is 2 instructions (displacement value is “2”). Instruction fetch is temporarily stopped (suppressed) when the repeat instruction is decoded, and the instruction fetch stop is released when the repeat instruction is finished. To do.

第３はリピートブロックのサイズが３命令以上（変位値が“３”以上）の場合であり、リピート命令の実行後ジャンプをしたり、命令フェッチの一時停止制御を行ったりすることなく、パイプライン処理を継続する。 The third is the case where the repeat block size is 3 instructions or more (displacement value is "3" or more), and the pipeline does not jump after execution of the repeat instruction or does not perform pause control of instruction fetch. Continue processing.

図１７は、本発明の実施の形態であるデータ処理装置の制御部２１１のうち、リピート制御の説明に必要な部分の一部のみを模式的に示したブロック図である。実際の論理はより複雑である。また、図１７で示しているように明確にブロック分けできるようなものでないものもある。あくまでも、説明しやすいように示した概念的なブロック図である。 FIG. 17 is a block diagram schematically showing only a part of the control unit 211 of the data processing apparatus according to the embodiment of the present invention, which is necessary for explanation of repeat control. The actual logic is more complex. Also, there are some that cannot be clearly divided into blocks as shown in FIG. It is a conceptual block diagram shown for ease of explanation.

ＩＦステージ制御部６５１は命令フェッチステージの制御を行う。ＩＦステージ制御部６５１内に存在しリピート時パイプライン制御手段の一部としても機能する命令フェッチ要求生成部６５２は命令キューの空き状態やリピート関連の制御信号（命令フェッチ抑止信号６８１，命令フェッチ抑止解除信号６８２）に基づき、命令フェッチ要求信号６９１を生成する。 The IF stage control unit 651 controls the instruction fetch stage. The instruction fetch request generation unit 652 that exists in the IF stage control unit 651 and also functions as a part of the pipeline control means at the time of repeat is an instruction queue empty state and repeat related control signals (instruction fetch suppression signal 681, instruction fetch suppression Based on the release signal 682), an instruction fetch request signal 691 is generated.

リピート制御部６５３はリピート制御を行い、ＰＣ部２２４から比較器３４４の出力である命令フェッチアドレス（ＩＡアドレス３３７の値）とＲＰＴＥ（ＲＰＴＥレジスタ３４３の値）との一致情報６９２と、１検出回路３４９の出力であるＴＲＰＴＣレジスタ３４８の値が“１”であるかどうかの検出結果情報６９３を入力する。ＴＲＰビット６５４は制御レジスタＣＲ０（ＰＳＷ）のＲＰビット４３に対応する情報を保持するリピート状態ビットである。最終的なリピート状態は実行ステージで管理されている。このＴＲＰビット６５４は、パイプライン処理において命令フェッチ段階で先行更新するために保持する情報である。命令キュー２１２は、命令フェッチ部２０２から取り込まれた命令データを保持するバッファであり、２つの３２ビット命令をキューイングすることができる。 The repeat control unit 653 performs repeat control, and coincidence information 692 between the instruction fetch address (the value of the IA address 337) and the RPTE (the value of the RPTE register 343), which are output from the comparator 344 from the PC unit 224, and one detection circuit Detection result information 693 indicating whether or not the value of the TRPTC register 348 that is an output of 349 is “1” is input. The TRP bit 654 is a repeat status bit that holds information corresponding to the RP bit 43 of the control register CR0 (PSW). The final repeat state is managed at the execution stage. The TRP bit 654 is information to be held for the advance update at the instruction fetch stage in the pipeline processing. The instruction queue 212 is a buffer that holds instruction data fetched from the instruction fetch unit 202, and can queue two 32-bit instructions.

Ｄステージ制御部６６１は、命令デコードステージの制御を行う。命令デコード部２１３内の命令レジスタ６６２は命令デコード中の命令を保持する。リピート命令は第１デコーダ２１４のみで処理される。デコーダ６６３はオペレーションコードのデコードを行う。リピート設定用判定部としても機能するｄｉｓｐ判定部６６４はリピート命令の変位値の値が“１”であるか、“２”であるか、“３”以上であるかという、所定の条件を満足するか否かを判定する判定動作を実行する。 The D stage control unit 661 controls the instruction decode stage. An instruction register 662 in the instruction decoding unit 213 holds an instruction being decoded. The repeat instruction is processed only by the first decoder 214. The decoder 663 decodes the operation code. The disp determination unit 664 that also functions as a repeat setting determination unit satisfies a predetermined condition that the displacement value of the repeat command is “1”, “2”, or “3” or more. A determination operation for determining whether or not to perform is performed.

リピート時パイプライン制御手段の一部としても機能する制御信号生成部６６５は、デコーダ６６３、ｄｉｓｐ判定部６６４、命令レジスタ６６２や図示していないその他の部分からの入力に基づき命令の実行に必要な制御信号等を生成する。リピート命令開始時で、変位値が“２”以下の場合には、活性状態の命令フェッチ抑止信号６８１を所定の信号線を介して命令フェッチ要求生成部６５２に出力する。 The control signal generation unit 665 that also functions as a part of the pipeline control means at the time of repeat is necessary for executing an instruction based on inputs from the decoder 663, the disp determination unit 664, the instruction register 662, and other parts not shown. Generate control signals and the like. If the displacement value is “2” or less at the start of a repeat instruction, an active instruction fetch suppression signal 681 is output to the instruction fetch request generation unit 652 via a predetermined signal line.

Ｅステージ制御部６７１は、命令実行ステージの制御を行う。Ｅステージ制御部６７１内のジャンプ制御部６７２はジャンプの制御を行い、ジャンプ処理を行う場合には活性状態のジャンプ信号６８４を所定の信号線を介してＩＦステージ制御部６５１、Ｄステージ制御部６６１に送り、パイプライン処理における前処理をキャンセルする。すなわち、ジャンプ制御部６７２はリピート時パイプライン制御手段の一部として機能する。 The E stage control unit 671 controls the instruction execution stage. The jump control unit 672 in the E stage control unit 671 performs jump control. When performing jump processing, the jump signal 684 in the active state is sent to the IF stage control unit 651 and the D stage control unit 661 via predetermined signal lines. To cancel the preprocessing in the pipeline processing. That is, the jump control unit 672 functions as a part of the repeat pipeline control means.

リピート時パイプライン制御手段の一部としても機能するＥステージ制御信号ラッチ６７３は実行ステージでの制御に必要な制御信号を保持するラッチである。Ｅステージ制御信号ラッチ６７３は、リピート命令の終了時で、変位値が“２”（第１の基準値）の場合には、ＲＥＰＩ（２）のＥステージにおいて、Ｅステージ制御信号ラッチ６７３に保持された活性状態の命令フェッチ抑止解除信号６８２を所定の信号線を介して命令フェッチ要求生成部６５２に送る。また、リピート命令の終了時のＥステージで、変位値が“１”（第２の基準値，所定の基準値）の場合には、ジャンプを要求するジャンプ要求信号６８３を所定の信号線を介してジャンプ制御部６７２に送る。 An E stage control signal latch 673 that also functions as part of the pipeline control means at the time of repeat is a latch that holds a control signal necessary for control at the execution stage. The E stage control signal latch 673 is held in the E stage control signal latch 673 at the E stage of REPI (2) when the displacement value is “2” (first reference value) at the end of the repeat instruction. The activated instruction fetch inhibition release signal 682 thus sent is sent to the instruction fetch request generator 652 via a predetermined signal line. When the displacement value is “1” (second reference value, predetermined reference value) at the E stage at the end of the repeat instruction, a jump request signal 683 for requesting a jump is sent via a predetermined signal line. To the jump control unit 672.

ＰＳＷ部３２３内のＲＰビット６７５は制御レジスタＣＲ０（ＰＳＷ）のＲＰビット４３を物理的に保持するラッチであり、特に明示的に示している。ジャンプが起こった場合には、パイプライン前処理をキャンセルするために、ＲＰビット６７５の値がＴＲＰビット６５４に転送され、再設定される。 The RP bit 675 in the PSW unit 323 is a latch that physically holds the RP bit 43 of the control register CR0 (PSW), and is specifically shown. If a jump occurs, the value of the RP bit 675 is transferred to the TRP bit 654 and reset to cancel the pipeline preprocessing.

次に、プログラム例をいくつか挙げ、本実施の形態のデータ処理装置による、リピート命令実行時、及び、ブロックリピート処理中の動作について詳細に説明する。 Next, some examples of the program will be described, and operations during execution of a repeat instruction and during block repeat processing by the data processing apparatus according to the present embodiment will be described in detail.

図１８はリピート（対象）ブロックの命令数（この命令数に対応する総コードサイズが所定のコードサイズとなる）が３命令の場合のプログラム例を示す説明図である。実行する命令は、本発明の説明においてそれほど重要ではないので、ＩａやＩ１など簡略的な表記で示している。各命令は、１サイクルで処理を終了するロング命令、もしくは、並列実行する２命令の場合を想定している。コマンド行７０１〜７０３に示す命令Ｉａ〜Ｉｃは、リピート関連の設定を規定するリピート設定命令であるＲＥＰＩ（リピート）命令７０４の前に実行される命令である。コマンド行７０５〜７０７に示す命令Ｉ１〜Ｉ３がリピートブロック命令であり、２回繰り返し実行する。コマンド７０８以降の命令Ｉ４等がリピートブロックに引き続く後続命令である。 FIG. 18 is an explanatory diagram showing a program example when the number of instructions in a repeat (target) block (the total code size corresponding to the number of instructions is a predetermined code size) is three instructions. The instruction to be executed is not so important in the description of the present invention, and is therefore indicated by a simple notation such as Ia or I1. Each instruction is assumed to be a long instruction that ends processing in one cycle or two instructions that are executed in parallel. Instructions Ia to Ic shown in the command lines 701 to 703 are instructions that are executed before a REPI (repeat) instruction 704 that is a repeat setting instruction that defines repeat-related settings. Instructions I1 to I3 shown in the command lines 705 to 707 are repeat block instructions and are repeatedly executed twice. The instruction I4 and the like after the command 708 are subsequent instructions following the repeat block.

図１９は実施の形態１のデータ処理装置が図１８で示したプログラムを実行する場合のリピート処理中のパイプライン処理を示すタイミング図である。同図では、内蔵命令メモリ２０３に格納された命令を実行する場合を示している。 FIG. 19 is a timing chart showing pipeline processing during repeat processing when the data processing apparatus of the first embodiment executes the program shown in FIG. In the figure, a case where an instruction stored in the internal instruction memory 203 is executed is shown.

同図において、上の部分は各パイプラインステージの処理（ＩＦステージ、Ｄステージ、Ｅステージ、Ｍ／Ｅ２ステージ）の様子を、下の部分はリピート処理に関係するレジスタの値や信号の値を示している。命令フェッチ要求は図１７の命令フェッチ要求信号６９１の値を示す。ＩＡは図３１のＩＡレジスタ３３７、ＲＰＴＥはＲＰＴＥレジスタ３４３、ＲＰＴＳはＲＰＴＳレジスタ３４１、ＴＲＰＴＣはＴＲＰＴＣレジスタ３４８、ＲＰＴＣはＲＰＴＣレジスタ３４５、ＮＰＣはＮＰＣ３３１、ＥＰＣはＥＰＣ３３４の値をそれぞれ意味する。 In the figure, the upper part shows the processing of each pipeline stage (IF stage, D stage, E stage, M / E2 stage), and the lower part shows register values and signal values related to repeat processing. Show. The instruction fetch request indicates the value of the instruction fetch request signal 691 in FIG. IA means the IA register 337, RPTE means the RPTE register 343, RPTS means the RPTS register 341, TRPTC means the TRPTC register 348, RPTC means the RPTC register 345, NPC means the NPC 331, and EPC means the value of the EPC 334, respectively.

“ＩＡ＝＝ＲＰＴ＿Ｅ”は図１３、図１７で示した一致情報６９２、“ＴＲＰＴＣ＝＝１”は図１３、図１７で示した検出結果情報６９３を意味する。ＲＰビットは図１７で示したＲＰビット６７５、ＴＲＰビットは図１７で示したＲＰビット６５４の値をそれぞれ意味する。なお、図１９において、“Ｈ”，“Ｌ”が重複する期間（一部ハッチングで示す）は不定期間を意味する。また、“ＩＡ＝＝ＲＰＴ＿Ｅ”，“ＴＲＰＴＣ＝＝１”における丸印は認識されるタイミングを意味する。 “IA == RPT_E” means the coincidence information 692 shown in FIGS. 13 and 17, and “TRPTC == 1” means the detection result information 693 shown in FIGS. The RP bit means the value of the RP bit 675 shown in FIG. 17, and the TRP bit means the value of the RP bit 654 shown in FIG. In FIG. 19, a period in which “H” and “L” overlap (partially indicated by hatching) means an indefinite period. Further, the circles in “IA == RPT_E” and “TRPTC == 1” mean the recognition timing.

リピート命令は、Ｄステージ４０２で２つのステップに分解され、Ｅステージ４０３で各々１サイクル、計２サイクルかけて実行される。図１９では、２つのステップを“ＲＥＰＩ（１）”、“ＲＥＰＩ（２）”と示している。第１ステップ（ＲＥＰＩ（１））でＲＰＴＥレジスタ３４３、ＲＰＴＳレジスタ３４１を設定し、第２ステップ(ＲＥＰＩ（２）)でＲＰＴＣレジスタ３４５、ＴＲＰＴＣレジスタ３４８、ＲＰビット６７５、ＴＲＰビット６５４を設定する。ＲＥＰＩ命令のデコード時にｄｉｓｐ判定部６６４によってリピート命令の変位値が判定される。変位値が“３”以上の場合は、後で説明する“２”以下の場合の命令フェッチの一時停止制御（命令フェッチ抑止及び解除機能）や、Ｅステージでのジャンプ制御（パイプラインキャンセル機能）等は行われない。 The repeat instruction is decomposed into two steps in the D stage 402, and is executed in the E stage 403 over one cycle, each for two cycles. In FIG. 19, the two steps are indicated as “REPI (1)” and “REPI (2)”. In the first step (REPI (1)), the RPTE register 343 and the RPTS register 341 are set, and in the second step (REPI (2)), the RPTC register 345, the TRPTC register 348, the RP bit 675, and the TRP bit 654 are set. When the REPI instruction is decoded, the disp determining unit 664 determines the displacement value of the repeat instruction. When the displacement value is “3” or more, instruction fetch suspension control (instruction fetch suppression and release function) when “2” or less, which will be described later, and jump control at the E stage (pipeline cancel function) Etc. are not performed.

Ｔ３サイクルで、ＲＥＰＩ命令の第１ステップ(ＲＥＰＩ（１）)の実行を行う。ＲＥＰＩ命令７０４のアドレスを保持しているＥＰＣ３３４の値がＳ３バス２５３を介してＡＡラッチ３０２に、ＲＥＰＩ命令のｄｉｓｐフィールド５１３の値が第１デコーダ２１４からＡＢラッチ３０３に各々取り込まれ、ＡＬＵ３０１で両者の値を加算することによりリピートブロックエンドアドレスであるＩ３命令のアドレスが計算され、加算結果がＪＡバス２７４を介してＲＰＴＥレジスタ３４３に書き込まれる。また、ＮＰＣ３３１に保持されているリピートブロックの先頭命令であるＩ１命令のアドレスがラッチ３３２、Ｄ１バス２６１を介して、ＲＰＴＳレジスタ３４１に書き込まれ、その後ラッチ３４２にも書き込まれる。 In the T3 cycle, the first step (REPI (1)) of the REPI instruction is executed. The value of the EPC 334 holding the address of the REPI instruction 704 is taken into the AA latch 302 via the S3 bus 253, and the value of the disp field 513 of the REPI instruction is taken into the AB latch 303 from the first decoder 214. Is added, the address of the I3 instruction which is the repeat block end address is calculated, and the addition result is written to the RPTE register 343 via the JA bus 274. Also, the address of the I1 instruction that is the first instruction of the repeat block held in the NPC 331 is written to the RPTS register 341 via the latch 332 and the D1 bus 261, and then written to the latch 342.

Ｔ４サイクルで、ＲＥＰＩ命令の第２ステップ(ＲＥＰＩ（２）)の実行を行う。ＡＡラッチ３０２は“０”にする。ＲＥＰＩ命令のｉｍｍフィールド５１２の値が第１デコーダ２１４からＡＢラッチ３０３に取り込まれ、ＡＬＵ３０１で“０”と加算し、加算結果がＤ１バス２６１を介してＲＰＴＣレジスタ３４５、ＴＲＰＴＣレジスタ３４８に書き込まれる。また、リピート処理中であることを示すＲＰビット６７５、ＴＲＰビット６５４が“１”にセットされる。Ｔ４サイクルで、ＲＥＰＩ命令の実行に伴うリピート関連の設定が終了し、Ｔ５サイクル以降でフェッチされる命令に関して、リピートによる命令フェッチシーケンスの切り替えが可能になる。 In the T4 cycle, the second step (REPI (2)) of the REPI instruction is executed. The AA latch 302 is set to “0”. The value of the imm field 512 of the REPI instruction is taken into the AB latch 303 from the first decoder 214, added to “0” by the ALU 301, and the addition result is written to the RPTC register 345 and the TRPTC register 348 via the D1 bus 261. Further, the RP bit 675 and the TRP bit 654 indicating that the repeat process is being performed are set to “1”. In the T4 cycle, repeat-related settings associated with the execution of the REPI instruction are completed, and the instruction fetch sequence can be switched by repeat for instructions fetched after the T5 cycle.

本プログラム例では、リピート設定命令としてＲＥＰＩ命令を用いているが、繰り返し回数をレジスタ値で指定するＲＥＰ命令の場合、レジスタ番号フィールド５０３で指定されるレジスタの値（繰り返し回数）がレジスタファイル２２１からＳ３バス２５３に読み出され、ＡＢラッチ３０３に転送される点のみが異なる。 In this example program, the REPI instruction is used as a repeat setting instruction. However, in the case of a REP instruction that specifies the number of repetitions as a register value, the register value (the number of repetitions) specified in the register number field 503 is obtained from the register file 221. The only difference is that the data is read to the S3 bus 253 and transferred to the AB latch 303.

本データ処理装置は、２本の命令キューを備えているため、ＲＥＰＩ命令デコード中にＩ１、Ｉ２命令のフェッチが行われるが、Ｉ１命令のデコード処理が開始されるＴ４までは命令キュー２１２はフル状態となっているため、Ｉ３命令の命令フェッチ要求は出力されない。従って、Ｔ４サイクルでＩ１命令のデコード処理が開始されることにより、命令キュー２１２のＩ１命令が格納されていた領域が空くため、Ｔ５サイクルでＩ３命令のフェッチが行われる。この処理例は、最も命令キューが詰まる場合の例を示している。すなわち、リピートブロックが３命令上の場合には、特にパイプライン処理の前段処理の状態を配慮することなく、ブロックリピート処理を開始できる。 Since this data processing apparatus has two instruction queues, the I1 and I2 instructions are fetched during the REPI instruction decoding, but the instruction queue 212 is full until T4 when the decoding process of the I1 instruction is started. Since it is in the state, the instruction fetch request for the I3 instruction is not output. Therefore, when the decoding process of the I1 instruction is started in the T4 cycle, the area where the I1 instruction is stored in the instruction queue 212 becomes free, and the I3 instruction is fetched in the T5 cycle. This processing example shows an example when the instruction queue is most clogged. That is, when the repeat block has three instructions, the block repeat process can be started without considering the state of the pre-stage process of the pipeline process.

次に、実行する命令とは無関係にハードウェア的に行われるリピート制御について説明する。実行する命令処理シーケンスの切り替えは、ＩＦステージ４０１で行われる。ＴＲＰビット６５４が“１”の状態で、リピートによるシーケンス制御が行われる。ＴＲＰ６５４が“１”の状態時に、ＩＡレジスタ３３７内の命令フェッチアドレスとＲＰＴＥレジスタ３４３のリピートブロック最終命令のアドレスとが比較器３４４で比較される。また、ＴＲＰビット６５４が“１”の状態で、ＴＲＰＴＣレジスタ３４８の値が“１”であるかどうかが、１検出回路３４９でチェックされる。 Next, repeat control performed in hardware regardless of the instruction to be executed will be described. The instruction processing sequence to be executed is switched at the IF stage 401. In the state where the TRP bit 654 is “1”, sequence control by repeat is performed. When the TRP 654 is “1”, the instruction fetch address in the IA register 337 and the address of the repeat block last instruction in the RPTE register 343 are compared by the comparator 344. Also, the 1 detection circuit 349 checks whether the value of the TRPTC register 348 is “1” while the TRP bit 654 is “1”.

ＴＲＰビット６５４が“１”で、かつ、ＴＲＰＴＣレジスタ３４８の値が“１”でない状態時に、ＩＡレジスタ３３７の値がＲＰＴＥレジスタ３４３と一致した場合、リピートブロックの最終命令のフェッチであり、かつ、リピート継続であることを示しており、処理シーケンスを切り替え、次にリピートブロックの先頭アドレスの命令（Ｉ１）をフェッチする。すなわち、Ｔ５サイクルでリピートブロック最終命令（Ｉ３）をフェッチ後、Ｔ６サイクルでＩ１命令をフェッチする。Ｔ５サイクルでは、ラッチ３４２に保持されているＩ１命令のアドレス値が、ＪＡバス２７４を介して、ＩＡレジスタ３３７に転送され、命令フェッチアドレスとして使用される。ＴＲＰビット６５４が“１”の状態で、ＩＡレジスタ３３７の値がＲＰＴＥレジスタ３４３と一致した場合、ＴＲＰＴＣレジスタ３４８の値が“１”だけデクリメントされる。すなわち、Ｔ５サイクルでＴＲＰＴＣレジスタ３４８の値が１デクリメントされる。 If the value of the IA register 337 matches the RPTE register 343 when the TRP bit 654 is “1” and the value of the TRPTC register 348 is not “1”, it is a fetch of the last instruction of the repeat block, and This indicates that the repeat is continued, the processing sequence is switched, and then the instruction (I1) at the head address of the repeat block is fetched. That is, after the repeat block last instruction (I3) is fetched in the T5 cycle, the I1 instruction is fetched in the T6 cycle. In the T5 cycle, the address value of the I1 instruction held in the latch 342 is transferred to the IA register 337 via the JA bus 274 and used as an instruction fetch address. When the value of the IA register 337 matches the value of the RPTE register 343 while the TRP bit 654 is “1”, the value of the TRPTC register 348 is decremented by “1”. That is, the value of the TRPTC register 348 is decremented by 1 in the T5 cycle.

ＴＲＰビット６５４が“１”で、かつ、ＴＲＰＴＣレジスタ３４８の値が“１”の状態時に、ＩＡレジスタ３３７の値がＲＰＴＥレジスタ３４３と一致した場合、リピート処理が終了することを示す。この場合、リピートブロックの命令は指定された回数の繰り返し処理を終えたことになるので、命令フェッチのシーケンス切り替えは行われず、次にリピートブロックの次の命令（Ｉ４）をフェッチする。すなわち、Ｔ８サイクルでＩ３命令をフェッチ後、Ｔ９サイクルでＩ４命令をフェッチする。この場合も、Ｔ８サイクルでＴＲＰＴＣレジスタ３４８の値はデクリメントされる。また、リピート処理の命令フェッチ段階での処理終了に伴い、Ｔ８サイクルでＴＲＰビット６５４は“０”にクリアされる。その後、シーケンシャルな命令フェッチを継続する。 When the TRP bit 654 is “1” and the value of the TRPTC register 348 is “1” and the value of the IA register 337 matches the RPTE register 343, it indicates that the repeat process is finished. In this case, since the instruction of the repeat block has completed the specified number of repetitions, the instruction fetch sequence is not switched, and the next instruction (I4) of the repeat block is fetched next. That is, after fetching the I3 instruction in the T8 cycle, the I4 instruction is fetched in the T9 cycle. Also in this case, the value of the TRPTC register 348 is decremented in the T8 cycle. Further, the TRP bit 654 is cleared to “0” in the T8 cycle with the completion of the process at the instruction fetch stage of the repeat process. Thereafter, the sequential instruction fetch is continued.

ブロックリピート処理中で命令フェッチアドレスがＲＰＴＥレジスタ３４３と一致した事を示すリピートブロック最終命令情報と、ブロックリピート処理中で命令フェッチアドレスがＲＰＴＥレジスタ３４３と一致し、かつ、更新前のＴＲＰＴＣレジスタ３４８が“１”であった事を示すブロックリピート処理終了情報が、命令コードとともに保持、転送され、この情報に基づき以降のステージで命令非依存のブロックリピート処理に関するハードウェア制御が行われる。 Repeat block final instruction information indicating that the instruction fetch address matches the RPTE register 343 during block repeat processing, and the TRPTC register 348 before the update when the instruction fetch address matches the RPTE register 343 during block repeat processing. Block repeat processing end information indicating that it is “1” is held and transferred together with the instruction code, and hardware control related to instruction-independent block repeat processing is performed in the subsequent stages based on this information.

リピート処理を継続するリピートブロック最終命令（Ｉ３）処理時には、Ｅステージ４０３での処理実行前のＴ６サイクルで、次に実行する命令のアドレスとしてリピートブロック先頭命令（Ｉ１）のアドレスがラッチ３４２からＮＰＣ３３１に書き込まれる。リピートブロック最終命令（Ｉ３）実行時のＴ７、Ｔ１０サイクルでは、ＲＰＴＣレジスタ３４５の値がデクリメントされる。また、リピート処理を終了するリピートブロック最終命令（Ｉ３）実行時でＴ１０サイクルで、ＲＰビット６７５がゼロクリアされる。 At the time of repeat block final instruction (I3) processing to continue the repeat process, the address of the repeat block head instruction (I1) is latched from the latch 342 to NPC 331 as the address of the next instruction to be executed in the T6 cycle before the process execution in the E stage 403. Is written to. In the T7 and T10 cycles when the repeat block final instruction (I3) is executed, the value of the RPTC register 345 is decremented. In addition, the RP bit 675 is cleared to zero in the cycle T10 when the repeat block final instruction (I3) for ending the repeat process is executed.

このようにして、実行する命令とは独立にリピート制御を行うことにより、オーバーヘッドのない繰り返し処理を実現している。しかし、リピートブロックの命令数が“２”以下の場合、リピート制御のための各種判定が正しく行えるＴ４サイクルにはすでにリピートブロックの最終命令のフェッチが終了するタイミングとなるため、無条件に命令フェッチ部にフェッチされる構成の場合、正しいリピート制御が実現できない。以下、リピートブロックの命令数が２以下の場合の本データ処理装置の実行制御について説明する。 In this way, iterative processing without overhead is realized by performing repeat control independently of the instruction to be executed. However, when the number of instructions in the repeat block is “2” or less, the fetch of the last instruction in the repeat block is already completed in the T4 cycle in which various determinations for repeat control can be correctly performed. In the case of a configuration fetched by a part, correct repeat control cannot be realized. Hereinafter, execution control of the data processing apparatus when the number of instructions in the repeat block is 2 or less will be described.

図２０はリピートブロックの命令数が２命令の場合のプログラム例を示す説明図である。図２１は本実施の形態のデータ処理装置が図２０で示したプログラムを実行する場合のリピート処理中のパイプライン処理を示すタイミング図である。図２０に示すように、Ｉ１、Ｉ２の２命令がリピートブロックは２となり、これらの命令（Ｉ１，Ｉ２）が２回繰り返して実行される。 FIG. 20 is an explanatory diagram showing a program example when the number of instructions in the repeat block is two. FIG. 21 is a timing chart showing pipeline processing during repeat processing when the data processing apparatus of the present embodiment executes the program shown in FIG. As shown in FIG. 20, two instructions I1 and I2 are 2 repeat blocks, and these instructions (I1 and I2) are executed twice.

以下、図１８、図１９に示したプログラム処理例と異なる点に着目して説明を行う。なお、図２１において、“Ｈ”，“Ｌ”が重複する期間（一部ハッチングで示す）は不定期間を意味する。また、“ＩＡ＝＝ＲＰＴ＿Ｅ”，“ＴＲＰＴＣ＝＝１”における丸印は認識されるタイミングを意味する。 Hereinafter, the description will be made focusing on differences from the program processing examples shown in FIGS. In FIG. 21, a period in which “H” and “L” overlap (partially indicated by hatching) means an indefinite period. Further, the circles in “IA == RPT_E” and “TRPTC == 1” mean the recognition timing.

リピート命令のデコードを開始するＴ２サイクルの最初では、命令キューにＲＥＰＩ命令とＩ１命令の２命令が格納されているため、それ以前のパイプライン処理状態に依存せずＩ２命令のフェッチは開始されていない。Ｔ２サイクルでＲＥＰＩ命令のデコードを開始する。リピート命令のデコード時にはｄｉｓｐ判定部６６４でリピートブロックの最終命令を指定するための変位値(所定のコードサイズ)が判定され、“２”であることが判定される。 At the beginning of the T2 cycle where decoding of the repeat instruction is started, since the REPI instruction and the I1 instruction are stored in the instruction queue, the fetch of the I2 instruction is started regardless of the previous pipeline processing state. Absent. In the T2 cycle, decoding of the REPI instruction is started. At the time of decoding the repeat instruction, the disp determination unit 664 determines a displacement value (predetermined code size) for designating the last instruction of the repeat block, and determines that it is “2”.

実施の形態１のデータ処理装置では、ＲＥＰＩ命令の第１ステップ(ＲＥＰＩ（１）)のデコード時に、ｄｉｓｐ判定部６６４が変位値が“２”以下と判定した場合、判定結果を受けた制御信号生成部６６５は、“Ｈ”（活性状態）の命令フェッチ抑止信号６８１を所定の信号線を通して命令フェッチ要求生成部６５２に送る。このとき、命令フェッチ抑止解除信号６８２に関する信号がＥステージ制御信号ラッチ６７３に送られる。 In the data processing apparatus according to the first embodiment, when the disp determination unit 664 determines that the displacement value is “2” or less during decoding of the first step (REPI (1)) of the REPI instruction, the control signal that has received the determination result. The generation unit 665 sends an instruction fetch suppression signal 681 of “H” (active state) to the instruction fetch request generation unit 652 through a predetermined signal line. At this time, a signal related to the instruction fetch suppression release signal 682 is sent to the E stage control signal latch 673.

ＩＦステージ制御部６５１内の命令フェッチ要求生成部６５２は“Ｈ”の命令フェッチ抑止信号６８１を受けると命令のフェッチを一時停止する命令フェッチ抑止制御を開始し、命令フェッチ要求信号６９１のアサートを停止する。その後、“Ｈ”（活性状態）の命令フェッチ抑止解除信号６８２、あるいは“Ｈ”のジャンプ信号６８４を受けるまで、命令フェッチ抑止状態を維持する。 Upon receiving the “H” instruction fetch suppression signal 681, the instruction fetch request generation unit 652 in the IF stage control unit 651 starts instruction fetch suppression control that temporarily stops fetching instructions, and stops asserting the instruction fetch request signal 691. To do. Thereafter, the instruction fetch inhibition state is maintained until the instruction fetch inhibition release signal 682 of “H” (active state) or the jump signal 684 of “H” is received.

そして、ＲＥＰＩ命令の第２ステップ(ＲＥＰＩ（２）)を実行するＴ４サイクルで、ＲＥＰＩ命令の設定が完了する。Ｔ４サイクルでは、Ｅステージ制御信号ラッチ６７３から“Ｈ”の命令フェッチ抑止解除信号６８２が所定の信号線を介して命令フェッチ要求生成部６５２に送られる。ＩＦステージ制御部６５１は、“Ｈ”の命令フェッチ抑止解除信号６８２に応答して、命令フェッチ抑止状態を解除し、Ｔ４サイクルでＩ２命令のフェッチ要求出力を行い、続くＴ５サイクルでＩ２命令のフェッチを行う。 Then, the setting of the REPI instruction is completed in the T4 cycle in which the second step (REPI (2)) of the REPI instruction is executed. In the T4 cycle, an “H” instruction fetch suppression release signal 682 is sent from the E stage control signal latch 673 to the instruction fetch request generator 652 via a predetermined signal line. In response to the “H” instruction fetch inhibition release signal 682, the IF stage control unit 651 releases the instruction fetch inhibition state, outputs a fetch request for the I2 instruction in the T4 cycle, and fetches the I2 instruction in the subsequent T5 cycle. I do.

このように、リピート設定命令(ＲＥＰＩ)の設定が完了するまで、命令フェッチを一時停止する命令フェッチ抑止状態を設けることにより、Ｉ２命令のフェッチから命令フェッチ段階でのリピート関連の判定や制御が正しく行われることを保証する。 In this way, by providing an instruction fetch suppression state in which instruction fetch is temporarily stopped until the setting of the repeat setting instruction (REPI) is completed, repeat-related determination and control from the I2 instruction fetch to the instruction fetch stage are correctly performed. Guarantee that it will be done.

ただし、実施の形態１では、命令フェッチを一時停止することにより、図２１の処理例の場合、Ｉ２命令の実行が開始されるまでに１クロックサイクルのオーバーヘッドを生じる。ただし、このオーバーヘッドは最初の１回目の実行に関してのみ生じるが、その後の繰り返し処理に関してはオーバーヘッドを生じない。また、パイプラインの状態によっては、オーバーヘッドを生じない場合もある。例えば、Ｉ１命令がシーケンシャル実行を行うショートサブ命令を２命令含む場合、Ｉ１命令の実行に２サイクルを要するため、オーバーヘッドは生じない。 However, in the first embodiment, by temporarily stopping the instruction fetch, in the case of the processing example of FIG. 21, an overhead of one clock cycle is generated before the execution of the I2 instruction is started. However, this overhead occurs only for the first execution of the first time, but no overhead occurs for the subsequent repetitive processing. Also, depending on the state of the pipeline, there may be no overhead. For example, when the I1 instruction includes two short sub-instructions that perform sequential execution, overhead is not generated because two cycles are required to execute the I1 instruction.

このようにリピート命令処理時に命令フェッチの一時停止制御を行うことにより、リピートブロックの命令数が２命令の場合も、低オーバーヘッドでリピート処理を実現できる。 By performing instruction fetch suspension control during repeat instruction processing in this manner, even when the number of instructions in a repeat block is two, repeat processing can be realized with low overhead.

すなわち、実施の形態１のデータ処理装置は、リピートブロックの命令数（所定のコードサイズ）が“２”（第１の基準値）以下である場合であっても、ｄｉｓｐ判定部６６４の判定動作によって上記命令数が“２”以下であることの判定時からリピート設定命令に規定したリピート関連の設定動作が終了するまでの期間、パイプライン処理対象の命令のフェッチを停止することにより、オーバーヘッドを最小限に抑えながら正常なリピート動作を実行することができる。 In other words, the data processing apparatus according to the first embodiment performs the determination operation of the disp determination unit 664 even when the number of repeat block instructions (predetermined code size) is “2” (first reference value) or less. By stopping the fetching of instructions subject to pipeline processing during the period from when it is determined that the number of instructions is “2” or less until the repeat-related setting operation specified in the repeat setting instruction is completed, overhead is reduced. Normal repeat operation can be executed while minimizing.

図２２はリピートブロックの命令数が１命令の場合のプログラム例を示す説明図である。図２３は本実施の形態のデータ処理装置が図２２で示したプログラムを実行する場合のリピート処理中のパイプライン処理を示すタイミング図である。図２２に示すように、Ｉ１命令のみがリピートブロックとなり、２回繰り返し実行される。以下、図１８、図１９に示したプログラム処理例と異なる点に着目して説明を行う。なお、図２３において、“Ｈ”，“Ｌ”が重複する期間（一部ハッチングで示す）は不定期間を意味する。また、“ＩＡ＝＝ＲＰＴ＿Ｅ”，“ＴＲＰＴＣ＝＝１”における丸印は認識されるタイミングを意味する。 FIG. 22 is an explanatory diagram showing a program example when the number of instructions in the repeat block is one. FIG. 23 is a timing chart showing pipeline processing during repeat processing when the data processing apparatus of the present embodiment executes the program shown in FIG. As shown in FIG. 22, only the I1 instruction becomes a repeat block and is repeatedly executed twice. Hereinafter, the description will be made focusing on differences from the program processing examples shown in FIGS. In FIG. 23, a period in which “H” and “L” overlap (partially indicated by hatching) means an indefinite period. Further, the circles in “IA == RPT_E” and “TRPTC == 1” mean the recognition timing.

リピート命令のデコードを開始するＴ２サイクルの最初では、リピート関連の設定が始まっていない状態ですでにＩ１命令のフェッチが完了しているため、そのまま継続して処理を続けても現状のハードウェア構成では正しい処理が行えない。以下、その理由について説明する。 At the beginning of the T2 cycle where decoding of the repeat instruction is started, since the fetch of the I1 instruction has already been completed in a state where the setting related to repeat has not started, the current hardware configuration is continued even if processing is continued as it is Then, correct processing cannot be performed. The reason will be described below.

本実施の形態相当のデータ処理装置では、命令フェッチ要求の出力サイクルの後半からリピート関連の判定を行う。すなわち、Ｉ１命令のフェッチ要求を出力するＴ１サイクルの後半から、アドレスの一致判定やカウント値の判定を行うことになる。 In the data processing apparatus corresponding to the present embodiment, the repeat-related determination is performed from the second half of the output cycle of the instruction fetch request. That is, address match determination and count value determination are performed from the latter half of the T1 cycle in which a fetch request for the I1 instruction is output.

ところが、その判定に必要なリピート関連の制御レジスタ等の設定は、リピート命令を実行するＴ３、Ｔ４サイクルで行われる。したがって、Ｔ１サイクルでは正しい判定を行うことができず、Ｔ２サイクルでのＩ１命令のフェッチ後、リピート処理により次にＩ１命令をフェッチするという判断も行えない。本来、最初のＩ１命令のフェッチに伴い、Ｔ２サイクルで行うべきＴＲＰＴＣのデクリメント処理等も行えない。このように、正しいリピート関連の処理が行えないため、後続命令の正常な動作が保証できない。 However, the setting of repeat-related control registers and the like necessary for the determination is performed in the T3 and T4 cycles for executing the repeat instruction. Therefore, a correct determination cannot be made in the T1 cycle, and it cannot be determined that the I1 instruction is fetched next by repeat processing after the I1 instruction is fetched in the T2 cycle. Originally, the TRPTC decrement processing that should be performed in the T2 cycle cannot be performed with the fetch of the first I1 instruction. Thus, since correct repeat-related processing cannot be performed, normal operation of subsequent instructions cannot be guaranteed.

そこで、実施の形態１におけるデータ処理装置は、リピートブロックが１命令の場合は、リピート関連の設定終了後、パイプラインの前処理をキャンセルし、１回目のＩ１命令のフェッチから処理をやり直すように構成している。 Therefore, when the repeat block has one instruction, the data processing apparatus according to the first embodiment cancels the pipeline pre-processing after the repeat-related setting is completed, and repeats the processing from the first fetch of the I1 instruction. It is composed.

Ｔ２サイクルでＲＥＰＩ命令のデコードを開始する。リピート命令のデコード時にはｄｉｓｐ判定部６６４でリピートブロックの最終命令を指定するための変位値(所定のコードサイズ)が判定され、“１”であることが判定される。ＲＥＰＩ命令の第１ステップ(ＲＥＰＩ（１）)のデコード時に、変位値が“２”以下の場合、前述したように、制御信号生成部６６５から、“Ｈ”の命令フェッチ抑止解除信号６８２が所定の信号線を通して命令フェッチ要求生成部６５２に送られる。このとき、ジャンプ要求信号６８３に関連する情報がＥステージ制御信号ラッチ６７３にラッチされる。 In the T2 cycle, decoding of the REPI instruction is started. At the time of decoding the repeat instruction, the disp determination unit 664 determines a displacement value (predetermined code size) for designating the last instruction of the repeat block, and determines that it is “1”. When the displacement value is “2” or less at the time of decoding of the first step (REPI (1)) of the REPI instruction, as described above, the “H” instruction fetch suppression release signal 682 is predetermined from the control signal generation unit 665. Is sent to the instruction fetch request generation unit 652 through the signal line. At this time, information related to the jump request signal 683 is latched in the E stage control signal latch 673.

ＩＦステージ制御部６５１は“Ｈ”の命令フェッチ抑止解除信号６８２に応答して、命令フェッチ抑止状態となる。したがって、ＩＦステージ制御部６５１は、その後に“Ｈ”の命令フェッチ抑止解除信号を受けるか、“Ｈ”のジャンプ信号６８４を受けるまで、命令のフェッチを一時停止する。 The IF stage control unit 651 enters an instruction fetch inhibition state in response to the instruction fetch inhibition release signal 682 of “H”. Therefore, the IF stage control unit 651 temporarily stops fetching instructions until it receives an “H” instruction fetch suppression release signal or an “H” jump signal 684.

なお、リピートブロックが１命令の場合、“Ｈ”のジャンプ信号６８４によってパイプライン処理をキャンセルすることになるため、命令フェッチを停止することは必須ではないが、なるべく不要な命令フェッチを行わないように命令フェッチを停止する制御を行っている。 When the repeat block is one instruction, the pipeline processing is canceled by the “H” jump signal 684. Therefore, it is not essential to stop the instruction fetch, but an unnecessary instruction fetch should be avoided. Control to stop the instruction fetch is performed.

Ｔ４サイクルで、ＲＥＰＩ命令の第２ステップ(ＲＥＰＩ（２）)の実行を行う。ＲＥＰＩ命令の第２ステップの処理時に、リピートブロックの命令数が“３”命令以上の場合の処理に加え、リピートブロックの先頭命令へのジャンプ処理を行う。このとき、Ｅステージ制御信号ラッチ６７３からジャンプ要求を指示するジャンプ要求信号６８３がＥステージ制御部６７１内のジャンプ制御部６７２に送られる。 In the T4 cycle, the second step (REPI (2)) of the REPI instruction is executed. During the processing of the second step of the REPI instruction, in addition to the processing when the number of instructions in the repeat block is “3” or more, jump processing to the head instruction of the repeat block is performed. At this time, a jump request signal 683 for instructing a jump request is sent from the E stage control signal latch 673 to the jump control unit 672 in the E stage control unit 671.

その結果、リピートブロックの先頭命令であるＩ１の命令アドレスが、ＮＰＣ３３１からラッチ３３２、ＪＡバス２７４を介してＩＡレジスタ３３７に転送されるとともに、ジャンプ制御部６７２からＨ”のジャンプ信号６８４がＩＦステージ制御部６５１及び命令フェッチ要求生成部６５２に送られ、“Ｈ”のジャンプ信号６８４を受けたＩＦステージ制御部６５１及びＤステージ制御部６６１はそれぞれパイプライン処理の前段処理をキャンセルする。 As a result, the instruction address of I1, which is the first instruction of the repeat block, is transferred from the NPC 331 to the IA register 337 via the latch 332 and the JA bus 274, and the jump signal 684 of H ″ from the jump control unit 672 is the IF stage. The IF stage control unit 651 and the D stage control unit 661 that have been sent to the control unit 651 and the instruction fetch request generation unit 652 and have received the “H” jump signal 684 cancel the pre-stage processing of the pipeline processing.

このように、ＩＦステージ制御部６５１は、フェッチが完了しているＩ１命令のフェッチをやり直すことにより、リピート命令の設定後Ｉ１命令のフェッチが行われるようになり、正しいリピート制御が行われる。 As described above, the IF stage control unit 651 performs fetch of the I1 instruction after setting the repeat instruction by performing the fetch of the I1 instruction that has been fetched again, and correct repeat control is performed.

ただし、この場合パイプラインの前段処理をキャンセルするので、図２３の処理例の場合、Ｉ１命令の実行が開始されるまでに２クロックサイクルのオーバーヘッドを生じる。ただし、このオーバーヘッドは最初の１回目の実行に関してのみ生じるが、その後の繰り返し処理に関してはオーバーヘッドを生じない。 However, in this case, since the pre-stage processing of the pipeline is canceled, in the case of the processing example of FIG. 23, an overhead of 2 clock cycles is generated before the execution of the I1 instruction is started. However, this overhead occurs only for the first execution of the first time, but no overhead occurs for the subsequent repetitive processing.

このように、実施の形態１のデータ処理装置は、リピート設定命令の設定処理完了時にジャンプ処理を行うパイプラインキャンセル機能を有することにより、リピートブロックの命令数が１命令の場合も、リピート処理を実現できる。 As described above, the data processing apparatus according to the first embodiment has the pipeline cancel function that performs the jump process when the setting process of the repeat setting instruction is completed, so that the repeat process can be performed even when the number of instructions in the repeat block is one instruction. realizable.

すなわち、実施の形態１のデータ処理装置は、上記パイプラインキャンセル機能により、リピートブロックの命令数（所定のコードサイズ）が“１”（第２の基準値）である場合であっても、オーバーヘッドを最小限に抑えながら正常なリピート動作を実行することができる。 In other words, the data processing apparatus according to the first embodiment uses the pipeline cancel function, even if the number of repeat block instructions (predetermined code size) is “1” (second reference value). Normal repeat operation can be executed while minimizing the error.

上述のように、実施の形態１よるデータ処理装置は、リピート命令処理時に命令フェッチを一時停止する命令フェッチ抑止及び解除機能と、ジャンプ処理によりパイプライン処理の前段処理をキャンセルするパイプラインキャンセル機能とを有することにより、リピート命令の繰り返し対象ブロックのサイズが小さい場合にも正常動作を保証できるようになる。 As described above, the data processing apparatus according to the first embodiment includes an instruction fetch suppression and release function that temporarily stops instruction fetching during repeat instruction processing, and a pipeline cancel function that cancels the previous stage processing of pipeline processing by jump processing. Thus, normal operation can be ensured even when the size of the repeat target block of the repeat instruction is small.

その結果、プログラム作成に際して、リピート命令で実行できるリピートブロックのサイズに制限がなくなるため、不必要にループ処理を展開する必要もなく、繰り返し回数がダイナミックに変化する場合も含めリピート命令で対処できない端数処理を行う部分が単純化でき、コードサイズも処理サイクル数も削減できるという効果を奏する。 As a result, there is no limit on the size of repeat blocks that can be executed with repeat instructions when creating a program, so there is no need to unnecessarily expand loop processing, and fractions that cannot be handled with repeat instructions even when the number of repetitions changes dynamically The processing portion can be simplified, and the code size and the number of processing cycles can be reduced.

特にディジタル信号処理においては繰り返し処理の占める割合が非常に多く、繰り返し処理のコードサイズの削減は有効である。また、プログラムをＲＯＭ化する場合には、プログラムのコード効率がチップサイズに直接影響を及ぼすため、コスト削減にも大きく寄与する。 In particular, in digital signal processing, the ratio of repetitive processing is very large, and it is effective to reduce the code size of repetitive processing. In addition, when the program is implemented in ROM, the code efficiency of the program directly affects the chip size, which greatly contributes to cost reduction.

このように、実施の形態１のデータ処理装置では、命令フェッチ抑止及び解除機能並びにパイプラインキャンセル機能を有することにより、リピート設定命令の第１ステップ(ＲＥＰＩ（１）)のＤステージ４０２から命令フェッチを一時停止したり、リピート設定命令の第２ステップ(ＲＥＰＩ（２）)のＥステージ４０３で命令フェッチの抑止解除あるいはジャンプ処理によりパイプライン処理の前段処理をキャンセルすることにより、コード効率のよい高性能、低消費電力で低コストなデータ処理装置を得ることができる。また、プログラム自体も単純になるため、プログラム開発の生産性が上がるとともに、バグの混入を低減できるという効果もある。 As described above, the data processing apparatus according to the first embodiment has the instruction fetch suppression and release function and the pipeline cancel function, so that the instruction fetch from the D stage 402 of the first step (REPI (1)) of the repeat setting instruction is performed. Can be suspended by canceling the pre-processing of the pipeline processing by canceling the suppression of instruction fetch or canceling the jump at the E stage 403 of the second step (REPI (2)) of the repeat setting instruction. A low-cost data processing device with high performance, low power consumption can be obtained. In addition, since the program itself is simplified, the productivity of program development is increased, and there is an effect that the mixing of bugs can be reduced.

上述した実施の形態１におけるデータ処理装置の構成例は、あくまでも一構成例を示したものであり、本発明の適用範囲を限定するものではない。基本的に、ハードウェアリピート機能を備えた、パイプライン処理を行うデータ処理装置であれば、どのようなものに本発明を適用しても同様の効果がある。上述の実施の形態に示した構成に限定されるものではない。 The above-described configuration example of the data processing apparatus according to the first embodiment is merely an example of the configuration, and does not limit the application range of the present invention. Basically, the present invention can be applied to any data processing apparatus having a hardware repeat function and performing pipeline processing. It is not limited to the configuration shown in the above embodiment.

上記実施の形態では、ＶＬＩＷプロセッサに対して本発明の技術を適用した例を示しているが、ＲＩＳＣプロセッサ、ＣＩＳＣプロセッサやＤＳＰ等、基本的にどのようなアーキテクチャのデータ処理装置に関しても適用可能である。また、可変長命令セットアーキテクチャのプロセッサに適用してもよい。ただし、可変長命令セットを処理する場合は、整置を配慮する必要があり、制御は多少複雑になる。また、可変長命令セットを処理する場合は、命令数とコードサイズが１対１に対応しないため、リピートブロックのコードサイズで処理内容の判定を行うように制御する必要がある。 In the above embodiment, an example in which the technology of the present invention is applied to a VLIW processor is shown. However, the present invention can be applied to a data processing apparatus of basically any architecture such as a RISC processor, a CISC processor, or a DSP. is there. Further, the present invention may be applied to a processor having a variable length instruction set architecture. However, when processing a variable length instruction set, it is necessary to consider alignment, and the control is somewhat complicated. Also, when processing a variable length instruction set, the number of instructions and code size do not correspond one-to-one, so it is necessary to perform control so that the processing content is determined by the code size of the repeat block.

上記実施の形態では、命令フェッチ抑止信号６８１及び命令フェッチ抑止解除信号６８２に基づく命令フェッチ抑止及び解除機能とジャンプ信号６８４に基づくパイプラインキャンセル機能を共に実装した例を示しているが、各々の機能を単独で実装してもよい。例えば、上記パイプラインキャンセル機能のみを実装し、リピート命令の設定終了時における“Ｈ”のジャンプ信号６８４に基づくパイプラインキャンセルのみを行うように構成し、リピートブロックの命令が２命令以下の場合は常に上述したジャンプ処理を行うようにしてもよい。 In the above embodiment, an example is shown in which both the instruction fetch suppression and release function based on the instruction fetch suppression signal 681 and the instruction fetch suppression release signal 682 and the pipeline cancel function based on the jump signal 684 are implemented. May be implemented alone. For example, when only the pipeline cancel function is implemented and only the pipeline cancel based on the jump signal 684 of “H” at the end of the repeat instruction setting is performed, and the repeat block instruction is 2 instructions or less The jump process described above may always be performed.

また、上記命令フェッチ抑止及び解除機能のみを実装し、リピートブロックの命令が２命令の場合には対処するが、リピートブロックの命令が１命令の場合は動作保証しないようにし構成してもよい。ただし、このように、上記命令フェッチ抑止及び解除機能並びに上記パイプラインキャンセル機能のうちの一方を単独で実装した場合、効果が小さくなったり、制限が増えたりするが、その点は設計時に開発コストと性能のトレードオフを考慮して判断すればよいことである。 Further, only the instruction fetch suppression and release function described above may be implemented to cope with the case where the repeat block instruction is two instructions, but the operation may not be guaranteed when the repeat block instruction is one instruction. However, when one of the instruction fetch suppression / cancellation function and the pipeline cancel function is implemented alone as described above, the effect is reduced or the restriction is increased. Judging from the trade-off between performance and performance.

また、実施の形態１では、簡単のため比較的浅いパイプラインで、命令キューが２本の単純な例を示しているが、上記実施の形態と異なるパイプライン処理構成のものに適用しても良い。高性能化のため、パイプライン１段の処理量は小さくして処理段数を増やすという狭ピッチパイプライン化によりリピート関連の設定を行うステージまでのパイプライン段数が多い場合や、命令キューのエントリ数が多い場合は、リピート関連の設定が完了するまでに取り込まれる命令数の最大値が大きくなり、対処を行わない場合に保証できるリピートブロックのサイズが増大するため、本発明の技術は特に有効である。 In the first embodiment, a simple example with a relatively shallow pipeline and two instruction queues is shown for the sake of simplicity. However, the first embodiment may be applied to a pipeline processing configuration different from that of the first embodiment. good. For high performance, if the number of pipeline stages to the stage where repeat-related settings are made by narrow-pitch pipeline that reduces the processing amount of one stage of pipeline and increases the number of processing stages, or the number of entries in the instruction queue When there is a large number of instructions, the maximum number of instructions fetched until the completion of repeat-related settings increases, and the size of the repeat block that can be guaranteed when no action is taken increases, so the technique of the present invention is particularly effective. is there.

実施の形態１では、１レベルのみリピート機能を実装した例を示したが、入れ子構造の複数レベルのリピートを実装している場合にも適用可能である。 In the first embodiment, an example in which the repeat function is implemented only at one level has been described. However, the present invention can be applied to a case where multiple levels of repeats are implemented.

また、条件分岐の分岐／非分岐両方向の複数の処理シーケンスの命令フェッチを行っておくような構成の場合でも、本発明は適用可能である。以下、この点について詳述する。高性能なプロセッサでは、条件分岐命令処理時の分岐ペナルティ（分岐によりパイプラインキャンセルが起こった場合に、分岐先命令のフェッチから実行に至るまでのＥステージアイドルサイクル）を削減するために、分岐／非分岐の両方のシーケンスの命令をフェッチするような機能を実装するものがある（さらに３つ以上の場合もある）。そのような場合でも、リピート命令処理（命令デコード／実行）時に処理しているリピート命令が含まれるシーケンスの命令に関して、本実施の形態と同じ処理を行えばよい。すなわち、複数の処理シーケンスの命令フェッチを行っておくような構成の場合でも、各々１つの処理シーケンスに着目すれば同じ処理を行えばよいことになる。 The present invention can also be applied to a configuration in which instruction fetches of a plurality of processing sequences in both branch / non-branch directions of conditional branches are performed. Hereinafter, this point will be described in detail. In a high-performance processor, branch / penalty during conditional branch instruction processing (E stage idle cycle from fetch to execution of branch destination instruction when pipeline cancellation occurs due to branch) is reduced. Some implement functions that fetch both non-branch sequences of instructions (and may have more than two). Even in such a case, the same processing as in the present embodiment may be performed with respect to an instruction of a sequence including a repeat instruction being processed during repeat instruction processing (instruction decoding / execution). That is, even in the case of a configuration in which instruction fetches of a plurality of processing sequences are performed, the same processing may be performed if attention is paid to each processing sequence.

上記実施の形態では、リピート命令の実行を２サイクルかけて行っているが、１サイクルで行ってもよいし、３サイクル以上かけてもよい。また、上記実施の形態では、リピート命令実行後にジャンプを行う場合、第２ステップ(ＲＥＰＩ（２）)が実行される２サイクル目でジャンプ処理を行っているが、第３ステップを追加し、第３ステップでジャンプ処理を行うようにしてもよい。ジャンプ処理のタイミングは、命令フェッチ段階で正しいシーケンス制御が保証できるタイミング以降なら任意のタイミングで行ってもよい。例えば、リピート命令処理時にリピート関連の設定を実行ステージ以前に行う場合は、リピート関連の設定タイミングに応じてより早くリピート設定命令以降のパイプラインキャンセル（ジャンプ処理）を行っても問題ない。 In the above embodiment, the repeat instruction is executed over two cycles, but it may be executed in one cycle or over three cycles. In the above embodiment, when a jump is performed after execution of a repeat instruction, the jump process is performed in the second cycle in which the second step (REPI (2)) is executed. The jump process may be performed in three steps. The timing of the jump processing may be performed at any timing as long as it is after the timing at which correct sequence control can be guaranteed at the instruction fetch stage. For example, when the repeat-related setting is performed before the execution stage during the repeat instruction processing, there is no problem even if the pipeline cancel (jump processing) after the repeat setting instruction is performed earlier according to the repeat-related setting timing.

実施の形態１では、命令フェッチの抑止解除をリピート命令の実行の最終ステップの実行時に行っているが、ハードウェア的に動作が保証できるタイミングであれば、それ以降のどのタイミングで行ってもよい。例えば、リピート命令処理時にリピート関連の設定を実行ステージ以前に行う場合は、リピート関連の設定タイミングに応じてより早く命令フェッチの抑止解除を行っても問題ない。 In the first embodiment, the cancellation of instruction fetch suppression is performed at the time of execution of the final step of execution of a repeat instruction, but may be performed at any subsequent timing as long as the operation can be guaranteed in hardware. . For example, if repeat-related settings are made before the execution stage during repeat instruction processing, there is no problem even if instruction fetch suppression is canceled earlier according to the repeat-related setting timing.

実施の形態１では、ジャンプ処理によりパイプラインの前段処理をキャンセルする場合も命令のフェッチ抑止を行っているが、命令フェッチの抑止を行わなくてもよい。ただし、命令フェッチの抑止を行っておく方が無駄な命令のフェッチを抑えることが可能であり、ジャンプ処理時に命令フェッチ要求がすぐに受け付け可能な状態となる。なお、命令フェッチに複数サイクルがかかる場合など、ジャンプ要求がすぐに受け付けられない場合がある。この場合はフェッチ動作を止めれば命令フェッチがすぐに受け付けられるようになる。 In the first embodiment, instruction fetch suppression is performed even when the upstream processing of the pipeline is canceled by jump processing, but instruction fetch suppression may not be performed. However, it is possible to suppress fetching of useless instructions by suppressing instruction fetch, and an instruction fetch request can be immediately accepted during jump processing. In some cases, such as when an instruction fetch takes multiple cycles, a jump request may not be accepted immediately. In this case, if the fetch operation is stopped, the instruction fetch can be accepted immediately.

実施の形態１では、リピート関連の設定／リピート処理の起動を１つのリピート命令で全て行うようにしているが、複数の命令で設定／起動を行うようにしてもよいし、各設定値について異なる指定方法を採ってもかまわない。基本的に、どのような命令仕様や制御レジスタのハードウェア仕様であっても、リピート関連の最後の設定を行う命令の処理時に、その命令からリピート機能により処理シーケンスを切り替えるための判定に使用される命令フェッチアドレスの命令までのコードサイズ情報を判定するための手段を設け、パイプライン処理を考慮して、正しく動作保証できるように、上述した命令フェッチ抑止及び解除機能やパイプラインキャンセル機能を備えればよい。 In the first embodiment, the start of repeat-related setting / repeat processing is all performed with one repeat command. However, setting / startup may be performed with a plurality of commands, and each setting value is different. You can use the designation method. Basically, whatever instruction specifications and control register hardware specifications are used for the determination to switch the processing sequence from the instruction by the repeat function when processing the instruction for the last setting related to repeat. In order to ensure correct operation in consideration of pipeline processing, the above-mentioned instruction fetch suppression and release function and pipeline cancel function are provided. Just do it.

例えば、繰り返し回数は別命令で予め設定するようにしても、上記命令フェッチ抑止及び解除機能や上記パイプラインキャンセル機能の実行制御等のリピート命令の制御方法の判定に繰り返し回数は影響しないので、上記実施の形態と同様に制御できる。また、上記実施の形態では、繰り返し回数としてリピートブロックの実行回数を指定するようになっているが、繰り返し回数を“繰り返し回数−１”で設定し管理するなど、上記実施の形態と異なる命令仕様であってもかまわない。この場合、上記実施の形態ではリピートの終了のためカウント値が“１”であることを判定しているが、その代わりに“０”であることを判定すればよい。 For example, even if the number of repetitions is set in advance by another instruction, the number of repetitions does not affect the determination of the repeat instruction control method such as the execution control of the instruction fetch suppression and release function and the pipeline cancel function. Control can be performed in the same manner as in the embodiment. In the above embodiment, the repeat block execution count is specified as the repeat count. However, an instruction specification different from the above embodiment, such as setting and managing the repeat count as “repetition count−1”, is used. It doesn't matter. In this case, in the above-described embodiment, it is determined that the count value is “1” for the end of repeat, but instead, it may be determined that it is “0”.

また実施の形態１では、リピート命令はリピートブロックの直前で実行するようにしており、リピートブロックの先頭命令はリピート命令の次命令であることが暗黙で指定されている。例えば、リピートブロックの開始命令をリピート命令で明示的に指定、あるいは、暗黙的に数命令先に指定したり、予め別命令で指定しておくようにしてもよい。リピート命令とリピートブロックの最終命令までのコード量を判定するのにリピートブロックの先頭命令がどこであるかは、関連しない。 In the first embodiment, the repeat instruction is executed immediately before the repeat block, and it is implicitly specified that the head instruction of the repeat block is the next instruction of the repeat instruction. For example, the repeat block start instruction may be explicitly specified by a repeat instruction, may be implicitly specified by several instructions, or may be specified in advance by another instruction. The determination of the code amount from the repeat instruction to the last instruction of the repeat block is irrelevant as to where the start instruction of the repeat block is.

また実施の形態１では、リピートブロックの最終命令をリピート命令からの変位値で指定している。リピートブロックの開始命令や最終命令を変位値でなく絶対アドレスで指定してもよいし、リピートブロックの最終命令をリピート命令の次命令からの変位値等で指定してもよい。さらに、リピートブロックの最終命令のアドレス情報として、リピートブロックの最終命令の次命令への変位値で指定するようにしておくような命令仕様であってもかまわない。この場合、内部ハードウェア的にリピートブロックの最終命令のアドレスを計算、管理してもよいし、アドレス比較を行うタイミングを早め、フェッチアドレスとリピートブロックの最終命令の次命令アドレスと比較し、一致したらその命令のフェッチを行わずリピートブロックの先頭命令のアドレスの命令フェッチを行うようにすればよい。 In the first embodiment, the last instruction of the repeat block is designated by a displacement value from the repeat instruction. The repeat block start instruction and final instruction may be specified by an absolute address instead of the displacement value, or the repeat block final instruction may be specified by a displacement value from the instruction next to the repeat instruction. Furthermore, the address specification of the last instruction of the repeat block may be an instruction specification that is designated by a displacement value to the next instruction of the last instruction of the repeat block. In this case, the address of the last instruction of the repeat block may be calculated and managed by internal hardware, or the timing of address comparison is advanced, and the fetch address is compared with the next instruction address of the last instruction of the repeat block. Then, it is only necessary to fetch the instruction at the address of the first instruction of the repeat block without fetching the instruction.

また、リピートブロックの最終命令のアドレスが絶対アドレスで指定される場合、リピート命令のアドレスとリピートブロックの最終命令のアドレスの差分をとれば、リピートブロックの最終命令までのコードサイズ情報を判定できる。 Further, when the address of the last instruction of the repeat block is designated by an absolute address, the code size information up to the last instruction of the repeat block can be determined by taking the difference between the address of the repeat instruction and the address of the last instruction of the repeat block.

また、リピートブロックの最終命令のアドレスが予め別命令で設定が行われている場合には、既に設定が終わっているレジスタ等の値を参照して、リピートブロックの最終命令までのコードサイズを判定すればよい。 Also, if the address of the last instruction of the repeat block has been set in advance by another instruction, the code size up to the last instruction of the repeat block is determined by referring to the values of registers that have already been set. do it.

実施の形態１では、リピートブロックの最終命令フェッチ時にリピート関連の判定を行うようにしているが、早いタイミングで判定を行うため、リピートブロックの最終命令の１つ前の命令アドレスをハードウェア的に保持するようにしておき、そのアドレスとフェッチアドレスを比較を行うようにしてもよい。 In the first embodiment, the repeat-related determination is performed at the time of fetching the final instruction of the repeat block. However, in order to perform the determination at an early timing, the instruction address immediately before the final instruction of the repeat block is implemented in hardware. The address may be held, and the address and the fetch address may be compared.

また、フェッチ時に準備段階の次にフェッチする命令アドレス（例えばインクリメンタ３３９の出力）と比較するようにしてもよい。また、リピートブロックの最終命令のフェッチが受け付けられた後でもシーケンス制御の切り替えが正しく行えるハードウェア構成をとっているのであれば、それを配慮してリピートブロックのサイズを判定すればよい。以下、この点について詳述する。 Further, it may be compared with an instruction address (for example, an output of the incrementer 339) fetched after the preparation stage at the time of fetching. If the hardware configuration is such that the sequence control can be correctly switched even after the fetch of the last instruction of the repeat block is accepted, the size of the repeat block may be determined in consideration of this. Hereinafter, this point will be described in detail.

実施の形態１では、命令フェッチの要求出力時に判定を開始し、命令フェッチ中にＴＲＰＴＣやＴＲＰビットの更新を行うようにしている。したがって、図２３で示した例ではＩ１の命令フェッチ(Ｔ２サイクル)がリピート関連の設定（Ｔ４サイクル）の前に終わっているため対処できなくなっている。 In the first embodiment, determination is started when an instruction fetch request is output, and the TRPTC and TRP bits are updated during instruction fetch. Therefore, in the example shown in FIG. 23, the instruction fetch of I1 (T2 cycle) is completed before the setting related to repeat (T4 cycle), so that it cannot be dealt with.

しかし、この場合、Ｔ４サイクルではＩ２の命令フェッチをまだ開始していないので、直前に命令フェッチしたＩ１のアドレスを別途保持しておき、リピート関連の設定が完了し正しい判定が可能となるＴ４サイクルで、別途保持しているアドレスとの判定を行い、Ｔ５サイクルでの処理シーケンスの切り替えやＴＲＰＴＣやＴＲＰビットの更新を行うことが可能となるようなハードウェア構成をとれば、Ｔ４サイクルでデコードしたＩ１命令をＴ５サイクルで実行することができるため、１サイクルアイドル期間を削減できる。したがって、リピートブロックの命令数が“１”の場合でも、命令数が“２”の場合と同様、命令フェッチ抑止及び解除機能により正常動作可能となり、パイプラインキャンセル機能は不要となる。上記した例に限らず、いずれの場合においても、正しく動作保証できるように、命令フェッチ抑止及び解除機能やパイプラインキャンセル機能を備えれば良い。 However, in this case, since the instruction fetch of I2 has not yet started in the T4 cycle, the address of the instruction I1 fetched immediately before is separately held, the repeat-related setting is completed, and the correct determination is possible. Then, if it is determined that the address is separately held, and the hardware configuration is such that the processing sequence can be switched and the TRPTC and TRP bits can be updated in the T5 cycle, decoding is performed in the T4 cycle. Since the I1 instruction can be executed in the T5 cycle, one cycle idle period can be reduced. Accordingly, even when the number of instructions in the repeat block is “1”, as in the case where the number of instructions is “2”, normal operation is possible by the instruction fetch suppression and release function, and the pipeline cancel function is not necessary. The present invention is not limited to the above example, and in any case, an instruction fetch suppression and release function and a pipeline cancel function may be provided so that correct operation can be guaranteed.

実施の形態１では、共にリピート設定命令の命令デコード段階で命令のフェッチ抑止制御を行っているが、命令フェッチ完了直後に取り込まれた命令のプリデコードを行い実施の形態１で行っていたリピート命令の命令フェッチ抑止制御をより早い段階で行うようにしてもよい。命令デコードステージまでのパイプライン段数が深く、命令キューのエントリ数が多くなると、リピート命令のフェッチ完了後、リピート命令の実行がすぐに行えない場合もあるが、命令フェッチ完了直後にリピート処理のための命令フェッチ抑止制御を行うことにより、パイプラインキャンセル処理を必要とするリピートブロック数を削減することができるため、さらなる性能向上を実現できる。また、命令フェッチ抑止及び解除に関する制御は行わず、単に動作保証できない場合にパイプラインキャンセル機能を実行するハードウェア構成の場合は、命令の実行段階でリピートブロックのサイズに関する判定を行ってもかまわない。このように、どの段階でリピート命令に関する処理内容の判定を行ってもかまわない。 In the first embodiment, instruction fetch suppression control is performed at the instruction decode stage of the repeat setting instruction, but the repeat instruction used in the first embodiment is performed by predecoding the instruction fetched immediately after the instruction fetch is completed. This instruction fetch suppression control may be performed at an earlier stage. If the number of pipeline stages to the instruction decode stage is deep and the number of entries in the instruction queue is large, the repeat instruction may not be executed immediately after the repeat instruction fetch is completed. By performing the instruction fetch suppression control, it is possible to reduce the number of repeat blocks that require pipeline cancel processing, and thus further improve performance. In addition, control regarding instruction fetch suppression and release is not performed, and if the hardware configuration simply executes the pipeline cancel function when operation cannot be guaranteed, determination regarding the size of the repeat block may be performed at the instruction execution stage. . In this way, the processing content regarding the repeat instruction may be determined at any stage.

＜実施の形態２＞
上述の実施の形態１では、リピートブロックの命令数の判定結果のみで、リピート命令の処理方法を決定している。上述の実施の形態１のように、命令キューのエントリ数も少なく比較的パイプライン段数が浅い場合には、制御が単純であり制御の複雑さを含む開発コストと製品コスト等の効果のトレードオフのバランスがよい。 <Embodiment 2>
In the first embodiment described above, the repeat instruction processing method is determined only by the determination result of the number of instructions in the repeat block. When the number of instruction queue entries is small and the number of pipeline stages is relatively small as in the first embodiment, the control is simple and the trade-off between effects such as development cost and product cost including control complexity is simple. Is well balanced.

しかし、パイプライン段数が多い場合や、命令キューのエントリ数が多い場合は、命令デコード時までにパイプライン処理を行う最大命令数に対応して、リピートブロックの命令数の判定結果のみでリピート命令の処理方法を決定すると、性能的にオーバーヘッドが大きくなる場合がある。このような場合は、リピートブロックの命令数の判定結果のみではなく、命令キューに取り込まれている命令数、すなわち、命令フェッチ段階後、命令デコード段階前のデコード待機状態の命令数の判定も行い、ダイナミックにリピート命令の処理方法を決定すればよい。 However, if the number of pipeline stages is large or the number of entries in the instruction queue is large, the repeat instruction is determined only by the result of determining the number of instructions in the repeat block, corresponding to the maximum number of instructions to be pipelined before instruction decoding. If the processing method is determined, overhead may increase in performance. In such a case, not only the determination result of the number of instructions in the repeat block, but also the number of instructions fetched into the instruction queue, that is, the number of instructions in the decode standby state after the instruction fetch stage and before the instruction decode stage is determined. The processing method of the repeat instruction may be determined dynamically.

この場合、制御方法が異なる実施の形態２の構成を図を用いて簡単に説明する。パイプライン構成や命令キューのエントリ数等は異なるが、全体的な制御は前述の実施の形態１と同様な構成を想定しており、詳細な説明は省略し、ポイントとなる点のみを簡単に説明する。 In this case, the configuration of the second embodiment having a different control method will be briefly described with reference to the drawings. Although the pipeline configuration and the number of instruction queue entries are different, the overall control assumes the same configuration as in the first embodiment, and a detailed description is omitted, and only the point is simplified. explain.

図２４はパイプライン処理を模式的に示す説明図である。最大１０段のパイプライン処理を行う例を示している。命令フェッチ１（ＩＦ１）ステージ８０１は命令フェッチ要求を出力するステージであり、命令フェッチ２(ＩＦ２)ステージ８０２は命令フェッチに関するメモリアクセスを行うステージであり、命令フェッチ３(ＩＦ３)ステージ８０３は命令フェッチ完了待ち、及び、完了時にフェッチデータの命令キューへの転送を行うステージである。これらのステージ８０１〜８０３は８０１〜８０３の順にパイプライン処理される。 FIG. 24 is an explanatory diagram schematically showing pipeline processing. An example of performing pipeline processing of up to 10 stages is shown. The instruction fetch 1 (IF1) stage 801 is a stage for outputting an instruction fetch request, the instruction fetch 2 (IF2) stage 802 is a stage for performing memory access related to the instruction fetch, and the instruction fetch 3 (IF3) stage 803 is an instruction fetch. This stage waits for completion and transfers fetched data to the instruction queue upon completion. These stages 801 to 803 are pipeline processed in the order of 801 to 803.

命令デコード１（Ｄ１）ステージ８０４は命令コードの前段デコードを行うステージであり、命令デコード２（Ｄ２）ステージ８０５は命令コードの後段デコード、及び、レジスタファイルからのオペランドデータ読み出しを行うステージであり、命令実行１ステージ８０６は１サイクルで演算実行を終了する命令の演算実行、及び、ロード／ストア／分岐命令等のアドレス計算等命令の実行を行うステージである。これらのステージ８０４〜８０６は８０４〜８０６の順にパイプライン処理される。 The instruction decode 1 (D1) stage 804 is a stage for performing the preceding decoding of the instruction code, and the instruction decode 2 (D2) stage 805 is a stage for performing the subsequent decoding of the instruction code and reading the operand data from the register file. The instruction execution 1 stage 806 is a stage for executing an instruction for completing an operation execution in one cycle and executing an instruction such as an address calculation such as a load / store / branch instruction. These stages 804 to 806 are pipeline processed in the order of 804 to 806.

また、メモリアクセス１（Ｍ１）ステージ８０７はデータアクセス要求を出力するステージであり、メモリアクセス２（Ｍ２）ステージ８０８はデータアクセスに関するメモリアクセスを行うステージであり、メモリアクセス３（Ｍ３）ステージ８０９はデータアクセス完了待ち、及び、データアクセス完了時にロードデータを転送するステージであり、ライトステージ８１０はロードデータをレジスタファイルに書き込むライト（ＷＭ）ステージである。これらのステージ８０７〜８１０は、命令実行１ステージ８０６に続いて、８０７〜８１０の順にパイプライン処理される。 The memory access 1 (M1) stage 807 is a stage for outputting a data access request, the memory access 2 (M2) stage 808 is a stage for performing memory access related to data access, and the memory access 3 (M3) stage 809 is This is a stage for waiting for data access completion and transferring load data upon completion of data access, and a write stage 810 is a write (WM) stage for writing load data to a register file. These stages 807 to 810 are pipelined in the order of 807 to 810 following the instruction execution 1 stage 806.

ライト１（Ｗ１）ステージ８１１は１サイクルで演算実行を終了する命令の演算結果をレジスタに書き込むステージである。このライト１ステージ８１１は命令実行１ステージ８０６に続いてパイプライン処理される。 The write 1 (W1) stage 811 is a stage in which the operation result of an instruction that completes operation execution in one cycle is written to a register. This write 1 stage 811 is pipeline processed following the instruction execution 1 stage 806.

命令実行２（Ｅ２）ステージ８１２は積和演算命令等演算実行に２サイクルを要する命令の後段実行を行うステージであり、ライト２（Ｗ２）ステージ８１３はＥ２ステージ８１２での演算結果をレジスタに書き込むステージである。これらのステージ８１２，８１３は、命令実行１ステージ８０６に続いて、８１２，８１３の順にパイプライン処理される。 The instruction execution 2 (E2) stage 812 is a stage for executing the subsequent stage of an instruction that requires two cycles for execution of operations such as a product-sum operation instruction, and the write 2 (W2) stage 813 writes the operation result in the E2 stage 812 to the register. It is a stage. These stages 812 and 813 are pipelined in the order of 812 and 813 following the instruction execution 1 stage 806.

図２５は本発明の実施の形態２であるデータ処理装置の制御部のうち、リピート制御の説明に必要な部分を示したブロック図である。なお、概略構成は図１７で示した実施の形態１と同様の構成を想定している。 FIG. 25 is a block diagram showing a part necessary for explanation of repeat control in the control unit of the data processing apparatus according to the second embodiment of the present invention. The schematic configuration assumes the same configuration as that of the first embodiment shown in FIG.

挿図に示すように、ＭＰＵコア部８５１は内部に制御部８５３を有しており、外部の命令フェッチ部８５２を介して命令を受ける。制御部８５３、内部に命令フェッチ制御部８５４、Ｄ１ステージ制御部８５６、Ｄ２ステージ制御部８５８、Ｅ１ステージ制御部８６０、命令キュー８５５、第１命令デコード部８５７、第２命令デコード部８５９、及び制御信号生成部８６１を有している。 As shown in the figure, the MPU core unit 851 has a control unit 853 inside, and receives an instruction via an external instruction fetch unit 852. Control unit 853, instruction fetch control unit 854, D1 stage control unit 856, D2 stage control unit 858, E1 stage control unit 860, instruction queue 855, first instruction decode unit 857, second instruction decode unit 859, and control A signal generation unit 861 is included.

リピート時パイプライン制御手段の一部としても機能する命令フェッチ制御部８５４はＩＦ１ステージ８０１〜ＩＦ３ステージ８０３のステージ制御を含む命令フェッチの制御を行い、実施の形態１のＩＦステージ制御部６５１と同等の機能を有するとともに、さらに後述する機能を有する。 The instruction fetch control unit 854 that also functions as a part of the pipeline control means at the time of repeat performs control of instruction fetch including stage control of the IF1 stage 801 to IF3 stage 803 and is equivalent to the IF stage control unit 651 of the first embodiment. In addition to the functions described below.

命令キュー８５５は８エントリの命令キューを含む命令キューであり、Ｄ１ステージ制御部８５６はＤ１ステージ８０４の制御を行い、第１命令デコード部８５７はＤ１ステージ８０４で命令の前段デコードを行い、Ｄ２ステージ制御部８５８はＤ２ステージ８０５の制御を行い、第２命令デコード部８５９はＤ２ステージ８０５で命令の後段デコードを行い、Ｅ１ステージ制御部８６０はＥ１ステージ８０６での命令実行制御を行い、制御信号生成部８６１はＥ１ステージ８０６での実行制御信号の生成等を行う。 The instruction queue 855 is an instruction queue including an 8-entry instruction queue, the D1 stage control unit 856 controls the D1 stage 804, the first instruction decoding unit 857 performs the preceding stage decoding of the instruction at the D1 stage 804, and the D2 stage. The control unit 858 controls the D2 stage 805, the second instruction decoding unit 859 performs post-stage decoding of the instruction at the D2 stage 805, and the E1 stage control unit 860 performs instruction execution control at the E1 stage 806, and generates a control signal The unit 861 generates an execution control signal in the E1 stage 806.

第１命令デコード部８５７は、命令レジスタ８７１、リピート処理判定部８７２，デコーダ８７３及び制御信号生成部８７４を内部に有する。 The first instruction decoding unit 857 includes an instruction register 871, a repeat process determination unit 872, a decoder 873, and a control signal generation unit 874 inside.

命令レジスタ８７１は第１命令デコード部でデコードする命令を保持するレジスタであり、リピート設定命令用判定部として機能するリピート処理判定部８７２はリピート命令の処理方法を判定し、デコーダ８７３は命令をデコードする。 The instruction register 871 is a register for holding an instruction to be decoded by the first instruction decoding unit. The repeat process determining unit 872 functioning as a repeat setting instruction determining unit determines a processing method of the repeat instruction, and the decoder 873 decodes the instruction. To do.

リピート時パイプライン制御手段の一部として機能する制御信号生成部８７４は主として命令フェッチ抑止情報８８４やＤ２ステージでの制御に使用する制御信号)やパイプラインの後段で必要となる情報（命令フェッチ抑止解除信号８８３、ジャンプ要求信号８８５に関する情報を含む）を生成する。実際は非常に複雑なブロック構成となるが、図２５では説明容易化を考慮して単純化して示している。 The control signal generation unit 874 functioning as part of the pipeline control means at the time of repeat is mainly instruction fetch suppression information 884 and control signals used for control in the D2 stage) and information required in the subsequent stage of the pipeline (instruction fetch suppression) A release signal 883 and a jump request signal 885 are included). In actuality, the block configuration is very complicated. In FIG. 25, however, the block configuration is simplified in consideration of ease of explanation.

以下、前述の実施の形態１と異なる点に着目し、実施の形態２の相違点のみ簡単に説明する。この例では、命令キュー８５５が８エントリ存在するため、リピート命令デコード開始時には最大後続の７命令が命令キュー８５５に取り込まれている場合があり得る。しかし、パイプラインで処理される前後の命令、及び、命令の格納位置等に依存し命令キューにどの程度命令がたまっているかは大きく異なる。常に、最大の場合を想定して上述の実施の形態１と同様の制御を行うと、リピートブロックの命令数が７命令以下の場合には常に実行ステージでジャンプ処理を実行しないといけなくなる。また、パイプラインが深いので分岐によるサイクル数のペナルティも大きくなる。通常、そこまで命令キューが詰まることは少ない。 Hereinafter, focusing on the differences from the first embodiment, only the differences from the second embodiment will be described briefly. In this example, since there are eight entries in the instruction queue 855, there may be a case where up to seven subsequent instructions are taken into the instruction queue 855 at the start of repeat instruction decoding. However, the number of instructions stored in the instruction queue differs greatly depending on the instructions before and after being processed in the pipeline and the storage positions of the instructions. If the same control as in the first embodiment is performed assuming the maximum case, the jump process must always be executed in the execution stage when the number of instructions in the repeat block is 7 instructions or less. In addition, since the pipeline is deep, the penalty for the number of cycles due to branching also increases. Usually, the instruction queue is rarely clogged.

実施の形態２では、命令デコード開始時にデコード対象の命令の何命令後の命令のフェッチが受け付けられているか（すなわち、デコード実行前状態（デコード待ち状態もしくは命令フェッチ完了待ち状態とを含む）であるか）を示すフェッチ情報８８２が、命令フェッチ制御部８５４から所定の信号線を介して第１命令デコード部８５７内のリピート処理判定部８７２に出力される。なお、デコード実行前状態とはリピート処理判定部８７２による判定動作実行前にも該当する。 In the second embodiment, the instruction fetch instruction after the instruction to be decoded is accepted at the start of instruction decoding (that is, the pre-decode execution state (including the decode wait state or the instruction fetch completion wait state). Is fetched from the instruction fetch control unit 854 to the repeat processing determination unit 872 in the first instruction decoding unit 857 via a predetermined signal line. The state before execution of decoding also corresponds to the state before execution of the determination operation by the repeat processing determination unit 872.

リピート処理判定部８７２ではリピート命令処理時に、リピート命令で指定されるリピートブロックのサイズと、フェッチ情報８８２に基づき命令フェッチ処理を開始している後続の命令数を判定し、リピート命令の処理方法を決定する。 The repeat processing determination unit 872 determines the repeat block processing method for the repeat instruction by determining the size of the repeat block specified by the repeat instruction and the number of subsequent instructions starting the instruction fetch process based on the fetch information 882 during the repeat instruction processing. decide.

すなわち、前述した実施の形態１のデータ処理装置と同様、そのサイクルで出力しようとしている命令フェッチ要求は抑止可能であり、リピートブロックの最終命令の命令フェッチ要求が受け付けられていなければ、パイプライン処理をキャンセルする必要はない。そこで、リピートブロックの最終命令（所定の制御基準命令）の命令フェッチ要求が既に受け付けられおりデコード実行前状態である場合には、リピート設定命令の実行によるリピート関連の設定後、後述するジャンプ処理が行われ、そうでない場合は、活性状態の命令フェッチ抑止情報８８４を命令フェッチ制御部８５４に送る。 That is, as with the data processing apparatus of the first embodiment described above, an instruction fetch request to be output in that cycle can be suppressed, and if the instruction fetch request for the last instruction of the repeat block is not accepted, pipeline processing is performed. There is no need to cancel. Therefore, when the instruction fetch request of the final instruction (predetermined control reference instruction) of the repeat block has already been received and is in a pre-decode execution state, the jump processing described later is performed after the repeat-related setting by executing the repeat setting instruction. If not, the active instruction fetch suppression information 884 is sent to the instruction fetch control unit 854.

このように、実施の形態２のデータ処理装置は、デコード実行前状態の命令数に基づく所定の条件に成立時に、デコード実行前状態の命令を有効活用し、オーバーヘッドを最小限に抑えながら正常なリピート動作を実行することができる。 As described above, the data processing apparatus according to the second embodiment normally uses instructions in the pre-decode execution state when the predetermined condition based on the number of instructions in the pre-decode execution state is satisfied, and minimizes overhead. Repeat operations can be performed.

命令フェッチ抑止情報８８４には命令フェッチを一時停止するまでにフェッチ可能な命令数の情報も含まれる。したがって、命令フェッチ制御部８５４は、命令フェッチ抑止情報８８４に基づくことにより、リピートブロックの最終命令の１つ前の命令までの命令フェッチの継続は許可することができる。このように制御することにより、命令プリフェッチの停止期間を最小限に抑え、性能低下を低減している。 The instruction fetch suppression information 884 includes information on the number of instructions that can be fetched before the instruction fetch is temporarily stopped. Therefore, the instruction fetch control unit 854 can permit continuation of the instruction fetch up to the instruction immediately before the last instruction of the repeat block based on the instruction fetch inhibition information 884. By controlling in this way, the instruction prefetch stop period is minimized, and performance degradation is reduced.

したがって、命令フェッチ制御部８５４は命令フェッチ抑止情報８８４に基づき、指定された命令数の命令フェッチ後は、活性状態の命令フェッチ抑止解除信号８８３を受信するまで、命令のフェッチ要求の出力を抑止する。もちろん、活性状態の命令フェッチ抑止情報８８４を受けた時点で、リピートブロックの最終命令の１つ前の命令までフェッチが終わっている場合には、すぐに命令フェッチ要求の出力を抑止する。 Therefore, based on the instruction fetch suppression information 884, the instruction fetch control unit 854 suppresses output of an instruction fetch request until an active instruction fetch suppression release signal 883 is received after fetching the designated number of instructions. . Of course, when fetching up to the instruction immediately before the last instruction of the repeat block is completed when the active instruction fetch inhibition information 884 is received, the output of the instruction fetch request is immediately inhibited.

このように、実施の形態２のデータ処理装置は、所定の制御基準命令となるリピートブロックの最終命令がデコード実行前状態でない場合、当該最終命令の直前の命令の命令フェッチ部８５２によるフェッチ動作完了後に、命令フェッチ抑止制御を命令フェッチ部８５２に対して行うことにより、デコード実行前状態の命令を有効に活用しながら、正常なリピート動作を実行することができる。 As described above, in the data processing apparatus according to the second embodiment, when the last instruction of the repeat block that is a predetermined control reference instruction is not in the pre-decoding state, the fetch operation by the instruction fetch unit 852 of the instruction immediately before the last instruction is completed. Later, by performing instruction fetch suppression control on the instruction fetch unit 852, a normal repeat operation can be executed while effectively using the instruction in the pre-decode execution state.

Ｅ１ステージ８０６でリピート関連の設定が完了する。正しい命令フェッチシーケンスの制御が可能なタイミングにおいて、活性状態の命令フェッチ抑止解除信号８８３が所定の信号線を介して命令フェッチ部８５４に送られる。なお、命令フェッチ抑止解除信号８８３は第２命令デコード部８５９によるリピート命令のデコード終了後、当該リピート命令の実行時のタイミングで制御信号生成部８６１から出力される。この制御信号生成部８６１はリピート時パイプライン制御手段の一部としても機能する。 In the E1 stage 806, repeat-related settings are completed. At a timing when a correct instruction fetch sequence can be controlled, an active instruction fetch suppression release signal 883 is sent to the instruction fetch unit 854 via a predetermined signal line. Note that the instruction fetch suppression release signal 883 is output from the control signal generation unit 861 at the timing when the repeat instruction is executed after the second instruction decode unit 859 decodes the repeat instruction. This control signal generator 861 also functions as part of the repeat pipeline control means.

その結果、命令フェッチ制御部８５４は、活性状態の命令フェッチ抑止解除信号８８３に応答して命令のフェッチを再開する。一方、活性状態の命令フェッチ抑止情報８８４を受けたものの、まだフェッチ抑止状態に入っていない時点において、活性状態の命令フェッチ抑止解除信号８８３を受けた場合、当該命令フェッチ抑止情報８８４は無効化され、フェッチは抑止されることなく継続される。 As a result, the instruction fetch control unit 854 resumes fetching instructions in response to the active instruction fetch suppression release signal 883. On the other hand, if the instruction fetch suppression information 884 in the active state is received but the instruction fetch suppression release signal 883 in the active state is received at the time when the instruction fetch suppression information 883 is not yet entered, the instruction fetch suppression information 884 is invalidated. The fetch continues without being suppressed.

一方、上述したジャンプ処理は以下のように行われる。第２命令デコード部８５９によりリピート命令のデコード終了後、リピート命令の実行時のタイミングで制御信号生成部８６１からジャンプ要求信号８８５がＥ１ステージ制御部８６０に出力される。 On the other hand, the jump process described above is performed as follows. After decoding of the repeat instruction by the second instruction decode unit 859, the jump request signal 885 is output from the control signal generation unit 861 to the E1 stage control unit 860 at the timing when the repeat instruction is executed.

すると、リピート時パイプライン制御手段の一部としても機能するＥ１ステージ制御部８６０からジャンプ信号８８６が命令フェッチ制御部８５４、Ｄ１ステージ制御部８５６、Ｄ２ステージ制御部８５８に出力されることにより、ＩＦ１ステージ８０１〜ＩＦ３ステージ８０３、Ｄ１ステージ８０４及びＤ２ステージ８０５のパイプライン処理内容がキャンセルされる。 Then, the jump signal 886 is output from the E1 stage control unit 860, which also functions as a part of the pipeline control means during repeat, to the instruction fetch control unit 854, the D1 stage control unit 856, and the D2 stage control unit 858, so that IF1 The pipeline processing contents of stage 801 to IF3 stage 803, D1 stage 804, and D2 stage 805 are cancelled.

このように、実施の形態２のデータ処理装置は、ジャンプ信号８８６に基づくパイプラインキャンセル機能を有することにより、所定の制御基準命令となるリピートブロックの最終命令が既にデコード実行前状態である場合であっても、オーバーヘッドを最小限に抑えながら正常なリピート動作を実行することができる。 As described above, the data processing apparatus according to the second embodiment has the pipeline cancel function based on the jump signal 886, so that the final instruction of the repeat block that is the predetermined control reference instruction is already in the pre-decode execution state. Even if it exists, normal repeat operation | movement can be performed, minimizing overhead.

また、ジャンプ処理を命令フェッチ抑止情報８８４のタイミングで出力させることも考えられる。すなわち、制御信号生成部８７４から命令フェッチ抑止情報８８４と共に命令フェッチ制御部８５４のみを初期化する（Ｄ１ステージ制御部８５６，Ｄ２ステージ制御部８５８は初期化しない）命令フェッチキャンセル信号に相当する信号を出力させ、命令フェッチに関するパイプラインキャンセル機能の効果を発揮させた上、命令フェッチ制御部８５４により、命令フェッチに関するリピートブロックの最終命令の１つ前の命令までの命令フェッチを改めて行った後、命令フェッチ抑止状態に移行するように制御することにより、オーバーヘッドを最小限に抑えることができる。 It is also conceivable to output jump processing at the timing of the instruction fetch inhibition information 884. That is, only the instruction fetch control unit 854 is initialized together with the instruction fetch suppression information 884 from the control signal generation unit 874 (D1 stage control unit 856 and D2 stage control unit 858 are not initialized). The instruction fetch control unit 854 performs the instruction fetch up to the instruction immediately before the last instruction of the repeat block related to the instruction fetch, after the output of the pipeline cancel function related to the instruction fetch is performed. By controlling to shift to the fetch suppression state, overhead can be minimized.

このように、実施の形態２のデータ処理装置では、パイプラインキャンセル処理を実行した後、所定の制御基準命令であるリピートブロックの最終命令の直前の命令の命令フェッチ部８５２によるフェッチ要求受け付け後に、命令フェッチ制御部８５４によって命令フェッチ抑止制御を行うことにより、オーバーヘッドを最小限に抑えながら正常なリピート動作を実行することができる。 As described above, in the data processing apparatus according to the second embodiment, after executing the pipeline cancel process, after receiving the fetch request by the instruction fetch unit 852 of the instruction immediately before the last instruction of the repeat block which is a predetermined control reference instruction, By performing instruction fetch suppression control by the instruction fetch control unit 854, normal repeat operation can be executed while minimizing overhead.

この実施の形態２では、リピートブロックの最終命令の１命令前の命令まで、フェッチを継続できるようにしているが、制御を簡単にするために実施の形態１と同様、単に、活性状態の命令フェッチ抑止情報８８４を受けた時点で命令フェッチを停止するような制御を行ってもよい。ただし、処理サイクル数のオーバーヘッドの削減効果はやや落ちる場合がある。 In the second embodiment, the fetch can be continued up to the instruction one instruction before the last instruction of the repeat block. However, in order to simplify the control, as in the first embodiment, the active instruction is simply used. Control may be performed such that instruction fetch is stopped when the fetch suppression information 884 is received. However, the effect of reducing the overhead of the number of processing cycles may be slightly reduced.

このような構成をとった場合も、実施の形態１と同じ効果を得ることができる。また、実施の形態２は、パイプライン処理の命令フェッチの状態も絡めてリピート命令の処置方法を判断することにより、さらに処理サイクル数のオーバーヘッドを削減することが可能となるという実施の形態１以上の効果を奏する。 Even when such a configuration is adopted, the same effect as in the first embodiment can be obtained. Further, the second embodiment is more than the first embodiment in which it is possible to further reduce the overhead of the number of processing cycles by determining the processing method of the repeat instruction in connection with the instruction fetch state of the pipeline processing. The effect of.

この場合、パイプライン段数が多く、命令キューのサイズも大きいため、実施の形態１で説明したように命令フェッチ後命令キューに書き込む段階でプリデコードを行い、リピート命令の処理方法を決定し、早期に命令のフェッチ制御を行うようにすれば、更にパイプライン処理の乱れを抑えることができる。 In this case, since the number of pipeline stages is large and the size of the instruction queue is large, as described in the first embodiment, predecoding is performed at the stage of writing to the instruction queue after instruction fetching, and a repeat instruction processing method is determined. If instruction fetch control is performed at the same time, the disturbance of the pipeline processing can be further suppressed.

実施の形態２も、実施の形態１と同様、あくまでも一つの構成例を示したものであり、本発明の適用範囲を限定するものではない。基本的に、ハードウェアリピート機能を備えた、パイプライン処理を行うデータ処理装置であれば、どのようなものに適用しても同様の効果がある。図２５で示した実施の形態２の構成に限定されるものではない。実施の形態１と同様、異なるアーキテクチャやハードウェア実装手法を採用した場合にも、適用可能である。 The second embodiment, like the first embodiment, shows only one configuration example, and does not limit the application range of the present invention. Basically, any data processing apparatus having a hardware repeat function and performing pipeline processing has the same effect. It is not limited to the configuration of the second embodiment shown in FIG. Similar to the first embodiment, the present invention can be applied even when a different architecture or hardware mounting method is adopted.

実施の形態２では、既にリピートブロックの最終命令の命令フェッチが受け付けられている場合には、一律リピート関連の設定完了後ジャンプを行うような制御を行っているが、この場合、命令フェッチ制御部８５４にリピートブロックの先頭命令からリピートブロックの最終命令の１命令前まで命令フェッチを再実行し、その後命令フェッチの抑止解除がまだ行われていなければ命令フェッチを一時停止するように制御してもよい。そうすれば、さらにパイプライン処理の乱れを低減できる。ただし、デコード中の命令の次命令のアドレスを管理する機能、及び、そのアドレスへ命令フェッチ段階でジャンプする機能が追加となり、制御も複雑になる。また、フェッチ済みの命令の一部（リピートブロックの最終命令以降の命令）のみを破棄する機能が有れば、さらに無駄な命令フェッチの再実行を削減できる。 In the second embodiment, when the instruction fetch of the final instruction of the repeat block has already been accepted, control is performed so as to perform a jump after completion of uniform repeat-related setting. In this case, the instruction fetch control unit Even if the instruction fetch is re-executed from the first instruction of the repeat block to one instruction before the last instruction of the repeat block in 854, and the instruction fetch suppression is not released yet, the instruction fetch is temporarily stopped. Good. By doing so, the disturbance of the pipeline processing can be further reduced. However, the function of managing the address of the instruction next to the instruction being decoded and the function of jumping to the address at the instruction fetch stage are added, and the control becomes complicated. Further, if there is a function of discarding only a part of fetched instructions (instructions after the last instruction of the repeat block), it is possible to further reduce wasteful instruction fetch re-execution.

汎用レジスタを示す説明図である。It is explanatory drawing which shows a general purpose register. アキュムレータの構成を示す説明図である。It is explanatory drawing which shows the structure of an accumulator. 制御レジスタを示す説明図である。It is explanatory drawing which shows a control register. 制御レジスタＣＲ０に格納されるＰＳＷの構成を示す説明図である。It is explanatory drawing which shows the structure of PSW stored in control register CR0. 本データ処理装置の命令フォーマットを示す説明図である。It is explanatory drawing which shows the command format of this data processor. ＦＭビットのフォーマット及び実行順序指定の詳細を示す説明図である。It is explanatory drawing which shows the detail of the format of FM bit, and execution order designation | designated. 典型的な命令のビット割り付けの例（その１）を示す説明図である。It is explanatory drawing which shows the example (the 1) of the bit allocation of a typical instruction. 典型的な命令のビット割り付けの例（その２）を示す説明図である。It is explanatory drawing which shows the example (the 2) of the bit allocation of a typical instruction. 典型的な命令のビット割り付けの例（その３）を示す説明図である。It is explanatory drawing which shows the example (the 3) of the bit allocation of a typical instruction. 典型的な命令のビット割り付けの例（その４）を示す説明図である。It is explanatory drawing which shows the example (the 4) of the bit allocation of a typical instruction. 本実施の形態のデータ処理装置の機能ブロック構成を示すブロック図である。It is a block diagram which shows the functional block structure of the data processor of this Embodiment. 第１演算部の詳細ブロック構成を示すブロック図である。It is a block diagram which shows the detailed block structure of a 1st calculating part. ＰＣ部の内部構成の詳細を示すブロック図である。It is a block diagram which shows the detail of the internal structure of PC part. パイプライン処理を示す説明図である。It is explanatory drawing which shows a pipeline process. リピート命令（その１）のビット割り付けを示す説明図である。It is explanatory drawing which shows bit allocation of a repeat instruction (the 1). リピート命令（その２）のビット割り付けを示す説明図である。It is explanatory drawing which shows bit allocation of a repeat instruction (the 2). 本発明の実施の形態１であるデータ処理装置の制御部のうち、リピート制御の説明に必要な部分を示したブロック図である。It is the block diagram which showed the part required for description of repeat control among the control parts of the data processor which is Embodiment 1 of this invention. リピートブロックの命令数が３命令の場合のプログラム例を示す説明図である。It is explanatory drawing which shows the example of a program in case the number of instructions of a repeat block is 3 instructions. 本実施の形態１のデータ処理装置が図１８で示したプログラムを実行する場合のリピート処理中のパイプライン処理を示すタイミング図である。FIG. 19 is a timing chart showing pipeline processing during repeat processing when the data processing apparatus of the first embodiment executes the program shown in FIG. 18. リピートブロックの命令数が２命令の場合のプログラム例を示す説明図である。It is explanatory drawing which shows the example of a program in case the number of instructions of a repeat block is two instructions. 本実施の形態１のデータ処理装置が図２０で示したプログラムを実行する場合のリピート処理中のパイプライン処理を示すタイミング図である。FIG. 21 is a timing chart showing pipeline processing during repeat processing when the data processing apparatus of the first embodiment executes the program shown in FIG. 20. リピートブロックの命令数が１命令の場合のプログラム例を示す説明図である。It is explanatory drawing which shows the example of a program in case the number of instructions of a repeat block is one instruction. 本実施の形態１のデータ処理装置が図２２で示したプログラムを実行する場合のリピート処理中のパイプライン処理を示すタイミング図である。FIG. 23 is a timing chart showing pipeline processing during repeat processing when the data processing apparatus according to the first embodiment executes the program shown in FIG. 22. パイプライン処理を模式的に示す説明図である。It is explanatory drawing which shows a pipeline process typically. 本発明の実施の形態２であるデータ処理装置の制御部のうち、リピート制御の説明に必要な部分を示したブロック図である。It is the block diagram which showed the part required for description of repeat control among the control parts of the data processor which is Embodiment 2 of this invention.

符号の説明Explanation of symbols

２０１ＭＰＵコア部、２０２命令フェッチ部、２１１制御部、２１２，８５５命令キュー、２１３命令デコード部、２１４第１デコーダ、２２４ＰＣ部、３２３ＰＳＷ部、６５１ＩＦステージ制御部、６５２命令フェッチ要求生成部、６５３リピート制御部、６５４ＴＲＰビット、６６１Ｄステージ制御部、６６２，８７１命令レジスタ、６６３，８７３デコーダ、６６４ｄｉｓｐ判定部、６６５，６７４，８６１，８７４，制御信号生成部、６７１Ｅステージ制御部、６７２ジャンプ制御部、６７３Ｅステージ制御信号ラッチ、６７５ＲＰビット、６８１命令フェッチ抑止信号、６８２命令フェッチ抑止解除信号、６８４ジャンプ信号、８５４命令フェッチ制御部、８５６Ｄ１ステージ制御部、８５８Ｄ２ステージ制御部、８６０Ｅ１ステージ制御部、８７２リピート処理判定部。
201 MPU core unit, 202 instruction fetch unit, 211 control unit, 212,855 instruction queue, 213 instruction decode unit, 214 first decoder, 224 PC unit, 323 PSW unit, 651 IF stage control unit, 652 instruction fetch request generation unit , 653 repeat control unit, 654 TRP bit, 661 D stage control unit, 662, 871 instruction register, 663, 873 decoder, 664 disp determination unit, 665, 674, 861, 874, control signal generation unit, 671 E stage control unit , 672 Jump control unit, 673 E stage control signal latch, 675 RP bit, 681 Instruction fetch suppression signal, 682 Instruction fetch suppression release signal, 684 jump signal, 854 Instruction fetch control unit, 856 D1 stage control unit, 858 D2 stage control , 860 E1 stage control unit, 872 repeat determination unit.

Claims

パイプライン処理機能を有するとともに、リピート関連の設定の少なくとも一部を規定したリピート設定命令に従い、少なくとも一つの命令を含み総コードサイズが所定のコードサイズとなるリピート対象ブロックをハードウェア的に繰り返し実行するリピート機能を有するデータ処理装置であって、
パイプライン処理対象の命令をフェッチする命令フェッチ部と、
前記命令フェッチ部から前記命令を受け、当該命令が前記リピート設定命令である場合に、前記リピート対象ブロックに関する前記リピート機能の設定内容が所定の条件を満足するか否かを判定する判定動作を実行するリピート設定命令用判定部と、
前記リピート設定命令用判定部が前記所定の条件を満足すると判定した場合、前記パイプライン処理対象の命令の新規フェッチ開始を一時抑止させる命令フェッチ抑止制御を行うリピート時パイプライン制御手段と、
を備えるデータ処理装置。 Repeated blocks that have a pipeline processing function and repeat at least one instruction and have a total code size of a predetermined code size in hardware according to a repeat setting instruction that defines at least a part of repeat-related settings A data processing apparatus having a repeat function,
An instruction fetch unit that fetches an instruction to be pipelined;
When the instruction is received from the instruction fetch unit and the instruction is the repeat setting instruction, a determination operation is performed to determine whether or not the setting content of the repeat function related to the repeat target block satisfies a predetermined condition A determination unit for a repeat setting command to perform,
When the repeat setting instruction determination unit determines that the predetermined condition is satisfied, a repeat pipeline control unit that performs instruction fetch suppression control for temporarily suppressing start of a new fetch of the pipeline processing target instruction;
A data processing apparatus comprising:

請求項１記載のデータ処理装置であって、
前記所定の条件は前記所定のコードサイズが第１の基準値以下である条件を含み、
前記リピート時パイプライン制御手段は、
前記リピート設定命令用判定部の前記判定動作によって前記所定の条件を満足すると判定した時に前記命令フェッチ抑止制御を開始し、前記リピート設定命令に規定したリピート関連の設定動作が終了した後に前記命令フェッチ抑止制御を解除する、
データ処理装置。 The data processing apparatus according to claim 1, wherein
The predetermined condition includes a condition that the predetermined code size is equal to or smaller than a first reference value;
The repeat pipeline control means includes:
The instruction fetch suppression control is started when it is determined that the predetermined condition is satisfied by the determination operation of the repeat setting instruction determination unit, and the instruction fetch is performed after a repeat-related setting operation defined in the repeat setting instruction is completed. Canceling suppression control,
Data processing device.

請求項１記載のデータ処理装置であって、
前記所定の条件は前記所定のコードサイズが第２の基準値以下である条件を含み、
前記リピート時パイプライン制御手段は、
前記リピート設定命令用判定部の前記判定動作によって前記所定の条件を満足すると判定した時に前記命令フェッチ抑止制御を開始し、前記リピート設定命令に規定したリピート関連の設定動作が終了した後に前記リピート設定命令以降の命令のパイプライン処理をキャンセルするパイプラインキャンセル処理を実行する、
データ処理装置。 The data processing apparatus according to claim 1, wherein
The predetermined condition includes a condition that the predetermined code size is equal to or smaller than a second reference value;
The repeat pipeline control means includes:
The instruction fetch suppression control is started when it is determined that the predetermined condition is satisfied by the determination operation of the determination unit for the repeat setting instruction, and the repeat setting is completed after a repeat-related setting operation specified in the repeat setting instruction is completed. Execute pipeline cancel processing to cancel the pipeline processing of instructions after the instruction,
Data processing device.

請求項１記載のデータ処理装置であって、
前記所定の条件は、前記リピート設定命令に引き続く、前記命令フェッチ開始後、前記判定動作実行前の状態の命令コードサイズに基づく条件を含み、
前記リピート時パイプライン制御手段は、
前記リピート設定命令用判定部の前記判定動作によって前記所定の条件を満足すると判定した時に前記命令フェッチ抑止制御を実行し、前記リピート設定命令に規定したリピート関連の設定動作が終了した後に前記命令フェッチ抑止制御を解除する、
データ処理装置。 The data processing apparatus according to claim 1, wherein
The predetermined condition includes a condition based on an instruction code size in a state after execution of the instruction fetch and before execution of the determination operation following the repeat setting instruction.
The repeat pipeline control means includes:
The instruction fetch suppression control is executed when it is determined that the predetermined condition is satisfied by the determination operation of the repeat setting instruction determination unit, and the instruction fetch is performed after a repeat-related setting operation defined in the repeat setting instruction is completed. Canceling suppression control,
Data processing device.

請求項４記載のデータ処理装置であって、
前記リピート対象ブロックにおける前記少なくとも一つの命令は所定の制御基準命令コードを含み、前記所定の条件は前記判定動作実行前状態の命令に前記所定の制御基準命令コードが存在しない場合を含み、
前記リピート時パイプライン制御手段は、
前記リピート設定命令用判定部が前記所定の条件を満足しないと判定した場合、前記リピート設定命令以降の命令のパイプライン処理をキャンセルするパイプラインキャンセル処理を実行する、
データ処理装置。 A data processing apparatus according to claim 4, wherein
The at least one instruction in the repeat target block includes a predetermined control reference instruction code, and the predetermined condition includes a case where the predetermined control reference instruction code does not exist in an instruction in a state before execution of the determination operation;
The repeat pipeline control means includes:
When the determination unit for the repeat setting instruction determines that the predetermined condition is not satisfied, a pipeline cancel process for canceling the pipeline process of the instruction after the repeat setting instruction is executed.
Data processing device.

請求項１記載のデータ処理装置であって、
前記リピート対象ブロックにおける前記少なくとも一つの命令は所定の制御基準命令コードを含み、前記所定の条件は前記判定動作実行前状態の命令に前記所定の制御基準命令コードが存在しない場合を含み、
前記リピート時パイプライン制御手段は、
前記リピート設定命令用判定部が前記所定の条件を満足すると判定した場合、前記所定の制御基準命令コードの直前の命令コードの前記フェッチ部によるフェッチ動作開始後に前記命令フェッチ抑止制御を開始し、前記リピート設定命令に規定したリピート関連の設定動作が終了した後に前記命令フェッチ抑止制御を解除し、
前記リピート設定命令用判定部が前記所定の条件を満足しないと判定した場合、前記リピート設定命令以降の命令のパイプライン処理をキャンセルするパイプラインキャンセル処理を実行する、
データ処理装置。 The data processing apparatus according to claim 1, wherein
The at least one instruction in the repeat target block includes a predetermined control reference instruction code, and the predetermined condition includes a case where the predetermined control reference instruction code does not exist in an instruction in a state before execution of the determination operation;
The repeat pipeline control means includes:
When the repeat setting instruction determination unit determines that the predetermined condition is satisfied, the instruction fetch suppression control is started after the fetch operation by the fetch unit of the instruction code immediately before the predetermined control reference instruction code is started, Release the instruction fetch suppression control after the repeat-related setting operation specified in the repeat setting instruction is completed,
When the determination unit for the repeat setting instruction determines that the predetermined condition is not satisfied, a pipeline cancel process for canceling the pipeline process of the instruction after the repeat setting instruction is executed.
Data processing device.

パイプライン処理機能を有するとともに、リピート関連の設定の少なくとも一部を規定したリピート設定命令に従い、少なくとも一つの命令を含み総コードサイズが所定のコードサイズとなるリピート対象ブロックをハードウェア的に繰り返し実行するリピート機能を有するデータ処理装置であって、
パイプライン処理対象の命令をフェッチする命令フェッチ部と、
前記命令フェッチ部から前記命令を受け、当該命令が前記リピート設定命令である場合に、前記リピート対象ブロックに関する前記リピート機能の設定内容が第１の条件を満足するか否かを判定する判定動作を実行するリピート設定命令用判定部と、
前記リピート設定命令判定部が前記第１の条件を満足すると判定した場合、前記リピート設定命令に規定したリピート関連の設定動作が終了した後に前記リピート設定命令以降の命令のパイプライン処理をキャンセルするパイプラインキャンセル処理を実行するリピート時パイプライン制御手段とを備え、
前記第１の条件は前記所定のコードサイズが第１の基準値以下である条件を含む、
データ処理装置。 Repeated blocks that have a pipeline processing function and repeat at least one instruction and have a total code size of a predetermined code size in hardware according to a repeat setting instruction that defines at least a part of repeat-related settings A data processing apparatus having a repeat function,
An instruction fetch unit that fetches an instruction to be pipelined;
When the instruction is received from the instruction fetch unit and the instruction is the repeat setting instruction, a determination operation is performed to determine whether or not the setting content of the repeat function related to the repeat target block satisfies a first condition. A repeat setting instruction determination unit to be executed;
A pipe that cancels pipeline processing of instructions subsequent to the repeat setting instruction after the repeat-related setting operation defined in the repeat setting instruction is completed when the repeat setting instruction determination unit determines that the first condition is satisfied; A repeat pipeline control means for executing line cancellation processing,
The first condition includes a condition that the predetermined code size is equal to or smaller than a first reference value.
Data processing device.

請求項７記載のデータ処理装置であって、
前記リピート設定命令用判定部は、前記リピート対象ブロックに関する前記リピート機能の設定内容として、前記所定のコードサイズが前記第１の基準値より大きく第２の基準値以下である第２の条件を満足するか否かをさらに判定し、
前記リピート時パイプライン制御手段は、
前記リピート設定命令用判定部が前記判定動作によって前記第２の条件を満足すると判定した時に、前記パイプライン処理対象の命令の新規フェッチ開始を一時抑止させる命令フェッチ抑止制御を開始し、前記リピート設定命令に規定したリピート関連の設定動作が終了した後に前記命令フェッチ抑止制御を解除する、
データ処理装置。 The data processing apparatus according to claim 7, wherein
The repeat setting command determination unit satisfies a second condition in which the predetermined code size is larger than the first reference value and smaller than or equal to a second reference value as the setting contents of the repeat function related to the repeat target block. Further determine whether or not to
The repeat pipeline control means includes:
When the determination unit for the repeat setting instruction determines that the second condition is satisfied by the determination operation, an instruction fetch suppression control for temporarily suppressing a new fetch start of the pipeline processing target instruction is started, and the repeat setting is started. Canceling the instruction fetch suppression control after the repeat-related setting operation defined in the instruction is completed;
Data processing device.

パイプライン処理機能を有するとともに、リピート関連の設定の少なくとも一部を規定したリピート設定命令に従い、少なくとも一つの命令を含み総コードサイズが所定のコードサイズとなるリピート対象ブロックをハードウェア的に繰り返し実行するリピート機能を有するデータ処理装置であって、
パイプライン処理対象の命令をフェッチする命令フェッチ部と、
前記命令フェッチ部から前記命令を受け、当該命令が前記リピート設定命令である場合に、前記リピート対象ブロックに関する前記リピート機能の設定内容が所定のキャンセル条件を満足するか否かを判定する判定動作を実行するリピート設定命令用判定部と、
前記リピート設定命令用判定部が前記所定のキャンセル条件を満足すると判定した場合、前記リピート設定命令以降の命令のパイプライン処理をキャンセルするパイプラインキャンセル処理を実行するリピート時パイプライン制御手段とを備え、
前記少なくとも一つの命令は所定の制御基準命令コードを含み、前記所定のキャンセル条件は前記判定動作実行前状態の命令に前記所定の制御基準命令コードが存在する場合を含む、
データ処理装置。 Repeated blocks that have a pipeline processing function and repeat at least one instruction and have a total code size of a predetermined code size in hardware according to a repeat setting instruction that defines at least a part of repeat-related settings A data processing apparatus having a repeat function,
An instruction fetch unit that fetches an instruction to be pipelined;
When the instruction is received from the instruction fetch unit and the instruction is the repeat setting instruction, a determination operation for determining whether or not the setting content of the repeat function related to the repeat target block satisfies a predetermined cancel condition A repeat setting instruction determination unit to be executed;
When the repeat setting instruction determination unit determines that the predetermined cancel condition is satisfied, the repeat setting instruction determining unit includes a repeat pipeline control unit that executes pipeline cancel processing for canceling pipeline processing of instructions subsequent to the repeat setting instruction. ,
The at least one instruction includes a predetermined control reference instruction code, and the predetermined cancellation condition includes a case where the predetermined control reference instruction code exists in an instruction in a state before execution of the determination operation.
Data processing device.

請求項９記載のデータ処理装置であって、
前記リピート時パイプライン制御手段は、
前記所定のキャンセル条件を満足する場合、前記パイプラインキャンセル処理を前記判定動作の実行タイミングで実行した後、前記所定の制御基準命令コードの直前の命令コードの前記フェッチ部によるフェッチ動作開始後に、前記パイプライン処理対象の命令の新規フェッチ開始を一時抑止させる命令フェッチ抑止制御を行い、前記リピート設定命令に規定したリピート関連の設定動作が終了した後に前記命令フェッチ抑止制御を解除する、
データ処理装置。
The data processing apparatus according to claim 9, wherein
The repeat pipeline control means includes:
When the predetermined cancellation condition is satisfied, after the pipeline cancel processing is executed at the execution timing of the determination operation, after the fetch operation of the instruction code immediately before the predetermined control reference instruction code is started by the fetch unit, Performing instruction fetch suppression control that temporarily suppresses the start of a new fetch of an instruction subject to pipeline processing, and canceling the instruction fetch suppression control after completion of a repeat-related setting operation defined in the repeat setting instruction;
Data processing device.