JPH1091597A

JPH1091597A - Serialization of instruction based on token at multiprocessor system

Info

Publication number: JPH1091597A
Application number: JP9197035A
Authority: JP
Inventors: Gaerutonaa Ute; ウテ・ガエルトナー; Yorugu Getsutsurafu Klaus; クラウス・ヨルグ・ゲッツラフ; Puetsufuaa Irvine; アーヴィン・プェッファー; Tasuto Hans-Werner; ハンス−ヴェルナー・タスト
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 1996-08-13
Filing date: 1997-07-23
Publication date: 1998-04-10

Abstract

PROBLEM TO BE SOLVED: To provide a process for serializing an instruction to be serially processed by a multiprocessor system while using a token. SOLUTION: The token can be allocated to one of processors 601 corresponding to a request and afterwards, the processor 601 has a right to execute a command. When the command is composed of distributed tasks, the token is still blocked until the final dependent task belonging to the command is executed as well. Only at that time, the token can be allocated to the other instruction. Besides, a device for managing this token has three states as features. In that 1st state, the token can be utilized, in the 2nd state, the token is allocated to one of processors 601 and in the 3rd state, the dependent task is still executed so that the token can be blocked.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、マルチプロセッサ
装置の命令を、トークンを使用して直列化するプロセス
と、トークンを管理する装置に関し、特に、全てのプロ
セッサに共通のリソースを変更するＩＰＴＥ（ページ・
テーブル・エントリ無効化）、ＳＳＫＥ（記憶キー拡張
セット）等のコマンドに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a process for serializing instructions of a multiprocessor device using tokens, and a device for managing tokens. More particularly, the present invention relates to an IPTE system for changing resources common to all processors. page·
Table entry invalidation), SSKE (memory key extension set), and other commands.

【０００２】[0002]

【従来の技術】マルチプロセッサ装置では、各々のプロ
セッサが、オペレーティング・システムによって各々の
プロセッサに割当てられた命令の流れを処理する。たと
えば、第１のプロセッサは、メモリ内の指定されたペー
ジに保存されたプログラムを処理し、第２のプロセッサ
は、他のページに保存されたオペレーティング・システ
ムのプログラムを実行する。実行側プロセッサへの個々
のプログラムの割当ては、この例では、オペレーティン
グ・システムの特別なルーチンによって行われる。プロ
セッサが、プロセッサに割当てられているプログラムの
処理を終えると、オペレーティング・システムは、再び
使用できるようになったプロセッサに割当てる新しいプ
ログラムを明示する。2. Description of the Related Art In a multiprocessor system, each processor processes a flow of instructions assigned to each processor by an operating system. For example, a first processor processes a program stored on a specified page in memory, and a second processor executes an operating system program stored on another page. The assignment of individual programs to the executing processors is in this example performed by special routines of the operating system. When the processor finishes processing the program assigned to the processor, the operating system specifies a new program to assign to the processor that has become available again.

【０００３】２つのプロセッサが、それぞれプログラム
を実行中に同じメモリ・アドレスにアクセスすると、衝
突が生じる。最初、実際にはプロセッサのそれぞれの専
用レベル１（Ｌ１）キャッシュだけが変更される。ただ
しその後、データの保全性を保証するために、Ｌ１キャ
ッシュに保存された２つの権限レベルのどちらが有効か
判断できなければならない。[0003] A collision occurs when two processors access the same memory address while each is executing a program. Initially, only the dedicated Level 1 (L1) cache of each of the processors is actually changed. However, it must then be possible to determine which of the two privilege levels stored in the L1 cache is valid in order to guarantee data integrity.

【０００４】マルチプロセッサ装置のプロセッサの１つ
が、全てのプロセッサに共通のリソースを変更するとき
には、データの保全性に関して、より重要な問題が生じ
る。[0004] When one of the processors of a multiprocessor device changes resources that are common to all processors, a more significant problem arises with respect to data integrity.

【０００５】その場合、リソースの変更によって、他の
プロセッサで今実行されている命令に関してリソースの
ステータスが未定義のままにならないようにしなければ
ならない。そうでなければデータの保全性に対する違反
が生じる。[0005] In that case, it must be ensured that the resource change does not leave the status of the resource undefined for the instruction currently being executed on another processor. Otherwise, a breach of data integrity will occur.

【０００６】全てのプロセッサに共通のリソースを変更
するコマンドを実行するときには、他の全てのプロセッ
サが、実行されている変更を記憶しなければならないこ
とがある。たとえば、共同リソースのローカル・コピー
を更新しなければならないため等である。従って、共同
リソースを変更するコマンドにより、他の全てのプロセ
ッサで実行する必要のある一連のドーター・プロセス
（daughter process）が起動することがあり得る。When executing a command that changes a resource that is common to all processors, all other processors may need to remember the change being performed. For example, a local copy of a collaborative resource must be updated. Thus, a command to change a joint resource can trigger a series of daughter processes that need to be executed on all other processors.

【０００７】これらの全てのプロセスの実行で重要なこ
とは、データの保全性を維持することに関連した正しい
スケジューリングである。What is important in the performance of all these processes is the correct scheduling associated with maintaining data integrity.

【０００８】この種の命令の例は、仮想アドレシングに
よりメモリをアドレスするマルチプロセッサ装置の分野
に見られる。普通、仮想アドレスから実アドレスへの変
換は、メモリに保存され、互いに関係のあるいくつかの
テーブルを使用して行われる。図１にこれらのテーブル
の構造を示す。変換に必要な第１のテーブル、セグメン
ト・テーブルの初期アドレスは、ＣＰＵの制御レジスタ
に保存される。このようにして指定されたセグメント・
テーブルを起点として、変換を進めるのに必要なページ
・テーブルの初期アドレスを含む、セグメント・テーブ
ル内のエントリに達するために、セグメント索引（仮想
アドレスのビット１乃至１１によって形成される）を使
用することができる。[0008] Examples of this type of instruction are found in the field of multiprocessor devices that address memory by virtual addressing. Typically, the translation of a virtual address to a real address is performed using a number of tables that are stored in memory and related to each other. FIG. 1 shows the structure of these tables. The initial addresses of the first table and the segment table required for the conversion are stored in a control register of the CPU. The segment specified in this way
From the table, use the segment index (formed by bits 1-11 of the virtual address) to reach an entry in the segment table that contains the initial address of the page table needed to proceed with the translation. be able to.

【０００９】図２はセグメント・テーブルのエントリを
示す。各エントリは、参照が行われたページ・テーブル
の初期アドレス（ページ・テーブル起点）だけでなく、
その長さも示す。アドレス変換のためにページ・テーブ
ルの使用が予想されなくなると、ページ・テーブルは有
効でなくなったと認識されなければならない。これは、
ページ・テーブルの無効ビットをセットすることによっ
て行われる。ページ・テーブル無効ビットがセットされ
ていない、従って０に等しいページ・テーブルのみアド
レス変換に使用することができる。FIG. 2 shows the entries of the segment table. Each entry contains not only the initial address of the referenced page table (page table origin), but also
The length is also shown. When use of the page table is no longer expected for address translation, the page table must be recognized as no longer valid. this is,
This is done by setting the invalid bit in the page table. Only page tables whose page table invalid bit is not set, and therefore equal to 0, can be used for address translation.

【００１０】次に、アドレス変換の次の段階で、途中で
見つかっているページ・テーブルを利用して、必要なペ
ージ・フレームの初期アドレスが決定される。図１はこ
の方法を示している。ページ・テーブルの関連エントリ
に達する上での索引の目的は、仮想アドレスの中間領
域、ページ索引（仮想アドレスのビット１２乃至１９）
によって達成される。このようにして見つかったページ
・テーブル・エントリは、所要ページが保存されたペー
ジ・フレームの初期アドレスを参照する。この例のペー
ジ・フレームは、ページを収容できるメモリ内アドレス
領域である。Next, in the next stage of the address conversion, a necessary initial address of a page frame is determined by using a page table found in the middle. FIG. 1 illustrates this method. The purpose of the index in reaching the relevant entry in the page table is the middle area of the virtual address, the page index (bits 12-19 of the virtual address).
Achieved by The page table entry found in this way refers to the initial address of the page frame where the required page is stored. The page frame in this example is an address area in a memory that can accommodate a page.

【００１１】図３はページ・テーブル・エントリを示
す。これはページ・フレームの初期アドレスを含む。ま
たページ無効ビットがあり、指定されたページのデータ
が有効かどうかを示す。ページ・テーブル・エントリ
は、このページ無効ビットがセットされていない、従っ
て０に等しい場合はアドレス変換にのみ使用できる。FIG. 3 shows a page table entry. It contains the initial address of the page frame. There is also a page invalid bit, which indicates whether the data of the specified page is valid. A page table entry can only be used for address translation if this page invalid bit is not set and therefore equals zero.

【００１２】アクセスされる実アドレスを決定するため
に、図１に示すように、仮想アドレスのバイト索引（ビ
ット２０乃至３１）を、ページ・フレームの初期アドレ
スに追加する必要がある。そのために、バイト索引はペ
ージ・フレームのランプ・アドレスに付加される。この
ようにして得られるアドレスは実アドレスである。この
実アドレスをもとに絶対アドレスを得るには、いわゆる
プレフィクスを実アドレスに追加しなければならないこ
とがある。メモリは、このようにして得られた絶対アド
レスにより、直接アクセスすることができる。In order to determine the real address to be accessed, the byte index (bits 20-31) of the virtual address needs to be added to the initial address of the page frame, as shown in FIG. To do so, a byte index is added to the ramp address of the page frame. The address obtained in this way is a real address. To obtain an absolute address based on this real address, it may be necessary to add a so-called prefix to the real address. The memory can be directly accessed by the absolute address obtained in this way.

【００１３】ここまでに述べた仮想アドレスから実アド
レスへの多段階変換は非常に時間がかかる。テーブルの
階層全体を検索しなければならないためである。１つの
アドレスの変換に約５０のプロセッサ・サイクルが必要
である。The above-described multi-stage conversion from a virtual address to a real address takes a very long time. This is because the entire table hierarchy must be searched. The translation of one address requires about 50 processor cycles.

【００１４】この理由のために、セグメント・テーブル
とページ・テーブルによって詳細な変換を回避する形
で、ページ・フレームの実初期アドレスを高速に検索で
きるよう、アドレス変換キャッシュ（または変換索引バ
ッファ：ＴＬＢ）が用いられている。そのため、ページ
に関係するアドレス変換キャッシュ・エントリで、仮想
アドレスの上位部（すなわちセグメント索引とページ索
引）が、ページ・フレームの実初期アドレスとの関係か
らセットされる。このエントリは、ページが最初にアド
レスされた場合に作成され、その後、ページがアドレス
される全ての場合に用いられる。アドレス変換キャッシ
ュのルックアップにはわずか数プロセッサ・サイクルし
か要しないので、メモリに保存されるセグメント・テー
ブルとページ・テーブルによる明示的変換に比べて、速
度の点で大きな利点がある。For this reason, the address translation cache (or translation lookaside buffer: TLB) can be used to quickly retrieve the actual initial address of a page frame in a manner that avoids detailed translation by the segment table and the page table. ) Is used. Therefore, in the address translation cache entry relating to the page, the upper part of the virtual address (ie, the segment index and the page index) is set in relation to the actual initial address of the page frame. This entry is created the first time the page is addressed and is used thereafter in all cases where the page is addressed. Since lookup of the address translation cache takes only a few processor cycles, there is a significant speed advantage over explicit translation with the segment and page tables stored in memory.

【００１５】各プロセッサに、割当てられた専用のアド
レス変換キャッシュがある。キャッシュは、シーケンス
に従ってプロセッサによって実行されたアドレス変換結
果を保存する。仮想アドレスを、割当てられた実アドレ
スに変換する場合、プロセッサは最初、そのアドレス変
換キャッシュのこのページに関するエントリを保存して
いるかどうかチェックする。そうでない場合にのみ、プ
ロセッサは、メモリ内のテーブルを利用して仮想アドレ
スを変換する。Each processor has its own dedicated address translation cache. The cache stores the result of the address translation performed by the processor according to the sequence. When translating a virtual address to an assigned real address, the processor first checks whether it has stored an entry for this page in its address translation cache. Only if this is not the case does the processor translate the virtual address using the table in memory.

【００１６】仮想アドレス処理によって開かれたアドレ
ス領域は、使用可能なメモリのサイズよりも何倍も大き
い。そのため、ある指定された時点では、実質的にアド
レス可能なページの一部しかメモリ内に存在しない。他
のページは外部記憶媒体にある。The address area opened by the virtual address processing is many times larger than the available memory size. Thus, at a given point in time, substantially only a portion of the addressable page exists in memory. Other pages are on external storage media.

【００１７】プロセッサが、メモリに存在しないページ
にアクセスした場合、別のページをメモリから外部記憶
媒体に転送する必要がある。プロセッサがアクセスしよ
うとする新しいページは、メモリにそのように作成した
空き位置に保存することができる。ページの動的な保存
と転送のこのプロセスは"スワッピング"と呼ばれる。When the processor accesses a page that does not exist in the memory, another page needs to be transferred from the memory to the external storage medium. New pages that the processor tries to access can be saved in the free space so created in memory. This process of dynamically storing and transferring pages is called "swapping."

【００１８】ページがメモリから外部記憶媒体に転送さ
れるとき、このページに関係するページ・テーブルのエ
ントリも無効にしなければならない。このエントリは実
際に、メモリのページの実初期アドレスを参照していた
からである。これは、関連ページ・テーブル・エントリ
のページ無効ビットをセットすることによって行われ
る。図３はページ・テーブル・エントリの構造を示す。
ページ無効ビットはこの例ではビット２１である。ペー
ジ無効ビットがセットされているページ・テーブル・エ
ントリは、アドレス変換には使われなくなることがあ
る。When a page is transferred from memory to an external storage medium, the page table entry associated with the page must also be invalidated. This is because this entry actually referred to the actual initial address of the page in memory. This is done by setting the page invalid bit in the associated page table entry. FIG. 3 shows the structure of a page table entry.
The page invalid bit is bit 21 in this example. Page table entries with the page invalid bit set may not be used for address translation.

【００１９】ページを外部記憶媒体に転送し、関連ペー
ジ・テーブル・エントリを無効にすることで、メモリ
に、他のページを保存できる空き位置が作成される。By transferring a page to an external storage medium and invalidating the related page table entry, a free space is created in the memory where another page can be stored.

【００２０】プロセッサが、まだメモリに存在しないペ
ージをアドレスするとき、当のページは外部記憶媒体か
らメモリに入力しなければならない（デマンド・ページ
ング）。この新しいページはそこで、古いページを無効
にして別の記憶域に転送することによって作成されてい
る空き位置を占める。この新しいページについて、新し
いページ・テーブル・エントリが作成される。全く異な
る仮想アドレス領域が、この新しいページ・テーブル・
エントリによって、古いページに占められていたメモリ
の絶対アドレスの同じ領域にマップされる。When the processor addresses a page that is not already in memory, the page must be entered into memory from external storage media (demand paging). The new page then occupies a free space that has been created by invalidating the old page and transferring it to another storage location. A new page table entry is created for this new page. An entirely different virtual address area is the new page table table
The entry maps to the same area of absolute address in memory occupied by the old page.

【００２１】ここまで、ページがメモリの記憶域に入力
されるとき、またはそこから外部記憶域に転送されると
きに、ページ・テーブルのエントリに変更が必要なこと
について述べてきたが、エントリは、途中で外部記憶域
に転送されているページを参照する個々のプロセッサに
割当てられたアドレス変換キャッシュにも存在する。ペ
ージ・テーブルのエントリと共に、ＴＬＢのこれらのエ
ントリも、アドレス変換キャッシュに割当てられたプロ
セッサがこれらのエントリを使用できないように、従っ
て、メモリからすでに転送されており、メモリ内の位置
が他のページによって占められているページをアドレス
しようとしてもできないように、"無効"と宣言しなけれ
ばならない。転送されたページを参照するＴＬＢエント
リが、ＴＬＢに存在し続けるものなら、割当てられたプ
ロセッサは、目的のページとは全く異なるページをアド
レスする。So far, it has been described that the entries in the page table need to be changed when pages are entered into memory or transferred from there to external storage. Exists in the address translation cache assigned to each processor which refers to the page being transferred to the external storage on the way. Along with the page table entries, these entries in the TLB are also available to the processor assigned to the address translation cache so that they cannot be used, and therefore have already been transferred from memory, and the location in memory has been changed to another page. You must declare it "invalid" so that you cannot try to address the page occupied by. If the TLB entry referring to the transferred page continues to exist in the TLB, the assigned processor addresses a page that is completely different from the intended page.

【００２２】ここで、メモリから転送されるページを無
効にする目的からは、ページ・テーブルの対応エントリ
を無効にする必要がある一方で、個々のプロセッサに割
当てられた全てのアドレス変換キャッシュのページに関
係した全てのエントリも無効にする必要がある。Here, for the purpose of invalidating the page transferred from the memory, it is necessary to invalidate the corresponding entry of the page table, while all the address translation cache pages allocated to the individual processors are invalidated. All entries related to must also be invalidated.

【００２３】その結果は、ページ・テーブルの１つのエ
ントリを無効にするためには、一方で、ページ・テーブ
ル・エントリ自体（及び開始側プロセッサに割当てられ
たＴＬＢで転送されるページに関係するエントリ）を、
無効化を開始するプロセッサによって無効にする必要が
あり、他方、他の全てのプロセッサに割当てられたアド
レス変換キャッシュで転送されるページに関係するエン
トリを無効にするためには、これらのプロセッサで依存
するタスクを起動する必要もある。The result is that to invalidate one entry in the page table, on the one hand, the page table entry itself (and the entry associated with the page transferred in the TLB assigned to the initiating processor) ),
Must be invalidated by the processor initiating the invalidation, while relying on these processors to invalidate entries related to pages transferred in the address translation cache assigned to all other processors. You also need to launch the task you want to do.

【００２４】その場合、変更される共通のリソースは、
メモリの関連ページ・テーブルであり、これはもちろん
全てのプロセッサによってアクセスされる。アドレス変
換キャッシュの対応エントリは、共通リソースのローカ
ル・コピーを表す。共通リソースを変更する場合は、ロ
ーカル・コピーも更新しなければならない。In that case, the common resources to be changed are:
The associated page table in memory, which is of course accessed by all processors. The corresponding entry in the address translation cache represents a local copy of the common resource. If you change common resources, you must also update your local copy.

【００２５】重要なことは、開始側プロセッサで初期タ
スクを実行し、開始側プロセッサに依存するタスクの、
他のプロセッサ（レスポンダ）での実行時に、アドレス
変換に関してデータの保全性を保護することである。ど
のプロセッサでも、そこで実行される命令は、どの１つ
のプロセッサでも、その実行時に必要とするアドレス変
換を一貫して行えなければならない。また更に、これ
は、アドレス変換がメモリのテーブル階層を使用した長
いパスを経由して行われるか、アドレス変換キャッシュ
を使用した短いパスを経由して行われるかどうかとは無
関係でなければならない。Importantly, the initiating processor performs the initial task, and the tasks that depend on the initiating processor are:
Protecting data integrity with respect to address translation when executed on another processor (responder). The instructions executed on any processor must be able to consistently perform the address translation required by the execution on any one processor. Still further, this must be independent of whether the address translation is performed via a long path using a table hierarchy of memory or via a short path using an address translation cache.

【００２６】ここで、初期タスク（開始側プロセッサで
実行される）と、それに依存するタスク（他のプロセッ
サで実行される）は、適切な形で、それぞれのプロセッ
サで実行される他の命令と直列化しなければならない。Here, the initial task (executed by the initiating processor) and the task dependent on it (executed by the other processor) are appropriately divided into other instructions executed by the respective processors. Must be serialized.

【００２７】マルチプロセッサ装置にはコンピュータ・
コマンドの列全体があり、これは、これまで示してきた
方式に従って発生し、全てのプロセッサに共通のリソー
スは、従って初期タスクによって変更されると同時に、
初期タスクに依存するタスクによって更新される共通リ
ソースのローカル・コピーが利用できる。ここで述べた
ページ・テーブル・エントリの無効化（ＩＰＴＥ）は、
この種のコマンドの１つの例である。The multiprocessor device includes a computer
There is an entire sequence of commands, which occurs according to the scheme shown so far, the resources common to all processors are thus changed by the initial task,
A local copy of the common resource is available that is updated by tasks that depend on the initial task. Invalidation of the page table entry (IPTE) described here
This is one example of this type of command.

【００２８】この種のコンピュータ・コマンドの他の例
は、全てのページに割当てられ、このページへのアクセ
ス権を決定するキー情報の変更に関係する。キー情報
は、所要記憶域に保存される一方、アドレス変換キャッ
シュのエントリにも保存される。ページに割当てられた
キー情報を変更するには、中央のリソース、所要記憶域
を変更する必要がある一方、個々のプロセッサのアドレ
ス変換キャッシュ内のキー情報のローカル・コピーを更
新する必要もある。Another example of this type of computer command involves changing key information that is assigned to every page and determines access rights to this page. While the key information is stored in the required storage area, it is also stored in the address translation cache entry. Changing the key information assigned to a page requires changing the central resources and storage requirements, while updating the local copy of the key information in the address translation cache of each individual processor.

【００２９】ここでも、共通リソースを変更するために
初期タスクが、またローカル・コピーを更新するための
この初期タスクに依存するタスクが実行される。Here, too, an initial task is executed to change the common resource and a task dependent on this initial task to update the local copy.

【００３０】これらのコマンドの場合でも、データの保
全性の保護が保証されなければならない。命令実行中
に、この命令によってアドレスされるページのキーの変
更は防ぐ必要がある。そのためには、初期タスクと、依
存タスクの適切なスケジューリングが必要である。Even in the case of these commands, protection of data integrity must be guaranteed. During execution of the instruction, it is necessary to prevent the key of the page addressed by the instruction from being changed. For that purpose, appropriate scheduling of initial tasks and dependent tasks is required.

【００３１】この問題の解決方法は、図３に示してい
る、いわゆる"静止"方式で表される。この方式の場
合、"ＩＰＴＥ"コマンド（ページ・テーブル・エントリ
無効化）を実行しようとするプロセッサは、"静止要求"
を他の全てのプロセッサに送る。これらのプロセッサは
ただ、"静止要求"の前に開始された命令を実行している
（３０１）。これらの命令の１つが完全に実行されると
（３０２）、関連するプロセッサは、最後のプロセッサ
も、まだ保留中であるその命令の実行を完了する（３０
３）まで待機時間を挿入する（３０４）。この時点から
以降、全てのプロセッサが"静止モード"になる（３０
５）。ＰＵ０はここで、対応するページ・テーブル・エ
ントリ及び、外部記憶域に転送されるページに関係した
それ自身のＴＬＢのエントリを無効化できる。他のプロ
セッサは、これらに割当てられたＴＬＢのこのページに
関係するエントリを削除するよう促される。A solution to this problem is represented by the so-called "stationary" method shown in FIG. In the case of this method, the processor that intends to execute the “IPTE” command (invalidate the page table entry) receives the “quiesce request”
To all other processors. These processors are just executing the instruction that was started before the "quiesce request" (301). When one of these instructions is fully executed (302), the associated processor completes execution of that instruction, which is still pending by the last processor (30).
A waiting time is inserted until 3) (304). From this point on, all processors are in "quiescent mode" (30
5). PU0 can now invalidate the corresponding page table entry and its own TLB entry related to the page being transferred to external storage. Other processors are prompted to delete entries related to this page in their assigned TLB.

【００３２】ＩＰＴＥコマンドが完全に終了した後、開
始側ＰＵは他の全てのプロセッサに"静止リセット"を送
る（３０８）。次に後者のプロセッサは、命令フローの
次の命令の処理に進む（３０９）。ＰＵ０も、"静止リ
セット"を送った後、後の命令の処理に進む。After the IPTE command has completed, the initiating PU sends a "quiesce reset" to all other processors (308). Next, the latter processor proceeds to the processing of the next instruction in the instruction flow (309). PU0 also sends a "quiet reset" and then proceeds to the processing of the subsequent instruction.

【００３３】この方式の欠点は、依存タスクを実行して
いるプロセッサ側で待機時間が長く、その間、これらの
プロセッサは次の命令を実行できないことである。各プ
ロセッサで、待機時間は、各例につき、開始側プロセッ
サがその"静止要求"を送ったとき実行されたばかりのそ
の命令の完了時（３０２）に始まり、最後のプロセッサ
も、"静止要求"の時点で保留されていたコマンドの実行
を終えた時点に終了する（３０３）。開始側プロセッサ
がページ・テーブル・エントリを無効にしている間、他
のプロセッサ側はまた待機時間に入り（３０７）、この
状態は依存タスクを実行する要求が届くまで続く。レス
ポンダＰＵ側に生じるこれら待機時間は無視できるもの
ではなく、達成可能なパフォーマンスを損う。A drawback of this scheme is that the processors executing the dependent tasks have long wait times, during which time they cannot execute the next instruction. At each processor, the wait time begins, at each instance, at the completion of the instruction (302) that was just executed when the initiating processor sent its "quiesce request", and the last processor also has a "quiesce request" The processing ends when the execution of the command held at the time is completed (303). While the initiating processor invalidates the page table entry, the other processor also enters a wait time (307), which continues until a request to perform the dependent task arrives. These waiting times occurring on the responder PU are not negligible and impair the achievable performance.

【００３４】レスポンダＰＵに生じるこれらの待機時間
を避ける方法は、Padegsらによる米国特許番号３９４７
８２３号に述べられている。ここで提起している解決方
法の目的は、任意のプロセッサで保留されている命令の
それぞれに用いられるメモリ・アドレス（及びその内
容）を、それら命令に対して命令の完了まで維持するこ
とである。これは、命令によってアクセスされるメモリ
・アドレスと内容に対して特別なバッファ・メモリを取
り入れることにより行われる。その結果、命令は実メモ
リ・ページ、アドレス、及び内容に依存しなくなる。こ
れらへのアクセスは、命令の実行中には行われなくなる
からである。従って、ページ・テーブルのエントリとア
ドレス変換キャッシュの無効化、及び外部Ｉ／Ｏ装置へ
のページの転送は、まだ実行されている命令を考慮せず
に、これらの命令が、外部記憶域に転送されるページの
アドレスにアクセスする場合でも、実行することができ
る。この方法の場合、結果的に、データの保全性に関し
て、正しいスケジューリングに注意する必要はない。そ
のため、依存タスクの実行に関係する待機時間は、この
方法を利用することにより回避することができる。A method of avoiding these wait times that occur in the responder PU is described in US Pat.
No. 823. The purpose of the solution proposed here is to maintain the memory address (and its contents) used for each of the pending instructions in any processor until the completion of the instruction for those instructions. . This is done by incorporating a special buffer memory for the memory addresses and contents accessed by the instructions. As a result, instructions are not dependent on real memory pages, addresses, and contents. Access to these is not performed during execution of the instruction. Therefore, invalidation of page table entries and address translation caches, and transfer of pages to external I / O devices is accomplished by taking these instructions to external storage without regard to the instructions that are still being executed. Even if the address of the page to be accessed is accessed, it can be executed. In this way, consequently, there is no need to pay attention to correct scheduling with respect to data integrity. Therefore, the waiting time related to the execution of the dependent task can be avoided by using this method.

【００３５】ただし、この方法の欠点は、ちょうど実行
されている命令に必要な全てのアドレス及びアドレス内
容を一時的に保存するために、高価なハードウェアを追
加しなければならないことにある。The disadvantage of this method, however, is that expensive hardware must be added to temporarily store all addresses and address contents needed for the instruction just being executed.

【００３６】IBM Technical Disclosure Bulletin、Vo
l．33、No．6B、November、1990、P428-433に発表され
た"Low-Synchronization Translation Lookaside Buffe
r Consistency Algorithm（低同期化ＴＬＢ一貫性アル
ゴリズム）"で、B．S．Rosenburgは、ページ・テーブル
・エントリを無効にするプロセスを示している。ここで
はレスポンダ・プロセッサの待機時間が回避される。そ
のために、開始側プロセッサはまず、対応するページ・
テーブル・エントリを無効化し、変更されたページ・テ
ーブルを識別する。次にレスポンダ・プロセッサに割込
みが送られ、これに応答して、命令フローの割込み可能
点で、アクティブなページ・テーブルの変更を探す。ペ
ージ・テーブル・エントリが無効化されている場合、レ
スポンダ・プロセッサは、存在する場合には対応するロ
ーカルＴＬＢエントリを無効化する。次にレスポンダ・
プロセッサは、命令フローの次の命令の処理に進む。IBM Technical Disclosure Bulletin, Vo
l. 33, No. 6B, November, 1990, P428-433, "Low-Synchronization Translation Lookaside Buffe
In the "r Consistency Algorithm", BS Rosenburg shows a process for invalidating page table entries, where the latency of the responder processor is avoided. To do so, the initiating processor first starts the corresponding page
Invalidate table entries and identify page tables that have changed. An interrupt is then sent to the responder processor, which responds by looking for active page table changes at the interruptible point of the instruction flow. If the page table entry has been invalidated, the responder processor invalidates the corresponding local TLB entry, if any. Next, responder
The processor proceeds to process the next instruction in the instruction flow.

【００３７】この方法の欠点は、全てのレスポンダＰＵ
が、全てのページ・テーブルを調べてフラグを探し、エ
ントリを無効化しなければならないことである。このよ
うな処理にはかなりの時間がかかる。もう１つの欠点と
して、変更されたページ・テーブルにアクセスする命令
に関係したデータの保全性については何も言及されてい
ない。分散タスクから構成されるコマンドの直列化に関
係した技術的な解決方法では、データの保全性という側
面を無視することはできない。The disadvantage of this method is that all responders PU
Must go through all page tables for flags and invalidate the entry. Such processing takes a considerable amount of time. As another disadvantage, no mention is made of the data integrity associated with instructions that access the modified page table. Technical solutions related to the serialization of commands composed of distributed tasks cannot ignore the aspect of data integrity.

【００３８】[0038]

【発明が解決しようとする課題】本発明の目的は、マル
チプロセッサ装置において命令の直列化方法を提供する
ことである。この場合、指定されたコマンドの直列化の
必要性は、これらのコマンドにより、全てのプロセッサ
に共通のリソースが変更されるという事実から来てい
る。SUMMARY OF THE INVENTION It is an object of the present invention to provide a method for serializing instructions in a multiprocessor device. In this case, the need for serialization of the specified commands comes from the fact that these commands change the resources common to all processors.

【００３９】本発明の他の目的は、開始側プロセッサで
実行される１つの初期タスクと、この初期タスクに依存
し、他のプロセッサで実行されるタスクとで構成される
コマンドを実行するために、適切な直列化の方法を提供
することである。共同リソースのローカル・コピーを更
新しなければならない場合には、依存タスクの実行が必
要になることがある。Another object of the present invention is to execute a command composed of one initial task executed by the initiating processor and tasks dependent on this initial task and executed by another processor. , To provide a suitable serialization method. If a local copy of a collaborative resource needs to be updated, dependent tasks may need to be performed.

【００４０】本発明の他の目的は、高いパフォーマンス
を保証するために、さまざまなタスクの実行時に生じる
待機時間を最少にすることである。It is another object of the present invention to minimize the wait time that occurs when performing various tasks to ensure high performance.

【００４１】また、分散タスクを構成するコマンドの実
行時にデータの保全性を保証することも重要である。It is also important to guarantee data integrity when executing commands constituting a distributed task.

【００４２】本発明の他の目的は、ページ・テーブル・
エントリを無効化するコマンド、ＩＰＴＥ、及びページ
に関係するキー情報を変更するコマンド、ＳＳＫＥを対
象にした、改良された実行形態を提起することである。Another object of the present invention is to provide a page table table.
The object is to provide an improved implementation for SSKE, a command to invalidate entries, a command to change key information related to IPTE and pages.

【００４３】本発明の他の目的は、ここで提供する直列
化処理を行うためのトークンを管理する装置を開示する
ことである。Another object of the present invention is to disclose an apparatus for managing tokens for performing the serialization process provided herein.

【００４４】本発明の他の目的は、この装置を、ハード
ウェアにあまりコストをかけずに実現することである。Another object of the present invention is to realize this device at a low hardware cost.

【００４５】[0045]

【課題を解決するための手段】本発明に従って、マルチ
プロセッサ装置で直列処理される命令の直列化のための
トークンを管理する装置によってタスクが実行される。
プロセッサは、トークンを保有している場合にのみ、直
列処理される命令を実行することができ、トークンを管
理する装置は次の状態を特徴とする。第１の状態では、
どのプロセッサにもトークンは割当てられず、トークン
は、トークンを要求したプロセッサには割当てることが
できる。第２の状態では、プロセッサの１つにトークン
が割当てられ、トークンを要求したプロセッサには割当
てられない。第３の状態では、どのプロセッサにもトー
クンは割当てられず、トークンを要求したプロセッサに
も割当てられない。In accordance with the present invention, a task is performed by an apparatus that manages tokens for serializing instructions that are serialized in a multiprocessor device.
The processor can execute the serialized instruction only if it has the token, and the device for managing the token is characterized by the following states. In the first state,
No token is assigned to any processor, and the token can be assigned to the processor that requested the token. In the second state, one of the processors is assigned a token and is not assigned to the processor that requested the token. In the third state, no token is assigned to any processor, nor is it assigned to the processor that requested the token.

【００４６】従って、直列化される命令の直列化は、ト
ークンの割当てによって規定される。この場合、第３の
状態については別に、トークンがブロックされる機構が
追加される。これにより直列化される命令の実行を外部
条件と連結することができる。外部条件により、トーク
ンをブロックし、従って直列化される命令の実行を防ぐ
ことができる。Thus, the serialization of the serialized instruction is defined by the token assignment. In this case, apart from the third state, a mechanism for blocking the token is added. This allows the execution of the serialized instruction to be linked to external conditions. External conditions can block the token and thus prevent execution of the serialized instruction.

【００４７】ｎ個のプロセッサで構成されるマルチプロ
セッサ装置でトークンを管理する装置は、本発明に従っ
て、プロセッサｉがトークンを保有している場合にセッ
トされる１つの信号Ａｉの、各プロセッサｉに対する可
用性、各プロセッサｉに対して、まだ完全に実行されて
いないタスク（このタスクは直列処理される命令の１つ
の実行で直列化され、そのためにトークンを要求したプ
ロセッサに前記トークンを割当てることができなくな
る）がプロセッサｉで保留されているときにセットされ
る１つの信号Ｂｉの可用性、及び、マルチプロセッサ装
置全体に共通な信号Ｃの可用性、によって実現される。
信号Ｃは、先に列挙された全てのプロセッサの信号Ａ
１、．．．Ａｎ、Ｂ１、．．．ＢｎのORチェインによっ
て生じ、要求側プロセッサにトークンを割当てられない
場合にセットされる。ここでトークンを管理する装置の
第１の状態は、信号Ａ１、．．．Ａｎ、Ｂ１、．．．Ｂ
ｎのいずれもセットされないことを特徴とする。トーク
ンを管理する装置の第２の状態は、信号Ａ１、．．．Ａ
ｎの１つがセットされることを特徴とする。トークンを
管理する装置の第３の状態は、信号Ａ１、．．．Ａｎは
いずれもセットされないが、Ｂ１、．．．Ｂｎ信号の少
なくとも１つはセットされることを特徴とする。A device for managing tokens in a multiprocessor device consisting of n processors, according to the invention, provides one signal Ai, which is set when processor i has a token, for each processor i. Availability, for each processor i, a task that has not yet been completely executed (this task is serialized with the execution of one of the serialized instructions, so that the token can be assigned to the processor that requested the token) Is eliminated by the processor i, and the availability of one signal Bi set when the processor i is suspended and the availability of the signal C common to the entire multiprocessor device.
Signal C is signal A of all the processors listed above.
1,. . . An, B1,. . . Caused by the Bn's OR chain and set if the requesting processor cannot be assigned a token. Here, the first state of the device that manages the tokens is signal A1,. . . An, B1,. . . B
n is not set. The second state of the device managing the token is the signals A1,. . . A
n is set. The third state of the device that manages the tokens is the signal A1,. . . An is not set, but B1,. . . At least one of the Bn signals is set.

【００４８】この実施形態の利点は、トークンを管理す
る装置の状態が、非常に簡素な形で表されることであ
る。その結果、必要なハードウェアのコストは低く抑え
られる。命令の直列化に必要な基本回路は、信号Ａ
１、．．．Ａｎ、Ｂ１、．．．ＢｎのORチェインに限定
される。An advantage of this embodiment is that the state of the device managing the token is represented in a very simple manner. As a result, the required hardware costs are kept low. The basic circuit required for instruction serialization is signal A
1,. . . An, B1,. . . Limited to OR chain of Bn.

【００４９】本発明の他の利点として、各プロセッサｉ
に対して、信号ＡｉとＢｉのORチェインによって生じる
１つの信号Ｃｉの可用性が考慮されている。また、マル
チプロセッサ全体に共通であり、全てのプロセッサの信
号Ｃ１、．．．ＣｎのORチェインによって生じる信号Ｃ
の可用性も考慮されている。Another advantage of the present invention is that each processor i
In contrast, the availability of one signal Ci generated by the OR chain of the signals Ai and Bi is considered. Further, the signal C1,. . . The signal C generated by the OR chain of Cn
Availability is also considered.

【００５０】信号Ｃは、マルチプロセッサ装置全体に共
通な信号なので、これは、回路設計の観点から、各プロ
セッサに置かれるORゲートにより、各プロセッサｉに割
当てられる信号Ａｉ及びＢｉが最初に連接され、１つの
信号Ｃｉを形成する場合は利点になる。その際、１つの
信号、つまり信号Ｃｉ、を各プロセッサから中央のORゲ
ートに供給するだけでよいからである。Since the signal C is a signal common to the entire multiprocessor device, the signal Ai and Bi assigned to each processor i are connected first by an OR gate placed in each processor from the viewpoint of circuit design. It is advantageous to form one signal Ci. At that time, only one signal, that is, the signal Ci, needs to be supplied from each processor to the central OR gate.

【００５１】本発明の他の利点として、直列処理される
命令は、開始側プロセッサで実行される第１のタスク
と、レスポンダ・プロセッサで実行される依存タスクと
で構成される。ここでトークンは、開始側プロセッサに
割当てなければならないので、開始側プロセッサは第１
のタスクを実行することができる。トークンは、全ての
依存タスクが完了するまで、トークンを要求するプロセ
ッサには割当てられない。As another advantage of the present invention, the instructions to be serialized consist of a first task executed by the initiating processor and a dependent task executed by the responder processor. Here, the token must be assigned to the initiating processor, so the initiating processor is
Tasks can be performed. The token is not assigned to the processor requesting the token until all dependent tasks have completed.

【００５２】本発明により、本発明に従ってトークンが
取り得る第３の状態でどのような利点が得られるか明ら
かになる。命令が、異なるプロセッサで実行される分散
タスクで構成される場合、開始側プロセッサは、命令の
実行を開始するためには、トークンを保有していなけれ
ばならない。しかし、まだ保留されているタスクが実行
過程にある限り、トークンはブロックされたままであ
る。ただし開始側プロセッサは、そのタスクをすでに終
えており、トークンを返している可能性はある。こうし
て依存タスクにより、直列処理される新しいコマンドの
実行を開始することはできなくなる。The present invention makes clear what advantages are gained in the third possible state of the token according to the invention. If the instruction consists of a distributed task executed on a different processor, the initiating processor must have a token to begin executing the instruction. However, as long as the task that is still pending is in the process of being executed, the token remains blocked. However, the initiating processor may have already completed its task and returned a token. Thus, the dependent task will not be able to start executing a new command to be serialized.

【００５３】本発明の他の利点として、命令バッファが
マルチプロセッサ装置の各プロセッサに割当てられ、こ
こで最初のプロセッサは、最初のタスクの実行時にコマ
ンドやアドレスをレスポンダ・プロセッサの命令バッフ
ァに書込む。これらのコマンドやアドレスは、レスポン
ダ・プロセッサで実行される依存タスクを指定する。Another advantage of the present invention is that an instruction buffer is assigned to each processor of the multiprocessor device, where the first processor writes commands and addresses to the instruction buffer of the responder processor when executing the first task. . These commands and addresses specify dependent tasks to be performed on the responder processor.

【００５４】このようにして、最初のタスクは、レスポ
ンダ・プロセッサに割込む必要なく、必要な依存タスク
を開始することができる。レスポンダ・プロセッサが、
依存タスクを実行できる点に達したときにのみ、命令バ
ッファに書込まれた情報が考慮される。In this way, the first task can start the required dependent task without having to interrupt the responder processor. The responder processor
Only when the point at which the dependent task can be performed is reached is the information written in the instruction buffer considered.

【００５５】本発明の他の利点として、レスポンダＰＵ
の命令バッファへの書込みは、ブロードキャスト強制操
作によって行われる。Another advantage of the present invention is that the responder PU
Is written to the instruction buffer by a broadcast forced operation.

【００５６】更に、信号Ｂｉ（"コマンド保留"）を、レ
スポンダ・プロセッサｉの命令バッファへのコマンドま
たはアドレスの書込みと一緒にセットすることも利点に
なる。It is further advantageous to set the signal Bi ("command pending") together with the writing of a command or address to the instruction buffer of the responder processor i.

【００５７】これと共に、レスポンダ・プロセッサｉの
それぞれの依存タスクの終了時に信号Ｂｉをリセットす
ることも利点になる。In addition, it is advantageous to reset the signal Bi at the end of each dependent task of the responder processor i.

【００５８】また本発明では、依存タスクは、レスポン
ダ・プロセッサの命令フローの中、割込み可能な点に挿
入され実行される。In the present invention, the dependent task is inserted and executed at an interruptible point in the instruction flow of the responder processor.

【００５９】これによりレスポンダ・プロセッサの待機
時間が回避され、パフォーマンスは大幅に向上する。This avoids the wait time of the responder processor and significantly improves performance.

【００６０】リマインダをもとにした解決方法で特に大
きな利点のあるコマンドは、ＳＳＫＥ（記憶キー拡張セ
ット）コマンドである。A particularly advantageous command of the reminder-based solution is the SSKE (memory key expansion set) command.

【００６１】マルチプロセッサ装置はメモリで構成され
るが、メモリをアドレスするためには、仮想アドレスか
ら実アドレスへの変換を、アドレス変換テーブルを利用
して行うことができる。すでに行われたアドレス変換の
結果は、アドレス変換キャッシュに保存される。アドレ
ス変換キャッシュはそれぞれプロセッサの１つに割当て
られる。ページに対する個々のプロセッサのアクセス権
を決定するためのキー情報は、メモリの一部分、つまり
キー記憶域に保存される。直列処理される命令は、キー
情報を変更するための命令（ＳＳＫＥ）である。開始側
プロセッサは、最初のタスクの実行時にキー記憶域にあ
るキー情報を変更する。そして、依存タスクの実行時、
他のプロセッサは、他のプロセッサに割当てられたアド
レス変換キャッシュのキー情報を変更する。Although the multiprocessor is composed of a memory, in order to address the memory, a translation from a virtual address to a real address can be performed using an address translation table. The result of the already performed address translation is stored in the address translation cache. Each address translation cache is assigned to one of the processors. Key information for determining individual processor access rights to a page is stored in a portion of memory, the key storage. The instruction to be serially processed is an instruction (SSKE) for changing key information. The initiating processor changes the key information in the key storage upon execution of the first task. And when the dependent task runs,
The other processor changes the key information of the address translation cache assigned to the other processor.

【００６２】プロセッサは各々、キー記憶域のキーの変
更によって影響を受ける。ＳＳＫＥの場合にデータの保
全性の観点からコマンドの直列化が必要である。依存タ
スクの実行時には他のコマンドへのトークンの割当ても
防止しなければならない。Each of the processors is affected by a key change in the key storage. In the case of SSKE, serialization of commands is necessary from the viewpoint of data integrity. When executing dependent tasks, the assignment of tokens to other commands must also be prevented.

【００６３】リマインダをもとにした解決方法が特に都
合がよい１つのコマンドは、ページ・テーブルのエント
リを無効化するコマンド、ＩＰＴＥ（ページ・テーブル
・エントリ無効化）である。その場合、マルチプロセッ
サ装置はメモリで構成されるが、メモリをアドレスする
ために、仮想アドレスから実アドレスへの変換を、アド
レス変換テーブルを利用して行うことができる。すでに
行われたアドレス変換の結果は、アドレス変換キャッシ
ュに保存される。アドレス変換キャッシュはそれぞれプ
ロセッサの１つに割当てられる。直列処理される命令
は、ページ・テーブル・エントリを無効化する命令（Ｉ
ＰＴＥ）である。開始側プロセッサは、最初のタスクの
実行時にページ・テーブル・エントリを無効化する。そ
して、依存タスクの実行時、他のプロセッサは、他のプ
ロセッサに割当てられたアドレス変換キャッシュの対応
するエントリを無効化する。One command that is particularly convenient for the reminder based solution is the command to invalidate a page table entry, IPTE (Page Table Entry Invalidation). In such a case, the multiprocessor device is configured by a memory. In order to address the memory, the conversion from the virtual address to the real address can be performed using an address conversion table. The result of the already performed address translation is stored in the address translation cache. Each address translation cache is assigned to one of the processors. The instruction processed serially is an instruction for invalidating a page table entry (I
PTE). The initiating processor invalidates the page table entry upon execution of the first task. Then, when the dependent task is executed, the other processor invalidates the corresponding entry of the address translation cache assigned to the other processor.

【００６４】直列化は、特に、ページ・テーブル・エン
トリを無効化する場合にデータの保全性の観点から重要
である。プロセッサは全て、アドレス変換のためにメモ
リのテーブルにアクセスするので、当然、ページ・テー
ブル・エントリの無効化は全てのプロセッサに影響を与
える。また、まだ依存タスクの実行時に新しいＩＰＴＥ
を開始することも不可能である。依存タスクは、ＩＰＴ
Ｅの場合には、ＴＬＢエントリを無効化する。特にこの
ために、返された後でもトークンをブロック状態にする
ことができるトークン管理装置が有利である。Serialization is important from a data integrity standpoint, especially when invalidating page table entries. Of course, invalidating a page table entry affects all processors because all processors access tables in memory for address translation. Also, when the dependent task is executed, a new IPTE
It is also impossible to start. Dependent task is IPT
In the case of E, the TLB entry is invalidated. Particularly for this, a token management device that can block the token even after it has been returned is advantageous.

【００６５】本発明の利点として、特に、ページ・テー
ブル・エントリの無効化に関係して、ｎ次関連アドレス
変換キャッシュの導入がある。この場合、ｎ回を超える
アドレス変換を要する命令は、ページ・テーブル・エン
トリを無効化する命令（ＩＰＴＥ）で直列化される。An advantage of the present invention is the introduction of an nth-order related address translation cache, particularly related to invalidating page table entries. In this case, instructions that require more than n address translations are serialized with page table entry invalidation instructions (IPTE).

【００６６】ＩＰＴＥは、次の手段により、ｎ回を超え
るアドレス変換を要するこの種のコマンドで有利に直列
化することができる。トークンは、プロセッサｉに割当
てなければならない。これによりプロセッサｉは、ｎ回
を超えるアドレス変換を要する命令を実行することがで
きる。IPTE can be advantageously serialized with such commands requiring more than n address translations by the following means: The token must be assigned to processor i. Thereby, the processor i can execute an instruction requiring address conversion more than n times.

【００６７】このような手段によりデータの保全性の保
護を保証できる。ｎタプル関連アドレス変換キャッシュ
により、ｎ回より少ないまたは等しいアドレス変換を要
するコマンドでＩＰＴＥを直列化する必要がない。これ
で時間とコストが節約される。With such means, protection of data integrity can be guaranteed. With an n-tuple related address translation cache, there is no need to serialize the IPTE with commands requiring less than or equal to n address translations. This saves time and money.

【００６８】マルチプロセッサ装置で直列処理される命
令を直列化するため、本発明に従ったプロセスについて
述べる。直列処理される命令の１つの実行は、開始側プ
ロセッサでの第１のタスクの実行から構成され、開始側
プロセッサは、トークンを保有している場合にのみ第１
のタスクを実行することができ、トークンは、使用可能
な場合、プロセッサの１つにのみ割当てることができ
る。このプロセスは次のステップで構成される。（１）
開始側プロセッサにトークンを要求するステップ、
（２）トークンが使用可能な場合に開始側プロセッサに
トークンを割当てるステップ、（３）直列処理される命
令を実行するステップ、（４）直列処理される命令の最
初のタスクが完了した後にトークンを返すステップ（そ
の結果、トークンは、必ずしも他のプロセッサから使用
できるようにする必要がない）、（５）直列処理される
命令の実行が完了した後にトークンの可用性を確立する
ステップ。A process in accordance with the present invention for serializing instructions that are serialized in a multiprocessor device is described. One execution of the serialized instruction consists of execution of a first task on the initiating processor, which only initiates the first task if it has the token.
Tasks can be performed, and a token can be assigned to only one of the processors, if available. This process consists of the following steps. (1)
Requesting a token from the initiating processor;
(2) assigning the token to the initiating processor if the token is available; (3) executing the serialized instruction; (4) replacing the token after the first task of the serialized instruction is completed. Returning (so that the token does not necessarily need to be made available to other processors); (5) establishing token availability after execution of the serialized instruction is complete.

【００６９】このプロセスの場合、最初のタスクのパフ
ォーマンスは、トークンを保有した実行側プロセッサに
依存する。このプロセッサはまた、最初のタスクの実行
を完了したときにトークンを返す。In this process, the performance of the first task depends on the executing processor holding the token. The processor also returns a token when it has completed execution of the first task.

【００７０】本発明に従った利点は、それにもかかわら
ず、コマンドに属し、他のプロセッサで実行されるタス
クが、すでに完全に完了したかどうかを考慮できるとい
う点にある。他のプロセッサのタスクを含めて、コマン
ド全体が完了したときにのみ、直列処理される新しいコ
マンドを実行することができる。本発明に従って、その
時点までトークンの可用性が再確立されることはない。An advantage according to the invention is that, nevertheless, it is possible to take into account whether the tasks belonging to the command and executed on other processors have already been completely completed. A new command to be serialized can be executed only when the entire command is completed, including tasks of other processors. In accordance with the present invention, token availability is not re-established up to that point.

【００７１】[0071]

【発明の実施の形態】図５及び図６は、開始側プロセッ
サで実行される最初のタスクのシーケンスを示す。例に
示すように、ページ・テーブル・エントリの無効化が考
慮されているが、それでもコマンドを実行する基本シー
ケンスは、最初のタスク、及びこの最初のタスクに依存
するタスクで構成された全てのコマンドで同じである。
ＩＰＴＥ（ページ・テーブル・エントリ無効化）の例を
続けるが、最初のタスクが実行されているとき、ページ
無効ビットをセットすることによって、ページ・テーブ
ルのページに関係するエントリを最初に無効化しなけれ
ばならない。次のステップでは、開始側ＰＵのＴＬＢの
ページに関係するエントリを無効化しなければならな
い。更に、開始側プロセッサは他のプロセッサ、レスポ
ンダＰＵに、他のプロセッサにも割当てられたＴＬＢの
ページに関係するエントリを無効化するよう要求しなけ
ればならない。FIG. 5 and FIG. 6 show the sequence of the first task executed by the initiating processor. As shown in the example, the invalidation of the page table entry is taken into account, but the basic sequence of executing commands is still the first task and all commands consisting of tasks that depend on this first task. Is the same.
Continuing with the IPTE (Page Table Entry Invalidation) example, when the first task is being performed, the page-related entry of the page table must first be invalidated by setting the page invalid bit. Must. In the next step, the entry relating to the page of the TLB of the initiating PU must be invalidated. In addition, the initiating processor must request the other processor, the responder PU, to invalidate the entry associated with the page in the TLB that has also been assigned to the other processor.

【００７２】メモリに保存されたアドレス変換テーブル
にアクセスするコマンドが妥当な直列化レベルに達する
には、無効化を行うプロセッサは、トークンを要求し、
トークンを割当てられなければならない。トークンを保
有したプロセッサだけが、アドレス変換テーブルに対す
る修正アクセスを行うことができる。更に、１度にトー
クンを保有できるのは１次のプロセッサだけである。リ
レー・バトンにたとえることのできるトークンを利用す
ることで、共通リソースに対するプロセッサのアクセス
権が規定される。In order for a command accessing the address translation table stored in memory to reach a reasonable serialization level, the invalidating processor requests a token,
You must be assigned a token. Only the processor holding the token can make a modified access to the address translation table. Furthermore, only the primary processor can hold the token at one time. By using a token that can be compared to a relay baton, the access right of the processor to the common resource is defined.

【００７３】あるページが、あるプロセッサによって無
効化されるとき、プロセッサは、最初のステップ（４０
０）で、無効化トークンが使用可能か（"トークン使用
可能"）どうか確認する。トークンが使用できない場
合、最初のタスクの実行は延期される。トークンが使用
できる場合、次のステップ（４０１）で開始側プロセッ
サがトークンを要求できる。次に、無効化トークンを要
求側プロセッサに割当てることができた（"トークン受
信"）かどうかの問い合わせがなされる（４０２）。そ
うでなければ、途中でトークンの可用性に変化があった
ことになる。そのため、処理はシーケンスの先頭へ返る
（４００）。トークンを割当てることができた（"トー
クン受信"）場合、開始側プロセッサは最初のタスクの
実行を開始することができる。When a page is invalidated by a processor, the processor proceeds to the first step (40
At 0), it is confirmed whether the invalidation token is available ("token available"). If the token is not available, execution of the first task is postponed. If the token is available, the next step (401) allows the initiating processor to request the token. Next, an inquiry is made as to whether the invalidation token could be assigned to the requesting processor ("token received") (402). Otherwise, the token availability has changed halfway. Therefore, the process returns to the beginning of the sequence (400). If a token can be assigned ("receive token"), the initiating processor can begin execution of the first task.

【００７４】初めに、外部記憶域に転送されるページに
関係したページ・テーブル・エントリが無効化される
（４０３）。次のステップ（４０４）で、開始側ＰＵ
は、依存タスクを実行するコマンドを全てのレスポンダ
ＰＵに送る。これは、いわゆるブロードキャスト強制操
作によって行われる。つまり、このページに関係したＴ
ＬＢエントリを、無効化対象のページのアドレスと共に
無効化するコマンドが、全てのレスポンダＰＵに送ら
れ、レスポンダＰＵのそれぞれの無効化バッファに書込
まれる。従って開始側ＰＵは、全てのレスポンダＰＵ
に、処理すべき内容を通知している。ページに関係した
ＴＬＢエントリの無効化がそれぞれいつ行われるかは、
レスポンダＰＵの問題になる。しかし、レスポンダＰＵ
がそのＴＬＢを変更するのに必要とする情報は全て、無
効化バッファに保存されている。レスポンダＰＵのステ
ータスはここで"コマンド保留"になる。つまり別の依存
タスクが保留されている。First, the page table entry related to the page transferred to the external storage is invalidated (403). In the next step (404), the starting PU
Sends a command to execute a dependent task to all responder PUs. This is performed by a so-called broadcast compulsory operation. That is, T related to this page
A command for invalidating the LB entry together with the address of the page to be invalidated is sent to all the responder PUs and written into the respective invalidation buffers of the responder PUs. Therefore, the starting PU is set to all responder PUs.
Is notified of the content to be processed. When each invalidation of a TLB entry related to a page is performed,
It becomes a problem of the responder PU. However, responder PU
All the information needed for changing its TLB is stored in the invalidation buffer. The status of the responder PU becomes "command pending" here. That is, another dependent task is suspended.

【００７５】ブロードキャスト強制操作が実行される
と、初期タスクの終わりに達し、初期タスクは従って、
ステップ（４０５）で無効化トークンを返すことができ
る。その結果、開始側ＰＵはトークンを保有しなくなる
が、トークンはそれでも、トークンを要求する他のどの
ＰＵにも割当てることはできない。依存タスクはまだレ
スポンダＰＵ側で保留されているからである。When the broadcast force operation is performed, the end of the initial task is reached, and the initial task is therefore
In step (405), an invalidation token can be returned. As a result, the initiating PU no longer holds the token, but the token can still not be assigned to any other PU requesting the token. This is because the dependent task is still suspended on the responder PU side.

【００７６】ステップ（４０６）は、レスポンダＰＵの
無効化バッファに保存されたコマンドがレスポンダＰＵ
によってどのように処理されるかを示す。そのため、依
存タスク（ページ・テーブルの無効化では、対応するＴ
ＬＢエントリを削除するはずのタスク）は、対応するレ
スポンダＰＵの命令フローで割込み可能点においてルー
プをなし、実行される。レスポンダＰＵの"コマンド保
留"ステータスは、依存タスクの実行が完全に終了する
と消失する。まだ保留されている最後の依存タスクがそ
のプロセッサによって完全に実行されたとき、要求側プ
ロセッサに対するトークンの可用性が再確立される（４
０７）。トークンは再び"使用可能"になる。つまり、要
求側プロセッサにトークンを割当てることができ、従っ
て、ページ・テーブル・エントリの無効化を行う立場に
置くことができる。ステップ（４０７）では、開始側Ｐ
Ｕの開始側タスクが完了し、開始側ＰＵはそこで命令フ
ローの次の命令を処理できる（４０８）。In step (406), the command stored in the invalidation buffer of the responder PU is
Shows how it is handled by Therefore, the dependent task (in invalidation of the page table, the corresponding T
The task that should delete the LB entry) is executed by forming a loop at an interruptible point in the instruction flow of the corresponding responder PU. The "command pending" status of the responder PU disappears when the execution of the dependent task is completed. When the last dependent task that is still pending has been completely executed by that processor, the availability of the token to the requesting processor is re-established (4
07). The token becomes "usable" again. That is, a token can be assigned to the requesting processor and, therefore, placed in a position to invalidate the page table entry. In step (407), the starting side P
U's initiating task is complete, and the initiating PU can now process the next instruction in the instruction flow (408).

【００７７】ステップ（４００）で、開始側プロセッサ
は、無効化トークンが使用可能かどうかチェックする。
使用できない場合、つまり、"トークン使用可能"ステー
タスが存在しない場合、開始側プロセッサ、ここでは、
開始側タスクを実行を妨げられているプロセッサは、ス
テップ（４１０）で、他のＰＵからの依存タスクが命令
バッファに保存されているかどうかチェックする。ＩＰ
ＴＥの例では、これはＴＬＢエントリを無効化するコマ
ンドである。この種の依存タスクが実行されるのを待っ
ていない場合、プロセッサは最初の問い合わせに返る
（４００）。最初の問い合わせでトークンの可用性が再
びチェックされる。しかし、他のＰＵからの依存タスク
が命令バッファに保存されていると、この依存タスクは
ステップ（４１１）で実行される。従ってＩＰＴＥの場
合、外部記憶域に転送されるページに対応したＴＬＢエ
ントリが無効化される。無効化に必要なデータは命令バ
ッファに保存される。依存タスクが完了すると、"コマ
ンド保留"信号がステップ（４１２）でリセットされ
る。次に、プロセッサは、無効化トークンが途中で使用
可能になったかどうか再確認するために問い合わせに返
る（４００）。At step (400), the initiating processor checks whether an invalidation token is available.
If not available, that is, if the "token available" status does not exist, the initiating processor, here:
The processor that is prevented from executing the initiating task checks in step (410) whether dependent tasks from other PUs are stored in the instruction buffer. IP
In the TE example, this is a command to invalidate a TLB entry. If not waiting for such a dependent task to be performed, the processor returns to the first query (400). The availability of the token is checked again on the first query. However, if a dependent task from another PU is stored in the instruction buffer, this dependent task is executed in step (411). Therefore, in the case of IPTE, the TLB entry corresponding to the page transferred to the external storage area is invalidated. Data required for invalidation is stored in the instruction buffer. Upon completion of the dependent task, the "command pending" signal is reset at step (412). Next, the processor returns to the query to reconfirm whether the revocation token has become available halfway (400).

【００７８】レスポンダＰＵの観点から見た対応する方
式を図７に示す。レスポンダＰＵが、実行されたばかり
の命令の処理を終えると、問い合わせステップ（５０
０）が実行される。ここでは、他のＰＵが、ＴＬＢエン
トリを無効化するコマンドを当該レスポンダＰＵの無効
化バッファにブロードキャスト強制操作によって書込ん
だかどうかチェックされる。書込んでいない場合、レス
ポンダＰＵは命令フローの次の命令の実行に進む（５０
３）。しかし依存タスクが保留されている場合、従って
レスポンダＰＵは"コマンド保留"ステータスにあり、依
存タスクは、ステップ（５０１）で命令バッファに保存
されたコマンドに従って実行される。ページ・テーブル
・エントリの無効化の場合、外部記憶域に転送されるペ
ージに関係したＴＬＢエントリが無効化される。その
後、ステップ（５０２）で、当該レスポンダＰＵで保留
されている依存タスクはなくなったことを示すために"
コマンド保留"信号がリセットされる。次にステップ
（５０３）で命令フローの次の命令が処理される。FIG. 7 shows a corresponding method from the viewpoint of the responder PU. When the responder PU has finished processing the instruction just executed, an inquiry step (50)
0) is executed. Here, it is checked whether or not another PU has written a command for invalidating the TLB entry into the invalidation buffer of the responder PU by a broadcast forcing operation. If not, the responder PU proceeds to execute the next instruction in the instruction flow (50).
3). However, if the dependent task is pending, then the responder PU is in "command pending" status, and the dependent task is executed according to the command stored in the instruction buffer in step (501). In the case of invalidation of the page table entry, the TLB entry related to the page transferred to the external storage is invalidated. Then, in step (502), in order to indicate that there are no more dependent tasks pending in the responder PU,
The "command pending" signal is reset. Next, in step (503), the next instruction in the instruction flow is processed.

【００７９】図８に、直列化に用いられるトークンが取
り得るさまざまな状態の表現を示す。複数のプロセッサ
（６０１）を有するコンピュータ装置（６００）が示し
てある。FIG. 8 shows representations of various states that can be taken by a token used for serialization. A computing device (600) having a plurality of processors (601) is shown.

【００８０】装置の第１の状態（６０６）で、トークン
（６０３）はどのプロセッサにも割当てられていない
が、使用はできる。つまりプロセッサの１つから要求が
あれば、そのプロセッサに割当てることができる。In the first state of the device (606), the token (603) has not been assigned to any processor but can be used. That is, if there is a request from one of the processors, it can be assigned to that processor.

【００８１】その場合には、装置は第２の状態（６０
７）にシフトする。ここでトークンはプロセッサの１
つ、この例ではプロセッサ２（６０４）に割当てられ
る。プロセッサ２はこうして、直列処理される特別な命
令を実行する許可を受ける。これらの命令には、開始側
タスクが全てのプロセッサに共通のリソースを変更し、
他のプロセッサで依存タスクを起動する全ての命令が含
まれる。他のプロセッサは、共通リソースのローカル・
コピーを変更する。ＰＵ２の場合、トークンを保有する
ことは、この種の命令の開始側タスクを実行できる権利
を意味する。In that case, the device is in the second state (60
Shift to 7). Where the token is 1 of the processor
In this example, it is assigned to the processor 2 (604). Processor 2 is thus authorized to execute the special instructions to be serialized. In these instructions, the initiating task modifies resources common to all processors,
Includes all instructions that launch dependent tasks on other processors. Other processors use local resources for common resources.
Change the copy. In the case of PU2, holding the token implies the right to perform the initiating task of this type of instruction.

【００８２】開始側タスクが完全に実行されると、ＰＵ
２はトークンを返す。これは図５のステップ（４０５）
に対応する。しかし、まだ完全に実行されていない依存
タスクがレスポンダＰＵで保留されている場合は、トー
クンは、返されたときにまだ他のプロセッサから使用で
きない。従ってトークンを管理する装置は第３の状態
（６０８）にシフトする。トークン（６０５）がどのＰ
Ｕ（６０１）にも割当てられていないとき、それでもト
ークンはまだ"トークン使用可能"ステータスにはない。
ＰＵ（６０１）の少なくとも１つはまだその"コマンド
保留"信号をリセットしていないからである。これが生
じるとき、つまり、全てのレスポンダＰＵがそれらに割
当てられた依存タスクを終了しているとき、トークンは
再び開放され、また第１の状態（６０６）に達する。こ
れは図５のステップ（４０７）に対応する。ＰＵ（６０
１）の１つからの要求によりトークン（６０３）をその
ＰＵに割当てることができる。When the initiating task is completely executed, the PU
2 returns the token. This is the step (405) in FIG.
Corresponding to However, if a dependent task that has not been completely executed is pending at the responder PU, the token is not yet available to other processors when returned. Therefore, the device managing the token shifts to the third state (608). Which P is the token (605)
When not also assigned to U (601), the token is still not in "token available" status.
This is because at least one of the PUs (601) has not yet reset its "command pending" signal. When this occurs, ie, when all responder PUs have finished their dependent tasks, the token is released again and reaches the first state (606). This corresponds to step (407) in FIG. PU (60
A token (603) can be assigned to that PU by a request from one of 1).

【００８３】状態１（６０６）、状態２（６０７）、状
態３（６０８）を順次に実行する代わりに、第２の状態
から第１の状態への切り替え（６０９）も可能である。
つまりプロセッサの１つに割当てられているトークン
（６０４）は、返されたとき再び直接使用できるように
なる。このような処理は、１つのタスクでのみ構成され
るコマンドには理にかなっており、従って処理の実行は
依存タスクの処理で構成されない。これらのコマンド
は、ここに示した形で、開始側タスクと依存タスクで構
成される、すでに述べたコマンドで直列化できる。この
手順は、データの保全性を守るためには必要になること
がある。Instead of sequentially executing the state 1 (606), the state 2 (607), and the state 3 (608), it is also possible to switch (609) from the second state to the first state.
That is, the token (604) assigned to one of the processors is again directly usable when returned. Such a process makes sense for a command composed of only one task, and therefore execution of the process is not composed of dependent task processes. These commands can be serialized in the form shown here, with the commands already described, consisting of the initiator task and the dependent tasks. This procedure may be necessary to protect data integrity.

【００８４】図９は、トークンを管理するための基本回
路を示す。ここで、どのプロセッサｉについても、プロ
セッサｉがトークンを保有している（"トークン受信"）
かどうかを示す信号Ａｉ（７０１）を使用できる。ま
た、どのプロセッサｉについても、まだ完全に実行され
ていない依存タスクがプロセッサｉで実行を保留されて
いる（"コマンド保留"）かどうかを示す信号Ｂｉ（７０
２）を使用できる。これらの信号は全て、ORチェイン
（７００）によって信号Ｃ（７０３）になるよう処理さ
れる。信号Ａｉのいずれか１つがセットされる（つま
り、プロセッサｉがトークンを保有した状態になる）
か、信号Ｂｉの少なくとも１つがセットされる（つま
り、まだ完全に実行されていない依存タスクが保留され
ている）と、信号Ｃ（７０３）もセットされる。この信
号Ｃの意味は"トークン使用不可"である。このようにト
ークンは、信号Ｃがセットされていない場合、つまりト
ークンが"使用可能"な場合にのみ、要求側プロセッサに
割当てることができる。FIG. 9 shows a basic circuit for managing tokens. Here, for any processor i, the processor i has a token ("token received").
The signal Ai (701) indicating whether or not it can be used. In addition, for any processor i, a signal Bi (70) indicating whether a dependent task that has not been completely executed has been suspended in processor i (“command suspension”).
2) can be used. All of these signals are processed by OR chain (700) to become signal C (703). Any one of the signals Ai is set (that is, the processor i enters the state of holding the token).
Alternatively, if at least one of the signals Bi is set (ie, a dependent task that has not been completely executed is suspended), the signal C (703) is also set. The meaning of this signal C is "token unusable". Thus, a token can be assigned to the requesting processor only when signal C is not set, ie, when the token is "usable".

【００８５】次に、一方ではラインＡ１からＡｎ（７０
１）、ＢｉからＢｎ（７０２）、及びＣ（７０３）の状
態間で、他方では図８に示した状態間で、接続の確立が
試行される。トークンを管理する装置が第１の状態（６
０６）にある場合、これはつまり一方でトークン（６０
３）はどのプロセッサにも割当てられていないことを意
味する。そのため、信号Ａ１からＡｎ（７０１）はどれ
もセットされない。他方、どのプロセッサも"コマンド
保留"ステータスにはならない。トークン（６０３）は
そのとき使用できないからである。よって、ラインＢｉ
からＢｎ（７０２）のどれもまたセットされない。従っ
て、信号Ｃ（７０３）もセットされない。トークン（６
０３）は従って"トークン使用可能"ステータスにある。Next, on the one hand, lines A1 to An (70
1), an attempt is made to establish a connection between the states Bi to Bn (702) and C (703), and on the other hand between the states shown in FIG. The device that manages the token is in the first state (6
06), this means on the one hand the token (60
3) means that it is not assigned to any processor. Therefore, none of the signals A1 to An (701) is set. On the other hand, no processor will be in "command pending" status. This is because the token (603) cannot be used at that time. Therefore, the line Bi
To Bn (702) are also not set. Therefore, the signal C (703) is not set. Token (6
03) is therefore in "token available" status.

【００８６】プロセッサの１つからの要求に応答して、
トークンがこのプロセッサに割当てられた場合は、信号
Ａ１からＡｎ（７０１）の１つがセットされなければな
らない。これは"トークン受信"ステータスに相当する。
従って、信号ラインＢｉからＢｎ（７０２）のステータ
スとは無関係に、信号Ｃ、"トークン使用不可"がセット
される。プロセッサに割当てられたトークン（６０４）
は、他のどのプロセッサにも割当てられなくなる。信号
のこのステータスは装置の第２の状態（６０７）に対応
する。In response to a request from one of the processors,
If a token is assigned to this processor, one of the signals A1 to An (701) must be set. This corresponds to the "token received" status.
Therefore, signal C, "token unusable" is set irrespective of the status of signal lines Bi to Bn (702). Token assigned to the processor (604)
Will not be assigned to any other processor. This status of the signal corresponds to the second state of the device (607).

【００８７】ここで、トークンを保有していたプロセッ
サがトークンを返したが、依存タスクはまだ完全に実行
されていない場合、装置は第３の状態（６０８）にシフ
トする。この時点までセットされていた対応する信号ラ
インＡｉ（"トークン受信"）は、トークンが返るとリセ
ットされる。しかしまだ処理されていない依存タスクが
レスポンダＰＵ側で実行を保留されているので、信号Ｂ
１からＢｎ（"コマンド保留"）のうち少なくとも１つは
セットされ、そのため、信号Ｃ（７０３）もセットされ
る。つまりすでに返されたトークン（６０５）は第３の
状態でまだ"トークン使用不可"ステータスにある。Here, if the processor holding the token returns the token, but the dependent task has not been completely executed, the device shifts to the third state (608). The corresponding signal line Ai ("token received") which has been set up to this point is reset when the token is returned. However, since the dependent task that has not been processed has been suspended from executing on the responder PU, the signal B
At least one of 1 through Bn ("command pending") is set, so signal C (703) is also set. That is, the already returned token (605) is still in the "token unusable" status in the third state.

【００８８】全ての依存タスクが完全に処理されたとき
だけ、Ｂ１からＢｎの全てのライン（"コマンド保留"）
もリセットされる。次に信号Ｃもリセットされ、装置は
再び第１の状態（６０６）に戻り、トークン（６０３）
は再び"使用可能"になる。All lines B1 to Bn ("command pending") only when all dependent tasks have been completely processed
Is also reset. Then the signal C is also reset, and the device returns to the first state (606) again and the token (603)
Becomes "enabled" again.

【００８９】図１０は、実現の容易さという点では有利
な図９に示した回路の変更例を示す。この例で、プロセ
ッサｉに関係した信号Ａｉ（７０８、"トークン受信"）
とＢｉ（７０９、"コマンド保留"）は、ORゲート（７１
０）によって処理され、全てのプロセッサでＣｉ信号
（７１１）が形成される。前記信号Ｃｉは次に中央のOR
ゲート（７１２）の入力に印加される。この中央ORゲー
トの出力は信号Ｃ（７１３）で、これは"トークン使用
不可"の意味を有する。図９の大きなORゲート（７０
０）をｎ個の小さいORゲート（７１０）と１つの中央OR
ゲート（７１２）に分ける利点は、２つのライン（Ａｉ
とＢｉ）を使用する必要はなくなり、各プロセッサから
中央のORゲート（７１２）に１つ（Ｃｉ）だけでよいこ
とである。他の部分の回路は同じである。FIG. 10 shows a modification of the circuit shown in FIG. 9 which is advantageous in terms of ease of implementation. In this example, the signal Ai associated with the processor i (708, "token received")
And Bi (709, "command pending") are OR gate (71
0), and a Ci signal (711) is formed in all processors. The signal Ci is then at the center OR
Applied to the input of gate (712). The output of this central OR gate is signal C (713), which means "token unavailable". The large OR gate (70
0) with n small OR gates (710) and one central OR
The advantage of dividing into gates (712) is that two lines (Ai
And Bi) need not be used, but only one (Ci) from each processor to the central OR gate (712). The other circuits are the same.

【００９０】図１１は、信号Ａ１乃至Ａｎ、Ｂ１乃至Ｂ
ｎ、及びＣの時系列と共に、初期タスク及び依存タスク
の時系列を表す試みである。更に、この時系列を、図８
に示している装置に可能な状態とリンクする試みもなさ
れている。FIG. 11 shows signals A1 to An and B1 to B
This is an attempt to represent the time series of the initial task and the dependent task, together with the time series of n and C. Further, this time series is shown in FIG.
Attempts have been made to link the possible states to the devices shown in FIG.

【００９１】まず、トークン（８００）は第１の状態
（６０６）にある、つまりトークンは要求に応じてＰＵ
に割当てることができる（"トークン使用可能"）。従っ
て、信号Ａ１乃至Ａｎ、Ｂ１乃至Ｂｎ、またはＣのいず
れもセットされない。プロセッサｉは、ページ・テーブ
ル・エントリの無効化を開始しようとした場合、最初に
トークンを要求しなければならない。トークンは使用で
きるので、プロセッサの要求に応じてプロセッサに割当
てることができる。（８０１）。トークンを受信したＰ
Ｕｉは信号Ａｉ（８０４、"トークン受信"）をセットす
る。First, the token (800) is in the first state (606), that is, the token is
("Token available"). Therefore, none of the signals A1 to An, B1 to Bn, or C is set. When processor i attempts to initiate invalidation of a page table entry, it must first request a token. The tokens can be used and assigned to processors as required by the processor. (801). P that received the token
Ui sets signal Ai (804, "token received").

【００９２】ＰＵｉはトークンを保有したので、共通リ
ソースを変更する命令の初期タスク（８０２）を実行す
る権利を有する。トークンはここでＰＵｉに割当てられ
るので、トークン（８０５）を要求する他のＰＵからは
使用できなくなる。この点、で、信号Ｃ（８１１、"ト
ークン使用不可"）も、ＰＵｉがトークンを受け取った
時点からセットされなければならない。初期タスク（８
０２）の実行中に、開始側ＰＵｉは他のＰＵ、すなわち
レスポンダに要求（８０３）を送り、初期タスクに属す
る依存タスク（８０８、８０９）の実行を求める。レス
ポンダＰＵで実行される依存タスクのより詳しい仕様に
関するコマンド及びデータも、開始側ＰＵによってレス
ポンダの命令バッファに書込まれる。ブロードキャスト
強制操作によってレスポンダＰＵの命令バッファにコマ
ンド及びデータが書込まれる瞬間、各レスポンダＰＵ
の"コマンド保留"信号がセットされる。ＰＵｉによって
開始されたブロードキャスト強制操作の結果、信号Ｂｉ
を除く信号Ｂ１乃至Ｂｎ（８１０）が全てセットされ
る。Since PUi has the token, it has the right to execute the initial task (802) of the instruction to change the common resource. Since the token is now assigned to PUi, it cannot be used by other PUs requesting the token (805). At this point, signal C (811, "token not available") must also be set from the time PUi receives the token. Initial Task (8
During execution of 02), the initiating PUi sends a request (803) to another PU, that is, a responder, and requests execution of the dependent tasks (808, 809) belonging to the initial task. Commands and data relating to a more detailed specification of the dependent task executed by the responder PU are also written by the initiator PU into the instruction buffer of the responder. At the moment when the command and data are written into the instruction buffer of the responder PU by the broadcast forced operation, each responder PU
Is set. As a result of the broadcast forcing operation initiated by PUi, the signal Bi
Except for the signals B1 to Bn (810) are set.

【００９３】開始側タスク（８０２）が終了すると、開
始側ＰＵはトークンを返す（８０６）。同時に、プロセ
ッサｉによるトークンの保有を示す信号Ａｉがリセット
される（８０４）。しかし、この返却が生じるとき、ト
ークンは、トークンを要求する他のＰＵから使用できる
ようにはならない。依存タスクがまだレスポンダＰＵ側
で保留されているからである（８１０）。装置はここで
第３の状態（６０８）、つまりどのＰＵもトークンを保
有していない状態（８０７）になる。そのためラインＡ
１乃至Ａｎはどれもセットされない。信号Ｃは"トーク
ン使用不可"ステータスのままである（８１１）。レス
ポンダＰＵがその命令フローの割込み可能点に達する
と、実行を保留されている各依存タスク（８０８、８０
９）を挿入し実行できる。レスポンダＰＵの１つ、たと
えばＰＵ１、での依存タスクの実行が終了すると、この
レスポンダに関係する"コマンド保留"信号、この場合は
Ｂ１がリセットされる（８１０）。When the initiating task (802) ends, the initiating PU returns a token (806). At the same time, the signal Ai indicating the possession of the token by the processor i is reset (804). However, when this return occurs, the token is not made available to other PUs requesting the token. This is because the dependent task is still pending on the responder PU side (810). The device is now in a third state (608), ie, no PU has a token (807). Therefore line A
None of 1 to An is set. Signal C remains in the "token unavailable" status (811). When the responder PU reaches an interruptible point in its instruction flow, each dependent task (808, 80) pending execution
9) can be inserted and executed. When the execution of the dependent task on one of the responders PU, for example PU1, is completed, the "command pending" signal associated with this responder, in this case B1, is reset (810).

【００９４】開始側ＰＵによって開始された最後の依存
タスク（８０９）が完了すると、まだセットされている
最後の"コマンド保留"信号がリセットされる。すでに返
却されているトークンはここで再び使用可能になる。装
置は状態３から状態１（８１２）に変わり、信号Ｃのス
テータスは"トークン使用可能"になる（８１１）。トー
クンはまた、トークンを要求するＰＵに割当てることが
できる。ここで達した状態は、また初期状態（８００）
に対応する。Upon completion of the last dependent task (809) started by the initiating PU, the last "command pending" signal still set is reset. Tokens that have already been returned can now be used again. The device changes from state 3 to state 1 (812) and the status of signal C becomes "token available" (811). The token can also be assigned to the PU requesting the token. The state reached here is the initial state (800)
Corresponding to

【００９５】信号Ａ１乃至Ａｎ、Ｂ１乃至Ｂｎ、及びＣ
のステータスの時間に関係した変化を考慮すると、信号
Ｃ（８１１）は実際には、図９及び図１０に描いたよう
に、信号Ａ１乃至Ａｎ及びＢ１乃至ＢｎのORチェイン処
理として表せることが明確になる。The signals A1 to An, B1 to Bn, and C
Considering the time-related changes in the status of the C signal, it is clear that the signal C (811) can actually be represented as an OR chain of the signals A1 to An and B1 to Bn, as depicted in FIGS. become.

【００９６】図１２は、トークンを管理するための具体
的な回路を示す。ＰＵｉがトークンを要求するとき、信
号ライン（９０１）は"HIGH"にセットされる。信号ライ
ン（９０２）が同時に"HIGH"になると、つまりトークン
が使用可能であれば、両方の入力、従ってANDゲート
（９００）の出力も"HIGH"になる。ANDゲートの出力は
一時的にラッチ（９０３）に保存される。信号ラインＡ
ｉ（９０４）の"HIGH"信号は、トークンが要求の時点で
使用可能だったので要求側ＰＵｉに割当てられたことを
意味する。従って信号Ａｉは、プロセッサｉがトークン
を保有している（この場合は信号がセットされる）かど
うかを示す。FIG. 12 shows a specific circuit for managing tokens. When PUi requests a token, signal line (901) is set to "HIGH". If the signal line (902) goes "HIGH" at the same time, that is, if a token is available, both inputs, and thus the output of the AND gate (900), will also go "HIGH". The output of the AND gate is temporarily stored in the latch (903). Signal line A
A "HIGH" signal at i (904) means that the token was assigned to the requesting PUi because it was available at the time of the request. Thus, signal Ai indicates whether processor i has the token (in this case, the signal is set).

【００９７】信号ラインＡｉ（９０４）及び信号ライン
Ｂｉ（９０５、"コマンド保留"）は、ORゲート（９０
６）の入力を成す。これは図１０のORゲート（７１０）
に対応する。ここで、このORゲート（９０６）の出力は
信号Ｃｉであり、これは一時的にラッチ（９０７）に保
存される。中央ORゲート（９１１）との接続（９０９）
は、図１０の中央ORゲート（７１２）に相当するドライ
バ・モジュール（９０６）を介して確立される。他のプ
ロセッサも全て、それぞれの信号ライン（９０９）に対
応した接続（９１０）を介して中央ORゲート（９１１）
の入力に接続される。中央ORゲート（９１１）の入力
（９０９、９１０）は、ここでは図１０の信号Ｃ１乃至
Ｃｎ（７１１）に相当する。よって、ORゲート（９１
１）の出力は、信号Ａｉ（"トークン受信"）か、でなけ
れば信号Ｂｉ（"コマンド保留"）がいずれかのＰＵでセ
ットされているかどうかを示す。The signal line Ai (904) and the signal line Bi (905, "command pending") are connected to the OR gate (90).
Make the input of 6). This is the OR gate (710) in FIG.
Corresponding to Here, the output of the OR gate (906) is the signal Ci, which is temporarily stored in the latch (907). Connection with the central OR gate (911) (909)
Is established via a driver module (906) corresponding to the central OR gate (712) of FIG. All other processors also have a central OR gate (911) via a connection (910) corresponding to the respective signal line (909).
Connected to the input of The inputs (909, 910) of the central OR gate (911) here correspond to the signals C1 to Cn (711) of FIG. Therefore, the OR gate (91
The output of 1) indicates whether the signal Ai ("token received") or otherwise the signal Bi ("command pending") is set in any PU.

【００９８】中央ORゲート（９１１）の出力（９１２）
は、図１０の信号ラインＣ（７１３）に相当する。この
信号ラインＣが"HIGH"にセットされた場合、これは、ま
だ完全に実行されていない依存タスクがプロセッサの１
つで保留されている（信号Ｂｉ、"コマンド保留"の１つ
がセットされている）ので、プロセッサのいずれか１つ
がトークンを保有している（信号Ａｉ、"トークン受信"
の１つがセットされている）か、そうでなければトーク
ンはまだブロックされていることを意味する。ここで、
信号ラインＣの"HIGH"ステータスは、トークンを要求す
るＰＵにトークンを割当てられないこと、従ってトーク
ンは使用不可（"トークン使用不可"）であることを意味
する。よってトークンは、信号ラインＣが"ＬＯＷ"にな
ったときだけＰＵに割当てられる。信号Ｃは中央ORゲー
ト（９０１）からプロセッサのそれぞれに位置するレシ
ーバ（９１３）に供給される（９１２）。従って各プロ
セッサ上には信号ラインＣ（９１４、"トークン使用不
可"）が存在し、Ｃのステータスは各ラッチ（９１５）
から測定できる。Output (912) of central OR gate (911)
Corresponds to the signal line C (713) in FIG. If this signal line C is set to "HIGH", this means that a dependent task that has not yet fully executed
Are held (signal Bi, one of "command pending" is set), and any one of the processors holds a token (signal Ai, "token received").
Is set), or otherwise the token is still blocked. here,
A "HIGH" status on signal line C means that a token cannot be assigned to the PU requesting the token, and therefore the token is disabled ("token disabled"). Therefore, the token is assigned to the PU only when the signal line C becomes “LOW”. Signal C is provided from a central OR gate (901) to receivers (913) located in each of the processors (912). Thus, on each processor there is a signal line C (914, "token not available") and the status of C is indicated by each latch (915).
Can be measured from

【００９９】信号Ｃ（９１４）は、インバータ（９１
６）を介して信号（９０２）に変換され、信号（９０
２）はトークンが使用できる（"トークン使用可能"）そ
のときに"HIGH"になる。信号（９０２）は、トークン
が"トークン使用可能"ステータス（９０２）にあるとき
だけプロセッサｉに割当てられるように、ANDゲート
（９００）の第２の入力に供給される。The signal C (914) is supplied to the inverter (91)
6) to a signal (902),
2) becomes "HIGH" when the token can be used ("token available"). Signal (902) is provided to a second input of AND gate (900) so that the token is only assigned to processor i when in the "token available" status (902).

【０１００】信号Ｂｉ（９０５、"コマンド保留"）は、
依存タスクがプロセッサｉで保留されているときにセッ
トされる。依存タスクを指定するコマンドは、それを実
行するために必要なデータと共に、ライン（９１７）を
介してブロードキャスト強制操作によって無効化バッフ
ァ（９１８）に書込まれる。ラッチ（９２０）もライン
（９１７）を介してセットされる。このラッチの出力、
信号ラインＢｉ（９０５、"コマンド保留"）は次に"HIG
H"になる。この信号（９０５）は、ＰＵｉが依存タスク
を完全に実行したときにリセットされる。これは、リセ
ット入力に存在する"トークン・ロック解除"信号（９１
９）によってリセットされたラッチ（９２０）によって
行われる。その結果、信号Ｂｉ（９０５、"コマンド保
留"）も再び"LOW"にリセットされる。The signal Bi (905, "command pending") is
Set when the dependent task is pending on processor i. The command specifying the dependent task, along with the data needed to execute it, is written to the invalidation buffer (918) by broadcast forcing via line (917). Latch (920) is also set via line (917). The output of this latch,
The signal line Bi (905, "command pending") is next to "HIG
H ". This signal (905) is reset when the PUi has completely performed the dependent task. This is the" token unlock "signal (91) present at the reset input.
This is done by the latch (920) reset by 9). As a result, the signal Bi (905, "command pending") is reset to "LOW" again.

【０１０１】初期タスクには、ページ・テーブルのエン
トリを無効化する目的があるが、依存タスクは、このペ
ージに関係するレスポンダＰＵの各ＴＬＢエントリを削
除する役割がある。ＴＬＢエントリを無効化するコマン
ドが無効化バッファ（９１８）に書込まれる場合、"Ｔ
ＬＢエントリ無効化"信号が同時にセットされる。信号
Ｂｉ（９０５）も"HIGH"なので、ＴＬＢエントリが無効
化されるのであればANDゲート（９２２）の出力もセッ
トされる。ＴＬＢエントリの削除は、無効化バッファに
保存されたコマンドに従って行われる。The initial task has a purpose of invalidating the entry of the page table, while the dependent task has a role of deleting each TLB entry of the responder PU related to this page. If a command to invalidate the TLB entry is written to the invalidation buffer (918), "T
The signal "LB entry invalidation" is set at the same time. Since the signal Bi (905) is also "HIGH", the output of the AND gate (922) is also set if the TLB entry is invalidated. , According to the command stored in the invalidation buffer.

【０１０２】図１３は、アドレス変換キャッシュの構造
及び機能モードを示す。アドレス変換が正常に行われた
場合、ページの実初期アドレスへの仮想アドレスのセグ
メント索引及びページ索引の割当てが、セグメント・テ
ーブルとページ・テーブルのエントリを使用してメモリ
で確立される。しかし、この変換に必要なプロセッサ・
サイクル数はかなりの値になる。ただし、メモリにある
ページをアドレスするのに必要なアドレス変換が１度行
われていると、仮想アドレスのセグメント索引とページ
索引の、ページの実初期アドレスとの接続をテーブルの
エントリに記録することが可能になる。アドレス変換キ
ャッシュは、この種のテーブルを表し、仮想アドレスの
高次の要素をページの実初期アドレスに割当てるための
ものである。プロセッサがページを初めてアドレスする
場合、アドレス変換は、セグメント・テーブルとページ
・テーブルを使用して行う必要がある。そのため、対応
するエントリがＴＬＢに作成される。これにより仮想ア
ドレスの高次の要素がページの実初期アドレスに割当て
られる。プロセッサは、２回目にこのページにアクセス
しようとすると、最初に、プロセッサに割当てられたＴ
ＬＢを調べ、このページに関係したエントリを探す。こ
の種のエントリをＴＬＢで見つけた場合、明示的アドレ
ス変換の実行は不要になり、かなりの時間が節約され
る。セグメント・テーブルとページ・テーブルを利用し
た明示的アドレス変換は、プロセッサが、プロセッサに
割当てられたＴＬＢのページに関係したエントリを見つ
けなかったときにのみ行われる。FIG. 13 shows the structure and function mode of the address translation cache. If the address translation is successful, the assignment of the segment index and the page index of the virtual address to the real initial address of the page is established in memory using the entries of the segment table and the page table. However, the processor
The number of cycles is a considerable value. However, once the address translation required to address a page in memory has been performed, the connection between the segment index of the virtual address and the actual initial address of the page in the page index is recorded in an entry in the table. Becomes possible. The address translation cache represents such a table and is for assigning higher order elements of the virtual address to the real initial address of the page. When a processor addresses a page for the first time, address translation must be performed using a segment table and a page table. Therefore, a corresponding entry is created in the TLB. This assigns the higher order element of the virtual address to the real initial address of the page. When the processor attempts to access this page for the second time, the processor first assigns the T
The LB is searched for an entry related to this page. If such an entry is found in the TLB, no explicit address translation needs to be performed, saving considerable time. Explicit address translation using the segment table and page table is performed only when the processor does not find an entry related to the page of the TLB assigned to the processor.

【０１０３】ＴＬＢはアレイ（１００６、１００７、１
００８、１００９）で構成される。ページの実初期アド
レスへの仮想アドレスの高次要素の割当てはＴＬＢのタ
スクである。ＴＬＢの列（１００６）の１２８エントリ
の１つをアドレスするのに、仮想アドレスの高次要素の
ビット１３乃至１９（１００１）が用いられる。各エン
トリは、仮想アドレスに関係し、仮想アドレスのビット
１乃至１２が保存される要素（１０１１）及びページの
実初期アドレス（１０１２）で構成される。The TLB is an array (1006, 1007, 1
008, 1009). Assigning a higher order element of a virtual address to the real initial address of a page is a task of the TLB. Bits 13-19 (1001) of the higher order element of the virtual address are used to address one of the 128 entries of the TLB column (1006). Each entry is related to a virtual address, and includes an element (1011) in which bits 1 to 12 of the virtual address are stored and a real initial address (1012) of the page.

【０１０４】仮想アドレス（１０００）を基準に、仮想
アドレスによって指定されたページに関係したＴＬＢエ
ントリの検索を行う場合、まず、ＴＬＢの１２８のエン
トリの１つが、仮想アドレスのビット１３乃至１９（１
００１）を使用してアドレスされる。このようにして見
つけられたエントリは、仮想アドレスのビット１３乃至
１９と、検索されている仮想アドレスの対応ビットとの
間に一致があり、最近行われたアドレス変換の１つに関
係する。しかし、エントリが実際に検索されたページを
表すかどうかを確認するには、エントリに保存された仮
想アドレスのビット１乃至１２（１０１１）と、検索さ
れている仮想アドレスのビット１乃至１２（１００２）
との比較が必要である。この比較は比較器（１０１５）
で行われる。比較器の出力信号（１０１６）は、比較さ
れたビットが一致するときセットされる。この場合、エ
ントリに保存されたアドレス・ビット１乃至１９（１０
１２）は、検索されているページの実初期アドレスの高
次要素を表す。比較器の出力（１０１６）がセットされ
ると、実アドレスの高次要素（１０１２）は、ANDゲー
ト（１０１７）を介してアドレス・バス（１０２２）に
切り替えることができる。When searching for a TLB entry related to the page specified by the virtual address based on the virtual address (1000), first, one of the 128 entries of the TLB stores bits 13 to 19 (1) of the virtual address.
001). The entry found in this way has a match between bits 13 to 19 of the virtual address and the corresponding bit of the virtual address being searched, and relates to one of the recently performed address translations. However, to determine if the entry actually represents the retrieved page, bits 1 through 12 (1011) of the virtual address stored in the entry and bits 1 through 12 (1002) of the virtual address being retrieved )
Comparison with is necessary. This comparison is performed by the comparator (1015).
Done in The comparator output signal (1016) is set when the compared bits match. In this case, the address bits 1 to 19 (10
12) represents the higher order element of the actual initial address of the page being searched. When the output (1016) of the comparator is set, the higher order element (1012) of the real address can be switched to the address bus (1022) via the AND gate (1017).

【０１０５】しかしそれでも、２つの仮想アドレスは、
ビット１３乃至１９が一致し、ビット１乃至１２は異な
る可能性はある。アレイのアドレスはビット１３乃至１
９を介して行われるので、後で変換されたアドレスのエ
ントリは、先に変換されたアドレスを上書きし、その場
合には１つの列（１００６）しか使用できない。従っ
て、ビット１３乃至１９が一致し、ビット１乃至１２は
異なる仮想アドレスのＴＬＢに、共存するエントリを作
成する可能性を得るためには、最初の列に並列な列を追
加する必要がある。図１３は４つの列を示す。ここで
は、"４重関連アドレス変換キャッシュ"という用語を用
いる。最初の列に類似した追加列（１００７、１００
８、１００９）はそれぞれ、仮想アドレスのビット１乃
至１２を比較するための比較器（１０１８）を有する。
仮想アドレスが一致する場合、この比較器の出力信号
（１０１９）は、ANDゲート（１０２０）を介して、ペ
ージの実初期アドレス（１０１４）をアドレス・バス
（１０２２）に切り替える。図１３に示した４重関連Ｔ
ＬＢで、ビット１３乃至１９が一致する仮想アドレスに
関係した４つまでの異なるエントリが、４つの列（１０
０６、１００７、１００８、１００９）に並んで存在す
ることができる。However, the two virtual addresses are still
Bits 13 to 19 may match and bits 1 to 12 may be different. The address of the array is bits 13 to 1
9, the entry of the later translated address overwrites the previously translated address, in which case only one column (1006) can be used. Thus, in order for bits 13 to 19 to match and bits 1 to 12 to create a co-located entry in the TLB at a different virtual address, it is necessary to add a parallel column to the first column. FIG. 13 shows four columns. Here, the term "quadruple related address translation cache" is used. Additional columns similar to the first column (1007, 100
8, 1009) each have a comparator (1018) for comparing bits 1 to 12 of the virtual address.
If the virtual addresses match, the output signal of this comparator (1019) switches the real initial address (1014) of the page to the address bus (1022) via an AND gate (1020). Quadruple association T shown in FIG.
In the LB, up to four different entries related to the virtual address to which bits 13 to 19 match are shown in four columns (10
06, 1007, 1008, 1009).

【０１０６】これら４つの仮想アドレスの１つがＴＬＢ
を使用して変換されるとき、４つの列に関係する４つの
エントリが、仮想アドレスのビット１３乃至１９によっ
て最初に選択される。変換される仮想アドレスに実際に
対応する４つのエントリが、仮想アドレスのビット１乃
至１２と、エントリ内の対応ビットとの比較器により行
われる比較で確認される。比較器によってビットが一致
することが示された列のエントリが、当該エントリのは
ずである。このエントリに保存された実初期アドレスが
ここでANDゲートを介してアドレス・バス（１０２２）
に切り替えられる。One of these four virtual addresses is TLB
, The four entries associated with the four columns are first selected by bits 13-19 of the virtual address. The four entries that actually correspond to the virtual address to be translated are identified by a comparison made by a comparator between bits 1 through 12 of the virtual address and the corresponding bits in the entry. The entry in the column where the comparator indicates that the bits match should be the entry. The real initial address stored in this entry is now passed to the address bus (1022) via the AND gate.
Can be switched to

【０１０７】全ての列に共通の共通無効ビットをアドレ
スするのに仮想アドレスのビット１３乃至１９（１００
１）も用いられる。そのため、１２８のラッチで構成さ
れるアレイ（１０１０）に工夫がされる。ＴＬＢエント
リを無効化するため、このエントリに関係した共通無効
ビット（１０２１）がセットされる。しかしセットされ
た共通無効ビットは４つの列全てに関係する。To address a common invalid bit common to all columns, bits 13 to 19 (100
1) is also used. Therefore, an array (1010) including 128 latches is devised. To invalidate a TLB entry, the common invalid bit (1021) associated with this entry is set. However, the common invalid bit that is set pertains to all four columns.

【０１０８】ｎタプル関連ＴＬＢにより、互いに完全に
独立した最大ｎ回のアドレス変換の結果を同時に保存す
ることができる。従って、途中でアドレス変換がｎ回ま
でしか必要ない命令が実行されるとき、必要なアドレス
変換は全て命令の始めに行うことができる。これは図１
４に示してある。命令（１１００）を実行するために必
要なアドレス変換は全て、命令の始めに実行される（１
１０１）。これらのアドレス変換を実行するために、命
令は、メモリのアドレス変換テーブル、つまりセグメン
ト・テーブルとページ・テーブルにアクセスできなけれ
ばならない。しかしアドレス変換が完了すればこれは必
要なくなる。実行されている全てのアドレス変換につい
て、対応するエントリが、プロセッサに属するＴＬＢで
作成されているからである。従って、命令（１１００）
は、その実行中に指定されたアドレスにアクセスしよう
とした場合、対応するＴＬＢエントリを使用してアクセ
スを実行できる。ある命令に必要なアドレス変換の回数
が、使用可能なＴＬＢの列の個数を超えない場合、必要
なアドレス変換は、実行される命令の始めに行うことが
でき、その後にページ・テーブルにアクセスする必要は
なくなる。これはつまり、この種の命令は、ページ・テ
ーブル・エントリを無効化して直列化する必要がないこ
とを意味する。命令はそのローカルＴＬＢにアクセス
し、後で、ページ・テーブルを使用して変換を行う必要
はないからである。The n-tuple related TLB can simultaneously store the results of address translation up to n times completely independent of each other. Therefore, when an instruction that requires address conversion up to n times is executed on the way, all necessary address conversion can be performed at the beginning of the instruction. This is Figure 1
It is shown in FIG. Any address translation required to execute the instruction (1100) is performed at the beginning of the instruction (1
101). To perform these address translations, the instructions must be able to access the address translation tables in memory, the segment table and the page table. However, this is no longer necessary once the address translation is complete. This is because, for all address translations being performed, corresponding entries are created in the TLB belonging to the processor. Therefore, the instruction (1100)
Can access the specified address during the execution by using the corresponding TLB entry. If the number of address translations required for an instruction does not exceed the number of available TLB columns, the necessary address translation can be performed at the beginning of the executed instruction and then access the page table. There is no need. This means that such instructions do not need to invalidate and serialize the page table entry. This is because the instruction accesses its local TLB and does not need to later use the page table to perform the translation.

【０１０９】ある命令の実行中にｎ回を超えるアドレス
変換を行う必要があり、使用可能なＴＬＢの列がｎ個し
かないときは事情は異なる。この例は図１５に示してあ
る。命令の始め（１１０３）には、命令に必要な全ての
アドレス変換を実行することはできない。そのために必
要になるエントリがＴＬＢで使用できないからである。
つまり、再びページ・テーブルにアクセスする後続の変
換（１１０４）は、命令が実行される次の段階で行う必
要があるからである。ページ・テーブルが、命令の実行
時に、第１と第２のアドレス変換の間に変更されるので
あれば、データの保全性の問題が生じる可能性がある。
そのため、ｎ回を超えるアドレス変換を要するこの種の
命令は、ページ・テーブルを変更するコマンド（たとえ
ばページ・テーブル・エントリ無効化）で直列化しなけ
ればならない。It is necessary to perform address translation more than n times during the execution of a certain instruction, and the situation is different when there are only n usable TLB columns. This example is shown in FIG. At the beginning of the instruction (1103), not all address translations required for the instruction can be performed. This is because the entry required for that cannot be used in the TLB.
This is because the subsequent translation (1104) of accessing the page table again must be performed at the next stage where the instruction is executed. If the page table is changed during the execution of the instruction between the first and second address translations, data integrity problems can occur.
Thus, instructions of this type that require more than n address translations must be serialized with a command that changes the page table (eg, invalidates a page table entry).

【０１１０】本発明に従って、この直列化は、ｎ回を超
えるアドレス変換を要するコマンドが、実行されるため
にトークンを保有しなければならないという形で達成さ
れる。対応するシーケンスが図１６に示してある。最
初、トークンは第１の状態（１２００）にある。プロセ
ッサｉは、ｎ回を超えるアドレス変換を要するコマンド
を実行する必要がある場合、トークンを要求しなければ
ならない。トークンが"使用可能"なら、プロセッサｉに
割当てることができる（１２０１）。従ってプロセッサ
ｉはコマンド（１２０２）を完全に実行する許可を得
る。プロセッサｉはトークンを保有しているので、信号
Ａｉ（１２０３、"トークン受信"）もセットされる。装
置は"トークン使用不可"ステータス（１２０４）にある
ので、トークンを要求する他のプロセッサにトークンを
割当てることはできない。ＰＵｉで実行されたタスクが
終了すると（１２０２）、コマンドの実行は中断する。
依存タスクは実行されないからである。トークンが、返
されたとき（１２０６）にすぐに再使用できるようにな
る（１２０７）のはそのためである。従って装置は、第
２の状態（６０７）から第１の状態（６０６）に直接シ
フトする。これは図８の矢印（６０９）に対応する。こ
れはまた、図１６の信号図からも読取れる。ＰＵｉで実
行されたタスクが終了すると（１２０２）、信号Ａｉ
（１２０３）もリセットされる。依存タスクは実行され
ないので、信号Ｂ１乃至Ｂｎはいずれもセットされな
い。そのため信号Ｃ（１２０４）、"トークン使用不可"
は、タスクの終了時にすぐにリセットされ、トークンは
再使用できるようになる。According to the present invention, this serialization is achieved in that commands requiring more than n address translations must hold a token to be executed. The corresponding sequence is shown in FIG. Initially, the token is in a first state (1200). Processor i must request a token if it needs to execute a command that requires more than n address translations. If the token is "usable", it can be assigned to processor i (1201). Thus, processor i gets permission to execute the command (1202) completely. Since processor i has the token, signal Ai (1203, "token received") is also set. Since the device is in the "token not available" status (1204), the token cannot be assigned to another processor requesting the token. When the task executed by the PUi ends (1202), the execution of the command is interrupted.
This is because the dependent task is not executed. That is why the token is immediately available for reuse (1207) when returned (1206). Thus, the device shifts directly from the second state (607) to the first state (606). This corresponds to the arrow (609) in FIG. This can also be read from the signal diagram of FIG. When the task executed in PUi ends (1202), the signal Ai
(1203) is also reset. Since the dependent task is not executed, none of the signals B1 to Bn is set. Therefore, signal C (1204), "token cannot be used"
Is reset immediately at the end of the task, and the token can be reused.

【０１１１】まとめとして、本発明の構成に関して以下
の事項を開示する。In summary, the following items are disclosed regarding the configuration of the present invention.

【０１１２】（１）マルチプロセッサ装置で直列処理さ
れる命令の直列化のためのトークンを管理する装置であ
って、プロセッサは、トークンを保有している場合に
は、直列処理される命令の１つのみを実行することがで
き、前記トークンを管理する装置は、前記トークンがい
ずれのプロセッサにも割当てられず、前記トークンを要
求したプロセッサに前記トークンを割当てることができ
る第１の状態と、前記トークンがプロセッサの１つに割
当てられ、前記トークンを要求したプロセッサには前記
トークンを割当てることができない第２の状態と、前記
トークンがいずれのプロセッサにも割当てられず、前記
トークンを要求したプロセッサに前記トークンを割当て
ることができる第３の状態と、を有する、装置。（２）前記マルチプロセッサ装置はｎ個（ｎは１以上の
整数）のプロセッサで構成され、各プロセッサｉについ
て、プロセッサｉが前記トークンを保有しているときに
セットされる信号Ａｉが使用でき、各プロセッサｉにつ
いて、信号Ｂｉが各プロセッサｉに使用でき、前記信号
Ｂｉは、まだ完全に実行されていないタスクがプロセッ
サｉで保留されているときにセットされ、前記タスク
は、直列処理される命令の１つの実行で直列化されなけ
ればならず、そのために、前記トークンを要求したプロ
セッサに前記トークンを割当てることができず、また、
信号Ｃが使用でき、前記信号Ｃは、マルチプロセッサ装
置全体に共通であり、すでに列挙されている全てのプロ
セッサの信号Ａ１、．．．Ａｎ、Ｂ１、．．．ＢｎのOR
チェインによって生じ、前記トークンを要求側プロセッ
サに割当てることができないときにセットされ、前記ト
ークンを管理する装置の第１の状態は、信号Ａ
１、．．．Ａｎ、Ｂ１、．．．Ｂｎのいずれもセットさ
れず、前記トークンを管理する装置の第２の状態は、信
号Ａ１、．．．Ａｎの１つがセットされ、前記トークン
を管理する装置の第３の状態は、信号Ａ１、．．．Ａｎ
のいずれもセットされず、信号Ｂ１、．．．Ｂｎのうち
少なくとも１つはセットされる、前記（１）記載のトー
クン管理装置。（３）各プロセッサｉについて、信号Ａｉ、ＢｉのORチ
ェインにより生じる１つの信号Ｃｉが使用でき、前記マ
ルチプロセッサ装置全体に共通な信号Ｃが、全てのプロ
セッサの信号Ｃ１、．．．ＣｎのORチェインにより生じ
る、前記（２）記載のトークン管理装置。（４）各プロセッサｉについて、プロセッサｉが前記ト
ークンを要求したときにセットされる１つの信号が使用
でき、前記１つの信号は、出力が信号ＡｉであるANDゲ
ートの入力側に、反転した信号Ｃと共に存在する、前記
（２）記載のトークン管理装置。（５）直列処理される前記命令は、開始側プロセッサで
実行される第１のタスクと、レスポンダ・プロセッサで
実行される依存タスクとで構成され、前記トークンは、
前記開始側プロセッサが前記第１のタスクを実行できる
ように前記開始側プロセッサに割当てなければならず、
前記トークンは、全ての依存タスクが完了するまでは前
記トークンを要求するプロセッサに割当てられない、前
記（１）記載のトークン管理装置。（６）前記マルチプロセッサ装置の全てのプロセッサに
命令バッファが割当てられ、前記第１のタスクの実行時
に、前記レスポンダ・プロセッサで実行される依存タス
クを指定するコマンドやアドレスを、前記開始側プロセ
ッサが前記レスポンダ・プロセッサの前記命令バッファ
に書込む、前記（５）記載のトークン管理装置。（７）第１のタスクの実行中に、前記開始側プロセッサ
は、前記レスポンダ・プロセッサで実行される依存タス
クを指定するコマンドやアドレスを、前記レスポンダ・
プロセッサの命令バッファに、ブロードキャスト強制操
作により書込む、前記（５）記載のトークン管理装置。（８）前記第１のタスクの実行時に、前記開始側プロセ
ッサは、前記レスポンダ・プロセッサの命令バッファに
コマンドやアドレスを書込み、レスポンダ・プロセッサ
ｉの前記命令バッファへの書込みで前記信号Ｂｉもセッ
トされる、前記（５）記載のトークン管理装置。（９）レスポンダ・プロセッサｉで保留されている前記
依存タスクが完了したとき、前記信号Ｂｉはリセットさ
れる、前記（８）記載のトークン管理装置。（１０）前記依存タスクが、前記レスポンダ・プロセッ
サの命令フローの割込み可能点で挿入され実行される、
前記（５）記載のトークン管理装置。（１１）前記マルチプロセッサ装置はメモリを含み、前
記メモリをアドレスするために、仮想アドレスから実ア
ドレスへの変換がアドレス変換テーブルを利用して実行
でき、すでに実行されているアドレス変換の結果は、ア
ドレス変換キャッシュの１つに保存され、前記アドレス
変換キャッシュはそれぞれ前記プロセッサの１つに割当
てられ、直列処理される命令は、ページ・テーブル・エ
ントリを無効化する命令（ＩＰＴＥ）であり、前記開始
側プロセッサは、前記第１のタスクの実行時に前記ペー
ジ・テーブル・エントリを無効化し、前記依存タスクの
実行時に、他のプロセッサは、前記他のプロセッサに割
当てられたアドレス変換キャッシュ内の対応するエント
リを無効化する、前記（５）記載のトークン管理装置。（１２）前記マルチプロセッサ装置はメモリを含み、前
記メモリをアドレスするために、仮想アドレスから実ア
ドレスへの変換がアドレス変換テーブルを利用して実行
でき、すでに実行されているアドレス変換の結果は、ア
ドレス変換キャッシュに保存され、前記アドレス変換キ
ャッシュはそれぞれ前記プロセッサの１つに割当てら
れ、ページに対する個々のプロセッサのアクセス権を明
示するキー情報が、前記メモリの一部、すなわちキー記
憶域に保存され、直列処理される命令は、前記キー情報
を変更する命令（ＳＳＫＥ）であり、前記開始側プロセ
ッサは、前記第１のタスクの実行時に前記キー記憶域の
前記キー情報を変更し、前記依存タスクの実行時に、前
記他のプロセッサに割当てられたアドレス変換キャッシ
ュの前記キー情報を前記他のプロセッサが変更する、前
記（５）記載のトークン管理装置。（１３）マルチプロセッサ装置で直列処理される命令を
直列化するプロセスであって、直列処理される命令の１
つの実行は、開始側プロセッサでの第１のタスクの実行
で構成され、前記開始側プロセッサは、トークンを保有
している場合にのみ前記第１のタスクを実行でき、前記
トークンは、使用できる場合にはプロセッサの１つにの
み割当てることができ、前記開始側プロセッサによって
前記トークンが要求されるステップと、前記トークンが
使用できる場合は、前記開始側プロセッサに前記トーク
ンが割当てられるステップと、直列処理される命令が実
行されるステップと、直列処理される命令の第１のタス
クが完了した後に前記トークンが返されることにより、
前記トークンは必ずしも他のプロセッサが使用できるよ
うにする必要のないステップと、直列処理される命令が
完了した後に、前記トークンの可用性が確立されるステ
ップと、を含む、プロセス。（１４）直列処理される命令を直列化するプロセスであ
って、各プロセッサｉで１つの信号Ａｉが使用でき、前
記トークンが開始側プロセッサｉに割当てられたとき、
信号Ａｉ（トークン受信）がセットされ、前記第１のタ
スクが完了した後に前記トークンが返されてから信号Ａ
ｉがリセットされる、前記（１３）記載のプロセス。（１５）直列処理される命令を直列化するプロセスであ
って、直列処理される命令の実行は、開始側プロセッサ
での第１のタスクの実行と、レスポンダ・プロセッサで
の依存タスクの実行とで構成され、全ての依存タスクが
終了したとき前記トークンの可用性が確立される、前記
（１３）記載のプロセス。（１６）直列処理される命令を直列化するプロセスであ
って、各プロセッサｉに１つの信号Ｂｉを使用でき、前
記信号Ｂｉは、依存タスクの実行がプロセッサｉで保留
されている場合にセットされ、前記信号Ｂｉは、前記依
存タスクの実行が完了したときリセットされる、前記
（１５）記載のプロセス。（１７）直列処理される命令を直列化するプロセスであ
って、前記依存タスクは、前記レスポンダ・プロセッサ
の命令フローの割込み可能点で挿入され実行される、前
記（１５）記載のプロセス。（１８）直列処理される命令を直列化するプロセスであ
って、前記マルチプロセッサ装置の各プロセッサに命令
バッファが割当てられ、前記第１のタスクの実行時に、
前記レスポンダ・プロセッサで実行される前記依存タス
クを指定するコマンドやアドレスを、前記開始側プロセ
ッサが前記レスポンダ・プロセッサの命令バッファに書
込む、前記（１５）記載のプロセス。（１９）直列処理される命令を直列化するプロセスであ
って、前記第１のタスクの実行時に、前記レスポンダ・
プロセッサで実行される前記依存タスクを指定するコマ
ンドやアドレスを、前記開始側プロセッサがブロードキ
ャスト強制操作によって前記レスポンダ・プロセッサの
命令バッファに書込む、前記（１５）記載のプロセス。（２０）直列処理される命令を直列化するプロセスであ
って、前記信号Ｂｉは、前記レスポンダ・プロセッサｉ
の命令バッファへの書込みによってもセットされる、前
記（１８）記載のプロセス。（２１）直列処理される命令を直列化するプロセスであ
って、前記マルチプロセッサ装置はメモリを含み、前記
メモリをアドレスするために、仮想アドレスから実アド
レスへの変換が、アドレス変換テーブルを利用して実行
でき、すでに実行されているアドレス変換の結果は、ア
ドレス変換キャッシュに保存され、前記アドレス変換キ
ャッシュはそれぞれ前記プロセッサの１つに割当てら
れ、直列処理される命令は、ページ・テーブル・エント
リを無効化する命令（ＩＰＴＥ）であり、前記開始側プ
ロセッサは、前記第１のタスクの実行時に前記ページ・
テーブル・エントリを無効化し、前記依存タスクの実行
時に、他のプロセッサは、前記他のプロセッサに割当て
られたアドレス変換キャッシュ内の対応するエントリを
無効化する、前記（１５）記載のプロセス。（２２）直列処理される命令を直列化するプロセスであ
って、前記マルチプロセッサ装置はメモリを含み、前記
メモリをアドレスするために、仮想アドレスから実アド
レスへの変換がアドレス変換テーブルを利用して実行で
き、すでに実行されているアドレス変換の結果は、アド
レス変換キャッシュに保存され、前記アドレス変換キャ
ッシュはそれぞれ前記プロセッサの１つに割当てられ、
ページに対する個々のプロセッサのアクセス権を明示す
るキー情報が、前記メモリの一部、すなわちキー記憶域
に保存され、直列処理される命令は、前記キー情報を変
更する命令（ＳＳＫＥ）であり、前記開始側プロセッサ
は、前記第１のタスクの実行時に前記キー記憶域の前記
キー情報を変更し、前記依存タスクの実行時には、前記
他のプロセッサに割当てられたアドレス変換キャッシュ
の前記キー情報を前記他のプロセッサが変更する、前記
（１５）記載のプロセス。（２３）直列処理される命令を直列化するプロセスであ
って、前記アドレス変換キャッシュはｎ次関連性であ
り、ｎ回を超えるアドレス変換を要する命令は、ページ
・テーブル・エントリを無効化する命令（ＩＰＴＥ）で
直列化される、前記（２１）記載のプロセス。（２４）直列処理される命令を直列化するプロセスであ
って、プロセッサｉが、ｎ回を超えるアドレス変換を要
する命令を実行するためには、前記トークンをプロセッ
サｉに割当てなければならない、前記（２３）記載のプ
ロセス。(1) A device for managing a token for serializing an instruction serially processed by a multiprocessor device, wherein the processor has one of the serialized instructions when the token is held. The first state in which the token is not assigned to any processor and the token can be assigned to the processor that has requested the token; and A second state in which a token is assigned to one of the processors and the processor that requested the token cannot be assigned the token; and a second state in which the token is not assigned to any processor and A third state to which the token can be assigned. (2) The multiprocessor device includes n (n is an integer of 1 or more) processors, and for each processor i, a signal Ai set when the processor i has the token can be used; For each processor i, a signal Bi is available to each processor i, said signal Bi being set when a task that has not yet been fully executed is pending on processor i, said task being executed by a serialized instruction Must be serialized in one execution of, so that the token cannot be assigned to the processor that requested the token, and
A signal C can be used, said signal C being common to the whole multiprocessor device, and the signals A1,. . . An, B1,. . . OR of Bn
Set by the chain when the token cannot be assigned to the requesting processor and the first state of the device managing the token is signal A
1,. . . An, B1,. . . Bn is not set and the second state of the device managing the token is a signal A1,. . . An is set, and the third state of the device managing the token is a signal A1,. . . An
Are not set and the signals B1,. . . The token management device according to (1), wherein at least one of Bn is set. (3) For each processor i, one signal Ci generated by the OR chain of the signals Ai and Bi can be used, and the signal C common to the entire multiprocessor device is the signal C1,. . . The token management device according to (2), which is generated by an OR chain of Cn. (4) For each processor i, one signal can be used which is set when the processor i requests the token, said one signal being the inverted signal at the input of an AND gate whose output is the signal Ai. C. The token management device according to (2), which exists together with C. (5) The instruction to be serially processed is composed of a first task executed by an initiating processor and a dependent task executed by a responder processor, and the token includes:
Must be assigned to the initiating processor so that the initiating processor can perform the first task;
The token management device according to (1), wherein the token is not assigned to a processor that requests the token until all dependent tasks are completed. (6) An instruction buffer is allocated to all processors of the multiprocessor device, and when the first task is executed, the initiating processor specifies a command or an address specifying a dependent task executed by the responder processor. The token management device according to (5), wherein the token is written in the instruction buffer of the responder processor. (7) During the execution of the first task, the initiating processor sends a command or an address designating a dependent task executed by the responder processor to the responder processor.
The token management device according to (5), wherein the instruction is written into an instruction buffer of the processor by a broadcast forced operation. (8) When executing the first task, the initiating processor writes a command or an address in an instruction buffer of the responder processor, and the signal Bi is also set by writing to the instruction buffer of the responder processor i. The token management device according to the above (5). (9) The token management device according to (8), wherein when the dependent task suspended in the responder processor i is completed, the signal Bi is reset. (10) the dependent task is inserted and executed at an interruptible point in the instruction flow of the responder processor;
The token management device according to (5). (11) The multiprocessor device includes a memory, and in order to address the memory, a translation from a virtual address to a real address can be performed by using an address translation table. The instructions stored in one of the address translation caches, wherein each of the address translation caches is assigned to one of the processors, and the serialized instruction is an instruction to invalidate a page table entry (IPTE); The side processor invalidates the page table entry when the first task is executed, and when the dependent task is executed, the other processor sets the corresponding entry in the address translation cache assigned to the other processor. The token management device according to (5), wherein (12) The multiprocessor device includes a memory, and in order to address the memory, a translation from a virtual address to a real address can be performed by using an address translation table. Stored in an address translation cache, wherein each of the address translation caches is assigned to one of the processors, and key information that specifies the individual processor's access to the page is stored in a portion of the memory, i.e., in a key storage area. The instruction to be serially processed is an instruction for changing the key information (SSKE), and the initiating processor changes the key information in the key storage area when the first task is executed, and The key information of the address translation cache assigned to the other processor Serial other processor changes, the (5) token management device according. (13) A process for serializing instructions serially processed by the multiprocessor device, wherein one of the serially processed instructions
One execution comprises execution of a first task on an initiating processor, wherein the initiating processor can execute the first task only if it has a token, and the token can be used if Can only be assigned to one of the processors and the token is requested by the initiating processor, and if the token is available, the token is assigned to the initiating processor. The executed instruction is executed, and the token is returned after the first task of the serialized instruction is completed.
A process comprising the steps of not necessarily making the token available to other processors, and establishing the availability of the token after the serialized instruction is completed. (14) A process of serializing instructions to be serialized, wherein one signal Ai is available in each processor i and the token is assigned to the initiating processor i,
Signal Ai (token received) is set and signal A is returned after the token is returned after the first task is completed.
The process according to (13), wherein i is reset. (15) A process of serializing an instruction to be serialized, wherein the execution of the serialized instruction is performed by executing the first task on the initiating processor and executing the dependent task on the responder processor. The process of (13), wherein the process is configured and the availability of the token is established when all dependent tasks have been completed. (16) A process for serializing instructions to be serialized, wherein one signal Bi can be used for each processor i, and the signal Bi is set when execution of a dependent task is suspended in the processor i. , Wherein the signal Bi is reset when the execution of the dependent task is completed. (17) The process according to (15), wherein the serialized instruction is serialized, and the dependent task is inserted and executed at an interruptible point of an instruction flow of the responder processor. (18) A process of serializing instructions to be serialized, wherein an instruction buffer is allocated to each processor of the multiprocessor device, and when the first task is executed,
The process according to (15), wherein the initiating processor writes a command or an address specifying the dependent task executed by the responder processor to an instruction buffer of the responder processor. (19) A process of serializing an instruction to be serialized, wherein the responder.
The process according to (15), wherein the command or address specifying the dependent task to be executed by the processor is written into the instruction buffer of the responder processor by the broadcast forcing operation by the initiating processor. (20) A process of serializing an instruction to be serialized, wherein the signal Bi is the responder processor i
The process according to (18), wherein the process is also set by writing to the instruction buffer. (21) A process for serializing an instruction to be serialized, wherein the multiprocessor device includes a memory, and the conversion from the virtual address to the real address uses an address conversion table to address the memory. The result of the already executed address translation is stored in an address translation cache, wherein the address translation cache is assigned to each of the processors, and the serialized instruction stores the page table entry. Instruction to invalidate (IPTE), wherein the initiating processor executes the first
The process of (15), wherein the table entry is invalidated, and upon execution of the dependent task, another processor invalidates a corresponding entry in an address translation cache assigned to the other processor. (22) A process of serializing an instruction to be serialized, wherein the multiprocessor device includes a memory, and a conversion from a virtual address to a real address uses an address conversion table to address the memory. The result of the address translation that can be performed and has already been performed is stored in an address translation cache, wherein the address translation cache is respectively assigned to one of the processors;
The key information specifying the access right of each processor to the page is stored in a part of the memory, that is, the key storage area, and the instruction to be serially processed is an instruction to change the key information (SSKE). The initiating processor changes the key information in the key storage area during execution of the first task, and replaces the key information in the address translation cache assigned to the other processor with the other processor during execution of the dependent task. The process according to (15), wherein the processor changes the process. (23) A process of serializing an instruction to be serialized, wherein the address translation cache has an n-order relationship, and an instruction requiring more than n address translations is an instruction that invalidates a page table entry. The process according to (21), wherein the process is serialized by (IPTE). (24) A process of serializing an instruction to be serialized, wherein the token must be assigned to the processor i in order for the processor i to execute an instruction requiring address conversion more than n times. 23) The process according to the above.

【図面の簡単な説明】[Brief description of the drawings]

【図１】セグメント・テーブル及びページ・テーブルに
よって仮想アドレスを実アドレスに変換する方法を示す
図である。FIG. 1 is a diagram showing a method of converting a virtual address into a real address by using a segment table and a page table.

【図２】セグメント・テーブル・エントリの構造を示す
図である。FIG. 2 is a diagram showing a structure of a segment table entry.

【図３】ページ・テーブル・エントリの構造を示す図で
ある。FIG. 3 is a diagram showing a structure of a page table entry.

【図４】最新技術で実現さるように、マルチプロセッサ
装置のページ・テーブル・エントリを無効化するコマン
ドの時系列を示す図である（静止方法）。FIG. 4 is a diagram showing a time series of a command for invalidating a page table entry of a multiprocessor device as realized by the latest technology (quiescence method).

【図５】ページ・テーブル・エントリの無効化の例を使
用して、最初のタスクと依存タスクで構成されるコマン
ドの最初のタスクを実行する、本発明に従ったステップ
のフローチャートを示す図である。FIG. 5 shows a flowchart of the steps according to the invention for performing the first task of a command consisting of a first task and a dependent task using the example of invalidating a page table entry. is there.

【図６】ページ・テーブル・エントリの無効化の例を使
用して、最初のタスクと依存タスクで構成されるコマン
ドの最初のタスクを実行する、本発明に従ったステップ
のフローチャートを示す図である。FIG. 6 shows a flow chart of the steps according to the invention for performing the first task of a command consisting of a first task and a dependent task, using the example of invalidating a page table entry. is there.

【図７】ページ・テーブル・エントリの無効化の例を使
用して、最初のタスクと依存タスクで構成されるコマン
ドの依存タスクを実行する、本発明に従ったステップの
フローチャートを示す図である。FIG. 7 shows a flowchart of the steps according to the invention for performing a dependent task of a command consisting of a first task and a dependent task, using the example of invalidating a page table entry. .

【図８】本発明に従ってトークンが取り得る３つの状態
を示す図である。FIG. 8 is a diagram showing three possible states of a token according to the present invention.

【図９】トークンを管理する基本回路の図である。FIG. 9 is a diagram of a basic circuit for managing tokens.

【図１０】トークンを管理するのに必要な基本回路を実
現するもう１つの可能性を示す図である。FIG. 10 shows another possibility of realizing the basic circuits required for managing tokens.

【図１１】トークンの現在状態に応じた回路の信号のス
テータスを示す図である。FIG. 11 is a diagram showing the status of the signal of the circuit according to the current state of the token.

【図１２】トークンを管理する具体的な回路の図であ
る。FIG. 12 is a diagram of a specific circuit for managing tokens.

【図１３】４重関連アドレス変換キャッシュの動作方法
を示す図である。FIG. 13 is a diagram illustrating an operation method of the quadruple-related address translation cache.

【図１４】ｎタプル関連アドレス変換キャッシュを有す
るプロセッサでｎ回より少ないアドレス変換を要する命
令の実行時に守るべき制限を示す図である。FIG. 14 illustrates restrictions that must be observed when executing an instruction that requires less than n address translations in a processor having an n-tuple-related address translation cache.

【図１５】ｎ回より多いアドレス変換を要する命令の実
行時に守るべき制限を示す図である。FIG. 15 is a diagram illustrating restrictions to be observed when executing an instruction requiring address conversion more than n times.

【図１６】最初の１つのタスクだけで構成された命令の
場合に、トークンの状態に応じた信号のステータスを示
す図である。FIG. 16 is a diagram showing the status of a signal according to the state of a token in the case of an instruction composed of only one first task.

【符号の説明】[Explanation of symbols]

６００コンピュータ装置６０１プロセッサ７００、７１０、７１２、９０６、９１１ ORゲート８０２初期タスク８０３要求８０８、８０９依存タスク９００、９２２、１０１７、１０２０ ANDゲート９０３、９０７、９１５、９２０ラッチ９１３レシーバ９１６インバータ９１８無効化バッファ１０００仮想アドレス１００６、１００７、１００８、１００９、１０１０
アレイ１０１２、１０１４実初期アドレス１０１５、１０１８比較器１０１６、１０１９出力信号１０２１共通無効ビット１０２２アドレス・バス１１００命令600 Computer device 601 Processor 700, 710, 712, 906, 911 OR gate 802 Initial task 803 Request 808, 809 Dependent task 900, 922, 1017, 1020 AND gate 903, 907, 915, 920 Latch 913 Receiver 916 Inverter 918 Invalidation Buffer 1000 Virtual address 1006, 1007, 1008, 1009, 1010
Array 1012, 1014 Real initial address 1015, 1018 Comparator 1016, 1019 Output signal 1021 Common invalid bit 1022 Address bus 1100 Instruction

───────────────────────────────────────────────────── フロントページの続き (72)発明者クラウス・ヨルグ・ゲッツラフドイツ、ディ−71101 シェーンアイク、フライゼンヴェグ 26 (72)発明者アーヴィン・プェッファードイツ、ディ−71088 ホルツゲルリンゲン、テックストラッセ 12 (72)発明者ハンス−ヴェルナー・タストドイツ、ディ−71093 バイル・イン・ショーエンバッハ、ハルトマンストラッセ 66 ──────────────────────────────────────────────────の Continued on the front page (72) Inventor Klaus Jorg Getzlav, Germany, D-71101 Schoenig, Freisenb 26 (72) Inventor Irvin Peffer, Germany, D-71088 Holzgerlingen, Texstrasse 12 (72) Inventor Hans-Werner Tast Germany, Di-71093 Bayer in Schoenbach, Hartmannstrasse 66

Claims

【特許請求の範囲】[Claims]

【請求項１】マルチプロセッサ装置で直列処理される命
令の直列化のためのトークンを管理する装置であって、プロセッサは、トークンを保有している場合には、直列
処理される命令の１つのみを実行することができ、前記トークンを管理する装置は、前記トークンがいずれのプロセッサにも割当てられず、
前記トークンを要求したプロセッサに前記トークンを割
当てることができる第１の状態と、前記トークンがプロセッサの１つに割当てられ、前記ト
ークンを要求したプロセッサには前記トークンを割当て
ることができない第２の状態と、前記トークンがいずれのプロセッサにも割当てられず、
前記トークンを要求したプロセッサに前記トークンを割
当てることができる第３の状態と、を有する、装置。An apparatus for managing a token for serializing an instruction serially processed by a multiprocessor device, wherein the processor has one of the serialized instructions when the token is held. Only the device that manages the token, the token is not assigned to any processor,
A first state in which the token can be assigned to a processor that has requested the token; and a second state in which the token has been assigned to one of the processors and the processor that has requested the token cannot be assigned the token. And said token is not assigned to any processor,
A third state in which the token can be assigned to a processor that has requested the token.

【請求項２】前記マルチプロセッサ装置はｎ個（ｎは１
以上の整数）のプロセッサで構成され、各プロセッサｉについて、プロセッサｉが前記トークン
を保有しているときにセットされる信号Ａｉが使用で
き、各プロセッサｉについて、信号Ｂｉが各プロセッサｉに
使用でき、前記信号Ｂｉは、まだ完全に実行されていな
いタスクがプロセッサｉで保留されているときにセット
され、前記タスクは、直列処理される命令の１つの実行で直列
化されなければならず、そのために、前記トークンを要
求したプロセッサに前記トークンを割当てることができ
ず、また、信号Ｃが使用でき、前記信号Ｃは、マルチプロセ
ッサ装置全体に共通であり、すでに列挙されている全て
のプロセッサの信号Ａ１、．．．Ａｎ、Ｂ１、．．．Ｂ
ｎのORチェインによって生じ、前記トークンを要求側プ
ロセッサに割当てることができないときにセットされ、前記トークンを管理する装置の第１の状態は、信号Ａ
１、．．．Ａｎ、Ｂ１、．．．Ｂｎのいずれもセットさ
れず、前記トークンを管理する装置の第２の状態は、信号Ａ
１、．．．Ａｎの１つがセットされ、前記トークンを管理する装置の第３の状態は、信号Ａ
１、．．．Ａｎのいずれもセットされず、信号Ｂ
１、．．．Ｂｎのうち少なくとも１つはセットされる、請求項１記載のトークン管理装置。2. The apparatus according to claim 1, wherein the number of the multiprocessor devices is n (n is 1).
For each processor i, a signal Ai set when the processor i has the token can be used, and for each processor i, a signal Bi can be used for each processor i. , The signal Bi is set when a task that has not yet been completely executed is pending in processor i, and the task must be serialized with the execution of one of the serialized instructions, In addition, the token cannot be assigned to the processor that requested the token, and the signal C can be used. The signal C is common to the entire multiprocessor device, and is the signal of all the processors already listed. A1,. . . An, B1,. . . B
n and is set when the token cannot be assigned to the requesting processor, the first state of the device managing the token being signal A
1,. . . An, B1,. . . Bn is not set and the second state of the device managing the token is a signal A
1,. . . An is set, and the third state of the device managing the token is a signal A
1,. . . An is not set, and the signal B
1,. . . The token management device according to claim 1, wherein at least one of Bn is set.

【請求項３】各プロセッサｉについて、信号Ａｉ、Ｂｉ
のORチェインにより生じる１つの信号Ｃｉが使用でき、前記マルチプロセッサ装置全体に共通な信号Ｃが、全て
のプロセッサの信号Ｃ１、．．．ＣｎのORチェインによ
り生じる、請求項２記載のトークン管理装置。3. The signal Ai, Bi for each processor i.
Can be used, and a signal C common to the entire multiprocessor device is a signal C1 of all processors. . . 3. The token management device according to claim 2, wherein the token management device is generated by an OR chain of Cn.

【請求項４】各プロセッサｉについて、プロセッサｉが
前記トークンを要求したときにセットされる１つの信号
が使用でき、前記１つの信号は、出力が信号ＡｉであるANDゲートの
入力側に、反転した信号Ｃと共に存在する、請求項２記載のトークン管理装置。4. For each processor i, one signal is set which is set when the processor i requests said token, said one signal being inverted at the input of an AND gate whose output is the signal Ai. The token management device according to claim 2, wherein the token management device is present together with the signal C.

【請求項５】直列処理される前記命令は、開始側プロセ
ッサで実行される第１のタスクと、レスポンダ・プロセ
ッサで実行される依存タスクとで構成され、前記トークンは、前記開始側プロセッサが前記第１のタ
スクを実行できるように前記開始側プロセッサに割当て
なければならず、前記トークンは、全ての依存タスクが完了するまでは前
記トークンを要求するプロセッサに割当てられない、請求項１記載のトークン管理装置。5. The instruction to be serially processed comprises a first task executed by an initiating processor, and a dependent task executed by a responder processor, wherein the token is generated by the initiating processor. The token of claim 1, wherein a token must be assigned to the initiating processor to perform a first task, and the token is not assigned to a processor requesting the token until all dependent tasks have completed. Management device.

【請求項６】前記マルチプロセッサ装置の全てのプロセ
ッサに命令バッファが割当てられ、前記第１のタスクの実行時に、前記レスポンダ・プロセ
ッサで実行される依存タスクを指定するコマンドやアド
レスを、前記開始側プロセッサが前記レスポンダ・プロ
セッサの前記命令バッファに書込む、請求項５記載のトークン管理装置。6. An instruction buffer is allocated to all processors of the multiprocessor device, and a command or an address designating a dependent task to be executed by the responder processor at the time of execution of the first task is transmitted to the start side. The token management device according to claim 5, wherein a processor writes to the instruction buffer of the responder processor.

【請求項７】第１のタスクの実行中に、前記開始側プロ
セッサは、前記レスポンダ・プロセッサで実行される依
存タスクを指定するコマンドやアドレスを、前記レスポ
ンダ・プロセッサの命令バッファに、ブロードキャスト
強制操作により書込む、請求項５記載のトークン管理装置。7. During execution of a first task, the initiating processor transmits a command or an address specifying a dependent task executed by the responder processor to a command buffer of the responder processor in a broadcast forcible operation. The token management device according to claim 5, wherein the token is written by:

【請求項８】前記第１のタスクの実行時に、前記開始側
プロセッサは、前記レスポンダ・プロセッサの命令バッ
ファにコマンドやアドレスを書込み、レスポンダ・プロセッサｉの前記命令バッファへの書込
みで前記信号Ｂｉもセットされる、請求項５記載のトークン管理装置。8. When the first task is executed, the initiating processor writes a command or an address in an instruction buffer of the responder processor, and writes the signal Bi in response to the writing of the instruction buffer of the responder processor i. The token management device according to claim 5, which is set.

【請求項９】レスポンダ・プロセッサｉで保留されてい
る前記依存タスクが完了したとき、前記信号Ｂｉはリセ
ットされる、請求項８記載のトークン管理装置。9. The token management device according to claim 8, wherein said signal Bi is reset when said dependent task pending in said responder processor i is completed.

【請求項１０】前記依存タスクが、前記レスポンダ・プ
ロセッサの命令フローの割込み可能点で挿入され実行さ
れる、請求項５記載のトークン管理装置。10. The token management device according to claim 5, wherein said dependent task is inserted and executed at an interruptible point of an instruction flow of said responder processor.

【請求項１１】前記マルチプロセッサ装置はメモリを含
み、前記メモリをアドレスするために、仮想アドレスから実
アドレスへの変換がアドレス変換テーブルを利用して実
行でき、すでに実行されているアドレス変換の結果は、アドレス
変換キャッシュの１つに保存され、前記アドレス変換キ
ャッシュはそれぞれ前記プロセッサの１つに割当てら
れ、直列処理される命令は、ページ・テーブル・エントリを
無効化する命令（ＩＰＴＥ）であり、前記開始側プロセッサは、前記第１のタスクの実行時に
前記ページ・テーブル・エントリを無効化し、前記依存タスクの実行時に、他のプロセッサは、前記他
のプロセッサに割当てられたアドレス変換キャッシュ内
の対応するエントリを無効化する、請求項５記載のトークン管理装置。11. The multiprocessor device includes a memory, wherein a translation from a virtual address to a real address can be performed using an address translation table to address the memory, and a result of the address translation already performed. Are stored in one of the address translation caches, wherein each of the address translation caches is assigned to one of the processors, the serialized instruction is an instruction to invalidate a page table entry (IPTE), The initiating processor invalidates the page table entry when the first task is executed, and when the dependent task is executed, the other processor causes a corresponding address in the address translation cache assigned to the other processor to be changed. The token management device according to claim 5, wherein an entry to be performed is invalidated.

【請求項１２】前記マルチプロセッサ装置はメモリを含
み、前記メモリをアドレスするために、仮想アドレスから実
アドレスへの変換がアドレス変換テーブルを利用して実
行でき、すでに実行されているアドレス変換の結果は、アドレス
変換キャッシュに保存され、前記アドレス変換キャッシ
ュはそれぞれ前記プロセッサの１つに割当てられ、ページに対する個々のプロセッサのアクセス権を明示す
るキー情報が、前記メモリの一部、すなわちキー記憶域
に保存され、直列処理される命令は、前記キー情報を変更する命令
（ＳＳＫＥ）であり、前記開始側プロセッサは、前記第１のタスクの実行時に
前記キー記憶域の前記キー情報を変更し、前記依存タス
クの実行時に、前記他のプロセッサに割当てられたアド
レス変換キャッシュの前記キー情報を前記他のプロセッ
サが変更する、請求項５記載のトークン管理装置。12. The multiprocessor device includes a memory, wherein a translation from a virtual address to a real address can be performed using an address translation table to address the memory, and a result of the address translation already performed. Are stored in an address translation cache, each of the address translation caches being assigned to one of the processors, and key information specifying the access right of each processor to a page is stored in a part of the memory, that is, a key storage area. The instruction stored and serially processed is an instruction (SSKE) for changing the key information, wherein the initiating processor changes the key information in the key storage area during execution of the first task, When the dependent task is executed, the address translation cache assigned to the other processor is It said over information other processors to modify the token management device according to claim 5.

【請求項１３】マルチプロセッサ装置で直列処理される
命令を直列化するプロセスであって、直列処理される命令の１つの実行は、開始側プロセッサ
での第１のタスクの実行で構成され、前記開始側プロセッサは、トークンを保有している場合
にのみ前記第１のタスクを実行でき、前記トークンは、使用できる場合にはプロセッサの１つ
にのみ割当てることができ、前記開始側プロセッサによって前記トークンが要求され
るステップと、前記トークンが使用できる場合は、前記開始側プロセッ
サに前記トークンが割当てられるステップと、直列処理される命令が実行されるステップと、直列処理される命令の第１のタスクが完了した後に前記
トークンが返されることにより、前記トークンは必ずし
も他のプロセッサが使用できるようにする必要のないス
テップと、直列処理される命令が完了した後に、前記トークンの可
用性が確立されるステップと、を含む、プロセス。13. A process for serializing instructions serialized in a multiprocessor device, wherein execution of one of the serialized instructions comprises execution of a first task in an initiating processor. The initiating processor can perform the first task only if it has the token, and the token can be assigned to only one of the processors if available, and the token can be assigned by the initiating processor. Is required, the token is assigned to the initiating processor if the token is available, the serialized instruction is executed, and the first task of the serialized instruction is performed. The token is returned after the completion of the process, so that the token is not necessarily used by another processor. A process that does not need to be performed and that the availability of the token is established after the serialized instruction is completed.

【請求項１４】直列処理される命令を直列化するプロセ
スであって、各プロセッサｉで１つの信号Ａｉが使用で
き、前記トークンが開始側プロセッサｉに割当てられたと
き、信号Ａｉ（トークン受信）がセットされ、前記第１のタスクが完了した後に前記トークンが返され
てから信号Ａｉがリセットされる、請求項１３記載のプロセス。14. A process for serializing instructions to be serialized, wherein one signal Ai is available for each processor i and the signal Ai (token received) when said token is assigned to the initiating processor i. 14. The process of claim 13, wherein is set and the signal Ai is reset after the token is returned after the first task is completed.

【請求項１５】直列処理される命令を直列化するプロセ
スであって、直列処理される命令の実行は、開始側プロセッサでの第
１のタスクの実行と、レスポンダ・プロセッサでの依存
タスクの実行とで構成され、全ての依存タスクが終了したとき前記トークンの可用性
が確立される、請求項１３記載のプロセス。15. A process for serializing serialized instructions, wherein the execution of the serialized instructions comprises execution of a first task on an initiating processor and execution of a dependent task on a responder processor. 14. The process of claim 13, wherein availability of the token is established when all dependent tasks have been completed.

【請求項１６】直列処理される命令を直列化するプロセ
スであって、各プロセッサｉに１つの信号Ｂｉを使用でき、前記信号Ｂｉは、依存タスクの実行がプロセッサｉで保
留されている場合にセットされ、前記信号Ｂｉは、前記依存タスクの実行が完了したとき
リセットされる、請求項１５記載のプロセス。16. A process for serializing instructions to be serialized, wherein one signal Bi is available for each processor i, said signal Bi being used when execution of a dependent task is suspended at processor i. 16. The process of claim 15, wherein the signal Bi is set when the execution of the dependent task is completed.

【請求項１７】直列処理される命令を直列化するプロセ
スであって、前記依存タスクは、前記レスポンダ・プロセッサの命令
フローの割込み可能点で挿入され実行される、請求項１
５記載のプロセス。17. The process of serializing serialized instructions, wherein the dependent tasks are inserted and executed at interruptible points in the instruction flow of the responder processor.
The process of claim 5.

【請求項１８】直列処理される命令を直列化するプロセ
スであって、前記マルチプロセッサ装置の各プロセッサに命令バッフ
ァが割当てられ、前記第１のタスクの実行時に、前記レスポンダ・プロセ
ッサで実行される前記依存タスクを指定するコマンドや
アドレスを、前記開始側プロセッサが前記レスポンダ・
プロセッサの命令バッファに書込む、請求項１５記載の
プロセス。18. A process for serializing instructions to be serialized, wherein an instruction buffer is allocated to each processor of the multiprocessor device, and is executed by the responder processor when executing the first task. The initiating processor sends a command or address specifying the dependent task to the responder
16. The process of claim 15, writing to an instruction buffer of a processor.

【請求項１９】直列処理される命令を直列化するプロセ
スであって、前記第１のタスクの実行時に、前記レスポンダ・プロセ
ッサで実行される前記依存タスクを指定するコマンドや
アドレスを、前記開始側プロセッサがブロードキャスト
強制操作によって前記レスポンダ・プロセッサの命令バ
ッファに書込む、請求項１５記載のプロセス。19. A process for serializing an instruction to be serialized, comprising: when executing the first task, a command or an address specifying the dependent task executed by the responder processor; The process of claim 15, wherein the processor writes to the instruction buffer of the responder processor by a broadcast forcing operation.

【請求項２０】直列処理される命令を直列化するプロセ
スであって、前記信号Ｂｉは、前記レスポンダ・プロセッサｉの命令
バッファへの書込みによってもセットされる、請求項１
８記載のプロセス。20. The process of serializing an instruction to be serialized, wherein the signal Bi is also set by writing to an instruction buffer of the responder processor i.
The process of claim 8.

【請求項２１】直列処理される命令を直列化するプロセ
スであって、前記マルチプロセッサ装置はメモリを含み、前記メモリをアドレスするために、仮想アドレスから実
アドレスへの変換が、アドレス変換テーブルを利用して
実行でき、すでに実行されているアドレス変換の結果は、アドレス
変換キャッシュに保存され、前記アドレス変換キャッシ
ュはそれぞれ前記プロセッサの１つに割当てられ、直列処理される命令は、ページ・テーブル・エントリを
無効化する命令（ＩＰＴＥ）であり、前記開始側プロセッサは、前記第１のタスクの実行時に
前記ページ・テーブル・エントリを無効化し、前記依存タスクの実行時に、他のプロセッサは、前記他
のプロセッサに割当てられたアドレス変換キャッシュ内
の対応するエントリを無効化する、請求項１５記載のプロセス。21. A process for serializing instructions to be serialized, wherein the multiprocessor device includes a memory, and the translation from a virtual address to a real address includes: The result of the already executed address translation is stored in an address translation cache, wherein the address translation cache is assigned to one of the processors, respectively, and the serialized instructions are executed in a page table table. An instruction (IPTE) for invalidating an entry, wherein the initiating processor invalidates the page table entry when the first task is executed, and when the dependent task is executed, the other processor executes the other task. Invalidates the corresponding entry in the address translation cache assigned to another processor The process of claim 15 wherein.

【請求項２２】直列処理される命令を直列化するプロセ
スであって、前記マルチプロセッサ装置はメモリを含み、前記メモリをアドレスするために、仮想アドレスから実
アドレスへの変換がアドレス変換テーブルを利用して実
行でき、すでに実行されているアドレス変換の結果は、アドレス
変換キャッシュに保存され、前記アドレス変換キャッシ
ュはそれぞれ前記プロセッサの１つに割当てられ、ページに対する個々のプロセッサのアクセス権を明示す
るキー情報が、前記メモリの一部、すなわちキー記憶域
に保存され、直列処理される命令は、前記キー情報を変更する命令
（ＳＳＫＥ）であり、前記開始側プロセッサは、前記第１のタスクの実行時に
前記キー記憶域の前記キー情報を変更し、前記依存タスクの実行時には、前記他のプロセッサに割
当てられたアドレス変換キャッシュの前記キー情報を前
記他のプロセッサが変更する、請求項１５記載のプロセス。22. A process for serializing instructions to be serialized, wherein the multiprocessor device includes a memory, and a translation from a virtual address to a real address uses an address translation table to address the memory. The results of the already performed address translation are stored in an address translation cache, each of which is assigned to one of the processors, and a key that specifies the individual processor's access to the page. Instructions in which information is stored in a portion of the memory, i.e., key storage, and serialized are instructions that change the key information (SSKE). The initiating processor executes the first task. At the time of execution of the dependent task, the key information in the key storage area is changed. 16. The process according to claim 15, wherein the key information of the address translation cache assigned to the processor is changed by the other processor.

【請求項２３】直列処理される命令を直列化するプロセ
スであって、前記アドレス変換キャッシュはｎ次関連性であり、ｎ回を超えるアドレス変換を要する命令は、ページ・テ
ーブル・エントリを無効化する命令（ＩＰＴＥ）で直列
化される、請求項２１記載のプロセス。23. A process for serializing instructions to be serialized, wherein the address translation cache is n-order relevant, wherein instructions requiring more than n address translations invalidate page table entries. 22. The process of claim 21, wherein the process is serialized with instructions to perform (IPTE).

【請求項２４】直列処理される命令を直列化するプロセ
スであって、プロセッサｉが、ｎ回を超えるアドレス変換を要する命
令を実行するためには、前記トークンをプロセッサｉに
割当てなければならない、請求項２３記載のプロセス。24. A process for serializing instructions to be serialized, wherein the processor i must assign the token to the processor i in order to execute an instruction that requires more than n address translations. A process according to claim 23.