JPH0696032A

JPH0696032A - Parallel processing system

Info

Publication number: JPH0696032A
Application number: JP24752392A
Authority: JP
Inventors: Teruo Tanaka; 輝雄田中; Yoshiko Tamaoki; 由子玉置; Yasuhiro Inagami; 泰弘稲上; Tadayuki Sakakibara; 忠幸榊原; Masanao Ito; 昌尚伊藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1992-09-17
Filing date: 1992-09-17
Publication date: 1994-04-08

Abstract

PURPOSE:To enable efficient parallel processing by fetching common data from a common extended storage, coupling plural subsystems, to main storages in the respective subsystems with a small transfer overhead. CONSTITUTION:The plural subsystems each consisting of a main storage 110, an instruction processor 180, an input/output processor 190, and a local extended storage 112 are provided and connected by the common extended storage 300 to constitute the parallel system. Further, a means which transfers data from the common extended storage 300 directly to the local extended storages 112 by an asynchronous instruction is provided. This data transfer means exerts no influence upon main storage access from an instruction processor 180. Further, data transfer from the local extended storages 112 to the main storages 110 is performed faster than the transfer from the shared extended storages 300 to the main storages 110.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、疎結合型マルチプロセ
サシステムに関して、特に大規模なデータを扱う処理の
高速化に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a loosely coupled multiprocessor system, and more particularly to speeding up processing of large-scale data.

【０００２】[0002]

【従来の技術】近年スーパコンピュータの登場により科
学技術計算処理のニーズは高まり、より大規模に、より
高速にとその要求はとどまるところをしらない。前者の
大規模化に対して、たとえば日立製作所のスーパコンピ
ュータＨＩＴＡＣＳ−８２０では、計算機システム内
に主記憶より容量の大きい拡張記憶装置（局所拡張記憶
装置）を備えている。この拡張記憶装置上の大規模デー
タを分割し、分割された部分データごとに、主記憶装置
上に読込み、演算処理を行い、その結果をまた拡張記憶
装置に書出すことを繰り返すことにより、主記憶装置の
容量を越える大規模なデータを扱う計算を可能としてい
る。2. Description of the Related Art In recent years, with the advent of supercomputers, the needs for scientific and technological calculation processing have increased, and the demands for larger scales and higher speeds are unavoidable. To cope with the former increase in scale, for example, the Hitachi supercomputer HITAC S-820 is provided with an expanded storage device (local expanded storage device) having a larger capacity than the main memory in the computer system. By dividing the large-scale data in the expansion storage device, reading the divided partial data into the main storage device, performing arithmetic processing, and writing the result back to the expansion storage device, It enables calculations that handle large-scale data that exceeds the capacity of the storage device.

【０００３】後者の高速化に関しては、１台の命令プロ
セサの高速化のみでなく、複数の命令プロセサを同時に
動作させる並列処理が一般化しつつある。この並列処理
をさらに拡張したひとつの構成方式として、主記憶装置
とその主記憶装置を共有する１台以上の命令プロセサお
よび１台以上の入出力プロセサからなるサブシステムを
複数用意し、そのサブシステム間を拡張記憶装置（共有
拡張記憶装置）で接続するシステムが特開平１−７８３
６１に開示されている。この拡張記憶装置上に複数のサ
ブシステム間で共有するデータ群を保持し、複数のサブ
システムを用いた並列処理を可能とする。Regarding the latter speeding up, not only the speeding up of one instruction processor but also parallel processing in which a plurality of instruction processors operate simultaneously is becoming popular. As one configuration method in which this parallel processing is further expanded, a plurality of subsystems each including a main storage device, one or more instruction processors sharing the main storage device, and one or more input / output processors are prepared. A system in which the two are connected by an extended storage device (shared extended storage device)
61. A data group shared by a plurality of subsystems is held on this extended storage device, and parallel processing using a plurality of subsystems is enabled.

【０００４】ここで、各サブシステム内の拡張記憶装置
を局所拡張記憶装置、サブシステム間を接続する拡張記
憶装置を共有拡張記憶装置と呼ぶ。これらの拡張記憶装
置に対しては、命令プロセサで行う同期型命令あるいは
入出力プロセサで行う非同期型命令でアクセスすること
ができる。Here, the extended storage device in each subsystem is called a local extended storage device, and the extended storage device connecting the subsystems is called a shared extended storage device. These extended storage devices can be accessed by a synchronous type instruction performed by an instruction processor or an asynchronous type instruction performed by an input / output processor.

【０００５】これまでに示した２種の拡張記憶装置を比
較する。後者の共有拡張記憶装置は各サブシステムから
アクセスすることが可能であるが、各サブシステムと独
立に実装されるため、各サブシステム内の主記憶装置あ
るいは命令プロセサと疎な結合となる。それに対して、
前者の局所拡張記憶装置は各サブシステム内にあり、主
記憶装置あるいはサブシステム内の命令プロセサと密に
結合することができるため、共有拡張記憶装置より高速
にアクセスすることができる。The two types of extended storage devices shown so far will be compared. The latter shared extended storage device can be accessed from each subsystem, but since it is implemented independently of each subsystem, it is loosely coupled to the main storage device or instruction processor in each subsystem. On the other hand,
The former local expanded storage device is in each subsystem and can be tightly coupled to the main storage device or the instruction processor in the subsystem, so that it can be accessed faster than the shared expanded storage device.

【０００６】[0006]

【発明が解決しようとする課題】複数のサブシステムを
用いて並列処理を行う場合、サブシステム間の共有デー
タは、共有拡張記憶に配置する。各サブシステムは必要
な共有データを共有拡張記憶から読み出し、主記憶に配
置し、命令プロセサで演算処理を行い、その結果を共有
データとして主記憶から共有拡張記憶に書き戻す。この
時の各サブシステムの動作の例を図８(ａ)に示す。図８
(ａ)では、共有拡張記憶から主記憶へのデータ転送処理
(ＣＭ)、演算処理(Ｅ)、主記憶から共有拡張記憶へのデ
ータ転送処理(ＭＣ)の各処理を繰り返す。１回の繰り返
しで必要なそれぞれの処理量を、ここでは、相対的に、
１単位、８単位、１単位と仮定する。When parallel processing is performed using a plurality of subsystems, shared data between the subsystems is placed in the shared extended storage. Each subsystem reads necessary shared data from the shared extended memory, arranges it in the main memory, performs arithmetic processing by the instruction processor, and writes back the result as shared data from the main memory to the shared extended memory. An example of the operation of each subsystem at this time is shown in FIG. Figure 8
In (a), data transfer processing from shared extended storage to main memory
(CM), arithmetic processing (E), and data transfer processing (MC) from main memory to shared extended memory are repeated. The amount of processing required for each iteration is
Assume 1 unit, 8 units, 1 unit.

【０００７】図８(ｂ)は、それぞれが１台の命令プロセ
サと入出力プロセサをもつ２つのサブシステム間で並列
処理を行う場合を示している。この時、２つのサブシス
テムからの共有拡張記憶へのアクセスは競合し、このア
クセスは逐次的に処理される。したがって、１回の繰り
返しに１１単位の処理時間がかかる。したがって、台数
効果は１．５倍(＝８＊２／１１)となる。FIG. 8B shows a case where parallel processing is performed between two subsystems each having one instruction processor and one input / output processor. At this time, accesses to the shared extended storage from the two subsystems compete with each other, and the accesses are sequentially processed. Therefore, it takes eleven units of processing time to repeat once. Therefore, the number effect is 1.5 times (= 8 * 2/11).

【０００８】図８(ｃ)では、共有拡張記憶へのアクセス
を入出力プロセサが非同期命令を用いて行い、演算処理
とオーバラップさせる場合を示している。ここで、次の
２つを仮定する。(1)非同期命令は同期命令より２倍の
時間がかかる、(2)非同期命令による主記憶アクセスの
ため、命令プロセサによる主記憶アクセスに影響があ
り、演算処理(Ｅ)とデータ転送処理((ＣＭ)あるいは(Ｍ
Ｃ))がオーバラップする範囲の処理時間が１．５倍に伸
びる。このため、１回の繰返しは１０単位の処理とな
り、このときの台数効果は１．６倍(＝８＊２／１０)と
なる。したがって、ただ単に共有拡張記憶へのアクセス
を同期型命令から非同期型命令にしただけでは、ほとん
ど性能向上を図ることができない。FIG. 8 (c) shows a case where the input / output processor uses an asynchronous instruction to access the shared extended storage and overlaps the arithmetic processing. Here, the following two are assumed. (1) Asynchronous instruction takes twice as long as synchronous instruction, (2) Main memory access by asynchronous instruction affects main memory access by instruction processor, and operation processing (E) and data transfer processing (( CM) or (M
The processing time in the range where C)) overlaps is extended by 1.5 times. Therefore, one iteration is a process of 10 units, and the number effect at this time is 1.6 times (= 8 * 2/10). Therefore, if the access to the shared extended memory is simply changed from the synchronous type instruction to the asynchronous type instruction, the performance can hardly be improved.

【０００９】[0009]

【課題を解決するための手段】上記課題を解決するため
には、(手段１)共有拡張記憶と局所拡張記憶との間に専
用のデータ転送手段を設け、命令プロセサの主記憶アク
セスに影響させずに、入出力プロセサが実行する非同期
処理で共有拡張記憶上の必要な共有データを局所拡張記
憶に転送し、(手段２)さらに、入出力プロセサが実行す
る非同期処理で局所拡張記憶上のデータを主記憶装置に
転送する。In order to solve the above problems, (Means 1) a dedicated data transfer means is provided between the shared extended memory and the local extended memory so as to affect the main memory access of the instruction processor. Instead, the necessary shared data on the shared extended storage is transferred to the local extended storage by the asynchronous processing executed by the I / O processor, and (means 2) further the data on the local extended storage is transferred by the asynchronous processing executed by the I / O processor. To the main storage device.

【００１０】[0010]

【作用】上記並列処理により、データ転送処理時間を演
算処理時間とオーバラップさせ、かつ、演算処理との主
記憶のアクセス競合を(手段１)では全くなし、(手段２)
では局所拡張記憶は共有拡張記憶より高速にアクセスで
きるため、その影響を受ける時間を短縮し、演算時間
(Ｅ)の伸びを最小限に抑えることができる。With the above parallel processing, the data transfer processing time overlaps the arithmetic processing time, and there is no access conflict of the main memory with the arithmetic processing in (Means 1), (Means 2)
Since the local expanded memory can be accessed faster than the shared expanded memory, the time affected by it is shortened and the calculation time is reduced.
The elongation of (E) can be minimized.

【００１１】[0011]

【実施例】以下、本発明の１実施例を図にしたがって説
明する。An embodiment of the present invention will be described below with reference to the drawings.

【００１２】図１は本発明の１実施例である並列プロセ
サシステムの構成を示す。図中、１００ないし２００は
サブシステム、１１０は主記憶、１１２は局所拡張記
憶、１３０は主記憶制御部、１５０は局所拡張記憶制御
部、１７０ないし１７２はセレクタ、１８０ないし１８
２は命令プロセサ、１９０ないし１９２は入出力プロセ
サをそれぞれあらわしている。各サブシステム１００な
いし２００内の構成はすべて同一である。さらに、３０
０は共有拡張記憶、３１０は共有拡張記憶制御部であ
る。もちろん、共有拡張記憶３００に接続するサブシス
テムの数は３組以上でもよい。ここで、複数の拡張記憶
を区別するために、それぞれに識別子ＥＳＩＤがつけら
れている。本実施例では、サブシステム１００内の局所
拡張記憶１１２にはＥＳＩＤ＝１が、サブシステム２０
０内の局所拡張記憶(図示せず)にはＥＳＩＤ＝２が、さ
らに、共有拡張記憶３００には、ＥＳＩＤ＝９が設定さ
れているとする。FIG. 1 shows the configuration of a parallel processor system which is an embodiment of the present invention. In the figure, 100 to 200 are subsystems, 110 is a main memory, 112 is a local expanded memory, 130 is a main memory controller, 150 is a local expanded memory controller, 170 to 172 are selectors, and 180 to 18
2 is an instruction processor, and 190 to 192 are input / output processors. The configurations in each subsystem 100 to 200 are all the same. Furthermore, 30
Reference numeral 0 is a shared extended storage, and 310 is a shared extended storage control unit. Of course, the number of subsystems connected to the shared extended storage 300 may be three or more. Here, in order to distinguish a plurality of extended storages, an identifier ESID is attached to each. In this embodiment, the local extended storage 112 in the subsystem 100 has ESID = 1 and the subsystem 20
It is assumed that ESID = 2 is set in the local extended storage (not shown) in 0 and ESID = 9 is set in the shared extended storage 300.

【００１３】以下では、記憶装置間のデータ転送処理に
おいて、それぞれの記憶装置を制御する主記憶制御部１
２０、局所拡張記憶制御部１５０および共有拡張記憶制
御部３１０の動作を順に説明する。In the following, in the data transfer process between the storage devices, the main storage control unit 1 for controlling each storage device.
20, the operations of the local extended storage control unit 150 and the shared extended storage control unit 310 will be sequentially described.

【００１４】まず、主記憶１１０と命令プロセサ１８０
および１８２、あるいは、入出力プロセサ１９０および
１９２との間のデータ転送、および主記憶１１０と局所
拡張記憶１５０あるいは共有拡張記憶３００との間のデ
ータ転送時に主記憶１１０を制御する主記憶制御部１２
０の動作を図２を用いて説明する。First, the main memory 110 and the instruction processor 180
And 182, or data transfer between the I / O processors 190 and 192, and data transfer between the main memory 110 and the local expanded storage 150 or the shared expanded storage 300, the main memory control unit 12 for controlling the main memory 110.
The operation of 0 will be described with reference to FIG.

【００１５】図中、１２１ないし１２２はデータ通信先
の拡張記憶の識別子ＥＳＩＤを保持するレジスタ、１２
３ないし１３０はＡＮＤゲート、１３１ないし１３４は
プライオリティ回路、１３５ないし１３９はセレクタ、
１４０ないし１４１は＋ｍカウンタ、１４２ないし１４
３はアドレス生成部をそれぞれ示している。In the figure, 121 to 122 are registers for holding the identifier ESID of the extended storage of the data communication destination, and 12
3 to 130 are AND gates, 131 to 134 are priority circuits, 135 to 139 are selectors,
140 to 141 are + m counters, 142 to 14
Reference numerals 3 indicate the address generators, respectively.

【００１６】線Ｌ１ないし線Ｌ４からは、各命令プロセ
サ１８０および１８２、あるいは入出力プロセサ１９２
および１９２からの主記憶アクセス要求が送られてく
る。この要求情報には、データの通信先が要求元プロセ
サであるのか、または、拡張記憶であるかの情報を含ん
でいる。プライオリティ回路１３１では、プロセサとの
データ通信要求の場合のプライオリティを決める。プラ
イオリティ回路１３２では、局所拡張記憶へのデータ転
送要求の場合のプライオリティを決める。プライオリテ
ィ回路１３３では、共有拡張記憶へのデータ転送要求の
場合のプライオリティを決める。転送先の拡張記憶の区
別は，レジスタ１２０あるいは１２２に保持されている
識別子ＥＳＩＤ番号との比較により行なう。それぞれの
プライオリティ回路１３１ないし１３３の出力により、
セレクタ１３７ないし１３９を切り換える。セレクタ１
３７ないし１３９には、線Ｌ５ないし線Ｌ８を介して、
各命令プロセサ１８０および１８２、あるいは入出力プ
ロセサ１９２および１９２からのアクセスすべき主記憶
アドレスが送られてきている。データ通信先が拡張記憶
の場合、そのデータ転送は、ｍバイト単位に複数回くり
かえし、あるまとまった単位(たとえば４ＫＢの整数倍)
で行なわれる。そのために、拡張記憶との間のデータ通
信の場合は、線Ｌ５ないし線Ｌ８から送られてきたアド
レス情報(たとえば、転送データの先頭アドレスとデー
タ通信量)をもとに主記憶へアクセスするアドレスを生
成する必要がある。この処理を共有拡張記憶に対しては
＋ｍカウンタ１４０とアドレス生成回路１４２で行な
い、局所拡張記憶に対しては＋ｍカウンタ１４１とアド
レス生成回路１４３で行なう。ここで、生成されたアド
レスは、さらに、プライオリティ回路１３４で制御され
るセレクタ１３５を介して、主記憶へ送られ、同じくプ
ライオリティ回路１３４で制御されるセレクタ１３６を
介してデータ転送が行なわれる。さらに、線Ｌ１ないし
線Ｌ４からの要求信号が共有拡張記憶との間のデータ転
送を示している場合、この要求は、プライオリティ回路
１３３、線Ｌ１５を介して共有拡張記憶制御部に転送さ
れる。From the line L1 to the line L4, the instruction processors 180 and 182 or the input / output processor 192 are connected.
And main memory access requests from 192 are sent. The request information includes information indicating whether the data communication destination is the request source processor or the extended storage. The priority circuit 131 determines the priority in the case of a data communication request with the processor. The priority circuit 132 determines the priority in the case of a data transfer request to the local expanded storage. The priority circuit 133 determines the priority in the case of a data transfer request to the shared extended storage. The extension storage of the transfer destination is distinguished by comparing with the identifier ESID number held in the register 120 or 122. By the outputs of the respective priority circuits 131 to 133,
The selectors 137 to 139 are switched. Selector 1
37 to 139 through lines L5 to L8,
The main memory addresses to be accessed are sent from the respective instruction processors 180 and 182 or the input / output processors 192 and 192. When the data communication destination is extended storage, the data transfer is repeated multiple times in units of m bytes, and a certain unit (for example, an integral multiple of 4 KB)
Done in. Therefore, in the case of data communication with the extended storage, the address for accessing the main storage based on the address information (for example, the start address of the transfer data and the data communication amount) sent from the line L5 to the line L8. Needs to be generated. This processing is performed by the + m counter 140 and the address generation circuit 142 for the shared extended storage, and by the + m counter 141 and the address generation circuit 143 for the local extended storage. Here, the generated address is further sent to the main memory via the selector 135 controlled by the priority circuit 134, and data transfer is performed via the selector 136 also controlled by the priority circuit 134. Further, when the request signal from the line L1 to the line L4 indicates data transfer with the shared extended storage, this request is transferred to the shared extended storage control unit via the priority circuit 133 and the line L15.

【００１７】つぎに、命令プロセサ１８０ないし１８２
で行なう演算処理とは独立に、入出力プロセサ１９０な
いし１９２が、サブシステム１００内の局所拡張記憶１
５０と共有拡張記憶３００との非同期データ転送処理を
実現する方法を説明する。Next, the instruction processors 180 to 182
The input / output processors 190 to 192 are independent of the arithmetic processing performed by the local expansion memory 1 in the subsystem 100.
A method for implementing the asynchronous data transfer process between the 50 and the shared extended storage 300 will be described.

【００１８】非同期データ転送処理は、命令プロセサ１
８０ないし１８２が実行したSTARTSUBCHANNEL命令など
の入出力命令により起動された入出力プロセサ１９０な
いし１９２により実行される。入出力プロセサ１９０あ
るいは１９２はチャネルコマンド語(ＣＣＷ)のコマンド
により入出力処理を実行する。この入出力処理は、命令
プロセサ１８０および１８２の命令実行とは独立に非同
期に行われる。The asynchronous data transfer process is performed by the instruction processor 1.
It is executed by the input / output processors 190 to 192 activated by the input / output instruction such as the START SUBCHANNEL instruction executed by the 80 to 182. The input / output processor 190 or 192 executes input / output processing by the command of the channel command word (CCW). This input / output processing is performed asynchronously independently of the instruction execution of the instruction processors 180 and 182.

【００１９】拡張記憶間の非同期データ転送処理に使用
するＣＣＷは、図５(ａ)に示す準備ＣＣＷと図５(ｂ)に
示す実行ＣＣＷである。図中の数字０および３１はビッ
ト位置を示す。The CCWs used for the asynchronous data transfer process between the extended storages are the preparation CCW shown in FIG. 5A and the execution CCW shown in FIG. 5B. The numbers 0 and 31 in the figure indicate bit positions.

【００２０】まずは、共有拡張記憶３００から局所拡張
記憶１１２にデータを転送する場合について説明する。First, the case of transferring data from the shared extended storage 300 to the local extended storage 112 will be described.

【００２１】準備ＣＣＷ(図５(ａ))では、コマンド部
(ＣＭＤ)で、ここでは共有拡張記憶からの読み出し準備
を指示するとともに、主記憶上におかれた拡張記憶用Ｉ
ＤＡＷ(Indirect Data Address Word)のアドレス(ＥＳ
−ＩＤＡＷアドレス)を示す。このＩＤＡＷは拡張記憶
の絶対アドレスを示す。ＩＤＡＷの形式を図６に示す。
ＩＤＡＷ中、ＥＳＩＤは拡張記憶を区別する識別子で、
ここではＥＳＩＤ＝９とし共有拡張記憶３００を示す。
拡張記憶ブロックアドレスは拡張記憶装置上の絶対ブロ
ックアドレスを示す。拡張記憶間のデータ転送はブロッ
ク単位(たとえば４ＫＢ)で行う。In the preparation CCW (FIG. 5A), the command section
In (CMD), here, the read preparation from the shared extended storage is instructed and the I for extended storage placed in the main storage is specified.
DAW (Indirect Data Address Word) address (ES
-IDAW address). This IDAW indicates the absolute address of the extended storage. The format of IDAW is shown in FIG.
In IDAW, ESID is an identifier that distinguishes the extended storage,
Here, the shared extended storage 300 is shown with ESID = 9.
The extended storage block address indicates an absolute block address on the extended storage device. Data transfer between extended memories is performed in block units (for example, 4 KB).

【００２２】一方、実行ＣＣＷ(図５(ｂ))では、コマン
ド部ＣＭＤで、準備ＣＣＷのコマンドの実行を指示する
とともに、フラグ、転送ブロック長、ＥＳ−ＩＤＡＷ
アドレスを示す。通常のＣＣＷでは、ＥＳ−ＩＤＡＷ
アドレスのフィールドにはデータ転送先として主記憶の
アドレスを示すが、ここでは、拡張記憶間のデータ転送
を行うため、このフィールドがＥＳ−ＩＤＡＷアドレ
スを示すことをフラグ内のビットで指定する。On the other hand, in the execution CCW (FIG. 5B), the command section CMD instructs the execution of the command of the preparation CCW, and the flag, transfer block length, ES-IDAW
Indicates an address. In normal CCW, ES-IDAW
The address field shows the address of the main memory as the data transfer destination, but here, since data transfer between the extended memories is performed, it is designated by a bit in the flag that this field indicates the ES-IDAW address.

【００２３】入出力プロセサ１９０あるいは１９２は、
準備ＣＣＷ、実行ＣＣＷを解読し、転送するデータ量
(ブロック長)、拡張記憶アドレスを線Ｌ７を介して局所
拡張記憶制御部１３２に送り、線Ｌ３を介してコマンド
(読み出しあるいは書き込み)を局所拡張記憶制御部１３
２に送り起動をかける。The input / output processor 190 or 192 is
Amount of data to be decoded and transferred for the prepared CCW and executed CCW
(Block length), the extended storage address is sent to the local extended storage control unit 132 via the line L7, and the command is sent via the line L3.
(Reading or writing) is performed by the local extended storage control unit 13
Send to 2 and activate.

【００２４】これらＣＣＷで示されたサブシステム１０
０内の局所拡張記憶制御部１３２の動作を図３を用いて
説明する。Subsystem 10 shown by these CCWs
The operation of the local extended storage control unit 132 in 0 will be described with reference to FIG.

【００２５】図中、１５１はデータ通信先の拡張記憶の
識別子ＥＳＩＤを保持するレジスタ、１５２ないし１５
５はＡＮＤゲート、１５６ないし１５８はプライオリテ
ィ回路、１５９ないし１６２はセレクタ、１６３は＋ｍ
カウンタ、１６４はアドレス生成部をそれぞれ示してい
る。In the figure, reference numeral 151 is a register for holding an identifier ESID of the extended storage of the data communication destination, 152 to 15
5 is an AND gate, 156 to 158 are priority circuits, 159 to 162 are selectors, and 163 is + m.
Counters 164 indicate address generation units, respectively.

【００２６】入出力プロセサ１９０あるいは１９２から
送られてきたデータ転送要求は、線Ｌ３を介して局所拡
張記憶制御部１３２に送られる。この要求は、共有拡張
記憶との間のデータ転送なので、識別子ＥＳＩＤ１５１
と比較され、プライオリティ回路１５７でプライオリテ
ィがとられ、セレクタ１６１を制御し、さらに、線Ｌ２
１を介して、この要求は、共有拡張記憶制御部に転送さ
れる。The data transfer request sent from the input / output processor 190 or 192 is sent to the local extended storage control unit 132 via the line L3. Since this request is a data transfer to / from the shared extended storage, the identifier ESID151
The priority circuit 157 gives priority and controls the selector 161.
This request is transferred to the shared extended storage control unit via 1.

【００２７】アドレス生成部１６４と＋ｍカウンタ１６
４の役割は主記憶制御部１２０の時と同様である。Address generator 164 and + m counter 16
The role of 4 is the same as that of the main memory control unit 120.

【００２８】次に、データ転送先である共有拡張記憶３
００について説明する。共有拡張記憶３００へのアクセ
スは、複数のサブシステムから行なわれる。サブシステ
ム＃１からは、線Ｌ２１あるいは線Ｌ１５を介して要求
信号が送られ、線Ｌ３０を介して共有拡張記憶３００上
のアドレス情報が送られる。転送データは線Ｌ３１を介
して転送しあう。これらの信号線はサブシステム＃２か
らも同様に設定されている。線Ｌ４０あるいは線Ｌ４１
を介して要求信号が送られ、線Ｌ４２を介してアドレス
情報が送られる。線Ｌ４３がデータ転送路である。この
共有記憶３００を制御する共有拡張記憶制御部３１０の
動作を図４を用いて説明する。Next, the shared extended storage 3 which is the data transfer destination
00 will be described. Access to the shared extended storage 300 is performed from a plurality of subsystems. From subsystem # 1, a request signal is sent via line L21 or line L15, and address information on shared extended storage 300 is sent via line L30. Transfer data is transferred via the line L31. These signal lines are similarly set from the subsystem # 2. Line L40 or line L41
The request signal is sent via the line, and the address information is sent via the line L42. The line L43 is a data transfer path. The operation of the shared extended storage control unit 310 that controls the shared storage 300 will be described with reference to FIG.

【００２９】図中、３１１はプライオリティ回路、３１
２ないし３１３はセレクタ、３１４は＋ｍカウンタ、３
１５はアドレス生成部をそれぞれ示している。In the figure, 311 is a priority circuit, 31
2 to 313 are selectors, 314 is + m counter, 3
Reference numerals 15 respectively indicate address generation units.

【００３０】サブシステム＃１あるいはサブシステム＃
２から送られてきたデータ転送要求は、プライオリティ
回路３１１でプライオリティがとられ、セレクタ３１１
を制御する。アドレス生成部３１５と＋ｍカウンタ３１
４の役割は主記憶制御部１２０の時と同様である。Subsystem # 1 or subsystem #
The priority of the data transfer request sent from No. 2 is taken by the priority circuit 311 and the selector 311
To control. Address generator 315 and + m counter 31
The role of 4 is the same as that of the main memory control unit 120.

【００３１】サブシステム＃１の局所拡張記憶からの要
求の場合、要求は線Ｌ２１を介して届き、データは、線
Ｌ３１を介して転送される。For a request from subsystem # 1's local expanded storage, the request arrives via line L21 and the data is transferred via line L31.

【００３２】この一連の拡張記憶間のデータ転送処理を
行う間、サブシステム１００内の各命令プロセサ１８０
あるいは１８２は主記憶制御部１２０を介して主記憶装
置１１０にアクセスしている。この時、入出力プロセサ
１９０あるいは１９２による拡張記憶間のデータ転送処
理によって、命令プロセサ１８０あるいは１８２の主記
憶アクセスへの影響はない。Each instruction processor 180 in the subsystem 100 is operated during the data transfer processing between the series of extended storages.
Alternatively, 182 accesses the main storage device 110 via the main storage control unit 120. At this time, the data transfer processing between the extended storages by the input / output processor 190 or 192 does not affect the main storage access of the instruction processor 180 or 182.

【００３３】次に、サブシステム１００内において、別
の入出力命令で起動された入出力プロセサにより、局所
拡張記憶１１２から主記憶１１０への非同期データ転送
を行う。この処理の手順は、先ほどのように準備ＣＣＷ
および実行ＣＣＷを用いて指定するが、準備ＣＣＷ(図
５(ｂ))のＥＳ−ＩＤＡＷアドレスフィールドが主記
憶上のアドレスをあらわしているように変更されること
のみが異なる。この非同期データ転送は、各命令プロセ
サ１８０ないし１８２の主記憶装置１１０へのアクセス
に影響を与える。しかし、局所拡張記憶１１２から主記
憶１１０へのデータ転送時間は、共有拡張記憶３００か
ら主記憶１１０へのデータ転送時間に比べて短い(共有
拡張記憶３００はすべてのサブシステムをサービスする
必要があるため)ので、各命令プロセサの主記憶アクセ
スへの影響を抑えることができる。Next, in the subsystem 100, an asynchronous data transfer from the local extended memory 112 to the main memory 110 is performed by the I / O processor activated by another I / O command. The procedure of this process is the same as the preparation CCW
It is specified using the execution CCW and the execution CCW, except that the ES-IDAW address field of the preparation CCW (FIG. 5B) is changed so as to represent the address on the main memory. This asynchronous data transfer affects the access of each instruction processor 180 to 182 to the main memory 110. However, the data transfer time from the local expanded storage 112 to the main memory 110 is shorter than the data transfer time from the shared expanded storage 300 to the main memory 110 (the shared expanded storage 300 needs to service all subsystems. Therefore, the influence on the main memory access of each instruction processor can be suppressed.

【００３４】一方、サブシステム１００内の主記憶１１
０から局所拡張記憶１１２への非同期データ転送、さら
に、局所拡張記憶１１２から共有拡張記憶３００への非
同期データ転送処理については、準備ＣＣＷのコマンド
部(ＣＭＤ)の内容および実行ＣＣＷのコマンド部(ＣＭ
Ｄ)およびフラグ部の内容を変更する(つまり、データ転
送方向を逆向きにする)だけで実現することができる。
このとき、局所拡張記憶１１２から読みだされたデータ
は、局所拡張記憶制御部１５０から線Ｌ２０、線Ｌ３
１、そして、共有記憶制御部３１０を介して共有拡張記
憶３００に書き込まれる。線Ｌ２０は主記憶１１０から
共有拡張記憶３００にデータ転送する場合に用いる線Ｌ
１３とセレクタ１７２を介して線Ｌ３１に絞られる。セ
レクタ１７２は、主記憶制御部１２０および線Ｌ１５を
介して共有拡張記憶記憶制御部３１０に送られる要求信
号、あるいは、局所拡張記憶制御部１５０および線Ｌ２
１を介して共有拡張記憶記憶制御部３１０に送られる要
求信号によって制御される。このセレクタ１７２によ
り、共有記憶制御部３２０のポート数を抑えることがで
きる(共有記憶制御部３２０は複数のサブシステムと接
続する必要があるため)。この拡張記憶間のデータ転送
処理も命令プロセサ１８０ないし１８２の主記憶アクセ
スに対して影響を与えない。On the other hand, the main memory 11 in the subsystem 100
Regarding the asynchronous data transfer from 0 to the local expanded storage 112 and the asynchronous data transfer process from the local expanded storage 112 to the shared expanded storage 300, the contents of the command part (CMD) of the preparation CCW and the command part (CM of the execution CCW)
D) and the contents of the flag part are changed (that is, the data transfer direction is reversed).
At this time, the data read from the local extended storage 112 is the lines L20 and L3 from the local extended storage control unit 150.
1, and is written in the shared extended storage 300 via the shared storage control unit 310. The line L20 is a line L used when data is transferred from the main memory 110 to the shared extended memory 300.
13 and the selector 172 to narrow the line L31. The selector 172 sends a request signal sent to the shared extended storage control unit 310 via the main storage control unit 120 and the line L15, or the local extended storage control unit 150 and line L2.
It is controlled by a request signal sent to the shared extended storage storage control unit 310 via 1. This selector 172 can reduce the number of ports of the shared storage control unit 320 (because the shared storage control unit 320 needs to be connected to a plurality of subsystems). The data transfer process between the expanded memories does not affect the main memory access of the instruction processors 180 to 182.

【００３５】これまでは、拡張記憶間のデータ転送を非
同期型命令で実現する方法を示したが、同様の処理を命
令プロセサで実行する同期型命令で実現することも可能
である。Up to now, the method of realizing the data transfer between the extended memories by the asynchronous type instruction has been described, but it is also possible to realize the same processing by the synchronous type instruction executed by the instruction processor.

【００３６】同期型命令のフォーマットを図７(ａ)に示
す。命令コード(ＯＰ)は拡張記憶間のデータ転送(ＭＯ
ＶＥ)を示す。命令で指示された２つのレジスタの内容
を図７(ｂ)に示す。レジスタ(Ｒ１)、レジスタ(Ｒ２)で
はそれぞれ主記憶上の拡張記憶制御パラメタアドレスを
示す。主記憶上の拡張記憶制御パラメタアドレス１には
データ転送先の拡張記憶絶対アドレスと転送すべきブロ
ック数があらかじめ保持されている。拡張記憶制御パラ
メタアドレス２には、データ転送元の拡張記憶絶対アド
レスがあらかじめ保持されている。この拡張記憶絶対ア
ドレスは、非同期転送の場合と同じく図６のフォーマッ
トをしており、対象とする拡張記憶のＥＳＩＤとその拡
張記憶内の絶対アドレスを示している。これらの拡張記
憶絶対アドレスのかわりに拡張記憶相対アドレスを用い
る場合もある。このときは、アドレスリロケーションを
行い、拡張記憶相対アドレスをＥＳＩＤとその拡張記憶
内の絶対アドレスに変換する。このアドレスリロケーシ
ョンの方法については、特開平２−７７８６７に開示さ
れている。The format of the synchronous instruction is shown in FIG. The instruction code (OP) is the data transfer (MO
VE) is shown. The contents of the two registers designated by the instruction are shown in FIG. The register (R1) and the register (R2) respectively indicate the extended storage control parameter address on the main storage. The extended storage control parameter address 1 on the main memory holds in advance the absolute storage absolute address of the data transfer destination and the number of blocks to be transferred. The extended storage control parameter address 2 holds in advance the absolute storage absolute address of the data transfer source. This extended storage absolute address has the format shown in FIG. 6 as in the case of asynchronous transfer, and shows the ESID of the target extended storage and the absolute address in the extended storage. The extended storage relative address may be used instead of these extended storage absolute addresses. At this time, address relocation is performed to convert the extended storage relative address into the ESID and the absolute address in the extended storage. This address relocation method is disclosed in Japanese Patent Laid-Open No. 2-77867.

【００３７】[0037]

【発明の効果】本発明によれば、サブシステム内の局所
拡張記憶と共有拡張記憶間のデータ転送を各命令プロセ
サからの主記憶アクセスに影響を与えずに、並行して行
うことができる。したがって、共有拡張記憶上の共有デ
ータを主記憶に取り込む場合、共有拡張記憶よりは高速
の局所拡張記憶からデータを主記憶に取込むことができ
る。図８(ｃ)にこの場合のタイムチャートを示す。１回
の繰り返し処理が８．５単位となり、台数効果は１．９
倍まで向上する。本発明のような局所拡張記憶と共有拡
張記憶間の直接データ転送を実現することにより、図５
(ａ)の場合に比べて、１１単位から８．５単位と２２％
の時間を短縮することができる。これは、本来の演算時
間８単位以外をすべてオーバヘッドとすると，オーバヘ
ッド部分を３単位から０．５単位に８３％も削減できる
ことを示している。According to the present invention, the data transfer between the local extended storage and the shared extended storage in the subsystem can be performed in parallel without affecting the main memory access from each instruction processor. Therefore, when the shared data on the shared expanded storage is taken into the main memory, the data can be taken into the main memory from the local expanded memory which is faster than the shared expanded memory. FIG. 8C shows a time chart in this case. One iteration process becomes 8.5 units, and the number effect is 1.9.
Improve up to twice. By implementing the direct data transfer between the local extended storage and the shared extended storage as in the present invention, the configuration shown in FIG.
Compared to the case of (a), 11 units to 8.5 units and 22%
The time can be shortened. This indicates that if all the units other than the original calculation time of 8 units are used as the overhead, the overhead portion can be reduced from 3 units to 0.5 unit by 83%.

【図面の簡単な説明】[Brief description of drawings]

【図１】記憶制御部を中心とする並列計算機システムの
構成図。FIG. 1 is a configuration diagram of a parallel computer system centered on a storage control unit.

【図２】主記憶制御部の詳細図。FIG. 2 is a detailed diagram of a main memory control unit.

【図３】局所拡張記憶制御部の詳細図。FIG. 3 is a detailed diagram of a local extended storage control unit.

【図４】共有拡張記憶制御部の詳細図。FIG. 4 is a detailed diagram of a shared extended storage control unit.

【図５】拡張記憶装置間のデータ転送命令の形式(ＣＣ
Ｗ)の一例を示す図。FIG. 5: Format of data transfer instruction between extended storage devices (CC
The figure which shows an example of W).

【図６】主記憶上の拡張記憶絶対アドレスフォーマット
の一例を示す図。FIG. 6 is a diagram showing an example of an extended storage absolute address format on a main storage.

【図７】同期型拡張記憶間データ転送命令フォーマット
とその構成図。FIG. 7 is a diagram showing a data transfer instruction format for synchronous extended storage and its configuration.

【図８】並列処理時の演算処理とデータ転送処理の基本
タイムチャート。FIG. 8 is a basic time chart of arithmetic processing and data transfer processing during parallel processing.

【符号の説明】[Explanation of symbols]

１００、２００…サブシステム、１１０…主記憶、１３
０…主記憶制御部、１１２…局所拡張記憶、、１
５０…局所拡張記憶制御部、１８０、１８２…命令プロ
セサ、１９０、１９２…入出力プロセサ、３００…共有
拡張記憶、、３１０…共有拡張記憶制御部。100, 200 ... Subsystem, 110 ... Main memory, 13
0 ... Main memory control unit, 112 ... Local expanded memory, 1
50 ... Local extended storage control unit, 180, 182 ... Instruction processor, 190, 192 ... Input / output processor, 300 ... Shared extended storage, ... 310 ... Shared extended storage control unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者榊原忠幸東京都国分寺市東恋ケ窪１丁目280番地株式会社日立製作所中央研究所内 (72)発明者伊藤昌尚東京都国分寺市東恋ケ窪１丁目280番地株式会社日立製作所中央研究所内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Tadayuki Sakakibara, 1-280, Higashi Koikeku, Kokubunji, Tokyo, Central Research Laboratory, Hitachi, Ltd. (72) Inventor Masahisa Ito 1-280, Higashi Koikeku, Kokubunji, Tokyo Hitachi, Ltd. Central Research Center

Claims

【特許請求の範囲】[Claims]

【請求項１】主記憶装置と該主記憶装置に接続され命令
を実行する少なくとも１つの命令処理装置と少なくとも
１つの入出力装置とランダムアクセスメモリから構成さ
れ該主記憶装置の２次記憶として接続される局所拡張記
憶装置とからなる複数のサブシステムと、ランダムアク
セスメモリから構成され上記サブシステム間で共有され
る共有拡張記憶装置とを有する並列処理システムにおい
て、該局所拡張記憶装置と共有拡張記憶装置間で直接デ
ータを転送するデータ転送手段を有することを特徴とす
る並列処理システム。1. A main storage device, at least one instruction processing device connected to the main storage device for executing instructions, at least one input / output device, and a random access memory, which are connected as secondary storage of the main storage device. In a parallel processing system having a plurality of subsystems each including a local extended storage device and a shared extended storage device configured by a random access memory and shared by the subsystems. A parallel processing system having a data transfer means for directly transferring data between devices.

【請求項２】上記並列処理システムにおいて、直接、該
局所拡張記憶装置と該共有拡張記憶装置間でデータを転
送するために、２つの拡張記憶装置を指定する拡張記憶
間データ転送命令を設けることを特徴とする特許請求項
１の並列処理システム。2. In the parallel processing system, in order to directly transfer data between the local expansion storage device and the shared expansion storage device, an inter-expansion storage data transfer instruction designating two expansion storage devices is provided. The parallel processing system according to claim 1, wherein