JPH08255086A

JPH08255086A - Object code generation system

Info

Publication number: JPH08255086A
Application number: JP5762695A
Authority: JP
Inventors: Sadao Nakamura; 定雄中村
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1995-03-16
Filing date: 1995-03-16
Publication date: 1996-10-01

Abstract

PURPOSE: To provide a object code generation system which generates an object code varying the number of execution processors without depending upon the processor constitution. CONSTITUTION: A compiler which is applied to a parallel computer consisting of plural processors and generates local codes of the respective processors by inputting a source program is equipped with a local data generation part 75 which analyzes the inputted source program and extracts parts using values depending upon the processor constitution including the numbers and sizes of the processors from procedures described in the source program and generates those values as variables and a procedure main body code generation part 77 which replaces the values depending upon the processor structure with variables and generates the object code of a procedure main body.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ソースプログラムを入
力して並列計算機を構成するプロセッサそれぞれのロー
カルコードを生成するコンパイラに適用して好適なオブ
ジェクトコード生成方式に係り、特にプロセッサ構造に
依存しないローカルコードを生成することを可能とする
オブジェクトコード生成方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an object code generation method suitable for application to a compiler which inputs a source program and generates local codes of respective processors constituting a parallel computer, and is not particularly dependent on the processor structure. The present invention relates to an object code generation method capable of generating a local code.

【０００２】[0002]

【従来の技術】数値計算の分野では、 Fortran 言語が
多く用いられ、逐次型計算機のための優れた最適化コン
パイラが開発されている。特に最近規格化された Fortr
an90においては、配列式等が導入されたためにベクトル
計算機のための最適化が容易になってきている。しか
し、分散メモリ型並列計算機に関しては依然として不都
合な点が多い。2. Description of the Related Art In the field of numerical calculation, the Fortran language is often used and an excellent optimizing compiler for a serial computer has been developed. Especially the recently standardized Fortr
In an90, optimization for vector computers has become easier because of the introduction of array formulas and the like. However, there are still many inconveniences regarding distributed memory type parallel computers.

【０００３】分散メモリ型並列計算機の性能を向上させ
るためには、計算を分割することにより並列度を上げ、
かつプロセッサ間通信を極力減らすことにより通信オー
バーヘッドを最小にする必要がある。従って、これを実
現するためにはデータの分割配置方法が重要となる。し
かし、 Fortran90 では、従来型計算機の連続アドレス
メモリ領域という考え方を継承しているために、依然と
して分割配置による最適化を困難なものにしている。In order to improve the performance of the distributed memory type parallel computer, the parallelism is increased by dividing the calculation.
Moreover, it is necessary to minimize the communication overhead by reducing the communication between processors as much as possible. Therefore, in order to realize this, the method of dividing and arranging the data is important. However, Fortran90 still inherits the concept of the continuous address memory area of a conventional computer, which makes it difficult to optimize the divided layout.

【０００４】近年、このような理由により Fortran 言
語仕様を若干修正してデータ分割配置に関する新しいコ
ンパイラ指示子を追加した新しい並列 Fortran 言語が
提案されている( 例えば HPFF,"High Performance Fort
ran Language SpecificationVers 1.0",1993 等) 。In recent years, for such a reason, a new parallel Fortran language has been proposed in which the Fortran language specification is slightly modified and a new compiler directive concerning the data division arrangement is added (eg HPFF, "High Performance Fort").
ran Language SpecificationVers 1.0 ", 1993 etc.).

【０００５】この並列 Fortran 言語の主たる特徴は、
配列データの分割配置の方法をコンパイラ指示子として
プログラムの中に埋め込むことである。この並列 Fortr
an言語のコンパイラは、コンパイラ指示子を参考にしな
がら、ソースプログラムからプロセッシングエレメント
( 以下、ＰＥという) それぞれのプログラムを生成す
る。このようなＰＥそれぞれのプログラムをローカルコ
ードという。また、ソースプログラム内の配列データを
複数のＰＥに分割配置し、ＰＥそれぞれのローカルコー
ドを生成するコンパイラを、ここでは単に並列コンパイ
ラと呼ぶことにする。The main features of this parallel Fortran language are:
The method of dividing and arranging array data is to embed it in the program as a compiler directive. This parallel Fortr
The an language compiler uses the compiler directive to refer to the processing elements from the source program.
(Hereinafter referred to as PE) Generate each program. Such a program of each PE is called a local code. A compiler that divides the array data in the source program into a plurality of PEs and generates a local code for each PE is simply referred to as a parallel compiler here.

【０００６】図１１乃至図１６を参照して従来のコンパ
イラによるオブジェクトコード生成方式について説明す
る。ここでは、図１１に示したような構成をもつ分散メ
モリ型並列計算機を想定して説明する。An object code generation method by a conventional compiler will be described with reference to FIGS. 11 to 16. Here, description will be made assuming a distributed memory type parallel computer having a configuration as shown in FIG.

【０００７】図１１は分散メモリ型並列計算機の概略構
成を示した図である。図１１において、１１はホスト計
算機、１２、１３、１４並びに１５はＰＥ、１６、１
７、１８並びに１９はそれぞれのＰＥ１２〜ＰＥ１５の
持つメモリ、及び１１０は複数のＰＥ１２〜ＰＥ１５間
を接続する結合網をそれぞれ示している。FIG. 11 is a diagram showing a schematic configuration of a distributed memory type parallel computer. In FIG. 11, 11 is a host computer, 12, 13, 14 and 15 are PEs, 16, 1
Reference numerals 7, 18 and 19 denote memories possessed by the respective PEs 12 to PE15, and 110 denotes a connection network connecting the plurality of PEs 12 to PE15.

【０００８】コンパイラが生成したローカルコードは、
ホスト計算機１１からＰＥ１２〜ＰＥ１５すべてにブロ
ードキャストされてメモリ１６〜メモリ１９に格納さ
れ、一斉に動作を開始する。これら全てのＰＥ１２〜Ｐ
Ｅ１５にて実行されるローカルプログラムは同一のもの
であるが、ＰＥ番号による条件判断により実行の仕方は
それぞれに異なる。The local code generated by the compiler is
It is broadcast from the host computer 11 to all the PEs 12 to PEs 15 and stored in the memories 16 to 19, and the operations are started simultaneously. All of these PE12-P
The local programs executed in E15 are the same, but the execution method differs depending on the condition judgment based on the PE number.

【０００９】図１２は並列Fortran 言語で記述されたサ
ンプルプログラムである。図１２において、（ｌ３１）
はサイズ１０の２つの１次元配列Ａ、Ｂを宣言するもの
である。また、（ｌ３２）、（ｌ３３）はコンパイラ指
示子である。このコンパイラ指示子は、プログラムの意
味には直接関係しないが、コンパイルの方法に関するヒ
ントをコンパイラに与えるために、生成されるオブジェ
クトコードの実行効率には強い影響を与える可能性があ
る。そして、（ｌ３２）では、３個のプロセッサについ
ての１次元配列Ｐを宣言している。また、（ｌ３３）
は、（ｌ３１）にて宣言した１次元配列Ａ、Ｂを、１次
元プロセッサ配列Ｐ上に分割配置することをコンパイラ
に対して指示するとともに、ブロック数が４であること
をコンパイラに対して指示している。FIG. 12 shows a sample program written in the parallel Fortran language. In FIG. 12, (l31)
Declares two one-dimensional arrays A and B of size 10. Further, (l32) and (l33) are compiler directives. Although this compiler directive is not directly related to the meaning of the program, it may give a hint to the compiler about how to compile it, and thus may strongly affect the execution efficiency of the generated object code. Then, in (l32), the one-dimensional array P for three processors is declared. Also, (l33)
Instructs the compiler to divide and arrange the one-dimensional arrays A and B declared in (l31) on the one-dimensional processor array P, and to instruct the compiler that the number of blocks is four. are doing.

【００１０】この指示に対応した１次元配列Ａ、Ｂの分
割配置を図１３に示す。図１３に示すように、ブロック
数４による分割配置によってサイズ１０の１次元配列
Ａ、Ｂが４つずつに分割されて３個のＰＥに配置され
る。ただし、配列サイズ１０をブロック数４で割った余
りは２であるので、最後のＰＥの持つ配列のサイズは２
となる。FIG. 13 shows a divided arrangement of the one-dimensional arrays A and B corresponding to this instruction. As shown in FIG. 13, the one-dimensional arrays A and B of size 10 are divided into four by four pieces and arranged in three PEs. However, since the remainder when the array size 10 is divided by the number of blocks 4 is 2, the array size of the last PE is 2
Becomes

【００１１】また、図１２の（ｌ３４）はFORALL文によ
る式の計算である。このFORALL文によって、FORALLイン
デックスI=1:9 についてFORALL式が並列に実行される。
この（ｌ３４）における左辺変数の配列添字の中の関数
MOD(...) は、FORALLインデックスI=1:9 の範囲で重複
した値を取らない整数値関数である。Further, (l34) in FIG. 12 is the calculation of the expression by the FORALL statement. This FORALL statement executes FORALL expressions in parallel for FORALL index I = 1: 9.
The function in the array subscript of the left-hand side variable in this (l34)
MOD (...) is an integer-valued function that does not take duplicate values in the range of FORALL index I = 1: 9.

【００１２】ここで、図１４を参照して従来の並列コン
パイラの動作手順の概略を示す。まず、コンパイルの最
初にターゲットプロセッサの構成２１を指定する（新並
列コンパイラではコンパイラ指示子により指定）。そし
て、コンパイラは、ステップＳ２２においてソースプロ
グラム２５を読み込み、構文解析してソースプログラム
２５を内部表現である中間形式２６に変換する。例え
ば、図１２に示すサンプルソースプログラムを構文解析
して中間表現に変換する等である。Here, the outline of the operation procedure of the conventional parallel compiler will be described with reference to FIG. First, the configuration 21 of the target processor is designated at the beginning of compilation (in the new parallel compiler, it is designated by the compiler directive). Then, in step S22, the compiler reads the source program 25, parses it, and converts the source program 25 into an intermediate format 26 which is an internal representation. For example, the sample source program shown in FIG. 12 is parsed and converted into an intermediate representation.

【００１３】次に、コンパイラは、ステップＳ２３にお
いて内部表現２６に対して様々な最適化変換を行い、変
換された中間形式２７を生成する。例えば、図１２に示
すサンプルプログラムの場合、FORALL式（ｌ３４）は図
１５の（ｌ５６）、（ｌ５７）及び（ｌ５８）のよう
に、通信部分と計算部分とに分割される。Next, in step S23, the compiler performs various optimization conversions on the internal representation 26 to generate a converted intermediate format 27. For example, in the case of the sample program shown in FIG. 12, the FORALL expression (l34) is divided into the communication part and the calculation part as shown in (l56), (l57) and (l58) of FIG.

【００１４】この（ｌ５６）、（ｌ５７）及び（ｌ５
８）に記述された一時配列Temp1 及びTemp2 は、配列Ａ
と同サイズ、同プロセッサ配置を持つ配列である。従っ
て、（ｌ５７）の計算は、プロセッサ間通信を行うこと
なく実行できることになる。These (l56), (l57) and (l5)
Temporary arrays Temp1 and Temp2 described in 8) are array A
It is an array that has the same size and the same processor layout as. Therefore, the calculation of (157) can be executed without performing inter-processor communication.

【００１５】なお、図１５の（ｌ５６）は、配列Ｂの要
素を右に1 個づつずらすシフト通信であり、一方、図１
５の（ｌ５８）は、配列Temp2 の要素を関数MOD(...)に
従って並べ換えるといった通信である。そして、図１５
の（ｌ５６）は、さらに右辺変数が基準となるように
（ｌ５９）の形に変形される。Note that (l56) in FIG. 15 is shift communication in which the elements of the array B are shifted to the right by one, while in FIG.
5 (158) is a communication for rearranging the elements of the array Temp2 according to the function MOD (...). And in FIG.
(L56) of is further transformed into the form of (l59) so that the variable on the right side becomes the reference.

【００１６】以上述べた（ｌ５５）から（ｌ５９）、
（ｌ５１０）及び（ｌ５１１）への変換は，図１４のス
テップＳ２３により中間表現２６に対して行われる。次
に、コンパイラは、ステップＳ２４にて変換された中間
形式２７をスキャンして、ＰＥそれぞれのオブジェクト
コード２８を生成する。図１２のサンプルプログラムの
場合、オブジェクトコードとして図１６のようなローカ
ルコードが生成されることになる。From the above (155) to (159),
The conversion into (1510) and (1511) is performed on the intermediate representation 26 in step S23 of FIG. Next, the compiler scans the intermediate format 27 converted in step S24 to generate the object code 28 of each PE. In the case of the sample program of FIG. 12, local code as shown in FIG. 16 is generated as the object code.

【００１７】この例では C 言語のプログラムを生成し
ているが、もちろん機械語コードを生成する物であって
もよい。本発明で問題としているのは、コンパイラの生
成するコードの内容である。In this example, a C language program is generated, but of course a machine language code may be generated. The problem in the present invention is the content of the code generated by the compiler.

【００１８】ここで、図１６を参照して従来のコンパイ
ラが図１２のサンプルプログラムに対して生成するロー
カルコードの内容について説明する。図１２に示すサン
プルソースプログラムの（ｌ３１）で定義されている配
列Ａ、Ｂは、ローカル配列に分割される。図１６の（ｌ
６６）、（ｌ６７）及び（ｌ６８）はこれらローカル配
列の宣言である。The contents of the local code generated by the conventional compiler for the sample program of FIG. 12 will be described with reference to FIG. Arrays A and B defined in (l31) of the sample source program shown in FIG. 12 are divided into local arrays. 16 (l
66), (l67) and (l68) are declarations of these local arrays.

【００１９】図１３に示したように、右端のプロセッサ
Ｐ（２）の持つローカル配列のサイズは他のプロセッサ
よりは小さいが、ここでは、すべてのプロセッサを同じ
大きさとして宣言する。As shown in FIG. 13, the size of the local array of the rightmost processor P (2) is smaller than that of the other processors, but here, all the processors are declared as the same size.

【００２０】まず最初に、先頭に実行される図１６の
（ｌ６４）及び（ｌ６５）にて自分のプロセッサ番号と
プロセッサ配列のサイズとを得る。次に、図１６の６１
において、図１５の（ｌ５９）に対応するシフト通信を
実行する。First, the processor number and the size of the processor array are obtained at (l64) and (l65) of FIG. 16 executed first. Next, in FIG.
In step 15, shift communication corresponding to (159) in FIG. 15 is executed.

【００２１】図１６の（ｌ６１１）における関数 rmt＿
write(...) は、 b[3] のアドレスから始まる sizeof
(double) バイトのデータをプロセッサ p+1 の temp
[0] のアドレスから始まる領域にリモート書き込みす
ることを意味する。ここでいうリモート書き込みとは、
ＰＥ番号とアドレスとを指定して直接相手のプロセッサ
のメモリにデータを書き込むことをいう。The function rmt_ in (1611) of FIG.
write (...) is a sizeof starting at the address in b [3]
(double) bytes of data in temp for processor p + 1
It means remote writing to the area starting from the address [0]. Remote writing here means
Directly writing data in the memory of the partner processor by designating the PE number and address.

【００２２】また、図１６の（ｌ６１２）の関数 sync
( 0, 2 ) は、０≦Ｐ≦２となるプロセッサ範囲でバリ
ア同期を開始し、バリア同期の完了を待たずに制御を戻
す。一方、図１６の（ｌ６１５）の関数 wait( 0, 2 )
は、０≦Ｐ≦２となるプロセッサ範囲でバリア同期の完
了を待つ。このsync() と wait() との間にコードを挿
入することによってバリア同期に対する時間の余裕が生
じることになる。Further, the function sync in (1612) of FIG.
(0, 2) starts barrier synchronization in the processor range where 0 ≦ P ≦ 2, and returns control without waiting for completion of barrier synchronization. On the other hand, the function wait (0, 2) of (l615) in FIG.
Waits for completion of barrier synchronization in the processor range where 0 ≦ P ≦ 2. Inserting code between this sync () and wait () will give you extra time for barrier synchronization.

【００２３】また、図１６の６２は、図１５の（ｌ５１
０）に対応して生成されたローカルコードの例である。
なお、この（ｌ５１０）の計算は、ＰＥ間通信なしで実
現される。Reference numeral 62 in FIG. 16 is (l51 in FIG. 15).
It is an example of the local code generated corresponding to 0).
The calculation of (1510) is realized without communication between PEs.

【００２４】また、図１６の６３は、図１５の（ｌ５１
１）に対応して生成されたローカルコードの例である。
この６３の場合、送信元のＰＥ番号とローカルデータと
から対応するグローバルインデックスを求め、また、FO
RALL式の左辺インデックス式を計算して通信相手のグロ
ーバルインデックスを求め、通信相手のＰＥ番号とロー
カルインデックスとに変換している。Reference numeral 63 in FIG. 16 indicates (l51 in FIG. 15).
It is an example of a local code generated corresponding to 1).
In the case of 63, the corresponding global index is obtained from the PE number of the transmission source and the local data, and the FO
The left index of the RALL expression is calculated to obtain the global index of the communication partner, and the PE index of the communication partner and the local index are converted.

【００２５】上記従来例が示すように、従来のコンパイ
ル方法では、コンパイル時に配列データの分割方法を知
っている必要があるため、コンパイルの前に目的プロセ
ッサの構成を指定する必要がある。As shown in the above-mentioned conventional example, in the conventional compiling method, it is necessary to know the method of dividing array data at the time of compiling, so it is necessary to specify the configuration of the target processor before compiling.

【００２６】従って、当然に生成されたローカルコード
は、コンパイル時に指定したプロセッサ構成でのみしか
動作しない。そして、実行するプロセッサ数を変更した
いのであれば、ソースプログラムを再コンパイルする必
要が生じる。しかし、通常、これらのコンパイルは、時
間やディスク容量等の計算機資源を多量に消費する処理
であり、その負担はかなり大きなのものである。Therefore, the generated local code naturally operates only in the processor configuration specified at the time of compilation. Then, if it is desired to change the number of processors to be executed, it is necessary to recompile the source program. However, usually, these compilations are processings that consume a large amount of computer resources such as time and disk capacity, and the load thereof is considerably large.

【００２７】[0027]

【発明が解決しようとする課題】以上詳述したように、
従来のコンパイル方式では、コンパイル時に配列データ
の分割方法を知っている必要があるため、コンパイルの
前に目的プロセッサの構成を指定する必要がある。DISCLOSURE OF THE INVENTION As described in detail above,
In the conventional compilation method, since it is necessary to know the method of dividing array data at the time of compilation, it is necessary to specify the configuration of the target processor before compilation.

【００２８】従って、当然に生成されたローカルコード
は、コンパイル時に指定したプロセッサ構成でのみしか
動作せず、実行するプロセッサ数を変更したいといった
場合には、ソースプログラムを再コンパイルする必要が
生じるといった問題点があった。Therefore, the generated local code naturally operates only in the processor configuration specified at the time of compilation, and when it is desired to change the number of processors to be executed, it is necessary to recompile the source program. There was a point.

【００２９】本発明は上記実情に鑑みなされたものであ
り、再コンパイルせずに実行プロセッサ数を可変とする
ことを可能とし、かつ効率的なオブジェクトコードを生
成するオブジェクトコード生成方式を提供することを目
的とする。The present invention has been made in view of the above circumstances, and provides an object code generation method capable of varying the number of execution processors without recompilation and generating an efficient object code. With the goal.

【００３０】[0030]

【課題を解決するための手段】本発明は、複数のプロセ
ッサからなる並列計算機に適用されるコンパイラであっ
て、ソースプログラムを入力して上記プロセッサそれぞ
れのローカルコードを生成するコンパイラにおいて、上
記入力したソースプログラムを解析して、このソースプ
ログラムに記述された手続きの中から上記プロセッサの
番号及びサイズを含むプロセッサ構造に依存する値を用
いる部分を抽出し、これらの値を変数化する手段と、上
記プロセッサ構造に依存する値を上記変数に置き換えて
上記手続き本体のオブジェクトコードを生成する手段と
を具備してなることを特徴とする。SUMMARY OF THE INVENTION The present invention is a compiler applied to a parallel computer comprising a plurality of processors, the compiler inputting a source program and generating a local code for each of the processors. A means for analyzing a source program, extracting from the procedure described in the source program, parts using values depending on the processor structure including the number and size of the processor, and converting the values into variables; Means for generating an object code of the procedure body by replacing a value depending on the processor structure with the variable.

【００３１】また、本発明は、上記手続きの中から仮引
数のプロセッサへの配置の仕方に依存して定まる値を抽
出し、上記抽出された値を要素として定義するデータ構
造体を作成する手段と、上記データ構造体を手続きの仮
引数に追加して、上記仮引数のプロセッサ配置に依存す
る値を、上記仮引数として与えられたデータ構造体の要
素に置き換えて、上記手続き本体のオブジェクトコード
を生成する手段とを具備してなることを特徴とする。Further, according to the present invention, means for extracting a value determined from the above procedure depending on how the dummy argument is arranged in the processor and creating a data structure defining the extracted value as an element. And the above data structure is added to the dummy argument of the procedure, and the value depending on the processor layout of the above dummy argument is replaced with the element of the data structure given as the above dummy argument, and the object code of the above procedure body And means for generating.

【００３２】また、本発明は、実引数のサイズまたはプ
ロセッサ配置の異なる手続き呼び出しパターンのそれぞ
れについて上記データ構造体の実体を割り当て、引数と
して追加して手続き呼び出しを行うコードを生成するこ
とを特徴とする。Further, the present invention is characterized in that a code for calling the procedure is generated by assigning the substance of the above-mentioned data structure to each procedure call pattern in which the size of the actual argument or the processor arrangement is different, and adding it as an argument. To do.

【００３３】また、本発明は、上記プロセッサ構造に依
存する変数化された値を初期設定するコード、更に異な
る手続きの呼び出しパターン毎に生成された上記データ
構造体の実体が存在する場合はこれら構造体の要素を初
期設定するコードからなる初期化手続きのコードを生成
する手段を具備してなることを特徴とする。Further, according to the present invention, a code for initializing a variableized value depending on the processor structure, and further, if there is an entity of the data structure generated for each call pattern of different procedures, these structures are present. It is characterized by comprising means for generating a code of an initialization procedure consisting of a code for initializing the elements of the body.

【００３４】即ち、本発明では従来技術の問題点を解決
するために、並列コンパイラに以下の機能を持たせる。
まず、本発明に係るコンパイラは、コンパイルしようと
しているメインプログラム、サブルーチン、または関数
( 以下これらを総称して手続きと呼ぶ) について、この
中に存在する仮引数のプロセッサ配置とは無関係に初期
化時に確定する値を識別する機能を持つ。That is, in the present invention, in order to solve the problems of the prior art, the parallel compiler is provided with the following functions.
First, the compiler according to the present invention is a main program, a subroutine, or a function to be compiled.
(Hereinafter, these are collectively called procedures.) It has a function to identify the value that is fixed at initialization regardless of the processor allocation of the dummy argument that exists in it.

【００３５】本発明に係るコンパイラはこれら仮引数に
依存せずに初期化時に確定する値を識別して変数に置き
換えた形でコード生成する機能を持つ。また、本発明で
は手続き内の仮引数のプロセッサ配置に依存して定まる
値を識別し、これらの値を要素とする構造体型を作る。
この構造体を手続き参照構造体と呼ぶこととする。The compiler according to the present invention has a function of generating a code in a form in which a value determined at initialization is identified and replaced with a variable without depending on these dummy arguments. Further, in the present invention, the values determined depending on the processor arrangement of the dummy argument in the procedure are identified, and the structure type having these values as elements is created.
This structure is called a procedure reference structure.

【００３６】本発明に係るコンパイラはコンパイル対象
手続きに対して、上記識別された仮引数のプロセッサ配
置に依存する値を、仮引数として与えられた上記手続き
参照構造体の要素で置き換えた形でコード生成する機能
をもつ。The compiler according to the present invention codes the procedure to be compiled by replacing the value of the identified dummy argument depending on the processor arrangement with the element of the procedure reference structure given as the dummy argument. Has a function to generate.

【００３７】本発明に係るコンパイラは異なる手続き参
照パターン毎に生成された手続き参照構造体の実体を引
数に追加して、目的の手続き呼び出しを行うコードを生
成する機能をもつ。The compiler according to the present invention has a function of adding the substance of the procedure reference structure generated for each different procedure reference pattern to an argument to generate a code for calling a target procedure.

【００３８】更に本発明に係るコンパイラは仮引数非依
存で初期化時に確定する全ての値を初期設定し、更に手
続き参照パターン毎に存在する手続き参照構造体型の実
体を初期設定する初期化ルーチンのコードを生成する機
能を持つ。Furthermore, the compiler according to the present invention initializes all the values that are fixed at the initialization without parameter dependence and further initializes the procedure reference structure type entity existing for each procedure reference pattern. Has a function to generate code.

【００３９】この初期化ルーチンのコードは単なる定数
値による初期値だけでなく、初期化時に知ることのでき
るプロセッサ配列のサイズとＰＥ番号を用いた式の値を
計算する一連のコードであってよい。The code of this initialization routine may be a series of codes for calculating the value of the expression using the size of the processor array and the PE number which can be known at the time of initialization, as well as the initial value by a constant value. .

【００４０】[0040]

【作用】本発明のオブジェクトコード生成方式によれ
ば、ソースプログラムが入力されたとき、まず、このソ
ースプログラムの解析を行い、このソースプログラムに
記述された手続きの中からプロセッサ構造とＰＥ番号に
直接的に依存する値を用いる部分と、手続きの仮引数の
プロセッサ配置を通してプロセッサ構造に依存する値を
用いる部分とを抽出する。According to the object code generation method of the present invention, when a source program is input, the source program is first analyzed, and the processor structure and PE number are directly analyzed from the procedures described in the source program. A part that uses a value that depends on the physical structure and a part that uses a value that depends on the processor structure are extracted through the processor arrangement of the dummy argument of the procedure.

【００４１】手続きの中のプロセッサ配列のサイズまた
はＰＥ番号に依存する値には、配列を複数のＰＥに分割
配置する場合のグローバルインデックスとＰＥ番号およ
びローカルインデックスとの対応、分割されたローカル
配列のサイズ等がある。For the value depending on the size of the processor array or the PE number in the procedure, the correspondence between the global index, the PE number and the local index when the array is divided and arranged in a plurality of PEs, the divided local array There are sizes etc.

【００４２】更に図１６に示した従来のローカルコード
では、ＰＥ番号による条件判断が多量にでてくるが、こ
れらＰＥ番号による条件式の値もＰＥ番号に依存する値
である。Further, in the conventional local code shown in FIG. 16, a large number of conditional judgments are made based on PE numbers, and the values of the conditional expressions based on these PE numbers are also values that depend on the PE number.

【００４３】これら値のそれぞれについて、仮引数のプ
ロセッサ配置に依存する場合と依存しない場合とがあり
得る。上記プロセッサ構造に依存する値で仮引数に関係
しない値については、変数化して手続きのコードを生成
する。Each of these values may or may not depend on the processor arrangement of the dummy argument. A value that depends on the processor structure and that does not relate to a dummy argument is converted into a variable to generate a procedure code.

【００４４】また、上記プロセッサ構造に依存する値で
仮引数のプロセッサ配置に依存する値について、これら
の値を要素とする構造体型を生成する。この構造体型を
手続き参照構造体型という。For the values that depend on the processor structure and that depend on the processor arrangement of the dummy argument, a structure type having these values as elements is generated. This structure type is called a procedure reference structure type.

【００４５】手続きのコード生成では、手続きの仮引数
に上記手続き参照構造体型を追加し、手続きの中の仮引
数のプロセッサに依存する値を、仮引数として与えられ
た手続き参照構造体の要素で置き換えた形でコード生成
する。In the code generation of a procedure, the above procedure reference structure type is added to the dummy argument of the procedure, and the processor-dependent value of the dummy argument in the procedure is replaced by the element of the procedure reference structure given as the dummy argument. Generate code in the replaced form.

【００４６】本発明では、プロセッサ配列のサイズはコ
ンパイル時には指定せず、コンパイラの生成したローカ
ルコードをＰＥにブロードキャストした後、一斉に実行
を開始し、その先頭で実行する初期化ルーチンの中でプ
ロセッサ配列のサイズと自分のＰＥ番号とを知る。In the present invention, the size of the processor array is not specified at the time of compilation, and after the local code generated by the compiler is broadcast to the PEs, the executions are started all at once and the processors are executed in the initialization routine executed at the beginning. Know the size of the array and your PE number.

【００４７】手続きの中の仮引数に依存せずかつプロセ
ッサ構造に依存する値は、プロセッサ配列のサイズと自
分のＰＥ番号を知れれば値を計算できる。従ってこれら
の値は初期化ルーチンの中で計算可能である。The value that does not depend on the dummy argument in the procedure and that depends on the processor structure can be calculated if the size of the processor array and its PE number are known. Therefore, these values can be calculated in the initialization routine.

【００４８】一般に手続きは仮引数を持ち、そして手続
き内のデータの値やデータのプロセッサへの配置が、仮
引数のプロセッサへの配置から間接的に定まる場合があ
りえる。Generally, a procedure has a dummy argument, and the value of data in the procedure and the allocation of the data to the processor may be indirectly determined from the allocation of the dummy argument to the processor.

【００４９】普通、仮引数の値そのものは実行時でない
と定まらないが、手続きを呼び出すときの実引数のプロ
セッサへの配置の仕方は初期化時に定めることが可能で
ある。従って本発明では異なる手続き参照パターンをす
べて識別し、手続き参照構造体の実体を生成する。Normally, the value of the dummy argument itself is not determined until execution time, but the method of arranging the actual argument in the processor when calling the procedure can be determined at initialization. Therefore, in the present invention, all different procedure reference patterns are identified, and the substance of the procedure reference structure is generated.

【００５０】初期化時において、それぞれの手続き呼び
出しパターン毎に実引数のプロセッサ配置が定まり、従
って手続きが呼ばれたとき、仮引数のプロセッサ配置に
依存する手続き内ローカル変数のプロセッサ配置も定ま
る。即ち手続き内の仮引数のプロセッサ配置に依存する
値も、手続き参照パターンごとに初期化時に確定する値
とすることができる。At the time of initialization, the processor allocation of the actual argument is determined for each procedure call pattern, so that when the procedure is called, the processor allocation of the in-procedure local variable depending on the processor allocation of the dummy argument is also determined. That is, the value depending on the processor arrangement of the dummy argument in the procedure can be set to the value fixed at the initialization for each procedure reference pattern.

【００５１】本発明のコンパイラが生成する初期化ルー
チンは、プロセッサ配列のサイズと自分のＰＥ番号に関
する情報を獲得した後、プログラム内の手続きがプロセ
ッサ構造依存かつ仮引数のプロセッサ配置に依存しない
値を持つなら、それを初期化し値を設定する。The initialization routine generated by the compiler of the present invention obtains the information about the size of the processor array and its own PE number, and then the procedure in the program determines the value that is dependent on the processor structure and not on the processor arrangement of the dummy argument. If you have it, initialize it and set the value.

【００５２】次に引数を持つ手続きにおいて、仮引数の
プロセッサ配置に依存する値が存在する、即ちその手続
きが空でない手続き参照構造体を持つなら、手続きの異
なる参照パターン毎に手続き参照構造体型の実体を生成
し、値を初期設定する。Next, in a procedure having an argument, if there is a value depending on the processor arrangement of the dummy argument, that is, if the procedure has a non-empty procedure reference structure, the procedure reference structure type of each reference pattern of the procedure is different. Create an entity and initialize the value.

【００５３】本発明のコンパイラは手続き呼び出しに対
して、この手続き呼び出しの参照パターンに対応する手
続き参照構造体型の実体を引数に追加して目的の手続き
を呼び出す形でコード生成を行う。The compiler of the present invention generates a code for a procedure call by adding a procedure reference structure type entity corresponding to the reference pattern of the procedure call to an argument and calling a target procedure.

【００５４】以上により、本発明のコンパイラの生成す
るローカルコードは、プロセッササイズとＰＥ番号に依
存する値がローカルコードの実行の先頭で確定させるこ
とができ、プロセッサ数に依存しないローカルコードの
実現が可能になる。As described above, in the local code generated by the compiler of the present invention, the value depending on the processor size and the PE number can be determined at the head of the execution of the local code, and the local code independent of the number of processors can be realized. It will be possible.

【００５５】ただし初期化時には計算できない動的な値
は存在しえる。このような値は実行時に計算しなければ
ならない。本発明による初期化ルーチンはローカル手続
きの先頭に一度だけ実行すればよいので、その処理時間
はプログラム全体の実行時間に較べて無視できる。従っ
て初期化ルーチンのコードの品質は問題にならない。However, there may be dynamic values that cannot be calculated at initialization. Such values must be calculated at run time. Since the initialization routine according to the present invention only needs to be executed once at the beginning of the local procedure, its processing time can be ignored compared to the execution time of the entire program. Therefore, the code quality of the initialization routine does not matter.

【００５６】また初期化ルーチンの実行が終了したらそ
のデータ領域は解放することができるので、実行時のメ
モリサイズに影響を与えない。そして初期化ルーチンに
対する時間とサイズに関するコード最適化が不要である
ので、初期化ルーチンの生成に要する処理時間は小さ
い。Further, since the data area can be released when the execution of the initialization routine is completed, it does not affect the memory size at the time of execution. Since the code optimization regarding the time and size for the initialization routine is unnecessary, the processing time required for generating the initialization routine is short.

【００５７】従来例ではＰＥ番号による条件判断をロー
カルコードの内部で行っていた。これに対して本発明で
は手続き参照パターン毎に手続き参照構造体を生成し、
手続き参照構造体の初期化によってＰＥ番号に関する論
理式が計算されるので、ローカルコードの実行の中で計
算する必要がなくなり、実行時間が短縮できる。本発明
は上記条件式の値だけでなく、初期化時に計算可能な値
はすべて初期化時に計算できるので、実行時のオーバー
ヘッドを大幅に軽減できる。In the conventional example, the condition judgment based on the PE number is performed inside the local code. On the other hand, in the present invention, a procedure reference structure is generated for each procedure reference pattern,
Since the logical expression related to the PE number is calculated by the initialization of the procedure reference structure, it is not necessary to calculate it during the execution of the local code, and the execution time can be shortened. In the present invention, not only the value of the conditional expression but all values that can be calculated at the time of initialization can be calculated at the time of initialization, so that the overhead at the time of execution can be significantly reduced.

【００５８】[0058]

【実施例】以下、図面を参照して本発明の実施例を説明
する。本発明の実施例では、図３に示すような配列デー
タ構造体型と配列参照構造体型という2 つのデータ構造
を導入する。Embodiments of the present invention will be described below with reference to the drawings. The embodiment of the present invention introduces two data structures, that is, an array data structure type and an array reference structure type as shown in FIG.

【００５９】この配列データ構造体型は、配列の分割配
置の方法を特徴づけるデータ構造である。そして、この
配列データ構造体型は以下の構造体要素をもつ。まず、
配列データ構造体型は、ローカル配列本体のデータ領域
へのポインタをもつ。なお、実際のデータ領域は初期化
時に割り当てることになる。そして、その他として、分
割前配列のインデックス下限値と上限値、分割されたロ
ーカル配列のサイズ、分割配置の対象のプロセッサ番号
下限値と上限値、分割のブロックサイズ、ローカルイン
デックスから対応するグローバルインデックスを求める
グローバル変換テーブル、グローバルインデックスから
対応するプロセッサ番号を求めるプロセッサ番号テーブ
ル、及びグローバルインデックスから対応するローカル
インデックスを求めるローカル変換テーブルをもつ。This array data structure type is a data structure that characterizes the method of array division. The array data structure type has the following structure elements. First,
The array data structure type has a pointer to the data area of the local array body. The actual data area will be allocated at the time of initialization. Then, in addition, the index lower and upper limits of the pre-division array, the size of the divided local array, the processor number lower and upper limits of the division allocation target, the block size of the division, and the corresponding global index from the local index It has a global conversion table to be obtained, a processor number table to obtain the corresponding processor number from the global index, and a local translation table to obtain the corresponding local index from the global index.

【００６０】一方、配列参照構造体型は、配列参照を特
徴づける構造体型である。そして、この配列参照構造体
型は以下の構造体要素を持つ。まず、配列参照構造体型
は、参照する配列の配列データ構造体へのポインタをも
つ。そして、その他として、ローカル配列参照における
ローカルインデックスの初期値と終値とをもつ。配列
は、複数ＰＥに分割して配置されるため、それぞれのＰ
Ｅで並列に参照できる。このローカルインデックスの初
期値と終値とは、このときの各ＰＥのローカル配列参照
に関するものである。On the other hand, the array reference structure type is a structure type that characterizes an array reference. The array reference structure type has the following structure elements. First, the array reference structure type has a pointer to the array data structure of the referenced array. And, in addition, it has the initial value and the final value of the local index in the local array reference. Since the array is divided into a plurality of PEs and arranged, each P
You can refer to them in parallel with E. The initial value and the final value of this local index relate to the local array reference of each PE at this time.

【００６１】図１に本発明に係るオブジェクトコード生
成方式を適用してなるコンパイラの概略構成を示す。図
１に示すように、本発明のコンパイラ１は、最適化変換
部２及びコード生成部３を有してなる。また、最適化変
換部２は、手続き参照パターン解析部７２、ＰＥ依存解
析部７３、及び手続きコード変換部７４を有し、コード
生成部３は、ローカルデータ生成部７５、初期化ルーチ
ンコード生成部７６、及び手続き本体コード生成部７７
を有してなる。FIG. 1 shows a schematic configuration of a compiler to which the object code generation method according to the present invention is applied. As shown in FIG. 1, the compiler 1 of the present invention includes an optimization conversion unit 2 and a code generation unit 3. Further, the optimization conversion unit 2 has a procedure reference pattern analysis unit 72, a PE dependency analysis unit 73, and a procedure code conversion unit 74, and the code generation unit 3 has a local data generation unit 75 and an initialization routine code generation unit. 76, and procedure body code generation unit 77
To have.

【００６２】以下、本発明の第１実施例を図１２に示し
たサンプルソースプログラムを例にして説明する。図２
は同実施例に係るコンパイラ１が図１２に示すサンプル
ソースプログラムに対して生成するローカルコードのブ
ロック図である。The first embodiment of the present invention will be described below by taking the sample source program shown in FIG. 12 as an example. Figure 2
FIG. 13 is a block diagram of local code generated by the compiler 1 according to the embodiment for the sample source program shown in FIG. 12.

【００６３】図２において、８１は構造体型宣言であ
る。これについては、図３を参照して既に説明した。ま
た、図２において、８２はローカルデータの宣言、８３
は初期化ルーチンのコード、及び８４は手続き本体のコ
ードであり、その詳細をそれぞれ図４、図５及び図７に
示す。In FIG. 2, reference numeral 81 is a structure type declaration. This has already been described with reference to FIG. Further, in FIG. 2, reference numeral 82 is a declaration of local data, and 83
Is the code of the initialization routine, and 84 is the code of the procedure body, the details of which are shown in FIGS. 4, 5 and 7, respectively.

【００６４】まず、同実施例におけるコンパイラは、図
１２に示したソースプログラムを読み込み、構文解析し
て中間コードに変換してメモリの中に格納する。この図
１２に示したサンプルプログラムは手続き参照を含まな
いので、図１に示す手続き参照パターン解析部７２は機
能しない。また、図１２に示すソースプログラムはメイ
ンプログラムなので、引数を持たず、従って手続き参照
構造体型を持たない。First, the compiler in this embodiment reads the source program shown in FIG. 12, parses it, converts it into intermediate code, and stores it in the memory. Since the sample program shown in FIG. 12 does not include a procedure reference, the procedure reference pattern analysis unit 72 shown in FIG. 1 does not function. Further, since the source program shown in FIG. 12 is the main program, it has no arguments and therefore does not have a procedure reference structure type.

【００６５】また、図１に示すＰＥ依存解析部７３は、
上述した中間表現をスキャンして、プロセッササイズと
ＰＥ番号とに依存するすべての値を識別する。次に、図
１に示すローカルデータ生成部７５は、識別した値を変
数化する。この結果、図４に示すようなデータ宣言が生
成される。Further, the PE dependence analysis unit 73 shown in FIG.
The intermediate representation described above is scanned to identify all values that depend on processor size and PE number. Next, the local data generator 75 shown in FIG. 1 converts the identified value into a variable. As a result, the data declaration as shown in FIG. 4 is generated.

【００６６】次に、同実施例に係るコンパイラは、図１
に示す初期化ルーチン生成部７６によって、上記生成さ
れたローカルデータを初期設定するコードからなる初期
化ルーチンのコードを生成する。Next, the compiler according to the present embodiment is as shown in FIG.
The initialization routine generation unit 76 shown in (1) generates an initialization routine code including a code for initializing the generated local data.

【００６７】図５は同実施例に係るコンパイラ１によっ
て生成された初期化ルーチンを例示した図である。ここ
で、図５に示した初期化ルーチンの概要を説明する。FIG. 5 is a diagram illustrating an initialization routine generated by the compiler 1 according to the embodiment. Here, the outline of the initialization routine shown in FIG. 5 will be described.

【００６８】まず最初に（ｌ１１１）、（ｌ１１２）に
おいて、関数 get＿my＿pnum() 及び get＿psor＿size
() によって自分のプロセッサ番号とプロセッサ配列の
サイズを取得する。First, in (l111) and (l112), the functions get_my_pnum () and get_psor_size are processed.
Get the processor number and the size of the processor array with ().

【００６９】そして、次に配列データ構造体を初期化す
る。配列データ構造体の初期化は（ｌ１１４）における
配列データ領域の生成と、（ｌ１１１１）におけるグロ
ーバル変換テーブル、（ｌ１１１０）1110におけるプロ
セッサ番号テーブル、及び（ｌ１１９）におけるローカ
ル変換テーブルの生成を含む。Then, the array data structure is initialized. The initialization of the array data structure includes generation of an array data area in (l114), generation of a global conversion table in (l1111), a processor number table in (l1110) 1110, and generation of a local conversion table in (l119).

【００７０】ローカル配列のデータ領域については、元
の配列のサイズが１０であり、これをブロック数４でps
ize 個のプロセッサに分割するので、ローカル配列のサ
イズは、(10 + psize - 1)/psize( 整数部分) である。
この値は当然、初期化時には確定する。Regarding the data area of the local array, the size of the original array is 10, and this is ps with 4 blocks.
Since it is divided into ize processors, the size of the local array is (10 + psize-1) / psize (integer part).
This value is of course fixed at initialization.

【００７１】そして、（ｌ１１３）において上記サイズ
が計算された後、（ｌ１１４）において実際のデータ領
域を malloc(...) 関数によって割り当てる。プロセッ
サ番号テーブルは、配列のグローバルインデックスを与
えて、対応する配列要素を持つプロセッサ番号を得るた
めのテーブルである。同様に、ローカル変換テーブルは
配列のグローバルインデックスから対応するローカル配
列のインデックスを得るためのテーブルである。After the size is calculated in (l113), the actual data area is allocated by the malloc (...) Function in (l114). The processor number table is a table for giving a global index of an array to obtain a processor number having a corresponding array element. Similarly, the local conversion table is a table for obtaining the index of the corresponding local array from the global index of the array.

【００７２】図１２の例にあるようなブロック分割の場
合、プロセッササイズpsize 、ブロックサイズnblock
とすると、グローバルインデックス I に対応するプロ
セッサ番号 p は， p = I/nblock ( 整数部分) 、同じ
くローカルインデックス iは i = mod( I, nblock ) で
与えられる。In the case of block division as in the example of FIG. 12, processor size psize and block size nblock
Then, the processor number p corresponding to the global index I is given by p = I / nblock (integer part), and the local index i is given by i = mod (I, nblock).

【００７３】この p および i は、式で計算するより
予めテーブルの形にしておくほうが効率的である。従っ
て、プロセッサ番号テーブルとローカル変換テーブルを
設ける。It is more efficient to make p and i in the form of a table in advance than to calculate them by formulas. Therefore, a processor number table and a local conversion table are provided.

【００７４】図５の初期化ルーチンは、（ｌ１１１０）
と（ｌ１１９）において、プロセッサ番号テーブルとロ
ーカル変換テーブルのデータ領域を割り当て、上記式に
よって値を計算し初期設定する。また、グローバル変換
テーブルは、ローカルインデックスから対応するグロー
バルインデックスを求めるものである。このテーブルの
内容はＰＥごとに異なる。実際、ＰＥ番号を p とする
とローカルインデックス i に対応するグローバルイン
デックス I は、I = nblock * p + i となる。The initialization routine of FIG. 5 is (l1110).
In and (l119), the data areas of the processor number table and the local conversion table are allocated, the values are calculated by the above formulas, and the values are initialized. Further, the global conversion table is for obtaining the corresponding global index from the local index. The contents of this table differ for each PE. In fact, when the PE number is p, the global index I corresponding to the local index i is I = nblock * p + i.

【００７５】図５の初期化ルーチンは、（ｌ１１１１）
において、グローバル変換テーブルのデータ領域を割り
当て、上記式によって値を計算し初期設定する。また、
他の配列データ構造体についても同様に初期設定する。
この場合、プロセッサ番号テーブル、ローカル変換テー
ブル、及びグローバル変換テーブルは全ての配列構造体
で同じなので、同じデータ領域を共有することができ
る。The initialization routine of FIG. 5 is (1111)
In, the data area of the global conversion table is allocated, the value is calculated by the above formula, and the initialization is performed. Also,
Similarly, other array data structures are initialized.
In this case, since the processor number table, the local conversion table, and the global conversion table are the same in all array structures, the same data area can be shared.

【００７６】また、配列参照構造体の初期設定は、ロー
カル配列を参照するときのローカルインデックスの初期
値と終値とをＰＥ毎に指定する。図１２に示したサンプ
ルプログラムには、図１５の（ｌ５９）と（ｌ５１１）
のようにI=0:8 と I=1:9 の2 つのFORALLインデックス
範囲が記述されている。そして、図５の（ｌ１１１
２）、（ｌ１１１３）、（ｌ１１１４）において、FORA
LLインデックスI=0:8 による配列参照に対応する配列参
照構造体 xa を初期設定する。また、FORALLインデック
ス I=1:9による配列参照に対応する配列参照構造体 xte
mp も同様に初期設定する。In the initial setting of the array reference structure, the initial value and the final value of the local index when referring to the local array are specified for each PE. The sample program shown in FIG. 12 includes (l59) and (l511) of FIG.
As shown in, two FORALL index ranges of I = 0: 8 and I = 1: 9 are described. Then, in FIG.
2), (11113) and (11114), FORA
Initializes the array reference structure xa corresponding to the array reference with LL index I = 0: 8. Also, the array reference structure xte corresponding to the array reference with FORALL index I = 1: 9.
Initialize mp as well.

【００７７】また、ローカルコードの中にはＰＥ番号を
含む条件式が度々でてくる。図５に示す初期化ルーチン
では、ＰＥ番号を得た後、上記条件式の値を予め計算し
ておく。（ｌ１１１５）が上記条件式の値の計算に相当
する。In the local code, conditional expressions including the PE number often appear. In the initialization routine shown in FIG. 5, after the PE number is obtained, the value of the conditional expression is calculated in advance. (11115) corresponds to the calculation of the value of the conditional expression.

【００７８】そして、図５の初期化ルーチンの実行によ
って、図６に示すようなデータ構造が生成される。図６
において、１２１及び１２２は、それぞれ配列参照型の
実体 xa 及び xtemp2 である。By executing the initialization routine of FIG. 5, a data structure as shown in FIG. 6 is generated. Figure 6
In, 121 and 122 are array reference type entities xa and xtemp2, respectively.

【００７９】１２３、１２４、１２５、及び１２６はそ
れぞれ配列データ構造体の実体 a、b 、temp1 、及びte
mp2 である。また、１２７、１２８、１２９、及び１２
１０は、それぞれ配列構造体の実体の要素であるローカ
ル配列本体のデータ領域である。また、１２１１、１２
１２、及び１２１３はそれぞれローカルテーブル、グロ
ーバルテーブル、及びプロセッサテーブルである。この
１２１１、１２１２、及び１２１３は、配列構造体の実
体 a、b 、temp1 、temp2 に共有される。Reference numerals 123, 124, 125, and 126 denote the entities a, b, temp1, and te of the array data structure, respectively.
It's mp2. Also, 127, 128, 129, and 12
Reference numeral 10 is a data area of the local array body which is an element of the entity of the array structure. Also, 1211, 12
12 and 1213 are a local table, a global table, and a processor table, respectively. These 1211, 1212, and 1213 are shared by the entities a, b, temp1, and temp2 of the array structure.

【００８０】同実施例に係るコンパイラ１の手続きコー
ド変換部７４は、手続き本体の中間表現に対して、プロ
セッササイズとＰＥ番号に依存する値を図４に示すロー
カルデータを参照する形に変換する。そして、手続き本
体コード生成部７７は、手続き本体に対して図７に示す
ようなコードを生成する。The procedure code conversion unit 74 of the compiler 1 according to the embodiment converts a value depending on the processor size and PE number for the intermediate representation of the procedure body into a form that refers to the local data shown in FIG. . Then, the procedure body code generator 77 generates a code as shown in FIG. 7 for the procedure body.

【００８１】図７に示すコードは、処理の内容は従来例
で説明したコンパイラが生成したコードである図１６と
全く同じであるが、プロセッササイズとＰＥ号に依存す
る値が、図６に示す初期化ルーチンによって初期化時に
計算され定数化されているので、コードが単純化され、
従って実行効率が向上している。The code shown in FIG. 7 is exactly the same as the code generated by the compiler described in the conventional example as shown in FIG. 16, but the values depending on the processor size and the PE number are shown in FIG. Since the initialization routine calculates and constants at initialization, the code is simplified,
Therefore, the execution efficiency is improved.

【００８２】次に、本発明の第２実施例を図８に示すサ
ンプルプログラムを参照して説明する。図８に示すサン
プルプログラムは、図１２に示したサンプルプログラム
と同じ計算を行うが、図８に示すサンプルプログラム
は、 FORALL 式の計算がサブルーチンになっている。Next, a second embodiment of the present invention will be described with reference to the sample program shown in FIG. The sample program shown in FIG. 8 performs the same calculation as the sample program shown in FIG. 12, but in the sample program shown in FIG. 8, FORALL-type calculation is a subroutine.

【００８３】図８において、（ｌ１５６）は仮引数がFo
rtran90 言語仕様書(JIS X3001 プログラム言語Fortra
n)に定義されている形状引き継ぎ配列であることを宣言
している。これは，仮引数の各次元のサイズが実引数の
それを引き継ぐことを意味する。In FIG. 8, the dummy argument of (l156) is Fo.
rtran90 Language Specification (JIS X3001 Programming Language Fortra
It declares that it is a shape inheritance array defined in n). This means that the size of each dimension of the dummy argument inherits that of the actual argument.

【００８４】また、（ｌ１５７）は、ＩＮＨＥＲＩＴコ
ンパイル指示子である。これは、仮引数のプロセッサ配
置が実引数のプロセッサ配置と同じであることをコンパ
イラに知らせるものである。Further, (l157) is an INHERIT compile directive. This informs the compiler that the processor layout of the dummy argument is the same as the processor layout of the actual argument.

【００８５】形状引き継ぎ配列を仮引数に持つ手続きを
呼び出すためには、呼び出す側に明示的な手続き引用仕
様宣言が必要であるが、図８の例では省略する。同実施
例に係るコンパイラは、図８の（ｌ１５１）以下のメイ
ンプログラムに対して図９に示すローカルコードを生成
し、図８の（ｌ１５５）以下のサブルーチンに対して図
１０に示すローカルコードを生成する。ここでは簡単の
ため、第１実施例により生成されるローカルコードであ
る図４、図５、及び図７と内容的に異なる部分だけを説
明する。In order to call a procedure having a shape inheritance array as a dummy argument, the calling side needs an explicit procedure reference specification declaration, but it is omitted in the example of FIG. The compiler according to the embodiment generates the local code shown in FIG. 9 for the main program under (l151) in FIG. 8 and the local code shown in FIG. 10 for the subroutine under (l155) in FIG. To generate. Here, for simplification, only the portions different in content from the local codes generated in the first embodiment will be described.

【００８６】まず、同実施例に係るコンパイラの手続き
参照パターン解析部７２は、ソースプログラムの中の異
なる手続き参照パターンを識別する。図８のサンプルソ
ースプログラムでは、サブルーチンコールが１個あるの
で、手続き参照パターン解析部７２はこの手続き呼び出
しに１個の手続参照構造体を対応づける。First, the procedure reference pattern analysis unit 72 of the compiler according to the embodiment identifies different procedure reference patterns in the source program. Since there is one subroutine call in the sample source program of FIG. 8, the procedure reference pattern analysis unit 72 associates one procedure reference structure with this procedure call.

【００８７】同実施例に係るコンパイラの生成するロー
カルコードである図９の１３１は、手続参照構造体の実
体へのポインタ（ｌ１３４）を含んでいる。また、この
手続参照構造体の実体は、初期化ルーチン１３２の中で
生成される。Local code 131 generated by the compiler according to the embodiment includes a pointer (l134) to the substance of the procedure reference structure. The substance of this procedure reference structure is generated in the initialization routine 132.

【００８８】１３２の中でサブルーチン XYZ に対応す
る初期化ルーチン iniｔ＿xyz(...)は、サンプルプログ
ラムの中でサブルーチン XYZ が持つ仮引数に対応する
配列参照構造体を実引数として呼び出される。その結
果、 iniｔ＿xyz(..) は、初期設定された手続参照構造
体の実体を返す。In 132, the initialization routine init_xyz (...) Corresponding to the subroutine XYZ is called with the array reference structure corresponding to the dummy argument of the subroutine XYZ in the sample program as the actual argument. As a result, init_xyz (..) returns the body of the initialized procedure reference structure.

【００８９】この返された手続参照構造体は、図８の
（ｌ１５４）で実引数としてサブルーチン XYZ の呼び
出しに対応する関数 xyz(...) に渡される。次に，同実
施例に係るコンパイラがサブルーチン XYZ に対応して
生成する図１０のコードについて説明する。This returned procedure reference structure is passed as an actual argument to the function xyz (...) Corresponding to the call of the subroutine XYZ in (l154) of FIG. Next, the code of FIG. 10 generated by the compiler according to the embodiment in correspondence with the subroutine XYZ will be described.

【００９０】図１０の１４１は、手続参照構造体の宣言
である。また、１４２は、初期化手続きである。この初
期化手続きは、サブルーチン XYZ が持つ仮引数に対応
した配列参照構造体を仮引数に持ち、初期設定された手
続参照構造体の実体を返す。この場合、構造体の実体は
すべて初期化時に値が決定する。Reference numeral 141 in FIG. 10 is a declaration of a procedure reference structure. Further, 142 is an initialization procedure. This initialization procedure has an array reference structure corresponding to the dummy argument of subroutine XYZ as a dummy argument, and returns the body of the initialized procedure reference structure. In this case, the values of all structure entities are determined at initialization.

【００９１】サブルーチン XYZ に直接対応する１４３
は、手続参照構造体を仮引数とする。サブルーチン XYZ
の中のプロセッサ構成と仮引数に依存する部分は、すべ
て生成されたコードでは仮引数として与えられた手続参
照構造体から求める。143 directly corresponding to the subroutine XYZ
Takes a procedure reference structure as a dummy argument. Subroutine XYZ
The part that depends on the processor configuration and the dummy argument in is all obtained from the procedure reference structure given as the dummy argument in the generated code.

【００９２】同実施例が示すとおり、プロセッササイズ
とプロセッサ番号に依存する部分はすべて手続参照構造
体としてローカルコードをＰＥにロードした後、実行の
先頭の初期化時に値が設定される。As shown in the embodiment, all the parts depending on the processor size and the processor number are set to the values at the initialization of the beginning of the execution after the local code is loaded into the PE as the procedure reference structure.

【００９３】従って、本発明によってコンパイラはＰＥ
数に依存しないオブジェクトコードを生成できるように
なる。また、ＰＥ数に依存しないオブジェクトコードは
並列手続きのライブラリの実現に有用である。ユーザプ
ログラムは様々なプロセッサ数で実行される可能性があ
るが、プロセッサ数固定のライブラリであるなら、プロ
セッサ数の全ての可能性の分だけライブラリのバージョ
ンを容易しなければならない。一方、本発明によればプ
ロセッサ数可変のライブラリを一つ用意すれば全てのプ
ロセッサ数に対応できるので、ライブラリのためのディ
スク容量を大幅に節約できる。Therefore, according to the present invention, the compiler
You will be able to generate object code that is independent of numbers. Moreover, the object code that does not depend on the number of PEs is useful for realizing a library of parallel procedures. The user program may be executed by various numbers of processors, but if the library has a fixed number of processors, the version of the library should be facilitated by all the possibilities of the number of processors. On the other hand, according to the present invention, if one library with a variable number of processors is prepared, all the numbers of processors can be accommodated, so that the disk capacity for the library can be greatly saved.

【００９４】[0094]

【発明の効果】以上詳述したように本発明によれば、プ
ロセッササイズとプロセッサ番号に依存する部分はすべ
てデータ構造体としてローカルコードをＰＥにロードし
た後、実行の先頭の初期化時に値が設定されるようにす
るため、ＰＥ数に依存しないオブジェクトコードを生成
できるようになる。As described in detail above, according to the present invention, all the parts depending on the processor size and the processor number are loaded with the local code as the data structure in the PE, and then the value is initialized at the initialization of the execution. Since it is set, it becomes possible to generate an object code that does not depend on the number of PEs.

【００９５】また、本発明によれば、プロセッサ数可変
のライブラリを一つ用意すれば全てのプロセッサ数に対
応できるため、ライブラリのためのディスク容量を大幅
に節約できることになる。Further, according to the present invention, if one library with a variable number of processors is prepared, all the numbers of processors can be supported, so that the disk capacity for the library can be greatly saved.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明に係るオブジェクトコード生成方式を適
用してなるコンパイラの概略構成図。FIG. 1 is a schematic configuration diagram of a compiler to which an object code generation method according to the present invention is applied.

【図２】本発明の実施例に係るソースプログラムに対し
て生成するローカルコードの構成図。FIG. 2 is a configuration diagram of local code generated for a source program according to an embodiment of the present invention.

【図３】本発明の実施例に係る配列データ構造体型及び
配列参照構造体型を例示した図。FIG. 3 is a diagram illustrating an array data structure type and an array reference structure type according to an embodiment of the present invention.

【図４】本発明の実施例に係るソースプログラムに対し
て生成するローカルデータを例示した図。FIG. 4 is a diagram exemplifying local data generated for a source program according to an embodiment of the present invention.

【図５】本発明の実施例に係るソースプログラムに対し
て生成する初期化ルーチンを例示した図。FIG. 5 is a diagram exemplifying an initialization routine generated for a source program according to an embodiment of the present invention.

【図６】本発明の実施例に係るソースプログラムに対し
て生成する初期化ルーチンが実行されたときに生成する
手続参照構造体を例示した図。FIG. 6 is a diagram exemplifying a procedure reference structure generated when an initialization routine generated for a source program according to an embodiment of the present invention is executed.

【図７】本発明の実施例に係るソースプログラムに対し
て生成する手続き本体コードを例示した図。FIG. 7 is a diagram exemplifying a procedure body code generated for a source program according to an embodiment of the present invention.

【図８】本発明の実施例に係るサンプルソースプログラ
ムを示した図。FIG. 8 is a diagram showing a sample source program according to an embodiment of the present invention.

【図９】本発明の実施例に係るソースプログラムに対し
て生成するローカルコードを例示した図。FIG. 9 is a diagram exemplifying local code generated for a source program according to an embodiment of the present invention.

【図１０】本発明の実施例に係るソースプログラムに対
して生成するローカルコードを例示した図。FIG. 10 is a diagram exemplifying local code generated for a source program according to an embodiment of the present invention.

【図１１】本発明の実施例に係る分散メモリ型並列計算
機の概略構成図。FIG. 11 is a schematic configuration diagram of a distributed memory type parallel computer according to an embodiment of the present invention.

【図１２】サンプルソースプログラムを示した図。FIG. 12 is a diagram showing a sample source program.

【図１３】ソースプログラム中の配列データの分割配置
の様子を示した図。FIG. 13 is a diagram showing a state of divided arrangement of array data in a source program.

【図１４】従来のコンパイラの動作原理を説明する図。FIG. 14 is a diagram illustrating the operation principle of a conventional compiler.

【図１５】ソースプログラムに対してコンパイラが行う
コード変換を例示した図。FIG. 15 is a diagram exemplifying code conversion performed by a compiler for a source program.

【図１６】従来のコンパイラが生成するローカルコード
を例示した図。FIG. 16 is a diagram exemplifying local code generated by a conventional compiler.

【符号の説明】[Explanation of symbols]

１…コンパイラ、２…最適化変換部、３…コード生成
部、７２…手続き参照パターン解析部、７３…ＰＥ依存
解析部、７４…手続きコード変換部、７５…ローカルデ
ータ生成部、７６…初期化ルーチンコード生成部、７７
…手続き本体コード生成部、８０…ローカルコード、８
１…構造体型宣言、８２…ローカルデータ宣言、８３…
初期化ルーチンコード、８４…手続き本体コード。1 ... Compiler, 2 ... Optimization conversion unit, 3 ... Code generation unit, 72 ... Procedure reference pattern analysis unit, 73 ... PE dependency analysis unit, 74 ... Procedure code conversion unit, 75 ... Local data generation unit, 76 ... Initialization Routine code generator, 77
… Procedure body code generator, 80… Local code, 8
1 ... Structure type declaration, 82 ... Local data declaration, 83 ...
Initialization routine code, 84 ... Procedure body code.

Claims

【特許請求の範囲】[Claims]

【請求項１】複数のプロセッサからなる並列計算機に
適用されるコンパイラであって、ソースプログラムを入
力して上記プロセッサそれぞれのローカルコードを生成
するコンパイラにおいて、上記入力したソースプログラムを解析して、このソース
プログラムに記述された手続きの中から上記プロセッサ
の番号及びサイズを含むプロセッサ構造に依存する値を
用いる部分を抽出し、これらの値を変数化する手段と、
上記プロセッサ構造に依存する値を上記変数に置き換え
て上記手続き本体のオブジェクトコードを生成する手段
とを具備してなることを特徴とするオブジェクトコード
生成方式。1. A compiler applied to a parallel computer comprising a plurality of processors, wherein the compiler inputs a source program to generate local codes of the processors, analyzes the input source program, Means for extracting from the procedure described in the source program, parts using values depending on the processor structure including the processor number and size, and converting these values into variables,
An object code generation method, comprising means for generating an object code of the procedure body by replacing a value dependent on the processor structure with the variable.

【請求項２】上記手続きの中から仮引数のプロセッサ
への配置の仕方に依存して定まる値を抽出し、上記抽出
された値を要素として定義するデータ構造体を作成する
手段と、上記データ構造体を手続きの仮引数に追加し
て、上記仮引数のプロセッサ配置に依存する値を、上記
仮引数として与えられたデータ構造体の要素に置き換え
て、上記手続き本体のオブジェクトコードを生成する手
段とを具備してなることを特徴とする請求項１記載のオ
ブジェクトコード生成方式。2. A means for extracting a value determined from the procedure depending on how to allocate a dummy argument to a processor, and creating a data structure defining the extracted value as an element, and the data. A means for adding a structure to a dummy argument of a procedure, replacing a value of the dummy argument that depends on the processor arrangement with an element of a data structure given as the dummy argument, and generating an object code of the procedure body. The object code generation method according to claim 1, further comprising:

【請求項３】実引数のサイズまたはプロセッサ配置の
異なる手続き呼び出しパターンのそれぞれについて上記
データ構造体の実体を割り当て、引数として追加して手
続き呼び出しを行うコードを生成することを特徴とする
請求項２記載のオブジェクトコード生成方式。3. The code for calling a procedure call is generated by assigning the substance of the data structure to each of the procedure call patterns having different actual argument sizes or different processor arrangements and adding it as an argument. The described object code generation method.

【請求項４】上記プロセッサ構造に依存する変数化さ
れた値を初期設定するコード、更に異なる手続きの呼び
出しパターン毎に生成された上記データ構造体の実体が
存在する場合はこれら構造体の要素を初期設定するコー
ドからなる初期化手続きのコードを生成する手段を具備
してなることを特徴とする請求項１、２又は３記載のオ
ブジェクトコード生成方式。4. A code for initializing a variableized value depending on the processor structure, and further, if there is an entity of the data structure generated for each call pattern of different procedures, the elements of these structures are 4. The object code generation method according to claim 1, further comprising means for generating a code of an initialization procedure including a code to be initialized.