JPH0145648B2

JPH0145648B2 -

Info

Publication number: JPH0145648B2
Application number: JP16784580A
Authority: JP
Inventors: Makoto Okada
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1980-11-28
Filing date: 1980-11-28
Publication date: 1989-10-04
Also published as: JPS5790757A

Description

【発明の詳細な説明】本発明は、ソート・マージの対象となるレコー
ド群を複数のブロツクに分け、各ブロツクを代表
する代表キーを求め、これらの代表キーを用いて
ソート・マージを行うようにしたソート・マージ
処理方式に関するものである。[Detailed Description of the Invention] The present invention divides a group of records to be sorted and merged into a plurality of blocks, obtains representative keys representing each block, and performs sorting and merging using these representative keys. This is related to the sort/merge processing method.

第１図は従来のソート・マージ処理方式を示す
ものであつて、Ｒは入力データ・レコード群、γ
は入力データ・レコード、STはストリングをそ
れぞれ示している。従来のソート・マージ処理方
式においては、先ず第１図イに示すように入力デ
ータ・レコード群Ｒをソート処理して複数のスト
リングSTに分割する。次に、第１図ロに示すよ
うに、これらのストリングSTをマージ処理して
より長い複数のストリングSTを作成する。これ
らのマージ処理を繰返して２本のストリングST
が作出されると、第１図ハに示すようにこれら２
本のストリングをマージ処理して１本のストリン
グを作り出す。 FIG. 1 shows a conventional sort/merge processing method, where R is a group of input data records, γ
indicates an input data record, and ST indicates a string. In the conventional sort/merge processing method, first, as shown in FIG. 1A, the input data record group R is sorted and divided into a plurality of strings ST. Next, as shown in FIG. 1B, these strings ST are merged to create a plurality of longer strings ST. By repeating these merging processes, two strings ST
are created, these two
Create a single string by merging book strings.

通常、第１図イのソート処理で作り出されるス
トリングは直接アクセス記憶装置などの外部記憶
装置上に置かれ、その本数は入力データ・レコー
ドの件数と並べ換えに使用できる主記憶量との関
係で定まる。第１図ロのようにストリングをマー
ジするために、外部記憶装置と主記憶装置間のデ
ータの移動が行われ、データ移動の総量（アクセ
ス量）は第１図イで作成されたストリングの本数
と上記の主記憶量との関係で決定される。外部記
憶装置へのアクセスは主記憶へのアクセスに比べ
て非常に遅いので、ソート・マージ処理の処理速
度は外部記憶装置へのアクセス量（アクセス回
数）によつて殆んど決定される。 Normally, the strings created by the sorting process shown in FIG. In order to merge strings as shown in Figure 1B, data is moved between the external storage device and the main memory, and the total amount of data movement (access amount) is the number of strings created in Figure 1B. It is determined based on the relationship between the amount of memory and the amount of main memory described above. Since access to the external storage device is much slower than access to the main memory, the processing speed of the sort/merge process is mostly determined by the amount of accesses (number of accesses) to the external storage device.

本発明は、上記の考察に基づくものであつて外
部記憶装置へのアクセス量を減少でき、これによ
つてソート・マージ処理の処理速度を向上できる
ようにしたソート・マージ処理方式を提供するこ
とを目的としている。そしてそのため、本発明の
ソート・マージ処理方式は、入力フアイルに格納されている入力データ・レ
コード群を、処理装置、主記憶、直接アクセス記
憶装置上の作業フアイルを有する計算機でソー
ト・マージ処理し、ソート・マージされたデー
タ・レコードの列を出力フアイルに格納するソー
ト・マージ処理方式において、 (a) 上記入力フアイルに格納れている入力デー
タ・レコード群を順次に上記主記憶に読込み、
上記処理装置で以て処理することにより、上記
入力データ・レコード群を、データ・レコード
の集合である所の複数のブロツクに分割して上
記作業フアイルに格納すると共に、各ブロツク
毎に昇順もしくは降順に従い代表キーを選択
し、作業フアイル上のブロツクのアドレスと当
該ブロツクの代表キーで以てブロツク毎の付属
情報を作成し、作成された付属情報の集合を上
記作業フアイルに格納し、 (b) 作業フアイル上の付属情報の集合を上記主記
憶に読込み、上記処理装置で以て代表キーに従
つてソート処理し、ソート処理された付属情報
の列を上記作業フアイルに格納し、 (c) ソート処理された付属情報の列に従い、上記
作業フアイルよりブロツクを上記主記憶に読込
み、該ブロツクから主記憶上にデータ・レコー
ド列を生成し、次に読込まれた代表キーに基づ
いて上記データ・レコード列を２分し、２個の
部分列の内の所定の部分列を上記出力フアイル
に格納し、残された部分列と次に読込まれたブ
ロツクのデータ・レコードとをマージして、新
たなレコードとして主記憶上に生成することを特徴とするものである。以下、本発明を図
面を参照しつつ説明する。 The present invention is based on the above considerations, and provides a sort/merge processing method that can reduce the amount of access to an external storage device and thereby improve the processing speed of sort/merge processing. It is an object. Therefore, the sort/merge processing method of the present invention involves sorting/merging a group of input data records stored in an input file using a computer having a processing device, a main memory, and a work file on a direct access storage device. In a sort/merge processing method that stores sorted/merged data record columns in an output file, (a) sequentially reads a group of input data records stored in the input file into the main memory;
By processing with the processing device, the input data record group is divided into a plurality of blocks, each of which is a set of data records, and stored in the work file, and each block is sorted in ascending or descending order. Select a representative key according to the above, create attached information for each block using the address of the block on the work file and the representative key of the block, store the set of created attached information in the work file, (b) A set of attached information on the work file is read into the main memory, sorted by the processing device according to the representative key, and the sorted column of attached information is stored in the work file, (c) sorting. According to the processed attribute information column, a block is read from the work file into the main memory, a data record string is generated from the block on the main memory, and then the data record is generated based on the read representative key. Divide the column into two, store a predetermined subsequence of the two subsequences in the above output file, merge the remaining subsequence with the data record of the next read block, and create a new block. It is characterized by being generated on the main memory as a record. Hereinafter, the present invention will be explained with reference to the drawings.

第２図イ，ロ，ハは本発明のソート・マージ処
理方式の１例を説明する図、第３図は本発明を実
施するために必要な機器構成を示す図である。 FIGS. 2A, 2B, and 2C are diagrams for explaining an example of the sort/merge processing method of the present invention, and FIG. 3 is a diagram showing the equipment configuration necessary for implementing the present invention.

第２図および第３図において、Ｂはブロツク、
AIは付属情報、１は入力フアイル、２は処理装
置、３は主記憶、４は作業フアイル、５は出力フ
アイルをそれぞれ示している。 In Figures 2 and 3, B is a block;
AI indicates attached information, 1 indicates an input file, 2 indicates a processing device, 3 indicates a main memory, 4 indicates a work file, and 5 indicates an output file.

従来のソート・マージ処理方式においては、マ
ージ処理はストリング単位で行われるが、本発明
においては各ストリングを構成するブロツク単位
にマージ処理を行う。処理手順は下記の通りであ
る。 In the conventional sort/merge processing method, merging processing is performed in units of strings, but in the present invention, merging processing is performed in units of blocks constituting each string. The processing procedure is as follows.

入力データ・レコード群Ｒは入力フアイル１に
格納されているが、第２図イに示すように入力デ
ータ・レコード群Ｒを主記憶３上に順次読込んで
処理装置２でソート処理、ブロツク化処理および
付属情報作成処理を行い、その結果を作業フアイ
ル４に格納する。ソート処理とは複数のストリン
グを作成する処理であり、ブロツク処理とは各ス
トリングSTを複数のブロツクＢに分割する処理
である。ブロツクとは主記憶装置と外部記憶装置
との間におけるデータ転送の単位である。付加情
報作成処理とは、各ブロツクＢ毎に、代表キーお
よびそのブロツクの作業フアイル４のアドレスよ
り成る付加情報AIを作成し、その付加情報AIの
作業フアイルに書込む処理である。代表キーとし
ては、そのブロツクに属するレコードのキーのう
ち最も大きいもの或は最も小さいものが代表キー
として選択される。 The input data record group R is stored in the input file 1, and as shown in FIG. and additional information creation processing, and the results are stored in the work file 4. Sorting processing is processing for creating a plurality of strings, and block processing is processing for dividing each string ST into a plurality of blocks B. A block is a unit of data transfer between the main storage and external storage. The additional information creation process is a process of creating additional information AI consisting of a representative key and the address of the work file 4 of that block for each block B, and writing it into the work file of the additional information AI. The largest or smallest of the keys of records belonging to the block is selected as the representative key.

第２図イのソート処理、ブロツク化処理および
付属情報作成処理が行われた後、第２図ロに示す
ように、作業フアイル４より付属情報AIの集合
を主メモリ３上に読込み、代表キーに従つてソー
ト処理を行い、付属情報AIのストリングを作成
し、これを再び作業フアイル４に書込む。 After the sorting process, blocking process, and attached information creation process shown in Fig. 2A are performed, as shown in Fig. 2B, the set of attached information AI is read into the main memory 3 from the work file 4, and the representative key Sort processing is performed according to , a string of attached information AI is created, and this is written to the work file 4 again.

第２図ロのソート処理を行つて付属情報ストリ
ングを作成した後、付加情報AIのストリングに
従い、作業フアイル４上のブロツクＢを順次に主
記憶３に読込み、マージ処理を施し、主記憶３上
にデータ・レコードγのストリングを作成する。
主記憶３の内のストリングを、作業フアイル４に
残つている付属情報のストリングのおける先頭の
付属情報の代表キーを用いて２つの部分に分け
る。一方は上記の代表キーより前にあるキーをも
つデータ・レコードのストリングであり、他方は
後方のデータ・レコードのストリングとなる。前
者は直ちに出力レコードとして主記憶３より追出
され、出力フアイル５に格納される。出力フアイ
ル５にデータ・レコードのストリングを格納した
後、主記憶３に後続ブロツクＢを作業フアイル４
から読込み可能であれば、後続のブロツクＢを読
込み、上記のようなマージ処理を行う。主記憶３
に後続のブロツクＢを読込むスペースが存在しな
い場合には、後方のデータ・レコードのストリン
グ中で不必要なレコードを作業フアイル４に書戻
すことにより空きスペースを確保する。この場
合、作業フアイル４へ書戻されるブロツクに対
し、第２図イのようにして付属情報を作成し、こ
の作成した付属情報を第２図ロのように作成され
た付属情報のストリングの中の正しい位置に挿入
する。書戻されたブロツクをも含め、作業フアイ
ル４上に存在する全てのブロツクをマージし、出
力レコード列とした時点で処理は完了する。 After creating an attached information string by performing the sorting process shown in FIG. Create a string of data records γ in .
The string in the main memory 3 is divided into two parts using the representative key of the first attached information in the string of attached information remaining in the work file 4. One is a string of data records with keys before the representative key, and the other is a string of data records after. The former is immediately removed from the main memory 3 as an output record and stored in the output file 5. After storing the string of data records in the output file 5, the subsequent block B is stored in the main memory 3 as a work file 4.
If it is possible to read from block B, the subsequent block B is read and the merge process as described above is performed. Main memory 3
If there is no space to read the subsequent block B, empty space is secured by writing unnecessary records in the string of subsequent data records back to the work file 4. In this case, for the block to be written back to the work file 4, the attached information is created as shown in Figure 2 A, and the created attached information is inserted into the created attached information string as shown in Figure 2 B. Insert it in the correct position. The process is completed when all blocks existing on the work file 4, including the blocks written back, are merged to form an output record string.

以上の説明から明らかなように、本発明のソー
ト・マージ処理方式によれば、ソート・マージ処
理を効率的に処理することが出来る。従来方式に
おいては、入力データ・レコードの順序が殆んど
揃つているような場合でも、揃つていない場合と
略ぼ同様な処理時間を必要としたが、本発明のソ
ート・マージ処理方式によれば、入力データ・レ
コードの順序が略ぼ揃つているような場合には非
常に短時間で処理を終了することが出来る。 As is clear from the above description, according to the sort/merge processing method of the present invention, sort/merge processing can be efficiently processed. In the conventional method, almost the same processing time was required even when the input data records were almost in the same order as when they were not, but the sort/merge processing method of the present invention requires approximately the same processing time. According to this method, when the input data records are almost in the same order, processing can be completed in a very short time.

【図面の簡単な説明】[Brief explanation of drawings]

第１図は従来のソート・マージ処理方式を示す
図、第２図は本発明のソート・マージ処理方式の
１例を説明する図、第３図は本発明を実施するた
めに必要な機器構成を示す図である。Ｒ……入力データ・レコード群、γ……入力デ
ータ・レコード、ST……ストリング、Ｂ……ブ
ロツク、AI……付属情報、１……入力フアイル、
２……処理装置、３……主記憶、４……作業フア
イル、５……出力フアイル。 Fig. 1 is a diagram showing a conventional sort/merge processing method, Fig. 2 is a diagram illustrating an example of the sort/merge processing method of the present invention, and Fig. 3 is a diagram showing the equipment configuration necessary to implement the present invention. FIG. R...Input data record group, γ...Input data record, ST...String, B...Block, AI...attached information, 1...Input file,
2... Processing device, 3... Main memory, 4... Work file, 5... Output file.

Claims

【特許請求の範囲】１入力フアイルに格納されている入力データ・
レコード群を、処理装置、主記憶、直接アクセス
記憶装置上の作業フアイルを有する計算機でソー
ト・マージ処理し、ソート・マージされたデー
タ・レコードの列を出力フアイルに格納するソー
ト・マージ処理方式において、 (a) 上記入力フアイルに格納されている入力デー
タ・レコード群を順次に上記主記憶に読込み、
上記処理装置で以て処理することにより、上記
入力データ・レコード群を、データ・レコード
の集合である所の複数のブロツクに分割して上
記作業フアイルに格納すると共に、各ブロツク
毎に昇順もしくは降順に従い代表キーを選択
し、作業フアイル上のブロツクのアドレスと当
該ブロツクの代表キーで以てブロツク毎の付属
情報を作成し、作成された付属情報の集合を上
記作業フアイルに格納し、 (b) 作業フアイル上の付属情報の集合を上記主記
憶に読込み、上記処理装置で以て代表キーに従
つてソート処理し、ソート処理された付属情報
の列を上記作業フアイルに格納し、 (c) ソート処理された付属情報の列に従い、上記
作業フアイルよりブロツクを上記主記憶に読込
み、該ブロツクから主記憶上にデータ・レコー
ド列を生成し、次に読込まれた代表キーに基づ
いて上記データ・レコード列を２分し、２個の
部分列の内の所定の部分列を上記出力フアイル
に格納し、残された部分列と次に読込まれたブ
ロツクのデータ・レコードとをマージして、新
たなレコードとして主記憶上に生成することを特徴とするソート・マージ処理方式。[Claims] 1. Input data stored in the input file.
A sort/merge processing method in which a group of records is sorted and merged by a computer having work files on a processing unit, main memory, and direct access storage, and the sorted and merged columns of data records are stored in an output file. (a) Sequentially read the input data record group stored in the input file into the main memory,
By processing with the processing device, the input data record group is divided into a plurality of blocks, each of which is a set of data records, and stored in the work file, and each block is sorted in ascending or descending order. Select a representative key according to the above, create attached information for each block using the address of the block on the work file and the representative key of the block, store the set of created attached information in the work file, (b) A set of attached information on the work file is read into the main memory, sorted by the processing device according to the representative key, and the sorted column of attached information is stored in the work file, (c) sorting. According to the processed attribute information column, a block is read from the work file into the main memory, a data record string is generated from the block on the main memory, and then the data record is generated based on the read representative key. Divide the column into two, store a predetermined subsequence of the two subsequences in the above output file, merge the remaining subsequence with the data record of the next read block, and create a new block. A sort/merge processing method characterized by generating records on main memory.