JP7133843B2

JP7133843B2 - Processing device, processing system, processing method, program, and recording medium

Info

Publication number: JP7133843B2
Application number: JP2018197597A
Authority: JP
Inventors: 輝久小松; 洋介大野; 元太郎森本; コウチョウ; 洋平小山; 真弘人泰地
Original assignee: RIKEN Institute of Physical and Chemical Research
Current assignee: RIKEN Institute of Physical and Chemical Research
Priority date: 2018-10-19
Filing date: 2018-10-19
Publication date: 2022-09-09
Anticipated expiration: 2038-10-19
Also published as: JP2023171462A; JP2020064560A; JP7376163B2; JP2022163069A

Description

本発明は、処理装置、処理システム、処理方法、プログラム、及び、記録媒体に関する。 The present invention relates to a processing device, a processing system, a processing method, a program, and a recording medium.

分子動力学シミュレーションを行うための専用計算機が知られており、専用計算機による種々のシミュレーション高速化手法が開発されている。例えば、巨大分子等を含むシミュレーション空間を複数のセルに分割して管理する手法が知られている（特許文献１）。しかし、従来の手法によってもシミュレーションの処理速度は必ずしも十分ではなく、更なる高速化が望まれている。一例として、従来の手法によると複数のセル間の粒子移動等の処理をソフトウエアにより行っており、オーバーヘッドが依然大きかった。
特許文献１特開２００６－２３６２５６号公報 Dedicated computers for conducting molecular dynamics simulations are known, and various simulation speed-up methods using dedicated computers have been developed. For example, there is known a method of dividing a simulation space containing macromolecules and the like into a plurality of cells for management (Patent Document 1). However, the simulation processing speed is not necessarily sufficient even with the conventional method, and further speeding up is desired. As an example, according to the conventional method, processing such as particle movement between a plurality of cells is performed by software, and the overhead is still large.
Patent document 1 Japanese Patent Application Laid-Open No. 2006-236256

専用計算機を用いた粒子動力学シミュレーションの処理を高速化する装置等を提供することを課題とする。 An object of the present invention is to provide an apparatus and the like for speeding up the processing of particle dynamics simulation using a dedicated computer.

上記課題を解決するために、本発明の第１の態様においては、空間内に配置される複数の粒子のそれぞれの粒子データを記憶する粒子データメモリと、空間を分割した各セルのセル番号に対応付けて、粒子データメモリにおける、当該セル内の粒子の粒子データを格納するために割り当てられた記憶位置を示すセル情報を記憶するセル情報メモリと、空間を分割したセルに含まれる粒子の粒子データを、セル番号を指定してアクセスする処理ユニットと、セル番号を指定した粒子データへのアクセスを受けたことに応じて、指定されたセル番号に対応付けられたセル情報を用いて粒子データメモリにおけるアクセス対象の粒子データの記憶位置を特定するメモリコントローラとを備える処理装置、処理システム、プログラム、及び、記録媒体を提供する。 In order to solve the above-mentioned problems, in a first aspect of the present invention, a particle data memory for storing particle data of each of a plurality of particles arranged in a space and a cell number of each cell dividing the space are stored. In association with each other, a cell information memory for storing cell information indicating a storage position allocated for storing particle data of a particle in the cell in the particle data memory, and a particle of the particles contained in the cells into which the space is divided. A processing unit that accesses data by specifying a cell number, and particle data using cell information associated with the specified cell number in response to access to the particle data that specifies the cell number. A processing device, a processing system, a program, and a recording medium including a memory controller that identifies a storage location of particle data to be accessed in a memory are provided.

なお、上記の発明の概要は、本発明の必要な特徴の全てを列挙したものではない。また、これらの特徴群のサブコンビネーションもまた、発明となりうる。 It should be noted that the above summary of the invention does not list all the necessary features of the invention. Subcombinations of these feature groups can also be inventions.

本実施形態における空間のノード分担の一例を示す。1 shows an example of node allocation of space in this embodiment. 本実施形態における分割空間のセル分割の一例を示す。An example of cell division of a divided space in this embodiment is shown. 本実施形態における処理装置１０のブロック図を示す。1 shows a block diagram of a processing device 10 in this embodiment. FIG. 本実施形態におけるメモリ１００の記憶内容の一例を示す。1 shows an example of contents stored in the memory 100 according to the present embodiment. 本実施形態におけるメモリ処理の一例を示す。An example of memory processing in this embodiment is shown. 本実施形態におけるメモリ処理の別の一例を示す。4 shows another example of memory processing according to the present embodiment. 本実施形態におけるメモリ処理の更に別の一例を示す。Still another example of memory processing in this embodiment is shown. 本実施形態における遠距離力の処理方法の一例を示す。An example of a long-distance force processing method according to the present embodiment is shown. 本実施形態における遠距離力の処理方法の一例を示す。An example of a long-distance force processing method according to the present embodiment is shown. 本実施形態における遠距離力の処理方法の一例を示す。An example of a long-distance force processing method according to the present embodiment is shown. 本実施形態における遠距離力の処理方法の一例を示す。An example of a long-distance force processing method according to the present embodiment is shown. 本実施形態における遠距離力の処理方法の一例を示す。An example of a long-distance force processing method according to the present embodiment is shown. 排除粒子機能のマスクパターンの一例を示す。1 shows an example of a mask pattern for an excluded particle function. 本実施形態におけるマスクパターン識別情報の一例を示す。1 shows an example of mask pattern identification information in this embodiment. 本実施形態におけるマスクパターンの一例を示す。An example of a mask pattern in this embodiment is shown. 本実施形態における高排除のマスクテーブルの一例を示す。3 shows an example of a high rejection mask table according to the present embodiment. 本実施形態における低排除のマスクテーブルの一例を示す。An example of a low rejection mask table in this embodiment is shown. 本実施形態における高排除のマスクテーブル斜方モードの一例を示す。An example of a mask table oblique mode with high rejection in this embodiment is shown. 本実施形態における低排除のマスクテーブル斜方モードの一例を示す。An example of a mask table oblique mode with low rejection in this embodiment is shown.

以下、発明の実施の形態を通じて本発明を説明するが、以下の実施形態は特許請求の範囲にかかる発明を限定するものではない。また、実施形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。 Hereinafter, the present invention will be described through embodiments of the invention, but the following embodiments do not limit the invention according to the claims. Also, not all combinations of features described in the embodiments are essential for the solution of the invention.

本実施形態の処理システムは、空間内に配置された複数の粒子の間に働く力を計算することで、空間内の粒子の運動をシミュレーションする。例えば、処理システムは、多数の原子を含む巨大分子等に生じる古典力学的な力やポテンシャルエネルギー等を、複数の処理ノードにより計算する。 The processing system of this embodiment simulates the movement of particles in space by calculating forces acting between a plurality of particles arranged in space. For example, the processing system uses a plurality of processing nodes to calculate classical mechanical forces, potential energies, and the like that occur in macromolecules containing a large number of atoms.

本実施形態において、複数の処理ノードの各々は、空間を複数の次元方向のそれぞれにおいて分割した分割空間を担当し、担当する分割空間内の力やポテンシャルエネルギーを計算する。処理システムが扱う空間は、シミュレーションさせる粒子が配置される空間（例えば、３次元空間）であり、予め定められた大きさの立体領域として定義される。 In this embodiment, each of the plurality of processing nodes takes charge of a divided space obtained by dividing the space in each of a plurality of dimensional directions, and calculates forces and potential energies in the divided space that it takes charge of. The space handled by the processing system is a space (for example, a three-dimensional space) in which particles to be simulated are arranged, and is defined as a three-dimensional region of a predetermined size.

粒子は、分子動力学シミュレーションの対象となる原子、原子群、分子、それらのイオン、又は電子であってよい。粒子には、質量、座標、速度、加速度、及び、電荷の１以上の数値が、粒子データとして付与されていてよい。分子動力学シミュレーションにおいて、粒子には複数の力が作用する。 The particles may be atoms, groups of atoms, molecules, their ions, or electrons that are the subject of molecular dynamics simulations. One or more numerical values of mass, coordinates, velocity, acceleration, and electric charge may be assigned to particles as particle data. In a molecular dynamics simulation, particles are subject to multiple forces.

例えば、粒子には、共有結合力、クーロン力、及び、ファンデルワールス力等が作用する。処理システムは、複数の処理ノードを用いて、複数の粒子に作用する力の計算と、複数の粒子の位置の更新と交互に繰り返し行ってよい。処理システムは、分子動力学シミュレーションに代えて、重力多体シミュレーションを行ってもよく、この場合は、粒子は天体等の大きな質量を有する質点となり、粒子に作用する力は重力等となる。 For example, covalent forces, Coulomb forces, Van der Waals forces, etc. act on particles. The processing system may use multiple processing nodes to repeatedly alternate between computing forces acting on multiple particles and updating the positions of multiple particles. The processing system may perform a gravitational many-body simulation instead of the molecular dynamics simulation. In this case, the particles are mass points having a large mass, such as celestial bodies, and the force acting on the particles is gravity.

図１は、本実施形態における空間の処理ノードによる分担の一例を示す。処理システムにおける複数の処理ノードの各々は、専用に設計された専用チップ（以下、単にチップともいう）により実装されてよい。例えば、図１に示すように、処理システムは８処理ノード（８チップ）×８処理ノード（８チップ）×８処理ノード（８チップ）の５１２処理ノード（５１２チップ）を備えてよい。そして、空間を５１２分割した分割空間のそれぞれの処理を、各処理ノード（各チップ）が担当してよい。 FIG. 1 shows an example of allocation of spatial processing nodes in this embodiment. Each of the plurality of processing nodes in the processing system may be implemented by a specially designed dedicated chip (hereinafter simply referred to as chip). For example, as shown in FIG. 1, a processing system may comprise 512 processing nodes (512 chips) of 8 processing nodes (8 chips) by 8 processing nodes (8 chips) by 8 processing nodes (8 chips). Then, each processing node (each chip) may be in charge of processing each divided space obtained by dividing the space into 512 spaces.

各処理ノードは、担当する分割空間に存在する粒子の力と位置の計算を担当してよい。複数の処理ノードはネットワークで相互接続されて、系全体の計算処理に必要な通信を行ってよい。複数の処理ノードは、ボードに搭載されてよい。例えば、８チップを搭載したボードを６４台設けることで、５１２チップの処理システムを構築してよい。 Each processing node may be responsible for computing the forces and positions of particles in its responsible divided space. Multiple processing nodes may be interconnected by a network to provide the communications necessary for the computational processing of the entire system. Multiple processing nodes may be mounted on a board. For example, a 512-chip processing system may be constructed by providing 64 boards on which 8 chips are mounted.

図２は、本実施形態における分割空間のセル分割の一例を示す。処理システムは、セルインデックス法を用いて、複数のセルに分割された空間を管理してよい。セルは、各処理ノードが管理する分割空間を更に分割した空間であってよい。例えば、図示するように、処理ノードは、担当する分割空間を２×２×２＝８セルに分割して管理してよい。これにより、処理システムは、空間内の電荷や電位を、粒子単位ではなくセル単位で少なくとも部分的に管理し、計算効率を高めることができる。 FIG. 2 shows an example of cell division of a divided space in this embodiment. The processing system may use a cell index method to manage the space divided into multiple cells. A cell may be a space obtained by further dividing a divided space managed by each processing node. For example, as illustrated, the processing node may divide the divided space it is in charge of into 2×2×2=8 cells and manage them. This allows the processing system to manage, at least in part, the charge or potential in the space on a cell-by-cell basis rather than a particle-by-particle basis, increasing computational efficiency.

一例として、処理システムは、１００～１００万原子を有する分子又は分子群の動力学シミュレーションを実行してよい。セルの一辺は１～１００ｎｍの範囲であってよい。 As an example, the processing system may perform dynamics simulations of molecules or groups of molecules having 1-1 million atoms. A cell side may range from 1 to 100 nm.

図３は、本実施形態における処理装置１０のブロック図を示す。処理システムにおける処理ノード（及びチップ）は、図３に示す処理装置１０により実現されてよい。処理装置１０は、メモリ１００、処理ユニット２００、メモリコントローラ３００、ネットワークインタフェース４００、及び、コンボリューションユニット６００を有する。 FIG. 3 shows a block diagram of the processing device 10 in this embodiment. A processing node (and chip) in the processing system may be implemented by the processing device 10 shown in FIG. The processing device 10 has a memory 100 , a processing unit 200 , a memory controller 300 , a network interface 400 and a convolution unit 600 .

メモリ１００は、粒子データ、及び、粒子データのメモリ１００内の記憶位置に関する情報等を記憶する。メモリ１００は、粒子データメモリ１１０、セル情報メモリ１２０、メモリブロック情報メモリ１３０、マスクパターンメモリ１４０、及び、マスクテーブルメモリ１５０を含む。 The memory 100 stores information such as particle data and storage locations of the particle data in the memory 100 . Memory 100 includes particle data memory 110 , cell information memory 120 , memory block information memory 130 , mask pattern memory 140 and mask table memory 150 .

粒子データメモリ１１０、セル情報メモリ１２０、メモリブロック情報メモリ１３０、マスクパターンメモリ１４０、及び、マスクテーブルメモリ１５０は、単一の物理メモリにより実装されてよく、又は、複数の物理メモリにより実装されてもよい。 Particle data memory 110, cell information memory 120, memory block information memory 130, mask pattern memory 140, and mask table memory 150 may be implemented by a single physical memory, or may be implemented by multiple physical memories. good too.

粒子データメモリ１１０は、空間内に配置される複数の粒子のそれぞれの粒子データを記憶する。粒子データは、粒子の物理状態を示すデータであってよい。例えば、粒子データメモリ１１０は、複数の粒子について、粒子番号、座標、質量、及び、電荷の情報を粒子データとして記憶してよい。粒子番号は、粒子が属するセル内における相対的な番号を含んでよい。これに加えて／代えて、粒子番号は、空間内における絶対的な番号を含んでよい。 Particle data memory 110 stores particle data of each of a plurality of particles arranged in space. Particle data may be data indicating the physical state of a particle. For example, the particle data memory 110 may store particle number, coordinates, mass, and charge information for a plurality of particles as particle data. A particle number may include a relative number within the cell to which the particle belongs. Additionally/alternatively, the particle number may comprise an absolute number in space.

粒子データメモリ１１０は、複数の粒子について、更に速度、及び／又は、加速度等の情報を粒子データとして記憶してもよい。また、粒子データは、後述するマスクパターン識別情報を含んでよい。また、粒子データは、後述するアトリビュートを含んでよい。更に、粒子データメモリ１１０は、粒子以外のデータを記憶してもよい。 The particle data memory 110 may further store information such as velocity and/or acceleration as particle data for a plurality of particles. Also, the particle data may include mask pattern identification information, which will be described later. Particle data may also include attributes described later. Furthermore, the particle data memory 110 may store data other than particles.

セル情報メモリ１２０は、処理装置１０が担当する各セルのセル番号に対応付けて、粒子データメモリ１１０における、当該セル内の粒子の粒子データを格納するために割り当てられた記憶位置を示すセル情報を記憶する。例えば、セル情報は、各セルのセル番号に対応付けて、粒子データメモリ１１０内の複数のメモリブロックを示すものであってよい。一例として、セル情報は、セル番号を粒子データメモリ１１０内のアドレスに変換するアドレス変換テーブルであってよい。 The cell information memory 120 stores cell information indicating the storage location assigned to store the particle data of the particles in the cell in the particle data memory 110 in association with the cell number of each cell handled by the processing device 10. memorize For example, the cell information may indicate a plurality of memory blocks within the particle data memory 110 in association with the cell number of each cell. As an example, the cell information may be an address conversion table that converts cell numbers into addresses within the particle data memory 110 .

これにより、対象粒子のセル番号とセル内の粒子番号とを指定することで、粒子データメモリ１１０の対象粒子の粒子データを格納する記憶位置が取得可能になる。セル情報メモリ１２０は、セル中の粒子数をセル情報として記憶してよい。また、セル情報は、後述するアトリビュートを含んでよい。 Thus, by designating the cell number of the target particle and the particle number within the cell, it is possible to acquire the storage position of the particle data memory 110 for storing the particle data of the target particle. The cell information memory 120 may store the number of particles in the cell as cell information. Also, the cell information may include attributes to be described later.

メモリブロック情報メモリ１３０は、粒子データメモリ１１０内における、使用済みのメモリブロックまたは未使用のメモリブロックを管理するためのメモリブロック情報を記憶する。例えば、メモリブロック情報メモリ１３０は、粒子データメモリ１１０のメモリブロックに対応するアドレスと、使用／未使用を示すインジケータとの組をメモリブロック情報として記憶してよい。 The memory block information memory 130 stores memory block information for managing used memory blocks or unused memory blocks in the particle data memory 110 . For example, the memory block information memory 130 may store a set of an address corresponding to a memory block of the particle data memory 110 and a used/unused indicator as memory block information.

マスクパターンメモリ１４０は、計算対象の粒子に働く力及び／又はポテンシャルの計算において、他の粒子のそれぞれを高排除および低排除の少なくとも一方の対象とすべきか否かを指定する複数のマスクパターンを記憶する。例えば、分子中における隣り合う原子同士は共有結合で結合するが、このような原子間のクーロン力、ファンデルワールス力等の効果は、共有結合に取り込まれている場合がある。 The mask pattern memory 140 stores a plurality of mask patterns that specify whether or not each of the other particles should be subjected to at least one of high exclusion and low exclusion in the calculation of the force and/or potential acting on the particle to be calculated. Remember. For example, adjacent atoms in a molecule are bound by covalent bonds, and effects such as Coulomb force and van der Waals force between such atoms may be incorporated into the covalent bonds.

このような場合に、処理装置１０は、隣接原子同士に作用するクーロン力等の効果をマスクパターンにより排除し、同一の力が実質的に二重に考慮されないようにする。ここで、クーロン力、ファンデルワールス力等の効果の排除は、高い水準（例えば、完全に排除）で行ってよく（「高排除」の指定）、又は、これよりも低い水準（例えば、一部のみ排除）で行ってもよく（「低排除」の指定）、いずれを行うかはマスクパターンにより指定されてよい。 In such cases, processor 10 uses mask patterns to eliminate effects such as Coulomb forces acting on adjacent atoms so that the same forces are not substantially double considered. Here, the elimination of effects such as Coulomb force, van der Waals force, etc. may be performed at a high level (for example, complete elimination) (designation of "high exclusion"), or at a lower level (for example, one part only) may be performed (designation of "low exclusion"), and which one is performed may be specified by the mask pattern.

各粒子がどのマスクパターンを使用するかは、粒子データメモリ１１０に格納される粒子データ中のマスクパターン識別情報で識別されてよい。マスクパターンの詳細は後述する。 Which mask pattern each particle uses may be identified by mask pattern identification information in the particle data stored in particle data memory 110 . Details of the mask pattern will be described later.

マスクテーブルメモリ１５０は、計算対象の２以上の粒子と、他の２以上の粒子との間で高排除および低排除の少なくとも一方を指定するマスクテーブルを記憶する。マスクパターンメモリ１４０が記憶するマスクパターンによれば、粒子番号が比較的近い粒子間の力の排除を考慮することができるが、環状分子中の隣接原子や巨大タンパク質中でジスルフィド結合する原子間等、空間的には近接するが粒子番号が離れた粒子間の力の排除を考慮できない場合がある。 The mask table memory 150 stores a mask table that specifies at least one of high exclusion and low exclusion between two or more particles to be calculated and two or more other particles. According to the mask pattern stored in the mask pattern memory 140, it is possible to consider the exclusion of forces between particles having relatively close particle numbers, but there is a possibility that the forces between adjacent atoms in a cyclic molecule or between disulfide-bonded atoms in a giant protein, etc., can be considered. , may fail to account for the exclusion of forces between particles that are spatially close but separated by particle number.

そこで、処理装置１０は、マスクテーブルメモリ１５０により、粒子番号が離れた複数の粒子又は粒子群間の高排除及び低排除などの指定を個別に記憶し、空間的には近接するが粒子番号が離れた粒子間の力の排除を考慮することができる。マスクテーブルの詳細は後述する。 Therefore, the processing device 10 individually stores designations such as high exclusion and low exclusion between a plurality of particles or particle groups with separate particle numbers in the mask table memory 150, and the particles are spatially close to each other but have different particle numbers. Elimination of forces between distant particles can be considered. Details of the mask table will be described later.

処理ユニット２００は、空間内に配置される複数の粒子のそれぞれについて、他の各粒子から働く力及び電位を計算し、粒子の位置を計算、更新する。処理ユニット２００は、当該計算を複数種類の演算回路に分担して実行してよい。例えば、処理ユニット２００は、複数のパイプライン２１０、複数のコア２２０、及び、長距離ユニット２３０を含んでよい。処理ユニット２００は、必要に応じて、汎用プロセッサ、ローカルメモリ、その他の回路等を含んでよい。 The processing unit 200 calculates the force and potential acting from each of the other particles for each of the plurality of particles arranged in the space, and calculates and updates the position of the particle. The processing unit 200 may share the calculation with a plurality of types of arithmetic circuits. For example, processing unit 200 may include multiple pipelines 210 , multiple cores 220 , and long range unit 230 . Processing unit 200 may include a general-purpose processor, local memory, other circuitry, etc., as appropriate.

複数のパイプライン２１０は、協働して複数の粒子間に働く力及び／又はポテンシャルの一部を計算する。例えば、複数のパイプライン２１０は、協働して粒子間に作用する短距離のクーロン力、及び、粒子間に作用するファンデルワールス力を計算する。ここで、短距離のクーロン力とは、１つのセル及び近傍セルの空間内で作用するクーロン力であってよい。なお、以降の説明では「短距離のクーロン力及び／又はファンデルワールス力、及び／又はこれらのポテンシャル」を「短距離クーロン力等」ともいう。 Multiple pipelines 210 cooperate to compute a portion of the forces and/or potentials acting between multiple particles. For example, multiple pipelines 210 cooperate to compute short-range Coulomb forces acting between particles and van der Waals forces acting between particles. Here, the short-range Coulomb force may be the Coulomb force acting within the space of one cell and neighboring cells. In the following description, "short-range Coulomb force and/or van der Waals force, and/or potential thereof" will also be referred to as "short-range Coulomb force and the like".

複数のパイプライン２１０は、マスクパターンメモリ１４０に格納されるマスクパターンを利用して、一部の粒子間の短距離クーロン力等の計算を省略してよい。例えば、「高排除」が全部排除を意味する場合、複数のパイプライン２１０は、「高排除」と指定される粒子間の短距離クーロン力等の計算を行わないか、これらの計算結果を０としてよい。例えば、「高排除」が一部の排除を意味する場合、複数のパイプライン２１０は、「高排除」と指定される粒子間の短距離クーロン力等の計算に所定の係数を乗じてよい。 Multiple pipelines 210 may utilize mask patterns stored in mask pattern memory 140 to omit calculations such as short-range Coulomb forces between some particles. For example, if “high rejection” means total rejection, the plurality of pipelines 210 may either not perform calculations such as short-range Coulomb forces between particles designated as “high rejection” or reduce the results of these calculations to zero. may be For example, if "high rejection" means partial rejection, then multiple pipelines 210 may multiply calculations such as short-range Coulomb forces between particles designated as "high rejection" by a predetermined factor.

また、複数のパイプライン２１０は、マスクパターンメモリ１４０に格納されるマスクパターンを利用して、一部の粒子間の短距離クーロン力等を減殺してよい。例えば、複数のパイプライン２１０は、「低排除」と指定される粒子間の短距離クーロン力等の計算に「高排除」に用いた係数よりも大きい所定の係数（例えば、１／２又は１／４）を乗じてよい。例えば、処理ユニット２００は８個のパイプライン２１０を含んでよい。 The plurality of pipelines 210 may also utilize the mask patterns stored in the mask pattern memory 140 to attenuate short-range Coulomb forces and the like between some particles. For example, multiple pipelines 210 may use a predetermined factor (e.g., 1/2 or 1 /4). For example, processing unit 200 may include eight pipelines 210 .

コア２２０は、複数の粒子間に働く力の他の一部を計算する。例えば、コア２２０は、共有結合力を計算する。コア２２０は、ＳＲＡＭ等の専用メモリを命令メモリ等の用途で有してもよい。 Core 220 computes another portion of the forces acting between multiple particles. For example, core 220 calculates covalent strength. Core 220 may have dedicated memory, such as SRAM, for purposes such as instruction memory.

コア２２０は、各粒子に作用する力を統合し、統合された力及び質量から各粒子の加速度及び速度を計算し、更に次のシミュレーション時刻における各粒子の位置（例えば、３次元座標）を計算してよい。ここで、コア２２０は、自身が計算する共有結合力、複数のパイプライン２１０が計算する短距離クーロン力等に加えて、後述するコンボリューションユニット６００が計算する長距離クーロン力を統合してよい。また、コア２２０は、各粒子の電位を計算してよい。 The core 220 integrates the forces acting on each particle, calculates the acceleration and velocity of each particle from the integrated force and mass, and further calculates the position (eg, three-dimensional coordinates) of each particle at the next simulation time. You can Here, the core 220 may integrate the long-range Coulomb force calculated by the convolution unit 600, which will be described later, in addition to the covalent bond force calculated by itself, the short-range Coulomb force calculated by the plurality of pipelines 210, and the like. . Core 220 may also calculate the potential of each particle.

複数のコア２２０は、処理装置１０が担当するセルを分担してよい。例えば、処理ユニット２００は、８個のコア２２０を含んでよく、各コアが図２に示す空間における１又は複数のセルの処理を担当してよい。複数のコアが１セルの処理を担当してもよい。例えば、あるコア２２０が図２のセルＡに含まれる粒子に作用する力及び粒子の位置の計算を行い、別のあるコア２２０が図２のセルＢに含まれる粒子に作用する力及び粒子の位置の計算を行いよい。また、処理ユニット２００は、各セルを担当するコア２２０に加えて、パイプライン２１０制御用のコア２２０を別途含んでよい。 A plurality of cores 220 may share a cell that the processing device 10 is in charge of. For example, processing unit 200 may include eight cores 220, each core responsible for processing one or more cells in the space shown in FIG. A plurality of cores may be responsible for processing one cell. For example, one core 220 performs calculations of the forces acting on the particles and the positions of the particles contained in cell A of FIG. Position calculations may be performed. Also, the processing unit 200 may separately include a core 220 for controlling the pipeline 210 in addition to the core 220 in charge of each cell.

処理ユニット２００のパイプライン２１０及びコア２２０は、メモリコントローラ３００を介して、粒子データメモリ１１０に記憶された粒子データにアクセスする。処理ユニット２００は、空間を分割したセルに含まれる粒子の粒子データを、セル番号を指定してアクセスしてよい。 Pipeline 210 and core 220 of processing unit 200 access particle data stored in particle data memory 110 via memory controller 300 . The processing unit 200 may access the particle data of the particles contained in the space-divided cells by designating the cell number.

例えば、コア２２０は、メモリコントローラ３００を介して、粒子データメモリ１１０に記憶された粒子データに書き込みを行ってよい。一例として、コア２２０は、セル番号およびセル内の粒子番号を指定して粒子データに対する書き込みを要求する書込要求をメモリコントローラ３００へと送信してよい。 For example, core 220 may write to particle data stored in particle data memory 110 via memory controller 300 . As an example, core 220 may send a write request to memory controller 300 specifying a cell number and a particle number within a cell to request writing of particle data.

また、コア２２０は、粒子の位置の更新（例えば、粒子のセル間の移動）に伴い、メモリコントローラ３００を介して、指定したセル番号のセルに粒子を追加してよい。例えば、コア２２０は、指定したセル番号のセルに粒子を追加することを指示する追加要求をメモリコントローラ３００へと送信してよい。なお、コア２２０は、ハードウェアにより実現されてよく、又は、ソフトウエア（すなわちプログラム）により実現されてもよい。後者の場合、ソフトウエア（プログラム）は、記録媒体（例えば、揮発性又は不揮発性メモリ）に記録されてよい。 In addition, the core 220 may add particles to the cell with the specified cell number via the memory controller 300 as the particle position is updated (for example, the particle moves between cells). For example, core 220 may send an add request to memory controller 300 to add particles to a cell with a specified cell number. Note that the core 220 may be implemented by hardware or may be implemented by software (that is, a program). In the latter case, the software (program) may be recorded on a recording medium (eg, volatile or non-volatile memory).

長距離ユニット２３０は、粒子間に働く力のうち処理ユニット２００で計算されないものを計算する。例えば、長距離ユニット２３０は、コンボリューションユニット６００と協働して粒子間に作用する長距離クーロン力を計算する。一例として、長距離ユニット２３０は、空間内に設けた格子点のうち処理装置１０が担当する分割空間に含まれる格子点にアサインされた複数の粒子の電荷から、担当する分割空間内に含まれる格子点の電荷を補間演算により算出してよい。また、格子点の電位から粒子位置の力とポテンシャルエネルギーを補間演算により算出してよい。長距離ユニット２３０の動作の詳細については、後述する。 The long range unit 230 calculates forces acting between particles that are not calculated by the processing unit 200 . For example, long range unit 230 cooperates with convolution unit 600 to calculate long range Coulomb forces acting between particles. As an example, the long-distance unit 230 is included in the divided space it is in charge of, based on the electric charges of a plurality of particles assigned to the grid points included in the divided space the processor 10 is in charge of, among the grid points provided in the space. The charge of the grid point may be calculated by interpolation calculation. Alternatively, the force and potential energy at the particle position may be calculated from the potential of the grid point by interpolation calculation. Details of the operation of long range unit 230 are provided below.

メモリコントローラ３００は、処理ユニット２００からの、メモリ１００に記憶された粒子データ等へのアクセスを仲介する。メモリコントローラ３００は、セル番号を指定した粒子データへのアクセスを受けたことに応じて、指定されたセル番号に対応付けられたセル情報を用いて粒子データメモリ１１０におけるアクセス対象の粒子データの記憶位置を特定する。例えば、メモリコントローラ３００は、セル情報メモリ１２０のセル情報を参照することで、アクセス対象の粒子データの記憶位置を特定してよい。 Memory controller 300 mediates access from processing unit 200 to particle data and the like stored in memory 100 . The memory controller 300 stores the access target particle data in the particle data memory 110 using the cell information associated with the specified cell number in response to access to the particle data with the specified cell number. Locate. For example, the memory controller 300 may identify the storage location of the particle data to be accessed by referring to the cell information in the cell information memory 120 .

また、メモリコントローラ３００は、処理ユニット２００からの要求に応じ、粒子データメモリ１１０の粒子データに書き込みしてよい。例えば、メモリコントローラ３００は、処理ユニット２００から書込要求を受け取ったことに応じて、粒子データメモリ１１０における、指定されたセル番号に対応付けられたセル情報および指定されたセル内の粒子番号によって示される記憶位置の粒子データに対して書込データを書き込んでよい。 The memory controller 300 may also write to particle data in the particle data memory 110 in response to requests from the processing unit 200 . For example, in response to receiving a write request from the processing unit 200, the memory controller 300 writes the data in the particle data memory 110 according to the cell information associated with the specified cell number and the particle number in the specified cell. Write data may be written to the particle data at the indicated storage locations.

また、メモリコントローラ３００は、処理ユニット２００からの要求に応じ、粒子データメモリ１１０に粒子データを追加してよい。例えば、メモリコントローラ３００は、処理ユニット２００から追加要求を受け取ったことに応じて、粒子データメモリ１１０における、指定されたセル番号に対応付けられたセル情報によって示される記憶位置に、当該粒子の粒子データを追加してよい。 The memory controller 300 may also add particle data to the particle data memory 110 in response to requests from the processing unit 200 . For example, in response to receiving an addition request from the processing unit 200, the memory controller 300 stores the particle of the particle in the storage location indicated by the cell information associated with the designated cell number in the particle data memory 110. You can add data.

また、メモリコントローラ３００は、セルに粒子データメモリ１１０の未使用のメモリブロックを割り当ててよい。例えば、メモリコントローラ３００は、セルにメモリブロックを割り当てる場合に、メモリブロック情報メモリ１３０に格納されたメモリブロック情報を用いて、粒子データメモリ１１０が使用できるメモリブロックのうち未使用のメモリブロックを選択してよい。なお、粒子データメモリ１１０が粒子以外のデータを記憶する場合、メモリコントローラ３００は、粒子以外のデータを記憶するメモリブロックをセルへの割り当てから除外してよい。 Memory controller 300 may also allocate unused memory blocks of particle data memory 110 to cells. For example, when allocating memory blocks to cells, the memory controller 300 uses the memory block information stored in the memory block information memory 130 to select unused memory blocks from among the memory blocks that can be used by the particle data memory 110. You can When the particle data memory 110 stores data other than particles, the memory controller 300 may exclude memory blocks storing data other than particles from allocation to cells.

さらに、メモリコントローラ３００は、複数の処理ユニットから同一セルに対して競合する処理を実行してよい。例えば、メモリコントローラ３００は、複数の処理ユニットから同一セルに対して競合する複数の追加要求を受信した場合に、複数の追加要求のそれぞれをアトミックに処理してよい。 In addition, memory controller 300 may perform competing operations on the same cell from multiple processing units. For example, when memory controller 300 receives multiple conflicting add requests for the same cell from multiple processing units, memory controller 300 may process each of the multiple add requests atomically.

ネットワークインタフェース４００は、処理装置１０と外部（例えば、他の処理装置１０）との通信を仲介する。これにより、処理装置１０は、自身が担当しない担当外セルを担当する他の処理装置１０と通信し、担当外セル中の粒子の粒子データ等を他の処理装置１０から取得することができる。 The network interface 400 mediates communication between the processing device 10 and the outside (for example, another processing device 10). As a result, the processing device 10 can communicate with other processing devices 10 in charge of cells not in charge of itself, and acquire particle data and the like of particles in the cells not in charge from the other processing devices 10 .

コンボリューションユニット６００は、粒子間に働く力のうち処理ユニット２００で計算されないものを計算する。例えば、コンボリューションユニット６００は、長距離ユニット２３０と協働して、粒子間に作用する長距離クーロン力を計算する。例えば、コンボリューションユニット６００は、格子点の電荷から格子点の電位を畳み込み演算により算出し、長距離ユニット２３０に提供する。コンボリューションユニット６００は、長距離クーロン力の計算量及び計算時間を削減するために、粒子の電荷を格子点上にアサインする手法を用いてよい。コンボリューションユニット６００の動作の詳細については後述する。 The convolution unit 600 calculates forces acting between particles that are not calculated by the processing unit 200 . For example, convolution unit 600 cooperates with long range unit 230 to calculate long range Coulomb forces acting between particles. For example, the convolution unit 600 calculates the grid point potential from the grid point charge by a convolution operation, and provides the long distance unit 230 with the potential. The convolution unit 600 may employ a technique of assigning particle charges onto lattice points in order to reduce the computational complexity and time of the long-range Coulomb force. Details of the operation of convolution unit 600 will be described later.

例えば、コンボリューションユニット６００は、空間内に設けた格子点のうち処理装置１０が担当する分割空間に含まれる格子点にアサインされた複数の粒子の電荷から、担当する分割空間内に含まれる格子点の電位を畳み込み演算により算出してよい。複数の処理装置１０におけるコンボリューションユニット６００は、相互に通信して、各格子点の電荷に応じた値を複数の軸の各軸方向に順に畳み込んでいってよい。コンボリューションユニット６００は、ハードウェアにより実現されてよく、又は、ソフトウエア（すなわちプログラム）により実現されてもよい。後者の場合、ソフトウエア（プログラム）は、記録媒体（例えば、揮発性又は不揮発性メモリ）に記録されてよい。コンボリューションユニット６００の処理の詳細は後述する。 For example, the convolution unit 600 converts the charges of a plurality of particles assigned to the grid points included in the divided space handled by the processing device 10 out of the grid points provided in the space into the grid points included in the divided space handled by the processing device 10. The potential of a point may be calculated by a convolution operation. The convolution units 600 in the plurality of processing devices 10 may communicate with each other to sequentially convolve the values corresponding to the charges of the lattice points in each axial direction of the plurality of axes. The convolution unit 600 may be implemented in hardware or in software (ie, program). In the latter case, the software (program) may be recorded on a recording medium (eg, volatile or non-volatile memory). Details of the processing of the convolution unit 600 will be described later.

このように、複数の処理装置１０は、互いに通信しながら、パイプライン２１０及びコア２２０等の専用ハードウェアにより粒子に作用する力や粒子位置等を演算し、メモリ１００の所定の記憶位置に演算結果を格納する。各粒子の記憶位置は、セル情報メモリ１２０等により管理されるので、粒子がセル間を移動した場合や粒子が追加された場合であっても、粒子データメモリ１１０を書き換えずに処理を完了することも可能である。 In this way, while communicating with each other, the plurality of processing units 10 compute forces acting on particles, particle positions, and the like using dedicated hardware such as pipelines 210 and cores 220 , and store the computations in predetermined storage locations in the memory 100 . Store the result. Since the storage position of each particle is managed by the cell information memory 120 or the like, even when particles move between cells or when particles are added, processing is completed without rewriting the particle data memory 110. is also possible.

本実施形態によれば、メモリコントローラ３００が処理ユニット２００のメモリ１００へのアクセスを管理するので、複数のパイプライン２１０やコア２２０間の同期をソフトウエアでとることなく、複数の粒子をセル上で管理することができる。また、本実施形態によれば、セル番号により、粒子データの読み出しが可能になるので、粒子がセル間を移動した場合であっても同じパイプラインコマンド及びネットワークコマンドを再利用することができる。 According to this embodiment, since the memory controller 300 manages the access of the processing unit 200 to the memory 100, multiple particles can be generated on the cell without synchronizing the multiple pipelines 210 and the cores 220 by software. can be managed by In addition, according to this embodiment, the particle data can be read out based on the cell number, so even if the particle moves between cells, the same pipeline command and network command can be reused.

図４は、本実施形態におけるメモリ１００の記憶内容の一例を示す。図４には、最大６４粒子を格納するセルＡとセルＤが示される。粒子データメモリ１１０は、多数のメモリブロック（以下、ＭＢともいう）を有し、そのうちのメモリブロック１～メモリブロック４でセルＡの６４粒子の粒子データを記憶し得る。ここで、各メモリブロックは最大１６粒子分の粒子データを記憶してよい。また、粒子データメモリ１１０は、メモリブロック１６でセルＤの最大１６粒子分の粒子データを記憶してよい。 FIG. 4 shows an example of contents stored in the memory 100 in this embodiment. FIG. 4 shows cells A and D storing up to 64 particles. The particle data memory 110 has a large number of memory blocks (hereinafter also referred to as MBs), of which memory block 1 to memory block 4 can store particle data of 64 particles of cell A. FIG. Here, each memory block may store particle data for up to 16 particles. Also, the particle data memory 110 may store particle data for up to 16 particles in the cell D in the memory block 16 .

セル情報メモリ１２０は、粒子データメモリ１１０における粒子の記憶位置をセル番号に対応づけて記憶する。例えば、セル情報メモリ１２０は、セルＡに対応付けてメモリブロック１～４に対応するアドレス（例えば、メモリブロック１の先頭アドレス）をセル情報として記憶する。 The cell information memory 120 stores the storage positions of particles in the particle data memory 110 in association with cell numbers. For example, the cell information memory 120 stores addresses corresponding to memory blocks 1 to 4 (for example, the top address of memory block 1) in association with cell A as cell information.

これにより、セルＡに属する粒子のうち１～１６番目までのものは粒子データメモリ１１０のメモリブロック１に対応し、１７～３２番目までのものは粒子データメモリ１１０のメモリブロック２に対応し、３３～４８番目までのものは粒子データメモリ１１０のメモリブロック３に対応し、４９～６４番目までのものは粒子データメモリ１１０のメモリブロック４に対応することが示される。 Thus, the 1st to 16th particles belonging to the cell A correspond to the memory block 1 of the particle data memory 110, the 17th to 32nd particles correspond to the memory block 2 of the particle data memory 110, It is shown that the 33rd to 48th correspond to memory block 3 of the particle data memory 110 and the 49th to 64th correspond to memory block 4 of the particle data memory 110 .

ここで、粒子データメモリ１１０及びセル情報メモリ１２０は、二重化された粒子データを記憶してよい。例えば、粒子データメモリ１１０は、複数の粒子について現在のシミュレーション時刻の粒子データを記憶する第１領域と、当該複数の粒子について次の時刻の粒子データを記憶する第２領域とを備えてよい。そして、粒子の移動に伴って各セルに粒子の再配置を行う場合、第１領域において記憶位置が指定される粒子データを第２領域の記憶位置に再割り当てしてよい。 Here, the particle data memory 110 and the cell information memory 120 may store duplicated particle data. For example, the particle data memory 110 may include a first area for storing particle data at the current simulation time for a plurality of particles, and a second area for storing particle data at the next time for the plurality of particles. Then, when the particles are rearranged in each cell as the particles move, the particle data whose storage location is specified in the first area may be reassigned to the storage location in the second area.

ここで、セル情報メモリ１２０も、粒子データメモリ１１０の第１領域と第２領域に対応して、第１セット及び第２セットを有してよい。例えば、セル情報メモリ１２０は、複数のセルに対応付けた第１セットの複数のセル情報と、複数のセルに対応付けた第２セットの複数のセル情報とを記憶してよい。そして、空間内に複数の粒子の移動に伴って複数の粒子を各セルに再配置する処理において、処理ユニット２００は、第１セットの複数のセル情報によって記憶位置が指定される各粒子の粒子データを、第２セットの複数のセル情報によって指定される記憶位置に再割当してよい。 Here, the cell information memory 120 may also have a first set and a second set corresponding to the first area and the second area of the particle data memory 110 . For example, the cell information memory 120 may store a first set of cell information associated with the cells and a second set of cell information associated with the cells. Then, in the process of rearranging the plurality of particles in each cell as the plurality of particles move within the space, the processing unit 200 performs the particle The data may be reassigned to storage locations specified by the second set of multiple cell information.

図４では、セル情報メモリ１２０の第１セットにおいて、セルＡに属する現在の粒子データが粒子データメモリ１１０のＭＢ１～４に記憶され、セルＢに属する現在の粒子データが粒子データメモリ１１０のＭＢ５～８に記憶されることが示される。また、セル情報メモリ１２０の第２セットにおいて、セルＡに属する次時刻の粒子データが粒子データメモリ１１０のＭＢ１'～４'に記憶され、セルＢに属する次時刻の粒子データが粒子データメモリ１１０のＭＢ５'～８'に記憶されることが示される。ここで、ＭＢ１'～４'はＭＢ１～４と異なるブロックであり、ＭＢ５'～８'はＭＢ５～８とは異なるブロックである。 4, in the first set of cell information memory 120, the current particle data belonging to cell A are stored in MB1-4 of particle data memory 110, and the current particle data belonging to cell B are stored in MB5 of particle data memory 110. .about.8 are shown to be stored. In addition, in the second set of the cell information memory 120, the next time particle data belonging to the cell A is stored in MB1' to MB4' of the particle data memory 110, and the next time particle data belonging to the cell B is stored in the particle data memory 110. are stored in MBs 5' to 8' of . Here, MB1'-4' are blocks different from MB1-4, and MB5'-8' are blocks different from MB5-8.

セル情報メモリ１２０は、セル中の粒子数を記憶してよい。例えば、図示するようにセル情報メモリ１２０は、セルＡの粒子数が１１であり、セルＢの粒子数が１６であることを記憶する。メモリブロック情報メモリ１３０は、メモリブロック１、メモリブロック２等が使用済であるか、未使用であるかを示す。 Cell information memory 120 may store the number of particles in the cell. For example, as shown, the cell information memory 120 stores that cell A has 11 particles and cell B has 16 particles. The memory block information memory 130 indicates whether memory block 1, memory block 2, etc. are used or unused.

図５から図７において、本実施形態におけるメモリ処理の例を示す。 5 to 7 show examples of memory processing in this embodiment.

図５は、メモリコントローラ３００によるアトミックなセルへの粒子追加の例を示す。あるセルに粒子Ａのみが含まれる場合を想定する。ここで、コア１（図３に示すコア２２０の１つに対応）が、メモリコントローラ３００に粒子追加命令（Ａｐｐｅｎｄ命令）を送信する。これに応じて、メモリコントローラ３００が粒子データメモリ１１０にアクセスして当該セルに新しい粒子Ｂを追加する。 FIG. 5 illustrates an example of atomic cell addition of particles by memory controller 300 . Suppose a cell contains only particle A. Here, core 1 (corresponding to one of cores 220 shown in FIG. 3) sends a particle add instruction (Append instruction) to memory controller 300 . In response, memory controller 300 accesses particle data memory 110 to add a new particle B to the cell.

また、コア２（図３に示すコア２２０の別の１つに対応）が、メモリコントローラ３００に粒子追加命令（Ａｐｐｅｎｄ命令）を送信する。これに応じて、メモリコントローラ３００が、粒子データメモリ１１０にアクセスして当該セルに新しい粒子Ｃを追加する。メモリコントローラ３００は、このようなメモリアクセス処理を一度（アトミック）に行ってよい。 Core 2 (corresponding to another one of cores 220 shown in FIG. 3) also sends a particle append instruction (Append instruction) to memory controller 300 . In response, memory controller 300 accesses particle data memory 110 to add a new particle C to the cell. The memory controller 300 may perform such memory access processing once (atomic).

このようにメモリコントローラ３００が、メモリへの粒子の追加を制御する。各コアがメモリコントローラ３００を介さず、それぞれでメモリにアクセスして粒子を追加した場合は、処理の競合が生じ得るが、本実施形態によればメモリコントローラ３００により、そのような競合は避けられる。 The memory controller 300 thus controls the addition of particles to the memory. If each core accesses the memory and adds particles without going through the memory controller 300, processing conflicts may occur, but according to the present embodiment, the memory controller 300 can avoid such conflicts. .

図６は、メモリコントローラ３００によるアトミックな演算処理の例を示す。あるセル（例えば、特定の粒子）に対して演算処理（例えば、複数種類の力の合算）を行う場合を想定する。ここで、処理ユニットのコア１（図３に示すコア２２０の１つに対応）が、予め定められたデータを演算により更新することを指示する更新要求（例えば、積算を指示するＡＣＣＵＭ命令）を、メモリコントローラ３００に送信する。更新要求を受け取ったことに応じて、メモリコントローラ３００が粒子データメモリ１１０にアクセスして演算処理（例えば、Ａ＋Ｂ）を実行し、演算結果でデータを更新する。 FIG. 6 shows an example of atomic arithmetic processing by the memory controller 300 . Assume that a cell (eg, a specific particle) is subjected to arithmetic processing (eg, summation of multiple types of forces). Here, the core 1 of the processing unit (corresponding to one of the cores 220 shown in FIG. 3) issues an update request (for example, an ACCUM command instructing accumulation) instructing that predetermined data be updated by calculation. , to the memory controller 300 . In response to receiving the update request, the memory controller 300 accesses the particle data memory 110 to perform an arithmetic operation (eg, A+B) and update the data with the operation result.

また、コア２（図３に示すコア２２０の別の１つに対応）が、メモリコントローラ３００に更新要求（例えば、ＡＣＣＵＭ命令）を送信する。これに応じて、メモリコントローラ３００が粒子データメモリ１１０にアクセスして演算処理（Ａ＋Ｂ＋Ｃ）を実行する。このような演算処理を一度（アトミック）に行ってよい。 Core 2 (corresponding to another one of cores 220 shown in FIG. 3) also sends an update request (eg, an ACCUM instruction) to memory controller 300 . In response, the memory controller 300 accesses the particle data memory 110 and executes the arithmetic processing (A+B+C). Such arithmetic processing may be performed once (atomic).

このようにメモリコントローラ３００が、メモリ上での演算処理を制御する。各コアがメモリコントローラ３００を介さず、それぞれでメモリに読出及び書込を行って演算した場合は、処理の競合が生じ得るが、本実施形態によればメモリコントローラ３００により、そのような競合は避けられる。 Thus, the memory controller 300 controls arithmetic processing on the memory. If each core reads and writes to the memory and performs calculations without going through the memory controller 300, competition in processing may occur. can avoid.

図７は、メモリコントローラ３００によるアトミックな積算処理の別の例を示す。メモリ１００は、各粒子の粒子データが現シミュレーション時刻で更新されたか否かを示すアトリビュートを記憶してよい。例えば、アトリビュートは、複数の粒子のそれぞれに対応付けられて、各粒子の粒子データ中の予め定められたデータが現シミュレーション時刻に更新されたか否かを示すものであってよい。アトリビュートは、粒子データメモリ１１０が記憶する各粒子の粒子データ、及び、セル情報メモリ１２０が記憶する各粒子に対応するセル情報の少なくとも１つに保持されてよい。 FIG. 7 shows another example of atomic accumulation processing by memory controller 300 . The memory 100 may store an attribute indicating whether particle data for each particle has been updated at the current simulation time. For example, the attribute may be associated with each of a plurality of particles and indicate whether predetermined data in particle data of each particle has been updated at the current simulation time. The attributes may be held in at least one of particle data for each particle stored in particle data memory 110 and cell information corresponding to each particle stored in cell information memory 120 .

メモリコントローラ３００は、一の粒子の粒子データ中における予め定められたデータがアクセスされたことに応じて、アトリビュートに基づいて、予め定められたデータに記録されている値を使用するか、初期値を使用するかを選択してよい。 In response to access to predetermined data in particle data of one particle, the memory controller 300 uses the value recorded in the predetermined data or the initial value based on the attribute. You may choose to use the

例えば、図７の例では、最初に、粒子データメモリ１１０において、対象粒子についてデータＡとアトリビュートＴ－１が記憶されている。次の時刻Ｔで、コア１（図３に示すコア２２０の１つに対応）が、メモリコントローラ３００に更新要求（例えば、ＡＣＣＵＭ命令）を送信する。 For example, in the example of FIG. 7, data A and attribute T−1 are first stored in the particle data memory 110 for the target particle. At the next time T, core 1 (corresponding to one of cores 220 shown in FIG. 3) sends an update request (eg, an ACCUM instruction) to memory controller 300 .

メモリコントローラ３００は、更新要求を受け取ったことに応じて、予め定められたデータが現シミュレーション時刻に更新されている場合は予め定められたデータに記録されている値に対して演算を施して更新する。メモリコントローラ３００は、予め定められたデータが現シミュレーション時刻に更新されていない場合は初期値に対して演算を施して予め定められたデータを更新する。 In response to receiving the update request, the memory controller 300 updates the value recorded in the predetermined data by performing an operation if the predetermined data has been updated at the current simulation time. do. If the predetermined data has not been updated at the current simulation time, the memory controller 300 performs a calculation on the initial value to update the predetermined data.

例えば、メモリコントローラ３００が、対象粒子のアトリビュートＴ－１と現在の時刻Ｔとを比較し、一致しないと判断する。これに応じて、メモリコントローラ３００は、粒子データメモリ１１０にアクセスして、対象粒子についてデータＡを破棄し、ＡＣＣＵＭ命令に含まれる初期値Ｂで更新する。メモリコントローラ３００は、同時に対象粒子のアトリビュートをＴ－１からＴに更新する。 For example, the memory controller 300 compares the target particle attribute T−1 with the current time T and determines that they do not match. In response, memory controller 300 accesses particle data memory 110 to discard data A for the target particle and update it with the initial value B contained in the ACCUM instruction. The memory controller 300 simultaneously updates the attribute of the target particle from T-1 to T.

更にコア２（図３に示すコア２２０の別の１つに対応）が、メモリコントローラ３００に更新要求（例えば、ＡＣＣＵＭ命令）を送信する。メモリコントローラ３００が、対象粒子のアトリビュートＴと現在の時刻Ｔとを比較し、一致すると判断する。これに応じて、メモリコントローラ３００は、対象粒子の記憶されているデータＢを使用し、粒子データメモリ１１０にアクセスして積算処理（Ｂ＋Ｃ）を実行する。 In addition, Core 2 (corresponding to another one of cores 220 shown in FIG. 3) sends an update request (eg, an ACCUM instruction) to memory controller 300 . The memory controller 300 compares the attribute T of the target particle and the current time T and determines that they match. In response, the memory controller 300 uses the stored data B of the target particle, accesses the particle data memory 110, and executes the integration process (B+C).

図７に示した実施形態によれば、メモリコントローラ３００がアトリビュートの値に応じて、演算結果を初期値（例えば、０）にする。各コア２２０がメモリ１００にアクセスして積算処理（例えば、複数種類の力の合算）を行う場合、最初に、メモリ１００にアクセスしたコア２２０が初期値０を書き込む必要があるが（０クリアともいう）、このような処理には時間がかかる。一方で、本実施形態によれば、メモリコントローラ３００が０クリアをメモリに対して行う必要がなく、アトリビュートの値に応じて初期値を０にした積算処理を行うことができる。なお、アトリビュートは、粒子データメモリ１１０が粒子以外のデータを記憶する場合、粒子データだけでなく、当該粒子以外のデータにも適用してよい。 According to the embodiment shown in FIG. 7, the memory controller 300 initializes the operation result (eg, 0) according to the value of the attribute. When each core 220 accesses the memory 100 and performs integration processing (for example, summation of multiple types of forces), first, the core 220 that accesses the memory 100 needs to write an initial value of 0 (also known as 0 clear). ), such processing is time consuming. On the other hand, according to the present embodiment, the memory controller 300 does not need to clear the memory to 0, and the integration process can be performed with the initial value set to 0 according to the value of the attribute. Note that when the particle data memory 110 stores data other than particles, the attributes may be applied not only to the particle data but also to the data other than the particles.

図８から図１２において、本実施形態における遠距離力（例えば、遠距離のクーロン力）の処理方法の例を示す。例えば、処理装置１０は、（１）空間中の各粒子が有する電荷を、空間に設けられた比較的少数の格子点に近似的にアサインするチャージアサインを実行し、（２）格子点上の電位を計算する電位計算を実行し、（３）その後、格子点上の電位を元に各粒子への力を計算するパックインターポレーションを実行する。このような手法は、ＰａｒｔｉｃｌｅＭｅｓｈＥｗａｌｄ（ＰＭＥ）としても知られている。本実施形態において、長距離ユニット２３０は（１）と（３）を実行し、コンボリューションユニット６００は（２）を実行する。 FIGS. 8 to 12 show examples of methods for processing long-range forces (for example, long-range Coulomb forces) in this embodiment. For example, the processing device 10 (1) performs charge assignment that approximately assigns the charge of each particle in space to a relatively small number of grid points provided in the space, and (2) performs charge assignment on the grid points. Potential calculation is performed to calculate the potential, and (3) pack interpolation is then performed to calculate the force on each particle based on the potential on the lattice points. Such an approach is also known as Particle Mesh Ewald (PME). In this embodiment, long range unit 230 performs (1) and (3) and convolution unit 600 performs (2).

ここで、（２）電位計算は、格子電荷を含むポアソン方程式を解くことにより行われる。ポアソン方程式は、格子点の電荷値に格子点間の距離に応じた係数（カーネル）を乗じ、予め定められた範囲内の全ての格子点からの寄与を畳み込みにより加算することにより、解かれる。ここで、畳み込みは、密で近距離の畳み込みと、疎で長距離の畳み込みの結果を足し合わせることにより行ってよい。このような手法はＭｕｌｔｉｌｅｖｅｌＳｕｍｍａｔｉｏｎＭｅｔｈｏｄ（ＭＳＭ：Ｓｋｅｅｌ等、２００２、Ｈａｒｄｙ等、２０１６）として知られている。 Here, (2) potential calculation is performed by solving Poisson's equation including lattice charges. The Poisson's equation is solved by multiplying the grid point charge value by a coefficient (kernel) that depends on the distance between grid points, and adding the contributions from all grid points within a predetermined range by convolution. Here, the convolution may be performed by adding together the results of a dense short-range convolution and a sparse long-range convolution. Such an approach is known as the Multilevel Summation Method (MSM: Skeel et al., 2002; Hardy et al., 2016).

図８は、ＭＳＭによる解法の一例を示す。以降の図では説明のため、２次元（２軸方向）に配列された格子点を図示するが、実際には３次元（３軸方向）に配列された格子点が用いられてよい。 FIG. 8 shows an example of the MSM solution. Although grid points arranged two-dimensionally (in two axial directions) are illustrated in subsequent drawings for explanation, grid points arranged three-dimensionally (in three axial directions) may actually be used.

図示するように、コンボリューションユニット６００は、例えば、レベル１～３の３段階の粗密レベルで畳み込みを実行してもよい。対象となる格子点（図中、Ｔで示す）の電位を算出するのに、コンボリューションユニット６００は、まず最も密なレベル１における近接格子点６６からの対象格子点Ｔへの寄与を計算する。 As shown, convolution unit 600 may perform convolution at three coarse and fine levels, levels 1-3, for example. To compute the potential of a grid point of interest (denoted by T in the figure), convolution unit 600 first computes the contribution from the closest grid point 66 at level 1 to the grid point of interest T. .

次にコンボリューションユニット６００は、中間の粗密レベルのレベル２における近接格子点６４からの対象格子点Ｔへの寄与を計算する。更にコンボリューションユニット６００は、最も疎なレベル３における近接格子点６２からの対象格子点Ｔへの寄与を計算する。コンボリューションユニット６００は、これらのレベル１～３における寄与を合算することにより、対象格子点Ｔの電位を算出してよい。 The convolution unit 600 then computes the contribution to the target grid point T from the neighboring grid points 64 at level 2 of the intermediate coarse and fine levels. In addition, convolution unit 600 computes the contribution to target grid point T from neighboring grid points 62 at level 3, the sparsest. The convolution unit 600 may calculate the potential of the target grid point T by summing the contributions at these levels 1-3.

レベル１～３のいずれかのレベルにおける格子点のそれぞれは、セルに対応するものであってよい。例えば、レベル３の各格子点は、各セルの中心点又は頂点に対応するものであってよい。また、例えば、レベル３の所定の個数の格子点（例えば、８個の格子点）は、セルごとに設けられてよい。 Each grid point in any one of levels 1 to 3 may correspond to a cell. For example, each level 3 grid point may correspond to the center point or vertex of each cell. Also, for example, a predetermined number of level 3 grid points (for example, 8 grid points) may be provided for each cell.

図９は、１つのレベルにおける畳み込み処理の概要を示す。この図の例では、各処理ノード（チップ）は、４個の格子点を担当する。格子点はセルに対応するものであってよい。一例として、１個の格子点は１個のセルに対応する。 FIG. 9 shows an overview of the convolution process at one level. In the example of this figure, each processing node (chip) is responsible for four lattice points. A grid point may correspond to a cell. As an example, one lattice point corresponds to one cell.

コンボリューションユニット６００は、対象格子点Ｔから一方向にカットオフ範囲内（例えば隣接２個）の格子点までの電荷を考慮して畳み込みを行う。図９の例では、チップ５が担当する対象格子点Ｔの電位は、チップ１が担当する１個の隣接格子点、チップ２～４及びチップ７が担当する２個の隣接格子点、及び、チップ５～６及びチップ８～９が担当する４個の隣接格子点（対象格子点自身も含んでよい）の電荷の畳み込みに少なくとも部分的に基づいて決定する。 The convolution unit 600 performs convolution considering charges from the target grid point T to grid points within a cutoff range (for example, two adjacent grid points) in one direction. In the example of FIG. 9, the potential of the target grid point T handled by the chip 5 is one adjacent grid point handled by the chip 1, two adjacent grid points handled by the chips 2 to 4 and the chip 7, and The decision is based, at least in part, on the convolution of the charges of the four adjacent grid points (which may include the subject grid point itself) served by chips 5-6 and chips 8-9.

ここで、チップ５は、チップ１～４及びチップ６～９と個別に通信して各隣接格子点の電荷を取得してもよいが、その場合、演算時間よりも通信よるオーバーヘッドが処理時間のボトルネックとなる場合がある。そこで、本実施形態においては、格子点を軸方向に分離し、各軸における畳み込みを別個に行うことで、チップ間の通信時間を削減し、全体的な処理時間を減少させる。 Here, the chip 5 may communicate with the chips 1 to 4 and chips 6 to 9 individually to acquire the charge of each adjacent grid point. It may become a bottleneck. Therefore, in this embodiment, the lattice points are separated in the axial direction and the convolution is performed separately on each axis, thereby reducing the communication time between chips and the overall processing time.

例えば、複数の処理ノードのコンボリューションユニット６００は、複数の軸内における格子点間の距離に応じた係数を算出する関数を、複数の軸のそれぞれに分離したカーネル関数の組み合わせによって近似することにより、複数の軸における畳み込み演算を各軸方向の畳み込みに分離してよい。 For example, the convolution unit 600 of the plurality of processing nodes approximates a function for calculating coefficients according to distances between grid points in a plurality of axes by combining kernel functions separated for each of the plurality of axes. , the convolution operation on multiple axes may be separated into convolutions along each axis.

例えば、コンボリューションユニット６００は、各軸について、
ｘ'_ｎ＝ａ_－２ｘ_ｎ－２＋ａ_－１ｘ_ｎ－１＋ａ_０ｘ_ｎ＋ａ_１ｘ_ｎ＋１＋ａ_２ｘ_ｎ＋２ …式１
を計算することで畳み込み演算を実行してよい。ここで、ｘ'_ｎは対象格子点Ｔにおける電位であり、ａ_－２～ａ_２は格子点間の距離に応じた係数であり、ｘ_ｎ－２～ｘ_ｎ＋２は各格子点の電荷である。 For example, convolution unit 600 may, for each axis:
_x'n =a _-2xn _-2 +a _-1xn _-1 + _a0xn + _a1xn ₊₁ ₊ _a2xn+2 _Equation 1
A convolution operation may be performed by computing Here, x′ _n is the potential at the target grid point T, a ₋₂ to a ₂ are coefficients according to the distance between the grid points, and x _n−2 to x _n+2 are the charges at each grid point. .

コンボリューションユニット６００は、第１軸方向における予め定められた範囲の格子点を担当する他の処理ノード（チップ）から受信した電荷等の値に応じた値を担当する格子点に畳み込んでよい。例えば、コンボリューションユニット６００は、図８に示すｘ方向と図９に示すｙ方向のそれぞれについて、他の処理ノード（チップ）から電荷等の値（例えば、ｘ_ｎ－２～ｘ_ｎ＋２）を受信し、上記式１を計算することで対象格子点のｘ方向の寄与分及びｙ方向の寄与分を演算してよい。 The convolution unit 600 may convolve a value according to the value of charge received from another processing node (chip) in charge of a predetermined range of grid points in the first axis direction into the grid points in charge. . For example, the convolution unit 600 receives values such as charge (eg, x _n−2 to x _n+2 ) from other processing nodes (chips) for each of the x direction shown in FIG. 8 and the y direction shown in FIG. Then, the x-direction contribution and the y-direction contribution of the target lattice point may be calculated by calculating the above equation 1.

また、コンボリューションユニット６００は、自身の担当する格子点の電荷及び／又は自身の演算結果の情報を別のチップのコンボリューションユニット６００に対して送信する。例えば、複数の処理ノード（チップ）のそれぞれのコンボリューションユニット６００は、担当する格子点に割り当てられた電荷等の値を、複数の軸のうち第１軸方向における予め定められた範囲の格子点を担当する他の処理ノードに送信する。 Also, the convolution unit 600 transmits information on the electric charge of the grid point it is in charge of and/or the result of its own operation to the convolution unit 600 of another chip. For example, the convolution unit 600 of each of the plurality of processing nodes (chips) converts values such as charges assigned to grid points in charge to grid points in a predetermined range in the first axis direction among the multiple axes. to other processing nodes responsible for

例えば、コンボリューションユニット６００は、電荷及び／又は演算結果を、各軸方向に必要な範囲までに存在するコンボリューションユニット６００に対して、マルチキャストする。一例として、コンボリューションユニット６００は、電荷等をｙ方向及び／又はｚ方向に１又は複数チップ分隣接するコンボリューションユニット６００にマルチキャストしてよい。 For example, the convolution unit 600 multicasts the charge and/or the operation result to the convolution units 600 that exist within a required range in each axis direction. As an example, the convolution unit 600 may multicast charges, etc., to adjacent convolution units 600 by one or more chips in the y and/or z directions.

コンボリューションユニット６００は、予め定められた範囲の格子点を担当する他の処理ユニットからの受信回数が、予め定められた数に達したことに応じて第１軸方向における畳み込みの終了を判定してよい。例えば、図１０におけるチップ５のコンボリューションユニット６００は、チップ４から１回電荷を受信し、チップ６から２回電荷を受信したこと（すなわち、計３回の電荷の受信）に応じて、Ｘ方向における畳み込みの終了を判定してよい。同様に図１１におけるチップ５のコンボリューションユニット６００は、チップ２から１回電荷を受信し、チップ８から２回電荷を受信したこと（すなわち、計３回の電荷の受信）に応じて、Ｙ方向における畳み込みの終了を判定してよい。 The convolution unit 600 determines the end of convolution in the first axis direction when the number of receptions from other processing units in charge of grid points in a predetermined range reaches a predetermined number. you can For example, the convolution unit 600 of chip 5 in FIG. 10, in response to receiving charge once from chip 4 and receiving charge twice from chip 6 (i.e., receiving charge a total of three times), X The end of convolution in a direction may be determined. Similarly, the convolution unit 600 of chip 5 in FIG. 11 responds to receiving charge once from chip 2 and receiving charge twice from chip 8 (i.e., receiving charge three times in total). The end of convolution in a direction may be determined.

そして、コンボリューションユニット６００は、複数の粗密レベルで演算した畳み込み結果に基づいて、各格子点の電位を計算してよい。例えば、コンボリューションユニット６００は、図９～図１１に基づいて説明した手法により、複数の処理ノードのコンボリューションユニット６００は、空間を第１単位で分割した各第１格子点（例えば、図８のレベル１に示す格子点）にアサインされた電荷を各第１格子点から第１範囲内で畳み込んでよい。 The convolution unit 600 may then calculate the potential of each lattice point based on the convolution results calculated at multiple coarse and fine levels. For example, the convolution unit 600 uses the technique described with reference to FIGS. ) may be convoluted within a first range from each first grid point.

第１格子点の電荷を集計することで、コンボリューションユニット６００は、空間を第１単位よりも大きな第２単位で分割した各第２格子点（例えば、図８のレベル２に示す格子点）にアサインされた電荷を、算出してよい。 By aggregating the charges of the first grid points, the convolution unit 600 divides the space into second units larger than the first unit, each of the second grid points (for example, the grid points shown at level 2 in FIG. 8). The charge assigned to may be calculated.

そして、コンボリューションユニット６００は、算出した電荷から第２格子点の電位を算出し、第２格子点で求めた電位から第１範囲内で畳み込むことで、各第１格子点の電位を算出してよい。コンボリューションユニット６００は、更に空間を第２単位よりも大きな第３単位で分割した各第３格子点（例えば、図８のレベル３に示す格子点）にアサインされた電荷を算出し、各第３格子点の電位を算出し、これらを更に用いて、各第１格子点の電位を算出してよい。 Then, the convolution unit 600 calculates the potential at the second grid point from the calculated charge, and convolves the potential obtained at the second grid point within the first range to calculate the potential at each first grid point. you can The convolution unit 600 further calculates the charge assigned to each third grid point (for example, grid points shown in level 3 in FIG. 8) obtained by dividing the space into third units larger than the second unit, and calculates each third grid point. The potentials of the three grid points may be calculated and further used to calculate the potential of each first grid point.

複数の粗密レベルで演算した畳み込み結果の統合は、コンボリューションユニット６００とは別個のハードウェアにより実現してもよい。例えば、処理システム又は処理装置１０は、ＦＰＧＡ等の専用回路を別途有し、当該専用回路で畳み込み結果の統合を行ってもよい。 Integration of convolution results calculated at multiple coarse and fine levels may be realized by hardware separate from the convolution unit 600 . For example, the processing system or processing device 10 may separately have a dedicated circuit such as an FPGA, and the dedicated circuit may integrate convolution results.

図１２は、３レベルにおいて実行されるコンボリューションユニット６００による電位演算の一例を示す。図示する通り、６４格子点×６４格子点×６４格子点における電荷（６４^３電荷）に基づいて、より大きな単位の３２格子点×３２格子点×３２格子点における電荷（３２^３電荷）が演算される。また、３２^３電荷に基づいて更に大きい単位の１６格子点×１６格子点×１６格子点における電荷（１６^３電荷）が演算される。 FIG. 12 shows an example of potential computation by convolution unit 600 performed in three levels. As shown in the figure, based on the charges (64 ³ charges) at 64 grid points x 64 grid points x 64 grid points, the charges (32 ³ charges) at larger units of 32 grid points x 32 grid points x 32 grid points are calculated. be done. Also, based on the 32 ³ charges, the charges (16 ³ charges) at a larger unit of 16 grid points x 16 grid points x 16 grid points are calculated.

そして、１６^３電荷から同格子点における電位（１６^３電位）が演算される。１６^３電位及び３２^３電荷に基づいて３２格子点×３２格子点×３２格子点における電位（３２^３電位）が演算される。更に３２^３電位及び６４^３電荷に基づいて６４格子点×６４格子点×６４格子点における電位（６４^３電位）が演算される。 Then, the potential ( ¹⁶³ potential) at the same lattice point is calculated from the ¹⁶³³ charge. Based on the 16 ³ potentials and 32 ³ charges, potentials (32 ³ potentials) at 32 grid points x 32 grid points x 32 grid points are calculated. Further, based on the 32 ³ potentials and 64 ³ charges, potentials (64 ³ potentials) at 64 grid points x 64 grid points x 64 grid points are calculated.

本実施形態によれば、各レベルの各軸方向において、畳み込み演算は順不同に実行でき、各格子点の演算の終了は受信したデータ個数で判断できる。また、コンボリューションユニット６００は、演算に必要な前の軸方向のデータが揃うまで次の演算を実行しなくてよいので、同期処理を必要とせず、状態を自動的に遷移させることができる。 According to this embodiment, in each axis direction of each level, the convolution operation can be executed in random order, and the end of the operation of each lattice point can be determined by the number of received data. In addition, the convolution unit 600 does not need to execute the next calculation until all the previous axial data required for the calculation are ready, so the state can be automatically changed without the need for synchronization processing.

また、本実施形態によれば、ＭＳＭに必要な格子レベルの上下を繋げる補完演算も一つのハードウェア（すなわちコンボリューションユニット６００）により実現される。本実施形態によれば、ＦＦＴによりポアソン方程式を解く場合と比較して、処理ノード間の通信量を低減することで、全体的な処理時間を削減することができる。 Further, according to the present embodiment, a complementary operation for connecting upper and lower lattice levels required for MSM is also realized by one piece of hardware (that is, the convolution unit 600). According to the present embodiment, compared with the case of solving Poisson's equation by FFT, by reducing the amount of communication between processing nodes, it is possible to reduce the overall processing time.

図１３から図１６Ｂを用いて、本実施形態における排除粒子機能の例を示す。 13 to 16B are used to illustrate an example of the excluded particle function in this embodiment.

図１３は、排除粒子機能に用いられるマスクパターンの一例を示す。マスクパターンメモリ１４０に記憶された複数のマスクパターンのそれぞれは、計算対象の粒子の粒子番号に対して予め定められた相対粒子番号を有する他の粒子のそれぞれを高排除および低排除の少なくとも一方の対象とすべきか否かを指定する。例えば、図１３では、短距離クーロン力等の計算において、粒子番号ｉ＝１の粒子が、別の粒子（粒子番号ｊ＝２、３、４…）を高排除又は低排除の対象とするかを表すマスクパターンを示している。 FIG. 13 shows an example of a mask pattern used for the excluded particles function. Each of the plurality of mask patterns stored in the mask pattern memory 140 performs at least one of high exclusion and low exclusion for each of the other particles having a predetermined relative particle number with respect to the particle number of the particle to be calculated. Specify whether to target or not. For example, in FIG. 13, in the calculation of the short-range Coulomb force, etc., it is A mask pattern representing

図１３のマスクパターンは「０００１１０…」の値を有している。ここで「０」は排除の対象としないことを表し、「１」は排除の対象とすることを表す。また、マスクパターンは別の粒子に対し、高排除と低排除の両方についての指定を含んでいる。例えば、当該マスクパターンは、粒子番号１の粒子が、粒子番号２の粒子に対して、マスク値「００」を有する。これは、粒子番号１の粒子は、粒子番号２の粒子を、低排除及び高排除の対象にしないことを示す。 The mask pattern in FIG. 13 has values of "000110...". Here, "0" indicates not to be excluded, and "1" indicates to be excluded. The mask pattern also contains both high and low rejection designations for different particles. For example, in the mask pattern, particles with particle number 1 have a mask value of "00" with respect to particles with particle number 2. This indicates that particles of particle number 1 do not subject particles of particle number 2 to low and high rejection.

また、例えば、当該マスクパターンは、粒子番号１の粒子が、粒子番号３の粒子に対して、マスク値「０１」を有する。これは、粒子番号１の粒子は、粒子番号３の粒子を、低排除の対象にすることを示す。同様に、当該マスクパターンは、粒子番号１の粒子が、粒子番号４の粒子に対して、マスク値「１０」を有する。これは、粒子番号１の粒子は、粒子番号４の粒子を、高排除の対象にすることを示す。 Also, for example, in the mask pattern, particles with particle number 1 have a mask value of “01” with respect to particles with particle number 3 . This indicates that particles with particle number 1 target particles with particle number 3 for low rejection. Similarly, in the mask pattern, particles with particle number 1 have a mask value of "10" with respect to particles with particle number 4. This indicates that particles of particle number 1 make particles of particle number 4 subject to high rejection.

このマスクパターンの値に基づいて、処理ユニット２００のパイプライン２１０は、短距離クーロン力等の計算の一部を省略してよい。例えば、パイプライン２１０は、図１３のマスクパターンの値に基づいて、粒子番号１の粒子と粒子番号２の粒子間の短距離クーロン力等を計算し、粒子番号１の粒子と粒子番号３の粒子間の短距離クーロン力等を計算した結果を低排除所定の割合で減殺し、粒子番号１の粒子と粒子番号４の粒子間の短距離クーロン力等を高排除所定の割合で減殺してよい、例えば、高排除が全部排除を意味する場合、短距離クーロン力等を計算しないでよい（又はこれらの力等を０としてよい）。 Based on the value of this mask pattern, pipeline 210 of processing unit 200 may omit some of the calculations, such as the short-range Coulomb force. For example, the pipeline 210 calculates the short-range Coulomb force between the particles of particle number 1 and the particle of particle number 2 based on the mask pattern values of FIG. The result of calculating the short-range Coulomb force between particles, etc., is reduced at a low-exclusion ratio, and the short-range Coulomb force, etc. between particles of particle number 1 and particle number 4 is reduced at a high-exclusion ratio. Good, eg, if high rejection means total rejection, short range Coulomb forces, etc. may not be calculated (or these forces, etc. may be zero).

ここで、粒子間の相互作用を高い精度でシミュレーションしようとすると、全ての粒子ペア間のマスクパターンを保持することになる。しかし、全ての粒子ペアのマスクパターンを保持することは、メモリ容量やマスク幅の制約から難しい場合がある。一方でマスクパターンに頼らず個別に粒子間の高排除／低排除／非排除を考慮した力の演算を個別に行うと計算時間が増大する問題が生じていた。 Here, when attempting to simulate interactions between particles with high accuracy, mask patterns between all particle pairs are retained. However, it may be difficult to hold mask patterns for all particle pairs due to restrictions on memory capacity and mask width. On the other hand, there is a problem that the calculation time increases if the force is individually calculated considering high exclusion/low exclusion/non-exclusion between particles without relying on the mask pattern.

ここで、本発明者らは、分子動力学シミュレーションの対象となる巨大タンパク質等は、所定構造のアミノ酸の繰り返し等、同一／類似する部分構造が多く出現することに着目した。従って、粒子に適用すべきマスクパターンも同様のものが多く出現することになる。そこで、本実施形態によれば、マスクパターンを限られた数のパターンに限定する代わりに、マスク幅を大きくし、これにより、短距離クーロン力等のより効率的な排除を実現した。本実施形態では、限られた数のマスクパターンを指定するためにマスクパターン識別情報を用いる。 Here, the present inventors paid attention to the fact that many identical/similar partial structures, such as repeated amino acids of a given structure, appear in large proteins and the like that are objects of molecular dynamics simulations. Therefore, many similar mask patterns to be applied to particles appear. Therefore, according to the present embodiment, instead of limiting the number of mask patterns to a limited number, the mask width is increased, thereby achieving more efficient elimination of the short-range Coulomb force and the like. In this embodiment, mask pattern identification information is used to designate a limited number of mask patterns.

図１４は、本実施形態におけるマスクパターン識別情報の一例を示す。マスクパターン識別情報は、複数の粒子のそれぞれについて、複数のマスクパターンのうち各粒子に働く力の計算において使用するマスクパターンを識別する。粒子データメモリ１１０は、粒子データの一部として、図１４に示すようなマスクパターン識別情報を記憶してよい。 FIG. 14 shows an example of mask pattern identification information in this embodiment. The mask pattern identification information identifies, for each of the plurality of particles, a mask pattern to be used in calculation of the force acting on each particle among the plurality of mask patterns. The particle data memory 110 may store mask pattern identification information as shown in FIG. 14 as part of the particle data.

例えば図示するように、粒子番号１の粒子の粒子データは、パターン番号１０２３をマスクパターン識別情報として含む。この場合、粒子番号１の粒子は、パターン番号１０２３で指定されるマスクパターンを使用することが示される。同様に粒子番号２の粒子の粒子データは、パターン番号４４２をマスクパターン識別情報として含む。この場合、粒子番号２の粒子は、パターン番号４４２で指定されるマスクパターンを使用することが示される。 For example, as illustrated, the particle data of the particle with particle number 1 includes pattern number 1023 as mask pattern identification information. In this case, the particle with particle number 1 is indicated to use the mask pattern designated by pattern number 1023 . Similarly, the particle data of the particle with particle number 2 includes pattern number 442 as mask pattern identification information. In this case, the particle with particle number 2 is shown to use the mask pattern designated by pattern number 442 .

図１５は、本実施形態において用いられるマスクパターンの一例を示す。複数のマスクパターンのそれぞれは、予め定められた長さのマスクビット列と、マスクビット列のうち計算対象の粒子に働く計算における高排除すべき他の粒子の指定および低排除すべき他の粒子の指定に割り当てるビット数を設定するための設定値とを含む。 FIG. 15 shows an example of mask patterns used in this embodiment. Each of the plurality of mask patterns includes a mask bit string of a predetermined length, designation of other particles to be highly excluded and other particles to be low-excluded in the calculation acting on the particle to be calculated in the mask bit string. and a setting value for setting the number of bits to allocate to .

例えば、図１５に示すマスクパターンは、７７ビットのビット長さを有し、７２ビットのマスクビット列と、５ビットの設定値とを含む。５ビットの設定値は、０～３１までの指定値を指定できる。当該指定値により、７２ビットのマスクビット列を、８～４０ビットの第１マスクと、３２～６４ビットの第２マスクとに分割する。例えば、設定値が１０であれば、７２ビットのマスクビット列を、１８ビットの第１マスクと５４ビットの第２マスクに分割する。 For example, the mask pattern shown in FIG. 15 has a bit length of 77 bits and includes a 72-bit mask bit string and a 5-bit setting value. The 5-bit setting value can specify a specified value from 0 to 31. The specified value divides the 72-bit mask bit string into a first mask of 8-40 bits and a second mask of 32-64 bits. For example, if the set value is 10, the 72-bit mask bit string is divided into a first mask of 18 bits and a second mask of 54 bits.

第１マスクは、計算対象の粒子に働く力等の計算における高排除すべき他の粒子の指定に用いられる。例えば、粒子番号１の粒子に対して、「０１０１…」の第１マスクを使用する場合、粒子番号１の粒子と粒子番号３の粒子間の短距離クーロン力等、及び、粒子番号１の粒子と粒子番号５の粒子間の短距離クーロン力等は高排除所定の割合で減殺されてよい。高排除が全部排除を意味する場合、短距離クーロン力等を計算しないでよい（又はこれらの力等を０としてよい）。 The first mask is used to designate other particles to be highly excluded in calculations such as forces acting on particles to be calculated. For example, when using the first mask of "0101..." for the particles of particle number 1, the short-range Coulomb force between the particles of particle number 1 and the particles of particle number 3, etc., and the particles of particle number 1 The short-range Coulomb force between the particles of No. 5 and Particle No. 5 may be reduced at a high rejection predetermined rate. If high rejection means total rejection, short range Coulomb forces etc. may not be calculated (or these forces etc. may be zero).

第２マスクは、計算対象の粒子に働く力等の計算における低排除すべき他の粒子の指定に用いられる。例えば、粒子番号１の粒子に対して、「００１０…」の第２マスクを使用する場合、粒子番号１の粒子と粒子番号４の粒子間の短距離クーロン力等は一部（例えば１／２又は１／４）が排除されてよい。 The second mask is used to designate other particles to be excluded in calculations such as forces acting on particles to be calculated. For example, when using the second mask of "0010..." for the particles of particle number 1, the short-range Coulomb force between the particles of particle number 1 and the particles of particle number 4 is partly (for example, 1/2 or 1/4) may be eliminated.

低排除に使用される第２マスクは、高排除に使用される第１マスクよりもマスク幅が大きくてよい。これにより、高排除より遠距離まで影響する可能性のある低排除の効果を、高排除よりも遠くの原子まで考慮することができる。 The second mask used for low rejection may have a larger mask width than the first mask used for high rejection. This allows the effects of low exclusion, which can affect longer distances than high exclusion, to be taken into account to atoms further away than high exclusion.

ここで、図１５に示すマスクパターンは、パターン番号４４２である。従って、当該マスクパターンは、粒子データのマスクパターン識別情報において４４２が指定された場合に使用される。マスクパターンメモリ１４０は、所定の数（例えば、１０２４個）のマスクパターンを記憶してよい。 Here, the mask pattern shown in FIG. 15 is pattern number 442 . Therefore, this mask pattern is used when 442 is specified in the mask pattern identification information of the particle data. The mask pattern memory 140 may store a predetermined number (eg, 1024) of mask patterns.

更に、処理装置１０は、マスクパターンメモリ１４０に記憶された所定数のマスクパターンでカバーできない粒子間の高排除／低排除の関係を更に考慮するために、マスクテーブルを用いてよい。例えば、環状分子中の隣接原子等、空間的には近接するが粒子番号が離れた粒子群間の高排除／低排除のパターンをマスクテーブルで指定することができる。 Additionally, processor 10 may use mask tables to further account for high/low rejection relationships between particles that are not covered by the predetermined number of mask patterns stored in mask pattern memory 140 . For example, a mask table can specify a pattern of high exclusion/low exclusion between groups of particles that are spatially close but separated by particle number, such as adjacent atoms in a ring molecule.

図１６Ａ及び図１６Ｂに、本実施形態におけるマスクテーブルの一例を示す。マスクテーブルは、マスクテーブルメモリ１５０に記憶され、マスクパターンでカバーされない粒子番号の離れた２つの粒子群の間に働く短距離クーロン力等の高排除および低排除の少なくとも一方を指定するものであってよい。図１６Ａは高排除のマスクテーブルを示し、図１６Ｂは低排除のマスクテーブルを示す。マスクテーブルメモリ１５０に記憶されるマスクテーブルが表現する粒子群の組み合わせは、記憶するマスクテーブルの数だけ組み合わせ方（例えば１６の組み合わせ方）があってよい。 16A and 16B show an example of the mask table in this embodiment. The mask table is stored in the mask table memory 150 and specifies at least one of high exclusion and low exclusion such as a short-range Coulomb force acting between two particle groups separated by particle numbers not covered by the mask pattern. you can FIG. 16A shows a mask table for high rejection and FIG. 16B shows a mask table for low rejection. The combinations of particle groups represented by the mask tables stored in the mask table memory 150 may have as many combinations as the number of stored mask tables (for example, 16 combinations).

例えば、図１６Ａの例では、粒子番号１０１の粒子は、粒子番号２２１、２２２、２２３の粒子に対し、「１００」の高排除のパターンを有する。また、図１６Ｂの例では、「０１０」の低排除のパターンを有する。これは、粒子番号２２１の粒子は粒子番号１０１の粒子に対して高排除（例えば、全部が排除）され、粒子番号２２２の粒子は粒子番号１０１の粒子に対して低排除（例えば、一部が排除）されることを示す。同様に粒子番号２２２の粒子は粒子番号１０２の粒子に対して高排除（例えば、全部が排除）され、粒子番号２２３の粒子は粒子番号１０３の粒子に対して低排除（例えば、一部が排除）されることが示される。 For example, in the example of FIG. 16A, a particle with particle number 101 has a high rejection pattern of "100" relative to particles with particle numbers 221, 222, and 223. FIG. Also, the example of FIG. 16B has a low rejection pattern of “010”. This means that particles with particle number 221 have high rejection (e.g., totally excluded) with respect to particles with particle number 101, and particles with particle number 222 have low rejection (e.g., some excluded). Similarly, particles with particle number 222 are highly excluded (e.g., totally excluded) with respect to particles with particle number 102, and particles with particle number 223 are lowly excluded (e.g., partially excluded) with respect to particles with particle number 103. ) is shown.

ここで、マスクテーブルに記憶する高排除および低排除の少なくとも一方を指定する情報が、粒子番号の離れた２つの粒子群のどの粒子同士の組み合わせ方に対応するのかを予め定めた対応関係によって指定可能であってよい。一例として、図１７で説明する斜方モードが挙げられる。 Here, information specifying at least one of high exclusion and low exclusion stored in the mask table corresponds to which combination of particles in two particle groups with different particle numbers is specified by a predetermined correspondence relationship. It may be possible. An example is the oblique mode described in FIG.

図１７Ａ及び図１７Ｂは、本実施形態における高排除のマスクテーブル斜方モードの一例を示す。本実施形態の処理装置１０は、マスクテーブルの利用に関して、例えば、図１７Ａおよび図１７Ｂに示す斜方モードを備えてよい。斜方モードが指定された場合、マスクテーブルのマスク値が示す高排除／低排除の指定範囲が、粒子番号に応じて、所定数（例えば１つ）ずつずれていく。図１７Ａの例では、粒子番号２２１の粒子は粒子番号１０１の粒子に対して高排除（例えば、全部が排除）され、粒子番号２２３の粒子は粒子番号１０２の粒子に対して高排除（例えば、全部が排除）されることが示され、図１７Ｂの例では、粒子番号２２２の粒子は粒子番号１０１の粒子に対して低排除（例えば、一部が排除）され、粒子番号２２５の粒子は粒子番号１０３の粒子に対して低排除（例えば、一部が排除）されることが示される。 17A and 17B show an example of a mask table oblique mode with high rejection in this embodiment. The processing apparatus 10 of this embodiment may be provided with an oblique mode, for example, as shown in FIGS. 17A and 17B, for use of the mask table. When the oblique mode is specified, the designated range of high exclusion/low exclusion indicated by the mask value of the mask table is shifted by a predetermined number (for example, one) according to the particle number. In the example of FIG. 17A, particles with particle number 221 are highly excluded (e.g., totally excluded) with respect to particles with particle number 101, and particles with particle number 223 are highly excluded (e.g., with particles with particle number 102). 17B, particles with particle number 222 are low-rejected (e.g., partially excluded) relative to particles with particle number 101, and particles with particle number 225 are particles Low rejection (eg, partial rejection) is shown for particles numbered 103 .

このように本実施形態の処理システム及び処理装置１０等によれば、従来の専用計算機を用いた場合と比較して、より高速かつ効率的に分子動力学シミュレーションを実行することができる。 As described above, according to the processing system and the processing device 10 of the present embodiment, the molecular dynamics simulation can be executed faster and more efficiently than in the case of using a conventional dedicated computer.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更または改良を加えることが可能であることが当業者に明らかである。その様な変更または改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments. It is obvious to those skilled in the art that various modifications and improvements can be made to the above embodiments. It is clear from the description of the scope of claims that forms with such modifications or improvements can also be included in the technical scope of the present invention.

特許請求の範囲、明細書、および図面中において示した装置、システム、プログラム、および方法における動作、手順、ステップ、および段階等の各処理の実行順序は、特段「より前に」、「先立って」等と明示しておらず、また、前の処理の出力を後の処理で用いるのでない限り、任意の順序で実現しうることに留意すべきである。特許請求の範囲、明細書、および図面中の動作フローに関して、便宜上「まず、」、「次に、」等を用いて説明したとしても、この順で実施することが必須であることを意味するものではない。 The execution order of each process such as actions, procedures, steps, and stages in the devices, systems, programs, and methods shown in the claims, the specification, and the drawings is particularly "before", "before" etc., and it should be noted that it can be implemented in any order unless the output of the previous process is used in the subsequent process. Regarding the operation flow in the claims, the specification, and the drawings, even if the description is made using "first," "next," etc. for convenience, it means that it is essential to carry out in this order. not a thing

１０処理装置
６２近接格子点
６４近接格子点
６６近接格子点
１００メモリ
１１０粒子データメモリ
１２０セル情報メモリ
１３０メモリブロック情報メモリ
１４０マスクパターンメモリ
１５０マスクテーブルメモリ
２００処理ユニット
２１０パイプライン
２２０コア
２３０長距離ユニット
３００メモリコントローラ
４００ネットワークインタフェース
６００コンボリューションユニット REFERENCE SIGNS LIST 10 processing unit 62 neighboring grid points 64 neighboring grid points 66 neighboring grid points 100 memory 110 particle data memory 120 cell information memory 130 memory block information memory 140 mask pattern memory 150 mask table memory 200 processing unit 210 pipeline 220 core 230 long range unit 300 memory controller 400 network interface 600 convolution unit

Claims

空間内に配置される複数の粒子のそれぞれの粒子データを記憶する粒子データメモリと、
前記空間を分割した各セルのセル番号に対応付けて、前記粒子データメモリにおける、当該セル内の粒子の粒子データを格納するために各セルに割り当てられたメモリブロックを示すセル情報を記憶するセル情報メモリと、
前記空間を分割したセルに含まれる粒子の粒子データを、セル番号を指定してアクセスする処理ユニットと、
セル番号を指定した粒子データへのアクセスを受けたことに応じて、指定されたセル番号に対応付けられたセル情報を用いて前記粒子データメモリにおけるアクセス対象の粒子データのメモリブロックを特定するメモリコントローラと
を備える処理装置。 a particle data memory that stores particle data of each of a plurality of particles arranged in space;
A cell for storing cell information indicating a memory block assigned to each cell for storing particle data of particles in the cell in the particle data memory in association with the cell number of each cell obtained by dividing the space. an information memory;
a processing unit that accesses particle data of particles contained in cells obtained by dividing the space by specifying a cell number;
A memory for identifying a memory block of particle data to be accessed in the particle data memory using cell information associated with the specified cell number in response to access to particle data with a specified cell number. A processing unit comprising a controller and .

空間内に配置される複数の粒子のそれぞれの粒子データを記憶する粒子データメモリと、
前記空間を分割した各セルのセル番号に対応付けて、前記粒子データメモリにおける、当該セル内の粒子の粒子データを格納するために各セルに割り当てられた記憶位置を示すセル情報を記憶するセル情報メモリと、
前記空間を分割したセルに含まれる粒子の粒子データを、セル番号を指定してアクセスする処理ユニットと、
セル番号を指定した粒子データへのアクセスを受けたことに応じて、指定されたセル番号に対応付けられたセル情報を用いて前記粒子データメモリにおけるアクセス対象の粒子データの記憶位置を特定するメモリコントローラと、
前記粒子データメモリ内における、使用済みの記憶位置または未使用の記憶位置を管理するための情報を記憶するメモリと、
を備え、
前記メモリコントローラは、セルに記憶位置を割り当てる場合に、前記使用済みの記憶位置または未使用の記憶位置を管理するための情報を用いて未使用の記憶位置を選択する
処理装置。 a particle data memory that stores particle data of each of a plurality of particles arranged in space;
A cell for storing cell information indicating a storage location assigned to each cell for storing particle data of particles in the cell in the particle data memory in association with the cell number of each cell obtained by dividing the space. an information memory;
a processing unit that accesses particle data of particles contained in cells obtained by dividing the space by specifying a cell number;
A memory for specifying a storage location of the particle data to be accessed in the particle data memory by using cell information associated with the specified cell number in response to access to the particle data with the specified cell number. a controller ;
a memory for storing information for managing used or unused storage locations within the particle data memory;
with
The memory controller selects unused storage locations using information for managing the used or unused storage locations when assigning storage locations to cells.
processing equipment.

前記処理ユニットは、指定したセル番号のセルに粒子を追加することを指示する追加要求を前記メモリコントローラへと送信し、
前記メモリコントローラは、前記追加要求を受け取ったことに応じて、前記粒子データメモリにおける、指定されたセル番号に対応付けられたセル情報によって示される記憶位置に、当該粒子の粒子データを追加する請求項１又は２に記載の処理装置。 The processing unit transmits to the memory controller an addition request instructing addition of particles to a cell with a specified cell number;
wherein said memory controller adds particle data of said particle to a storage location indicated by cell information associated with a specified cell number in said particle data memory in response to said addition request being received. Item 3. The processing apparatus according to Item 1 or 2 .

前記メモリコントローラは、複数の前記処理ユニットから同一セルに対して競合する複数の前記追加要求を受信した場合に、前記複数の追加要求のそれぞれをアトミックに処理する請求項３に記載の処理装置。 4. The processing device according to claim 3 , wherein when said memory controller receives a plurality of said add requests that conflict for the same cell from a plurality of said processing units, said memory controller atomically processes each of said plurality of add requests.

前記処理ユニットは、セル番号およびセル内の粒子番号を指定して粒子データに対する書き込みを要求する書込要求を前記メモリコントローラへと送信し、
前記メモリコントローラは、前記書込要求を受け取ったことに応じて、前記粒子データメモリにおける、指定されたセル番号に対応付けられたセル情報および指定されたセル内の粒子番号によって示される記憶位置の粒子データに対して書込データを書き込む請求項１から４のいずれか一項に記載の処理装置。 The processing unit transmits to the memory controller a write request requesting writing of particle data by designating a cell number and a particle number within the cell;
In response to receiving the write request, the memory controller stores cell information associated with the specified cell number and the storage location indicated by the particle number in the specified cell in the particle data memory. 5. The processing apparatus according to any one of claims 1 to 4 , wherein write data is written with respect to particle data.

前記セル情報メモリは、各セルのセル番号に対応付けて、前記粒子データメモリ内の複数のメモリブロックを示す前記セル情報を記憶する請求項１から５のいずれか一項に記載の処理装置。 6. The processing device according to claim 1 , wherein said cell information memory stores said cell information indicating a plurality of memory blocks in said particle data memory in association with a cell number of each cell.

前記粒子データメモリ内における、使用済みのメモリブロックまたは未使用のメモリブロックを管理するためのメモリブロック情報を記憶するメモリブロック情報メモリを更に備え、
前記メモリコントローラは、セルにメモリブロックを割り当てる場合に、前記メモリブロック情報を用いて未使用のメモリブロックを選択する
請求項１、３から６のいずれか一項に記載の処理装置。 further comprising a memory block information memory for storing memory block information for managing used memory blocks or unused memory blocks in the particle data memory;
7. The processing device according to claim 1 , wherein the memory controller selects an unused memory block using the memory block information when allocating a memory block to a cell.

前記粒子データメモリ内における、メモリブロックの使用済および未使用を示すインジケータを含むメモリブロック情報を記憶するメモリブロック情報メモリを更に備える、
請求項１から７のいずれか一項に記載の処理装置。 further comprising a memory block information memory within the particle data memory for storing memory block information including indicators of used and unused memory blocks ;
8. A processing apparatus according to any one of claims 1-7.

前記セル情報メモリは、複数のセルに対応付けた第１セットの複数の前記セル情報と、前記複数のセルに対応付けた第２セットの複数の前記セル情報とを記憶し、
前記処理ユニットは、前記空間内に前記複数の粒子の移動に伴って前記複数の粒子を各セルに再配置する処理において、前記第１セットの複数のセル情報によって記憶位置が指定される各粒子の粒子データを、前記第２セットの複数のセル情報によって指定される記憶位置に再割当する
請求項１から８のいずれか一項に記載の処理装置。 the cell information memory stores a first set of the plurality of cell information associated with the plurality of cells and a second set of the plurality of cell information associated with the plurality of cells;
In the processing of rearranging the plurality of particles in each cell as the plurality of particles move within the space, the processing unit is arranged such that each particle whose storage position is designated by the plurality of cell information of the first set of particle data to storage locations specified by the second set of cell information.

各粒子の前記粒子データおよび各粒子に対応する前記セル情報の少なくとも１つは、前記複数の粒子のそれぞれに対応付けて、各粒子の前記粒子データ中の予め定められたデータが現シミュレーション時刻に更新されたか否かを示すアトリビュートを保持し、
前記メモリコントローラは、一の粒子の前記粒子データ中における前記予め定められたデータがアクセスされたことに応じて、前記アトリビュートに基づいて、前記予め定められたデータに記録されている値を使用するか、初期値を使用するかを選択する
請求項１から９のいずれか一項に記載の処理装置。 At least one of the particle data of each particle and the cell information corresponding to each particle is associated with each of the plurality of particles, and predetermined data in the particle data of each particle is generated at current simulation time. Holds an attribute indicating whether or not it has been updated,
The memory controller uses the value recorded in the predetermined data based on the attribute in response to accessing the predetermined data in the particle data of one particle. or use a default value.

前記処理ユニットは、前記予め定められたデータを演算により更新することを指示する更新要求を前記メモリコントローラへと送信し、
前記メモリコントローラは、前記更新要求を受け取ったことに応じて、前記予め定められたデータが現シミュレーション時刻に更新されている場合は前記予め定められたデータに記録されている値に対して演算を施して更新し、前記予め定められたデータが現シミュレーション時刻に更新されていない場合は初期値に対して演算を施して前記予め定められたデータを更新する
請求項１０に記載の処理装置。 The processing unit transmits to the memory controller an update request instructing to update the predetermined data by calculation,
In response to receiving the update request, the memory controller performs an operation on the value recorded in the predetermined data when the predetermined data is updated at the current simulation time. 11. The processing device according to claim 10 , wherein if the predetermined data has not been updated at the current simulation time, the predetermined data is updated by performing an operation on the initial value.

粒子データメモリが、空間内に配置される複数の粒子のそれぞれの粒子データを記憶する段階と、
セル情報メモリが、前記空間を分割した各セルのセル番号に対応付けて、前記粒子データメモリにおける、当該セル内の粒子の粒子データを格納するために各セルに割り当てられたメモリブロックを示すセル情報を記憶する段階と、
処理ユニットが、前記空間を分割したセルに含まれる粒子の粒子データを、セル番号を指定してアクセスする段階と、
メモリコントローラが、セル番号を指定した粒子データへのアクセスを受けたことに応じて、指定されたセル番号に対応付けられたセル情報を用いて前記粒子データメモリにおけるアクセス対象の粒子データのメモリブロックを特定する段階と
を備える処理方法。 a particle data memory storing particle data for each of a plurality of particles arranged in space;
A cell in which the cell information memory indicates a memory block assigned to each cell for storing particle data of particles in the cell in the particle data memory in association with the cell number of each cell obtained by dividing the space. storing the information;
a step in which a processing unit accesses particle data of particles contained in cells obtained by dividing the space by designating a cell number;
The memory controller stores a memory block of particle data to be accessed in the particle data memory using cell information associated with the specified cell number in response to access to particle data with a specified cell number. A processing method comprising identifying and .

粒子データメモリが、空間内に配置される複数の粒子のそれぞれの粒子データを記憶する段階と、
セル情報メモリが、前記空間を分割した各セルのセル番号に対応付けて、前記粒子データメモリにおける、当該セル内の粒子の粒子データを格納するために各セルに割り当てられた記憶位置を示すセル情報を記憶する段階と、
処理ユニットが、前記空間を分割したセルに含まれる粒子の粒子データを、セル番号を指定してアクセスする段階と、
メモリコントローラが、セル番号を指定した粒子データへのアクセスを受けたことに応じて、指定されたセル番号に対応付けられたセル情報を用いて前記粒子データメモリにおけるアクセス対象の粒子データの記憶位置を特定する段階と、
メモリが、前記粒子データメモリ内における、使用済みの記憶位置または未使用の記憶位置を管理するための情報を記憶する段階と、
前記メモリコントローラは、セルに記憶位置を割り当てる場合に、前記使用済みの記憶位置または未使用の記憶位置を管理するための情報を用いて未使用の記憶位置を選択する段階と、
を備える処理方法。 a particle data memory storing particle data for each of a plurality of particles arranged in space;
A cell in which the cell information memory indicates a storage position assigned to each cell for storing particle data of particles in the cell in the particle data memory in association with the cell number of each cell obtained by dividing the space. storing the information;
a step in which a processing unit accesses particle data of particles contained in cells obtained by dividing the space by designating a cell number;
In response to access to particle data with a specified cell number, the memory controller stores the storage position of the particle data to be accessed in the particle data memory using the cell information associated with the specified cell number. and
a memory storing information for managing used or unused storage locations within the particle data memory;
the memory controller selecting unused storage locations using information for managing the used or unused storage locations when allocating storage locations to cells;
A processing method comprising: