CN106100961A - A kind of Direct Connect Architecture computing cluster system based on infinite bandwidth and construction method - Google Patents
A kind of Direct Connect Architecture computing cluster system based on infinite bandwidth and construction method Download PDFInfo
- Publication number
- CN106100961A CN106100961A CN201610580213.5A CN201610580213A CN106100961A CN 106100961 A CN106100961 A CN 106100961A CN 201610580213 A CN201610580213 A CN 201610580213A CN 106100961 A CN106100961 A CN 106100961A
- Authority
- CN
- China
- Prior art keywords
- module
- computing unit
- routing table
- task
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/28—Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
- H04L12/46—Interconnection of networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/52—Queue scheduling by attributing bandwidth to queues
- H04L47/522—Dynamic queue service slot or variable bandwidth allocation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
- H04L47/52—Queue scheduling by attributing bandwidth to queues
- H04L47/527—Quantum based scheduling, e.g. credit or deficit based scheduling or token bank
Abstract
The invention provides a kind of Direct Connect Architecture computer cluster based on infinite bandwidth, including main control unit, topology constructing module and calculating resource pool, wherein, described calculating resource pool includes that at least 2 computing units, described computing unit include infinite bandwidth adaptation module and route construction module;Described computing unit is connected with each other by infinite bandwidth network, communication between computing unit is without the communication interaction that can realize lossless calculated performance by switch, network delay is low, reduce the cost of group system operation maintenance, improve the reliability of group system;And set expandability provided by the present invention can be good, can be according to the demand of nonidentity operation amount, the arbitrarily number of computing unit in extension or reduction system.
Description
Technical field
The present invention relates to high-performance computer group system, calculate particularly to a kind of Direct Connect Architecture based on infinite bandwidth
Group system and construction method.
Background technology
Computer cluster is a kind of computer system, and it is connected by one group of loose integrated computer software and/or hardware
Picking up the evaluation work that the most closely cooperated, in some sense, they can be counted as a computer, cluster system
Single computer in system is commonly referred to node, is generally connected by LAN.
HPCC is the one of computer cluster, uses and the different of distribution of computation tasks to cluster are calculated joint
Put and improve computing capability, be mainly used in scientific algorithm and engineering calculation field.HPCC generally runs
Concurrent application, such as based on MPI standard exploitation Parallel Computation.This class application program can realize multiple calculating
Nodal parallel performs calculating task, calculates and generally has data exchange frequently and message transmission, therefore high-performance meter between node
Calculating cluster and generally configure special calculating network to carry out these data exchanges, the performance calculating network can be to a great extent
Affect the computational efficiency of concurrent program.
At present, computing cluster system uses fat tree topology structure mostly, carries out being connected in series (Indirect with switch
Network, switch based), carry out data exchange through copper cable or optical cable.When group system does cross-node computing, thoroughly
Crossing TCP/IP agreement, data enter switch through netting twine, and switch transmits data to correct node and completes communication, to complete
Cross-node operation.But with write computer node number increase, between node, network service amplitude is necessarily significantly increased, therefore, for
Accelerating point-to-point transmission call duration time and reduce delay, the demand of switch is necessarily synchronized to increase by system, in turn results in system overall
Network environment is complicated, and system builds operation management cost to be increased.
In addition to such scheme, also having another kind of computing cluster system, it uses complete direct-connected topological structure, and this framework is not required to
Want switch can realize the communication interaction of all calculating nodes.But this structure is typically only applicable to minisystem, because right
For the group system with N number of calculating node, use complete direct-connected topological structure system to need to be equipped with N* (N-1) individual network interface card and connect
Mouthful, so for large-scale cluster system, the framework difficulty of this structure is high, autgmentability is poor, management is inconvenient.
Summary of the invention
It is an object of the invention to overcome prior art not enough, it is provided that a kind of Direct Connect Architecture based on infinite bandwidth calculates collection
Group's system and construction method, in system, the communication interaction of all computing units is without completing by mutual machine, and system is prone to structure
Building, autgmentability is strong, is applicable to large-scale calculations cluster, and system have employed infinite bandwidth communication technology, meets cluster system
System is for bandwidth and the demand of communication delay.
The present invention uses following technical scheme for achieving the above object:
On the one hand, the invention provides a kind of Direct Connect Architecture computer cluster based on infinite bandwidth, including topology
Build module and calculate resource pool;Described calculating resource pool is connected with described topology constructing module;
Wherein, described calculating resource pool includes at least 2 computing units, and described computing unit passes through infinite bandwidth network phase
Connect;
Described computing unit includes infinite bandwidth adaptation module and route construction module;
Described topology constructing module is used for obtaining sum and neighbours' number of each described computing unit of described computing unit,
And draw maximum neighbours' number, and calculate network dimension according to described maximum neighbours' number, and according to described computing unit sum and net
Network dimension generates at least one network topological diagram, and all described network topological diagrams are sent to described calculating resource pool;
Described infinite bandwidth adaptation module is for providing data transport service based on infinite bandwidth agreement, to realize each
Data communication between described computing unit is mutual;
Described route construction module is used for obtaining all described network topological diagrams, and according to network topological diagram meter each described
Calculate all possible communication path between this described computing unit and other described computing units, and generate complete trails routing table;
Described route construction module is additionally operable to determine the routed path of actual survival in described complete trails routing table, i.e. can practical communication
Routed path, and according to reality survival routed path generate communication routing table, described communication routing table is according to routed path
Purpose IP address be grouped, and the routed path in each packet is carried out ascending sort according to the jumping figure of path process.
In an embodiment of the present invention, described Direct Connect Architecture computer cluster based on infinite bandwidth also includes master control
Unit, described main control unit is connected with any one of computing unit;Described main control unit is used for obtaining task, and by described
It is sent in the described computing unit being connected after task segmentation, then it is single to be assigned to other described calculating by computing unit this described
Unit, described main control unit is additionally operable to initialize described computing unit.
In an embodiment of the present invention, described main control unit includes task acquisition module, task allocating module and initialization
Module;Wherein, described task acquisition module is used for obtaining task, if described task allocating module is for being divided into described task
Dry subtask, and be that computing unit is distributed in described subtask, described task allocating module is additionally operable to be sent to described subtask
Calculating in resource pool, described initialization module, for distributing IP address for described computing unit, is additionally operable to initialize described topology
Build module and described route construction module.
In an embodiment of the present invention, described main control unit also includes state read module and feedback module, described state
Read module is for reading the duty of described computing unit, and is sent to described feedback module, and described feedback module is used for
To the duty of the described computing unit that user feedback receives.
In an embodiment of the present invention, described main control unit also includes resource distribution module and resource adjusting module;
Described resource distribution module for arranging resource acquisition authority and distribution initial resource to getting of task;Described
Resource adjusting module is for adjusting, according to the resource acquisition authority of each task, the resource that each task can be occupied.
In an embodiment of the present invention, described topology constructing module is by traveling through the IP address acquisition institute of described computing unit
State the total and maximum neighbours' number of computing unit.
In an embodiment of the present invention, during described topology constructing module is arranged on described main control unit.
In an embodiment of the present invention, optionally, described main control unit be additionally operable to obtain user input computing unit total
Several and maximum neighbours' number, and total for described computing unit and maximum neighbours' number is sent to described topology constructing module, described in open up
Flutter structure module and generate network topological diagram according to receiving the total and maximum neighbours of computing unit.
In an embodiment of the present invention, described main control unit can be any one of computing unit.
In another embodiment of the present invention, the system that first aspect present invention is provided also includes total route construction mould
Block, described total route construction module is connected with described calculating resource pool, described total route construction module also with described topology constructing
Module is connected;
Described total route construction module is for obtaining the IP address of all described computing units, described total route construction module
It is additionally operable to obtain all-network topological diagram, and generates between all computing units all possible logical according to described network topological diagram
Letter path, and generate at least one complete trails routing table according to the IP address of initial calculation unit, and described complete trails is route
Table is sent in the computing unit of correspondence, and the route construction module in described computing unit is according to the complete trails routing table received
Determine the routed path of actual survival, i.e. can the routed path of practical communication, and generate according to the routed path of reality survival
Communication routing table, described communication routing table is grouped according to the purpose IP address of routed path, and to the road in each packet
Ascending sort is carried out according to the jumping figure of path process by path.
In an embodiment of the present invention, described computing unit also includes that processor, internal memory, local memory device, extension set
Standby interface.
On the other hand, present invention also offers a kind of network topology map generalization method, comprise the steps:
Obtain node total number N in network and the neighbor node number of each node, take maximum neighbor node number M;
Calculate network dimensionality K, the logarithm that wherein K is is end M with 2, and round up;
Building at least one K and tie up network topological diagram, the most each node is all with 2KIndividual neighbor node is connected, and the dimension of maximum
Degree nodes is not more than N-M+2.
In an embodiment of the present invention, described K dimension its coordinate of topological network meets:
0≤xi≤2Ni-1
xjMod2=xj+1mod2
Each node xiIt is connected to 2KIndividual neighbor node yi, yiCoordinate meet:
yi=(xi+1)mod2NiOr yi=(xi-1+2Ni)mod2Ni
Wherein, mod represents modulo operation, coordinate points xiRepresent any one node, N in i-th dimensioniRepresent the joint of i-th dimension degree
Count, whereinK=log2M, and round up;max1≤i≤KNi≤N-M+2。
On the other hand, present invention also offers a kind of generation method of routing table, comprise the steps:
Selected start node and destination node, obtain all-network topological diagram;
According to all network topological diagrams got, calculate the described start node all paths to described destination node,
And generate complete trails routing table;
Confirm the surviving path in complete trails routing table, and generate communication routing table;
Routed path in communication routing table is grouped according to the purpose IP address in path, and in each packet
Routed path carries out ascending sort according to the jumping figure of path process.
Beneficial effects of the present invention: Direct Connect Architecture computing cluster system based on infinite bandwidth provided by the present invention, be
The communication that in system, all computing units can reach lossless calculated performance not by the case of mutual machine, network delay is low
Alternately, reduce the cost of group system operation maintenance, improve the reliability of group system;And system provided by the present invention
Scalability is good, arbitrarily can extend or the number of computing unit in reduction system according to the demand of nonidentity operation amount.
Accompanying drawing explanation
Fig. 1 is the system structure schematic diagram in one embodiment of the invention;
Fig. 2 is the system structure schematic diagram in another embodiment of the present invention;
Fig. 3 is the system structure schematic diagram in another embodiment of the present invention;
Fig. 4 is the flow chart of Generating Network Topology Map provided by the present invention;
Fig. 5 is the flow chart of routing table production method provided by the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawings and specific embodiment the present invention will be further described, illustrative examples therein and
Illustrate only to be used for explaining the present invention, but not as a limitation of the invention.
In first embodiment of the invention, as it is shown in figure 1, be the system structure schematic diagram of the present invention, a kind of based on nothing
The Direct Connect Architecture computer cluster of limit bandwidth, including main control unit 100, topology constructing module 200 and calculating resource pool;
Wherein, described calculating resource pool includes that at least 2 computing units 300, all computing units 300 pass through infinite bandwidth
Network is connected with each other;
Topology constructing module 200 is used for obtaining sum and neighbours' number of each described computing unit of computing unit 300, and
Draw maximum neighbours' number, calculate network dimension according to described maximum neighbours' number, and according to described computing unit 300 sum and network
Dimension generates at least one network topological diagram, and all described network topological diagrams are sent to described calculating resource pool;
Computing unit 300 includes infinite bandwidth adaptation module 310 and route construction module 320;Wherein, infinite bandwidth is adaptive
Module 310 is for providing data transport service based on infinite bandwidth agreement, to realize the data between each computing unit 300
Communication interaction;
Route frame modules 320 is used for obtaining all described network topological diagrams, and according to network topological diagram meter each described
Calculate this computing unit 300 and arrive all possible routed path between other computing units 300, and generate complete trails routing table;Road
By building module 320 and be additionally operable to determine in described complete trails routing table the path of actual survival, i.e. can the route of practical communication
Path, and generate communication routing table according to the routed path of reality survival, described communication routing table is according to the purpose of routed path
IP address is grouped, and according to the jumping figure of path process, the routed path in each packet is carried out ascending sort.
System provided by the present invention also includes main control unit 100, described main control unit 100 and any one of calculating
Unit 300 is connected;Main control unit 100 is used for obtaining task, and is sent to the meter being connected after getting of task being split
Calculating in unit 300, then be assigned in other computing units 300 by this computing unit 300, described main control unit 100 is additionally operable to
Initialize described computing unit 300;
In second embodiment of the invention, as in figure 2 it is shown, topology constructing module 200 is arranged in main control unit 100,
Main control unit 100 also includes task acquisition module 110, task allocating module 120, initialization module 130, state read module
140, feedback module 150;
Wherein, task acquisition module 110 is for obtaining the task that user issues, and task allocating module 120 will get
Task is divided at least one subtask, and is the computing unit 300 of the concrete execution of each subtask distribution, task distribution mould
All subtasks are sent in the computing unit 300 being connected by block 120, and computing unit 300 is intercepting the subtask of oneself correspondence
After, forward remaining subtask to other computing units 300.
Initialization module 130, for distributing IP address for computing unit 300, is additionally operable to topology constructing module 200 and road
Initialization directive is sent by building module 320;Concrete, initialization includes, topology constructing module 200 builds network topological diagram,
Route construction module 320 builds routing table.
State read module 140 is for reading the duty of each computing unit 300, as memory usage, CPU use
Rate, hard disk remaining space etc., and the duty read is fed back to user by feedback module 150, in order to user checks
Calculate the working condition of resource pool.
In the embodiment of the present invention the first or second embodiment, according to user's request, described computing unit 300 also may be used
Including processor, internal memory, local memory device, expansion equipment interface etc..When running for the first time, main control unit 100 sends initially
Change instruction, distribute IP address for all computing units 300, and order topology constructing module 200 builds network topological diagram, order road
Routing table is built by building module 320.
In the embodiment of the present invention the first or second embodiment, topology constructing module 200 is to connected computing unit
300 send communication bag, travel through the IP address of all computing units 300, and topology constructing module 200 obtains according to traversing result and calculates
The total N of unit 300 and the neighboring units number of each computing unit 300, and take maximum neighboring units number M, single to maximum neighbours
Unit number M takes the logarithm with 2 as the end, and rounds up, and obtains network dimension K, and generates at least according to sum N and network dimension K
All described network topological diagrams are sent to calculate in resource pool by one network topological diagram;Wherein, the distance metric list in network
Position, for jumping, is often a jumping through a via node in data communication process, and the distance between two computing units 300 is
During zero jumping, then the two computing unit 300 neighboring units each other.
Concrete, after described topology constructing module 200 obtains the total N and maximum neighboring units number M of computing unit 300,
Build a cartesian coordinate system, wherein, coordinate points xiRepresent any one node, N in i-th dimensioniRepresent the node of i-th dimension degree
Number, wherein,K=log2M, and round up;max1≤i≤KNi≤N-M+2,
Coordinate xiMeet:
0≤xi≤2Ni-1
xjMod2=xj+1mod2
Each node xiIt is connected to 2KIndividual neighbor node yi, yiCoordinate meet:
yi=(xi+1)mod2NiOr yi=(xi-1+2Ni)mod2Ni
According to above-mentioned formula, topology constructing module 200 can build at least one K and tie up network topology (N1×N2×……×
NK), wherein, the arbitrary node in all-network topology is all with 2KIndividual neighbor node is connected, and the dimension nodes of maximum is not more than
N-M+2, and node total number is not less than N;
Route construction module 320 obtains all described network topological diagrams, and according to network topological diagram each described in terms of this
Calculating unit is all possible routed path that start element calculates other computing units, and is written into complete trails routing table
In;Route construction module 320 sends communication acknowledgement bag, to obtain actual depositing according to the routed path of record in complete trails routing table
The routed path lived, i.e. can the routed path of practical communication, and generate communication routing table according to the routed path of reality survival,
Described communication routing table is grouped according to the purpose IP address of routed path, and to the routed path in each packet according to road
The jumping figure of footpath process carries out ascending sort.
When needs communicate, computing unit 300 from top to bottom selects routed path to communicate according to purpose IP address, when
When selected path failure cannot communicate, next routed path is selected to communicate, to ensure that the data between computing unit are handed over
Mutually.
In an embodiment of the present invention, alternatively, computing unit 300 sum N and maximum neighbor node number M are passed through by user
Main control unit 100 inputs, and computing unit 300 sum N and maximum neighbours number M is sent to topology constructing module by main control unit 100
200。
In an embodiment of the present invention, described main control unit also includes resource distribution module and resource adjusting module;
Wherein, described resource distribution module for arranging resource acquisition authority and the initial money of distribution to getting of task
Source, described initial resource includes minimum resources and flexible resource;
Described resource adjusting module can actual account for for adjusting each task institute according to the resource acquisition authority of each task
Some flexible resources.
In the concrete application scenarios of the present invention one, system provided by the present invention is for running the teaching management system of school
System;Wherein, including 16 computing units, each computing unit has 2 and processes core, 4G internal memory, 250G solid state hard disc, i.e. counts
Calculate resource pool and have 32 process cores, 64G internal memory and 4TB memory space.Task mainly includes student status management, curricula-variable pipe
Reason, teaching schedule management system, school's personnel management and examination management, user is each by the resource distribution module of main control unit 100
Individual task divides resource and sets authority, as respectively distributed 8 process for curricula-variable management, examination management and teaching schedule management system task
Core, 16G internal memory and 500G memory space, and it is set to general resource acquisition authority;For school's personnel management and student status management
Task is respectively distributed 4 and is processed core, 8G internal memory and 1250G memory space, and is set to the highest resource acquisition authority;Arrange simultaneously
The minimum resources of all tasks is configured to 2 and processes core, 4G internal memory and 250G memory space, and remaining configures for flexible resource.
When certain mission requirements amount increases suddenly, resource adjusting module adjusts each task according to the resource acquisition authority of this task
Flexible resource configures, and e.g., at the beginning of the new term begins, the demand of curricula-variable management role increases severely, and resource adjusting module judges that curricula-variable task is
General resource acquisition authority, therefore, will be both examination management and the bullet of teaching schedule management system task of general resource acquisition authority
Property resource distributes to curricula-variable management role, for belonging to school's personnel management and the student status management task of the highest resource acquisition authority
Shared resource then not adjusts;When new life enters a school, the demand of student status management task increases, and resource adjusting module judges to learn
Nationality management role is the highest resource acquisition authority, therefore, first the curricula-variable of general resource acquisition authority is managed, examination management and
The flexible resource of teaching schedule management system task distributes to student status management task, when the resource requirement having not been met student status management task
Time, then the flexible resource of school's personnel management task is distributed to student status management task.
In an embodiment of the present invention, during main control unit 100 may be disposed at one of them computing unit 300.
In third embodiment of the invention, as described in Figure 3, provided in the present invention the first or second embodiment it is
System also includes total route construction module 400;Total route construction module 400 respectively with topology constructing module 200 and calculate resource pool
It is connected;The all-network topological diagram of generation is sent to total route construction module 400, total route construction by topology constructing module 200
Module 400 sends the ergodic communication bag IP address with all computing units 300 of acquisition to computing unit 300, and according to each institute
State network topological diagram and calculate routed path possible between each computing unit 300, and according to the IP address of initial calculation unit
Generate at least one complete trails routing table;All complete trails routing tables are sent to calculate resource pool by total route construction module 400
In, the computing unit 300 in resource pool obtains the complete trails routing table using local IP address as initial address, and by remaining
Complete trails routing table is transmitted to other computing units 300;Route construction module 320 is remembered according in the complete trails routing table got
The routed path of record sends communication acknowledgement bag, to obtain the routed path of reality survival, i.e. can the routed path of practical communication,
And generating communication routing table according to the routed path of reality survival, described communication routing table is according to the purpose IP address of routed path
It is grouped, and the routed path in each packet is carried out ascending sort according to the jumping figure of path process.When needs communicate,
Computing unit 300 from top to bottom selects routed path to communicate according to purpose IP address, when selected path failure cannot communicate
Time, select next routed path to communicate, to ensure the data interaction between computing unit.
In the embodiment of third embodiment of the invention, according to user's request, described computing unit 300 may also include place
Reason device, internal memory, local memory device, expansion equipment interface etc..When running for the first time, main control unit 100 sends initialization directive,
Distribute IP address for all computing units 300, and order topology constructing module 200 builds network topological diagram, order always route structure
Modeling block 400 builds complete trails routing table, order route construction module 320 builds communication routing table.
As shown in Figure 4, present invention also offers a kind of network topology map generalization method, comprise the steps:
S110: obtain node total number N in network and the neighbor node number of each node, takes maximum neighbor node number M;
S120: calculate network dimension K, the logarithm that wherein K is is end M with 2, and round up;
S130: building at least one K and tie up topological network, the most each node is all with 2KIndividual neighbor node is connected, and maximum
Dimension nodes be not more than N-M+2.
In an embodiment of the present invention, above-mentioned steps uses system provided by the present invention to complete, specifically by topology constructing
Module 200 performs;
Concrete, topology constructing module 200 sends communication bag to connected computing unit 300, travels through all computing units
The IP address of 300, topology constructing module 200 obtains the total N of computing unit 300 and each computing unit according to traversing result
The neighboring units number of 300, and take maximum neighboring units number M;Wherein, when the distance between two computing units 300 is a jumping,
Then the two computing unit 300 neighboring units each other;Maximum neighboring units number M is taken the logarithm with 2 as the end, and rounds up,
Obtain network dimension K, and generate at least one network topological diagram, by all described network topologies according to sum N and network dimension K
Figure is sent to calculate in resource pool.
Concrete, topology constructing module 200 builds K by the following method and ties up network topology (N1×N2×……×NK):
Build a cartesian coordinate system, wherein, coordinate points xiRepresent any one node, N in i-th dimensioniRepresent the node of i-th dimension degree
Number, whereinK=log2M, and round up;max1≤i≤KNi≤N-M+2,
Coordinate xiMeet:
0≤xi≤2Ni-1
xjMod2=xj+1mod2
Each node xiIt is connected to 2KIndividual neighbor node yi, yiCoordinate meet:
yi=(xi+1)mod2NiOr yi=(xi-1+2Ni)mod2Ni
Network topology (N is tieed up according at least one K constructed by above-mentioned formula1×N2×……×NK), in its network arbitrarily
Node is all with 2KIndividual neighbor node is connected, and the dimension nodes of maximum is not more than N-M+2, and node total number is not less than N;
In an embodiment of the present invention, computing unit 300 sum N and maximum neighbor node number M are passed through master control list by user
Unit 100 input, computing unit 300 sum N and maximum neighbours number M is sent to topology constructing module 200 by main control unit 100.
As it is shown in figure 5, present invention also offers a kind of generation method of routing table, comprise the steps:
S210: selected start node and destination node, obtain all-network topological diagram;
S220: according to all network topological diagrams got, calculates described start node owning to described destination node
Path, and generate complete trails routing table;
S230: confirm the surviving path in complete trails routing table, and generate communication routing table;
S240: the routed path in communication routing table is grouped according to the purpose IP address in path, and to each point
Routed path in group carries out ascending sort according to the jumping figure of path process.
In an embodiment of the present invention, above-mentioned steps uses the system of the present invention the first or second embodiment to complete, tool
Body is performed by the route construction module 320 of each computing unit 300;
Concrete, route construction module 320 obtains at least one network topological diagram, and according to network topological diagram each described
Calculate with the IP address of this computing unit for initial address to institute's likely routed path of each other computing units, and by it
In write complete trails routing table, route construction module 320 sends communication acknowledgement according to the path of the record in complete trails routing table
Bag, to obtain the routed path of reality survival, i.e. can the routed path of practical communication, and according to the routed path of reality survival
Generating communication routing table, described communication routing table is grouped according to the purpose IP address of routed path, and in each packet
Routed path carry out ascending sort according to the jumping figure of path process.
In an alternative embodiment of the invention, above-mentioned steps use third embodiment of the invention system complete, specifically by
Total route construction module 400 and route construction module 320 perform jointly, and wherein step S210 and S220 are in total route construction module
Completing in 400, step S230 and S240 are completed by building module 320;
Concrete, total route construction module 400 obtains all-network topological diagram, and it is logical to send traversal to computing unit 300
Letter bag is to obtain the IP address of all computing units 300, and calculates each computing unit 300 according to network topological diagram each described
Between possible routed path, and generate at least one complete trails routing table according to the IP address of initial calculation unit;Total route
Build module 400 to be sent to all complete trails routing tables calculate in resource pool, the route construction module of each computing unit 300
320 intercept the complete trails routing table using local IP address as initial address, and remaining complete trails routing table is transmitted to it
His computing unit 300, route construction module 320 sends communication acknowledgement bag according to the path of the record in complete trails routing table, with
Obtain the routed path of reality survival, i.e. can the routed path of practical communication, and generate according to the routed path of reality survival
Communication routing table, described communication routing table is grouped according to the purpose IP address of routed path, and to the road in each packet
Ascending sort is carried out according to the jumping figure of path process by path.
Obviously, above-described embodiment is only used to clearer expression technical solution of the present invention example, rather than right
The restriction of embodiment of the present invention.To those skilled in the art, can also be made other on the basis of the above description
The change of multi-form or variation, without departing from the inventive concept of the premise, these broadly fall into protection scope of the present invention.Cause
The protection domain of this patent of the present invention should be as the criterion with claims.
Claims (10)
1. a Direct Connect Architecture computer cluster based on infinite bandwidth, it is characterised in that include topology constructing module and
Calculate resource pool;Described calculating resource pool is connected with described main control unit and described topology constructing module respectively;
Wherein, described calculating resource pool includes at least 2 computing units, and described computing unit is interconnected mutually by infinite bandwidth network
Connect;
Described computing unit includes infinite bandwidth adaptation module and route construction module;
Described topology constructing module is for obtaining sum and neighbours' number of each described computing unit of described computing unit, and obtains
Go out maximum neighbours' number, and calculate network dimension according to described maximum neighbours' number, and according to the sum of described computing unit and described
Network dimension generates at least one network topological diagram, and all described network topological diagrams are sent to described calculating resource pool;
Described infinite bandwidth adaptation module is for providing data transport service based on infinite bandwidth agreement, to realize described in each
Data communication between computing unit is mutual;
Described route construction module is used for obtaining all described network topological diagrams, and calculates this according to network topological diagram each described
All possible communication path between described computing unit and other described computing units, and generate complete trails routing table;Described
Route construction module is additionally operable to determine the routed path of actual survival in described complete trails routing table, and according to the road of reality survival
Being communicated routing table by coordinates measurement, described communication routing table is grouped according to the purpose IP address of routed path, and to each
Routed path in packet carries out ascending sort according to the jumping figure of path process.
2. Direct Connect Architecture computer cluster based on infinite bandwidth as claimed in claim 1, it is characterised in that also include
Main control unit, described main control unit is connected with any one computing unit;
Wherein, described main control unit is used for obtaining task, and will be sent to the described computing unit being connected after the segmentation of described task
In, then it being assigned to other described computing units by computing unit this described, described main control unit is additionally operable to initialize described meter
Calculate unit.
3. Direct Connect Architecture computer cluster based on infinite bandwidth as claimed in claim 2, it is characterised in that described master
Control unit includes task acquisition module, task allocating module and initialization module;
Wherein, described task acquisition module is used for obtaining task, if described task allocating module is for being divided into described task
Dry subtask, and be that computing unit is distributed in described subtask, described task allocating module is additionally operable to be sent to described subtask
Calculating in resource pool, described initialization module, for distributing IP address for described computing unit, is additionally operable to initialize described topology
Build module and described route construction module.
4. Direct Connect Architecture computer cluster based on infinite bandwidth as claimed in claim 1, it is characterised in that described master
Control unit also includes state read module and feedback module, and described state read module is for reading the work of described computing unit
State, and it is sent to described feedback module, the work of the described feedback module described computing unit for receiving to user feedback
Make state.
5. Direct Connect Architecture computer cluster based on infinite bandwidth as claimed in claim 1, it is characterised in that at this
In a bright embodiment, described main control unit also includes resource distribution module and resource adjusting module;
Described resource distribution module for arranging resource acquisition authority and distribution initial resource to getting of task;Described resource
Adjusting module is for adjusting, according to the resource acquisition authority of each task, the resource that each task can be occupied.
6. Direct Connect Architecture computer cluster based on infinite bandwidth as claimed in claim 1, it is characterised in that described master
Control unit may be disposed in any one of computing unit.
7. a Direct Connect Architecture computer cluster based on infinite bandwidth, it is characterised in that include as in claim 1-6
Arbitrary described Direct Connect Architecture computer cluster based on infinite bandwidth, also includes total route construction module, described total road
Being connected with described calculating resource pool by building module, described total route construction module is also connected with described topology constructing module;
Described total route construction module is for obtaining the IP address of all described computing units, and described total route construction module is also used
In obtaining all-network topological diagram, and calculate the communication path between each computing unit according to described network topological diagram, and press
IP address according to initial calculation unit generates at least one complete trails routing table, and described total route construction module is additionally operable to described
Complete trails routing table is sent to calculate in resource pool.
8. a network topology map generalization method, it is characterised in that include
Obtain node total number N in network and the neighbor node number of each node, take maximum neighbor node number M;
Calculate network dimension K, the logarithm that wherein K is is end M with 2, and round up;
Building at least one K and tie up network topological diagram, the most each node is all with 2KIndividual neighbor node is connected, and the dimension joint of maximum
Count no more than N-M+2.
9. network topology map generalization method as claimed in claim 8, it is characterised in that described K ties up the every of network topological diagram
Individual node xiIt is connected to 2KIndividual neighbor node yi;
Wherein, xiCoordinate meet:
0≤xi≤2Ni-1
yiCoordinate meet:
yi=(xi+1)mod2NiOr yi=(xi-1+2Ni)mod2Ni
Mod represents modulo operation, coordinate points xiRepresent any one node, N in i-th dimensioniRepresent the nodes of i-th dimension degree,K=log2M, and round up;max1≤i≤KNi≤N-M+2。
10. the generation method of a routing table, it is characterised in that include
Selected start node and destination node, obtain all-network topological diagram;
According to all network topological diagrams got, calculate the described start node all paths to described destination node, and raw
Become complete trails routing table;
Confirm the surviving path in complete trails routing table, and generate communication routing table;
Routed path in communication routing table is grouped according to the purpose IP address in path, and to the route in each packet
Path carries out ascending sort according to the jumping figure of path process.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610580213.5A CN106100961A (en) | 2016-07-21 | 2016-07-21 | A kind of Direct Connect Architecture computing cluster system based on infinite bandwidth and construction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610580213.5A CN106100961A (en) | 2016-07-21 | 2016-07-21 | A kind of Direct Connect Architecture computing cluster system based on infinite bandwidth and construction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106100961A true CN106100961A (en) | 2016-11-09 |
Family
ID=57449158
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610580213.5A Pending CN106100961A (en) | 2016-07-21 | 2016-07-21 | A kind of Direct Connect Architecture computing cluster system based on infinite bandwidth and construction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106100961A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018119830A1 (en) * | 2016-12-29 | 2018-07-05 | 中国科学院计算技术研究所 | Method and system for constructing task processing path |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1921437A (en) * | 2006-08-04 | 2007-02-28 | 上海红神信息技术有限公司 | Inside and outside connecting network topology framework and parallel computing system for self-consistent expanding the same |
CN101309201A (en) * | 2007-05-14 | 2008-11-19 | 华为技术有限公司 | Route processing method, routing processor and router |
CN101727512A (en) * | 2008-10-17 | 2010-06-09 | 中国科学院过程工程研究所 | General algorithm based on variation multiscale method and parallel calculation system |
CN102790698A (en) * | 2012-08-14 | 2012-11-21 | 南京邮电大学 | Large-scale computing cluster task scheduling method based on energy-saving tree |
CN103152397A (en) * | 2013-02-06 | 2013-06-12 | 浪潮电子信息产业股份有限公司 | Method for designing multi-control storage system |
CN104243320A (en) * | 2014-09-10 | 2014-12-24 | 珠海市君天电子科技有限公司 | Method and device for optimizing network access paths |
CN104283789A (en) * | 2014-09-19 | 2015-01-14 | 深圳市腾讯计算机***有限公司 | Routing convergence method and system |
CN206100022U (en) * | 2016-07-21 | 2017-04-12 | 广州高能计算机科技有限公司 | It calculates cluster system directly to link framework based on infinite bandwidth |
-
2016
- 2016-07-21 CN CN201610580213.5A patent/CN106100961A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1921437A (en) * | 2006-08-04 | 2007-02-28 | 上海红神信息技术有限公司 | Inside and outside connecting network topology framework and parallel computing system for self-consistent expanding the same |
CN101309201A (en) * | 2007-05-14 | 2008-11-19 | 华为技术有限公司 | Route processing method, routing processor and router |
CN101727512A (en) * | 2008-10-17 | 2010-06-09 | 中国科学院过程工程研究所 | General algorithm based on variation multiscale method and parallel calculation system |
CN102790698A (en) * | 2012-08-14 | 2012-11-21 | 南京邮电大学 | Large-scale computing cluster task scheduling method based on energy-saving tree |
CN103152397A (en) * | 2013-02-06 | 2013-06-12 | 浪潮电子信息产业股份有限公司 | Method for designing multi-control storage system |
CN104243320A (en) * | 2014-09-10 | 2014-12-24 | 珠海市君天电子科技有限公司 | Method and device for optimizing network access paths |
CN104283789A (en) * | 2014-09-19 | 2015-01-14 | 深圳市腾讯计算机***有限公司 | Routing convergence method and system |
CN206100022U (en) * | 2016-07-21 | 2017-04-12 | 广州高能计算机科技有限公司 | It calculates cluster system directly to link framework based on infinite bandwidth |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018119830A1 (en) * | 2016-12-29 | 2018-07-05 | 中国科学院计算技术研究所 | Method and system for constructing task processing path |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108566659B (en) | 5G network slice online mapping method based on reliability | |
CN106101262A (en) | A kind of Direct Connect Architecture computing cluster system based on Ethernet and construction method | |
CN104375882B (en) | The multistage nested data being matched with high-performance computer structure drives method of calculation | |
Prisacari et al. | Bandwidth-optimal all-to-all exchanges in fat tree networks | |
CN107836001A (en) | Convolutional neural networks on hardware accelerator | |
CN102486739A (en) | Method and system for distributing data in high-performance computer cluster | |
Zhao et al. | Joint VM placement and topology optimization for traffic scalability in dynamic datacenter networks | |
Gong et al. | Revenue-driven virtual network embedding based on global resource information | |
CN105049353A (en) | Method for configuring routing path of business and controller | |
CN108111335A (en) | A kind of method and system dispatched and link virtual network function | |
Chen et al. | Tology-aware optimal data placement algorithm for network traffic optimization | |
JP6809360B2 (en) | Information processing equipment, information processing methods and programs | |
CN105391651A (en) | Virtual optical network multilayer resource convergence method and system | |
Pearce et al. | One quadrillion triangles queried on one million processors | |
Wolfe et al. | Preliminary performance analysis of multi-rail fat-tree networks | |
Navaridas et al. | Reducing complexity in tree-like computer interconnection networks | |
Filelis-Papadopoulos et al. | Towards simulation and optimization of cache placement on large virtual content distribution networks | |
El-Zoghdy | A hierarchical load balancing policy for grid computing environment | |
Pascual et al. | Optimization-based mapping framework for parallel applications | |
CN206100022U (en) | It calculates cluster system directly to link framework based on infinite bandwidth | |
CN102404409A (en) | Equivalent cloud network system based on optical packet switch | |
CN106100961A (en) | A kind of Direct Connect Architecture computing cluster system based on infinite bandwidth and construction method | |
Marinakis et al. | A hybrid discrete artificial bee colony algorithm for the multicast routing problem | |
Gaffour et al. | A new congestion-aware routing algorithm in network-on-chip: 2D and 3D comparison | |
US20230094933A1 (en) | Connecting processors using twisted torus configurations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20161109 |
|
WD01 | Invention patent application deemed withdrawn after publication |