CN103152281B - Two-level switch-based load balanced scheduling method - Google Patents

Two-level switch-based load balanced scheduling method Download PDF

Info

Publication number
CN103152281B
CN103152281B CN201310069391.8A CN201310069391A CN103152281B CN 103152281 B CN103152281 B CN 103152281B CN 201310069391 A CN201310069391 A CN 201310069391A CN 103152281 B CN103152281 B CN 103152281B
Authority
CN
China
Prior art keywords
input port
voq
region
queue
cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310069391.8A
Other languages
Chinese (zh)
Other versions
CN103152281A (en
Inventor
戴艺
肖立权
伍楠
曹继军
高蕾
张鹤颖
童元满
董德尊
王绍刚
沈胜宇
刘路
肖灿文
张磊
王永庆
齐星云
陆平静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201310069391.8A priority Critical patent/CN103152281B/en
Publication of CN103152281A publication Critical patent/CN103152281A/en
Application granted granted Critical
Publication of CN103152281B publication Critical patent/CN103152281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a two-level switch-based load balanced scheduling method, which comprises the following steps that first-level input ports buffer arriving cells in a virtual output queue (VOQ) according to destination ports; a scheduler switches messages to second-level input ports through a first-level switching network, wherein k cells from the same stream in the VOQ are called unit frames; each first-level input port executes minimum length scheduling according to a traffic distribution matrix, and transmits the unit frames of the same stream to a fixed mapping area of the stream through the first-level switching network in k continuous external timeslots; and N second-level input ports are sequentially divided into N/k groups of which each comprises k continuous second-level input ports forming an area, and each area buffers the cells in an output queue (OQ) according to the destination ports, and switches the cells to the destination output ports through a second-level switching network. The method has the advantages that a scheduling process is simple, computation or communication is not required, the method is easily implemented by hardware, the throughput of 100 percent is ensured, the sequence of the messages can be ensured, and the like.

Description

Load equilibration scheduling method based on two-stage exchange
Technical field
The present invention is mainly concerned with the dispatching message field in parallel switching fabric, refers in particular to a kind of method for dispatching message of realizing load balancing and packet order preserving in parallel switching fabric.
Background technology
Researcher adopt initiatively measure and passive measurement mode to packet out-ordering behavior done large quantity research.J.Bennet has measured packet out-ordering situation in the switching center of MAE-East ISP, finds that packet out-ordering situation is very serious under the high measurement environment of heavy duty, network equipment degree of parallelism, and more than 90% connection occurs out of order.J.Bennett analyzes the out of order rate of this height mainly from the local parallel treatment mechanism of network internal, comprises parallel switching equipment and parallel transmission link, and points out that packet out-ordering is not the illness behavior of network.There is packet out-ordering phenomenon in most high performance parallel switching fabrics, such as load balancing switching fabric (load-balanced switch), parallel Packet switch architecture (parallel packet switch), multilevel interchange frame (multi-level switch) etc. is all subject to the puzzlement of packet out-ordering problem.Degree of depth Parallel Design has brought load balancing and packet out-ordering two large problems to switching fabric.Load balancing is to realize the key of delay and throughput assurance, but load balancing may cause packet out-ordering, out of order message can damage Internet net condition, because the widely used TCP host-host protocol of Internet can be wrong regard out of order message the sign of the congested generation of message dropping as, thereby cause unnecessary re-transmission and TCP overtime.These retransmit and the overtime TCP throughput that will reduce improves message delay, and the order that guarantees message when therefore realizing input flow rate load balancing is extremely necessary.
The method that prevents packet out-ordering can be divided into two classes: 1) limit the quantity of out of order message, what at output finite capacity is set resets order buffering area, for resetting the out of order message of order; 2) guarantee that message sequentially leaves output port according to arrival, thereby avoided packet out-ordering.Due to the finite capacity of buffering area, first method can only be processed the out of order message in certain limit, if will reset order buffer size, is increased to O (N 2), can be correspondingly with the time scale increase message delay of quadratic power although can solve the problem of packet out-ordering completely, wherein N is port number.Therefore, the quantity that limits out of order message can not effectively solve packet out-ordering problem, and is difficult to adapt to the demand of router high port density.Preferential (the Full Ordered Frame First of full frame that Stanford University proposes, abbreviation FOFF) algorithm is the algorithm that represents of first method, it allows to exist in router the out of order message of some, and the buffer queue that is provided with N * N at output is used for resetting the out of order message of order.Can prove, in order to guarantee that cell leaves according to the order of sequence, FOFF algorithm resets order buffer pool size and is at most N 2individual cell and load balancing can be provided, obtains 100% throughput.In recent years, more researcher trends towards adopting second method to guarantee the order of message, and the order that resets of having eliminated output operates and buffering area expense, is conducive to improve delay performance.The common trait of this class dispatching method is by certain message passing mechanism, to obtain the state information of all arrival messages, based on overall message status information and executing centralized scheduling.For example, it is that notification packet time departure is created feedback network that mailbox exchange (mailbox switch) method adopts symmetric form connection mode, and scheduler is according to the time departure scheduling message of message.This strategy can guarantee that the message of every stream sequentially leaves switching system according to its arrival but can not provide load balancing to realize 100% throughput.Alternately mate dispatching method and adopt centralized scheduling method, suppose traffic characteristic be precognition and immobilize, adopt the method off-line of matrix decomposition to solve centralized scheduling problem, distributed implementation on-line scheduling also provides service guarantees.Yet, when flow becomes unpredictable and dynamically changes, be difficult to meet centralized dispatching requirement under large exchange size.PARALLEL MATCHING dispatching method is realized packet order preserving by transmit request-license token between first order input line card and second level input line card, but the frequent transmission of token between line card will be multiplied dispatching cycle in particular hardware realizes.
Summary of the invention
The technical problem to be solved in the present invention is just: the technical problem existing for prior art, the invention provides a kind of scheduling process simple, without any calculating or communication, be easy to the load equilibration scheduling method based on two-stage exchange that hardware had been realized, realized 100% throughput and can guarantee the order of message.
For solving the problems of the technologies described above, the present invention by the following technical solutions:
A kind of load equilibration scheduling method based on two-stage exchange, first order input port is buffered in VOQ queue by arrival cell according to destination interface, scheduler by first order switching network by message switching to second level input port, k the cell from same stream in VOQ queue is referred to as a unit frame, and unit frame is minimum scheduling unit; Each input port of the first order is carried out minimum length assignment according to flow distribution matrix, at k continuous external time groove, by first order switching network, the unit frame of same stream is sent to this and flows fixing mapping area; N second level input port is divided into N/k group successively, every group forms a region containing k continuous second level input port, each region according to destination interface by cell-buffering in OQ queue, owing to flowing to the mapping relations in region, fix, k the cell from same unit frame will arrive OQ queue heads position successively, by second level switching network, exchange to object output port.
As a further improvement on the present invention, adopt two endless form to build the mapping that flows to region:
(1.1) build in a looping fashion N/k flow branching of first order input port to the mapping in an input port N/k region, the second level, to guarantee that each region be able to contain all stream;
(1.2) further adjust in a looping fashion the associated of different input ports and N/k kind mapping mode, to guarantee that each region contains all stream according to the equiblibrium mass distribution of input port.
As a further improvement on the present invention, assign cell serve N/k flow branching with polling mode when input port is dispatched according to the mapping relations that flow to region, described first order input port as follows in the operating process of each groove external time:
(2.1) if flow branching there is full frame in VOQ queue, priority scheduling full frame, gives this full frame N/k the highest dispatching priority of unit frame, searches flow distribution matrix L, and first unit frame of intercepting full frame, sends to current L g, jminimum region R g, perform step (2.1.1)~(2.1.k), if do not exist full frame to go to step (2.2);
If (2.1.1) input port i is to second level input port S g, 1inner link idle, by VOQ i, jqueue heads cell sends to region R gsecond level input port S g, 1, otherwise f=(f+1) modN/k goes to step (2.1);
(2.1.2) send VOQ i, jqueue heads cell is to region R gsecond level input port S g, 2;
(2.1.3) send VOQ i, jqueue heads cell is to region R gsecond level input port S g, 3;
The rest may be inferred sends VOQ to (2.1.k) i, jqueue heads cell is to region R gsecond level input port S g, k, g=(g+1) modN/k, goes to step (2.1);
(2.2) if flow branching vOQ queue, VOQ i, jthere is the highest dispatching priority unit frame in (kf≤j < kf+k), searches flow distribution matrix L, and this unit frame is sent to current L g, jminimum region R g, perform step (2.1.1)~(2.1.k), otherwise go to step (2.3);
(2.3) if flow branching there are one or more unit frame in VOQ queue, searches flow distribution matrix L, selects the unit frame of minimum equalizing coefficient VOQ queue according to lookup result, sends it to flow branching fixedly mapping area R g, perform step (2.1.1)~(2.1.k), if only can skip the search operation of flow distribution matrix containing a unit frame, directly the unique unit frame of flow branching is sent to its fixedly mapping area R g, otherwise flow branching , containing any unit frame, g=(g+1) modN/k, does not go to step (2.1).
Compared with prior art, the invention has the advantages that:
1, dispatching method of the present invention can be distributed in each input port and independently carries out, according to local VOQ queuing message, assign cell, without any need for communication overhead, with O (1) time complexity, has realized 100% throughput and can guarantee the order of message.
2, the present invention, between each input port scheduler without any communication overhead in the situation that, has realized packet order preserving and load balancing.By structure, flow to the fixedly mapping in region, avoided packet out-ordering, eliminated message and reset order expense; For avoiding flow region concentration phenomenon, adopt two circulation (dual-rotation) modes to build the mapping relations that flow to region of different input ports, each input port is safeguarded the flow distribution matrix of overall unified view, according to flow distribution matrix thread frame.Can prove, to any output port j, the same area OQ jidentical and the zones of different OQ of queue length jqueue length differs from 1 at the most, thereby has realized 100% load balancing degrees.
3, the present invention only need suitably choose polymerization granularity k, can obtain lowest latency in theory.Delay performance by simplation verification dispatching method of the present invention under different polymerization granularity k, and compare with the load balance scheduling algorithm of current main flow.Analog result shows, when polymerization granularity k=2, the present invention has optimal delay performance at present all dispatching algorithms that can guarantee message sequence, and under burst flow model, shows the performance suitable with the algorithm that does not possess packet order preserving characteristic.
4, the present invention is according to the fixedly mapping relations scheduling message that flows to region, and scheduling process is simple, without any calculating or communication, is easy to hardware and realizes.
Accompanying drawing explanation
Fig. 1 is an example of the applicable secondary switching architecture of dispatching method of the present invention.
Fig. 2 mapping method that to be the present invention build in concrete application example flows to region is at port number N=32, during polymerization granularity k=8, and the mapping result schematic diagram that flows to region that adopts two cyclic mapping modes to obtain.
Fig. 3 is that the present invention carries out load equilibration scheduling method schematic flow sheet in concrete application example.
Fig. 4 is that after the present invention adopts minimum length of the present invention to assign in concrete application example in bursts of traffic situation, cell is at the distribution situation schematic diagram of input port OQ buffering area, the second level.
Embodiment
Below with reference to Figure of description and specific embodiment, the present invention is described in further details.
The present invention is based on the load equilibration scheduling method of two-stage exchange, first first order input port is buffered in VOQ queue by arrival cell according to destination interface, scheduler by first order switching network (Mesh network as shown in the figure) by message switching to second level input port, k the cell flowing from same in VOQ queue, be referred to as a unit frame, unit frame is the minimum scheduling unit of the present invention.Each input port of the first order is independently carried out dispatching method of the present invention, according to flow distribution matrix, carrying out minimum length assigns, at k continuous external time groove, by first order switching network (Mesh network), the unit frame of same stream is sent to this and flows fixing mapping area.N second level input port is divided into N/k group successively, and every group forms a region containing k continuous second level input port; Each region according to destination interface by cell-buffering in OQ queue, owing to flowing to the mapping relations in region, fix, k the cell from same unit frame will arrive OQ queue heads position successively, by second level switching network (Mesh network as shown in the figure), exchange to object output port.In said process, each input port is independently carried out the cell dispatching algorithm based on stream mapping: the unit frame of same stream is sent to this by first order Mesh network and flow fixing mapping area; Each region according to destination interface by cell-buffering in OQ queue, owing to flowing to the mapping relations in region, fix, from k cell of same unit frame, will arrive successively OQ queue heads position, wait for that second level Mesh network arrives output port successively when idle.If first order buffering area is implemented in to first order line card, buffering area, the second level is implemented in second level line card, and so above-mentioned secondary switching fabric becomes typical load balancing switching fabric, and the present invention is particularly useful for load balancing router message dispatching method.For ease of statement, the present invention sets forth summary of the invention with Mesh network, and Mesh network can be regarded as realizes message switching to the technological approaches of second level input port and object output port, can be that Mesh network can be also other switching technologies.
As from the foregoing, core of the present invention is divided into a region with regard to being by k continuous input port, and input adopts the load sharing algorithm based on stream mapping, in fine-grained mode, k cell of same stream is assigned to fixing mapping area.By theoretical proof, this scheduling strategy can obtain 100% throughput and can guarantee the order of message.Wherein k is polymerization granularity, and it has determined the cell number of each scheduling same stream.For avoiding flow region concentration phenomenon, the present invention further adopts two circulation (dual-rotation) modes to build the mapping relations that flow to region of different input ports.For realizing the equiblibrium mass distribution that loads on second level input port, the present invention further safeguards the flow distribution matrix of overall unified view at each input port, according to flow distribution matrix thread frame, can realize 100% load balancing degrees.
Fig. 1 is an example of the applicable secondary switching architecture of dispatching method of the present invention.In figure, VOQ i, jthe VOQ j that represents first order input port i; the output queue j that represents second level input port l; F (i, j) represents the stream from first order input port i to output port j; K is polymerization granularity, represents the number (k is the factor of port number N) of scheduling same stream cell continuously; VOQ i, jevery k cell of queue forms a unit frame (unit frame), and the unit frame of the first order input port i N/k of place different VOQ queues (amounting to N cell) forms an aggregate frame (aggregate frame), VOQ i, jthe N of a queue cell, forms a full frame (full frame); Second level input port 1,2 ...., N is divided into N/k group successively, and each group is containing k continuous second level input port, and k second level input port of g group forms a region, is denoted as R g, S r, zz the second level input port that represents region r; It is corresponding with N/k region that the N bar stream of each input port is divided into N/k group, and every group is flowed containing k bar, and the k bar stream of input port i f group forms a flow branching, is denoted as
For reducing message buffering memory bandwidth requirements, Mesh network is normally operated in speed R/N (inner link speed-up ratio is 1), obtains thus giving a definition:
The link that definition 1. is R in speed sends or receives a spent time of cell is external time groove (external time slot).
Definition 2. link that is R/N in speed sends or receives a spent time of unit frame is time slot (time slot), time slot be external time groove N doubly.
Generally, each time slot be every N external time groove, UFFS-k algorithm can be from an input N/k flow branching polymerization N/k unit frame form an aggregate frame and be assigned to second level input port.VOQ queue equalizing coefficient has reflected the harmony of flow in second level input port OQ queue distribution, and the present invention, according to each region OQ queue length thread frame, has realized interregional load balancing.Next, will elaborate equalizing coefficient operation principle of the present invention.By adopting, the induction of time slot is proved, can prove theoretically that the above-mentioned dispatching method based on stream mapping can guarantee arbitrary region R g, 0≤j < N, k second level input port is corresponding k queue length identical (transmission delay of ignoring unit frame), wherein l ∈ R g.Thus, can obtain giving a definition:
Since definition 3. is to arbitrary region R g, identical (the l ∈ R of queue length g), queue VOQ so i, jequalizing coefficient equals its mapping area R gthe length of output queue j, is denoted as L g, j.
If definition 4. queue VOQ i, jexist unit frame and equalizing coefficient to meet continuous k external time groove by VOQ i, jqueue unit frame sends to region R g, minimum length that Here it is is assigned.
Can prove theoretically and adopt minimum length assignment strategy can guarantee after time slot T finishes, to any two region R g1, R g2, its OQ queue length Lg 1, jwith Lg 2, jdiffer from most 1, thereby can realize 100% throughput and 100% load balancing degrees.For realizing minimum length, assign, first order input port scheduler need to be safeguarded the flow distribution matrix L=[L of overall unified view g, j], in order to guarantee that flow distribution matrix is in the consistency of each input port view, must realize the alternative of each port to flow distribution matrix write operation.The present invention adopts lock mechanism to realize mutex L g, jmutual exclusion write: if g, j meets and L g, jin release (unlock) state, first order input port i is to L so g, jafter locking by VOQ i, jqueue unit frame is sent to its mapping area R g, L g, jafter adding 1, be unlocked.Equalizing coefficient L in flow branching g, jvOQ in locking state i, jqueue is directly skipped.Be not difficult to infer, the identical input port of mapping relations that only has those to flow to region is dispatched the identical VOQ of destination interface simultaneously i, jduring queue, just may cause same L g, jwrite conflict, to equalizing coefficient L g, jmutual exclusion write and can avoid these input ports a plurality of unit frame to be sent to the output queue j of the same area simultaneously, cause flow distribution unbalanced.
Dispatching message algorithm based on stream mapping, according to local VOQ queuing message scheduling cell, can be distributed in each input port of the first order and independently carry out, and what step 2 was described is the process that each input port of the first order is carried out load equilibration scheduling method of the present invention.
In the present invention, the first step adopts two circulation (dual-rotation) modes to build the mapping that flows to region.The mapping method that flows to region is related to the utilance of second level storage resources and two-stage Mesh network, for the mapping algorithm of avoiding loss of throughput to flow to region should be realized input load at the equiblibrium mass distribution in each region.The present invention proposes a kind of two circulation (dual-rotation) mapping algorithms of taking into account load balancing and packet order preserving, its design philosophy comes from the Essential Analysis to packet out-ordering reason: when the OQ queue length difference of the buffering area, the second level, cell place of same stream, will cause cell out of order, and out of order cell number increases along with the increase of OQ queue length difference.If k cell of same stream is assigned to predefined mapping area (k continuous second level input port) in fine-grained mode, owing to flowing to the mapping relations in region, fix, to any stream, the OQ queue length of its cell place mapping area is identical, thereby has realized the transmission according to the order of sequence of cell.
Since 1.1 for any given region, can only receive the fixing k bar stream of same input port, so for realizing the equiblibrium mass distribution that loads on each region, build in a looping fashion N/k flow branching of first order input port to the mapping in an input port N/k region, the second level, the ground floor for circulation that the corresponding two cyclic mapping algorithm pseudo code of step 1.1 are described, step 1.1 has guaranteed that each region be able to contain all stream;
1.2 further adjust the associated of different input ports and N/k kind mapping mode in a looping fashion, the second layer for circulation that the corresponding two cyclic mapping algorithm pseudo code of step 1.2 are described, step 1.2 has guaranteed that each region contains all stream according to the equiblibrium mass distribution of input port.
The mapping relations that two cyclic mapping algorithms flow to region by simple modulo operation foundation are easy to hardware and realize, and its pseudo-code is described below:
In the present invention, second step is that input port scheduler is assigned cell according to the mapping relations that flow to region, in poll (round-robin) mode, serve N/k flow branching, take unit frame as minimum scheduling unit, continuous k external time groove, send in flow branching the fixedly unit frame of VOQ queue.First order input port i scheduler carry out dispatching method of the present invention each external time groove operating process as follows:
If 2.1 flow branchings (0≤f < N/k, f is initialized as 0), VOQ queue, VOQ i, jthere is full frame in (kf≤j < kf+k), priority scheduling full frame, gives this full frame N/k the highest dispatching priority of unit frame, searches flow distribution matrix L, and first unit frame of intercepting full frame, sends to current L g, jminimum region R g, perform step 2.1.1~2.1.k, if do not exist full frame to go to step 2.2,
If 2.1.1 input port i is to second level input port S g, 1inner link idle, by VOQ i, jqueue heads cell sends to region R gsecond level input port S g, 1, otherwise f=(f+1) modN/k goes to step 2.1;
2.1.2 send VOQ i, jqueue heads cell is to region R gsecond level input port S g, 2;
2.1.3 send VOQ i, jqueue heads cell is to region R gsecond level input port S g, 3;
......
The like send VOQ to 2.1.k i, jqueue heads cell is to region R gsecond level input port S g, k, g=(g+1) modN/k, goes to step 2.1.
If 2.2 flow branchings vOQ queue, VOQ i, jthere is the highest dispatching priority unit frame in (kf≤j < kf+k), searches flow distribution matrix L, and this unit frame is sent to current L g, jminimum region R g, perform step 2.1.1~2.1.k, otherwise go to step 2.3.
If 2.3 flow branchings vOQ queue, VOQ i, jthere are one or more unit frame in (kf≤j < kf+k), searches flow distribution matrix L, according to lookup result, selects minimum equalizing coefficient VOQ queue, VOQ i, jthe unit frame of (kf≤j < kf+k), sends it to flow branching fixedly mapping area R g, perform step 2.1.1~2.1.k, if only can skip the search operation of flow distribution matrix containing a unit frame, directly the unique unit frame of flow branching is sent to its fixedly mapping area R g, otherwise flow branching , containing any unit frame, g=(g+1) modN/k, does not go to step 2.1.
The present invention adopts load balancing degrees to weigh and loads on the balanced intensity that OQ buffering area, the second level distributes, and load balancing degrees can be defined as follows:
Define 5. load balancing degrees: suppose at time period [t r, t v] in l Switching Module forwarded S l[t r, t v] individual cell.Load balancing degrees is within this time period, the minimum cell number that different Switching Modules forward and the ratio of maximum cell number, that is:
E [ t r , t v ] = min l = 0 , . . . K - 1 S l [ t r , t v ] max l = 0 , . . . K - 1 S l [ t r , t v ] (K is the number of second level Switching Module)
Obvious load balancing degrees E[t r, t v]≤1, E[t r, t v] level off to 1, represent that the cell number of each Switching Module processing is basic identical, the distribution that loads on Switching Module is more balanced.E[t r, t v] less, Switching Module load equilibrium is poorer, can prove that thus dispatching method of the present invention can obtain 100% load balancing degrees.
As shown in Figure 2, in concrete application example, the mapping method that flows to region of first step design of the present invention, as port number N=32, during polymerization granularity k=8, the mapping result (VOQ that flows to region that adopts two cyclic mapping methods to obtain ijthe stream of representative from input port i to output port j, → expression mapping relations), check for convenience the mapping result that flows to region, chosen the larger polymerization granularity stream k=8 of numerical value.The mapping method that flows to region flows to the fixing mapping relations in region for building, it is related to the utilance of second level input storage resources and two-stage switching network (as Mesh network), for the mapping algorithm of avoiding loss of throughput (loss of throughput) to flow to region should be realized input load at the equiblibrium mass distribution in each region.
The present invention adopts two circulation (dual-rotation) mapping modes to adjust flow branching to the mapping in region.As shown in Figure 2, input port i, (0≤i≤31) are containing N/k=4 flow branching second level input port is divided into N/k=4 region { R 0, R 1, R 2, R 3.For guaranteeing the utilance of first order exchange resource, the flow branching of same input port should be mapped to different regions, thereby obtains 4 kinds of mapping modes.In order to guarantee that each region be able to contain all stream, second layer circulation builds the flow branching of different input ports to the mapping in region with polling mode, has guaranteed theoretically the feasibility of load balancing.According to the operation of second layer cyclic mapping, input port i={0,4, ..., 28} adopts the first mapping mode, input port i={1,5 ..., 29} adopts the second mapping mode, input port i={2,6 ..., 30} adopts the third mapping mode, input port i={3,7, ..., 31} adopts the 4th kind of mapping mode.Above-mentioned pair of cyclic mapping method can be taken into account load balancing and packet order preserving, the present invention is according to second level input port flow distribution matrix thread frame, further guaranteed the harmony of every stream between region.
As shown in Figure 3, for the present invention carries out load equilibration scheduling method schematic flow sheet in concrete application example, corresponding above-mentioned second step of the present invention.
As shown in Figure 4, for the present invention is in bursts of traffic situation, after adopting minimum length of the present invention to assign, cell is in the distribution situation of input port OQ buffering area, the second level, and the present invention has realized the equiblibrium mass distribution of load when guaranteeing message sequence.Because flow branching is fixed to the mapping relations in region, exist some region that the relatively idle situation in other regions of buffer overflow occurs because load is overweight.For example, the VOQ queue of certain flow branching of input is containing N cell, and queue length is still in continuous growth, and other flow branchings do not have unit frame to dispatch.Heavy duty flow branching will become performance bottleneck to the inner link of its mapping area like this, and heavy duty flow branching is because being used his idle link of input line khaki to cause internal bandwidth waste on the other hand.In order to overcome the above problems, the in the situation that of bursts of traffic, the present invention allows heavy duty flow branching to seize link circuit resource, by giving full frame limit priority, burst message flow is uniformly distributed in to each region.In order to solve the out of order problem of cell of scheduling full frame initiation and to follow load balancing principle, the present invention will be from VOQ i, jthe N/k that queue an is read a unit frame i.e. full frame is assigned to current L successively g, jminimum region R g.Adopt assignment that this strategy obtains sequentially: from VOQ i, jfirst that queue is read and second unit frame assign successively in region 2 and 3, the three, region unit frame and the 4th unit frame assign successively in region 1 or region 4, these four unit frame will arrive output port j successively by reading order.Because the length of zones of different output queue j differs from 1 at the most, therefore the unit frame of N/k in full frame is sent to current L successively g, jit is out of order that cell can not be caused in minimum region.In essence, thread frame has all adopted minimum length assignment strategy with scheduling full frame, and both differences are that unit frame can only fixed assignment arrive its mapping area, and full frame will be split as N/k unit frame equiblibrium mass distribution in N/k region.
Below be only the preferred embodiment of the present invention, protection scope of the present invention is also not only confined to above-described embodiment, and all technical schemes belonging under thinking of the present invention all belong to protection scope of the present invention.It should be pointed out that for those skilled in the art, some improvements and modifications without departing from the principles of the present invention, should be considered as protection scope of the present invention.

Claims (2)

1. the load equilibration scheduling method exchanging based on two-stage, is characterized in that:
First order input port is buffered in VOQ queue by arrival cell according to destination interface, scheduler by first order switching network by message switching to second level input port, k the cell from same stream in VOQ queue is referred to as a unit frame, and unit frame is minimum scheduling unit; Each input port of the first order is carried out minimum length assignment according to flow distribution matrix, at k continuous external time groove, by first order switching network, the unit frame of same stream is sent to this and flows fixing mapping area;
N second level input port is divided into N/k group successively, every group forms a region containing k continuous second level input port, each region according to destination interface by cell-buffering in OQ queue, owing to flowing to the mapping relations in region, fix, k the cell from same unit frame will arrive OQ queue heads position successively, by second level switching network, exchange to object output port;
In first order input port when scheduling, assigns cell according to the mapping relations that flow to region and serves N/k flow branching with polling mode, and described first order input port as follows in the operating process of each groove external time:
(2.1) if flow branching vOQ queue, VOQ i,jthere is full frame, priority scheduling full frame, in 0≤f<N/k, f is initialized as 0, VOQ i,jmiddle kf≤j<kf+k; Give this full frame N/k the highest dispatching priority of unit frame, search flow distribution matrix L, first unit frame of intercepting full frame, sends to current L g,jminimum region R g, perform step (2.1.1)~(2.1.k), if do not exist full frame to go to step (2.2); Queue VOQ i,jequalizing coefficient equals its mapping area R gthe length of output queue j, is denoted as L g,j;
If (2.1.1) first order input port i is to second level input port S g, 1inner link idle, by VOQ i,jqueue heads cell sends to region R gsecond level input port S g, 1, otherwise f=(f+1) modN/k goes to step (2.1);
(2.1.2) send VOQ i,jqueue heads cell is to region R gsecond level input port S g, 2;
(2.1.3) send VOQ i,jqueue heads cell is to region R gsecond level input port S g, 3;
The rest may be inferred sends VOQ to (2.1.k) i,jqueue heads cell is to region R gsecond level input port S g,k, g=(g+1) modN/k, goes to step (2.1);
(2.2) if flow branching vOQ queue, VOQ i,j, there is the highest dispatching priority unit frame in kf≤j<kf+k wherein, searches flow distribution matrix L, and this unit frame is sent to current L g,jminimum region R g, perform step (2.1.1)~(2.1.k), otherwise go to step (2.3);
(2.3) if flow branching there are one or more unit frame in VOQ queue, searches flow distribution matrix L, selects the unit frame of minimum equalizing coefficient VOQ queue according to lookup result, sends it to flow branching fixedly mapping area R g, perform step (2.1.1)~(2.1.k), if only can skip the search operation of flow distribution matrix containing a unit frame, directly the unique unit frame of flow branching is sent to its fixedly mapping area R g, otherwise flow branching , containing any unit frame, g=(g+1) modN/k, does not go to step (2.1);
Wherein, VOQ i,jthe VOQ j that represents first order input port i; the output queue j that represents second level input port l; F (i, j) represents the stream from first order input port i to output port j; K is polymerization granularity, represents the number of scheduling same stream cell continuously, and k is the factor of port number N; VOQ i,jevery k cell of queue forms a unit frame, and the unit frame of the first order input port i N/k of place different VOQ queues forms an aggregate frame, amounts to N cell; VOQ i,jthe N of a queue cell, forms a full frame; Second level input port 1,2 ...., N is divided into N/k group successively, and each group is containing k continuous second level input port, and k second level input port of g group forms a region, is denoted as R g, S r,zz the second level input port that represents region r; It is corresponding with N/k region that the N bar stream of each second level input port is divided into N/k group, and every group is flowed containing k bar, and the k bar stream of second level input port i f group forms a flow branching, is denoted as
Described external time groove in order to be R in speed, link sends or receives the time that cell is spent.
2. the load equilibration scheduling method based on two-stage exchange according to claim 1, is characterized in that, adopts two endless form to build the mapping that flows to region:
(1.1) build in a looping fashion N/k flow branching of first order input port to the mapping in an input port N/k region, the second level, to guarantee that each region be able to contain all stream;
(1.2) further adjust in a looping fashion the associated of different first order input ports and N/k kind mapping mode, to guarantee that each region contains all stream according to the equiblibrium mass distribution of second level input port.
CN201310069391.8A 2013-03-05 2013-03-05 Two-level switch-based load balanced scheduling method Active CN103152281B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310069391.8A CN103152281B (en) 2013-03-05 2013-03-05 Two-level switch-based load balanced scheduling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310069391.8A CN103152281B (en) 2013-03-05 2013-03-05 Two-level switch-based load balanced scheduling method

Publications (2)

Publication Number Publication Date
CN103152281A CN103152281A (en) 2013-06-12
CN103152281B true CN103152281B (en) 2014-09-17

Family

ID=48550152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310069391.8A Active CN103152281B (en) 2013-03-05 2013-03-05 Two-level switch-based load balanced scheduling method

Country Status (1)

Country Link
CN (1) CN103152281B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103825845A (en) * 2014-03-17 2014-05-28 北京航空航天大学 Matrix decomposition-based packet scheduling algorithm of reconfigurable VOQ (virtual output queuing) structure switch
CN108243113B (en) * 2016-12-26 2020-06-16 深圳市中兴微电子技术有限公司 Random load balancing method and device
CN108632143A (en) * 2017-03-16 2018-10-09 华为数字技术(苏州)有限公司 A kind of method and apparatus of transmission data
CN109391556B (en) * 2017-08-10 2022-02-18 深圳市中兴微电子技术有限公司 Message scheduling method, device and storage medium
CN107770093B (en) * 2017-09-29 2020-10-23 内蒙古农业大学 Working method of preposed continuous feedback type two-stage exchange structure
CN108259382B (en) * 2017-12-06 2021-10-15 中国航空工业集团公司西安航空计算技术研究所 3x256 priority scheduling circuit
CN108540398A (en) * 2018-03-29 2018-09-14 江汉大学 Feedback-type load balancing alternate buffer dispatching algorithm
CN112653623B (en) * 2020-12-21 2023-03-14 国家电网有限公司信息通信分公司 Relay protection service-oriented route distribution method and device
CN114697275B (en) * 2020-12-30 2023-05-12 深圳云天励飞技术股份有限公司 Data processing method and device
CN113179226B (en) * 2021-03-31 2022-03-29 新华三信息安全技术有限公司 Queue scheduling method and device
CN114448899A (en) * 2022-01-20 2022-05-06 天津大学 Method for balancing network load of data center
CN114500581B (en) * 2022-01-24 2024-01-19 芯河半导体科技(无锡)有限公司 Method for realizing equal-delay distributed cache Ethernet MAC architecture
CN114415969B (en) * 2022-02-09 2023-09-29 杭州云合智网技术有限公司 Method for dynamically storing messages of exchange chip

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7362733B2 (en) * 2001-10-31 2008-04-22 Samsung Electronics Co., Ltd. Transmitting/receiving apparatus and method for packet retransmission in a mobile communication system
CN101404616A (en) * 2008-11-04 2009-04-08 北京大学深圳研究生院 Load balance grouping and switching structure and its construction method
WO2011050541A1 (en) * 2009-10-31 2011-05-05 北京大学深圳研究生院 Load balancing packet switching structure with the minimum buffer complexity and construction method thereof
CN102123087A (en) * 2011-02-18 2011-07-13 天津博宇铭基信息科技有限公司 Method for quickly calibrating multi-level forwarding load balance and multi-level forwarding network system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7362733B2 (en) * 2001-10-31 2008-04-22 Samsung Electronics Co., Ltd. Transmitting/receiving apparatus and method for packet retransmission in a mobile communication system
CN101404616A (en) * 2008-11-04 2009-04-08 北京大学深圳研究生院 Load balance grouping and switching structure and its construction method
WO2011050541A1 (en) * 2009-10-31 2011-05-05 北京大学深圳研究生院 Load balancing packet switching structure with the minimum buffer complexity and construction method thereof
CN102123087A (en) * 2011-02-18 2011-07-13 天津博宇铭基信息科技有限公司 Method for quickly calibrating multi-level forwarding load balance and multi-level forwarding network system

Also Published As

Publication number Publication date
CN103152281A (en) 2013-06-12

Similar Documents

Publication Publication Date Title
CN103152281B (en) Two-level switch-based load balanced scheduling method
Chuang et al. Practical algorithms for performance guarantees in buffered crossbars
US7154885B2 (en) Apparatus for switching data in high-speed networks and method of operation
Ouyang et al. LOFT: A high performance network-on-chip providing quality-of-service support
Zahid et al. A weighted fat-tree routing algorithm for efficient load-balancing in infini band enterprise clusters
Olsen et al. Polling systems with periodic server routeing in heavy traffic: distribution of the delay
Sahoo et al. Deterministic dynamic network-based just-in-time delivery for distributed edge computing
Liu et al. Packet scheduling in a low-latency optical interconnect with electronic buffers
Zhang et al. Reco: Efficient regularization-based coflow scheduling in optical circuit switches
US7460544B2 (en) Flexible mesh structure for hierarchical scheduling
Dhakad et al. Performance analysis of round robin scheduling using adaptive approach based on smart time slice and comparison with SRR
Wang et al. Randomized load-balanced routing for fat-tree networks
Eugster et al. Essential traffic parameters for shared memory switch performance
Kim et al. Extending bufferless on-chip networks to high-throughput workloads
Yoshigoe The CICQ switch with virtual crosspoint queues for large RTT
Lin et al. Distributed packet buffers for high-bandwidth switches and routers
Rasmussen et al. Efficient round‐robin multicast scheduling for input‐queued switches
Wang et al. Router with centralized buffer for network-on-chip
Bao et al. A priority-based polling scheduling algorithm for arbitration policy in Network on Chip
Yoshigoe Threshold-based exhaustive round-robin for the CICQ switch with virtual crosspoint queues
Prasanth et al. Prioritized queue with round robin scheduler for buffered crossbar switches
De Nicola et al. Stationary Characteristics Of Homogenous Geo/Geo/2 Queue With Resequencing In Discrete Time.
Fu et al. Design of a high-throughput NoC router with neighbor flow regulation
Chrysos Design issues of variable-packet-size, multiple-priority buffered crossbars
Skalis et al. Performance guarantees in partially buffered crossbar switches

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant