CN103152281B

CN103152281B - Two-level switch-based load balanced scheduling method

Info

Publication number: CN103152281B
Application number: CN201310069391.8A
Authority: CN
Inventors: 戴艺; 肖立权; 伍楠; 曹继军; 高蕾; 张鹤颖; 童元满; 董德尊; 王绍刚; 沈胜宇; 刘路; 肖灿文; 张磊; 王永庆; 齐星云; 陆平静
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2013-03-05
Filing date: 2013-03-05
Publication date: 2014-09-17
Anticipated expiration: 2033-03-05
Also published as: CN103152281A

Abstract

The invention discloses a two-level switch-based load balanced scheduling method, which comprises the following steps that first-level input ports buffer arriving cells in a virtual output queue (VOQ) according to destination ports; a scheduler switches messages to second-level input ports through a first-level switching network, wherein k cells from the same stream in the VOQ are called unit frames; each first-level input port executes minimum length scheduling according to a traffic distribution matrix, and transmits the unit frames of the same stream to a fixed mapping area of the stream through the first-level switching network in k continuous external timeslots; and N second-level input ports are sequentially divided into N/k groups of which each comprises k continuous second-level input ports forming an area, and each area buffers the cells in an output queue (OQ) according to the destination ports, and switches the cells to the destination output ports through a second-level switching network. The method has the advantages that a scheduling process is simple, computation or communication is not required, the method is easily implemented by hardware, the throughput of 100 percent is ensured, the sequence of the messages can be ensured, and the like.

Description

Load equilibration scheduling method based on two-stage exchange

Technical field

The present invention is mainly concerned with the dispatching message field in parallel switching fabric, refers in particular to a kind of method for dispatching message of realizing load balancing and packet order preserving in parallel switching fabric.

Background technology

Researcher adopt initiatively measure and passive measurement mode to packet out-ordering behavior done large quantity research.J.Bennet has measured packet out-ordering situation in the switching center of MAE-East ISP, finds that packet out-ordering situation is very serious under the high measurement environment of heavy duty, network equipment degree of parallelism, and more than 90% connection occurs out of order.J.Bennett analyzes the out of order rate of this height mainly from the local parallel treatment mechanism of network internal, comprises parallel switching equipment and parallel transmission link, and points out that packet out-ordering is not the illness behavior of network.There is packet out-ordering phenomenon in most high performance parallel switching fabrics, such as load balancing switching fabric (load-balanced switch), parallel Packet switch architecture (parallel packet switch), multilevel interchange frame (multi-level switch) etc. is all subject to the puzzlement of packet out-ordering problem.Degree of depth Parallel Design has brought load balancing and packet out-ordering two large problems to switching fabric.Load balancing is to realize the key of delay and throughput assurance, but load balancing may cause packet out-ordering, out of order message can damage Internet net condition, because the widely used TCP host-host protocol of Internet can be wrong regard out of order message the sign of the congested generation of message dropping as, thereby cause unnecessary re-transmission and TCP overtime.These retransmit and the overtime TCP throughput that will reduce improves message delay, and the order that guarantees message when therefore realizing input flow rate load balancing is extremely necessary.

The method that prevents packet out-ordering can be divided into two classes: 1) limit the quantity of out of order message, what at output finite capacity is set resets order buffering area, for resetting the out of order message of order; 2) guarantee that message sequentially leaves output port according to arrival, thereby avoided packet out-ordering.Due to the finite capacity of buffering area, first method can only be processed the out of order message in certain limit, if will reset order buffer size, is increased to O (N ²), can be correspondingly with the time scale increase message delay of quadratic power although can solve the problem of packet out-ordering completely, wherein N is port number.Therefore, the quantity that limits out of order message can not effectively solve packet out-ordering problem, and is difficult to adapt to the demand of router high port density.Preferential (the Full Ordered Frame First of full frame that Stanford University proposes, abbreviation FOFF) algorithm is the algorithm that represents of first method, it allows to exist in router the out of order message of some, and the buffer queue that is provided with N * N at output is used for resetting the out of order message of order.Can prove, in order to guarantee that cell leaves according to the order of sequence, FOFF algorithm resets order buffer pool size and is at most N ²individual cell and load balancing can be provided, obtains 100% throughput.In recent years, more researcher trends towards adopting second method to guarantee the order of message, and the order that resets of having eliminated output operates and buffering area expense, is conducive to improve delay performance.The common trait of this class dispatching method is by certain message passing mechanism, to obtain the state information of all arrival messages, based on overall message status information and executing centralized scheduling.For example, it is that notification packet time departure is created feedback network that mailbox exchange (mailbox switch) method adopts symmetric form connection mode, and scheduler is according to the time departure scheduling message of message.This strategy can guarantee that the message of every stream sequentially leaves switching system according to its arrival but can not provide load balancing to realize 100% throughput.Alternately mate dispatching method and adopt centralized scheduling method, suppose traffic characteristic be precognition and immobilize, adopt the method off-line of matrix decomposition to solve centralized scheduling problem, distributed implementation on-line scheduling also provides service guarantees.Yet, when flow becomes unpredictable and dynamically changes, be difficult to meet centralized dispatching requirement under large exchange size.PARALLEL MATCHING dispatching method is realized packet order preserving by transmit request-license token between first order input line card and second level input line card, but the frequent transmission of token between line card will be multiplied dispatching cycle in particular hardware realizes.

Summary of the invention

The technical problem to be solved in the present invention is just: the technical problem existing for prior art, the invention provides a kind of scheduling process simple, without any calculating or communication, be easy to the load equilibration scheduling method based on two-stage exchange that hardware had been realized, realized 100% throughput and can guarantee the order of message.

For solving the problems of the technologies described above, the present invention by the following technical solutions:

A kind of load equilibration scheduling method based on two-stage exchange, first order input port is buffered in VOQ queue by arrival cell according to destination interface, scheduler by first order switching network by message switching to second level input port, k the cell from same stream in VOQ queue is referred to as a unit frame, and unit frame is minimum scheduling unit; Each input port of the first order is carried out minimum length assignment according to flow distribution matrix, at k continuous external time groove, by first order switching network, the unit frame of same stream is sent to this and flows fixing mapping area; N second level input port is divided into N/k group successively, every group forms a region containing k continuous second level input port, each region according to destination interface by cell-buffering in OQ queue, owing to flowing to the mapping relations in region, fix, k the cell from same unit frame will arrive OQ queue heads position successively, by second level switching network, exchange to object output port.

As a further improvement on the present invention, adopt two endless form to build the mapping that flows to region:

(1.1) build in a looping fashion N/k flow branching of first order input port to the mapping in an input port N/k region, the second level, to guarantee that each region be able to contain all stream;

(1.2) further adjust in a looping fashion the associated of different input ports and N/k kind mapping mode, to guarantee that each region contains all stream according to the equiblibrium mass distribution of input port.

As a further improvement on the present invention, assign cell serve N/k flow branching with polling mode when input port is dispatched according to the mapping relations that flow to region, described first order input port as follows in the operating process of each groove external time:

(2.1) if flow branching there is full frame in VOQ queue, priority scheduling full frame, gives this full frame N/k the highest dispatching priority of unit frame, searches flow distribution matrix L, and first unit frame of intercepting full frame, sends to current L _{g, j}minimum region R _g, perform step (2.1.1)～(2.1.k), if do not exist full frame to go to step (2.2);

If (2.1.1) input port i is to second level input port S _{g, 1}inner link idle, by VOQ _{i, j}queue heads cell sends to region R _gsecond level input port S _{g, 1}, otherwise f=(f+1) modN/k goes to step (2.1);

(2.1.2) send VOQ _{i, j}queue heads cell is to region R _gsecond level input port S _{g, 2};

(2.1.3) send VOQ _{i, j}queue heads cell is to region R _gsecond level input port S _{g, 3};

The rest may be inferred sends VOQ to (2.1.k) _{i, j}queue heads cell is to region R _gsecond level input port S _{g, k}, g=(g+1) modN/k, goes to step (2.1);

(2.2) if flow branching vOQ queue, VOQ _{i, j}there is the highest dispatching priority unit frame in (kf≤j < kf+k), searches flow distribution matrix L, and this unit frame is sent to current L _{g, j}minimum region R _g, perform step (2.1.1)～(2.1.k), otherwise go to step (2.3);

(2.3) if flow branching there are one or more unit frame in VOQ queue, searches flow distribution matrix L, selects the unit frame of minimum equalizing coefficient VOQ queue according to lookup result, sends it to flow branching fixedly mapping area R _g, perform step (2.1.1)～(2.1.k), if only can skip the search operation of flow distribution matrix containing a unit frame, directly the unique unit frame of flow branching is sent to its fixedly mapping area R _g, otherwise flow branching , containing any unit frame, g=(g+1) modN/k, does not go to step (2.1).

Compared with prior art, the invention has the advantages that:

1, dispatching method of the present invention can be distributed in each input port and independently carries out, according to local VOQ queuing message, assign cell, without any need for communication overhead, with O (1) time complexity, has realized 100% throughput and can guarantee the order of message.

2, the present invention, between each input port scheduler without any communication overhead in the situation that, has realized packet order preserving and load balancing.By structure, flow to the fixedly mapping in region, avoided packet out-ordering, eliminated message and reset order expense; For avoiding flow region concentration phenomenon, adopt two circulation (dual-rotation) modes to build the mapping relations that flow to region of different input ports, each input port is safeguarded the flow distribution matrix of overall unified view, according to flow distribution matrix thread frame.Can prove, to any output port j, the same area OQ _jidentical and the zones of different OQ of queue length _jqueue length differs from 1 at the most, thereby has realized 100% load balancing degrees.

3, the present invention only need suitably choose polymerization granularity k, can obtain lowest latency in theory.Delay performance by simplation verification dispatching method of the present invention under different polymerization granularity k, and compare with the load balance scheduling algorithm of current main flow.Analog result shows, when polymerization granularity k=2, the present invention has optimal delay performance at present all dispatching algorithms that can guarantee message sequence, and under burst flow model, shows the performance suitable with the algorithm that does not possess packet order preserving characteristic.

4, the present invention is according to the fixedly mapping relations scheduling message that flows to region, and scheduling process is simple, without any calculating or communication, is easy to hardware and realizes.

Accompanying drawing explanation

Fig. 1 is an example of the applicable secondary switching architecture of dispatching method of the present invention.

Fig. 2 mapping method that to be the present invention build in concrete application example flows to region is at port number N=32, during polymerization granularity k=8, and the mapping result schematic diagram that flows to region that adopts two cyclic mapping modes to obtain.

Fig. 3 is that the present invention carries out load equilibration scheduling method schematic flow sheet in concrete application example.

Fig. 4 is that after the present invention adopts minimum length of the present invention to assign in concrete application example in bursts of traffic situation, cell is at the distribution situation schematic diagram of input port OQ buffering area, the second level.

Embodiment

Below with reference to Figure of description and specific embodiment, the present invention is described in further details.

The present invention is based on the load equilibration scheduling method of two-stage exchange, first first order input port is buffered in VOQ queue by arrival cell according to destination interface, scheduler by first order switching network (Mesh network as shown in the figure) by message switching to second level input port, k the cell flowing from same in VOQ queue, be referred to as a unit frame, unit frame is the minimum scheduling unit of the present invention.Each input port of the first order is independently carried out dispatching method of the present invention, according to flow distribution matrix, carrying out minimum length assigns, at k continuous external time groove, by first order switching network (Mesh network), the unit frame of same stream is sent to this and flows fixing mapping area.N second level input port is divided into N/k group successively, and every group forms a region containing k continuous second level input port; Each region according to destination interface by cell-buffering in OQ queue, owing to flowing to the mapping relations in region, fix, k the cell from same unit frame will arrive OQ queue heads position successively, by second level switching network (Mesh network as shown in the figure), exchange to object output port.In said process, each input port is independently carried out the cell dispatching algorithm based on stream mapping: the unit frame of same stream is sent to this by first order Mesh network and flow fixing mapping area; Each region according to destination interface by cell-buffering in OQ queue, owing to flowing to the mapping relations in region, fix, from k cell of same unit frame, will arrive successively OQ queue heads position, wait for that second level Mesh network arrives output port successively when idle.If first order buffering area is implemented in to first order line card, buffering area, the second level is implemented in second level line card, and so above-mentioned secondary switching fabric becomes typical load balancing switching fabric, and the present invention is particularly useful for load balancing router message dispatching method.For ease of statement, the present invention sets forth summary of the invention with Mesh network, and Mesh network can be regarded as realizes message switching to the technological approaches of second level input port and object output port, can be that Mesh network can be also other switching technologies.

As from the foregoing, core of the present invention is divided into a region with regard to being by k continuous input port, and input adopts the load sharing algorithm based on stream mapping, in fine-grained mode, k cell of same stream is assigned to fixing mapping area.By theoretical proof, this scheduling strategy can obtain 100% throughput and can guarantee the order of message.Wherein k is polymerization granularity, and it has determined the cell number of each scheduling same stream.For avoiding flow region concentration phenomenon, the present invention further adopts two circulation (dual-rotation) modes to build the mapping relations that flow to region of different input ports.For realizing the equiblibrium mass distribution that loads on second level input port, the present invention further safeguards the flow distribution matrix of overall unified view at each input port, according to flow distribution matrix thread frame, can realize 100% load balancing degrees.

Fig. 1 is an example of the applicable secondary switching architecture of dispatching method of the present invention.In figure, VOQ _{i, j}the VOQ j that represents first order input port i; the output queue j that represents second level input port l; F (i, j) represents the stream from first order input port i to output port j; K is polymerization granularity, represents the number (k is the factor of port number N) of scheduling same stream cell continuously; VOQ _{i, j}every k cell of queue forms a unit frame (unit frame), and the unit frame of the first order input port i N/k of place different VOQ queues (amounting to N cell) forms an aggregate frame (aggregate frame), VOQ _{i, j}the N of a queue cell, forms a full frame (full frame); Second level input port 1,2 ...., N is divided into N/k group successively, and each group is containing k continuous second level input port, and k second level input port of g group forms a region, is denoted as R _g, S _{r, z}z the second level input port that represents region r; It is corresponding with N/k region that the N bar stream of each input port is divided into N/k group, and every group is flowed containing k bar, and the k bar stream of input port i f group forms a flow branching, is denoted as

For reducing message buffering memory bandwidth requirements, Mesh network is normally operated in speed R/N (inner link speed-up ratio is 1), obtains thus giving a definition:

The link that definition 1. is R in speed sends or receives a spent time of cell is external time groove (external time slot).

Definition 2. link that is R/N in speed sends or receives a spent time of unit frame is time slot (time slot), time slot be external time groove N doubly.

Generally, each time slot be every N external time groove, UFFS-k algorithm can be from an input N/k flow branching polymerization N/k unit frame form an aggregate frame and be assigned to second level input port.VOQ queue equalizing coefficient has reflected the harmony of flow in second level input port OQ queue distribution, and the present invention, according to each region OQ queue length thread frame, has realized interregional load balancing.Next, will elaborate equalizing coefficient operation principle of the present invention.By adopting, the induction of time slot is proved, can prove theoretically that the above-mentioned dispatching method based on stream mapping can guarantee arbitrary region R _g, 0≤j < N, k second level input port is corresponding k queue length identical (transmission delay of ignoring unit frame), wherein l ∈ R _g.Thus, can obtain giving a definition:

Since definition 3. is to arbitrary region R _g, identical (the l ∈ R of queue length _g), queue VOQ so _{i, j}equalizing coefficient equals its mapping area R _gthe length of output queue j, is denoted as L _{g, j}.

If definition 4. queue VOQ _{i, j}exist unit frame and equalizing coefficient to meet continuous k external time groove by VOQ _{i, j}queue unit frame sends to region R _g, minimum length that Here it is is assigned.

Can prove theoretically and adopt minimum length assignment strategy can guarantee after time slot T finishes, to any two region R _g1, R _g2, its OQ queue length Lg _{1, j}with Lg _{2, j}differ from most 1, thereby can realize 100% throughput and 100% load balancing degrees.For realizing minimum length, assign, first order input port scheduler need to be safeguarded the flow distribution matrix L=[L of overall unified view _{g, j}], in order to guarantee that flow distribution matrix is in the consistency of each input port view, must realize the alternative of each port to flow distribution matrix write operation.The present invention adopts lock mechanism to realize mutex L _{g, j}mutual exclusion write: if g, j meets and L _{g, j}in release (unlock) state, first order input port i is to L so _{g, j}after locking by VOQ _{i, j}queue unit frame is sent to its mapping area R _g, L _{g, j}after adding 1, be unlocked.Equalizing coefficient L in flow branching _{g, j}vOQ in locking state _{i, j}queue is directly skipped.Be not difficult to infer, the identical input port of mapping relations that only has those to flow to region is dispatched the identical VOQ of destination interface simultaneously _{i, j}during queue, just may cause same L _{g, j}write conflict, to equalizing coefficient L _{g, j}mutual exclusion write and can avoid these input ports a plurality of unit frame to be sent to the output queue j of the same area simultaneously, cause flow distribution unbalanced.

Dispatching message algorithm based on stream mapping, according to local VOQ queuing message scheduling cell, can be distributed in each input port of the first order and independently carry out, and what step 2 was described is the process that each input port of the first order is carried out load equilibration scheduling method of the present invention.

In the present invention, the first step adopts two circulation (dual-rotation) modes to build the mapping that flows to region.The mapping method that flows to region is related to the utilance of second level storage resources and two-stage Mesh network, for the mapping algorithm of avoiding loss of throughput to flow to region should be realized input load at the equiblibrium mass distribution in each region.The present invention proposes a kind of two circulation (dual-rotation) mapping algorithms of taking into account load balancing and packet order preserving, its design philosophy comes from the Essential Analysis to packet out-ordering reason: when the OQ queue length difference of the buffering area, the second level, cell place of same stream, will cause cell out of order, and out of order cell number increases along with the increase of OQ queue length difference.If k cell of same stream is assigned to predefined mapping area (k continuous second level input port) in fine-grained mode, owing to flowing to the mapping relations in region, fix, to any stream, the OQ queue length of its cell place mapping area is identical, thereby has realized the transmission according to the order of sequence of cell.

Since 1.1 for any given region, can only receive the fixing k bar stream of same input port, so for realizing the equiblibrium mass distribution that loads on each region, build in a looping fashion N/k flow branching of first order input port to the mapping in an input port N/k region, the second level, the ground floor for circulation that the corresponding two cyclic mapping algorithm pseudo code of step 1.1 are described, step 1.1 has guaranteed that each region be able to contain all stream;

1.2 further adjust the associated of different input ports and N/k kind mapping mode in a looping fashion, the second layer for circulation that the corresponding two cyclic mapping algorithm pseudo code of step 1.2 are described, step 1.2 has guaranteed that each region contains all stream according to the equiblibrium mass distribution of input port.

The mapping relations that two cyclic mapping algorithms flow to region by simple modulo operation foundation are easy to hardware and realize, and its pseudo-code is described below:

In the present invention, second step is that input port scheduler is assigned cell according to the mapping relations that flow to region, in poll (round-robin) mode, serve N/k flow branching, take unit frame as minimum scheduling unit, continuous k external time groove, send in flow branching the fixedly unit frame of VOQ queue.First order input port i scheduler carry out dispatching method of the present invention each external time groove operating process as follows:

If 2.1 flow branchings (0≤f < N/k, f is initialized as 0), VOQ queue, VOQ _{i, j}there is full frame in (kf≤j < kf+k), priority scheduling full frame, gives this full frame N/k the highest dispatching priority of unit frame, searches flow distribution matrix L, and first unit frame of intercepting full frame, sends to current L _{g, j}minimum region R _g, perform step 2.1.1～2.1.k, if do not exist full frame to go to step 2.2,

If 2.1.1 input port i is to second level input port S _{g, 1}inner link idle, by VOQ _{i, j}queue heads cell sends to region R _gsecond level input port S _{g, 1}, otherwise f=(f+1) modN/k goes to step 2.1;

2.1.2 send VOQ _{i, j}queue heads cell is to region R _gsecond level input port S _{g, 2};

2.1.3 send VOQ _{i, j}queue heads cell is to region R _gsecond level input port S _{g, 3};

......

The like send VOQ to 2.1.k _{i, j}queue heads cell is to region R _gsecond level input port S _{g, k}, g=(g+1) modN/k, goes to step 2.1.

If 2.2 flow branchings vOQ queue, VOQ _{i, j}there is the highest dispatching priority unit frame in (kf≤j < kf+k), searches flow distribution matrix L, and this unit frame is sent to current L _{g, j}minimum region R _g, perform step 2.1.1～2.1.k, otherwise go to step 2.3.

If 2.3 flow branchings vOQ queue, VOQ _{i, j}there are one or more unit frame in (kf≤j < kf+k), searches flow distribution matrix L, according to lookup result, selects minimum equalizing coefficient VOQ queue, VOQ _{i, j}the unit frame of (kf≤j < kf+k), sends it to flow branching fixedly mapping area R _g, perform step 2.1.1～2.1.k, if only can skip the search operation of flow distribution matrix containing a unit frame, directly the unique unit frame of flow branching is sent to its fixedly mapping area R _g, otherwise flow branching , containing any unit frame, g=(g+1) modN/k, does not go to step 2.1.

The present invention adopts load balancing degrees to weigh and loads on the balanced intensity that OQ buffering area, the second level distributes, and load balancing degrees can be defined as follows:

Define 5. load balancing degrees: suppose at time period [t _r, t _v] in l Switching Module forwarded S _l[t _r, t _v] individual cell.Load balancing degrees is within this time period, the minimum cell number that different Switching Modules forward and the ratio of maximum cell number, that is:

E [t_{r}, t_{v}] = \frac{\min_{l = 0, . . . K - 1} S_{l} [t_{r}, t_{v}]}{\max_{l = 0, . . . K - 1} S_{l} [t_{r}, t_{v}]}

(K is the number of second level Switching Module)

Obvious load balancing degrees E[t _r, t _v]≤1, E[t _r, t _v] level off to 1, represent that the cell number of each Switching Module processing is basic identical, the distribution that loads on Switching Module is more balanced.E[t _r, t _v] less, Switching Module load equilibrium is poorer, can prove that thus dispatching method of the present invention can obtain 100% load balancing degrees.

As shown in Figure 2, in concrete application example, the mapping method that flows to region of first step design of the present invention, as port number N=32, during polymerization granularity k=8, the mapping result (VOQ that flows to region that adopts two cyclic mapping methods to obtain _ijthe stream of representative from input port i to output port j, → expression mapping relations), check for convenience the mapping result that flows to region, chosen the larger polymerization granularity stream k=8 of numerical value.The mapping method that flows to region flows to the fixing mapping relations in region for building, it is related to the utilance of second level input storage resources and two-stage switching network (as Mesh network), for the mapping algorithm of avoiding loss of throughput (loss of throughput) to flow to region should be realized input load at the equiblibrium mass distribution in each region.

The present invention adopts two circulation (dual-rotation) mapping modes to adjust flow branching to the mapping in region.As shown in Figure 2, input port i, (0≤i≤31) are containing N/k=4 flow branching second level input port is divided into N/k=4 region { R ₀, R ₁, R ₂, R ₃.For guaranteeing the utilance of first order exchange resource, the flow branching of same input port should be mapped to different regions, thereby obtains 4 kinds of mapping modes.In order to guarantee that each region be able to contain all stream, second layer circulation builds the flow branching of different input ports to the mapping in region with polling mode, has guaranteed theoretically the feasibility of load balancing.According to the operation of second layer cyclic mapping, input port i={0,4, ..., 28} adopts the first mapping mode, input port i={1,5 ..., 29} adopts the second mapping mode, input port i={2,6 ..., 30} adopts the third mapping mode, input port i={3,7, ..., 31} adopts the 4th kind of mapping mode.Above-mentioned pair of cyclic mapping method can be taken into account load balancing and packet order preserving, the present invention is according to second level input port flow distribution matrix thread frame, further guaranteed the harmony of every stream between region.

As shown in Figure 3, for the present invention carries out load equilibration scheduling method schematic flow sheet in concrete application example, corresponding above-mentioned second step of the present invention.

As shown in Figure 4, for the present invention is in bursts of traffic situation, after adopting minimum length of the present invention to assign, cell is in the distribution situation of input port OQ buffering area, the second level, and the present invention has realized the equiblibrium mass distribution of load when guaranteeing message sequence.Because flow branching is fixed to the mapping relations in region, exist some region that the relatively idle situation in other regions of buffer overflow occurs because load is overweight.For example, the VOQ queue of certain flow branching of input is containing N cell, and queue length is still in continuous growth, and other flow branchings do not have unit frame to dispatch.Heavy duty flow branching will become performance bottleneck to the inner link of its mapping area like this, and heavy duty flow branching is because being used his idle link of input line khaki to cause internal bandwidth waste on the other hand.In order to overcome the above problems, the in the situation that of bursts of traffic, the present invention allows heavy duty flow branching to seize link circuit resource, by giving full frame limit priority, burst message flow is uniformly distributed in to each region.In order to solve the out of order problem of cell of scheduling full frame initiation and to follow load balancing principle, the present invention will be from VOQ _{i, j}the N/k that queue an is read a unit frame i.e. full frame is assigned to current L successively _{g, j}minimum region R _g.Adopt assignment that this strategy obtains sequentially: from VOQ _{i, j}first that queue is read and second unit frame assign successively in region 2 and 3, the three, region unit frame and the 4th unit frame assign successively in region 1 or region 4, these four unit frame will arrive output port j successively by reading order.Because the length of zones of different output queue j differs from 1 at the most, therefore the unit frame of N/k in full frame is sent to current L successively _{g, j}it is out of order that cell can not be caused in minimum region.In essence, thread frame has all adopted minimum length assignment strategy with scheduling full frame, and both differences are that unit frame can only fixed assignment arrive its mapping area, and full frame will be split as N/k unit frame equiblibrium mass distribution in N/k region.

Below be only the preferred embodiment of the present invention, protection scope of the present invention is also not only confined to above-described embodiment, and all technical schemes belonging under thinking of the present invention all belong to protection scope of the present invention.It should be pointed out that for those skilled in the art, some improvements and modifications without departing from the principles of the present invention, should be considered as protection scope of the present invention.

Claims

1. the load equilibration scheduling method exchanging based on two-stage, is characterized in that:

First order input port is buffered in VOQ queue by arrival cell according to destination interface, scheduler by first order switching network by message switching to second level input port, k the cell from same stream in VOQ queue is referred to as a unit frame, and unit frame is minimum scheduling unit; Each input port of the first order is carried out minimum length assignment according to flow distribution matrix, at k continuous external time groove, by first order switching network, the unit frame of same stream is sent to this and flows fixing mapping area;

N second level input port is divided into N/k group successively, every group forms a region containing k continuous second level input port, each region according to destination interface by cell-buffering in OQ queue, owing to flowing to the mapping relations in region, fix, k the cell from same unit frame will arrive OQ queue heads position successively, by second level switching network, exchange to object output port;

In first order input port when scheduling, assigns cell according to the mapping relations that flow to region and serves N/k flow branching with polling mode, and described first order input port as follows in the operating process of each groove external time:

(2.1) if flow branching vOQ queue, VOQ _i,jthere is full frame, priority scheduling full frame, in 0≤f<N/k, f is initialized as 0, VOQ _i,jmiddle kf≤j<kf+k; Give this full frame N/k the highest dispatching priority of unit frame, search flow distribution matrix L, first unit frame of intercepting full frame, sends to current L _g,jminimum region R _g, perform step (2.1.1)～(2.1.k), if do not exist full frame to go to step (2.2); Queue VOQ _i,jequalizing coefficient equals its mapping area R _gthe length of output queue j, is denoted as L _g,j;

If (2.1.1) first order input port i is to second level input port S _{g, 1}inner link idle, by VOQ _i,jqueue heads cell sends to region R _gsecond level input port S _{g, 1}, otherwise f=(f+1) modN/k goes to step (2.1);

(2.1.2) send VOQ _i,jqueue heads cell is to region R _gsecond level input port S _{g, 2};

(2.1.3) send VOQ _i,jqueue heads cell is to region R _gsecond level input port S _{g, 3};

The rest may be inferred sends VOQ to (2.1.k) _i,jqueue heads cell is to region R _gsecond level input port S _g,k, g=(g+1) modN/k, goes to step (2.1);

(2.2) if flow branching vOQ queue, VOQ _i,j, there is the highest dispatching priority unit frame in kf≤j<kf+k wherein, searches flow distribution matrix L, and this unit frame is sent to current L _g,jminimum region R _g, perform step (2.1.1)～(2.1.k), otherwise go to step (2.3);

(2.3) if flow branching there are one or more unit frame in VOQ queue, searches flow distribution matrix L, selects the unit frame of minimum equalizing coefficient VOQ queue according to lookup result, sends it to flow branching fixedly mapping area R _g, perform step (2.1.1)～(2.1.k), if only can skip the search operation of flow distribution matrix containing a unit frame, directly the unique unit frame of flow branching is sent to its fixedly mapping area R _g, otherwise flow branching , containing any unit frame, g=(g+1) modN/k, does not go to step (2.1);

Wherein, VOQ _i,jthe VOQ j that represents first order input port i; the output queue j that represents second level input port l; F (i, j) represents the stream from first order input port i to output port j; K is polymerization granularity, represents the number of scheduling same stream cell continuously, and k is the factor of port number N; VOQ _i,jevery k cell of queue forms a unit frame, and the unit frame of the first order input port i N/k of place different VOQ queues forms an aggregate frame, amounts to N cell; VOQ _i,jthe N of a queue cell, forms a full frame; Second level input port 1,2 ...., N is divided into N/k group successively, and each group is containing k continuous second level input port, and k second level input port of g group forms a region, is denoted as R _g, S _r,zz the second level input port that represents region r; It is corresponding with N/k region that the N bar stream of each second level input port is divided into N/k group, and every group is flowed containing k bar, and the k bar stream of second level input port i f group forms a flow branching, is denoted as

Described external time groove in order to be R in speed, link sends or receives the time that cell is spent.

2. the load equilibration scheduling method based on two-stage exchange according to claim 1, is characterized in that, adopts two endless form to build the mapping that flows to region:

(1.2) further adjust in a looping fashion the associated of different first order input ports and N/k kind mapping mode, to guarantee that each region contains all stream according to the equiblibrium mass distribution of second level input port.