CN109194583A - Network congestion Diagnosis of Links method and system based on depth enhancing study - Google Patents

Network congestion Diagnosis of Links method and system based on depth enhancing study Download PDF

Info

Publication number
CN109194583A
CN109194583A CN201810890267.0A CN201810890267A CN109194583A CN 109194583 A CN109194583 A CN 109194583A CN 201810890267 A CN201810890267 A CN 201810890267A CN 109194583 A CN109194583 A CN 109194583A
Authority
CN
China
Prior art keywords
network
state
congestion
link
diagnosis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810890267.0A
Other languages
Chinese (zh)
Other versions
CN109194583B (en
Inventor
潘胜利
曾德泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN201810890267.0A priority Critical patent/CN109194583B/en
Publication of CN109194583A publication Critical patent/CN109194583A/en
Application granted granted Critical
Publication of CN109194583B publication Critical patent/CN109194583B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • H04L47/127Avoiding congestion; Recovering from congestion by using congestion prediction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The network congestion Diagnosis of Links method and system based on deep learning that the invention discloses a kind of, enhancing study is combined with deep learning by DQN, advantage when using it in face of dimensional state, using the Q-Learning mode of learning for constructing label based on state-movement-reward strategy, congestion link diagnosis is carried out.State is defined as the binary group being combined by link and by the congestion state collection in all paths of the link by the enhancing study part of DQN in the present invention;Action definition be according to path congestion state set come guess the link whether congestion;Reward, which is defined as hitting it, is positive reward and while guessing wrong is negative reward, and the deep learning part of DQN then uses depth convolutional neural networks even depth neural network in the present invention.In this way, DQN, by constantly iteration, the incidence relation between autonomous learning network congestion path and network congestion link realizes the Accurate Diagnosis to network congestion link, and diagnosis performance of the present invention is outstanding.

Description

Network congestion Diagnosis of Links method and system based on depth enhancing study
Technical field
The present invention relates to network congestion Diagnosis of Links fields, enhance study based on depth more specifically to a kind of Network congestion Diagnosis of Links method and system.
Background technique
Externally measured technology based on router cooperation initiates measurement process in network edge, by internal node to detection The feedback of data obtains parameter to be measured.Wherein, relatively common tool includes the ping for diagnostic network connectivity, obtains net The traceroute of network topology, pathchar of performance parameters such as measurement link bandwidth, time delay etc..When internal node because of network When the factors such as safety do not support cooperation, such methods will fail.In addition, such methods mostly use greatly ICMP (Internet Control Measurement Protocol) message as detection data, and in real network icmp packet priority compared with It is low, therefore the performance parameter measured possibly can not accurately reflect the virtual condition of network.End-to-end measurement passes through in network edge Sending and receiving data between node obtains the end to end performance parameter of network.This method only needs the basic storage using router Forwarding capability, minimum to the dependence of network itself, network tomography technology (Network Tomography, NT) is a kind of According to end-to-end measurement data, the method for the parameters within network such as link performance parameters, topological structure is inferred.Since it can not have Network internal performance parameter is obtained in the case where having internal node cooperation, with current internet non-cooperating, isomerization, based on edge The feature of control is agreed with very much, and the present invention is studied by the chromatography imaging method of network link performance parameter, passes through depth Study solves network congestion Diagnosis of Links, more accurately and quickly obtains the operating status of link in network.
Summary of the invention
In order to solve the above technical problems, present applicant proposes a kind of network congestion Diagnosis of Links based on depth enhancing study Method and system, by being learned by deep neural network and enhancing after end-to-end measurement link state under network chromatography method It practises to carry out congestion link positioning.
It includes such as that the present invention, which solves the network congestion Diagnosis of Links method based on deep learning used by its technical problem, Lower step:
S1, M parts of congestion state data of network collection physical link for treating diagnosis, have to obtain link congestion state network To acyclic figure M, as sample pool;M is the integer greater than 1;
S2, decision-making state modeling is carried out to every link congestion state network directed acyclic graph respectively, generates state together Set S;
S3, training number is used as according to the training method DQN, the state set S and corresponding decision set A of neural network According to collection, neural metwork training is carried out, it is input that each group of training data, which is the state of a directed acyclic graph, when training, corresponding Decision is output;
S4, using method identical with step S2, by the network directed acyclic graph of pending network congestion Diagnosis of Links into The modeling of row decision-making state, generates initiating task state s0, by state s0It substitutes into the neural network that step S3 training obtains, carries out Link congestion status predication.
Further, in the network congestion Diagnosis of Links method of the invention based on deep learning, DQN in step S3 Objective function construction method is as follows:
A1, with a deep neural network as the network of Q value, parameter ω forces Q function by updating ω Nearly optimal value: Q (s, a, ω) ≈ Qπ(s,a);In formula, s indicates state, and a indicates decision;
A2, using mean square deviation carry out objective function in Q value:
L (ω)=E [(r+ γ maxQ (s ', a ', ω)-Q (s, a, ω)2)];
In formula, s ' indicates next state, and a ' indicates next decision, and E is indicated., r expression., γ, which indicates to decay, is Number;
The gradient of A3, calculating parameter ω about objective function:
A4, optimization aim end to end is realized using SGD.
Further, in the network congestion Diagnosis of Links method of the invention based on deep learning, in DQN training, Main step includes:
B1, initialization experience pond D, setting capacity is N, for storing trained sample;
B2, initialization action-cost function Q neural network, weight parameter θ used are random value;
B3, initialized target movement-cost functionNeural network, structure is identical with Q, and weight parameter θ-=θ;
B4, setting segment sum M;
B5, initialization network inputs state s0, and calculate network output;
B6, with state set Snext={ s0As input set, recurrence update is carried out to network parameter.
Further, in the network congestion Diagnosis of Links method of the invention based on deep learning, wherein step B6 Include: to the step of network progress recurrence update
B61, each of input set state is carried out movement conjecture and executes network to update, while obtaining next shape State is added into NextState set if NextState is nonabsorptive state;
If B62, NextState set non-empty, as the input that network recurrence updates, continues recurrence, otherwise tie Beam.
Further, right in step B61 in the network congestion Diagnosis of Links method of the invention based on deep learning Each state carries out movement conjecture execution network and updates and include: the step of obtaining the set of NextState
C1, it uses ε-greedy strategy to carry out movement selection: randomly choosing a movement from set of actions A with probability ε As at, otherwise current state is input to the Q value for calculating each movement in current network with a CNN, select Q It is worth a maximum movement as at
C2, a is executedt, obtain executing atFeedback r afterwardstWith NextState st+1
C3, by four parameter (st,at,rt,st+1) be deposited into D together as state this moment, when storing N number of in D The state at quarter;
C4, minibatch state parameter group (s is taken out from D at randomj,aj,rj,sj+1);
C5, the target value for calculating each state, specifically by execution atReward afterwards updates Q value as target Value: if NextState is absorbing state, yj=rj, otherwise
C6, pass through SGD undated parameter θ;
Target action-value function network is updated after C7, every C iterationParameter θ-It is current The parameter θ of the network Q of action-value function, C are the positive integer greater than 1.
Further, in the network congestion Diagnosis of Links method of the invention based on deep learning, the link congestion State network indicates G=(V, E) by a directed acyclic graph, and wherein V={ 0,1,2 ..., k ..., m } is network node collection It closes, E={ l1,l2,...,lk,...,lmIt is link set, and link lkIt then indicates that end node is the link of k, owns in network The set in path is defined as P={ p1,p2,...,pi,...,pn, corresponding route congestion state observation set is defined as Y= {y1,y2,...,yi,...,yn, wherein the i-th paths piCongestion state be yi, work as yiWhen=1, path p is indicatediIn congestion State;And if yi=0, then it represents that path piIn normal condition, φkIt indicates to pass through link lkSet of paths;YkCorrespond to φkIn the congestion state in each path observe set, the path status collection in network is combined into X={ x1,x2,...,xk,...,xm};
State s is defined as a binary group of link and the route congestion state set by link, i.e. s=sk=(lk, Yk), state set is S={ s1,s2,s3,...,sk,...,sm, for being in state s=skWhen, the set of actions taken is A =a, wherein a=0 indicates conjecture link lkFor normal link, i.e.,As a=1, then it represents that conjecture lkFor congestion link, HaveWhen true link congestion state is identical as the link congestion state of conjecture, that is, work asWhen, it will be encouraged It encourages;Otherwise it will be punished.
Further, in the network congestion Diagnosis of Links method of the invention based on deep learning, entire based on deep Stateful set S in the network congestion Diagnosis of Links method of study is spent, strategy set A, tactful π are selected according to current state Lower a moment behavior a=π (s) has corresponding return value R (s) to be corresponding to it each of state set state s;It is right Corresponding weight function V is arranged for each strategy π in every next state in status switch, setting attenuation coefficient γπ(s0)=E [R (s0)+γR(s1)+γ2R(s2)+...|s0=S, π]=E [R (s0)+γVπ(s1)]。
The present invention is to solve its technical problem, additionally provides a kind of network congestion Diagnosis of Links system based on deep learning System, the system carry out network congestion link using the network congestion Diagnosis of Links method based on deep learning of any of the above-described and examine It is disconnected.
Beneficial effects of the present invention: the present invention is studied by the chromatography imaging method of network link performance parameter, is led to It crosses Deep Q-Learning and combines deep learning with intensified learning, more accurately and quickly obtain link in network Operating status.Network training is carried out using DQN, the multiple NextStates being likely to occur in state transfer for this problem propose The processing scheme that recurrence updates.Demonstrating this programme in an experiment has higher deduction accuracy and robustness compared with SCFS algorithm.
Detailed description of the invention
Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing:
Fig. 1 is the flow chart of network congestion Diagnosis of Links method one embodiment of the invention based on deep learning;
Fig. 2 is neural metwork training schematic diagram of the invention;
Fig. 3 is DQN training flow chart of the invention;
Fig. 4 is that recurrence updates flow chart in DQN training of the invention;
Fig. 5 is that movement conjecture updates flow through a network figure in DQN training of the invention;
Fig. 6 is state transfer schematic diagram when guessing 0;
Fig. 7 is state transfer schematic diagram when guessing 1;
Fig. 8 is the relational graph of of the invention cycle of training number and value network and target network difference degree;
Fig. 9 is the comparison diagram of the present invention and SCFS algorithm DR;
Figure 10 is the comparison diagram of the present invention and SCFS algorithm FPR.
Specific embodiment
For a clearer understanding of the technical characteristics, objects and effects of the present invention, now control attached drawing is described in detail A specific embodiment of the invention.
Network congestion Diagnosis of Links method and system the invention discloses one kind based on Deep Q-Learning (DQN), Main contents are combined enhancing study with deep learning by DQN, advantage when using it in face of dimensional state, using base The Q-Learning mode of learning of label is constructed in the strategy of " state-movement-reward ", carries out congestion link diagnosis.This hair " state " is defined as the congestion state in all paths by link and by the link by the enhancing study part of bright middle DQN Collect the binary group being combined into;" movement " be defined as the congestion state set according to path guess the link whether congestion;" prize Encourage " being defined as hitting it is positive reward and while guessing wrong is negative reward.And the deep learning part of DQN then uses depth to roll up in the present invention Product neural network even depth neural network.In this way, DQN is by the way that constantly iteration, autonomous learning network congestion path are gathered around with network The incidence relation between link is filled in, realizes the Accurate Diagnosis to network congestion link.Emulation under multiple network congestion scenario is real It tests the results show that DQN method compares more traditional SCFS method in the present invention, has more excellent congestion link diagnostic Energy.
With reference to Fig. 1, the network congestion Diagnosis of Links method based on deep learning that the present embodiment uses is included the steps that such as Under:
S1, M parts of congestion state data of network collection physical link for treating diagnosis, it is oriented to obtain link congestion state network Acyclic figure M, as sample pool;Wherein, M is the positive integer greater than 1;
S2, decision-making state modeling is carried out to every link congestion state network directed acyclic graph respectively, generates state together Set S;
S3, the training method DQN according to neural network, the shape that M link congestion state network directed acyclic graphs are generated State set S and corresponding decision set A carries out neural metwork training as input, and each group of training data is one when training The state of directed acyclic graph is input, and corresponding decision is output;
S4, using method identical with step S2, by the network directed acyclic graph of pending network congestion Diagnosis of Links into The modeling of row decision-making state, generates initiating task state s0, by state s0It substitutes into the neural network that DQN training obtains, carries out link Congestion state prediction.
With reference to Fig. 2, DQN objective function construction method is as follows:
A1, with a deep neural network as the network of Q value, parameter ω forces Q function by updating ω Nearly optimal value: Q (s, a, ω) ≈ Qπ(s,a);In formula, s indicates state, and a indicates decision (movement);
A2, using mean square deviation mean-square error carry out objective function objective function in Q value Namely loss function loss function:L (ω)=E [(r+ γ maxQ (s ', a ', ω)-Q (s, a, ω)2)];
In formula, L (ω) is objective function, and s ' indicates next state, and a ' indicates next decision, and E indicates expectation computing, R indicates reward, and γ indicates attenuation coefficient;
The gradient of A3, calculating parameter ω about loss function:
A4, optimization aim end to end is realized using SGD;
With reference to Fig. 3, in DQN training, main step includes:
B1, initialization experience pond D, setting capacity is N, for storing trained sample;
B2, initialization action-cost function Q neural network, weight parameter θ used are random value;
B3, initialized target movement-cost functionNeural network, structure is identical with Q, and weight parameter θ-=θ;
B4, setting segment sum M;
B5, initialization network inputs state s0, and calculate network output;
B6, with state set Snext={ s0As input set, recurrence update is carried out to network parameter;
Wherein, step B6 includes: to the step of network progress recurrence update
B61, each of input set state is carried out movement conjecture and executes network to update, while obtaining next shape State is added into NextState set if NextState is nonabsorptive state;
If B62, NextState set non-empty, as the input that network recurrence updates, continues recurrence, otherwise tie Beam;
More specific step is as shown in Figure 4.Movement conjecture is carried out to each state in step B61 and executes network update simultaneously The step of obtaining the set of NextState is as shown in Figure 5, comprising:
C1, it uses ε-greedy strategy to carry out movement selection: randomly choosing one from set of actions A with probability ε (very little) A movement is used as at, otherwise current state is input to in current network the Q that each movement is calculated (with a CNN) Value, selects the maximum movement (optimal movement) of Q value as at
C2, a is executedt, obtain executing atFeedback r afterwardstWith NextState st+1
C3, by four parameter (st,at,rt,st+1) be deposited into D (when storing N number of in D together as state this moment The state at quarter)
C4, minibatch state parameter group (s is taken out from D at randomj,aj,rj,sj+1);
C5, the target value of each state is calculated (by executing atReward afterwards updates Q value as target value): such as Fruit NextState is absorbing state, then yj=rj, otherwise
C6, pass through SGD undated parameter θ;
Target action-value function network is updated after C7, every C iterationParameter θ-It is current The parameter θ of the network Q of action-value function.
Network congestion Diagnosis of Links method based on depth enhancing study of the invention mainly includes Deep Q Network (DQN), congestion link diagnoses;Wherein Deep Q Learning includes enhancing study, deep neural network, congestion link diagnosis End-to-end link congestion state is obtained including network tomography method.
The variable-definition of network:
Network indicates G=(V, E) by a directed acyclic graph, and wherein V={ 0,1,2 ..., k ..., m } is node collection It closes, E={ l1,l2,...,lk,...,lmIt is link set, and link lkIt then indicates that end node is the link of k, owns in network The set in path is defined as P={ p1,p2,...,pi,...,pn, corresponding route congestion state observation set is defined as Y= {y1,y2,...,yi,...,yn, wherein the i-th paths piCongestion state be yi, work as yiWhen=1, path p is indicatediIn congestion State;And if yi=0, then it represents that path piIn normal condition, φkIt indicates to pass through link lkSet of paths;YkCorrespond to φkIn each path congestion state observe set.Path status collection in network is combined into X={ x1,x2,...,xk,...,xm}。
The variable-definition being related in DQN:
State s is defined as a binary group of link and the route congestion state set by link.That is s=sk=(lk, Yk);State set is S={ s1,s2,s3,...,sk,...,sm}.For being in state s=skWhen, we can take dynamic Make collection and be combined into A=a, wherein a=0 indicates conjecture link lkFor normal link, i.e.,As a=1, then it represents that conjecture lkFor Congestion link hasWhen true link congestion state is identical as the link congestion state of conjecture, that is, work asWhen, Will obtain reward R (s, a)=1;Otherwise will obtain punishment R (s, a)=- 2.
DQN diagnoses schematic diagram:
With reference to Fig. 6, original state collection: s1=(l1, [1,1,1,1,1]), in the case where guessing 0, state is shifted;
State s2=(l2,[1,1]),(l5, [1,1,1]), two states are learnt parallel, are transferred to NextState;
State s3=(l6,[1,1]),(l9, [1]), in the case where guessing 1, it is transferred to absorbing state, is terminated.
With reference to Fig. 7, s1=(l1, [1,1,1,1,1]), in the case where guessing 1, it is transferred directly to absorbing state E, is terminated;
The present invention is based in the network congestion Diagnosis of Links method of deep learning, from standing state, continue to optimize certainly Oneself strategy, stateful set S, behavior set A, tactful π select the behavior of lower a moment according to current state in the entire system A=π (s) has corresponding return value R (s) to be corresponding to it each of state set state s;For status switch In per next state, corresponding weight function V is arranged for each strategy π in setting attenuation coefficient γπ(s0)=E [R (s0)+γR(s1)+γ2R(s2)+...|s0=S, π], which meets Bellman equation, is write as Vπ(s0)=E [R (s0)+γ Vπ(s1)]。
Deep neural network (Deep Neural Network, DNN), refers to a series of spies being stacked by multiple layer heaps Determine neural network, each layer is then made of node.Operation carries out in node, and the operating mode of node and the neuron of the mankind are big It causes similar, will be activated when encountering enough stimulus informations and release signal.Node is by input data and one group of coefficient (or power Weight) it combines, its importance in algorithm learning tasks is specified by amplifying or inhibiting input.Input data and weight multiply The sum of product will enter the activation primitive of node, determine whether signal continues to transmit in a network, and the distance of transmitting, thus certainly Determine how signal influences the final result of network, such as classification movement.Deep learning network and more common single hidden layer mind Difference through network is depth, i.e. the node level that is passed through in the multistep process of pattern-recognition of data.Three layers or more (packets Include including outputting and inputting layer) system can be known as " depth " study.So depth is the art for having strict difinition Language indicates more than one hidden layer.
In deep learning network, one group of study identification on the basis of preceding layer exports of each node layer is specifically special Sign.As neural network depth increases, node can know another characteristic and also just become increasingly complex, because each layer can integrate and lay equal stress on The feature of group preceding layer.According to applicable cases difference, the form and size of deep neural network are also different.Popular form and big Small positive rapid evolution is with lift scheme accuracy and efficiency.There are two types of principal modes for the network of processing input: feedforward and circulation. In feedforward network, all calculating are all a series of runnings carried out on the basis of preceding layer output, such as CNN.Recirculating network is There is inherent memory, long-term dependence is allowed to influence output, such as LSTM.
According to Q-Learning more new formula: Q*(s, a)=Q (s, a)+α (r+ γ maxQ (s ', a ')-Q (s, a)), DQN Loss Function be L (θ)=E [(TargetQ-Q (s, a;θ))2], wherein θ is network parameter, target TargetQ= r+γmaxQ(s′,a′;θ).Experience pond (experience replay), the function in experience pond mainly solve correlation and non- Static distribution problem.Specific practice is the transfer sample (s that each time step agent and environmental interaction are obtainedt,at,rt,st+1) Playback memory unit is stored, (minibatch) is taken out when training at random just to train.Target network generates TargetQ value, Q (s, a;θi) indicate the output of current network MainNet, for assessing the value function of current state movement pair;Q (s,a;θi -) output that indicates TargetNet, it substitutes into ask above in the formula of TargetQ value and obtains target Q value.According to above Loss Function updates the parameter of MainNet, every to take turns iteration by N, and the parameter of MainNet is copied to TargetNet.
Target network is considered as a flight data recorder by network tomography, and usually, all measurements are around the end of network Node carries out, and this measurement strategies are referred to as end-to-end measurement end-to-end measurement in addition to passively listening the number between end node pair Outside according to message transmissions, be more by the way of initiatively sending probe messages between end-to-end node according to being adopted The Routing Protocol taken is different, and end-to-end measurement mode is broadly divided into two kinds at present: multicast measurement and unicast measurement are for safety etc. The considerations of factor, router is higher than multicast for the support of unicast and is based on multi-slot measurement Nguyen etc. in current Internet The priori congestion probability that people demonstrates link under the frame of Boolean network tomography can be by the number of multiple measurement time slots According to uniquely determining, and propose that a kind of method CLINK (congested LINK identification) based on matrix inversion is right It is solved;Then, the end-to-end data of the priori congestion probability of link and subsequent measurement time slot are combined, link shape can be obtained For the MAP estimation of state compared to SCFS algorithm, the accuracy of this method is higher, especially when in network congestion link compared with When more, there is the Ghita et al. of the Lausanne the higher verification and measurement ratio Institute of Technology above method is generalized to more generally scene for it In, find and demonstrate certain links in network state it is not mutually indepedent when, cognizable fill of link priori congestion probability is wanted Condition proposes that a kind of need to measure the scheme that can acquire link priori congestion probability to a small amount of end-to-end path.
It is tested using artificial network, artificial network includes 15 paths, and the priori congestion probability of link is randomly generated, Experiment repeats emulation 100 times altogether.Carry out link congestion diagnosis using the mentioned method of this patent, obtain this programme number cycle of training with The relational graph of value network and target network difference degree is as shown in Figure 8.Horizontal axis coordinate is number cycle of training, ordinate of orthogonal axes in figure Difference between value network and target network.It can be seen that with the increase of number cycle of training, value network and target network The difference degree of network is reducing.Experiment is compared using SCFS algorithm and this paper algorithm, respectively obtain verification and measurement ratio (DR) and is missed The relationship line chart of report rate (FPR) and congestion probability ρ, successively as shown in Figure 9, Figure 10.Wherein, verification and measurement ratio expression is detected Positive sample number accounts for the ratio of all positive sample numbers, and false detection rate is the ratio for being detected the sample being actually negative in the sample being positive. As seen from Figure 9, the verification and measurement ratio of SCFS algorithm is shown in small with the increase of congestion probability, and the verification and measurement ratio of context of methods is by congestion probability Influence is smaller, remains at higher level, and performance is significantly better than SCFS algorithm.As seen from Figure 10, when congestion probability be less than etc. When 0.7, in rate of false alarm, context of methods is only slightly higher than SCFS algorithm.In general, context of methods has better table It is existing.
The embodiment of the present invention is described with above attached drawing, but the invention is not limited to above-mentioned specific Embodiment, the above mentioned embodiment is only schematical, rather than restrictive, those skilled in the art Under the inspiration of the present invention, without breaking away from the scope protected by the purposes and claims of the present invention, it can also make very much Form, all of these belong to the protection of the present invention.

Claims (8)

1. a kind of network congestion Diagnosis of Links method based on deep learning, which comprises the steps of:
S1, M parts of congestion state data of network collection physical link for treating diagnosis, to obtain the oriented nothing of link congestion state network Ring figure M, as sample pool;M is the integer greater than 1;
S2, decision-making state modeling is carried out respectively to every link congestion state network directed acyclic graph, generating states set closes together S;
S3, according to the training method DQN, the state set S of neural network and corresponding decision set A as training data Collection carries out neural metwork training, and it is input that each group of training data, which is the state of a directed acyclic graph, when training, corresponding to determine Plan is output;
S4, using method identical with step S2, the network directed acyclic graph of pending network congestion Diagnosis of Links is determined The modeling of plan state, generates initiating task state s0, by state s0It substitutes into the neural network that step S3 training obtains, carries out link Congestion state prediction.
2. the network congestion Diagnosis of Links method according to claim 1 based on deep learning, which is characterized in that step S3 The objective function construction method of middle DQN is as follows:
A1, with a deep neural network as the network of Q value, parameter ω makes Q function approximation most by updating ω The figure of merit: Q (s, a, ω) ≈ Qπ(s,a);In formula, s indicates state, and a indicates decision, and π is strategy;
A2, using mean square deviation carry out objective function in Q value:
L (ω)=E [(r+ γ maxQ (s ', a ', ω)-Q (s, a, ω)2)];
In formula, s ' indicates next state, and a ' indicates next decision, and E indicates that expectation computing, r indicate reward, and γ indicates decaying Coefficient;
The gradient of A3, calculating parameter ω about objective function:
A4, optimization aim end to end is realized using SGD.
3. the network congestion Diagnosis of Links method according to claim 2 based on deep learning, which is characterized in that
In DQN training, main step includes:
B1, initialization experience pond D, setting capacity is N, for storing trained sample;
B2, initialization action-cost function Q neural network, weight parameter θ used are random value;
B3, initialized target movement-cost functionNeural network, structure is identical with Q, and weight parameter θ-=θ;
B4, setting segment sum M;
B5, initialization network inputs state s0, and calculate network output;
B6, with state set Snext={ s0As input set, recurrence update is carried out to network parameter.
4. the network congestion Diagnosis of Links method according to claim 3 based on deep learning, which is characterized in that wherein, Step B6 to network carry out recurrence update the step of include:
B61, each of input set state is carried out movement conjecture and executes network to update, while obtaining NextState, If NextState is nonabsorptive state, it is added into NextState set;
If B62, NextState set non-empty, as the input that network recurrence updates, continues recurrence, otherwise terminate.
5. the network congestion Diagnosis of Links method according to claim 4 based on deep learning, which is characterized in that step Each state act in B61 and guesses that executing network updates and include: the step of obtaining the set of NextState
C1, use ε-greedy strategy to carry out movement to select: randomly choosed from set of actions A using probability ε a movement as at, otherwise current state is input to the Q value for calculating each movement in current network with a CNN, select Q value most A big movement is used as at
C2, a is executedt, obtain executing atFeedback r afterwardstWith NextState st+1
C3, by four parameter (st,at,rt,st+1) be deposited into D together as state this moment, N number of moment is stored in D State;
C4, minibatch state parameter group (s is taken out from D at randomj,aj,rj,sj+1);
C5, the target value for calculating each state, specifically by execution atReward afterwards updates Q value as target value: if NextState is absorbing state, then yj=rj, otherwise
C6, pass through SGD undated parameter θ;
Target action-value function network is updated after C7, every C iterationParameter θ-For current action- The parameter θ of the network Q of value function, C are the positive integer greater than 1.
6. the network congestion Diagnosis of Links method according to claim 1 based on deep learning, which is characterized in that the chain Road congestion state network indicates G=(V, E) by a directed acyclic graph, and wherein V={ 0,1,2 ..., k ..., m } is network node Set, E={ l1,l2,…,lk,…,lmIt is link set, and link lkIt then indicates that end node is the link of k, owns in network The set in path is defined as P={ p1,p2,…,pi,…,pn, corresponding route congestion state observation set is defined as Y={ y1, y2,…,yi,…,yn, wherein the i-th paths piCongestion state be yi, work as yiWhen=1, path p is indicatediIn congestion state; And if yi=0, then it represents that path piIn normal condition, φkIt indicates to pass through link lkSet of paths;YkCorresponding to φkIn The congestion state in each path observes set, and the path status collection in network is combined into X={ x1,x2,...,xk,...,xm};
State s is defined as a binary group of link and the route congestion state set by link, i.e. s=sk=(lk,Yk), State set is S={ s1,s2,s3,...,sk,...,sm, for being in state s=skWhen, the set of actions taken is A=a, Wherein a=0 indicates conjecture link lkFor normal link, i.e.,As a=1, then it represents that conjecture lkFor congestion link, that is, haveWhen true link congestion state is identical as the link congestion state of conjecture, that is, work asWhen, it will be rewarded;It is no It will then be punished.
7. based on the network congestion Diagnosis of Links method described in claim 1 based on deep learning, which is characterized in that entire Stateful set S in network congestion Diagnosis of Links method based on deep learning, strategy set A, tactful π, according to current state Behavior a=π of lower a moment (s) is selected, for each of state set state s, has corresponding return value R (s) therewith It is corresponding;For, per next state, attenuation coefficient γ being arranged in status switch, for each strategy π, corresponding power is set Value function Vπ(s0)=E [R (s0)+γR(s1)+γ2R(s2)+…|s0=S, π]=E [R (s0)+γVπ(s1)]。
8. a kind of network congestion Diagnosis of Links system based on deep learning, it is characterised in that: using any one of claim 1-7 The network congestion Diagnosis of Links method based on deep learning carries out network congestion Diagnosis of Links.
CN201810890267.0A 2018-08-07 2018-08-07 Network congestion link diagnosis method and system based on deep reinforcement learning Active CN109194583B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810890267.0A CN109194583B (en) 2018-08-07 2018-08-07 Network congestion link diagnosis method and system based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810890267.0A CN109194583B (en) 2018-08-07 2018-08-07 Network congestion link diagnosis method and system based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN109194583A true CN109194583A (en) 2019-01-11
CN109194583B CN109194583B (en) 2021-05-14

Family

ID=64920861

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810890267.0A Active CN109194583B (en) 2018-08-07 2018-08-07 Network congestion link diagnosis method and system based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN109194583B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135482A (en) * 2019-04-30 2019-08-16 中国地质大学(武汉) A kind of network topology estimating method and system based on convolutional neural networks
CN110213025A (en) * 2019-05-22 2019-09-06 浙江大学 Dedicated ad hoc network anti-interference method based on deeply study
CN110225019A (en) * 2019-06-04 2019-09-10 腾讯科技(深圳)有限公司 A kind of network security processing method and device
CN110458466A (en) * 2019-08-16 2019-11-15 内蒙古大学 Based on data mining and the associated patent estimation method of Heterogeneous Knowledge, valuation system
CN110581808A (en) * 2019-08-22 2019-12-17 武汉大学 Congestion control method and system based on deep reinforcement learning
CN110768906A (en) * 2019-11-05 2020-02-07 重庆邮电大学 SDN-oriented energy-saving routing method based on Q learning
CN110809274A (en) * 2019-10-28 2020-02-18 南京邮电大学 Narrowband Internet of things-oriented unmanned aerial vehicle base station enhanced network optimization method
CN111416774A (en) * 2020-03-17 2020-07-14 深圳市赛为智能股份有限公司 Network congestion control method and device, computer equipment and storage medium
WO2020244906A1 (en) * 2019-06-03 2020-12-10 Nokia Solutions And Networks Oy Uplink power control using deep q-learning
CN112242959A (en) * 2019-07-16 2021-01-19 ***通信集团浙江有限公司 Micro-service current-limiting control method, device, equipment and computer storage medium
CN112714074A (en) * 2020-12-29 2021-04-27 西安交通大学 Intelligent TCP congestion control method, system, equipment and storage medium
CN113227973A (en) * 2019-02-26 2021-08-06 谷歌有限责任公司 Reinforcement learning techniques for selecting software policy networks and autonomously controlling corresponding software clients based on the selected policy networks
CN114567597A (en) * 2022-02-21 2022-05-31 重庆邮电大学 Congestion control method and device based on deep reinforcement learning in Internet of things
CN115208821A (en) * 2022-07-18 2022-10-18 广东电网有限责任公司 Cross-network route forwarding method and device based on BP neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012131424A1 (en) * 2011-02-25 2012-10-04 Telefonaktiebolaget L M Ericsson (Publ) Method for introducing network congestion predictions in policy decision
CN107396204A (en) * 2017-06-12 2017-11-24 江苏大学 A kind of P2P video request program node selecting methods based on linear programming and intensified learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012131424A1 (en) * 2011-02-25 2012-10-04 Telefonaktiebolaget L M Ericsson (Publ) Method for introducing network congestion predictions in policy decision
CN107396204A (en) * 2017-06-12 2017-11-24 江苏大学 A kind of P2P video request program node selecting methods based on linear programming and intensified learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
AKMAL YAQINI 等: ""An artificial neural network based fault detection and diagnosis for wireless mesh networks"", 《2018 WIRELESS DAYS (WD)》 *
罗颖 等: ""无线网络中基于强化学习的拥塞控制算法改进"", 《自动化仪表》 *
陈炜: ""基于神经网络预测算法的网络拥塞控制"", 《计算机信息工程》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113227973A (en) * 2019-02-26 2021-08-06 谷歌有限责任公司 Reinforcement learning techniques for selecting software policy networks and autonomously controlling corresponding software clients based on the selected policy networks
CN110135482A (en) * 2019-04-30 2019-08-16 中国地质大学(武汉) A kind of network topology estimating method and system based on convolutional neural networks
CN110213025A (en) * 2019-05-22 2019-09-06 浙江大学 Dedicated ad hoc network anti-interference method based on deeply study
US11463961B2 (en) 2019-06-03 2022-10-04 Nokia Solutions And Networks Oy Uplink power control using deep Q-learning
WO2020244906A1 (en) * 2019-06-03 2020-12-10 Nokia Solutions And Networks Oy Uplink power control using deep q-learning
CN110225019A (en) * 2019-06-04 2019-09-10 腾讯科技(深圳)有限公司 A kind of network security processing method and device
CN110225019B (en) * 2019-06-04 2021-08-31 腾讯科技(深圳)有限公司 Network security processing method and device
CN112242959A (en) * 2019-07-16 2021-01-19 ***通信集团浙江有限公司 Micro-service current-limiting control method, device, equipment and computer storage medium
CN112242959B (en) * 2019-07-16 2022-10-14 ***通信集团浙江有限公司 Micro-service current-limiting control method, device, equipment and computer storage medium
CN110458466A (en) * 2019-08-16 2019-11-15 内蒙古大学 Based on data mining and the associated patent estimation method of Heterogeneous Knowledge, valuation system
CN110458466B (en) * 2019-08-16 2023-09-26 内蒙古大学 Patent estimation method and system based on data mining and heterogeneous knowledge association
CN110581808A (en) * 2019-08-22 2019-12-17 武汉大学 Congestion control method and system based on deep reinforcement learning
CN110809274A (en) * 2019-10-28 2020-02-18 南京邮电大学 Narrowband Internet of things-oriented unmanned aerial vehicle base station enhanced network optimization method
CN110768906A (en) * 2019-11-05 2020-02-07 重庆邮电大学 SDN-oriented energy-saving routing method based on Q learning
CN110768906B (en) * 2019-11-05 2022-08-30 重庆邮电大学 SDN-oriented energy-saving routing method based on Q learning
CN111416774A (en) * 2020-03-17 2020-07-14 深圳市赛为智能股份有限公司 Network congestion control method and device, computer equipment and storage medium
CN112714074A (en) * 2020-12-29 2021-04-27 西安交通大学 Intelligent TCP congestion control method, system, equipment and storage medium
CN112714074B (en) * 2020-12-29 2023-03-31 西安交通大学 Intelligent TCP congestion control method, system, equipment and storage medium
CN114567597A (en) * 2022-02-21 2022-05-31 重庆邮电大学 Congestion control method and device based on deep reinforcement learning in Internet of things
CN114567597B (en) * 2022-02-21 2023-12-19 深圳市亦青藤电子科技有限公司 Congestion control method and device based on deep reinforcement learning in Internet of things
CN115208821A (en) * 2022-07-18 2022-10-18 广东电网有限责任公司 Cross-network route forwarding method and device based on BP neural network
CN115208821B (en) * 2022-07-18 2023-08-08 广东电网有限责任公司 Cross-network route forwarding method and device based on BP neural network

Also Published As

Publication number Publication date
CN109194583B (en) 2021-05-14

Similar Documents

Publication Publication Date Title
CN109194583A (en) Network congestion Diagnosis of Links method and system based on depth enhancing study
Falcon et al. Fault identification with binary adaptive fireflies in parallel and distributed systems
Perez-Nieves et al. Modelling behavioural diversity for learning in open-ended games
CN109523029B (en) Self-adaptive double-self-driven depth certainty strategy gradient reinforcement learning method
CN103971160B (en) particle swarm optimization method based on complex network
CN109847366A (en) Data for games treating method and apparatus
Akiyama et al. Online cooperative behavior planning using a tree search method in the robocup soccer simulation
Liu et al. Neupl: Neural population learning
Li et al. Generative attention networks for multi-agent behavioral modeling
Lin et al. Multi-robot adversarial patrolling: Handling sequential attacks
Subramanian et al. Decentralized mean field games
Scott et al. How does AI play football? An analysis of RL and real-world football strategies
Du et al. Multiagent reinforcement learning with heterogeneous graph attention network
CN104092503A (en) Artificial neural network spectrum sensing method based on wolf pack optimization
CN107271840A (en) A kind of Fault Section Location of Distribution Network based on LFOA
Wang et al. Towards skilled population curriculum for multi-agent reinforcement learning
Gurzoni et al. Market-based dynamic task allocation using heuristically accelerated reinforcement learning
Wu et al. Portal: Automatic curricula generation for multiagent reinforcement learning
Liu et al. Lazy agents: a new perspective on solving sparse reward problem in multi-agent reinforcement learning
CN114866272B (en) Multi-round data delivery system of true value discovery algorithm in crowd-sourced sensing environment
Hou et al. A memetic multi-agent demonstration learning approach with behavior prediction
Lin et al. Exploration-efficient deep reinforcement learning with demonstration guidance for robot control
Chen et al. Commander-Soldiers Reinforcement Learning for Cooperative Multi-Agent Systems
Wang et al. VHetNets for AI and AI for VHetNets: An Anomaly Detection Case Study for Ubiquitous IoT
Wang et al. Deconfounded Opponent Intention Inference for Football Multi-Player Policy Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant