CN114827021A

CN114827021A - Multimedia service flow acceleration system based on SDN and machine learning

Info

Publication number: CN114827021A
Application number: CN202210732754.0A
Authority: CN
Inventors: 郭永安; 吴庆鹏; 张啸; 余昊; 钱琪杰
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2022-06-27
Filing date: 2022-06-27
Publication date: 2022-07-29
Anticipated expiration: 2042-06-27
Also published as: CN114827021B

Abstract

The invention discloses a multimedia service flow acceleration system based on SDN and machine learning, which comprises a flow classification module and a path selection module; the flow classification module trains a machine learning model according to multimedia service flow information in a network to generate a service flow classifier; the service flow classifier is used for classifying the imported network service and identifying the flow requirement corresponding to the network service, wherein the parameters of the flow requirement comprise packet loss rate, time delay and bandwidth; and calculating a corresponding routing strategy according to the flow requirement corresponding to the network service of the path selection module, and meeting the requirements of each parameter of the flow requirement on the basis of the path accessibility. The invention can carry out optimal deployment service flow classification and scheduling strategy, realize rapid deployment and implementation and improve the utilization rate of network resources.

Description

Multimedia service flow acceleration system based on SDN and machine learning

Technical Field

The invention belongs to the technical field of computer networks, and particularly relates to a multimedia service flow acceleration system based on SDN and machine learning.

Background

With the expansion of internet application scale, various multimedia services such as video service, audio service, game service, etc. are in endless, and these services have higher requirements on service quality indexes such as time delay and bandwidth. Therefore, network congestion is easily caused, which makes management difficult. Under the traditional network architecture, a best effort forwarding strategy is adopted, and a Differentiated Services model (Differentiated Services, Diffserv) aims to set different coding points and bind behavior sets for different network Services, so that more fine-grained Differentiated Services can be provided for multimedia service flows. However, when there are many nodes, it is difficult to deploy new functions on each routing node in time. Therefore, the traditional network framework faces the problems of difficult function deployment and simple scheduling strategy.

The invention with publication number CN113114573A provides a video stream classification and scheduling system in an SDN network, which includes: the flow classification module is used for analyzing the packet header of a data packet in the edge router, extracting the flow characteristics, training the classifier, predicting unknown data by using the classifier, forwarding the classified data packet to an output port of the edge router, and issuing a flow table containing modified DSCP domain and output port forwarding behaviors; the path selection module is used for acquiring a global network topology, monitoring state changes of time delay, residual bandwidth and packet loss rate of each link, normalizing and scaling a link state value, calculating to obtain a path list meeting QoS (quality of service) constraint through an FPTAS-MMCP (fast forward adaptive multi-path-multicast service) algorithm, sequentially distributing paths in the feasible path list according to the priorities of different video services, and issuing a flow table to routing nodes on the paths to control data packet forwarding.

Generally speaking, in multimedia services, the priority of a video session is higher, and the priority of an elastic stream and a background stream is lower, however, in the real routing process, routing forwarding cannot be performed discretely according to the priority of the streams, and a routing path needs to be arranged by considering various factors. Therefore, a solution is needed to make a logic control policy more flexible for multimedia traffic flows and implement fast deployment and implementation of the policy.

Disclosure of Invention

The technical problem to be solved is as follows: based on the technical problems, the invention provides a multimedia service flow acceleration system based on SDN and machine learning, which can optimally deploy service flow classification and scheduling strategies, realize rapid deployment and implementation and improve the utilization rate of network resources.

The technical scheme is as follows:

a multimedia service flow acceleration system based on SDN and machine learning comprises a flow classification module and a path selection module which are deployed in an SDN control plane, wherein the SDN control plane issues a flow table to a switch network through an openflow protocol, so that an SDN data plane where the switch network is located executes data forwarding service according to the flow table;

the flow classification module trains a machine learning model according to multimedia service flow information in a network to generate a service flow classifier; the service flow classifier is used for classifying the imported network service and identifying the flow requirement corresponding to the network service, wherein the parameters of the flow requirement comprise packet loss rate, time delay and bandwidth; and calculating a corresponding routing strategy according to the flow requirement corresponding to the network service of the path selection module, and meeting the requirements of each parameter of the flow requirement on the basis of the path accessibility.

Further, the flow classification module comprises a packet information acquisition sub-module, an offline training sub-module and a flow classification sub-module;

the packet information acquisition sub-module comprises a packet information analysis component and a flow characteristic calculation component;

the packet information analysis component is used for acquiring service flow packet information from a data plane through a packet-in event processing function in an openflow protocol, wherein the service flow packet information comprises a source IP address, a destination IP address, a source port, a target port, effective length and arrival time of a data packet; the traffic characteristic calculation component calculates a plurality of characteristic vectors including packet average size, packet size variance, packet average arrival time interval, packet arrival time interval variance and packet size conversion count according to the traffic packet information, and sends the traffic packet information and the corresponding characteristic vectors to the offline training sub-module and the traffic classification sub-module;

the off-line training submodule is used for receiving different types of service traffic packet information and corresponding feature vectors sent by the packet information acquisition submodule, performing off-line learning on the service traffic packet information and the corresponding feature vectors through a machine learning algorithm, and generating a service traffic classifier according to a training result;

and the flow classification submodule loads a service flow classifier, classifies the flow according to the packet information of the network service sent by the packet information acquisition submodule, and identifies the flow type and the flow requirement of the network service.

Further, the offline training sub-module comprises a flow information collection component, a GCN classification component, a classification evaluation component and an evaluation optimization component;

the traffic information collection component is used for receiving different types of service traffic packet information and corresponding characteristic vectors sent by the packet information acquisition sub-module, arranging the service traffic packet information and the corresponding characteristic vectors into training samples, sending the training samples to a sample data set, and updating the sample data set; the GCN classification component adopts a machine learning algorithm to perform off-line learning on training samples in the sample data set, and generates a service flow classifier according to a training result; the classification evaluation component is used for evaluating the classification precision and accuracy of the generated service flow classifier, if the evaluation is qualified, the service flow classifier is output to the flow classification submodule, otherwise, the evaluation result is sent to the evaluation optimization component, and the evaluation optimization component optimizes the training samples in the sample data set according to the evaluation result of the classification evaluation component;

the evaluation process of the classification evaluation component comprises the following steps: randomly extracting 20% of data from the sample data set to perform classification test, and judging that the classifier is qualified if the classification accuracy reaches 90% or more;

the strategy of the evaluation optimization component for optimizing the training samples of the sample data set comprises: according to the sequence of the timestamps of the training samples entering the sample data set from far to near, the training samples are ranked, the training sample with the farthest timestamp and the training sample with the business type accounting for less than 5% are removed regularly, and the data in the sample data set is maintained within a preset quantity range.

Further, the GCN classification component performs offline learning on the training samples in the sample data set by using a machine learning algorithm, and the process of generating the traffic classifier according to the training result includes the following steps:

s1, acquiring the topology structure information of the whole network, generating a graph G (V, E), wherein V is the collection of nodes V in the graph, and E is the collection of edges E of the graph; generating an adjacency matrix A of the weighted graph, wherein in the adjacency matrix A, the weight between two adjacent nodes is set to be 1, and the rest is 0;

s2, generating a node degree matrix D using the adjacency matrix a:

adjacency matrix a is a diagonal matrix;

s3, taking a plurality of eigenvectors output by the flow characteristic calculation component as five-dimensional eigenvectors of each node, and constructing an eigenvector matrix X:

wherein m represents the number of nodes in the network and n represents the dimension of the feature;

s4, constructing a flow calculation model based on a GCN algorithm, wherein the input of the flow calculation model is an adjacency matrix A, a degree matrix D and a feature matrix X, the output of the flow calculation model is a flow demand feature matrix, and the flow demand feature matrix comprises full graph node information, link connection state information and data packet information;

the flow calculation model is composed of

Rolling and laminating the layer diagrams; for the k-th layer map convolution layer, it is provided

Which represents the input of the k-th layer,

an output node representation representing the k-th layer is obtained

Feature matrix

Is the input to the first graph convolution layer; hidden feature representation of node vi in feature propagation process of GCN k layer

Is the average value of its local neighbors, the update rule is as follows:

；

in the formula (I), the compound is shown in the specification,

；

；

wherein

And

representing the degree of node i and the degree of node j respectively,

representing the values of i rows and j columns in the adjacency matrix A, and n represents the number of nodes;

s5, constructing three types of feature matrices according to the parameter value range of the flow demand: a first traffic information feature matrix T for limiting packet loss rate, a second traffic information feature matrix D for limiting time delay and a third traffic information feature matrix B for limiting bandwidth;

s6, respectively calculating the similarity between the flow demand characteristic matrix output by the flow calculation model and the three types of flow information characteristic matrices, and taking the similarity as the flow demand characteristic matrix

Setting similarity weights: (

) Order:

；

in the formula, the similarity defined by the similarity is characterized in an approximately mode; the process of calculating the similarity between the flow demand characteristic matrix and the three types of flow information characteristic matrices is as follows: respectively solving flow demand characteristic matrix by using 1 norm method

And the Manhattan distances of the first flow information characteristic matrix T, the second flow information characteristic matrix D and the third flow information characteristic matrix B are normalized to respectively obtain the three distances

Order:

；

；

。

further, the

Is 3.

Further, the path selection module comprises a global view acquisition sub-module, a weight processing sub-module and a path calculation sub-module;

the global view acquisition submodule is used for acquiring current network link time delay, global topology information, switch port data rate, maximum data rate and configuration information through an LLDP data packet, an Echo message request and a switch port and flow table statistical information query request, and calculating to obtain the state information of the link of the whole network including the time delay of the whole network, the residual bandwidth and the packet loss rate;

the weight processing submodule is used for carrying out normalization and scaling on the link state information of the whole network;

and the path calculation submodule is used for carrying out route selection by utilizing a routing algorithm according to the link state information and the flow demand corresponding to the network service.

Further, the path computation submodule comprises a reachable path computation component, a path state acquisition component and an optimal path selection component;

the reachable path calculation component obtains all reachable paths corresponding to the imported network service through DFS algorithm query;

the path state acquisition component is used for acquiring the time delay d, the available bandwidth b and the packet loss rate t of all reachable paths;

the optimal path selection component calculates the time delay, the available bandwidth and the packet loss rate of each reachable path according to the similarity weight to obtain corresponding evaluation indexes

Selecting the reachable path with the minimum evaluation index value as the optimal pathPath:

。

has the advantages that:

the multimedia service flow acceleration system based on the SDN and the machine learning combines the flexibility characteristic of the SDN and the intelligence of the machine learning, and can train the complicated multimedia network service flow and generate a classification model based on a machine learning model of a control domain; and through the classification model, the routing strategy is quickly calculated aiming at the current network state and is issued to the data plane through an openflow protocol. In the flow classification process, the invention not only considers the priority degree of the flow service, but also considers the position of a flow publisher in the network, and classifies the flow in a continuous mode, thereby optimally deploying the service flow classification and scheduling strategy, realizing rapid deployment and implementation and improving the utilization rate of network resources.

Drawings

Fig. 1 is a schematic structural diagram of a multimedia service traffic acceleration system based on SDN and machine learning;

FIG. 2 is a schematic diagram of a traffic classification module;

fig. 3 is a schematic structural diagram of a path selection module.

Detailed Description

The following examples are presented to enable one of ordinary skill in the art to more fully understand the present invention and are not intended to limit the invention in any way.

The embodiment provides a multimedia service traffic acceleration system based on SDN (Software Defined Defined Network, Software Defined Network) and machine learning, which can obtain an optimal deployment service flow classification and scheduling policy, implement rapid deployment and implementation, and improve Network resource utilization rate.

The multimedia service flow acceleration system is applied to the SDN environment. As shown in fig. 1, the multimedia service traffic acceleration system includes a traffic classification module and a path selection module deployed in an SDN control plane, where the SDN control plane (SDN controller) issues a flow table to a switch network through an openflow protocol, so that an SDN data plane where the switch network is located executes a data forwarding service according to the flow table.

The flow classification module trains a machine learning model according to multimedia service flow information in a network to generate a service flow classifier; when a service is generated in a network and needs to be classified, the service flow classifier classifies the introduced network service and identifies the flow requirement corresponding to the network service, wherein parameters of the flow requirement comprise packet loss rate, time delay and bandwidth; and calculating a corresponding routing strategy according to the flow requirement corresponding to the network service of the path selection module, and meeting the requirements of each parameter of the flow requirement on the basis of the path accessibility.

Flow classifying module

As shown in fig. 2, the traffic classification module includes a packet information acquisition sub-module, an offline training sub-module, and a traffic classification sub-module. The packet information acquisition submodule is used for acquiring packet information of unknown flow through an openflow protocol, extracting packet information characteristics and sending the packet information characteristics to the offline training submodule and the flow classification submodule;

the off-line training submodule is used for receiving packet information characteristics sent by the report information acquisition submodule and performing off-line learning on the packet information characteristics through a machine learning algorithm, a training set and a testing machine are input at the beginning of the submodule or manually, and the module can update the training set and the testing set through new network information along with the deployment of the module in a network so as to realize an extensible function of changing in real time according to network changes. Meanwhile, the off-line training submodule can also generate a service flow classifier according to the training result and load the service flow classifier into the flow classification submodule.

And the flow classification submodule loads a flow classifier, classifies the flow according to the packet information sent by the packet information acquisition submodule, and identifies the flow type and the flow requirement.

The flow classification module comprises a packet information acquisition submodule, an off-line training submodule and a flow classification submodule.

The packet information acquisition submodule comprises a packet information analysis component and a flow characteristic calculation component. The packet information parsing component is used for acquiring service traffic packet information from a data plane through a packet-in event processing function in an openflow protocol, wherein the service traffic packet information comprises src (source IP address), dst (destination IP address), src _ port (source port), dst _ port (target port), and effective length and arrival time of a data packet acquired through a len () function and a time () function. The traffic characteristic calculation component calculates a plurality of characteristic vectors including len _ mean (packet average size), len _ std (packet size variance), time _ mean (packet average arrival time interval), time _ std (packet arrival time interval variance) and count (packet size conversion count) according to the traffic packet information, and sends the traffic packet information and the corresponding characteristic vectors to the offline training submodule and the traffic classification submodule.

The off-line training sub-module is used for receiving different types of service flow packet information and corresponding feature vectors sent by the packet information acquisition sub-module, performing off-line learning on the service flow packet information and the corresponding feature vectors through a machine learning algorithm, and generating a service flow classifier according to a training result. In the initial training stage, a training set and a testing machine can be manually input; with the deployment of the offline training submodule in the network, the offline training submodule can update the training set and the test set through new network information so as to realize the extensible function which changes in real time according to the network change.

Illustratively, the offline training sub-module includes a traffic information collection component, a GCN classification component, a classification evaluation component, and an evaluation optimization component.

The traffic information collection component is used for receiving different types of service traffic packet information and corresponding characteristic vectors sent by the packet information acquisition sub-module, arranging the service traffic packet information and the corresponding characteristic vectors into training samples, sending the training samples to a sample data set and updating the sample data set; the GCN classification component adopts a machine learning algorithm to perform off-line learning on training samples in the sample data set, and generates a service flow classifier according to a training result; and the classification evaluation component is used for evaluating the classification precision and accuracy of the generated service flow classifier, outputting the service flow classifier to the flow classification submodule if the evaluation is qualified, and otherwise, sending the evaluation result to the evaluation optimization component so that the evaluation optimization component optimizes the training samples in the sample data set according to the evaluation result of the classification evaluation component.

The evaluation process of the classification evaluation component is as follows: randomly extracting 20% of data from the sample data set to perform classification test, and judging that the classifier is qualified if the classification accuracy reaches 90% or more; the evaluation optimization component optimizes the training samples of the sample data set according to the following schemes: 1. ensuring that the data in the sample data set is kept in a stable quantity, namely, timely getting clear the data which is long in time; 2. and removing the data of the service type with the proportion lower than 5% in the sample data set.

Illustratively, traffic classification using the GCN algorithm is described in detail below. Generally speaking, in multimedia services, the priority of a video session is higher, and the priority of an elastic stream and a background stream is lower, however, in the real routing process, routing forwarding cannot be performed discretely according to the priority of the streams, and secondly, routing paths need to be arranged by considering various factors. Therefore, the process of traffic classification is important, which not only takes into account the priority of traffic but also the location of the traffic publisher in the network, classifying the traffic in a continuous manner. The specific training process comprises the following steps:

the process of generating the traffic classifier according to the training result comprises the following steps:

s1, acquiring the topology structure information of the whole network, and generating a graph G (V, E), wherein V is the set of nodes V in the graph, and E is the set of edges E of the graph; an adjacency matrix a of the weighted graph is generated, in which the weight between two adjacent nodes is set to 1, and the rest is 0.

S2, generating a node degree matrix D using the adjacency matrix a:

the adjacency matrix a is a diagonal matrix.

where m represents the number of nodes in the network and n represents the dimension of the feature.

S4, constructing a flow calculation model based on the GCN algorithm, wherein the input of the flow calculation model is an adjacency matrix A, a degree matrix D and a characteristic matrix X, the output of the flow calculation model is a flow demand characteristic matrix, and the flow demand characteristic matrix comprises full graph node information, link connection state information and data packet information.

The flow calculation model is composed of

Layer diagram rolling and laminating; for the kth picture convolution layer, let

Which represents the input of the k-th layer,

an output node representation representing the k-th layer is obtained

Feature matrix

Is the average value of its local neighbors, the update rule is as follows:

；

in the formula (I), the compound is shown in the specification,

，

. Preferably, the first and second liquid crystal materials are,

is 3.

S5, constructing three types of feature matrices according to the parameter value range of the flow demand: a first traffic information feature matrix T for limiting packet loss rate, a second traffic information feature matrix D for limiting time delay and a third traffic information feature matrix B for limiting bandwidth.

Setting similarity weights: (

）：

；

In the formula, the ≈ characterize the similarity defined by the similarity. The process of calculating the similarity between the flow demand characteristic matrix and the three types of flow information characteristic matrices is as follows: respectively solving flow demand characteristic matrix by using 1 norm method

And the Manhattan distances of the first flow information characteristic matrix T, the second flow information characteristic matrix D and the third flow information characteristic matrix B, then normalizing the three distances,are respectively obtained

Order:

；

；

。

(II) route selection module

As shown in fig. 3, the path selection module is configured to select an optimal routing policy based on a current network state according to the traffic type and the traffic demand sent by the traffic classification module. The path selection module comprises a global view acquisition sub-module, a weight processing sub-module and a path calculation sub-module.

The global view obtaining sub-module is configured to obtain current network Link delay, global topology information, switch port data rate, maximum data rate, configuration information, and the like through an LLDP (Link Layer Discovery Protocol) data packet, an Echo message request, and a switch port and flow table statistical information query request, so as to obtain information such as a full network delay, a remaining bandwidth, and a packet loss rate.

And the weight processing submodule is used for normalizing and scaling the link state information of the whole network.

And the path calculation submodule is used for selecting a route by using a routing algorithm according to the link state information and the service requirement.

Illustratively, the specific workflow of the path selection module includes:

step 1: and the global view acquisition sub-module acquires global network information through an Openflow message mechanism. The method specifically comprises the following steps: by adding the timestamp information into the Echo message, the time of data going to and going from the controller and the switch can be acquired, and the subsequent acquisition of the link delay is facilitated. In addition, the OFPFlow states Request message can be used to obtain the packet count, bit number, flow table lifetime and other statistical information meeting the conditions of the flow table matching domain. The Event OFPPort states Reply request message is used for acquiring information such as the receiving/sending bit number of the port of the switch, the survival time of the port and the like. The Event OFPPort Desc states Reply request message is used for acquiring the hardware attributes of the port, such as the current data rate, the maximum data rate, the configuration information and the like of the port. Meanwhile, the LLDP data packet encapsulated with the timestamp information is issued to each switch through an OFPAction Output message. The LLDP Packet is forwarded to the adjacent switch through the port and sent back to the controller through the Packet-In message, so that the adjacency matrix of the switch can be obtained by statistics, and the maintenance and update of the global topology are realized by the message triggered by the data layer behaviors, such as the addition/departure of the switch, the addition/modification/deletion of the port, the addition/deletion of the link, and the like. Based on the basic information, the packet loss rate, the residual bandwidth and the link time delay of the whole network link can be calculated.

And 2, the weight processing submodule performs normalization and scaling processing on the full-network state information obtained by the global trying acquisition submodule.

And 3, the path calculation submodule selects a path by using a routing algorithm according to the network state information and the service requirement.

Illustratively, the specific process of path selection includes:

for a certain service flow, finding all reachable paths through a DFS algorithm; acquiring time delay d, available bandwidth b and packet loss rate t of all paths; calculating the time delay, the available bandwidth and the packet loss rate of each path according to the similarity weight, wherein the formula is as follows:

(ii) a And finally, selecting the path with the minimum y value as the optimal path.

Claims

1. The multimedia service flow acceleration system is characterized by comprising a flow classification module and a path selection module which are deployed in an SDN control plane, wherein the SDN control plane issues a flow table to a switch network through an openflow protocol, so that an SDN data plane where the switch network is located executes data forwarding service according to the flow table;

2. The SDN and machine learning based multimedia service traffic acceleration system of claim 1, wherein the traffic classification module comprises a packet information acquisition sub-module, an offline training sub-module, and a traffic classification sub-module;

the packet information analysis component is used for acquiring service flow packet information from a data plane through a packet-in event processing function in an openflow protocol, wherein the service flow packet information comprises a source IP address, a destination IP address, a source port, a target port, effective length and arrival time of a data packet; the traffic characteristic calculation component calculates a plurality of characteristic vectors including packet average size, packet size variance, packet average arrival time interval, packet arrival time interval variance and packet size conversion count according to the traffic packet information, and sends the traffic packet information and the corresponding characteristic vectors to an offline training submodule and a traffic classification submodule;

3. The SDN and machine learning based multimedia service traffic acceleration system of claim 2, wherein the offline training sub-module comprises a traffic information collection component, a GCN classification component, a classification evaluation component, and an evaluation optimization component;

4. The SDN and machine learning based multimedia service traffic acceleration system of claim 3, wherein the GCN classification component employs a machine learning algorithm to perform offline learning on training samples in the sample data set, and the process of generating the service traffic classifier according to the training result comprises the following steps:

s1, acquiring the topology structure information of the whole network, and generating a graph G (V, E), wherein V is the set of nodes V in the graph, and E is the set of edges E of the graph; generating an adjacency matrix A of the weighted graph, wherein in the adjacency matrix A, the weight between two adjacent nodes is set to be 1, and the rest are 0;

s2, generating a node degree matrix D using the adjacency matrix a:

adjacency matrix a is a diagonal matrix;

the flow calculation model is composed of

Layer diagram rolling and laminating; for the k-th layer map convolution layer, it is provided

Which represents the input to the k-th layer,

an output node representation representing the k-th layer is obtained

The feature matrix is the input to the first graph convolution layer; hidden feature representation of node vi in feature propagation process of GCN k layer

Is the average value of its local neighbors, the update rule is as follows:

；

in the formula (I), the compound is shown in the specification,

；

；

wherein

And

representing the degree of node i and the degree of node j respectively,

Setting similarity weights: (

) Order:

；

Order:

；

；

。

5. the SDN and machine learning based multimedia service traffic acceleration system of claim 4, wherein the SDN and machine learning based multimedia service traffic acceleration system

Is 3.

6. The SDN and machine learning based multimedia service traffic acceleration system of claim 4, wherein the path selection module comprises a global view acquisition sub-module, a weight processing sub-module, and a path computation sub-module;

7. The SDN and machine learning based multimedia service traffic acceleration system of claim 6, wherein the path computation sub-module comprises a reachable path computation component, a path state acquisition component, and an optimal path selection component;

the optimal path selection component calculates the time delay, the available bandwidth and the packet loss rate of each reachable path according to the similarity weight to obtain corresponding evaluation indexesSign board

Selecting the reachable path with the minimum evaluation index value as the optimal path:

。