CN115174961A

CN115174961A - Multi-platform video flow early identification method facing high-speed network

Info

Publication number: CN115174961A
Application number: CN202210796253.9A
Authority: CN
Inventors: 吴桦; 乐鑫; 程光
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2022-07-07
Filing date: 2022-07-07
Publication date: 2022-10-11

Abstract

The invention discloses a multi-platform video flow early identification method facing a high-speed network. Then, a feature space for classifying video and non-video traffic is constructed based on a protocol-independent principle, and a data set is constructed by extracting feature vectors from marked traffic. Finally, a classification model is constructed offline on a dataset containing video and non-video traffic using a supervised machine learning approach. The classification model can accurately identify the video flow in the high-speed network under the situation of high-speed network sampling data acquisition by combining the characteristic space provided by the above. The feature space provided by the invention can extract stable feature vectors from a small amount of data packets of the stream, and can identify the video flow in the early stage of stream transmission. The invention can realize real-time identification of video traffic in massive high-speed traffic in limited memory and reasonable time, and can be used for network traffic analysis and network management.

Description

Multi-platform video flow early identification method facing high-speed network

Technical Field

The invention relates to a high-speed network-oriented multi-platform video traffic early identification method, and belongs to the technical field of network security.

Background

With the development of the internet, video traffic increasingly dominates the global network. By 2022, IP video traffic will account for 82% of all IP traffic (including businesses and consumers), higher than 75% in 2017, with a composite annual growth rate of 33%. Identifying video traffic in high speed networks in a timely manner helps manage and allocate network resources, and thus traffic identification methods have been a major concern for Internet Service Providers (ISPs).

However, as the demand of users for video streaming services increases, a large number of video platforms using different transport protocols appear, which brings some challenges to the identification of video traffic; in addition, due to the high speed of the network bandwidth, the ISP can only obtain the sampling data of the video traffic at the traffic collection node under limited resources, which also puts new requirements on the video traffic identification method.

Researchers have proposed a series of video traffic identification methods, of which threshold-based and machine learning-based methods are widely used, but these methods still have some limitations.

(1) Identification method based on threshold value

The threshold-based method records some statistics of the stream, compares the statistics with a set threshold, and judges whether the stream is a video stream according to whether the statistics exceed the threshold. Although the method can quickly and accurately identify the video traffic, the setting of the threshold value has strong dependence on the protocol, and only the video traffic of certain specific applications can be identified, and the diversity of the video transmission protocol causes that the method cannot identify the full-platform video traffic on a high-speed network.

(2) Machine learning-based identification method

The video traffic identification method based on machine learning constructs a traffic classification model by extracting effective features from the content and the mode of traffic, and the identification performance of the traffic classification model depends on the construction of a feature space. The existing feature space construction methods are mainly divided into two types. One is to construct the transmission mode (such as timing characteristics) of the video stream from the full traffic, however, this method needs a long time to extract the characteristics of the complete long stream, and cannot identify the video traffic in the high-speed network within a reasonable time. Another class of methods extracts features from critical packets of the stream (e.g., packets during the handshaking phase), thereby reducing the time required for feature extraction and increasing the speed of recognition. The effectiveness of such methods depends on whether critical packets can be acquired at an early stage of stream set-up. However, in a high-speed network, due to limited resources, the ISP cannot obtain all the critical data packets in the sampling traffic, and thus such a method has poor performance in a high-speed network sampling environment. In summary, none of the existing machine learning based methods can be used to identify video traffic in a high speed network.

Disclosure of Invention

The invention discloses a multi-platform video traffic early identification method facing a high-speed network, aiming at identifying video traffic in the high-speed network in limited memory and reasonable time. Specifically, the method firstly collects video traffic of different platforms, and then marks the video stream and the non-video stream according to handshake or request information of unknown streams. Then, a feature space for classifying video and non-video streams is constructed based on a protocol independent principle, and a data set is constructed by extracting feature vectors from marked traffic. Finally, a supervised machine learning approach is used to train the classification model offline on the obtained data set. The classification model can accurately identify the video flow in the high-speed network under the situation of high-speed network sampling data acquisition by combining with the characteristic space provided by the above. The feature space proposed by the invention can extract stable feature vectors from a small number of data packets of the stream, so that the video traffic can be identified at the early stage of stream transmission.

In order to realize the purpose of the invention, the specific technical steps of the scheme are as follows:

the method comprises the following steps that (1) video playing flow of different platforms is collected through data collection equipment;

preprocessing the acquired flow, and marking video and non-video streams;

step (3) extracting features of the traffic marked in the step (2), constructing a feature space based on rules, and then obtaining a sample set with labels;

step (4) taking the sample set obtained in the step (3) as a training set, and then training by using a supervised machine learning method to obtain a classification model capable of distinguishing video streams from non-video streams;

step (5) setting a sampling ratio, carrying out system sampling on the flow in the high-speed network according to groups, and then grouping the sampled groups and extracting characteristics;

and (6) predicting unknown streams by applying the classification model obtained in the step (4) and identifying video flow.

Further, in the step (1), the acquiring of the video traffic specifically includes the following substeps:

and (1.1) respectively capturing the flow on the laboratory host and the android device. Using Wireshark to directly grab flow at a host end; the android device is connected with a hot spot on the host, and the traffic of the video playing process of the android device is captured through Wireshark. And when video traffic is captured, the networking permission of other applications is forbidden.

(1.2) selecting popular video websites at home and abroad, playing videos and capturing flow according to the following strategies: setting the maximum capture time of each video to be 5 minutes, and then finishing capture and storing the video as a pcap file;

and (1.3) compiling an automatic script to realize the step (1.2) and capturing video flow in batches.

Further, in the step (2), the preprocessing and marking of the flow specifically includes the following sub-steps:

(2.1) recombining the data packets into bidirectional flows according to five-tuple (source IP, source port, destination IP, destination port and transport layer protocol) for the video flows of different platforms obtained in the step (1), and discarding the flows with the packet quantity less than N;

(2.2) judging a transmission protocol adopted by the bidirectional stream, and if the bidirectional stream is an unencrypted video stream, performing (2.3); otherwise, performing (2.4);

(2.3) extracting URL request information containing the transmitted file type from the bidirectional stream, judging whether the stream is a video stream according to the file type keyword, and marking;

(2.4) extracting an SNI field containing domain name information from handshake information in the bidirectional stream, judging whether the stream is a video stream according to keywords contained in the SNI, and marking;

further, in the step (3), constructing the labeled sample set specifically includes the following sub-steps:

(3.1) extracting the features shown in Table 1 for the tagged stream obtained in step (2);

TABLE 1 statistics and description

Statistical value	Description of statistical values
		f_pck	Number of data packets transmitted in uplink direction
b_pck	Number of data packets transmitted in the downstream direction
		f_len	Number of bytes transmitted in uplink direction
b_len	Byte number of downlink transmission
		f_d_p	Number of data packets with load transmitted in uplink direction
b_d_p	Number of data packets with load transmitted in downlink direction
		f_d_l	Data byte number with load transmitted in uplink direction
b_d_l	Data byte number with load transmitted in downlink direction
		p_len	The number of payload bytes carried by each packet in a bi-directional flow
tmGap	Effective transmission time of bidirectional flow

(3.2) further processing the collected information, and eliminating the influence of data packet sampling on the feature stability through statistical calculation;

(3.3) when the characteristics are selected, the influence of the protocol on the characteristics is avoided as much as possible, and a characteristic space shown in a table 2 is constructed on the basis of three characteristics (asymmetry, high transmission rate and unique payload length distribution) of video traffic transmission per se;

TABLE 2 feature space and description thereof

And (3.4) extracting a feature vector from the collected flow to construct a sample set based on the constructed feature space.

Further, in the step (4), training the classification model specifically includes the following steps:

(4.1) the sample sets are divided into 3:1, dividing the training set into a training set and a test set;

(4.2) training the training set by using a random forest algorithm, performing dimension reduction processing on the feature vectors by using the test set, and determining parameters of the algorithm;

and (4.3) obtaining a classification model for video traffic identification.

Further, the step (5) of acquiring the high-speed network traffic and extracting the feature vector specifically includes the following steps:

(5.1) deploying traffic collection equipment in a high-speed network, and continuously capturing traffic by using tcpdump;

(5.2) setting a sampling ratio, carrying out system sampling on the obtained data, and recombining flows according to quintuple;

(5.3) setting the number M of data packets required for extracting the features, and extracting feature vectors from the first M data packets of the sampled stream;

further, in the step (6), the feature vector of the high-speed network traffic extracted in the step (5) is input into the classification model obtained in the step (4), and the video traffic is identified from the feature vector and the result is output.

Compared with the prior art, the technical scheme of the invention has the following advantages:

(1) The invention provides a new feature space, the feature space uses features irrelevant to protocols, and the model searched by the features can identify multi-platform video flow adopting different protocols, so that the method has higher practicability in a high-speed network.

(2) The feature space provided by the invention can extract stable feature vectors from the first 500 data packets of each stream, so that the video can be quickly identified in the early stage of stream transmission, and the test result proves that the method can be used for real-time identification of video flow.

(3) The invention combines the sampling technology with the video stream identification method, reduces the resource consumption of flow processing in the high-speed network, and experiments prove that the invention can identify more than 98 percent of video flow in the 10Gbps high-speed network when the sampling rate is set to be 1/32.

Drawings

FIG. 1 is an overall architecture diagram of the present invention;

FIG. 2 is a packet payload length probability distribution for a video stream and other types of streams;

fig. 3 shows the recognition performance of the present invention when different sampling rates are set in high-speed network traffic.

Detailed Description

The technical solutions provided by the present invention will be described in detail below with reference to specific examples, and it should be understood that the following specific embodiments are only illustrative of the present invention and are not intended to limit the scope of the present invention.

The specific embodiment is as follows: the invention provides a high-speed network-oriented multi-platform video traffic early identification method, the general architecture of which is shown in figure 1, comprising the following steps:

preprocessing the acquired flow, and marking video and non-video streams;

step (3) extracting features from the traffic marked in step (2), constructing a feature space based on rules, and then obtaining a sample set with labels;

step (4) taking the sample set obtained in the step (3) as a training set, then training by using a supervised machine learning method and obtaining a classification model capable of distinguishing video streams from non-video streams;

and (6) predicting the unknown stream by applying the classification model obtained in the step (4) and identifying the video flow.

In an embodiment of the present invention, in the step (1), the specific steps of acquiring video traffic of different platforms are as follows:

and (1.1) respectively capturing the flow on the laboratory host and the android device. Using Wireshark to directly grab flow at a host end; the android device is connected to a hot spot on the host, and the traffic of a specific process of the android device is captured through Wireshark. And when video traffic is captured, the networking permission of other applications is forbidden.

(1.2) selecting popular video websites at home and abroad, playing videos and capturing flow according to the following strategies: setting the maximum capture time of each video to be 5 minutes, and then finishing capture and storing as a pcap file;

and (1.3) compiling an automatic script, realizing the capture of the video flow according to the strategy of the step (1.2), and forbidding other networking equipment when capturing the video flow.

(1.4) selecting a part of video platforms with the highest domestic and foreign user quantity, collecting video playing flow of the video platforms, and analyzing transmission protocols used by different platforms, wherein the specific description of the flow is shown in table 1.

Acquisition platform	Number of bytes of data collected	Transmission protocol
			Facebook	378MB	HTTP+TLS1.3；
Youtube	13.85GB	HTTP+TLS1.3；GQUIC；
			Twitter	70MB	HTTP+TLS1.3；
Bilibili	2.87GB	HTTP+TCP；UDT；
			Love art	5.3GB	HTTP+TCP；HTTP+TLS1.2；
Youke	1.29GB	HTTP+TCP；HTTP+TLS1.2；
			Fast hand	3.07GB	HTTP+TLS1.2；HTTP+TLS1.3；
Human-body film and television	1.18GB	HTTP+TLS1.2；
			Fox-searching movie	1.01GB	HTTP+TCP；HTTP+TLS1.3；GQUIC；
Tremble sound	112MB	HTTP+TCP；HTTP+TLS1.2；
			Volcano small video	334MB	HTTP+TCP；
Other platforms	0.99GB	HTTP+TCP；HTTP+TLS1.2；

In one embodiment of the present invention, in step (2), the specific steps of preprocessing and marking the flow rate are as follows:

(2.1) for the captured video flow, recombining the data packets into bidirectional flow according to five-tuple (source IP, source port, destination IP, destination port and transport layer protocol), setting N as 100, and discarding the flow of which the number of the data packets is less than N;

(2.2) unpacking the stream by using a dpkt tool, and extracting key information containing the flow type according to a transmission protocol used by the stream, wherein the method specifically comprises the following steps: if the stream is encrypted by adopting TLS or QUIC protocol, finding out a data packet containing ClientHello information, then extracting an SNI field containing server domain name information from the data packet, and finally judging whether the stream is a video stream according to keywords contained in the SNI field; if the stream is transmitted by adopting an unencrypted HTTP protocol, obtaining a URL from a data packet containing the GET request, and judging whether the stream is a video stream according to a request data type keyword contained in the URL.

And (2.3) writing a program to realize batch extraction of SNI and URL, matching according to a regular expression, and quickly marking video streams and non-video streams.

In one embodiment of the present invention, in step (3), statistics as shown in table 2 are collected for the tagged bidirectional stream obtained in step (2), then the collected information is amplified by the reciprocal of the set sampling rate to eliminate the influence of sampling on the stability of the statistics, and then statistical features are specifically constructed from the following three directions according to the characteristics of video traffic transmission:

TABLE 2 statistics and description

Statistical value	Description of statistical values
		f_pck	Number of data packets transmitted in uplink direction
b_pck	Number of data packets to be transmitted in the downstream direction
		f_len	Number of bytes transmitted in uplink direction
b_len	Byte number of downlink transmission
		f_d_p	Number of data packets with load transmitted in uplink direction
b_d_p	Number of data packets with load transmitted in downlink direction
		f_d_l	Data byte number with load transmitted in uplink direction
b_d_l	Data byte number with load transmitted in downlink direction
		p_len	The number of payload bytes carried by each data packet in the bidirectional flow
tmGap	Effective transmission time of bidirectional flow

(3.1) constructing four statistical characteristics RAT = { r _ b _ pck, r _ b _ len, r _ b _ dp, r _ b _ dl } based on the asymmetry of the uplink and downlink transmission of the video stream. Wherein r _ b _ pck is the ratio of the number of data packets sent in the downlink direction and the bidirectional flow, r _ b _ len is the ratio of the number of bytes sent in the downlink direction and the bidirectional flow, r _ b _ dp is the ratio of the number of data packets with load sent in the downlink direction and the bidirectional flow, and r _ b _ dl is the ratio of the number of bytes of load data sent in the downlink direction and the bidirectional flow. These four statistical characteristics are calculated using equation (1):

(3.2) based on the high transmission rate characteristic of the video stream, four statistical characteristics SPD = { b _ SPD _ pck, f _ SPD _ pck, b _ SPD _ len, f _ SPD _ len }. Wherein b _ spd _ pck and f _ spd _ pck are the transmission rates of the number of packets in the downlink direction and the uplink direction, respectively, and b _ spd _ len and f _ spd _ len are the byte transmission rates in the downlink direction and the uplink direction, respectively. These four statistical features are calculated using equation (2):

(3.3) attached fig. 2 shows packet payload length probability distribution of a video stream as distinguished from other types of streams, so the payload length is divided among regions based on unique payload length distribution of the video stream. According to the common MTU in a network link being 1300 bytes, the data packet payload is divided into 13 intervals according to every 100 bytes, and 15 intervals are divided by adding a left boundary and a right boundary, and the bidirectional stream comprises two directions, so that 30 intervals are included in total. These features are named PLD and calculated using equation (3):

wherein Interval _i The number of data packets included in the ith interval.

And (3.4) combining three types of features of RAT, SPD and PLD to construct a feature space, wherein the feature space contains 38 features in total, and extracting feature vectors from the tagged traffic obtained in the step (2) to construct a data set.

In an embodiment of the present invention, in the step (4), the training of the classification model specifically includes the following steps:

(4.1) the data set obtained in the step (3) is divided into 3:1, dividing a training set and a test set, wherein the training set comprises 7899 samples, and the test set comprises 2633 samples;

(4.2) this example trains the training set using a random forest algorithm and tests on the test set. Firstly, sorting the reusability of the features based on average impurity reduction (MDI), taking 8 features with the highest importance to realize the dimension reduction operation of the feature vector, and finally selecting the features as shown in a table 3; then, determining the optimal parameters of a random forest algorithm based on grid search ten-fold cross validation; and finally, obtaining a classification model for identifying the video flow.

TABLE 3 flow characteristics and meanings

Characteristic name	Means of
		per_b_(0)	The ratio of the number of packets having a payload length of 0 bytes to the total number of packets in the downstream direction
per_b_(1-100)	The ratio of the number of packets having a payload length of 1 to 100 bytes to the total number of packets in the downstream direction
		per_f_(>1300)	Ratio of number of packets having payload length greater than 1300 bytes to total number of packets in upstream direction
r_f_dp	Ratio between number of data packets with load transmitted in upstream and bidirectional flow
		r_f_dl	Ratio between number of payload bytes transmitted in upstream and bidirectional flows
r_f_pck	Ratio between number of data packets transmitted in upstream and bidirectional flows
		r_f_len	Ratio between number of bytes transmitted in upstream and bidirectional flow
b_spd_len	Data transmission rate in downlink direction

In one embodiment of the present invention, in the step (5), the specific steps of collecting the high-speed network traffic and extracting the feature vector are as follows:

(5.1) in this example, traffic is collected at the campus network port in the morning of 8 am 11 month in 2021, the collection time is 400s, the collected port bandwidth is 10Gbps, and finally the obtained traffic size is 117GB, which includes 171485 streams. The collected traffic comprises video traffic from different platforms;

(5.2) setting the sampling rate to be 1/32, performing grouping system sampling on the acquired data, and then recombining the data packets with the same five-tuple into the same bidirectional flow;

(5.3) according to the test result, setting the number M of data packets needed by extracting the feature vector of the flow to be 500, extracting the features of the sampled flow, and finally obtaining 30766 samples containing the features;

in an embodiment of the present invention, in the step (6), identifying the high-speed network video traffic by using the video traffic identifier includes the following specific steps:

(6.1) the precision ratio precision and the recall ratio recall are selected as evaluation indexes in the embodiment, and for the condition that all high-speed traffic does not contain tags, the precision and the recall are respectively calculated by adopting the following two methods:

sampling and verifying method: applying the classification model to obtain a classification result with a label, and manually proofreading a part of the result so as to estimate precision of the classification model;

a mark re-complementing method: mixing M video stream samples marked in advance into a sample to be predicted, applying a classification model to obtain a classification result with a label, recording the number of the samples of the video stream predicted by the classification model in the M samples as M, and estimating the call = M/M of the classification model.

(6.2) the classification model is applied to a high-speed flow data set to identify video flow, the data set comprises video flow of a plurality of platforms, precision and recycle of models at different sampling rates are shown in the attached drawing 3, and the identification of the multi-platform video flow of more than 98% in a high-speed network is proved by the method;

(6.3) the example analyzes the shortest time required by the invention for identifying the video traffic in a high-speed network through experiments to prove that the invention has stronger practicability. The time required for identifying the video stream in the high-speed network comprises feature extraction time and model prediction time. Wherein the time required for extracting features from a stream is mainly influenced by the bandwidth and the sampling rate, and for high-speed network traffic with 10Gbp bandwidth, the invention only needs 2.24 milliseconds for extracting features from the first 500 data packets of a bidirectional stream under the sampling rate of 1/32 and neglecting other processing overhead. For 30766 samples used in this example, the present invention can complete feature extraction in 68915 ms at the shortest time, and can complete model prediction in 322 ms. In conclusion, for 400 seconds of data in a real high-speed network of 10Gbps, the method can complete the identification of the video traffic in only 69.237 seconds at the shortest time, and the method proves that the method can be used for real-time identification of video traffic of different platforms in the high-speed network.

The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.

Claims

1. A multi-platform video flow early identification method facing a high-speed network is characterized by comprising the following steps:

the method comprises the following steps that (1) video playing flows of different platforms are collected through data collection equipment;

preprocessing the acquired flow, and marking video and non-video streams;

step 5, setting a sampling ratio, carrying out system sampling on the flow in the high-speed network according to groups, then grouping the sampled groups, and extracting characteristics;

2. The method for early identifying the video traffic of the high-speed network-oriented multiple platforms according to claim 1, wherein in the step (1), the method for capturing the video traffic is as follows:

(1.1) respectively capturing flow on a laboratory host and android equipment, and directly capturing the flow at a host end by using Wireshark; the android device is connected with a hot spot on the host, the flow of the video playing process of the android device is captured through Wireshark, and the networking permission of other applications is forbidden when the video flow is captured;

3. The method according to claim 1, wherein the preprocessing and marking of the traffic in step (2) specifically comprises the following steps:

(2.1) for the video traffic of different platforms obtained in the step (1), forming a quintuple, namely a source IP, a source port, a destination IP, a destination port and a data packet with the same transport layer protocol into the same bidirectional flow, and discarding the flow with the packet quantity less than N;

(2.2) judging the transmission protocol adopted by the bidirectional stream, and if the bidirectional stream is the non-encrypted video stream, performing (2.3); otherwise, carrying out (2.4);

and (2.4) extracting an SNI field containing domain name information from handshake information in the bidirectional stream, judging whether the stream is a video stream according to keywords contained in the SNI, and marking.

4. The method for early recognition of multi-platform video traffic oriented to high-speed network according to claim 1, wherein in the step (3), the specific steps of constructing the tagged sample set are as follows:

(3.1) recording statistics as shown in table 1 for the bi-directional stream that has been marked;

TABLE 1 statistics and description

(3.3) avoiding the influence of the protocol on the characteristics as much as possible when the characteristics are selected, and constructing a characteristic space shown in a table 2 for the bidirectional flow from three characteristics of video flow transmission, namely asymmetry, high transmission rate and unique payload length distribution of uplink and downlink flow transmission;

TABLE 2 description of feature spaces and features contained therein

5. The method for early recognition of multi-platform video traffic oriented to high-speed network according to claim 1, wherein in the step (4), training the classification model specifically includes the following steps:

(4.1) the sample sets were expressed as 3:1, dividing the training set into a training set and a testing set;

and (4.3) obtaining a classification model for video traffic identification.

6. The method according to claim 1, wherein the step (5) of acquiring the high-speed network traffic and extracting the feature vector comprises the following steps:

(5.2) setting a sampling ratio, carrying out system sampling on the obtained data, and recombining the streams according to quintuple;

and (5.3) setting the number M of data packets required for extracting the features, and extracting feature vectors from the first M data packets of the sampled stream.

7. The method for early identifying the video traffic of the high-speed network-oriented multiple platforms as claimed in claim 1, wherein in the step (6), the feature vector of the high-speed network traffic extracted in the step (5) is input into the classification model obtained in the step (4), and the video traffic is identified therefrom and the result is output.