CN113364699B - Cloud data flow management and control method and system based on multi-scale self-similar characteristic - Google Patents
Cloud data flow management and control method and system based on multi-scale self-similar characteristic Download PDFInfo
- Publication number
- CN113364699B CN113364699B CN202110662050.6A CN202110662050A CN113364699B CN 113364699 B CN113364699 B CN 113364699B CN 202110662050 A CN202110662050 A CN 202110662050A CN 113364699 B CN113364699 B CN 113364699B
- Authority
- CN
- China
- Prior art keywords
- data
- self
- cloud data
- traffic
- scale
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000004364 calculation method Methods 0.000 claims abstract description 34
- 238000007781 pre-processing Methods 0.000 claims abstract description 21
- 230000005540 biological transmission Effects 0.000 claims abstract description 15
- 230000002159 abnormal effect Effects 0.000 claims description 13
- 238000007726 management method Methods 0.000 claims description 12
- 238000004458 analytical method Methods 0.000 claims description 9
- 230000006870 function Effects 0.000 description 8
- 238000001514 detection method Methods 0.000 description 7
- 238000004088 simulation Methods 0.000 description 6
- 238000005070 sampling Methods 0.000 description 5
- 238000000342 Monte Carlo simulation Methods 0.000 description 4
- 238000005295 random walk Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000005653 Brownian motion process Effects 0.000 description 2
- 238000005537 brownian motion Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/20—Traffic policing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/27—Evaluation or update of window size, e.g. using information derived from acknowledged [ACK] packets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/50—Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention provides a cloud data flow control method and system based on multi-scale self-similarity characteristics. The method comprises the following steps: s1, preprocessing acquired cloud data to obtain a data sequence, wherein the preprocessing comprises data packet filling of the cloud data; s2, selecting a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window; s3, calculating the Hurst parameters of the data subsequences to determine the flow self-similarity characteristic of the cloud data; and S4, carrying out flow control on the cloud data based on the flow self-similarity characteristic so as to adaptively adjust a transmission flow structure of the cloud data.
Description
Technical Field
The invention relates to the field of flow control, in particular to a cloud data flow control method and system based on multi-scale self-similarity characteristics.
Background
In the face of increasingly diverse services and varying amounts of soaring network traffic, this requires the internet to provide higher quality of service. For those services with very high requirements on bandwidth, network delay, etc., the "Best-Effort" (Best-Effort) service provided in the conventional internet at present cannot meet the service requirements. Although the network speed and user experience have improved greatly with the development of network technology, the data transmitted through the network has increased at the same time, which makes how to guarantee the service quality still a bottleneck problem at present stage.
Cloud data (Cloud data) is a general term for technologies and platforms of data integration, data analysis, data integration, data distribution and data early warning based on Cloud computing business model application. Traffic control (Network traffic control) is a method that uses software or hardware to control Network traffic. The most important method is to introduce the concept of QoS, and to determine the priority of data packet passing by marking different types of network data packets. The traditional mode is limited to the problem that the massive cloud data traffic cannot be rapidly and safely controlled.
Disclosure of Invention
The cloud data traffic control scheme based on the multi-scale self-similarity characteristic is provided, and the problem that rapid safety control cannot be achieved for massive cloud data traffic in the prior art is solved.
The invention provides a cloud data flow control method based on multi-scale self-similarity characteristics in a first aspect. The method comprises the following steps: s1, preprocessing acquired cloud data to obtain a data sequence, wherein the preprocessing comprises data packet filling of the cloud data; s2, selecting a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window; s3, calculating the Hurst parameters of the data subsequences to determine the flow self-similarity characteristic of the cloud data; and S4, carrying out flow control on the cloud data based on the flow self-similarity characteristic so as to adaptively adjust a transmission flow structure of the cloud data.
According to the method provided by the first aspect of the present invention, in step S1, the cloud data includes a plurality of data records, when the duration of the data record is one second, the data packet padding is not performed, when the duration of the data record exceeds one second, it is determined that there is a default data packet in the data record, and the data packet padding is performed on the data record to obtain the data sequence, where the data packet padding specifically includes: when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling; and when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method.
According to the method provided by the first aspect of the present invention, in step S2, in the calculation windows of the respective scales, the data subsequences corresponding to the current scale are sequentially selected from the data sequences in a sliding manner, and the data subsequences selected in the calculation windows of the same scale have the same length.
According to the method provided by the first aspect of the present invention, in step S3, a flow self-similarity change curve is drawn as a self-similarity feature of the cloud data flow based on Hurst parameters of the data subsequences under calculation windows of various scales by using a re-standard polar difference analysis method.
According to the method provided by the first aspect of the present invention, in the step S4, the traffic self-similarity characteristic is compared with a second threshold to determine whether the cloud data belongs to an abnormal traffic, and if yes, the IP and the port of the cloud data are adaptively adjusted to control the transmission traffic structure, where the second threshold is a threshold for calculating a statistical value of a normal traffic.
The invention provides a cloud data flow management and control system based on a multi-scale self-similar characteristic. The system comprises: a preprocessing unit configured to preprocess the acquired cloud data to obtain a data sequence, the preprocessing including data packet stuffing of the cloud data; a subsequence selecting unit configured to select a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window; a self-similarity characteristic determination unit configured to calculate Hurst parameters of the data subsequences to determine a flow self-similarity characteristic of the cloud data; and the traffic control unit is configured to perform traffic control on the cloud data based on the traffic self-similarity characteristic so as to adaptively adjust a transmission traffic structure of the cloud data.
According to the system provided by the second aspect of the present invention, the cloud data includes a plurality of data records, and the preprocessing unit is specifically configured to: when the duration of the data record is one second, not performing the data packet padding, and when the duration of the data record exceeds one second, determining that the data record has a data packet default condition, and performing the data packet padding on the data record to obtain the data sequence, where the data packet padding specifically includes: when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling; and when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method.
According to the system provided by the second aspect of the present invention, the subsequence selecting unit is specifically configured to, under the calculation windows of the respective scales, sequentially select, in a sliding manner, the data subsequence corresponding to the current scale from the data sequence, and each data subsequence selected under the calculation window of the same scale has the same length.
According to the system provided by the second aspect of the present invention, the self-similarity characteristic determination unit is specifically configured to draw a flow self-similarity change curve as a self-similarity feature of the cloud data flow based on Hurst parameters of the data subsequences under calculation windows of various scales by using a re-standard polar difference analysis method.
According to the system provided by the second aspect of the present invention, the traffic management and control unit is specifically configured to compare the traffic self-similarity characteristic with a second threshold value to determine whether the cloud data belongs to an abnormal traffic, and if so, adaptively adjust the IP and the port of the cloud data to control the transmission traffic structure, where the second threshold value is a threshold value for calculating a statistical value of a normal traffic.
A third aspect of the present invention provides a non-transitory computer readable medium storing instructions which, when executed by a processor, perform the steps in a method for cloud data traffic management based on multi-scale self-similar characteristics according to the first aspect of the present invention.
In summary, in the technical scheme provided by the invention, the cloud data traffic is subjected to self-similarity calculation according to different time scales to obtain traffic characteristics under multiple time scales, and the threshold of the traffic self-similarity index is set according to the characteristics of different time scales, so that the self-similarity calculation of the data traffic under different application environments is adapted, and the problem that rapid safety control cannot be realized on the mass cloud data traffic is solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description in the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a cloud data traffic control method based on multi-scale self-similar characteristics according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart according to a first embodiment of the present invention;
FIG. 3 is a schematic flow chart according to a second embodiment of the present invention; and
fig. 4 is a structural diagram of a cloud data traffic management and control system based on multi-scale self-similar characteristics according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a cloud data flow control method based on multi-scale self-similarity characteristics in a first aspect. Fig. 1 is a flowchart of a cloud data traffic control method based on multi-scale self-similar characteristics according to an embodiment of the present invention, and as shown in fig. 1, the method includes: s1, preprocessing acquired cloud data to obtain a data sequence, wherein the preprocessing comprises data packet filling of the cloud data; s2, selecting a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window; s3, calculating Hurst parameters of the data subsequences to determine the flow self-similarity characteristic of the cloud data; and S4, carrying out flow control on the cloud data based on the flow self-similarity characteristic so as to adaptively adjust a transmission flow structure of the cloud data.
In step S1, the collected cloud data is preprocessed to obtain a data sequence, where the preprocessing includes data packet stuffing of the cloud data.
In some embodiments, in step S1, the cloud data includes a plurality of data records, and when the duration of the data record is one second, the data packet padding is not performed, and when the duration of the data record exceeds one second, it is determined that a packet default condition exists in the data record, and the data packet padding is performed on the data record to obtain the data sequence, where the data packet padding specifically includes: when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling; and when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method.
In particular, cloud data collection may have a default time condition, and a data packet time default or repeated part needs to be filled in the data preprocessing. There are no packets for a certain second and this leads to a confusion of the statistics. If a record spans more seconds, it indicates that the middle second has no data, and the right time can be used as the unique identification time of the record. When the default is less, the condition that the data before and after the default time is complete is mainly referred to, the mean value of the data before and after the default time is directly calculated to be used as the data at the default time, and otherwise, a cubic spline interpolation method is adopted to complete the data. Cubic Spline interpolation is called as Spline interpolation for short, and compared with other interpolation methods, the cubic Spline interpolation carries out reasonable compromise between flexibility and calculation speed. It is less demanding to compute and store than higher order splines and is more stable. Compared with a secondary interpolation spline, the cubic interpolation spline is more flexible in simulating any shape, and the accuracy of the obtained interpolation data is higher.
Cubic spline interpolation divides an interval into n-1 intervals for one interval (a, b):
x 0 =a<x 1 ,......x n-1 <b=x n
an unknown function needs to be simulated by known n +1 points, and the method is implemented by adopting a segmentation method in cubic spline interpolation. The piecewise function obtained by cubic spline interpolation ensures that the following conditions are satisfied, and these conditions are also the conditions for solving each segment of spline interpolation:
(1) The function value of the simulated function at a known point is equal to the function value of f;
(2) The modeled piecewise function is second order continuous, i.e. the derivative and the second derivative are equal at the intersection of the segments;
(3) The case of the second derivative at points a and b, or the law of change of the second derivative at these n +1 points, needs to be known.
It is known that:
a.n +1 data points [ xi, yi ], i =0,1, …, n;
b. each segment is a cubic polynomial function curve;
c. the nodes reach second order continuity;
d. characteristics at the left and right end points (natural boundary, fixed boundary, non-nodal boundary);
and (4) solving the coefficient in each spline curve equation according to the fixed point to obtain a specific expression of each curve.
In step S2, a data subsequence for determining the self-similarity characteristic is selected from the data sequences in a sliding manner by using a multi-scale calculation window.
In some embodiments, in step S2, under the calculation windows of the respective scales, the data subsequences corresponding to the current scale are sequentially selected from the data sequences in a sliding manner, and the data subsequences selected under the calculation windows of the same scale have the same length.
Specifically, the flow self-similarity calculation mainly utilizes a multi-scale thought, and flow data needs to be converted into a multi-scale representation mode after being preprocessed. The process is mainly realized by using a Hurst method, and subsequences are sequentially taken out of the whole data sequence in a sliding window mode. The subsequences taken out by sliding the window every time are the same in length, and the accuracy of calculating the H value every time is guaranteed.
The Hurst self-similarity parameter is the most widely used mathematical parameter which characterizes the self-similarity model and has the highest recognition degree. The Hurst parameter reflects the result of a long series of interconnection events, and the self-similarity characteristics of different types of network flows can be obtained by calculating the Hurst parameters of normal flow and abnormal flow. Hurst self-similarity means that changes in the time scale do not result in changes in the statistical characteristics. Local statistical properties of the network traffic can be used to approximate the overall properties of the network traffic. The Hurst self-similarity parameter is the only important index capable of representing the self-similarity process, in the network traffic, when H is in a value interval (0.5,1), the network shows self-similarity, and the closer the value is to 1, the stronger the self-similarity of the network traffic is. When H =0.5, the network traffic has randomness, without self-similarity on a time scale. When H is in the value range (0,0.5), it indicates that the network traffic has the opposite statistical property in time to the network traffic of the previous time.
In step S3, a Hurst parameter of each data subsequence is calculated to determine a traffic self-similarity characteristic of the cloud data.
In some embodiments, a flow self-similarity variation curve is drawn as a self-similarity feature of the cloud data flow based on Hurst parameters of the data subsequences under the calculation windows of the respective scales.
Specifically, a Hurst parameter H of a data subsequence is calculated by using a re-standard range (R/S) analysis method, and a flow self-similarity change curve can be drawn by sequentially calculating an H value sequence.
The Hurst parameter based on re-standard range (R/S) analysis is used as an index to judge whether time series data follow a random walk or biased random walk process. Systems with hercules statistics do not require independent random event hypotheses, which are typically probabilistic. Which reflects the results of a long string of interrelated events.
The Hurst parameter has three forms:
(1) If H =0.5, it is indicated that the time sequence can be described by a random walk;
(2) If 0.5-H-cloth-1, the long-term memory of the time sequence is shown;
(3) If 0 ≦ H <0.5, pink noise (anti-persistence) is indicated, i.e., the mean-return process.
That is, as long as H ≠ 0.5, the time-series data can be described by biased brownian motion (fractal brownian motion).
The method for calculating the Hurst parameter mainly comprises the following steps: a polymerization variance method, an R/S analysis method, a periodogram method, an absolute value method, a residual variance method, a wavelet analysis method and a Whittle method.
In step S4, traffic control is performed on the cloud data based on the traffic self-similarity characteristic, so as to adaptively adjust a transmission traffic structure of the cloud data.
In some embodiments, in the step S4, the traffic self-similar characteristic is compared with a second threshold to determine whether the cloud data belongs to an abnormal traffic, and if so, the IP and the port of the cloud data are adaptively adjusted to control the transmission traffic structure, where the second threshold is a threshold for calculating a statistical value of a normal traffic.
Specifically, since the flow data has the general characteristics of a signal, the flow anomaly detection problem can be extended to a signal detection problem, and anomaly detection is realized on the basis. The threshold (second threshold) used in the anomaly determination is a statistical value calculated for the normal flow, and is obtained by using the monte carlo method. And obtaining a final detection result through comparison. Limiting the ip and the port of the abnormal flow to adjust the flow structure of cloud data transmission, continuously inputting the data flow after control to preprocessing, and detecting and controlling the flow again.
The monte carlo method, also known as a statistical simulation method, uses random numbers (or more commonly pseudo-random numbers) to solve computational problems. The engineering technical problem solved by the method can be divided into two types, namely a deterministic problem and a stochastic problem. The problem solving steps are as follows:
(1) Constructing a probability model or a random model according to the proposed problem, and enabling the solution of the problem to correspond to certain characteristics (such as probability, mean value, variance and the like) of random variables in the model, wherein the constructed model is consistent with an actual problem or system in terms of main characteristic parameters;
(2) And generating random numbers on a computer according to the distribution of each random variable in the model, so as to realize a sufficient number of random numbers required by one simulation process. Generally, random numbers which are uniformly distributed are generated firstly, and then random numbers which obey certain distribution are generated, so that a random simulation test can be carried out;
(3) According to the characteristics of the probability model and the distribution characteristics of random variables, a proper sampling method is designed and selected, and each random variable is sampled (including direct sampling, hierarchical sampling, related sampling, important sampling and the like);
(4) Carrying out simulation test and calculation according to the established model to obtain a random solution of the problem;
(5) And (4) statistically analyzing the simulation test result, and giving a probability solution of the problem and precision estimation of the solution.
The present invention uses the monte carlo method to determine the threshold. The Monte Carlo method is used for calculating probability distribution and digital characteristics of complex random variables, and can estimate the reliability of a system and parts through random simulation, simulate a random process, seek optimal parameters of the system and the like.
Fig. 2 is a schematic flow chart according to a first embodiment of the present invention, and as shown in fig. 2, data preprocessing is performed on cloud data, then a flow self-similarity index is calculated by selecting calculation windows of different time lengths, and further a flow characteristic of the data is extracted according to the self-similarity, and then whether the flow self-similarity is normal is determined based on a threshold T, if yes, the flow is normal, if no, the flow is abnormal, a regulation measure for reducing abnormal IP and port data receiving and forwarding amounts is taken, and after regulation, the cloud data is preprocessed again to determine whether abnormal flow exists again.
Fig. 3 is a schematic flow chart according to a second embodiment of the present invention, and as shown in fig. 3, the implementation case is a function of performing traffic anomaly detection on cloud data to manage and control an abnormal traffic, and the specific flow is as follows: the method comprises the steps of cloud data sampling- > data preprocessing- > taking a flow sequence with various time lengths- > flow self-similarity calculation- > anomaly detection- > flow control, carrying out control and limitation on the abnormal flow of the cloud data, then carrying out anomaly detection on the flow, and adjusting a detection threshold and a limitation proportion to enable the flow control effect to be the best.
The invention provides a cloud data flow management and control system based on a multi-scale self-similar characteristic. Fig. 4 is a structural diagram of a cloud data traffic management system based on multi-scale self-similar characteristics according to an embodiment of the present invention, and as shown in fig. 4, the system 400 includes: a preprocessing unit 401 configured to preprocess the acquired cloud data to obtain a data sequence, where the preprocessing includes performing data packet stuffing on the cloud data; a subsequence selecting unit 402 configured to select a data subsequence for determining the self-similarity characteristic from the data sequences in a sliding manner by using a multi-scale calculation window; a self-similarity characteristic determination unit 403, configured to calculate Hurst parameters of the data subsequences to determine a traffic self-similarity characteristic of the cloud data; a traffic control unit 404 configured to perform traffic control on the cloud data based on the traffic self-similar characteristic to adaptively adjust a transmission traffic structure of the cloud data.
According to the system provided by the second aspect of the present invention, the cloud data includes a plurality of data records, and the preprocessing unit 401 is specifically configured to: when the duration of the data record is one second, not performing the data packet padding, and when the duration of the data record exceeds one second, determining that the data record has a data packet default condition, and performing the data packet padding on the data record to obtain the data sequence, where the data packet padding specifically includes: when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling; and when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method.
According to the system provided by the second aspect of the present invention, the subsequence selecting unit 402 is specifically configured to, under the calculation windows of the respective scales, sequentially select, in a sliding manner, the data subsequence corresponding to the current scale from the data sequence, where each data subsequence selected under the calculation window of the same scale has the same length.
According to the system provided by the second aspect of the present invention, the self-similarity characteristic determining unit 403 is specifically configured to draw a flow self-similarity change curve as a self-similarity feature of the cloud data flow based on the Hurst parameter of the data subsequence under the calculation window of each scale by using a re-standard polar difference analysis method.
According to the system provided by the second aspect of the present invention, the traffic management and control unit 404 is specifically configured to compare the traffic self-similarity characteristic with a second threshold to determine whether the cloud data belongs to an abnormal traffic, and if so, adaptively adjust the IP and the port of the cloud data to control the transmission traffic structure, where the second threshold is a threshold for calculating a statistical value of a normal traffic.
A third aspect of the present invention provides a non-transitory computer readable medium storing instructions that, when executed by a processor, perform the steps of a method for cloud data traffic management based on multi-scale self-similar characteristics according to the first aspect of the present invention.
In summary, in the technical scheme provided by the present invention, the cloud data traffic is subjected to self-similarity calculation according to different time scales to obtain traffic characteristics under multiple time scales, and the threshold of the traffic self-similarity index is set according to the characteristics of different time scales, so that the method is suitable for the self-similarity calculation of data traffic under different application environments, and the problem that rapid security control cannot be implemented on massive cloud data traffic is solved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the spirit of the corresponding technical solutions of the embodiments of the present invention.
Claims (6)
1. A cloud data flow control method based on multi-scale self-similar characteristics is characterized by comprising the following steps:
s1, preprocessing acquired cloud data to obtain a data sequence, wherein the preprocessing comprises data packet filling of the cloud data;
s2, selecting a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window;
s3, calculating the Hurst parameters of the data subsequences to determine the flow self-similarity characteristic of the cloud data;
s4, carrying out flow control on the cloud data based on the flow self-similarity characteristic so as to adaptively adjust a transmission flow structure of the cloud data;
in step S1, the cloud data includes a plurality of data records, when the duration of the data record is one second, the data packet padding is not performed, and when the duration of the data record exceeds one second, it is determined that the data record has a default data packet, and the data packet padding is performed on the data record to obtain the data sequence, where the data packet padding specifically includes:
when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling;
when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method;
in the step S4, the traffic self-similarity characteristic is compared with a second threshold to determine whether the cloud data belongs to an abnormal traffic, and if so, the IP and the port of the cloud data are adaptively adjusted to control the transmission traffic structure, where the second threshold is a threshold for calculating a statistical value of a normal traffic.
2. The method for managing and controlling cloud data flow based on multi-scale self-similar characteristics according to claim 1, wherein in step S2, under the calculation windows of each scale, a data subsequence corresponding to a current scale is sequentially selected from the data sequence in a sliding manner, and each data subsequence selected under the calculation window of the same scale has the same length.
3. The method for managing and controlling the cloud data flow based on the multi-scale self-similar characteristics as claimed in claim 1, wherein in the step S3, a flow self-similar change curve is drawn as the self-similar characteristics of the cloud data flow based on Hurst parameters of the data subsequences under the calculation windows of each scale by using a re-scaling polar difference analysis method.
4. A cloud data flow management and control system based on multi-scale self-similar characteristics is characterized by comprising:
a preprocessing unit configured to preprocess the acquired cloud data to obtain a data sequence, the preprocessing including data packet stuffing of the cloud data;
a subsequence selecting unit configured to select a data subsequence for determining the self-similarity characteristic from the data sequences in a sliding manner by using a multi-scale calculation window;
a self-similarity characteristic determination unit configured to calculate Hurst parameters of the data subsequences to determine a flow self-similarity characteristic of the cloud data;
the traffic control unit is configured to perform traffic control on the cloud data based on the traffic self-similarity characteristic so as to adaptively adjust a transmission traffic structure of the cloud data;
the cloud data comprises a plurality of data records, and the preprocessing unit is specifically configured to:
when the duration of the data record is one second, not performing the data packet padding, and when the duration of the data record exceeds one second, determining that a data packet default condition exists in the data record, and performing the data packet padding on the data record to obtain the data sequence, where the data packet padding specifically includes:
when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling;
when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method;
the traffic control unit is specifically configured to compare the traffic self-similarity characteristic with a second threshold value to determine whether the cloud data belongs to an abnormal traffic, and if yes, adaptively adjust I P and a port of the cloud data to control the transmission traffic structure, where the second threshold value is a threshold value for calculating a statistical value of a normal traffic.
5. The cloud data flow management and control system based on the multi-scale self-similar characteristic according to claim 4, wherein:
the subsequence selection unit is specifically configured to sequentially select a data subsequence corresponding to the current scale from the data sequence in a sliding manner under the calculation windows of all scales, wherein all the data subsequences selected under the calculation windows of the same scale have the same length;
the self-similarity characteristic determination unit is specifically configured to draw a flow self-similarity change curve as a self-similarity characteristic of the cloud data flow based on the Hurst parameter of the data subsequence under the calculation window of each scale by using a re-standard polar difference analysis method.
6. A non-transitory computer readable medium storing instructions, wherein the instructions, when executed by a processor, perform the steps of any one of claims 1-3 in a cloud data traffic management method based on multi-scale self-similar features.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110662050.6A CN113364699B (en) | 2021-06-15 | 2021-06-15 | Cloud data flow management and control method and system based on multi-scale self-similar characteristic |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110662050.6A CN113364699B (en) | 2021-06-15 | 2021-06-15 | Cloud data flow management and control method and system based on multi-scale self-similar characteristic |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113364699A CN113364699A (en) | 2021-09-07 |
CN113364699B true CN113364699B (en) | 2023-04-07 |
Family
ID=77534263
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110662050.6A Active CN113364699B (en) | 2021-06-15 | 2021-06-15 | Cloud data flow management and control method and system based on multi-scale self-similar characteristic |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113364699B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114049453B (en) * | 2021-11-18 | 2024-04-30 | 中国石油天然气股份有限公司 | Slit plate and modeling method of slit plate model |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102404164A (en) * | 2011-08-09 | 2012-04-04 | 江苏欣网视讯科技有限公司 | Flow analysis method based on ARMA (Autoregressive Moving Average) model and chaotic time sequence model |
WO2015149302A1 (en) * | 2014-04-02 | 2015-10-08 | 中国科学院自动化研究所 | Method for rebuilding tree model on the basis of point cloud and data driving |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7321555B2 (en) * | 2003-04-16 | 2008-01-22 | International Business Machines Corporation | Multilevel analysis of self-similar network traffic |
CN105577473B (en) * | 2015-12-21 | 2019-06-04 | 重庆大学 | A kind of multi-service traffic generating system based on Model of network traffic |
CN109685334B (en) * | 2018-12-10 | 2020-07-10 | 浙江大学 | Novel hydrological model simulation evaluation method based on multi-scale theory |
CN111586075B (en) * | 2020-05-26 | 2022-06-14 | 国家计算机网络与信息安全管理中心 | Hidden channel detection method based on multi-scale stream analysis technology |
-
2021
- 2021-06-15 CN CN202110662050.6A patent/CN113364699B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102404164A (en) * | 2011-08-09 | 2012-04-04 | 江苏欣网视讯科技有限公司 | Flow analysis method based on ARMA (Autoregressive Moving Average) model and chaotic time sequence model |
WO2015149302A1 (en) * | 2014-04-02 | 2015-10-08 | 中国科学院自动化研究所 | Method for rebuilding tree model on the basis of point cloud and data driving |
Also Published As
Publication number | Publication date |
---|---|
CN113364699A (en) | 2021-09-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI769754B (en) | Method and device for determining target business model based on privacy protection | |
CN112037930B (en) | Infectious disease prediction equipment, method, device and storage medium | |
CN111126622A (en) | Data anomaly detection method and device | |
CN110149237B (en) | Hadoop platform computing node load prediction method | |
CN111526119B (en) | Abnormal flow detection method and device, electronic equipment and computer readable medium | |
CN110874744B (en) | Data anomaly detection method and device | |
CN112468326A (en) | Access flow prediction method based on time convolution neural network | |
CN110460458A (en) | Based on multistage markovian Traffic anomaly detection method | |
CN111541626A (en) | Network bandwidth updating method and device, electronic equipment and storage medium | |
CN113364699B (en) | Cloud data flow management and control method and system based on multi-scale self-similar characteristic | |
CN111651421B (en) | Improved Rsync method, device and information synchronization system | |
CN109242250A (en) | A kind of user's behavior confidence level detection method based on Based on Entropy method and cloud model | |
CN110059894A (en) | Equipment state assessment method, apparatus, system and storage medium | |
CN112907128A (en) | Data analysis method, device, equipment and medium based on AB test result | |
CN114528190B (en) | Single index abnormality detection method and device, electronic equipment and readable storage medium | |
CN109065176B (en) | Blood glucose prediction method, device, terminal and storage medium | |
CN108399415B (en) | Self-adaptive data acquisition method based on life cycle stage of equipment | |
CN117009903A (en) | Data anomaly detection method, device, equipment and storage medium | |
CN116938683A (en) | Network path analysis system and method based on network security anomaly detection | |
CN107688862A (en) | Insulator equivalent salt density accumulation rate Forecasting Methodology based on BA GRNN | |
CN112988527A (en) | GPU management platform anomaly detection method and device and storage medium | |
Weber et al. | Multicomponent reaction-diffusion processes on complex networks | |
Drieieva et al. | Method of Fractal Traffic Generation by a Model of Generator on the Graph. | |
CN114124725A (en) | Quantum communication network reliability comprehensive evaluation method based on complex network model | |
Wang et al. | Community detection with self-adapting switching based on affinity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231113 Address after: 106, 1st Floor, Block B, Building 16, Enji Xiyuan Industrial Park, No.1 Liangjiadian, Fuwai, Haidian District, Beijing, 100142 Patentee after: BEIJING WONDERSOFT TECHNOLOGY Corp.,Ltd. Patentee after: Beijing Mingchao Xin'an Technology Co.,Ltd. Address before: 100142 block B, building 16, Enji West Industrial Park, No.1, liangjiadian, Fuwai, Haidian District, Beijing Patentee before: BEIJING WONDERSOFT TECHNOLOGY Corp.,Ltd. |
|
TR01 | Transfer of patent right |