CN113364699B - Cloud data flow management and control method and system based on multi-scale self-similar characteristic - Google Patents

Cloud data flow management and control method and system based on multi-scale self-similar characteristic Download PDF

Info

Publication number
CN113364699B
CN113364699B CN202110662050.6A CN202110662050A CN113364699B CN 113364699 B CN113364699 B CN 113364699B CN 202110662050 A CN202110662050 A CN 202110662050A CN 113364699 B CN113364699 B CN 113364699B
Authority
CN
China
Prior art keywords
data
self
cloud data
traffic
scale
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110662050.6A
Other languages
Chinese (zh)
Other versions
CN113364699A (en
Inventor
吴阳
喻波
王志海
董爱华
安鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mingchao Xin'an Technology Co.,Ltd.
Beijing Wondersoft Technology Co Ltd
Original Assignee
Beijing Wondersoft Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wondersoft Technology Co Ltd filed Critical Beijing Wondersoft Technology Co Ltd
Priority to CN202110662050.6A priority Critical patent/CN113364699B/en
Publication of CN113364699A publication Critical patent/CN113364699A/en
Application granted granted Critical
Publication of CN113364699B publication Critical patent/CN113364699B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/20Traffic policing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/27Evaluation or update of window size, e.g. using information derived from acknowledged [ACK] packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a cloud data flow control method and system based on multi-scale self-similarity characteristics. The method comprises the following steps: s1, preprocessing acquired cloud data to obtain a data sequence, wherein the preprocessing comprises data packet filling of the cloud data; s2, selecting a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window; s3, calculating the Hurst parameters of the data subsequences to determine the flow self-similarity characteristic of the cloud data; and S4, carrying out flow control on the cloud data based on the flow self-similarity characteristic so as to adaptively adjust a transmission flow structure of the cloud data.

Description

Cloud data flow management and control method and system based on multi-scale self-similar characteristic
Technical Field
The invention relates to the field of flow control, in particular to a cloud data flow control method and system based on multi-scale self-similarity characteristics.
Background
In the face of increasingly diverse services and varying amounts of soaring network traffic, this requires the internet to provide higher quality of service. For those services with very high requirements on bandwidth, network delay, etc., the "Best-Effort" (Best-Effort) service provided in the conventional internet at present cannot meet the service requirements. Although the network speed and user experience have improved greatly with the development of network technology, the data transmitted through the network has increased at the same time, which makes how to guarantee the service quality still a bottleneck problem at present stage.
Cloud data (Cloud data) is a general term for technologies and platforms of data integration, data analysis, data integration, data distribution and data early warning based on Cloud computing business model application. Traffic control (Network traffic control) is a method that uses software or hardware to control Network traffic. The most important method is to introduce the concept of QoS, and to determine the priority of data packet passing by marking different types of network data packets. The traditional mode is limited to the problem that the massive cloud data traffic cannot be rapidly and safely controlled.
Disclosure of Invention
The cloud data traffic control scheme based on the multi-scale self-similarity characteristic is provided, and the problem that rapid safety control cannot be achieved for massive cloud data traffic in the prior art is solved.
The invention provides a cloud data flow control method based on multi-scale self-similarity characteristics in a first aspect. The method comprises the following steps: s1, preprocessing acquired cloud data to obtain a data sequence, wherein the preprocessing comprises data packet filling of the cloud data; s2, selecting a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window; s3, calculating the Hurst parameters of the data subsequences to determine the flow self-similarity characteristic of the cloud data; and S4, carrying out flow control on the cloud data based on the flow self-similarity characteristic so as to adaptively adjust a transmission flow structure of the cloud data.
According to the method provided by the first aspect of the present invention, in step S1, the cloud data includes a plurality of data records, when the duration of the data record is one second, the data packet padding is not performed, when the duration of the data record exceeds one second, it is determined that there is a default data packet in the data record, and the data packet padding is performed on the data record to obtain the data sequence, where the data packet padding specifically includes: when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling; and when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method.
According to the method provided by the first aspect of the present invention, in step S2, in the calculation windows of the respective scales, the data subsequences corresponding to the current scale are sequentially selected from the data sequences in a sliding manner, and the data subsequences selected in the calculation windows of the same scale have the same length.
According to the method provided by the first aspect of the present invention, in step S3, a flow self-similarity change curve is drawn as a self-similarity feature of the cloud data flow based on Hurst parameters of the data subsequences under calculation windows of various scales by using a re-standard polar difference analysis method.
According to the method provided by the first aspect of the present invention, in the step S4, the traffic self-similarity characteristic is compared with a second threshold to determine whether the cloud data belongs to an abnormal traffic, and if yes, the IP and the port of the cloud data are adaptively adjusted to control the transmission traffic structure, where the second threshold is a threshold for calculating a statistical value of a normal traffic.
The invention provides a cloud data flow management and control system based on a multi-scale self-similar characteristic. The system comprises: a preprocessing unit configured to preprocess the acquired cloud data to obtain a data sequence, the preprocessing including data packet stuffing of the cloud data; a subsequence selecting unit configured to select a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window; a self-similarity characteristic determination unit configured to calculate Hurst parameters of the data subsequences to determine a flow self-similarity characteristic of the cloud data; and the traffic control unit is configured to perform traffic control on the cloud data based on the traffic self-similarity characteristic so as to adaptively adjust a transmission traffic structure of the cloud data.
According to the system provided by the second aspect of the present invention, the cloud data includes a plurality of data records, and the preprocessing unit is specifically configured to: when the duration of the data record is one second, not performing the data packet padding, and when the duration of the data record exceeds one second, determining that the data record has a data packet default condition, and performing the data packet padding on the data record to obtain the data sequence, where the data packet padding specifically includes: when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling; and when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method.
According to the system provided by the second aspect of the present invention, the subsequence selecting unit is specifically configured to, under the calculation windows of the respective scales, sequentially select, in a sliding manner, the data subsequence corresponding to the current scale from the data sequence, and each data subsequence selected under the calculation window of the same scale has the same length.
According to the system provided by the second aspect of the present invention, the self-similarity characteristic determination unit is specifically configured to draw a flow self-similarity change curve as a self-similarity feature of the cloud data flow based on Hurst parameters of the data subsequences under calculation windows of various scales by using a re-standard polar difference analysis method.
According to the system provided by the second aspect of the present invention, the traffic management and control unit is specifically configured to compare the traffic self-similarity characteristic with a second threshold value to determine whether the cloud data belongs to an abnormal traffic, and if so, adaptively adjust the IP and the port of the cloud data to control the transmission traffic structure, where the second threshold value is a threshold value for calculating a statistical value of a normal traffic.
A third aspect of the present invention provides a non-transitory computer readable medium storing instructions which, when executed by a processor, perform the steps in a method for cloud data traffic management based on multi-scale self-similar characteristics according to the first aspect of the present invention.
In summary, in the technical scheme provided by the invention, the cloud data traffic is subjected to self-similarity calculation according to different time scales to obtain traffic characteristics under multiple time scales, and the threshold of the traffic self-similarity index is set according to the characteristics of different time scales, so that the self-similarity calculation of the data traffic under different application environments is adapted, and the problem that rapid safety control cannot be realized on the mass cloud data traffic is solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description in the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a cloud data traffic control method based on multi-scale self-similar characteristics according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart according to a first embodiment of the present invention;
FIG. 3 is a schematic flow chart according to a second embodiment of the present invention; and
fig. 4 is a structural diagram of a cloud data traffic management and control system based on multi-scale self-similar characteristics according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a cloud data flow control method based on multi-scale self-similarity characteristics in a first aspect. Fig. 1 is a flowchart of a cloud data traffic control method based on multi-scale self-similar characteristics according to an embodiment of the present invention, and as shown in fig. 1, the method includes: s1, preprocessing acquired cloud data to obtain a data sequence, wherein the preprocessing comprises data packet filling of the cloud data; s2, selecting a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window; s3, calculating Hurst parameters of the data subsequences to determine the flow self-similarity characteristic of the cloud data; and S4, carrying out flow control on the cloud data based on the flow self-similarity characteristic so as to adaptively adjust a transmission flow structure of the cloud data.
In step S1, the collected cloud data is preprocessed to obtain a data sequence, where the preprocessing includes data packet stuffing of the cloud data.
In some embodiments, in step S1, the cloud data includes a plurality of data records, and when the duration of the data record is one second, the data packet padding is not performed, and when the duration of the data record exceeds one second, it is determined that a packet default condition exists in the data record, and the data packet padding is performed on the data record to obtain the data sequence, where the data packet padding specifically includes: when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling; and when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method.
In particular, cloud data collection may have a default time condition, and a data packet time default or repeated part needs to be filled in the data preprocessing. There are no packets for a certain second and this leads to a confusion of the statistics. If a record spans more seconds, it indicates that the middle second has no data, and the right time can be used as the unique identification time of the record. When the default is less, the condition that the data before and after the default time is complete is mainly referred to, the mean value of the data before and after the default time is directly calculated to be used as the data at the default time, and otherwise, a cubic spline interpolation method is adopted to complete the data. Cubic Spline interpolation is called as Spline interpolation for short, and compared with other interpolation methods, the cubic Spline interpolation carries out reasonable compromise between flexibility and calculation speed. It is less demanding to compute and store than higher order splines and is more stable. Compared with a secondary interpolation spline, the cubic interpolation spline is more flexible in simulating any shape, and the accuracy of the obtained interpolation data is higher.
Cubic spline interpolation divides an interval into n-1 intervals for one interval (a, b):
x 0 =a<x 1 ,......x n-1 <b=x n
an unknown function needs to be simulated by known n +1 points, and the method is implemented by adopting a segmentation method in cubic spline interpolation. The piecewise function obtained by cubic spline interpolation ensures that the following conditions are satisfied, and these conditions are also the conditions for solving each segment of spline interpolation:
(1) The function value of the simulated function at a known point is equal to the function value of f;
(2) The modeled piecewise function is second order continuous, i.e. the derivative and the second derivative are equal at the intersection of the segments;
(3) The case of the second derivative at points a and b, or the law of change of the second derivative at these n +1 points, needs to be known.
It is known that:
a.n +1 data points [ xi, yi ], i =0,1, …, n;
b. each segment is a cubic polynomial function curve;
c. the nodes reach second order continuity;
d. characteristics at the left and right end points (natural boundary, fixed boundary, non-nodal boundary);
and (4) solving the coefficient in each spline curve equation according to the fixed point to obtain a specific expression of each curve.
In step S2, a data subsequence for determining the self-similarity characteristic is selected from the data sequences in a sliding manner by using a multi-scale calculation window.
In some embodiments, in step S2, under the calculation windows of the respective scales, the data subsequences corresponding to the current scale are sequentially selected from the data sequences in a sliding manner, and the data subsequences selected under the calculation windows of the same scale have the same length.
Specifically, the flow self-similarity calculation mainly utilizes a multi-scale thought, and flow data needs to be converted into a multi-scale representation mode after being preprocessed. The process is mainly realized by using a Hurst method, and subsequences are sequentially taken out of the whole data sequence in a sliding window mode. The subsequences taken out by sliding the window every time are the same in length, and the accuracy of calculating the H value every time is guaranteed.
The Hurst self-similarity parameter is the most widely used mathematical parameter which characterizes the self-similarity model and has the highest recognition degree. The Hurst parameter reflects the result of a long series of interconnection events, and the self-similarity characteristics of different types of network flows can be obtained by calculating the Hurst parameters of normal flow and abnormal flow. Hurst self-similarity means that changes in the time scale do not result in changes in the statistical characteristics. Local statistical properties of the network traffic can be used to approximate the overall properties of the network traffic. The Hurst self-similarity parameter is the only important index capable of representing the self-similarity process, in the network traffic, when H is in a value interval (0.5,1), the network shows self-similarity, and the closer the value is to 1, the stronger the self-similarity of the network traffic is. When H =0.5, the network traffic has randomness, without self-similarity on a time scale. When H is in the value range (0,0.5), it indicates that the network traffic has the opposite statistical property in time to the network traffic of the previous time.
In step S3, a Hurst parameter of each data subsequence is calculated to determine a traffic self-similarity characteristic of the cloud data.
In some embodiments, a flow self-similarity variation curve is drawn as a self-similarity feature of the cloud data flow based on Hurst parameters of the data subsequences under the calculation windows of the respective scales.
Specifically, a Hurst parameter H of a data subsequence is calculated by using a re-standard range (R/S) analysis method, and a flow self-similarity change curve can be drawn by sequentially calculating an H value sequence.
The Hurst parameter based on re-standard range (R/S) analysis is used as an index to judge whether time series data follow a random walk or biased random walk process. Systems with hercules statistics do not require independent random event hypotheses, which are typically probabilistic. Which reflects the results of a long string of interrelated events.
The Hurst parameter has three forms:
(1) If H =0.5, it is indicated that the time sequence can be described by a random walk;
(2) If 0.5-H-cloth-1, the long-term memory of the time sequence is shown;
(3) If 0 ≦ H <0.5, pink noise (anti-persistence) is indicated, i.e., the mean-return process.
That is, as long as H ≠ 0.5, the time-series data can be described by biased brownian motion (fractal brownian motion).
The method for calculating the Hurst parameter mainly comprises the following steps: a polymerization variance method, an R/S analysis method, a periodogram method, an absolute value method, a residual variance method, a wavelet analysis method and a Whittle method.
In step S4, traffic control is performed on the cloud data based on the traffic self-similarity characteristic, so as to adaptively adjust a transmission traffic structure of the cloud data.
In some embodiments, in the step S4, the traffic self-similar characteristic is compared with a second threshold to determine whether the cloud data belongs to an abnormal traffic, and if so, the IP and the port of the cloud data are adaptively adjusted to control the transmission traffic structure, where the second threshold is a threshold for calculating a statistical value of a normal traffic.
Specifically, since the flow data has the general characteristics of a signal, the flow anomaly detection problem can be extended to a signal detection problem, and anomaly detection is realized on the basis. The threshold (second threshold) used in the anomaly determination is a statistical value calculated for the normal flow, and is obtained by using the monte carlo method. And obtaining a final detection result through comparison. Limiting the ip and the port of the abnormal flow to adjust the flow structure of cloud data transmission, continuously inputting the data flow after control to preprocessing, and detecting and controlling the flow again.
The monte carlo method, also known as a statistical simulation method, uses random numbers (or more commonly pseudo-random numbers) to solve computational problems. The engineering technical problem solved by the method can be divided into two types, namely a deterministic problem and a stochastic problem. The problem solving steps are as follows:
(1) Constructing a probability model or a random model according to the proposed problem, and enabling the solution of the problem to correspond to certain characteristics (such as probability, mean value, variance and the like) of random variables in the model, wherein the constructed model is consistent with an actual problem or system in terms of main characteristic parameters;
(2) And generating random numbers on a computer according to the distribution of each random variable in the model, so as to realize a sufficient number of random numbers required by one simulation process. Generally, random numbers which are uniformly distributed are generated firstly, and then random numbers which obey certain distribution are generated, so that a random simulation test can be carried out;
(3) According to the characteristics of the probability model and the distribution characteristics of random variables, a proper sampling method is designed and selected, and each random variable is sampled (including direct sampling, hierarchical sampling, related sampling, important sampling and the like);
(4) Carrying out simulation test and calculation according to the established model to obtain a random solution of the problem;
(5) And (4) statistically analyzing the simulation test result, and giving a probability solution of the problem and precision estimation of the solution.
The present invention uses the monte carlo method to determine the threshold. The Monte Carlo method is used for calculating probability distribution and digital characteristics of complex random variables, and can estimate the reliability of a system and parts through random simulation, simulate a random process, seek optimal parameters of the system and the like.
Fig. 2 is a schematic flow chart according to a first embodiment of the present invention, and as shown in fig. 2, data preprocessing is performed on cloud data, then a flow self-similarity index is calculated by selecting calculation windows of different time lengths, and further a flow characteristic of the data is extracted according to the self-similarity, and then whether the flow self-similarity is normal is determined based on a threshold T, if yes, the flow is normal, if no, the flow is abnormal, a regulation measure for reducing abnormal IP and port data receiving and forwarding amounts is taken, and after regulation, the cloud data is preprocessed again to determine whether abnormal flow exists again.
Fig. 3 is a schematic flow chart according to a second embodiment of the present invention, and as shown in fig. 3, the implementation case is a function of performing traffic anomaly detection on cloud data to manage and control an abnormal traffic, and the specific flow is as follows: the method comprises the steps of cloud data sampling- > data preprocessing- > taking a flow sequence with various time lengths- > flow self-similarity calculation- > anomaly detection- > flow control, carrying out control and limitation on the abnormal flow of the cloud data, then carrying out anomaly detection on the flow, and adjusting a detection threshold and a limitation proportion to enable the flow control effect to be the best.
The invention provides a cloud data flow management and control system based on a multi-scale self-similar characteristic. Fig. 4 is a structural diagram of a cloud data traffic management system based on multi-scale self-similar characteristics according to an embodiment of the present invention, and as shown in fig. 4, the system 400 includes: a preprocessing unit 401 configured to preprocess the acquired cloud data to obtain a data sequence, where the preprocessing includes performing data packet stuffing on the cloud data; a subsequence selecting unit 402 configured to select a data subsequence for determining the self-similarity characteristic from the data sequences in a sliding manner by using a multi-scale calculation window; a self-similarity characteristic determination unit 403, configured to calculate Hurst parameters of the data subsequences to determine a traffic self-similarity characteristic of the cloud data; a traffic control unit 404 configured to perform traffic control on the cloud data based on the traffic self-similar characteristic to adaptively adjust a transmission traffic structure of the cloud data.
According to the system provided by the second aspect of the present invention, the cloud data includes a plurality of data records, and the preprocessing unit 401 is specifically configured to: when the duration of the data record is one second, not performing the data packet padding, and when the duration of the data record exceeds one second, determining that the data record has a data packet default condition, and performing the data packet padding on the data record to obtain the data sequence, where the data packet padding specifically includes: when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling; and when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method.
According to the system provided by the second aspect of the present invention, the subsequence selecting unit 402 is specifically configured to, under the calculation windows of the respective scales, sequentially select, in a sliding manner, the data subsequence corresponding to the current scale from the data sequence, where each data subsequence selected under the calculation window of the same scale has the same length.
According to the system provided by the second aspect of the present invention, the self-similarity characteristic determining unit 403 is specifically configured to draw a flow self-similarity change curve as a self-similarity feature of the cloud data flow based on the Hurst parameter of the data subsequence under the calculation window of each scale by using a re-standard polar difference analysis method.
According to the system provided by the second aspect of the present invention, the traffic management and control unit 404 is specifically configured to compare the traffic self-similarity characteristic with a second threshold to determine whether the cloud data belongs to an abnormal traffic, and if so, adaptively adjust the IP and the port of the cloud data to control the transmission traffic structure, where the second threshold is a threshold for calculating a statistical value of a normal traffic.
A third aspect of the present invention provides a non-transitory computer readable medium storing instructions that, when executed by a processor, perform the steps of a method for cloud data traffic management based on multi-scale self-similar characteristics according to the first aspect of the present invention.
In summary, in the technical scheme provided by the present invention, the cloud data traffic is subjected to self-similarity calculation according to different time scales to obtain traffic characteristics under multiple time scales, and the threshold of the traffic self-similarity index is set according to the characteristics of different time scales, so that the method is suitable for the self-similarity calculation of data traffic under different application environments, and the problem that rapid security control cannot be implemented on massive cloud data traffic is solved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the spirit of the corresponding technical solutions of the embodiments of the present invention.

Claims (6)

1. A cloud data flow control method based on multi-scale self-similar characteristics is characterized by comprising the following steps:
s1, preprocessing acquired cloud data to obtain a data sequence, wherein the preprocessing comprises data packet filling of the cloud data;
s2, selecting a data subsequence for determining the self-similarity characteristic from the data sequence in a sliding manner by using a multi-scale calculation window;
s3, calculating the Hurst parameters of the data subsequences to determine the flow self-similarity characteristic of the cloud data;
s4, carrying out flow control on the cloud data based on the flow self-similarity characteristic so as to adaptively adjust a transmission flow structure of the cloud data;
in step S1, the cloud data includes a plurality of data records, when the duration of the data record is one second, the data packet padding is not performed, and when the duration of the data record exceeds one second, it is determined that the data record has a default data packet, and the data packet padding is performed on the data record to obtain the data sequence, where the data packet padding specifically includes:
when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling;
when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method;
in the step S4, the traffic self-similarity characteristic is compared with a second threshold to determine whether the cloud data belongs to an abnormal traffic, and if so, the IP and the port of the cloud data are adaptively adjusted to control the transmission traffic structure, where the second threshold is a threshold for calculating a statistical value of a normal traffic.
2. The method for managing and controlling cloud data flow based on multi-scale self-similar characteristics according to claim 1, wherein in step S2, under the calculation windows of each scale, a data subsequence corresponding to a current scale is sequentially selected from the data sequence in a sliding manner, and each data subsequence selected under the calculation window of the same scale has the same length.
3. The method for managing and controlling the cloud data flow based on the multi-scale self-similar characteristics as claimed in claim 1, wherein in the step S3, a flow self-similar change curve is drawn as the self-similar characteristics of the cloud data flow based on Hurst parameters of the data subsequences under the calculation windows of each scale by using a re-scaling polar difference analysis method.
4. A cloud data flow management and control system based on multi-scale self-similar characteristics is characterized by comprising:
a preprocessing unit configured to preprocess the acquired cloud data to obtain a data sequence, the preprocessing including data packet stuffing of the cloud data;
a subsequence selecting unit configured to select a data subsequence for determining the self-similarity characteristic from the data sequences in a sliding manner by using a multi-scale calculation window;
a self-similarity characteristic determination unit configured to calculate Hurst parameters of the data subsequences to determine a flow self-similarity characteristic of the cloud data;
the traffic control unit is configured to perform traffic control on the cloud data based on the traffic self-similarity characteristic so as to adaptively adjust a transmission traffic structure of the cloud data;
the cloud data comprises a plurality of data records, and the preprocessing unit is specifically configured to:
when the duration of the data record is one second, not performing the data packet padding, and when the duration of the data record exceeds one second, determining that a data packet default condition exists in the data record, and performing the data packet padding on the data record to obtain the data sequence, where the data packet padding specifically includes:
when the default of the data packet is lower than a first threshold value, determining the average value of two data records before and after the default time as the data of the default time for filling;
when the default of the data packet is not lower than the first threshold, filling the data at the default moment by utilizing a cubic spline difference method;
the traffic control unit is specifically configured to compare the traffic self-similarity characteristic with a second threshold value to determine whether the cloud data belongs to an abnormal traffic, and if yes, adaptively adjust I P and a port of the cloud data to control the transmission traffic structure, where the second threshold value is a threshold value for calculating a statistical value of a normal traffic.
5. The cloud data flow management and control system based on the multi-scale self-similar characteristic according to claim 4, wherein:
the subsequence selection unit is specifically configured to sequentially select a data subsequence corresponding to the current scale from the data sequence in a sliding manner under the calculation windows of all scales, wherein all the data subsequences selected under the calculation windows of the same scale have the same length;
the self-similarity characteristic determination unit is specifically configured to draw a flow self-similarity change curve as a self-similarity characteristic of the cloud data flow based on the Hurst parameter of the data subsequence under the calculation window of each scale by using a re-standard polar difference analysis method.
6. A non-transitory computer readable medium storing instructions, wherein the instructions, when executed by a processor, perform the steps of any one of claims 1-3 in a cloud data traffic management method based on multi-scale self-similar features.
CN202110662050.6A 2021-06-15 2021-06-15 Cloud data flow management and control method and system based on multi-scale self-similar characteristic Active CN113364699B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110662050.6A CN113364699B (en) 2021-06-15 2021-06-15 Cloud data flow management and control method and system based on multi-scale self-similar characteristic

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110662050.6A CN113364699B (en) 2021-06-15 2021-06-15 Cloud data flow management and control method and system based on multi-scale self-similar characteristic

Publications (2)

Publication Number Publication Date
CN113364699A CN113364699A (en) 2021-09-07
CN113364699B true CN113364699B (en) 2023-04-07

Family

ID=77534263

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110662050.6A Active CN113364699B (en) 2021-06-15 2021-06-15 Cloud data flow management and control method and system based on multi-scale self-similar characteristic

Country Status (1)

Country Link
CN (1) CN113364699B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114049453B (en) * 2021-11-18 2024-04-30 中国石油天然气股份有限公司 Slit plate and modeling method of slit plate model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404164A (en) * 2011-08-09 2012-04-04 江苏欣网视讯科技有限公司 Flow analysis method based on ARMA (Autoregressive Moving Average) model and chaotic time sequence model
WO2015149302A1 (en) * 2014-04-02 2015-10-08 中国科学院自动化研究所 Method for rebuilding tree model on the basis of point cloud and data driving

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7321555B2 (en) * 2003-04-16 2008-01-22 International Business Machines Corporation Multilevel analysis of self-similar network traffic
CN105577473B (en) * 2015-12-21 2019-06-04 重庆大学 A kind of multi-service traffic generating system based on Model of network traffic
CN109685334B (en) * 2018-12-10 2020-07-10 浙江大学 Novel hydrological model simulation evaluation method based on multi-scale theory
CN111586075B (en) * 2020-05-26 2022-06-14 国家计算机网络与信息安全管理中心 Hidden channel detection method based on multi-scale stream analysis technology

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404164A (en) * 2011-08-09 2012-04-04 江苏欣网视讯科技有限公司 Flow analysis method based on ARMA (Autoregressive Moving Average) model and chaotic time sequence model
WO2015149302A1 (en) * 2014-04-02 2015-10-08 中国科学院自动化研究所 Method for rebuilding tree model on the basis of point cloud and data driving

Also Published As

Publication number Publication date
CN113364699A (en) 2021-09-07

Similar Documents

Publication Publication Date Title
TWI769754B (en) Method and device for determining target business model based on privacy protection
CN112037930B (en) Infectious disease prediction equipment, method, device and storage medium
CN111126622A (en) Data anomaly detection method and device
CN110149237B (en) Hadoop platform computing node load prediction method
CN111526119B (en) Abnormal flow detection method and device, electronic equipment and computer readable medium
CN110874744B (en) Data anomaly detection method and device
CN112468326A (en) Access flow prediction method based on time convolution neural network
CN110460458A (en) Based on multistage markovian Traffic anomaly detection method
CN111541626A (en) Network bandwidth updating method and device, electronic equipment and storage medium
CN113364699B (en) Cloud data flow management and control method and system based on multi-scale self-similar characteristic
CN111651421B (en) Improved Rsync method, device and information synchronization system
CN109242250A (en) A kind of user&#39;s behavior confidence level detection method based on Based on Entropy method and cloud model
CN110059894A (en) Equipment state assessment method, apparatus, system and storage medium
CN112907128A (en) Data analysis method, device, equipment and medium based on AB test result
CN114528190B (en) Single index abnormality detection method and device, electronic equipment and readable storage medium
CN109065176B (en) Blood glucose prediction method, device, terminal and storage medium
CN108399415B (en) Self-adaptive data acquisition method based on life cycle stage of equipment
CN117009903A (en) Data anomaly detection method, device, equipment and storage medium
CN116938683A (en) Network path analysis system and method based on network security anomaly detection
CN107688862A (en) Insulator equivalent salt density accumulation rate Forecasting Methodology based on BA GRNN
CN112988527A (en) GPU management platform anomaly detection method and device and storage medium
Weber et al. Multicomponent reaction-diffusion processes on complex networks
Drieieva et al. Method of Fractal Traffic Generation by a Model of Generator on the Graph.
CN114124725A (en) Quantum communication network reliability comprehensive evaluation method based on complex network model
Wang et al. Community detection with self-adapting switching based on affinity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231113

Address after: 106, 1st Floor, Block B, Building 16, Enji Xiyuan Industrial Park, No.1 Liangjiadian, Fuwai, Haidian District, Beijing, 100142

Patentee after: BEIJING WONDERSOFT TECHNOLOGY Corp.,Ltd.

Patentee after: Beijing Mingchao Xin'an Technology Co.,Ltd.

Address before: 100142 block B, building 16, Enji West Industrial Park, No.1, liangjiadian, Fuwai, Haidian District, Beijing

Patentee before: BEIJING WONDERSOFT TECHNOLOGY Corp.,Ltd.

TR01 Transfer of patent right