CN118132402A

CN118132402A - Multi-dimensional time sequence prediction method, electronic equipment and storage medium

Info

Publication number: CN118132402A
Application number: CN202410540607.2A
Authority: CN
Inventors: 王静; 王济昂; 丁建立; 李永华
Original assignee: Civil Aviation University of China
Current assignee: Civil Aviation University of China
Priority date: 2024-04-30
Filing date: 2024-04-30
Publication date: 2024-06-04
Anticipated expiration: 2044-04-30
Also published as: CN118132402B

Abstract

The present invention relates to the field of computer technology application, and in particular, to a multidimensional time series prediction method, an electronic device, and a storage medium, including: firstly, carrying out high-dimensional space mapping on an input data matrix, then carrying out convolution operation on a plurality of time dimensions and variable dimensions on the mapped data matrix, and finally carrying out linear mapping on an output result to obtain a corresponding prediction result. According to the method, the multi-dimensional time sequence is processed by adopting the linear structure model, the inter-variable dependence and the inter-time dependence are modeled in the time dimension and the variable dimension respectively, the mixed potential representation of the variable and the time is obtained, and finally, the prediction result is obtained through linear mapping, so that the task of predicting the data flow in real time can be met, and the calculation requirement is reduced to the greatest extent.

Description

Multi-dimensional time sequence prediction method, electronic equipment and storage medium

Technical Field

The present invention relates to the field of computer technology application, and in particular, to a multidimensional time series prediction method, an electronic device, and a storage medium.

Background

In some application scenarios, for example, in the operation monitoring scenario of a weak current information system of a civil transportation airport, operation and maintenance personnel can realize real-time monitoring of the operation state of the system by collecting operation index data of related infrastructure, hardware platform, basic software, application software and the like of the system, when certain monitoring indexes are abnormal (such as sudden increase, sudden drop and shake), certain problems can exist in related applications or services, the system should give an alarm in time to inform related personnel to intervene, and therefore the occurrence of system faults is avoided. Therefore, during the running of the system, the change of the system running monitoring index data is required to be closely focused, and the development trend of the system running monitoring index data is rapidly and accurately predicted, so that possible abnormality of the system can be found as soon as possible, decision basis is provided for the early intervention of operation and maintenance personnel, and the normal running of the weak current information system of the airport is effectively ensured.

Disclosure of Invention

Aiming at the technical problems, the invention adopts the following technical scheme:

According to a first aspect of the present invention there is provided a multi-dimensional time series prediction method, the method being implemented on the basis of a current data processing model, the data processing model comprising: the system comprises an input module, a double-convolution network structure and an output module which are connected in sequence, wherein the double-convolution network structure comprises p double-convolution modules which are connected in sequence, and each double-convolution module comprises a first capture module, a time convolution layer, a variable convolution layer, a first feature fusion module, a second capture module and a second feature fusion module; the method comprises the following steps:

S100, acquiring a data set D for prediction corresponding to a monitored server at the current moment, and constructing an initial input matrix RS based on the D; wherein d= { D ₁,D₂,……,D_i,……,D_m }; wherein D _i is a monitoring data set corresponding to the ith monitoring index corresponding to the monitored server, the value of i is 1 to m, and m is the number of the monitoring indexes corresponding to the monitored server; d _i={d_i1,d_i2,……,d_ij,……,d_in},d_ij is a monitoring value corresponding to the j-th sampling monitoring moment of the i-th monitoring index corresponding to the monitored server in D, the value of j is 1 to n, and n is the number of sampling monitoring moments in D; RS is m×n matrix; the monitoring index is a parameter representing the performance of the monitored server; and in the step D, the time length Deltat=hxDeltaa between two adjacent sampling monitoring moments, deltaa is the time length corresponding to the sampling frequency of the monitoring index, h is a preset integer, and h is more than or equal to 1.

S200, mapping the RS into a high-dimensional matrix RH by using the input module, wherein RH is a matrix Q multiplied by n, and Q is more than m.

S300, a counter c=1 is set.

S400, if c is less than or equal to p, taking the output characteristic matrix R ^out _c-1 of the c-1 th double-convolution module as the input characteristic matrix R ⁱⁿ _c of the c-1 th double-convolution module, and executing S500; otherwise, R ^out _c-1 is taken as a target output feature matrix, and S900 is executed.

S500, filtering the R ⁱⁿ _c by using a first capturing module of the c-th double convolution module to obtain a filtered input feature matrix R ^in-f _c.

S600, performing convolution operation on each row of data of R ^in-f _c by using a time convolution layer of the c-th double convolution module to obtain a corresponding output time feature matrix RT ^out _c, and performing convolution operation on each row of data of R ^in-f _c by using a variable convolution layer of the c-th double convolution module to obtain a corresponding output variable feature matrix RV ^out _c.

And S700, performing feature fusion on the RT ^out _c and the RV ^out _c by using a first feature fusion module of the c-th double convolution module to obtain a corresponding output fusion feature matrix RM _c.

S800, filtering the RM _c by using a second capturing module of the c-th double convolution module to obtain a filtered output fusion feature matrix RM ^f _c; and performing feature fusion on the RM ^f _c and the R ⁱⁿ _c by using a second feature fusion module of the c-th double-convolution module to obtain an output feature matrix R ^out _c of the c-th double-convolution module; setting c=c+1; s400 is performed.

S900, mapping the target output characteristic matrix into an m multiplied by d matrix by using the output module, and taking the m multiplied by d matrix as prediction monitoring data of a monitored server at the current moment; s100 is executed; d is the number of monitoring instants to be predicted after the nth sampling monitoring instant.

According to a second aspect of the present invention, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being arranged to perform the method according to the first aspect of the invention.

According to a third aspect of the present invention there is provided a computer readable storage medium storing computer executable instructions for performing the method of the first aspect of the present invention.

The invention has at least the following beneficial effects:

According to the multi-dimensional time sequence prediction method provided by the embodiment of the invention, the multi-dimensional time sequence is processed by adopting the linear structure model, modeling of the sequence integrity is abandoned, modeling of each batch of input sequences is focused more, feature mapping is carried out on variable dimensions to extract hidden features among variables, a convolutional neural network is selected as a feature extraction component, cross-variable dependence and cross-time dependence are modeled respectively in the time dimension and the variable dimension, mixed potential representation of the variables and time is obtained, and finally, a prediction result is obtained through linear mapping, so that the task of predicting data flow in real time can be satisfied, and the calculation requirement is reduced to the greatest extent.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a block diagram of a data processing model according to an embodiment of the present invention;

fig. 2 is a flowchart of a multi-dimensional time series prediction method according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

It should be noted that some exemplary embodiments are described as a process or a method depicted as a flowchart. Although a flowchart depicts steps as a sequential process, many of the steps may be implemented in parallel, concurrently, or with other steps. Furthermore, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figures. The processes may correspond to methods, functions, procedures, subroutines, and the like.

An embodiment of the present invention provides a multi-dimensional time series prediction method, where the method is implemented based on a current data processing model, as shown in fig. 1, where the data processing model may include: the system comprises an input module 1, a double-convolution network structure 2 and an output module 3 which are sequentially connected, wherein the double-convolution network structure 2 comprises p double-convolution modules which are sequentially connected, and each double-convolution module comprises a first capturing module, a time convolution layer, a variable convolution layer, a first feature fusion module, a second capturing module and a second feature fusion module.

In an embodiment of the present invention, p may be an empirical value, and in an exemplary embodiment, p=6. The inventor of the invention discovers through experiments that 6 double convolution modules can enable a model to have better prediction performance.

The input module 1 is configured to map dimensions of a monitoring index in the received time-series data into a high-dimensional space, obtain high-dimensional space data, and serve as input of a corresponding double convolution module, specifically serve as input of a first capturing module connected with the input module 1. The output of the first capturing module is respectively used as the input of a corresponding time convolution layer and a corresponding variable convolution layer, the output of the time convolution layer and the output of the variable convolution layer are spliced by a corresponding first characteristic fusion module to be used as the input of a second capturing module, and the output of the second capturing module and the input of the corresponding first capturing module are spliced by a corresponding second characteristic fusion module to be used as the input of a next double convolution module.

The high-dimensional feature has more degrees of freedom, can better describe the nonlinear relation and interaction mode in the data, in order to better extract the feature representation among the variables, the input module can help the model to capture more complex and richer feature representation by mapping the variable dimension to the high-dimensional space, and a potential feature vector matrix is obtained, and the calculation process can be expressed as follows: h0 =linear (X). The Linear () is a full connection layer applied to variable dimensions, and maps the variable dimensions from an original input dimension C to Q dimensions to obtain an initial hidden state matrix H0, where X is an input multidimensional time sequence, for example, an input X is a matrix of 90×90, and after mapping, the variable dimensions may be a matrix of 320×90. It is known to the person skilled in the art that the specific mapping principle of the input module 1 can be applied in existing ways.

In the embodiment of the invention, the first capturing module and the second capturing module can be modules with the same structure and function and are used for introducing nonlinear transformation and enhancing the expression capability of a network. In one exemplary embodiment, the capture module is configured to filter the input data using the following formula: d ^in-f=k1×dⁱⁿ×（1+tan(（2/π）^1/2×(dⁱⁿ+k2×（dⁱⁿ）³); wherein d ⁱⁿ is data input into the capturing module, d ^in-f is data after filtering, k1 is a first preset coefficient, and k2 is a second preset coefficient. In one exemplary embodiment, k1=0.5, k2= 0.444715. The capture module defined by the above formula has a smoother curve, helping to alleviate the problem of gradient extinction.

Further, in the embodiment of the present invention, the time convolution layer is used to extract the features of the time dimension, and the variable convolution layer is used to extract the features of the variable dimension. In contrast to recurrent neural networks and attention mechanisms, the weights of the convolutional neural network map are fixed for each time step in the input sequence, and the convolutional operation can effectively capture local patterns in the data. It is possible that the loop or attention architecture has a higher representation capability but is easy to over fit, which leads to an increase in accumulated error for streaming data that varies frequently in the data-oriented mode. In view of the above problems, in the embodiment of the present invention, the model does not pay attention to modeling the integrity of the time sequence, uses a time convolution layer and a variable convolution layer, and uses one-dimensional time sequence convolution in the time dimension and one-dimensional time sequence convolution in the variable dimension respectively, wherein the time convolution layer and the variable convolution layer are both one-dimensional convolution layers, so as to capture local features of the input sequence and better model short-term changes.

The calculation process of the time convolution layer can be as follows: y [ t, v ] _time=b[v]+∑^g-1 _a1=1 H [ t+a1] W [ a1, v ], where Y [ t, v ] _time represents the convolution result in the time dimension, g is the size of the convolution kernel, which can be set based on actual needs, in one illustrative embodiment, g=3. W [ a1, v ] represents the weight of the convolution kernel at position a1 and variable v, as a learnable weight. H [ t+a1] represents the value of the input sequence at time step t+a1. b [ v ] represents the bias term in the variable dimension.

The calculation process of the variable convolution layer can be as follows: y [ t, v ] _var=b[t]+∑^g-1 _a2=1 H [ t+a2] W [ t, a2], wherein Y [ t, v ] _var represents the convolution result in the variable dimension. W [ t, a2] represents the weight of the convolution kernel at time t and variable a2, which is a learnable weight. H [ v+a2] represents the value of the input sequence at variable v+a2. b [ t ] represents the bias term in the time dimension.

Further, in the embodiment of the present invention, the first feature fusion module is configured to perform weighted fusion on the received convolution result in the time dimension and the convolution result in the variable dimension, where the weight corresponding to the convolution result in each dimension may be a learnable weight. The second feature fusion module is used for directly adding the received feature matrix filtered by the second capture module and the feature matrix input into the first capture module. The first feature fusion module and the second feature fusion module can prevent gradient elimination and gradient explosion problems in the network training process.

Further, in the embodiment of the invention, the output module uses the full connection layer Linear to convert the potential feature representation learned by the model into the predicted value of the future time step, so that more redundant calculation is avoided, the space complexity is reduced to the Linear relation relative to the sequence length, and the requirement of real-time calculation is met. In addition, the module introduces a Dropout layer, and improves the generalization capability of the model by randomly discarding part of neurons to reduce overfitting. The calculation process can be represented as follows: pred=linear (Dropout (H ^up)), where Pred is the prediction result, H ^up represents the hidden state matrix of the previous layer output, the time dimension is mapped to the output sequence length through the fully connected layer, and the variable dimension is mapped from the potential variable size to the initial variable size.

Further, in an embodiment of the present invention, the initial structure of the current data processing model may be a trained initial data processing model. The trained initial data processing model may be trained based on historical monitoring data sets of the monitored servers. In the embodiment of the present invention, the monitored server may be a server actually required to be monitored, and in an exemplary embodiment, for example, the monitored server may be an operation and maintenance monitoring system of an airport, specifically, a corresponding server of an airport flight information management system, an airport passenger information service system, and the like.

Further, in an embodiment of the present invention, the trained initial data processing model may be obtained by:

S10, acquiring a history monitoring dataset DH= { DH ₁,DH₂,……,DH_i,……,DH_m } of a monitored server; wherein DH _i is a monitoring data set corresponding to the ith monitoring index corresponding to the monitored server, the value of i is 1 to m, and m is the number of the monitoring indexes corresponding to the monitored server; DH _i={dH_i1,dH_i2,……,dH_iw,……,dH_iZ},d_iw is the monitoring value corresponding to the ith monitoring index corresponding to the monitored server at the w monitoring time, the value of w is 1 to Z, and Z is the number of the monitoring times corresponding to DH.

In DH, the duration of two adjacent monitoring instants may be equal to Deltaa. Δa is the duration of the sampling frequency of the monitoring index, for example 1 minute.

In the embodiment of the invention, the monitoring index of the monitored server can be a parameter representing the performance of the monitored server, the condition occurring inside the monitored server can be better known through the monitoring index, and all indexes comprehensively represent the state of the monitored server. The monitoring index of the monitored server may be determined based on actual situations, for example, the monitoring index of the monitored server may include a Free disk utilization of a directory mount (FREE DISK SPACE on/app/oracle),/home directory mount (FREE DISK SPACE on/home),/opt directory mount (FREE DISK SPACE on/opt),/usr directory mount (FREE DISK SPACE on/usr),/var directory mount (FREE DISK SPACE on/var),/opt directory Free index node utilization (Free inodes on/opt),/usr directory Free index node utilization (Free inodes on/usr), and/usr directory Free index node utilization (Free inodes on/var), and the like, which are not particularly limited in the present invention. In the embodiment of the invention, the monitoring index of the monitored server can be obtained based on the existing mode, for example, the monitoring index is obtained by sampling the monitoring index by a sampler built in the monitored server.

S11, sliding a window in a time step with the length of n, and generating training sample data of H batches based on DH, wherein the training sample data of each batch is an m multiplied by n matrix.

S12, inputting training sample data of a current batch into a current initial data processing model for training to obtain a corresponding prediction result;

S13, acquiring a current loss function value of a current initial data processing model based on a predicted result of a current batch and a corresponding real result, judging whether the current loss function value accords with a preset model training ending condition, if so, executing a step S15, otherwise, executing a step S14;

S14, updating parameters of a current initial data processing model based on the current loss function value, taking sample data of a next batch as training sample data of the current batch, and executing S13;

S15, taking the current initial data model as the trained initial data processing model.

In an embodiment of the present invention, the loss function may employ a mean square error. The parameter setting aspect of the model uses Adam optimizer, the learning rate is set to 0.0001, the iteration number is 15, and the history input window size is 60 time steps in length, i.e. m=60. In order to reduce the overfitting degree of the model and improve the generalization capability of the model, the invention uses an early-stop method, and when the loss function value of the model during training on the training set is no longer reduced, the model training ending condition is met, and the training is terminated, so that the iteration times of the model can be reduced, and the prediction effect of the model on the test set is improved.

In the embodiment of the present invention, the window size of the prediction result may be set based on actual needs, for example, may be 1, 24, or 32 numbers corresponding to monitoring moments.

Further, in an embodiment of the present invention, the training sample data may be a data set after the missing value filling and the smoothing process. Wherein, the missing value filling can be performed by adopting a linear interpolation method. The smoothing process may include:

S20, for DH _i, obtaining a stable coefficient STH _i corresponding to DH _i; if STH _i is less than f0, executing S22, otherwise, executing S21; wherein ,STH_i=（∑ⁿ _t=k+1（dH_it-Avg（DH_i））（dH_i（t-k）-Avg（DH_i）））/（∑ⁿ _o=1（dH_io-Avg（H_i））²）, k is the number of preset time steps, dH _it is the t-th monitoring data in dH _i, dH _i（t-k） is the t-k-th monitoring data in dH _i; avg (DH _i) is the average value corresponding to DH _i; f0 is a preset plateau coefficient threshold, and may be an empirical value, for example, f0=0.7.

S21, setting i=i+1, if i is less than or equal to m, executing S20; otherwise, exiting the current control program;

S22, downsampling DH _i according to a set downsampling frequency to obtain a downsampled historical monitoring dataset; the duration corresponding to the downsampling frequency may be equal to l× Δa, where L is a preset integer, for example, l=5 or 10.

It is known to those skilled in the art that if there is a non-stationary time series in the DH, H batches of training sample data can be generated based on the downsampled historical monitoring dataset. In addition, the data in the training sample data are all data after normalization processing.

The inventor of the present invention has realized that for a multidimensional time series, there may be non-stationary time series, and for these non-stationary time series, some influence may be generated on the prediction result, so in the embodiment of the present invention, for a data set having a non-stationary time series, by performing downsampling processing on the whole, the influence on the prediction result may be reduced as much as possible, so that the prediction result is more accurate.

Further, the multi-dimensional time series prediction method provided by the embodiment of the invention may include the following steps shown in fig. 2:

S100, acquiring a data set D for prediction corresponding to a monitored server at the current moment, and constructing an initial input matrix RS based on the D; wherein d= { D ₁,D₂,……,D_i,……,D_m }; wherein D _i is a monitoring data set corresponding to the ith monitoring index corresponding to the monitored server, the value of i is 1 to m, and m is the number of the monitoring indexes corresponding to the monitored server; d _i={d_i1,d_i2,……,d_ij,……,d_in},d_ij is a monitoring value corresponding to the j-th sampling monitoring moment of the i-th monitoring index corresponding to the monitored server in D, the value of j is 1 to n, and n is the number of sampling monitoring moments in D; RS is an mxn matrix.

In S100, the duration Δt=h×Δa between two adjacent sampling monitoring moments in D, h is a preset integer, and h is not less than 1.

S300, a counter c=1 is set.

Further, S500 may specifically include:

Performing filtering operation on the data d ⁱⁿ _c-uv of the ith row and the ith column in the R ⁱⁿ _c by using a first capturing module of the c-th double convolution module to obtain data d ^in-f _c-uv corresponding to filtering processing; wherein d ^in-f _c-uv satisfies the following condition ：d^in-f _c-uv=k1×dⁱⁿ _c-uv（1+tan(（2/π）^1/2×(dⁱⁿ _c-uv+k2×（dⁱⁿ _c-uv）³));, where k1 is a first preset coefficient and k2 is a second preset coefficient; u has a value of 1 to Q and v has a value of 1 to n.

S600, performing convolution operation on each row of data of R ^in-f _c by using a time convolution layer of the c-th double convolution module, namely performing convolution operation on a time dimension to obtain a corresponding output time feature matrix RT ^out _c, and performing convolution operation on each column of data of R ^in-f _c by using a variable convolution layer of the c-th double convolution module, namely performing convolution operation on a variable dimension to obtain a corresponding output variable feature matrix RV ^out _c.

Further, RM _c satisfies the following condition:

RM_c=WT^out _c×RT^out _c+WV^out _c×RV^out _c;WT^out _c For the weight corresponding to RT ^out _c, WV ^out _c is the weight corresponding to RV ^out _c.

Further, in S800, the filtering processing is performed on the RM _c by using the second capturing module of the c-th double convolution module to obtain a filtered output fusion feature matrix RM ^f _c, which specifically includes:

performing filtering operation on the data dM ⁱⁿ _c-uv of the ith row and the ith column in the RM _c by using a second capturing module of the c-th double convolution module to obtain data dM ^f _c-uv corresponding to filtering processing; wherein dM ^f _c-uv satisfies the following condition ：dM^f _c-uv=k1×dM^f _c-uv（1+tan(（2/π）^1/2×(dM^f _c-uv+k2×（dM^f _c-uv）³)).

Further, R ^out _c satisfies the following condition:

R^out _c=RM^f _c+Rⁱⁿ _c。

S900, mapping the target output characteristic matrix into an m multiplied by d matrix by using the output module, and taking the m multiplied by d matrix as prediction monitoring data of a monitored server at the current moment; s100 is executed; d is the number of monitoring moments to be predicted after the nth monitoring moment, namely the window size of the prediction result.

Further, the method provided by the embodiment of the invention further comprises the following steps:

s1000, acquiring actual monitoring data corresponding to target time d monitoring times after the current time.

S1100, acquiring a current loss function value of the current data processing model based on the predicted monitoring data and the actual monitoring data, if the current loss function value meets a preset condition, not updating the current data processing model, if not, updating parameters of the current data processing model based on the current loss function value, and taking the updated data processing model as the current data processing model.

The technical effects of S1000 and S1100 are that the data processing model of the invention can be an online learning model, and the data processing model of the invention can learn a reference parameter on a small data set first, and continuously update model parameters along with the input of data flow after training is finished. Under the real situation, the cost of collecting and arranging a large amount of historical data is high, and the online learning is difficult, so that a prediction result can be timely output according to the current input real-time data flow, and the model performance is continuously adjusted along with the time so as to adapt to the change of data distribution.

Further, in the embodiment of the present invention, D may be obtained by:

S11, acquiring a monitoring dataset DC= { DC ₁,DC₂,……,DC_i,……,DC_m }, corresponding to L×n monitoring moments, of which the monitored server is positioned before the current monitoring moment; wherein DC _i is the ith monitoring dataset in DC; DC _i={dC_i1,dC_i2,……,dC_io,……,dC_in},dC_io is the o-th monitoring data in DC _i, and the value of o is 1 to Lxn;

S12, performing missing value filling on the DC to obtain a monitoring data set DC ^f subjected to missing value filling;

And S13, carrying out stabilization treatment on the DC ^f to obtain the D.

Further, S12 may specifically include:

S101, traversing DC, for DC _i, adding DC _i to a current first missing value record table RC1 if at least two continuous missing values exist in DC _i, otherwise, adding D _i to a current second missing value record table RC 2; the initial values of RC1 and RC2 are null.

S102, if RC1 is null, executing S103; otherwise, S104 is performed.

And S103, filling missing values corresponding to the monitoring data set in RC2 by using a linear interpolation method, and exiting the current control program.

S104, acquiring a monitoring time range set TC 1= { TC1 ₁,TC1₂,……,TC1_g,……,TC1_E};TC1_g corresponding to RC1, wherein a monitoring time range ,TC1_g={[TC1^s1 _g,TC1^e1 _g],[TC1^s2 _g,TC1^e2 _g],……,[TC1^sx _g,TC1^ex _g],……,[TC1^sz（g） _g,TC1^ez（g） _g]},TC1^sx _g of a missing value corresponding to a g-th monitoring data set in RC1 is the starting monitoring time of an x-th missing sequence corresponding to the g-th monitoring data set in RC1, TC1 ^se _g is the ending monitoring time of an x-th missing sequence corresponding to the g-th monitoring data set in RC1, the value of x is 1 to z (g), and z (g) is the number of missing sequences corresponding to the g-th monitoring data set in RC 1; the value of g is 1 to E, and E is the number of data sets in TC 1; wherein, TC1 ^s1 _g＜TC1^s2 _g＜……＜TC1^sx _g＜……＜TC1^sz（g） _g.

S105, acquiring the minimum start monitor time T_min=min（TC1^s1 ₁,TC1^s1 ₂,……,TC1^s1 _g,……,TC1^s1 _E） and the maximum end monitor time T_max=max（TC1^ez（1） ₁,TC1^ez（2） ₂,……,TC1^ez（g） _g,……,TC1^ez（E _E ^））;, where min () represents the minimum value and max () represents the maximum value.

S106, acquiring monitoring data sets corresponding to n monitoring moments before the current initial monitoring moment T ^c _min in the historical monitoring data sets corresponding to the monitored server as current reference monitoring data sets; the initial value of T ^c _min is T _min.

S107, inputting the current reference monitoring data set into the current data processing model to obtain a corresponding prediction result.

S108, if (T _max-T^c _min +1) is less than or equal to d, filling the missing sequences in the monitoring data set in TC1 within the monitoring time range [ T ^c _min,T_max ] based on the prediction result to obtain DC ^f, and exiting the current control program; if (T _max-T^c _min +1) > d, filling the missing sequences in the monitoring dataset in TC1 within the monitoring time range [ T ^c _min,T^c _min +d ] based on the prediction result; and S109 is performed.

S109, setting T ^c _min=T^c _min +d+1, and executing S106.

In the embodiment of the invention, the scattered data missing in the data set is filled by adopting linear interpolation, and the continuous missing data is filled by adopting the data prediction model, so that the prediction result is more accurate.

Further, S13 may specifically include:

S111, for an ith monitoring data set DC ^f _i in DC ^f, acquiring a stability coefficient STC ^f _i corresponding to DC ^f _i; if STC ^f _i < f0, executing S113, otherwise, executing S112; wherein ,STC^f _i=（∑ⁿ _t=k+1（dC^f _it-Avg（DC^f _i））（dC^f _i（t-k）-Avg（DC^f _i）））/（∑ⁿ _o=1（dC^f _io-Avg（DC^f _i））²）, k is the preset monitoring time number, dC ^f _it is the t-th monitoring data in DC ^f _i, and dC ^f _i（t-k） is the t-k-th monitoring data in DC ^f _i; avg (DC ^f _i) is the average value corresponding to DC ^f _i.

S112, setting i=i+1, if i is less than or equal to m, executing S111; otherwise, taking the monitoring dataset corresponding to n monitoring moments before the current moment in DC ^f as D, and exiting the current control program.

S113, downsampling the DC ^f _i according to a set downsampling frequency to obtain D; wherein, the duration corresponding to the downsampling frequency is set to be equal to Lx delta a.

As known to those skilled in the art, h=1 if there is no non-stationary timing data in DC ^f. H=l, if present.

The technical effect of S110 to S113 is that the monitoring data for prediction can be made as smooth time series as possible, so that the prediction result can be improved as much as possible.

In order to objectively evaluate the prediction performance of the data processing model used in the embodiment of the present invention, six mainstream multidimensional time series prediction algorithms are selected to perform a comparison experiment on the same data set, and the obtained experimental results can be shown in the following table 1. The six mainstream multidimensional Time series prediction algorithms are ER, DER++, FSNet, time-TCN, DLinear and PatchTST, respectively, where ER is a technique for enhancing the performance of deep reinforcement learning algorithms by storing samples collected by sampling into an empirical playback buffer and randomly extracting empirical samples therefrom for training. DER++ is a variant of the ER algorithm, incorporating knowledge distillation strategies based on standard ER algorithms. FSNet is an online learning algorithm based on the TCN backbone network. Time-TCN is a Time sequence prediction algorithm based on TCN backbone network, which applies CNN to model Time dimension. Dlinear is a model for implementing multidimensional time series prediction using a simple linear network, patchTST is an algorithm based on a transducer architecture, emphasizing variable independence. The HCFA model fuses multiple technologies, focuses on local change of data, extracts local features in time dimension and variable dimension respectively by using time sequence convolution, models association relation between time sequence and variable, and enables the model to better process complex time sequence change. Meanwhile, compared with a cyclic neural network and a transducer architecture, the linear structure model is introduced, so that the method has higher calculation efficiency and meets the real-time requirement in an online learning task. The strategies enable the model to be better suitable for the change of the data mode in the online learning task and have robustness, so that the prediction accuracy is improved.

TABLE 1

Experimental results show that the model provided by the embodiment of the invention is based on a sequential structure, the time-series convolution is used for focusing on cross time and cross variable dependence, so that a better improvement effect is achieved on the prediction accuracy, and PatchTST emphasizes variable independence, and errors are smaller when predicting 1 time step, but the errors are gradually larger due to the fact that information from other variables is not associated with the increase of the prediction length; the Time-TCN only carries out convolution operation in the Time dimension; FSNet only consider cross-variable dependencies and are prone to overfitting when processing datasets with large variables, resulting in large errors.

In summary, the data processing model provided by the embodiment of the invention focuses on the local change of the data, and the local features are extracted in the time dimension and the variable dimension respectively by using the time sequence convolution, so that the model can better process complex time sequence change by modeling the association relationship between the time sequence and the variable. Meanwhile, compared with a cyclic neural network and a transducer architecture, the linear structure model is introduced, so that the method has higher calculation efficiency and meets the real-time requirement in an online learning task. The strategies enable the model to be better suitable for the change of the data mode in the online learning task and have robustness, so that the prediction accuracy is improved.

The embodiment of the invention also provides electronic equipment, which comprises: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being configured to perform the methods of embodiments of the present invention.

The embodiment of the invention also provides a non-transitory computer readable storage medium, which stores computer executable instructions for executing the method according to the embodiment of the invention.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution disclosed in the present invention can be achieved, and are not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A method of multidimensional time series prediction, the method being implemented based on a current data processing model, the data processing model comprising: the system comprises an input module, a double-convolution network structure and an output module which are connected in sequence, wherein the double-convolution network structure comprises p double-convolution modules which are connected in sequence, and each double-convolution module comprises a first capture module, a time convolution layer, a variable convolution layer, a first feature fusion module, a second capture module and a second feature fusion module; the method comprises the following steps:

S100, acquiring a data set D for prediction corresponding to a monitored server at the current moment, and constructing an initial input matrix RS based on the D; wherein d= { D ₁,D₂,……,D_i,……,D_m }; wherein D _i is a monitoring data set corresponding to the ith monitoring index corresponding to the monitored server, the value of i is1 to m, and m is the number of the monitoring indexes corresponding to the monitored server; d _i={d_i1,d_i2,……,d_ij,……,d_in},d_ij is a monitoring value corresponding to the j-th sampling monitoring moment of the i-th monitoring index corresponding to the monitored server in D, the value of j is1 to n, and n is the number of sampling monitoring moments in D; RS is m×n matrix; the monitoring index is a parameter representing the performance of the monitored server; the time length delta t=h×delta a between two adjacent sampling monitoring moments in the D is the time length corresponding to the sampling frequency of the monitoring index, h is a preset integer, and h is more than or equal to 1;

S200, mapping RS into a high-dimensional matrix RH by using the input module, wherein RH is a matrix Q multiplied by n, and Q is more than m;

s300, setting a counter c=1;

S400, if c is less than or equal to p, taking the output characteristic matrix R ^out _c-1 of the c-1 th double-convolution module as the input characteristic matrix R ⁱⁿ _c of the c-1 th double-convolution module, and executing S500; otherwise, taking R ^out _c-1 as a target output feature matrix, and executing S900;

S500, filtering the R ⁱⁿ _c by using a first capturing module of the c-th double convolution module to obtain a filtered input feature matrix R ^in-f _c;

S600, performing convolution operation on each row of data of R ^in-f _c by using a time convolution layer of the c-th double convolution module to obtain a corresponding output time feature matrix RT ^out _c, and performing convolution operation on each row of data of R ^in-f _c by using a variable convolution layer of the c-th double convolution module to obtain a corresponding output variable feature matrix RV ^out _c;

S700, performing feature fusion on RT ^out _c and RV ^out _c by using a first feature fusion module of a c-th double convolution module to obtain a corresponding output fusion feature matrix RM _c;

S800, filtering the RM _c by using a second capturing module of the c-th double convolution module to obtain a filtered output fusion feature matrix RM ^f _c; and performing feature fusion on the RM ^f _c and the R ⁱⁿ _c by using a second feature fusion module of the c-th double-convolution module to obtain an output feature matrix R ^out _c of the c-th double-convolution module; setting c=c+1; s400 is executed;

2. The method as recited in claim 1, further comprising:

s1000, acquiring actual monitoring data corresponding to target time d monitoring times after the current time;

3. The method according to claim 1, wherein S500 specifically comprises:

Performing filtering operation on the data d ⁱⁿ _c-uv of the ith row and the ith column in the R ⁱⁿ _c by using a first capturing module of the c-th double convolution module to obtain data d ^in-f _c-uv corresponding to filtering processing; wherein d ^in-f _c-uv satisfies the following condition ：d^in-f _c-uv=k1×dⁱⁿ _c-uv×（1+tan(（2/π）^1/2×(dⁱⁿ _c-uv+k2×（dⁱⁿ _c-uv）³));, where k1 is a first preset coefficient and k2 is a second preset coefficient; u is 1 to Q, v is 1 to n;

the filtering processing is performed on the RM _c by using the second capturing module of the c-th double convolution module, so as to obtain a filtered output fusion feature matrix RM ^f _c, which specifically includes:

Performing filtering operation on the data dM ⁱⁿ _c-uv of the ith row and the ith column in the RM _c by using a second capturing module of the c-th double convolution module to obtain data dM ^f _c-uv corresponding to filtering processing; wherein dM ^f _c-uv satisfies the following condition ：dM^f _c-uv=k1×dM^f _c-uv×（1+tan(（2/π）^1/2×(dM^f _c-uv+k2×（dM^f _c-uv）³)).

4. The method according to claim 1, wherein RM _c satisfies the following condition:

RM_c=WT^out _c×RT^out _c+WV^out _c×RV^out _c;WT^out _c For the weight corresponding to RT ^out _c, WV ^out _c is the weight corresponding to RV ^out _c;

R ^out _c satisfies the following conditions:

R^out _c=RM^f _c+Rⁱⁿ _c。

5. the method of claim 4, wherein D is obtained by:

S11, acquiring a monitoring dataset DC= { DC ₁,DC₂,……,DC_i,……,DC_m }, corresponding to L×n monitoring moments, of which the monitored server is positioned before the current monitoring moment; wherein DC _i is the ith monitoring dataset in DC; DC _i={dC_i1,dC_i2,……,dC_io,……,dC_in},dC_io is the o-th monitoring data in DC _i, and the value of o is 1 to Lxn; l is a preset integer;

And S13, carrying out stabilization treatment on the DC ^f to obtain the D.

6. The method according to claim 5, wherein S12 specifically comprises:

S101, traversing the DC, for the DC _i, adding the DC _i into a current first missing value record table RC1 if at least two continuous missing values exist in the DC _i, otherwise adding the DC _i into a current second missing value record table RC 2; the initial values of RC1 and RC2 are null values;

S102, if RC1 is null, executing S103; otherwise, executing S104;

s103, filling missing values corresponding to the monitoring data set in RC2 by using a linear interpolation method, and exiting the current control program;

S104, acquiring a monitoring time range set TC 1= { TC1 ₁,TC1₂,……,TC1_g,……,TC1_E};TC1_g corresponding to RC1, wherein a monitoring time range ,TC1_g={[TC1^s1 _g,TC1^e1 _g],[TC1^s2 _g,TC1^e2 _g],……,[TC1^sx _g,TC1^ex _g],……,[TC1^sz（g） _g,TC1^ez（g） _g]},TC1^sx _g of a missing value corresponding to a g-th monitoring data set in RC1 is the starting monitoring time of an x-th missing sequence corresponding to the g-th monitoring data set in RC1, TC1 ^se _g is the ending monitoring time of an x-th missing sequence corresponding to the g-th monitoring data set in RC1, the value of x is 1 to z (g), and z (g) is the number of missing sequences corresponding to the g-th monitoring data set in RC 1; the value of g is 1 to E, and E is the number of data sets in TC 1; wherein, TC1 ^s1 _g＜TC1^s2 _g＜……＜TC1^sx _g＜……＜TC1^sz（g） _g;

S105, acquiring a minimum starting monitoring time T_min=min（TC1^s1 ₁,TC1^s1 ₂,……,TC1^s1 _g,……,TC1^s1 _E） and a maximum monitoring time T_max=max（TC1^ez（1） ₁,TC1^ez（2） ₂,……,TC1^ez（g） _g,……,TC1^ez（E _E ^））;, wherein min () represents a minimum value and max () represents a maximum value;

S106, acquiring monitoring data sets corresponding to n monitoring moments before the current initial monitoring moment T ^c _min in the historical monitoring data sets corresponding to the monitored server as current reference monitoring data sets; the initial value of T ^c _min is T _min;

s107, inputting the current reference monitoring data set into the current data processing model to obtain a corresponding prediction result;

S108, if (T _max-T^c _min +1) is less than or equal to d, filling the missing sequences in the monitoring data set in TC1 within the monitoring time range [ T ^c _min,T_max ] based on the prediction result to obtain DC ^f, and exiting the current control program; if (T _max-T^c _min +1) > d, filling the missing sequences in the monitoring dataset in TC1 within the monitoring time range [ T ^c _min,T^c _min +d ] based on the prediction result; and performs S109;

S109, setting T ^c _min=T^c _min +d+1, and executing S106.

7. The method according to claim 5, wherein S13 specifically comprises:

S111, for an ith monitoring data set DC ^f _i in DC ^f, acquiring a stability coefficient STC ^f _i corresponding to DC ^f _i; if STC ^f _i < f0, executing S113, otherwise, executing S112; wherein ,STC^f _i=（∑ⁿ _t=k+1（dC^f _it-Avg（DC^f _i））（dC^f _i（t-k）-Avg（DC^f _i）））/（∑ⁿ _o=1（dC^f _io-Avg（DC^f _i））²）, k is the preset monitoring time number, dC ^f _it is the t-th monitoring data in DC ^f _i, and dC ^f _i（t-k） is the t-k-th monitoring data in DC ^f _i; avg (DC ^f _i) is the average value corresponding to DC ^f _i; f0 is a preset stability coefficient threshold;

S112, setting i=i+1, if i is less than or equal to m, executing S111; otherwise, taking the monitoring data sets corresponding to n monitoring moments before the current moment in DC ^f as D, and exiting the current control program;

8. The method of claim 1, wherein the temporal convolution layer and the variable convolution layer are one-dimensional convolution layers.

9. An electronic device comprising a processor and a memory;

The processor is adapted to perform the steps of the method according to any of claims 1 to 8 by invoking a program or instruction stored in the memory.

10. A non-transitory computer-readable storage medium storing a program or instructions that cause a computer to perform the steps of the method of any one of claims 1 to 8.