CN117420615A

CN117420615A - Coastal site wind speed prediction method based on space-time attention combined gating network

Info

Publication number: CN117420615A
Application number: CN202311491911.4A
Authority: CN
Inventors: 戴强晟; 毛莺池; 霍雪松; 荣毅; 汤剑松; 朱天昊
Original assignee: State Grid Jiangsu Electric Power Co Ltd; Hohai University HHU
Current assignee: State Grid Jiangsu Electric Power Co Ltd; Hohai University HHU
Priority date: 2023-11-10
Filing date: 2023-11-10
Publication date: 2024-01-19

Abstract

The invention discloses a coastal site wind speed prediction method based on a space-time attention combined gating network, which comprises the following steps: step 1: inputting wind speed data collected by each site in coastal areas, site position information and time stamps, and auxiliary characteristic data collected by the sites; step 2: constructing a wind speed prediction model; step 3: training a wind speed prediction model by using the acquired data; step 4: calculating the accuracy of the wind speed prediction model prediction, if the accuracy exceeds a preset threshold value, executing the step 5, otherwise, returning to the step 3; step 5: and inputting the wind speed data, the static space-time data and the auxiliary characteristic data of each site in the coastal region into a trained wind speed prediction model to obtain wind speed predicted values of a plurality of sites in the coastal region. Compared with the prior art, the method has the advantages of good prediction effect, good practicability and the like.

Description

Coastal site wind speed prediction method based on space-time attention combined gating network

Technical Field

The invention relates to a coastal site wind speed prediction method based on a space-time attention combined gating network, belongs to the technical field of weather prediction, and particularly relates to coastal site wind speed prediction.

Background

Accurate prediction of coastal wind speeds is significant for extreme weather event identification and early warning purposes. However, wind speed prediction is challenging due to the complexity of the external features and space-time correlation.

In the task of wind speed prediction with space-time correlation, existing methods are mainly divided into four categories: 1) The physical method comprises the following steps: physical methods solve the wind speed prediction problem by modeling the complete set of hydrodynamic and thermodynamic equations with environmental and geographic parameters (including temperature, humidity, surface roughness, etc.). Although physical methods can well describe the nature of atmospheric motion, their application is still constrained by the heavy computational burden, strong dependence on parameters, limited spatial-temporal resolution, inability to take into account local terrain, etc.; 2) Classical statistical method: this is a low cost and easy to apply technique for capturing mathematical relationships in time series data. Classical statistical methods include AR (Auto Regression, AR for short) and ARIMA (Auto Regression, integrated moving average, autoregressive Integrated Moving Average Model, ARIMA) models. However, the nonlinear capabilities of these approaches limit their prediction accuracy; 3) Shallow machine learning method: in order to break through the drawbacks of classical statistical methods, some studies began to predict wind speed using some shallow machine learning model such as SVR (support vector regression, support Vector Regression, SVR for short). Compared with classical statistical methods, although the methods have a certain improvement in nonlinear capability, the generalization capability of the models is weak, complex and dynamic space-time correlation cannot be extracted from space-time data, and meanwhile, the methods have limited capability of processing massive data; 4) The deep learning method comprises the following steps: with the rapid development of deep learning methods, the potential of these methods to model dynamic spatiotemporal correlations in large data is of interest to researchers. Some researchers began modeling wind speed big data using deep learning methods. Specifically: RNN (recurrent neural network, recurrent Neural Network, abbreviated RNN) was first applied by researchers to extract time dependence, and a model based on CNN (recurrent neural network, convolutional Neural Network, abbreviated CNN) was used to mine dynamic spatial correlation in image type wind speed data under euclidean space. However, the wind speed site distribution within an area is in non-European space. To address the limitations of the above approach, some researchers began modeling the spatio-temporal correlation of non-European structural wind speed data using GCN (graph roll-up network, graph Convolutional Neural Network, GCN for short) in combination with the architecture of the RNN. However, such methods ignore dynamic dependencies in wind speed data. Unlike the gcn+rnn architecture, the weights of the sites and time steps are dynamically calculated based on the temporal and spatial attention mechanisms of the self-attention mechanism, and the defects of the gcn+rnn method are overcome. However, these methods still suffer from two major drawbacks: 1. these methods do not effectively fuse auxiliary feature data related to wind speed predictions such as: temperature, atmospheric pressure, etc.; 2. in current spatio-temporal networks, dynamic spatio-temporal features are insufficiently mined. Most networks adopt pipeline structure modeling space-time characteristics which firstly extract space characteristics and then mine time characteristics, and influence degrees of the time and space characteristics on prediction respectively are ignored.

Disclosure of Invention

The invention aims to: aiming at the problems and the defects existing in the prior art, the invention provides the coastal site wind speed prediction method based on the space-time attention combined gating network, which has good prediction effect and strong practicability.

The technical scheme is as follows: a coastal site wind speed prediction method based on a space-time attention joint gating network comprises the following steps:

step 1: generating input data;

step 2: constructing a wind speed prediction model;

step 3: training a wind speed prediction model constructed in the step 2;

step 4: calculating the accuracy of the wind speed prediction model prediction, if the accuracy exceeds a preset threshold value, executing the step 5, otherwise, returning to the step 3;

step 5: and inputting the generated data into a trained wind speed prediction model to obtain wind speed predicted values of all sites in the coastal region.

Preferably, the step 1 specifically includes:

and collecting wind speed data, site position information, time stamp and auxiliary characteristic data of all sites in coastal areas, and then carrying out missing value filling and abnormal value detection and deletion preprocessing operation on the data to finally generate three types of input data. The wind speed data comprises wind direction, wind speed and maximum wind speed information in an hour. The location information and the time stamp refer to static spatiotemporal features. The auxiliary characteristic data comprises: sea surface temperature (deg.c), surface air pressure (hPa), etc. 72 auxiliary features.

More preferably, the three types of input data include: an undirected graph structure generated by all site data in coastal areas; static space and time characteristics generated by Node2vec and one-hot (one hot method, abbreviated as one-hot) respectively encode road position information and time stamps; and auxiliary feature data processed by FCs (fully connected networks Fully Connected Neural Networks, FCs for short).

Preferably, the step 2 specifically includes:

and constructing a gating mechanism based on a deep learning principle and a space-time module to establish a wind speed prediction model, wherein the gating mechanism is used for fusing static space, static time and auxiliary characteristics, and the space-time module is used for extracting dynamic space-time correlation.

More preferably, the gating mechanism specifically comprises the following steps:

the inputs to the gating mechanism module are static space, static time, and assist features. Firstly, combining Static space and time information to form ST (Static space-time feature, ST for short); then, ST is integrated with External features to constitute STE (space-Temporal-External Feature, abbreviated as STE).

More preferably, the space-time module uses ST-Encoding (space-time Encoder, ST-Encoding for short), similar attention (similar attention module, similar attention module, similar attention for short), and ST-Decoding (space-time Decoder, ST-Decoding for short), specifically:

the input composition of ST-Encoding is an undirected graph structure and the architecture of ST-Encoding contains spatial attention, temporal attention and ST Fusion structures. The spatial attention is composed of multi-head GAT (multi-head attention network, multi-head Graph Attention Network, multi-head GAT for short) and is used for mining dynamic spatial correlation in the historical data; the time attention consists of a multi-head self-attention mechanism for mining dynamic time correlation in historical data; ST Fusion (space-time Fusion) is used to adaptively fuse dynamic space-time features, and adaptive Fusion is performed according to the importance of the space-time related features.

The input to similar attention is the output representation of ST-Encoding, which is built based on self-attention mechanisms, primarily to select the most relevant historical spatio-temporal concealment states, reducing error propagation in prediction.

ST-Decoding takes the same structure as ST-Encoding, its input being the output of similar attention, for outputting dynamic spatio-temporal correlations in the future state.

More preferably, the step 3 specifically includes:

the wind speed prediction model is trained, a loss function adopted by the training is MSE (mean square error, mean Square Error, MSE for short), and the model is finely tuned by a back propagation algorithm according to the result of the loss function.

More preferably, the loss function MSE is specifically:

where Q is the length of the predicted sequence, N is the total number of target sites, y _i,j Representing the actual value of wind speed at site j at the predicted time step i;the wind speed forecast value at forecast time step iWebsite j is shown.

A coastal site wind speed prediction device based on a space-time attention joint gating network comprises the following modules:

module one: for generating input data;

and a second module: the wind speed prediction model is used for constructing a wind speed prediction model;

and a third module: a wind speed prediction model constructed by the training module II;

and a fourth module: calculating the accuracy rate of the wind speed prediction model prediction, if the accuracy rate exceeds a preset threshold value, executing a fifth module, otherwise, returning to the third module;

and a fifth module: and inputting the generated data into a trained wind speed prediction model to obtain wind speed predicted values of all sites in the coastal region.

A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a coastal site wind speed prediction method based on a spatio-temporal attention-combining gating network as described above when executing the computer program.

A computer readable storage medium storing a computer program for performing a coastal site wind speed prediction method based on a spatiotemporal attention joint gating network as described above.

Compared with the prior art, the invention has the following beneficial effects:

1. effectively fusing auxiliary characteristics: the invention provides a gating mechanism which can adaptively fuse the important influence of the auxiliary features and the static space-time features on the prediction and input the important influence into a space-time module to serve as an important reference for wind speed prediction.

2. Dynamic space-time correlation is fully mined: a brand new space-time module is provided for mining dynamic space-time correlation. Specifically, a network architecture consisting of ST-Encoding, similar attention and ST-Decoding is designed, firstly, the spatial correlation and the temporal correlation in the history data are extracted by using the spatial attention and the temporal attention mechanisms in ST-Encoding respectively, and the ST Fusion is used for self-adaptively combining according to the importance of the spatial correlation and the temporal correlation; similar attention is then introduced for selecting the most relevant historical spatiotemporal concealment state, reducing error propagation; finally, the dynamic space-time correlation in the future state is output by using ST-Decoding of the same structure as ST-Encoding.

Drawings

FIG. 1 is a flowchart of a wind speed prediction method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a gating mechanism of a wind speed prediction model according to an embodiment of the present invention;

FIG. 3 is an overall block diagram of a wind speed prediction model according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

The invention establishes a multi-site wind speed prediction model in coastal areas by utilizing the learning capacity of deep learning on big data. A coastal site wind speed prediction method based on a space-time attention combined gating network is shown in fig. 1, and comprises the following steps:

step 1: generating input data;

the coastal region multi-site undirected graph is defined as g= (V, E), where V represents the set of sites in the coastal region and E represents the set of edges between sites. The wind speed of the whole plot at time step t is expressed asWherein->The wind speed at time step t is represented by site N, and C represents the dimension of the target feature (e.g., 2 minute average wind direction (°), 2 minute average wind speed (m/s), or maximum wind speed (m/s), and the sampling frequency is extended to hours by averaging. Given an historic time step p=48 at hour intervals, the input undirected graph sequence can be expressed as +.>Where c=1 represents the current prediction target and n=14 represents the total number of coastal sites. A two-layer FCs is introduced to convert the dimension of X from C to d _model Can be expressed as: />I.e. the input hidden status of the wind speed of N sites over P time steps.

According to site location information, we learn site characterization as static space information by using Node2vec, and then unify dimensions to d by using two layers of FCs _model Can be expressed as:i.e. a static spatial signature representation. For S, it is considered static, so the set is the same for the historical time step P and the known predicted time step q=1 (1 hour in the future), thus expanding S to +.>From the time stamps, static time information for the historical time step P and the known predicted time step Q is generated using one-hot encoding and two-layer FCs, which can be expressed as: />I.e. a static time characteristic representation. Considering that the time information of each site is the same, duplication is thus performedThe space feature expands the static time feature into the static space-time feature for N times>

The auxiliary features include: 100u, east-west wind component (m/s) of the ground with the height of 100 meters; 100v the north-south wind component (m/s) of the ground at a height of 100 meters; 10u, east-west wind component (m/s) of the ground with the height of 10 meters; 10v: a north-south wind component (m/s) of a height of 10 meters on the ground; 2d, dew point temperature (DEG C) of 2 meters on the ground; 2t, the temperature (DEG C) of 2 meters on the ground; cape, convection effective potential energy (J/kg); capes, effective failure to shear convection (J/kg); cp, convective precipitation, cumulative amount (mm); deg0l 0 degree once height (meters); lcc, low cloud coverage, (0-1); msl average sea level air pressure (hPa); skt: surface temperature (DEG C); sp: surface air pressure (hPa); sst: sea surface temperature (. Degree. C.); tcc: total cloud coverage, (0-1); isobaric data: together, 1000, 950, 925, 900, 850, 700, 500, 200 levels. The 1000hPa isobaric plane data is illustrated as follows, the other hierarchy analogized d_L1000: divergence (1/s); q_l1000: specific humidity (g/kg); r_l1000: relative humidity (%); t_l1000: temperature (DEG C); u_L1000, the east-west wind component (m/s); v_l1000: a north-south wind component (m/s); w_L1000 vertical speed (Pa/s); the sampling frequency of each feature is extended to be hours by an averaging method for 72 auxiliary features, and in addition, the predicted value of the known predicted time step Q auxiliary feature can be acquired by a weather forecast system. Thus, the assist feature can be expressed as:unifying dimensions of assist features from 72 to d using 2-layer FCs _model Expressed as: />

Step 2: constructing a wind speed prediction model;

the specific structure of the model is shown in fig. 3. The model mainly comprises two parts, a gating mechanism and a space-time module.

The gating mechanism is as shown in fig. 2. First, the static space sum time is fusedInter-feature derivation

Wherein the method comprises the steps ofAnd->Representing a weight that can be learned, +.>Is a learnable bias.

Then, sigmoid is used as an activation function sigma to obtain the fusion weight of ST and E

Wherein, the ". As indicated by the dot product,and->Representing the weight that can be learned,representing a learnable bias. The reason for this is to consider z as a gating mechanism that adaptively controls the importance of ST and E, while emphasizing the important impact of static spatiotemporal information.

Finally, ST and E are fused according to their impact weights to obtain spatiotemporal external features based on the following formulaSign of sign

STE＝ST⊙z+E⊙(1-z)

The space-time module includes three parts, ST-Encoding, similar attention and ST-Decoding. The space-time module is a space-time feature mining network constructed based on a self-attention mechanism and a drawing attention mechanism and is used for extracting space-time associated features in historical wind speed data acquired by a site.

Since the nonlinear transformation function is used at high frequency in the spatio-temporal module, we first define the nonlinear transformation function:

f(x)＝ReLU(Wx+b)

where W and b are trainable parameters, reLU represents the activation function, x is the input to the nonlinear transformation function.

The composition of ST-Encoding is three parts, spatial attention, temporal attention and ST Fusion.

Spatial attention is built on Multi-GAT, a Multi-layer architecture. The input to the spatial attention mechanism isAnd spatiotemporal external features of P time steps in the past +.>The final output of spatial attention is +.>At layer l its output is +.>Site v _i At time step t _j The dynamic spatial correlation of (2) can be expressed as +.>For site v _i At time step t _j In other words, sites v and v _i Attention coefficient of->Can be expressed as:

wherein,representative sites v and v _i The influence of the two, V, represents all sites.

Then, the key vector of site v is summed with site v _i The influence of the query vector inner product

In this context,and->Representing the key resulting from the nonlinear transformation and the mth head attention of the query vector, respectively. [*,*]And<*,*>respectively represent splice and inner product, < >>Indicated at site v _i Time step t _j Is a spatiotemporal external feature of (a).

After obtainingAfterwards, dynamic spatial correlation ++>Is formulated as:

wherein,is the value vector of the mth head and is also generated by a nonlinear transformation. The l and M represent the total number of splices and heads, respectively. Through multi-layer iteration, site v _i At time step t _j The final dynamic spatial correlation can be expressed asW _s Representing a learnable weight matrix.

The input of attention at a given time isIts output can be expressed as +.> In layer r, the output is denoted +.>Wherein site v _i At time step t _j The dynamic time dependence of (a) can be understood as +.>For site v _i At time step t _j In the m-head, the time step t is equal to t _j Attention coefficient->The method comprises the following steps:

wherein,representing time step t _j And t. The set of historical time steps is +.>

Then, time step t _j Is inner-product of the query vector of (c) and the key vector of time step t to obtain a relevance representation,

finally, time dependence of the r-th layerExpressed as->Is a predetermined constant.

Wherein,represents the mth head value vector obtained through the nonlinear transformation function, and BN represents normalization.

Through multiple layers of iterationStation v can be obtained _i At time step t _j Time dependence

ST Fusion consists of a gating mechanism that can adaptively fuse according to the importance of dynamic temporal and spatial features. The ST Fusion working procedure is:

HST ^Enc ＝HS ^Enc ⊙z+HT ^Enc ☉(1-z)

wherein σ is a sigmoid function, +.,andis a learnable parameter. />Is the output of ST-Encoding.

similar attention is also a module based on self-attention mechanism, the input isWherein (1)>T < th > representing ST-Encoding output _P Historical space-time hiding states of the individual time steps; spatiotemporal external features STE for historical P and future Q time steps _P ＝(ste ₁ ，ste ₂ ，...，ste _P ) And STE (STE) _Q ＝(ste _P+1 ，ste _P+2 ，...，ste _P+Q ) The output is +.> Wherein (1)>Representing the spatiotemporal concealment status of the future Q-th time step of the similar attention output. First, future time step->And historical time stepsCorrelation of->Calculated, wherein P.ltoreq.t _a ≤Q，t _b And P is not more than. In the m head, correlation +.>The expression is as follows:

wherein,time step t _a And t _b Correlation between->Representing the time step t in the mth header _a For t _b Attention coefficient of>Layer 1 in the m-head similar attention in the future t _a Time-step space-time hidden state, W _t Is a learnable parameter weight.

The ST-Decoding adopts the same structure as the ST-Encoding, outputs the dynamic time-space correlation in the future state, and can be expressed as:finally, wind speed predictive value ++Q time steps in future of N sites in coastal area can be obtained through two layers of FCs>

Step 3: training the prediction model constructed in the step 2;

aiming at the constructed model, a training set, a verification set and a test set are constructed, the model is initialized and trained by training set data, parameters of the model are adjusted by using the verification set and evaluation indexes, the model effect is tested by using the test set, and the model is trained once for each target in a target wind speed set (wind direction, wind speed and extremely high wind speed information in an hour) respectively to predict;

the ratios of the three data sets of the training data set, the validation data set and the test data set in this example were 70%, 10% and 20% in this order.

Training is carried out by using the loss function MSE, fine tuning is carried out on the model, training of the model is completed, and finally wind speed predicted values of all stations in the coastal area for Q time steps in future are output.

The loss function MSE is a mean square error, specifically:

Step 4: calculating the accuracy of model prediction, if the accuracy exceeds a preset threshold, executing the step 5, otherwise, returning to the step 3;

step 5: one of the prediction targets is preset, coastal region multi-site data are input into a trained prediction model, and wind speed prediction values of all sites in Q time steps in the future are obtained.

module one: for generating input data;

The implementation process of each module of the coastal site wind speed prediction device based on the space-time attention combined gating network is the same as the implementation process of the corresponding steps of the method, and is not repeated.

The present embodiment also relates to a storage medium in which any one of the above wind speed prediction methods is stored.

It will be apparent to those skilled in the art that the steps of the method for predicting wind speed at a coastal site based on a spatio-temporal attention-combining gating network of the embodiments of the present invention described above may be implemented by general purpose computing means, they may be concentrated on a single computing means or distributed over a network of computing means, alternatively they may be implemented by program code executable by computing means, so that they may be stored in storage means, executed by computing means, and in some cases, the steps shown or described may be executed in a different order than herein, or they may be fabricated separately as individual integrated circuit modules, or a plurality of the modules or steps thereof may be fabricated as a single integrated circuit module. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.

While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims

1. The coastal site wind speed prediction method based on the space-time attention combined gating network is characterized by comprising the following steps of:

step 1: generating input data;

step 2: constructing a wind speed prediction model;

step 3: training a wind speed prediction model constructed in the step 2;

2. The coastal site wind speed prediction method based on the spatio-temporal attention combination gating network according to claim 1, wherein the step 1 is specifically:

collecting wind speed data of all sites in coastal areas, position information of the sites, time stamps and auxiliary characteristic data, preprocessing the data, and generating three types of input data; the wind speed data comprise wind direction, wind speed and maximum wind speed information in an hour; the time stamp refers to static spatiotemporal data.

3. The coastal site wind speed prediction method based on the spatio-temporal attention combination gating network of claim 2, wherein the three types of input data include: the undirected graph structure generated by all site data in coastal areas respectively encodes static space and time information generated by road position information and time stamps through Node2vec and one-hot, and auxiliary characteristic data processed by a fully-connected network.

4. The coastal site wind speed prediction method based on the spatio-temporal attention combination gating network according to claim 1, wherein the step 2 is specifically:

establishing a gating mechanism and a space-time module based on a deep learning principle to establish a wind speed prediction model, wherein the gating mechanism is used for fusing static space, static time and auxiliary characteristics;

the construction of the gating mechanism comprises the following contents:

firstly, fusing static space and time characteristics to obtain a fusion result

Wherein the method comprises the steps ofAnd->Is a trainable weight matrix, +.>Is a trainable bias;

then, sigmoid is used as an activation function sigma, and the fusion weight of ST and E is calculated

Wherein, the ". As indicated by the dot product,and->Representing a matrix of weights that can be trained,representing a learnable bias;

finally, ST and E are fused according to their impact weights to obtain spatiotemporal external features based on the following formula

STE＝ST⊙z+E⊙(1-z)；

The space-time module is a space-time feature mining network constructed based on a self-attention mechanism and a drawing attention mechanism and is used for extracting dynamic space-time correlation in historical wind speed data acquired by a site.

5. The coastal site wind speed prediction method based on the spatio-temporal attention combination gating network of claim 4, wherein the gating mechanism is:

the inputs to the gating mechanism are encoded site location information, a time stamp, and an assist feature; firstly, fusing site position information and a time stamp to form a static space-time characteristic; the static spatiotemporal features are then integrated with the external features to form spatiotemporal external features.

6. The method for predicting wind speed at a coastal site based on a joint spatio-temporal attention gating network of claim 4, wherein said spatio-temporal modules comprise ST-Encoding, similar attention and ST-Decoding;

in the spatio-temporal module, a nonlinear transformation function is defined as:

f(x)＝ReLU(Wx+b)

wherein W and b are trainable parameters, reLU represents an activation function, x is the input of a nonlinear transformation function;

the input of ST-Encoding is an undirected graph and STE, and the architecture of ST-Encoding contains spatial attention, temporal attention and ST Fusion structures; spatial attention is used to mine the spatial correlation of historical data dynamics; time attention is used to mine dynamic time correlation in historical data; ST Fusion is used to adaptively fuse dynamic spatiotemporal features;

similar attention for selecting the most relevant historical spatiotemporal concealment state;

spatial attention is built based on Multi-GAT, a Multi-layer architecture; the input to the spatial attention mechanism isAnd spatiotemporal external features of P time steps in the past +.>The final output of spatial attention is +.>At layer l its output is +.>Site v _i At time step t _j The dynamic spatial correlation of (1) is expressed as +.>For site v _i At time step t _j In other words, sites v and v _i Attention coefficient of->Expressed as:

wherein,representative sites v and v _i The influence of the two, V represents all sites;

And->M-th head attention, [/x, ] representing key and query vector generated by nonlinear transformation, respectively]And<*,*>respectively represent splice and inner product, < >>Indicated at site v _i Time step t _j Is a spatiotemporal external feature of (2);

after obtainingAfterwards, dynamic spatial correlation ++>Is formulated as:

wherein,is the value vector of the mth head, and is generated by nonlinear transformation; the I and M represent the total number of splices and heads, respectively; through multi-layer iteration, site v _i At time step t _j The final dynamic spatial correlation is denoted +.>W _s Representing a learnable weight matrix;

the input of attention at a given time isIts output is denoted +.>In layer r, the output is denoted +.>Wherein site v _i At time step t _j Dynamic time dependence of (2) isFor site v _i At time step t _j In the m-head, the time step t is equal to t _j Attention coefficient->The method comprises the following steps:

wherein,representing time step t _j And t; the set of historical time steps is +.>

finally, time dependence of the r-th layerIs shown as such and is to be understood,

wherein,is a preset constant, +.>Representing an mth head value vector obtained through a nonlinear transformation function, and BN represents normalization;

obtaining a site v through multi-layer iteration _i At time step t _j Time dependence

ST Fusion is composed of a gating mechanism, and is adaptively fused according to the importance of dynamic time and spatial characteristics; the ST Fusion working procedure is:

HST ^Enc ＝HS ^Enc ⊙z+HT ^Enc ⊙(1-z)

wherein σ is a sigmoid function, +.,andis a learnable parameter. />Is the output of ST-Encoding;

similar attention is also a module based on self-attention mechanism, the input isWherein (1)>T < th > representing ST-Encoding output _P Historical space-time hiding states of the individual time steps; spatiotemporal external features STE for historical P and future Q time steps _P ＝(ste ₁ ,ste ₂ ,...,ste _P ) And STE (STE) _Q ＝(ste _P+1 ,ste _P+2 ,...,ste _P+Q ) The output is +.> Wherein (1)>Representing the spatiotemporal concealment status of the future Q-th time step of similar attention output; first, future time step->And historical time stepsCorrelation of->Calculated, wherein P.ltoreq.t _a ≤Q，t _b P is less than or equal to; in the m head, correlation +.>The expression is as follows:

wherein,time step t _a And t _b Correlation between->Representing the time step t in the mth header _a For t _b Attention coefficient of>Representing the first layer similar attention at the m-head in the future t _a Time-step space-time hidden state, W _t Is a learnable parameter weight;

ST-Decoding takes the same structure as ST-Encoding, its input is the output of similar attention, which is used to output the dynamic spatio-temporal correlation in the future state, expressed as:finally, obtaining wind speed predictive values +_for Q time steps in future of N sites in coastal areas through two layers of FCs>

7. The coastal site wind speed prediction method based on the spatio-temporal attention combination gating network according to claim 1, wherein the step 3 is specifically:

and training the wind speed prediction model, wherein a loss function adopted by the training is MSE, and the model is finely adjusted by using a back propagation algorithm according to the result of the loss function.

The loss function MSE is specifically:

8. Coastal site wind speed prediction device based on space-time attention combined gating network is characterized by comprising the following modules:

module one: for generating input data;

9. A computer device, characterized by: the computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the coastal site wind speed prediction method based on a spatio-temporal attention-combining gating network as claimed in any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, characterized by: the computer readable storage medium stores a computer program for performing the coastal site wind speed prediction method based on the spatiotemporal attention joint gating network of any one of claims 1 to 7.