CN117688453B

CN117688453B - Traffic flow prediction method based on space-time embedded attention network

Info

Publication number: CN117688453B
Application number: CN202410147357.6A
Authority: CN
Inventors: 曾庆田; 赵志华; 原桂远; 李超; 段华; 宋戈; 周长红; 郭文艳; 程成
Original assignee: Shandong University of Science and Technology
Current assignee: Shandong University of Science and Technology
Priority date: 2024-02-02
Filing date: 2024-02-02
Publication date: 2024-04-30
Anticipated expiration: 2044-02-02
Also published as: CN117688453A

Abstract

The invention discloses a traffic flow prediction method based on a space-time embedded attention network, which belongs to the field of traffic flow prediction and comprises the following steps: step 1, acquiring an existing traffic flow data set from a public website, and sampling the existing traffic flow data set by a sliding window to obtain historical traffic flow data, time information, space information and future traffic flow label data for training; step 2, constructing a position coding matrix of time and space; step 3, calculating cosine similarity among the sensor nodes by using the spatial position coding matrix to obtain a spatial mask matrix; step 4, constructing a traffic flow prediction model based on the space-time embedded attention network, and training the traffic flow prediction model; and 5, collecting traffic flow data of the previous time period, inputting a trained traffic flow prediction model, and predicting the traffic flow data of the next time period. The invention realizes accurate prediction of traffic flow.

Description

Traffic flow prediction method based on space-time embedded attention network

Technical Field

The invention belongs to the field of traffic flow prediction, and particularly relates to a traffic flow prediction method based on a space-time embedded attention network.

Background

With the development of smart cities, smart transportation systems are beginning to manage, analyze and improve the traffic conditions of cities. Traffic flow prediction is widely studied as a core technology of intelligent traffic systems. Traffic flow sequences are derived from human activity data, and have significant spatiotemporal distributions and periodic patterns, a typical spatiotemporal sequence. The existing studies mainly have the following two problems.

The periodic pattern of traffic flow is difficult to model: since there is a significant periodicity in human activity, the change in traffic data exhibits a certain periodicity. The time information is used as the characteristic of the traffic data to model by the early time space neural network, however, the traffic data at different moments can be distinguished only by adding the time characteristic, and the period information of the traffic data is lacked. Subsequently, the attention space graph neural network (ASTGCN) and the delay propagation dynamics remote Transformer (PDFormer) improve the prediction accuracy of the model by introducing more period information using multi-period historical data and clustering historical traffic sequences. However, the above method is limited by the narrow sliding window that makes it difficult to model the overall periodic pattern. The space-time identity information network (STID) then uses the embedded vectors to position encode the time information, but ignores the day-to-day periodic pattern differences. For example, friday and friday belong to the same workday, the morning of both days may have the same performance, but friday afternoon may be quite different from friday. Therefore, it remains a great challenge to mine traffic data for finer, more accurate periodic patterns and to make it interpretable.

The spatial relationship between sensor nodes is difficult to define: early ConvLSTM processed traffic data into grid data, modeling the spatial correlation between sensor nodes using two-dimensional convolution. However, sensor nodes are distributed in the topology of the road network, and convolution operations result in spatially adjacent but uncorrelated nodes also being modeled. Due to the excellent topology modeling capability of the graph neural network, the diffusion convolution cyclic neural network (DCRNN) and the graph wavelet network (GRAPH WAVENET) model the spatial distribution of traffic data as a topological graph, using the graph neural network to model the spatial correlation of sensor nodes. However, the topological relationships in traffic sequences tend to be ambiguous and dynamically changing, and how to build accurate, dynamic topological structures is the focus of research. The self-attention mechanism is essentially a dynamic fully connected graph, whose input determines the nature of the fusion weights as the preferred method of modeling dynamic spatial correlation. The attention space graph neural network (ASTGCN) and the graph multi-attention network (GMAN) begin to apply self-attention mechanisms to dynamic modeling of space-time features. The above approach is still based on a predefined adjacency matrix, and the quality of the a priori knowledge may determine the upper limit of the model's spatial correlation modeling capability. Therefore, how to avoid the limitation of priori knowledge, mining the spatial correlation between nodes from the data itself is a key to influence the prediction accuracy.

Disclosure of Invention

In order to solve the problems, the invention provides a traffic flow prediction method based on a space-time embedded attention network, wherein the space-time embedded attention network (STEAN) consists of a time trend layer, a space-time position encoder and a space mask attention layer, wherein a one-dimensional convolution is used for extracting the time trend of a traffic sequence in the time trend layer, a time position of a time point in a period and an index of a sensor are used for carrying out space-time position encoding on the time trend in the space-time position encoder, and a period mode in an embedded vector implicit learning history sequence and spatial correlation among nodes are used; constructing a mask matrix by using the learnable spatial position codes, and modeling the spatial correlation among the nodes by using the attention layer; the method realizes accurate prediction of traffic flow.

The technical scheme of the invention is as follows:

A traffic flow prediction method based on a space-time embedded attention network comprises the following steps:

Step 1, acquiring an existing traffic flow data set from a public website, and sampling the existing traffic flow data set by a sliding window to obtain historical traffic flow data, time information, space information and future traffic flow label data for training;

step2, respectively constructing a time and space position coding matrix according to the set time period length and the sensor nodes;

Step 3, calculating cosine similarity among the sensor nodes by using the spatial position coding matrix to obtain a spatial mask matrix;

Step 4, constructing a traffic flow prediction model based on a space-time embedded attention network by using a position coding matrix and a space mask matrix of time and space, and training the traffic flow prediction model;

and 5, collecting traffic flow data of the previous time period, inputting a trained traffic flow prediction model, and predicting the traffic flow data of the next time period.

Further, in the step 1, the historical traffic flow data is regarded as a time-space sequenceWherein/>Representing the number of sensor nodes; /(I)Representing the total number of time points of the historical data, and corresponding to the input length of the historical traffic flow; /(I)Traffic flow at a first point in time representing historical data of a first sensor node; /(I)Historical data representing the first sensor node at the/>Traffic flow at each point in time; /(I)Represents the/>Traffic flow of historical data of each sensor node at a first time point; /(I)Represents the/>Historical data of individual sensor nodes are at/>Traffic flow at each point in time;

Defining time information as By the couple of weeks/>Time Point/>Whether or not it is holidayAnd time slice number per day/>Constructing; wherein/>，/>Representing the total number of time points in the set period; the specific formula of the time information is as follows:

(1)；

Defining spatial information as ；

Defining traffic flow prediction problem as a spatiotemporal sequence prediction problem using historical traffic flow dataTime information/>And spatial information/>Learn one/>Mapping function for obtaining traffic flow data at future timeWherein/>The specific formula of the output length representing the predicted traffic flow is as follows:

(2)。

further, the specific process of the step 2 is as follows:

Step 2.1, constructing a time embedding matrix in time Wherein/>Representing the dimension of the embedded vector; use time information/>The corresponding time position vector is taken out from the time embedding matrix to perform time position coding on trend information extracted from the sequence, and then a time position coding matrix/>; The time position coding process is described by the following formula:

(3)；

(4)；

Wherein, To set the/>Time embedded vectors for each point in time;

Step 2.2, constructing a space embedding matrix in space Using spatial information/>The corresponding spatial position vector is taken out from the spatial embedding matrix to carry out spatial position coding on trend information extracted by the sequence, and further a spatial position coding matrix/>; The spatial position encoding process is described by the following formula:

(5)；

(6)；

Wherein, To at/>The spatial embedding vector of each sensor node.

Further, in the step 3, a spatial mask matrixThe construction process of (2) is expressed as follows:

(7)；

(8)；

Wherein, A cosine similarity matrix between the spatially embedded vectors; /(I)Is minus infinity; A cosine similarity threshold value for the spatial mask matrix; the position mask value larger than or equal to the threshold value in the cosine similarity matrix is assigned to 0, and the position mask value smaller than the threshold value is assigned to minus infinity.

Further, in the step4, the constructed traffic flow prediction model includes a plurality of time trend layers, a space-time position encoder and a spatial mask attention layer; each time trend layer comprises two one-dimensional time convolution layers and a residual error connection; stacking a plurality of time trend layers, extracting local trend by the time trend layer at the bottom layer, and summarizing global trend by the time trend layer at the top layer; and training a traffic flow prediction model by using the history traffic flow data and the future traffic flow label data obtained through processing.

Further, in the step 4, the working process of the traffic flow prediction model is as follows:

Step 4.1, historical traffic flow data The method comprises the steps of inputting a first time trend layer, extracting features of historical traffic flow data through two parallel one-dimensional convolution layers, controlling information flow through a gating unit, and finally using residual connection to avoid gradient disappearance, wherein the whole process is described as the following formula:

(8)；

Wherein, Represents the/>The time trend information is output by the time trend extraction layer; /(I)Representing the Hadamard product; representing a Tanh activation function; /(I) Representing a Sigmoid activation function; /(I)、/>The weights of the two time convolution layers respectively; /(I)、/>Offset of two time convolution layers respectively;

and 4.2, fusing time trend information of different scales by using jump connection to obtain final time trend information, wherein the whole process is described as the following formula:

(9)；

Wherein, Representing the number of layers of the temporal trend extraction layer used,/>Representing a final time trend characteristic;

step 4.3, stacking a plurality of time trend layers to obtain time trend information of a plurality of scales;

Step 4.4, encoding the time position encoding matrix by the space-time position encoder Spatial position coding matrix/>And final time trend information/>Cascading, and performing space-time position coding on the time trend information to obtain time trend information/>, after the space-time position coding; The space-time position coding process is described by the following formula:

(10)；

Wherein, For cascade operation, will/>Splicing;

Step 4.5, time trend information after space-time position coding Inputting the spatial mask attention layer for fusion to obtain the fused space-time characteristic/>The fusion process is expressed as:

(11)；

Wherein, As a softmax function; /(I)、/>And/>Parameter matrix of query, key and value respectively,/>For/>Is a feature dimension of (1);

Step 4.6, will The characteristic dimension transformation is carried out through a multi-layer perceptron to obtain a final prediction result, namely traffic flow data/>, at the future momentThe calculation formula is as follows:

(12)；

Wherein, Is a multi-layer perceptron.

The beneficial technical effects brought by the invention are as follows.

The invention considers the periodic variation rule of traffic flow, and provides a space-time position encoder, which uses the position of the current time point in one period (week, month or year) and the sensor number to perform space-time position encoding on the variation trend of historical traffic flow data, thereby effectively modeling the periodic mode of traffic flow and improving the prediction precision of traffic flow;

The invention considers the problem that the spatial correlation among the sensor nodes is difficult to predefine, proposes a spatial mask attention layer, creates a spatial mask matrix by using the spatial position coding matrix obtained by learning, models the dynamic spatial correlation among the related sensor nodes, breaks through the limitation of predefining an adjacent matrix, and realizes the improvement of the traffic flow prediction precision;

the traffic flow prediction method based on the space-time embedded attention network creatively provides periodicity and dynamic spatial correlation of space-time coding and space mask attention modeling traffic flow when in use, and solves the problems that the traditional statistical model and the existing deep learning prediction method are difficult to perform periodic modeling and difficult to predefine a space adjacency matrix.

Drawings

FIG. 1 is a flow chart of a traffic flow prediction method based on a spatio-temporal embedded attention network of the present invention.

FIG. 2 is an overall framework diagram of a traffic flow prediction model based on a spatio-temporal embedded attention network of the present invention.

Detailed Description

The invention is described in further detail below with reference to the attached drawings and detailed description:

The invention takes traffic flow data as a research object, improves model prediction precision as a core target, and solves two key technical problems of periodic variation modes of modeling traffic flow and dynamic space correlation among modeling sensor nodes. By solving the two key technical problems, accurate prediction of road traffic flow can be realized.

Since traffic flow sequences originate from human activity, there is a clear periodicity involved. Aiming at the problem that the periodic mode of traffic flow is difficult to model, the method comprises the steps of firstly constructing a time trend layer to extract the variation trend of a traffic flow sequence, and stacking a plurality of time trend layers to extract the time trends of a plurality of scales. For example, the time trend of the bottom layer focuses on the change trend of the local window, and the time trend of the top layer focuses on the information of all time, so that the method is more global. The invention uses time information to perform time position coding on the extracted multi-scale time trend, and a period change mode is mined from historical traffic flow data through a leachable embedded representation.

There is a spatial correlation in the traffic flow sequence between different sensor nodes, but this spatial correlation is dynamically changing. For example: the correlation between different sensor nodes is different between the peak hours of the work and the peak hours of the work. Aiming at the problem that a spatial structure and dynamic spatial correlation among sensor nodes are difficult to predefine, a spatial mask matrix is constructed by using a learnable spatial position code, and the correlation among the sensor nodes is found out from historical traffic flow. And finally, simulating dynamic spatial information flow among the sensor nodes by using a spatial mask attention layer to realize dynamic spatial correlation modeling.

The invention combines embedded representation, time convolution and attention mechanisms to make traffic flow predictions.

The invention extracts the variation trend of traffic flow in a past time period on a plurality of scales, uses time and space information to carry out space-time position coding on the time trend of the plurality of scales, learns the traffic flow periodic pattern of the sensor nodes, uses the space position coding to construct a space mask matrix, combines a multi-head attention mechanism to mine dynamic space correlation among the related sensor nodes,

As shown in fig. 1, the invention provides a traffic flow prediction method based on a space-time embedded attention network, which can improve the prediction precision of traffic flow prediction, and mainly comprises the following steps:

Step 1, acquiring an existing traffic flow data set from a public website, and sampling the existing traffic flow data set by a sliding window to obtain historical traffic flow data, time information, space information and future traffic flow label data for training.

Treating historical traffic flow data as a spatiotemporal sequenceWherein/>Representing the number of sensor nodes; /(I)Representing the total number of time points, and corresponding to the input length of the historical traffic flow; traffic flow at a first point in time representing historical data of a first sensor node; /(I) Historical data representing the first sensor node at the/>Traffic flow at each point in time; /(I)Represents the/>Traffic flow of historical data of each sensor node at a first time point; /(I)Represents the/>Historical data of individual sensor nodes are at/>Traffic flow at each point in time.

Defining time information as，/>Index for representing time of a spatio-temporal sequence, time information/>By the couple of weeks/>Time Point/>Whether or not it is holiday/>And time slice number per day/>Constructing; wherein/>，/>Indicating the total number of time points within the set period. If the current date is a holiday, the last time slice number is used, otherwise, the time slice number of the corresponding day in one week is used; the specific formula of the time information is as follows:

(1)；

Defining spatial information as ，/>For representing the index of the sensor node in space.

(2)；

and 2, respectively constructing a time and space position coding matrix according to the set time period length and the sensor nodes. The specific process is as follows:

Step 2.1, constructing a time embedding matrix in time Wherein/>Representing the dimension of the embedded vector. Use time information/>The corresponding time position vector is taken out from the time embedding matrix to perform time position coding on trend information extracted by the sequence, and then a time position coding matrix/>; The time position coding process can be described as the following formula:

(3)；

(4)；

Wherein, To set the/>Time embedded vectors for each point in time;

in the time position coding process, the used holiday information is from legal holiday information issued by the country, and information leakage is avoided.

Step 2.2, spatially constructing a spatial embedding matrixUsing spatial information/>The corresponding spatial position vector is taken out from the spatial embedding matrix to carry out spatial position coding on trend information extracted by the sequence, and further a spatial position coding matrix/>The spatial position encoding process can be described as the following formula:

(5)；

(6)；

Wherein, To at/>Spatial embedding vectors of the individual sensor nodes;

In contrast to finding the topology between nodes by various methods, distinguishing nodes using a learnable embedded vector can mine the correlation between nodes from the data itself. In contrast to the chebyshev matrix and the Node2Vec approach, the spatial correlation between nodes need not be mined from the data itself, rather than relying on predefined adjacency matrices.

And 3, calculating cosine similarity among the sensor nodes by using the spatial position coding matrix to obtain a spatial mask matrix.

The invention constructs a mask matrix by using cosine similarity among space embedded vectors, controls information to flow among related nodes, and the space mask matrixThe construction process of (2) can be expressed as:

(7)；

(8)；

And 4, constructing a traffic flow prediction model based on the space-time embedded attention network by using the position coding matrix and the space mask matrix of time and space, and training the traffic flow prediction model.

The constructed traffic flow prediction model comprises a plurality of time trend layers, a space-time position encoder and a space mask attention layer, and is trained by using the historical traffic flow data and the future traffic flow label data obtained through processing. Each time trend layer comprises two one-dimensional time convolution layers and a residual error connection; a plurality of time trend layers are stacked, the time trend layer at the bottom layer extracts local trends, and the time trend layer at the top layer summarizes global trends. In reality, people can deduce future trends according to the change trend of traffic flow in the historical time. Therefore, in the present invention, one-dimensional convolution of larger kernels is used to extract the trend of traffic flow. The space-time position encoder is used for cascading the time and space position information with the extracted time trend information and performing space-time position encoding on the time trend information. Because of the spatial dynamic correlation of traffic sequences, i.e., the relationship between nodes at different times is dynamically changing, a spatial masking attention layer is designed to model the time-varying spatial correlation between trend features.

The working process of the traffic flow prediction model is as follows:

Step 4.1, historical traffic flow data The first time trend layer is input, the historical traffic flow data firstly passes through two parallel one-dimensional convolution layers to extract characteristics, then passes through a gating unit to control information flow, and finally uses residual connection to avoid gradient disappearance, and the whole process can be described as the following formula:

(8)；

Step 4.2, fusing time trend information of different scales by using jump connection to obtain final time trend information, wherein the whole process can be described as the following formula:

(9)；

Wherein, Representing the number of layers of the temporal trend extraction layer used,/>Representing the final time trend characteristics.

And 4.3, stacking a plurality of time trend layers to obtain time trend information of a plurality of scales.

Step 4.4, encoding the time position encoding matrix by the space-time position encoderSpatial position coding matrix/>And final time trend information/>Cascading the time trend information and the time trend information to obtain time trend information/>, after the time trend information is subjected to time-space position coding; The space-time position coding process can be described as the following formula:

(10)；

Wherein, For cascade operation, will/>、/>And/>Splicing;

Step 4.5, time trend information after space-time position coding The input space mask attention layer is fused, and the space mask matrix is used for controlling the calculation of the correlation among nodes in the fusion process to obtain the fused space-time characteristicsThe fusion process can be expressed as:

(11)；

Wherein, As a softmax function; /(I)、/>And/>Parameter matrix of query, key and value respectively,/>For/>Is a feature dimension of (a).

Step 4.6, willThe characteristic dimension transformation is carried out through a multi-layer perceptron to obtain a final prediction result, namely traffic flow data/>, at the future moment; The calculation formula is as follows:

(12)；

Wherein, Is a multi-layer perceptron.

According to the invention, a time trend layer, a space-time position encoder and a space mask attention layer are used for constructing a space-time embedded attention network model for traffic flow prediction, and the space-time position encoding is carried out on historical traffic flow data by extracting the change trend of the historical traffic flow data, so that the problem that the traffic flow data is difficult to carry out periodic modeling and dynamic space correlation modeling is solved, the limitation of a predefined space adjacency matrix is broken through, and the accuracy of traffic flow prediction is improved.

In order to demonstrate the feasibility and superiority of the present invention, the following comparative experiments were performed.

The present invention performs comparative experiments on three traffic flow datasets in the los angeles bay 04, 07 and 08, using three evaluation indices of Mean Absolute Error (MAE), root Mean Square Error (RMSE) and Mean Absolute Percent Error (MAPE). The space-time embedded attention network (STEAN) model, the diffusion convolution cyclic neural network (DCRNN), the graph wavelet network (GRAPH WAVENET), the graph multi-attention network (GMAN), the attention space-time graph neural network (ASTGCN), the space-time identity information network (STID) and the delay propagation dynamic remote Transformer (PDFormer) are subjected to comparison experiments, and the comparison results are shown in tables 1-3.

From the data in tables 1-3, it can be seen that the space-time embedded attention network model STEAN provided by the invention is obviously superior to other six prediction models in MAE, RMSE, MAPE indexes and the like, and obtains the minimum error value and the best prediction effect in 15 minutes, 30 minutes, 60 minutes and the average prediction result. Therefore, the space-time embedded attention network model can be used as an effective traffic flow prediction model, and the invention provides technical support for prediction and analysis of traffic flow.

Table 1 results of comparative experiments of the present invention with six other models on the los Angeles Bay 04 traffic flow dataset;

。

table 2 results of comparative experiments of the present invention with other six models on the los Angeles Bay 07 traffic flow dataset;

。

Table 3 results of comparative experiments of the present invention with other six models on the los Angeles bay 08 traffic flow dataset;

。

It should be understood that the above description is not intended to limit the invention to the particular embodiments disclosed, but to limit the invention to the particular embodiments disclosed, and that the invention is not limited to the particular embodiments disclosed, but is intended to cover modifications, adaptations, additions and alternatives falling within the spirit and scope of the invention.

Claims

1. A traffic flow prediction method based on a space-time embedded attention network is characterized by comprising the following steps:

step 5, collecting traffic flow data of the previous time period, inputting a traffic flow prediction model after training, and predicting the traffic flow data of the next time period;

In the step 1, the historical traffic flow data is regarded as a time-space sequence Wherein N represents the number of sensor nodes; t _h represents the total number of time points of the historical data, and corresponds to the input length of the historical traffic flow; x _1,1 represents the traffic flow of the historical data of the first sensor node at the first point in time; /(I)Traffic flow at time point T _h representing historical data of the first sensor node; x _N,1 represents the traffic flow of the historical data of the nth sensor node at the first point in time; /(I)Traffic flow at a T _h time point representing historical data of an nth sensor node;

Defining time information as X _t epsilon (1, T), which consists of a week number X _DiW, a time point X _TiD, whether the time information is holidays X _ifH or not and a time slice number T _times of each day; where t=8×t _times, T represents the total number of time points within the set period; the specific formula of the time information is as follows:

X_t＝X_TiD+(X_DiW(1-X_ifH)+7×X_ifH)×T_times (1)；

Defining the spatial information as X _s epsilon (1, N);

Defining the traffic flow prediction problem as a time-space sequence prediction problem, and learning an f (-) mapping function by using historical traffic flow data X, time information X _t and space information X _s to obtain traffic flow data at a future moment Wherein T _p represents the output length of the predicted traffic flow, and the specific formula is as follows:

the specific process of the step 2 is as follows:

Step 2.1, constructing a time embedding matrix E _t∈R^T×^F in time, wherein F represents the dimension of the embedding vector; using time information X _t to extract corresponding time position vector from time embedded matrix to perform time position coding on trend information extracted from sequence, thereby obtaining time position coding matrix The time position coding process is described by the following formula:

Wherein, Embedding a vector for setting the time of the T time point in the period;

step 2.2, constructing a space embedding matrix E _s∈R^N×F in space, using space information X _s to extract corresponding space position vector from the space embedding matrix to perform space position coding on trend information extracted by the sequence, and further obtaining a space position coding matrix The spatial position encoding process is described by the following formula:

Wherein, Embedding vectors for the space of the nth sensor node;

In the step 3, the construction process of the spatial mask matrix S _mask is expressed as follows:

S _cos is a cosine similarity matrix between the space embedded vectors; -inf is minus infinity; threshold is the cosine similarity threshold of the spatial mask matrix; the position mask value larger than or equal to the threshold value in the cosine similarity matrix is assigned to 0, and the position mask value smaller than the threshold value is assigned to minus infinity;

In the step 4, the constructed traffic flow prediction model comprises a plurality of time trend layers, a space-time position encoder and a space mask attention layer; each time trend layer comprises two one-dimensional time convolution layers and a residual error connection; stacking a plurality of time trend layers, extracting local trend by the time trend layer at the bottom layer, and summarizing global trend by the time trend layer at the top layer; training a traffic flow prediction model by using the history traffic flow data and the future traffic flow label data obtained by processing;

in the step 4, the working process of the traffic flow prediction model is as follows:

step 4.1, inputting historical traffic flow data X into a first time trend layer, extracting features of the historical traffic flow data through two parallel one-dimensional convolution layers, controlling information flow through a gate control unit, and finally using residual connection to avoid gradient disappearance, wherein the whole process is described as the following formula:

wherein, X _i represents the time trend information output by the ith time trend extraction layer; the Hadamard product is indicated; tanh (·) represents the Tanh activation function; sigma (·) represents Sigmoid activation function; The weights of the two time convolution layers respectively; b ₁、b₂ are the offsets of the two temporal convolution layers, respectively;

wherein L represents the number of layers of the time trend extraction layer used, and X _trend represents the final time trend feature;

Step 4.4, encoding the time position encoding matrix by the space-time position encoder Spatial position coding matrix/>Cascading with the final time trend information X _trend, performing space-time position coding on the time trend information to obtain time trend information/>, after space-time position codingThe space-time position coding process is described by the following formula:

Wherein, For cascade operation, X _trend,/>And/>Splicing;

Step 4.5, time trend information after space-time position coding The input spatial mask attention layer is fused to obtain a fused space-time characteristic X _st, and the fusion process is expressed as follows:

Wherein, softMax (·) is a SoftMax function; w _q、W_k and W _v are parameter matrices of queries, keys and values, respectively, and d _k is the feature dimension of W _q;

And 4.6, carrying out characteristic dimension transformation on the X _st through a multi-layer perceptron to obtain a final prediction result, namely traffic flow data Y at the future moment, wherein the calculation formula is as follows:

Y＝MLP(X_st) (12)；

Wherein, MLP (&) is a multi-layer perceptron.