CN115578851B - MGCN-based traffic prediction method - Google Patents
MGCN-based traffic prediction method Download PDFInfo
- Publication number
- CN115578851B CN115578851B CN202210832846.6A CN202210832846A CN115578851B CN 115578851 B CN115578851 B CN 115578851B CN 202210832846 A CN202210832846 A CN 202210832846A CN 115578851 B CN115578851 B CN 115578851B
- Authority
- CN
- China
- Prior art keywords
- time
- traffic
- graph
- node
- steps
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 77
- 230000007246 mechanism Effects 0.000 claims abstract description 18
- 238000013461 design Methods 0.000 claims abstract description 14
- 238000002474 experimental method Methods 0.000 claims abstract description 10
- 238000010586 diagram Methods 0.000 claims abstract description 9
- 230000004931 aggregating effect Effects 0.000 claims abstract description 4
- 238000013528 artificial neural network Methods 0.000 claims abstract description 4
- 239000011159 matrix material Substances 0.000 claims description 41
- 230000004927 fusion Effects 0.000 claims description 16
- 230000003068 static effect Effects 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 13
- 238000005096 rolling process Methods 0.000 claims description 11
- 230000009286 beneficial effect Effects 0.000 claims description 7
- 230000008859 change Effects 0.000 claims description 7
- 230000000007 visual effect Effects 0.000 claims description 7
- 238000002679 ablation Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 6
- 230000002123 temporal effect Effects 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 5
- 230000002776 aggregation Effects 0.000 claims description 4
- 238000004220 aggregation Methods 0.000 claims description 4
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 238000000638 solvent extraction Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 2
- 238000012795 verification Methods 0.000 claims description 2
- 230000019771 cognition Effects 0.000 claims 1
- 230000001149 cognitive effect Effects 0.000 claims 1
- 230000001737 promoting effect Effects 0.000 claims 1
- 238000013135 deep learning Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 101001095088 Homo sapiens Melanoma antigen preferentially expressed in tumors Proteins 0.000 description 1
- 102100037020 Melanoma antigen preferentially expressed in tumors Human genes 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
- G08G1/0129—Traffic data processing for creating historical data or processing based on historical data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Strategic Management (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Analytical Chemistry (AREA)
- Chemical & Material Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention provides a traffic prediction method based on dynamic space-time multi-graph convolution. The future traffic condition can be reasonably predicted, traffic jam can be effectively avoided, and planning time is reserved for people to travel. However, the complex traffic network and the nonlinear time dependence make traffic prediction very challenging, and the existing method also lacks the capability of modeling the dynamic time-space correlation of traffic data, so that the prediction result is unsatisfactory. The present invention thus proposes a dynamic spatiotemporal multi-graph convolution neural network (MGCN) based on graph convolution and attention mechanisms. The model mainly comprises a space-time diagram convolution module for capturing the comprehensive and dynamic space-time correlation of the traffic network. The future traffic conditions are then resolved along the spatio-temporal dimension, respectively, at both decoders, aggregating the multidimensional information. The two real large-scale data sets are subjected to full experiments, and experimental results show that the effectiveness and the superiority of the design model of the invention reach a good prediction level.
Description
Technical Field
The invention relates to a traffic prediction method, which has extremely important application prospect in the field of smart city construction.
Background
With the continuous development of the urban process, a series of traffic conditions such as traffic jam and the like gradually become important problems which are necessary to face on the road construction of the smart city, so that prediction of future traffic conditions is very necessary and significant. The purpose of traffic prediction is to use historical traffic data for each road segment to predict future traffic data. In addition, due to the instability of traffic flow change and the complexity of a traffic road network structure, reliable traffic prediction becomes a new challenge in an intelligent traffic system, and accurate traffic flow prediction can improve traffic conditions, shorten travel time of pedestrians and greatly reduce carbon emission.
In extensive research, while early conventional methods and their extended application have been highly successful in the field of traffic prediction, there are also certain limitations, such as auto-regressive integral moving average (autoregressive) and Vector Autoregressive (VAR) in machine learning methods, support Vector Regression (SVR), which are all based on ideal assumptions, which often do not hold in complex spatio-temporal traffic data. And the methods all depend on the characteristic attribute of manual design, and the characteristics of the data cannot be represented fairly and accurately. Along with the rise of deep learning, convolutional Neural Networks (CNN), graph Convolution Network (GCN) and the like are used for mining complex spatial features of traffic data, and when nonlinear time features in the traffic data are processed, methods such as Long and Short Time Memory (LSTM), gate control circulation unit (GRU) and the like are mainly used. Deep learning has found wide application in recent traffic flow prediction studies. Unlike the conventional model, the deep neural network model has a sufficiently large capacity and learning ability, and thus there is a great room for improvement. The model constructed based on the deep learning method at present has a remarkable improvement on prediction performance, but has some limitations, and in order to cope with the above challenges, the invention provides a space-time traffic flow prediction network MGCN based on deep learning.
Disclosure of Invention
In order to overcome various problems in early research, a dynamic space-time multi-graph convolution traffic prediction method is invented. In particular, the method is a new space-time fusion method, the complex space dependence is dynamically considered in the space dimension, and a new autocorrelation fusion module is used in the time dimension to better aggregate historical traffic information. Then, in order to improve the modeling capability of complex spatial relationships in the road network structure, the invention combines static graph convolution and dynamic graph convolution to capture the complex local and global spatial dependencies of the traffic network structure. The ensemble model employs a new encoder and decoder architecture in which not only the association between spatially adjacent sensors is considered but also an adjacency matrix in the time dimension is designed. Finally, a number of experiments were performed on both real data sets, verifying the feasibility and validity of the designed framework.
The invention mainly comprises five parts: (1) determining the input and output of the model. (2) data set processing. (3) And constructing a dynamic space-time multi-graph convolution network model based on dynamic graph convolution. (4) constructing a whole MGCN model. And (5) verifying the validity of the method.
The following describes the contents of the above five parts:
the input and output of the model is determined. The speed dataset is used as input of the method, wherein the speed dataset comprises speed values, longitude and latitude, start time, end time and other attributes. Historical traffic data for N sensors in a given traffic network at T h time steps The purpose of the model is to learn a function f that predicts the traffic conditions of all sensors at T h time steps in the future. The method can be concretely expressed as follows:
1. Data set preprocessing. Data set preprocessing mainly comprises normalization and other processes. The velocity data is extracted from the individual sensors distributed in the city. The resulting flow rate data is typically outliers and some noise, and the effects of outliers and extremes can be avoided indirectly through centering using a normalization process.
2. In order to simultaneously capture dynamic graph information in space and time dimensions, a dynamic space-time multi-graph convolution network (DSTGCN) based on GCN is designed. The GCN is used for popularizing the traditional convolution operation from the structured data to a non-Euclidean graph structure, and capturing potential graph relations, and on the basis, the traditional GCN method is extended to a space-time graph to capture dynamic graph information. Specifically, DSTGCN separately aggregate neighbor node information, multi-order neighbor information, and history information by static graph convolution, dynamic graph convolution, and autocorrelation volumes. The information aggregation method under three different view angles can effectively learn the characteristics of the graph nodes.
3. A dynamic spatiotemporal multi-graph convolution neural network (MGCN) is constructed. Three different graph convolution represent space-time correlation from different angles, and in order to enhance the characteristic expression capability of the node, a gating fusion mechanism is used for fusing the nodes pairwise. The present invention then approximates future traffic conditions by establishing a nonlinear relationship of future time steps to past time steps and taking the converted results as input to the decoder in order to reduce propagation errors between different predicted time steps over a long time horizon. The MGCN framework is mainly provided with two decoders for analyzing the change trend of traffic flow from the space dimension and the time dimension respectively. The multi-directional decoding is beneficial to comprehensively recognizing traffic conditions, so that future traffic flow can be effectively predicted.
4. And (5) verifying the validity of the method. The extensive experiments on two real traffic data sets prove that compared with other leading edge researches, the method is remarkably superior to other comparison methods in both short-time-range prediction and long-time-range prediction.
The invention takes the following detailed implementation steps for achieving the purposes:
Step 1: a traffic road network representation is defined. Traffic flow prediction is a typical spatiotemporal prediction problem whose purpose is to predict future traffic conditions from historical traffic data of observed sensors in a traffic network. The invention defines the traffic road network to be studied as a weighted undirected graph g= (V, E, a), where V is a vertex set of n= |v|, and N represents the number of sensors deployed in the traffic road network. E is defined as a collection of edges that represent connectivity between pairs of nodes. Expressed as a weighted adjacency matrix, where/>Indicating the proximity of nodes v i and v j (road distance between sensors). /(I)Is a graph signal acquired by all sensors on the traffic network at time step t, where D is the characteristic number of each node (traffic flow, traffic speed, etc.).
Step 2: the input and output of the model are determined. Historical traffic data for N sensors in a given traffic network at T h time stepsThe purpose is to learn a function f that can predict the traffic conditions of all sensors at T h time steps in the future. The method can be concretely expressed as follows:
Step 3: and selecting a proper proportion to divide the data set and preprocessing the data. According to the general partitioning criteria, 70% of the data was used for training, 10% of the data was used for validation, the remaining 20% of the data was used for testing, and the overall dataset was Z-Score normalized.
Step 4: and constructing a space-time embedding module of the dynamic space-time diagram convolution network (DSTGCN). In order to simultaneously capture dynamic graph information in space and time dimensions, the invention designs a dynamic space-time multi-graph convolution network (DSTGCN) based on graph convolution. Graph convolution extends the traditional convolution operation from structured data to a non-Euclidean graph structure, can capture potential graph relations, extends the traditional graph convolution method to a time-space graph on the basis of the potential graph relations, and captures dynamic graph information. Specifically, DSTGCN separately aggregate neighbor node information, multi-order neighbor information, and history information by static graph convolution, dynamic graph convolution, and autocorrelation volumes. The information aggregation method under three different view angles can effectively learn the characteristics of the graph nodes. These three different types of convolution are described in more detail below.
Step 4.1: and constructing a static graph rolling module. The traffic network defined based on geographic proximity measures the similarity of node pairs through a threshold Gaussian kernel function distance, focuses on nodes with certain distance intervals, ignores most of nodes with far distances, and can effectively extract valuable information. This aggregate relationship is expressed as:
Wherein the method comprises the steps of Is the output of the L layer at time t-Is the output of layer (L-1) and is also the input of layer L,/>And/>Is a learnable parameter. Sigma is a ReLU (·) activation function,/>Is a self-circulating adjacency matrix, which is defined specifically as follows:
Wherein the method comprises the steps of The distance from sensor v i to sensor v j, δ is the standard deviation, ε is the threshold value controlling the sparseness of the adjacency matrix A, and is designated as 0.1. /(I)Is a redefined adjacency matrix, I N is an N-dimensional diagonal matrix, with the aim of enhancing the importance of the node itself. /(I)Is a degree matrix, wherein/>
Step 4.2: and constructing a dynamic graph rolling module. Static graph convolution is a single operation in that the predefined adjacency matrix is fixed. However, in a real traffic network, different node relations are presented at different times, for example, the traffic conditions near schools are quite different at eight am and ten am, and thus are affected differently by the traffic segments near apartments. Simple application of static graph convolution to a traffic network cannot capture such dynamic changes. Further, the invention provides a dynamic graph rolling method which can adaptively adjust the correlation degree among nodes, thereby paying attention to more important node information. Specifically, the pairwise relationship between arbitrary road nodes is modeled as:
Wherein, Is the output of layer (L-1) at time t,/>Is the dynamic adjacency matrix of the L layer at time t,/>And the relevance scores of the L-th layer of the nodes i and j at the moment t are represented. Further, graph information is aggregated by a dynamic adjacency matrix:
Wherein the method comprises the steps of And/>Output and input of the L-layer picture signal at the t moment,/>, respectivelyAnd/>Is a learnable parameter. Next, the graph signals over a plurality of time steps are connected:
Wherein the method comprises the steps of The graph signal output over T h time steps is shown.
Step 4.3: an autocorrelation graph convolution module is constructed. Both static graph convolution and dynamic graph convolution are based on a traffic network structure, and node information is aggregated by distributing different weights to neighbor nodes, so that node characteristics can be effectively represented. The traffic condition of a place is closely related to the traffic condition of the previous different time steps and is non-linearly related. In order to build a model of the attribute, the invention designs an autocorrelation graph convolution module, which is mainly used for simulating nonlinear correlation between different time steps of a certain node. Specifically, a time attention matrix is constructed:
Wherein the method comprises the steps of Is a traffic representation of node v i over T h time steps,/>Is a matrix representation of the interrelation of the nodes v i between the different time steps of the first layer. By considering the information of the time step earlier than the target time step, after the time attention matrix is obtained, the state of the node v i over T h time steps is updated:
Wherein the method comprises the steps of And/>A graph signal representing the output and input of node v i at the first layer,And/>Is a learnable parameter.
Step 5: a dynamic spatiotemporal multi-graph convolution (MGCN) ensemble model is constructed. The three different graph convolution of the previous design represent the space-time correlation from different angles, but in order to further enhance the characteristic expression capability of the node, the invention uses a gating fusion mechanism to fuse the nodes pairwise. Then, in order to reduce propagation errors between different predicted time steps over a long time frame, future traffic conditions are approximately described by establishing a nonlinear relationship of future time steps to past time steps, and the converted result is used as input of a decoder. The MGCN framework mainly deploys two decoders to analyze the trend of traffic flow from the spatial and temporal dimensions, respectively. The multi-directional decoding is beneficial to comprehensively recognizing traffic conditions, so that future traffic flow can be effectively predicted.
Step 5.1: and constructing a gating fusion module. The invention uses a gating fusion mechanism to fuse three different convolution modules pairwise. Specifically, a gate is defined to control the importance of both:
z=φ(XWz+HUz+bz)
Y=z⊙X+(1-z)⊙H
Wherein the method comprises the steps of And/>Is a learnable parameter, phi is a sigmoid activation function, z is a gate controlling the importance of both,/>Is a fused representation of the features.
Step 5.2: a transition attention layer is constructed. In order to reduce propagation errors between different predicted time steps over a long time frame, the design approximates future traffic conditions by establishing a nonlinear relationship of future time steps to past time steps and takes the converted result as input to the decoder. Converting nodes in a traffic network into a matrix through node2Vec algorithmIt retains the structural information of the road network. The past and future times are then encoded by one-hot as a time matrix/>Space-time embedding/>Is a space-time representation of nodes in a traffic network, and the conversion relationship is represented as follows:
Wherein the method comprises the steps of Is the spatiotemporal representation of node v i at time step t i,/>And/>Is a learnable parameter,/>Is the attention score of the future time step t i to the past time step t j. The encoded traffic characteristics are converted into a decoder by adaptively selecting the relevant characteristics of all historical time steps T h:
Wherein the method comprises the steps of Is a characteristic representation of node v i at future time step t i,/>Is a characteristic representation of node v i at historic time step t j,/>Is a learnable parameter.
Step 6: a spatiotemporal attention mechanism is established. The model deploys two decoders to analyze the trend of traffic flow from the spatial and temporal dimensions, respectively. The multi-directional decoding is beneficial to comprehensively recognizing traffic conditions, so that future traffic flow can be effectively predicted. In particular, the present invention captures this characteristic through spatial and temporal awareness.
Step 6.1: a spatial attention module is constructed. In the spatial dimension, traffic conditions at different sites affect each other and dynamically change with time, and in order to simulate the characteristics, the invention designs a spatial attention mechanism to adapt to dynamic correlation among capture nodes. The mechanism is expressed as:
Wherein, Is the attention score of nodes v i and v j at time step t i,/>Is a flow representation of the time step of t i for node v i. /(I)And/>Is a learnable parameter. By dynamically assigning different weights to different nodes at different time steps, new node state representations are learned:
Wherein the method comprises the steps of Is a characteristic representation of the time step of node v i at t i.
Step 6.2: a time attention module is constructed. In the time dimension, there is a correlation between different time steps and a non-linear variation is exhibited over time. To model these characteristics, the present invention designs a correlation between time steps of adaptive modeling of the time-awareness mechanism. The mechanism is expressed as:
Wherein, Is the attention fraction between t i and t j time steps of node v i,/>Is a flow representation of the time step of t i for node v i. /(I)And/>Is a learnable parameter. After the attention score is obtained, the state of node v i at time step t i is updated as:
Wherein the method comprises the steps of Is a characteristic representation of the time step of node v i at t i.
Drawings
FIG. 1 is a diagram of a model overall framework of MGCN in the present invention
FIG. 2 is a graph showing the results of an ablation experiment of a model under MAE evaluation index in the present invention
FIG. 3 is a graph showing the results of an ablation experiment of a model under the evaluation index of RMSE in the present invention
FIG. 4 is a graph showing the results of an ablation experiment performed under MAPE evaluation in the model of the present invention
FIG. 5 is a visual analysis chart of the model at a predicted time step of 15min in the present invention
FIG. 6 is a visual analysis chart of the model at a predicted time step of 30min in the present invention
FIG. 7 is a visual analysis chart of the model at a predicted time step of 45min in the present invention
FIG. 8 is a visual analysis of the model at a predicted time step of 1Hour in the present invention
Detailed Description
The invention will be further described with reference to the drawings and examples.
The method acquires a large amount of flow data from sensors under the urban road network, performs data cleaning on the flow data, and respectively obtains the attributes such as a speed value, longitude and latitude, starting time, predicted ending time and the like after finishing. The method is widely applicable to various time sequence prediction fields and effectively processes complex time sequence data based on the dynamic space-time multi-graph convolution neural network. FIG. 1 is a diagram of an overall model framework of the present invention, the model being based on an encoder-decoder architecture, wherein a space-time diagram convolution module captures traffic network structure information, dynamic neighbor node information, and traffic variation information in the time dimension simultaneously, and performs efficient fusion, representing overall and dynamic space-time correlation. The historical time series is further converted into a future time series representation, and the future traffic conditions are resolved along the time and space dimensions respectively at two decoders, aggregating multidimensional information. Full experiments are carried out on two real large-scale data sets, and experimental results show that the effectiveness and superiority of the model are achieved, and the method achieves a better level and is superior to other baseline methods. The specific implementation is as follows:
Step 1: a traffic road network representation is defined. Traffic flow prediction is a typical spatiotemporal prediction problem whose purpose is to predict future traffic conditions from historical traffic data of observed sensors in a traffic network. The invention defines the traffic road network to be studied as a weighted undirected graph g= (V, E, a), where V is a vertex set of n= |v|, and N represents the number of sensors deployed in the traffic road network. E is defined as a collection of edges that represent connectivity between pairs of nodes. Expressed as a weighted adjacency matrix, where/>Indicating the proximity of nodes v i and v j (road distance between sensors). /(I)Is a graph signal collected by all sensors on the traffic network, wherein D is the characteristic number of each node.
Step 2: the input and output of the model are determined. Historical traffic data for N sensors in a given traffic network at T h time stepsThe purpose is to learn a function f that can predict the traffic conditions of all sensors at T h time steps in the future. The method can be concretely expressed as follows:
Step 3: and selecting a proper proportion to divide the data set and preprocessing the data. According to the general partitioning criteria, 70% of the data was used for training, 10% of the data was used for validation, the remaining 20% of the data was used for testing, and the overall dataset was Z-Score normalized.
Step 4: and constructing a space-time embedding module of the dynamic space-time diagram convolution network (DSTGCN). In order to simultaneously capture dynamic graph information in space and time dimensions, the invention designs a dynamic space-time multi-graph convolution network (DSTGCN) based on graph convolution. Graph convolution extends the traditional convolution operation from structured data to a non-Euclidean graph structure, can capture potential graph relations, extends the traditional graph convolution method to a time-space graph on the basis of the potential graph relations, and captures dynamic graph information. DSTGCN separately aggregate neighbor node information, multi-order neighbor information, and history information by static graph convolution, dynamic graph convolution, and autocorrelation graph volumes. The information aggregation method under three different view angles can effectively learn the characteristics of the graph nodes.
Step 4.1: and constructing a static graph rolling module. The similarity of node pairs is measured by a threshold Gaussian kernel function distance of a traffic network defined based on geographic proximity, and is expressed as follows:
Wherein the method comprises the steps of Is the output of the L layer at time t-Is the output of layer (L-1) and is also the input of layer L,/>And/>Is a learnable parameter. Sigma is a ReLU (·) activation function,/>Is a self-circulating adjacency matrix, which is defined specifically as follows:
Wherein the method comprises the steps of Is the distance from sensor v i to sensor v j, δ is the standard deviation, ε is the threshold value controlling the sparsity of the adjacency matrix A, and is designated as 0.1. /(I)Is a redefined adjacency matrix, I N is an N-dimensional diagonal matrix, with the aim of enhancing the importance of the node itself. /(I)Is a degree matrix, wherein/>
Step 4.2: and constructing a dynamic graph rolling module. The dynamic graph rolling method can adaptively adjust the correlation degree among nodes, so that more important node information is concerned. Specifically, the pairwise relationship between arbitrary road nodes is modeled as:
Wherein the method comprises the steps of Is the output of layer (L-1) at time t,/>Is the dynamic adjacency matrix of the L layer at time t,/>And the relevance scores of the L-th layer of the nodes i and j at the moment t are represented. Further, graph information is aggregated by a dynamic adjacency matrix:
Wherein the method comprises the steps of And/>Output and input of the L-layer picture signal at the t moment,/>, respectivelyAnd/>Is a learnable parameter. Next, the graph signals over a plurality of time steps are connected:
Wherein the method comprises the steps of The graph signal output over T h time steps is shown.
Step 4.3: an autocorrelation graph convolution module is constructed. The autocorrelation graph convolution module is mainly used for simulating nonlinear correlation between different time steps of a certain node. First, a time attention matrix is constructed:
Wherein the method comprises the steps of Is a traffic representation of node v i over T h time steps,/>Is a matrix representation of the interrelation of the nodes v i between the different time steps of the first layer. By considering the information of the time step earlier than the target time step, after the time attention matrix is obtained, the state of the node v i over T h time steps is updated:
Wherein the method comprises the steps of And/>A graph signal representing the output and input of node v i at the first layer,And/>Is a learnable parameter.
Step 5: a dynamic spatiotemporal multi-graph convolution (MGCN) ensemble model is constructed. The three different graph convolution of the previous design represent the space-time correlation from different angles, but in order to further enhance the characteristic expression capability of the node, the invention uses a gating fusion mechanism to fuse the nodes pairwise. Then, in order to reduce propagation errors between different predicted time steps over a long time frame, future traffic conditions are approximately described by establishing a nonlinear relationship of future time steps to past time steps, and the converted result is used as input of a decoder. The MGCN framework is mainly provided with two decoders for analyzing the change trend of traffic flow from the space dimension and the time dimension respectively. The multi-directional decoding is beneficial to comprehensively recognizing traffic conditions, so that future traffic flow can be effectively predicted.
Step 5.1: and constructing a gating fusion module. First a gate is defined to control the importance of both:
z=φ(XWz+HUz+bz)
Y=z⊙X+(1-z)⊙H
Wherein the method comprises the steps of And/>Is a learnable parameter, phi is a sigmoid activation function, z is a gate controlling the importance of both,/>Is a fused representation of the features.
Step 5.2: a transition attention layer is constructed. Firstly, converting nodes in a traffic network into a matrix through a node2Vec algorithmIt retains the structural information of the road network. The past and future times are then encoded by one-hot as a time matrix/>Space-time embedding/>Is a space-time representation of nodes in a traffic network, and the conversion relationship is represented as follows:
Wherein the method comprises the steps of Is the spatiotemporal representation of node v i at time step t i,/>And/>Is a learnable parameter,/>Is the attention score of the future time step t i to the past time step t j. The encoded traffic characteristics are converted into a decoder by adaptively selecting the relevant characteristics of all historical time steps T h:
Wherein the method comprises the steps of Is a characteristic representation of node v i at future time step t i,/>Is a characteristic representation of node v i at historic time step t j,/>Is a learnable parameter.
Step 6: a spatiotemporal attention mechanism is established. The model deploys two decoders to analyze the variation trend of traffic flow from the space and time dimensions respectively. The multi-directional decoding is beneficial to comprehensively recognizing traffic conditions, so that future traffic flow can be effectively predicted. Next, a spatial attention module and a temporal attention module are sequentially introduced.
Step 6.1: a spatial attention module is constructed. The invention designs a dynamic correlation between adaptive capture nodes of a spatial attention mechanism, which is expressed as:
Wherein, Is the attention score of nodes v i and v j at time step t i,/>Is a flow representation of the time step of t i for node v i. /(I)And/>Is a learnable parameter. By dynamically assigning different weights to different nodes at different time steps, new node state representations are learned:
Wherein the method comprises the steps of Is a characteristic representation of the time step of node v i at t i.
Step 6.2: a time attention module is constructed. The correlation between time steps is modeled adaptively by the time attention mechanism, expressed as:
Wherein, Is the attention fraction between t i and t j time steps of node v i,/>Is a flow representation of the time step of t i for node v i. /(I)And/>Is a parameter that can be learned, and after the attention score is obtained, the state of the node v i at time step t i is updated as follows:
Wherein the method comprises the steps of Is a characteristic representation of the time step of node v i at t i.
Step 7: and (5) experimental verification. To assess and understand the effect and performance of key components in the MGCN model proposed by the present invention, ablation studies on NE-BJ datasets were performed. The results of the ablation experiment of the model under three different evaluation indexes are shown in fig. 2, 3 and 4. Variants MGCN are named below and are shown below:
MGCN-ND: without MGCN of the dynamic graph convolutions, the dynamic graph convolution module is removed from MGCN.
MGCN-NS: without MGCN of the autocorrelation map convolution modules, the autocorrelation map convolution modules are removed from MGCN.
MGCN-NTA: without the time attention module MGCN, the time attention module is removed from MGCN.
MGCN-NSA: without the spatial attention module MGCN, the spatial attention module is removed from MGCN.
Experimental results show that the four modules, namely, the dynamic graph convolution, the autocorrelation graph convolution, the time attention module and the space attention module, are all critical to the performance of MGCN. The fusion of the dynamic graph convolution module and the static graph module confirms the effectiveness of dynamic and static combination, and the fusion of the autocorrelation graph convolution, the time attention module and the space attention module confirms the effectiveness of space-time fusion. The design of the autocorrelation graph convolution not only captures the time dependence more deeply but also helps the overall model achieve good results.
In order to further verify the feasibility and effectiveness of the model, the invention performs visual analysis on the actual traffic flow and the traffic flow predicted by the model. Fig. 5, 6, 7 and 8 are the visual analysis results for the predicted time step set to 15min, 30min, 45min and 1hour, respectively, under the NE-BJ dataset.
The results in the graph show that the overall traffic flow has certain periodicity in the time dimension, and the existence of periodicity describes the advantages of constructing a time graph convolution module and an autocorrelation graph convolution module to a certain extent, and the two modules can capture hidden time correlation more in the time dimension. In addition, the dynamic diagram and the static diagram are fused in the model, so that the traffic network characteristics under different time periods can be captured more effectively, and the abnormal peak of the traffic flow can be predicted accurately.
Claims (1)
1. A MGCN-based traffic prediction method, characterized by comprising the following steps:
definition: MGCN, which is called Multi-Graph Convolutional Neural Network, namely a Multi-graph convolution neural network, is a traffic prediction method based on a dynamic time-space sequence, and has the core targets of predicting traffic data in a future time period through acquired historical traffic data, the prediction model is widely applicable to various time-sequence prediction fields, and is used for effectively processing complex time-sequence data, the whole model is based on an encoder-decoder structure, wherein a time-space convolution module is used for capturing traffic network structure information, dynamic neighbor node information and flow change information in a time dimension at the same time and effectively fusing, representing comprehensive and dynamic time-space correlation, further converting a historical time sequence into a future time sequence representation, analyzing future traffic conditions respectively in the time dimension and the space dimension on two decoders, aggregating multidimensional information, and performing full experiments on two real large-scale data sets, and the experimental results show the effectiveness and the superiority of the model, and reach a better level than other baseline methods, and the specific steps are as follows:
Step 1: defining a traffic road network representation; traffic flow prediction is a typical spatiotemporal prediction problem, the purpose of which is to predict future traffic conditions by means of historical traffic data of observed sensors in traffic networks, the traffic network under study is defined as a weighted undirected graph g= (V, E, a), where V is a set of vertices of n= |v|, N represents the number of nodes deployed in the traffic network, E is defined as a set of edges, representing connectivity between node pairs, Expressed as a weighted adjacency matrix, where/>Representing the proximity of nodes v i and v j,/>The method is characterized in that the method is a graph signal acquired by all sensors on a traffic network, wherein D is the characteristic number of each node;
step 2: determining the input and output of the model; historical traffic data for N nodes in a given traffic network at T h time steps The purpose is to learn a function f, which can predict the traffic condition of all nodes in the future T h time steps, and can be expressed as:
Step 3: selecting a proper proportion division data set and preprocessing data; according to the general partitioning criteria, 70% of the data are used for training, 10% of the data are used for validation, the remaining 20% of the data are used for testing, and the whole dataset is Z-Score normalized;
Step 4: constructing a dynamic space-time diagram convolution network (DSTGCN) space-time embedding module; in order to capture dynamic graph information in space and time dimensions simultaneously, a dynamic space-time multi-graph convolution network (DSTGCN) based on graph convolution is designed, the graph convolution is used for promoting traditional convolution operation from structured data to a non-Euclidean graph structure, potential graph relations can be captured, a traditional graph convolution method is extended to a space-time graph on the basis, dynamic graph information is captured, and the DSTGCN is used for respectively aggregating neighbor node information, multi-order neighbor information and historical information through static graph convolution, dynamic graph convolution and autocorrelation graph volumes, so that the characteristics of graph nodes can be effectively learned through the information aggregation method under three different view angles;
Step 4.1: constructing a static graph rolling module; the similarity of node pairs is measured by a threshold Gaussian kernel function distance of a traffic network defined based on geographic proximity, and is expressed as follows:
Wherein the method comprises the steps of Is the output of the L layer at time t-Is the output of layer (L-1) and is also the input of layer L,/>And/>Is a learnable parameter, σ is a ReLU (·) activation function,/>Is a self-circulating adjacency matrix, which is defined specifically as follows:
Wherein the method comprises the steps of Is the distance from node v i to node v j, delta is the standard deviation, epsilon is the threshold value controlling the sparsity of the adjacency matrix A, designated as 0.1,/>Is a redefined adjacency matrix, I N is an N-dimensional diagonal matrix, with the aim of enhancing the importance of the node itself,/>Is a degree matrix, wherein/>
Step 4.2: constructing a dynamic graph rolling module; the dynamic graph rolling method can adaptively adjust the correlation degree among nodes, so that more important node information is focused, and in particular, the paired relation among any road nodes is modeled as follows:
Wherein the method comprises the steps of Is the output of layer (L-1) at time t,/>Is the dynamic adjacency matrix of the layer L at time t,The relevance score of the L layer of the node i and the node j at the moment t is represented, and further, the graph information is aggregated through a dynamic adjacency matrix:
Wherein the method comprises the steps of And/>Output and input of the L-layer picture signal at the t moment,/>, respectivelyAndIs a parameter that can be learned, then, connect the graph signals over multiple time steps:
Wherein the method comprises the steps of A graph signal output representing T h time steps;
step 4.3: constructing an autocorrelation graph convolution module; the autocorrelation graph convolution module is mainly used for simulating nonlinear correlation between different time steps of a certain node, and firstly, a time attention matrix is constructed:
Wherein the method comprises the steps of Is a traffic representation of node v i over T h time steps,/>Is a matrix representation of the interrelationship of node v i between different time steps at the first level, by considering the information of the time steps earlier than the target time step, after obtaining the time attention matrix, updating the state of node v i over T h time steps:
Wherein the method comprises the steps of And/>Graph signal representing output and input of node v i at layer i,/>AndIs a learnable parameter;
Step 5: constructing a dynamic space-time multi-graph convolution (MGCN) integral model; three different graph convolution of the previous design represent space-time correlation from different angles, but in order to further enhance the characteristic expression capability of the nodes, a gating fusion mechanism is used for fusing the nodes in pairs, then in order to reduce propagation errors between different prediction time steps in a long time range, future traffic conditions are approximately described by establishing a nonlinear relation of the future time steps to the past time steps, a result after conversion is used as input of a decoder, two decoders are mainly deployed in MGCN framework, the change trend of traffic flow is analyzed from space dimension and time dimension respectively, and multi-direction decoding is beneficial to comprehensive cognitive traffic conditions, so that the future traffic flow is effectively predicted;
Step 5.1: constructing a gating fusion module; first a gate is defined to control the importance of both:
z=φ(XWz+HUz+bz)
Y=z⊙X+(1-z)⊙H
Wherein the method comprises the steps of And/>Is a learnable parameter, phi is a sigmoid activation function, z is a gate controlling the importance of both,/>Is a fused feature representation;
step 5.2: constructing a conversion attention layer; firstly, converting nodes in a traffic network into a matrix through a node2Vec algorithm It retains the structural information of the road network and then encodes the past and future times into a time matrix/>, by one-hotSpace-time embedding/>Is a space-time representation of nodes in a traffic network, and the conversion relationship is represented as follows:
Wherein the method comprises the steps of Is the spatiotemporal representation of node v i at time step t i,/>And/>Is a learnable parameter,/>Is the attention score of the future time step T i to the past time step T j, and the encoded traffic characteristics are converted into a decoder by adaptively selecting the relevant characteristics of all the historical time steps T h:
Wherein the method comprises the steps of Is a characteristic representation of node v i at future time step t i,/>Is a characteristic representation of node v i at historic time step t j,/>Is a learnable parameter;
Step 6: establishing a space-time attention mechanism; the model is provided with two decoders, the change trend of traffic flow is analyzed from space dimension and time dimension respectively, multi-direction decoding is beneficial to comprehensive cognition of traffic conditions, so that future traffic flow is effectively predicted, and then a space attention module and a time attention module are sequentially introduced;
Step 6.1: constructing a spatial attention module; a spatial attention mechanism is designed to adapt to the dynamic correlation between capture nodes, and the mechanism is expressed as:
Wherein, Is the attention score of nodes v i and v j at time step t i,/>Is a flow representation of node v i at time step t i,/>And/>Is a learnable parameter that learns new node state representations by dynamically assigning different weights to different nodes at different time steps:
Wherein the method comprises the steps of Is a characteristic representation of node v i at time step t i;
Step 6.2: constructing a time attention module; the correlation between time steps is modeled adaptively by the time attention mechanism, expressed as:
Wherein, Is the attention fraction between t i and t j time steps of node v i,/>Is a flow representation of node v i at time step t i,/>And/>Is a parameter that can be learned, and after the attention score is obtained, the state of the node v i at time step t i is updated as follows:
Wherein the method comprises the steps of Is a characteristic representation of node v i at time step t i;
step 7: experiment verification; to assess and understand the effects and performance of key components in the proposed MGCN model, ablation studies on the NE-BJ dataset were performed, and variants of MGCN were named as follows:
MGCN-ND: MGCN without dynamic graph convolution, the dynamic graph convolution module is removed from MGCN;
MGCN-NS: MGCN, without the autocorrelation graph convolution module, removes the autocorrelation graph convolution module from MGCN;
MGCN-NTA: MGCN without a temporal attention module, the temporal attention module is removed from MGCN;
MGCN-NSA: MGCN without a spatial attention module, the spatial attention module is removed from MGCN;
Experimental results show that the four modules, namely the dynamic graph convolution, the autocorrelation graph convolution, the time attention module and the spatial attention module, are crucial to the performance of MGCN, the fusion of the dynamic graph convolution module and the static graph module proves the effectiveness of dynamic and static combination, the fusion of the autocorrelation graph convolution, the time attention module and the spatial attention module proves the effectiveness of space-time fusion, and the design of the autocorrelation graph convolution not only can capture the time dependence more deeply, but also helps the overall model to achieve good results;
In order to further verify the feasibility and effectiveness of the model, visual analysis is carried out on the actual traffic flow and the traffic flow predicted by the model, and experimental results show that the overall traffic flow has a certain periodicity in the time dimension, the existence of periodicity is used for explaining the advantages of constructing a time graph rolling module and an autocorrelation graph rolling module to a certain extent, and the two modules can capture hidden time correlation in the time dimension.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210832846.6A CN115578851B (en) | 2022-07-14 | 2022-07-14 | MGCN-based traffic prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210832846.6A CN115578851B (en) | 2022-07-14 | 2022-07-14 | MGCN-based traffic prediction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115578851A CN115578851A (en) | 2023-01-06 |
CN115578851B true CN115578851B (en) | 2024-06-07 |
Family
ID=84578722
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210832846.6A Active CN115578851B (en) | 2022-07-14 | 2022-07-14 | MGCN-based traffic prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115578851B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115578852B (en) * | 2022-07-14 | 2024-06-14 | 西北师范大学 | DSTGCN-based traffic prediction method |
CN116050640B (en) * | 2023-02-01 | 2023-10-13 | 北京交通大学 | Short-time passenger flow prediction method of multi-mode traffic system based on self-adaptive multi-graph convolution |
CN116153089B (en) * | 2023-04-24 | 2023-06-27 | 云南大学 | Traffic flow prediction system and method based on space-time convolution and dynamic diagram |
CN116543554B (en) * | 2023-05-01 | 2024-05-14 | 兰州理工大学 | Space-time converter traffic flow prediction method based on dynamic correlation |
CN116205383B (en) * | 2023-05-05 | 2023-07-18 | 杭州半云科技有限公司 | Static dynamic collaborative graph convolution traffic prediction method based on meta learning |
CN116361662B (en) * | 2023-05-31 | 2023-08-15 | 中诚华隆计算机技术有限公司 | Training method of machine learning model and performance prediction method of quantum network equipment |
CN116960991B (en) * | 2023-09-21 | 2023-12-29 | 杭州半云科技有限公司 | Probability-oriented power load prediction method based on graph convolution network model |
CN117456736B (en) * | 2023-12-22 | 2024-03-12 | 湘江实验室 | Traffic flow prediction method based on multi-scale space-time dynamic interaction network |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020024319A1 (en) * | 2018-08-01 | 2020-02-06 | 苏州大学张家港工业技术研究院 | Convolutional neural network based multi-point regression forecasting model for traffic flow forecasting |
CN112801404A (en) * | 2021-02-14 | 2021-05-14 | 北京工业大学 | Traffic prediction method based on self-adaptive spatial self-attention-seeking convolution |
CN113450568A (en) * | 2021-06-30 | 2021-09-28 | 兰州理工大学 | Convolutional network traffic flow prediction method based on space-time attention mechanism |
CN115240425A (en) * | 2022-07-26 | 2022-10-25 | 西北师范大学 | Traffic prediction method based on multi-scale space-time fusion graph network |
CN115578852A (en) * | 2022-07-14 | 2023-01-06 | 西北师范大学 | Traffic prediction method based on DSTGCN |
-
2022
- 2022-07-14 CN CN202210832846.6A patent/CN115578851B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020024319A1 (en) * | 2018-08-01 | 2020-02-06 | 苏州大学张家港工业技术研究院 | Convolutional neural network based multi-point regression forecasting model for traffic flow forecasting |
CN112801404A (en) * | 2021-02-14 | 2021-05-14 | 北京工业大学 | Traffic prediction method based on self-adaptive spatial self-attention-seeking convolution |
CN113450568A (en) * | 2021-06-30 | 2021-09-28 | 兰州理工大学 | Convolutional network traffic flow prediction method based on space-time attention mechanism |
CN115578852A (en) * | 2022-07-14 | 2023-01-06 | 西北师范大学 | Traffic prediction method based on DSTGCN |
CN115240425A (en) * | 2022-07-26 | 2022-10-25 | 西北师范大学 | Traffic prediction method based on multi-scale space-time fusion graph network |
Non-Patent Citations (1)
Title |
---|
基于时空多图卷积网络的交通站点流量预测;荣斌;武志昊;刘晓辉;赵苡积;林友芳;景一真;;计算机工程;20201231(第05期);第32-39页 * |
Also Published As
Publication number | Publication date |
---|---|
CN115578851A (en) | 2023-01-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115578851B (en) | MGCN-based traffic prediction method | |
Wang et al. | Predrnn: A recurrent neural network for spatiotemporal predictive learning | |
Qin et al. | A novel combined prediction scheme based on CNN and LSTM for urban PM 2.5 concentration | |
CN114067160B (en) | Small sample remote sensing image scene classification method based on embedded smooth graph neural network | |
CN110827544B (en) | Short-term traffic flow control method based on graph convolution recurrent neural network | |
Ta et al. | Adaptive spatio-temporal graph neural network for traffic forecasting | |
CN115240425B (en) | Traffic prediction method based on multi-scale space-time fusion graph network | |
Corizzo et al. | Anomaly detection and repair for accurate predictions in geo-distributed big data | |
CN110570035B (en) | People flow prediction system for simultaneously modeling space-time dependency and daily flow dependency | |
Zhu et al. | Spatial regression graph convolutional neural networks: A deep learning paradigm for spatial multivariate distributions | |
Gong et al. | Missing value imputation for multi-view urban statistical data via spatial correlation learning | |
CN111008337B (en) | Deep attention rumor identification method and device based on ternary characteristics | |
Wang et al. | Traffic-GGNN: predicting traffic flow via attentional spatial-temporal gated graph neural networks | |
Liang et al. | A new image classification method based on modified condensed nearest neighbor and convolutional neural networks | |
Jiang et al. | S-GCN-GRU-NN: A novel hybrid model by combining a Spatiotemporal Graph Convolutional Network and a Gated Recurrent Units Neural Network for short-term traffic speed forecasting | |
Xu et al. | A graph spatial-temporal model for predicting population density of key areas | |
CN114973678B (en) | Traffic prediction method based on graph attention neural network and space-time big data | |
CN116108984A (en) | Urban flow prediction method based on flow-POI causal relationship reasoning | |
CN113887704A (en) | Traffic information prediction method, device, equipment and storage medium | |
CN115376317A (en) | Traffic flow prediction method based on dynamic graph convolution and time sequence convolution network | |
CN117392686A (en) | Improved dynamic graph neural network-based unrealistic information detection method | |
Ma et al. | Spatio-temporal fusion graph convolutional network for traffic flow forecasting | |
ABBAS | A survey of research into artificial neural networks for crime prediction | |
Berkani et al. | Spatio-temporal forecasting: A survey of data-driven models using exogenous data | |
Wu et al. | Discovery of spatio-temporal patterns in multivariate spatial time series |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |