CN115620524B - Traffic jam prediction method, system, equipment and storage medium - Google Patents

Traffic jam prediction method, system, equipment and storage medium Download PDF

Info

Publication number
CN115620524B
CN115620524B CN202211612325.6A CN202211612325A CN115620524B CN 115620524 B CN115620524 B CN 115620524B CN 202211612325 A CN202211612325 A CN 202211612325A CN 115620524 B CN115620524 B CN 115620524B
Authority
CN
China
Prior art keywords
data
traffic
congestion
event data
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211612325.6A
Other languages
Chinese (zh)
Other versions
CN115620524A (en
Inventor
肖飞
张永敏
王艺锋
段思婧
何骁豪
王姗姗
孟陈莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202211612325.6A priority Critical patent/CN115620524B/en
Publication of CN115620524A publication Critical patent/CN115620524A/en
Application granted granted Critical
Publication of CN115620524B publication Critical patent/CN115620524B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0133Traffic data processing for classifying traffic situation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a traffic jam prediction method, a system, equipment and a storage medium, wherein the method acquires traffic event data, jam event data and station flow data by acquiring a multi-source heterogeneous data set; extracting the space correlation characteristics among the congestion event data, the traffic event data and the traffic congestion to be predicted by adopting a convolutional neural network; extracting time dependency relationship characteristics of the station traffic processing data and the congestion through a characteristic extraction network; acquiring various factor data except traffic event data, congestion event data and station flow data, and performing multi-classification processing and single-hot coding processing on the various factor data to obtain various factor characteristics; and splicing the spatial correlation characteristics, the time dependency relationship characteristics and the multiple factor characteristics to obtain space-time joint characteristics, and inputting the space-time joint characteristics into a multilayer perceptron model to obtain a prediction result of the traffic jam to be predicted. The invention can improve the accuracy of traffic jam prediction.

Description

Traffic jam prediction method, system, equipment and storage medium
Technical Field
The present invention relates to the field of traffic congestion prediction technologies, and in particular, to a traffic congestion prediction method, system, device, and storage medium.
Background
The problem of traffic jam is always a very concern for citizens going out, and traffic jam prediction is also an important research field of an intelligent traffic system. By combining big data, the traffic jam prediction model can effectively predict future traffic conditions according to road conditions, station traffic and historical jam data, so that citizens can be guided to go out, detour and peak shifting. The existing research methods mainly comprise a statistics-based method, a traditional machine learning method and a deep learning model-based method, wherein the statistics-based method is mainly designed for small data sets, is not suitable for processing complex and dynamic data and cannot capture the relationship among features; the traditional machine learning method needs manual feature extraction and cannot extract complex space-time features.
In China, a high-speed road section with the average traffic speed lower than 30km/h is regarded as a congestion road section, a data source provided by an APP indicates that congestion data can be uploaded to the cloud when the driving speed is lower than 30km/h, and no relevant congestion data is recorded when the driving speed is higher than 30 km/h. Due to the characteristics of data discontinuity, difficulty in calculating congestion lengths, difficulty in distinguishing congestion events and the like, congestion prediction is more difficult than prediction of other traffic flows (such as high-speed station traffic, high-speed road section traveling speed and the like).
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art. Therefore, the invention provides a traffic jam prediction method, a system, equipment and a storage medium, which can improve the accuracy of traffic jam prediction.
In a first aspect, an embodiment of the present invention provides a traffic congestion prediction method, where the traffic congestion prediction method includes:
acquiring a multi-source heterogeneous data set, and acquiring traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set;
extracting the congestion event data, the spatial correlation characteristics between the traffic event data and the traffic congestion to be predicted by adopting a convolutional neural network;
extracting time dependency relationship characteristics of the station traffic data and congestion through a characteristic extraction network; the feature extraction network is made by fusing a gated neural unit and an attention mechanism;
acquiring various factor data except traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set, and performing multi-classification processing and single-hot coding processing on the various factor data to obtain various factor characteristics;
and splicing the spatial correlation characteristic, the time dependency relationship characteristic and the multiple factor characteristics to obtain a space-time combined characteristic, and inputting the space-time combined characteristic into a multilayer perceptron model to obtain a prediction result of the traffic jam to be predicted.
Compared with the prior art, the first aspect of the invention has the following beneficial effects:
the method comprises the steps of obtaining a multi-source heterogeneous data set, obtaining traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set, obtaining various data sets, predicting traffic congestion in multiple aspects, and comprehensively predicting the traffic congestion condition, so that the accuracy of traffic congestion prediction is improved; extracting congestion event data, space correlation characteristics between the traffic event data and traffic congestion to be predicted by adopting a convolutional neural network, and extracting time dependence relation characteristics of station traffic data and the congestion through a characteristic extraction network; the characteristic extraction network is formed by fusing a gate control neural unit and an attention mechanism, obtains multiple factor data except traffic event data, congestion event data and station traffic data from a multi-source heterogeneous data set, performs binary processing and single-hot-code processing on the multiple factor data to obtain multiple factor characteristics, performs characteristic extraction on different data sets in different modes, can capture the relation among the characteristics, and improves the effectiveness of the characteristic extraction; the space correlation characteristics, the time dependency relationship characteristics and the multiple factor characteristics are spliced to obtain space-time combined characteristics, the space-time combined characteristics are input into the multilayer perceptron model to obtain a prediction result of the traffic jam to be predicted, the traffic jam condition is comprehensively predicted through the multiple types of characteristics, and the accuracy of traffic jam prediction can be improved.
According to some embodiments of the invention, the obtaining traffic event data, congestion event data and station traffic data from the multi-source heterogeneous data set comprises:
and carrying out unique thermal coding processing on the traffic events and congestion events in the multi-source heterogeneous data set to obtain the traffic event data and the congestion event data, and carrying out normalization processing on the station traffic in the multi-source heterogeneous data set to obtain the station traffic data.
According to some embodiments of the invention, the extracting the spatial correlation characteristics between the congestion event data, the traffic event data and the traffic congestion to be predicted by using a convolutional neural network comprises:
presetting a first historical time step, and acquiring historical data of a first quantity of congestion event data and historical data of traffic event data which are adjacent to geographical positions in the preset first historical time step;
splicing the historical data of the traffic event data and the historical data of the congestion event data to obtain a spliced data sequence;
and inputting the spliced data sequence into the convolutional neural network to obtain the congestion event data, the traffic event data and the spatial correlation characteristics between the traffic congestion to be predicted.
According to some embodiments of the invention, the inputting the concatenated data sequence into the convolutional neural network to obtain the congestion event data, the spatial correlation characteristics between the traffic event data and the traffic congestion to be predicted comprises:
inputting the spliced data sequence into the convolutional neural network, processing the spliced data sequence through a convolutional layer and a pooling layer:
Figure 68966DEST_PATH_IMAGE001
wherein,
Figure 249412DEST_PATH_IMAGE002
and
Figure 458807DEST_PATH_IMAGE003
representing the output of the convolutional layer, E representing the concatenated data sequence,
Figure 800927DEST_PATH_IMAGE004
and
Figure 977830DEST_PATH_IMAGE005
a matrix of weights is represented by a matrix of weights,
Figure 493125DEST_PATH_IMAGE006
Figure 947241DEST_PATH_IMAGE007
Figure 102672DEST_PATH_IMAGE008
and
Figure 376658DEST_PATH_IMAGE009
representing a deviation matrix, reLU representing an activation function,
Figure 23540DEST_PATH_IMAGE010
the value of the maximum function is represented,
Figure 128900DEST_PATH_IMAGE011
and
Figure 78401DEST_PATH_IMAGE012
the output of the pooling layer is represented as,
Figure 715050DEST_PATH_IMAGE013
representing a convolution operation;
after the convolution layer and the pooling layer process the concatenated data sequence, the concatenated data sequence will be processed
Figure 775410DEST_PATH_IMAGE012
Inputting the spatial correlation characteristics into a full connection layer, and obtaining the spatial correlation characteristics, wherein the spatial correlation characteristics are expressed as:
Figure 63172DEST_PATH_IMAGE014
wherein,
Figure 980312DEST_PATH_IMAGE015
representing a spatial correlation characteristic between the congestion event data at time t, the traffic event data and the traffic congestion to be predicted,
Figure 494470DEST_PATH_IMAGE016
a matrix of weights is represented by a matrix of weights,
Figure 466843DEST_PATH_IMAGE017
a deviation matrix is represented.
According to some embodiments of the invention, the extracting, by the feature extraction network, the time dependency feature of the station traffic data and the congestion includes:
presetting a second historical time step, and acquiring a second quantity of inbound site traffic data and outbound site traffic data with the top geographical position rank in the second historical time step;
splicing the inbound site traffic data and the outbound site traffic data to obtain spliced site traffic data;
inputting the flow data of the splicing site into the gated neural unit to obtain a first vector, and outputting the first vector in the t step of the gated neural unit
Figure 484477DEST_PATH_IMAGE018
Expressed as:
Figure 838098DEST_PATH_IMAGE019
wherein,
Figure 698607DEST_PATH_IMAGE020
showing the spliced site traffic data of step t-1,
Figure 366349DEST_PATH_IMAGE021
representing the flow data of the splicing site in the t step, wherein the GRU represents a gated neural unit;
inputting the first vector into the attention mechanism to obtain a second vector, wherein the attention mechanism is calculated by the formula:
Figure 848277DEST_PATH_IMAGE022
wherein,
Figure 576061DEST_PATH_IMAGE023
representing the vector output by the gated neural unit at time t
Figure 595970DEST_PATH_IMAGE018
The value of the attention distribution of (1),
Figure 926457DEST_PATH_IMAGE024
and
Figure 918684DEST_PATH_IMAGE025
the weight coefficient is represented by a weight coefficient,
Figure 928622DEST_PATH_IMAGE026
the coefficient of variation is represented by a coefficient of variation,
Figure 373509DEST_PATH_IMAGE027
representing the vector output by the gated neural unit at time j
Figure 507688DEST_PATH_IMAGE018
The value of the attention distribution of (1),
Figure 620000DEST_PATH_IMAGE028
indicating the attention weight, i indicates the total time;
calculating the vector output by the attention mechanism through a full-connection layer to obtain the time dependency relationship characteristics, wherein the calculation formula of the full-connection layer is as follows:
Figure 299374DEST_PATH_IMAGE029
wherein,
Figure 497137DEST_PATH_IMAGE030
representing the time dependency characteristics at time t,
Figure 169427DEST_PATH_IMAGE031
a matrix of weights is represented by a matrix of weights,
Figure 932984DEST_PATH_IMAGE032
representing the deviation vector and ReLU the activation function.
According to some embodiments of the invention, the multi-classification processing and the one-hot coding processing are performed on the multi-factor data to obtain the multi-factor characteristics, including:
if the multi-factor data in the multi-source heterogeneous data set are classified variables, representing the multi-factor data as classified 0-1 variables through multi-classification processing to obtain two-classification factor data, and mapping the two-classification factor data into multiple factor characteristics through single-hot coding;
and if the multi-factor data in the multi-source heterogeneous data set are multi-classification variables, mapping the multi-factor data into multi-factor characteristics by adopting single-hot coding.
According to some embodiments of the present invention, the stitching the spatial correlation feature, the temporal dependency relationship feature, and the multi-factor feature to obtain a spatiotemporal union feature, and inputting the spatiotemporal union feature into a multi-layered sensor model to obtain the prediction result of the traffic congestion to be predicted, includes:
inputting the space-time joint features into a multilayer perceptron model, and calculating through a hidden layer and an output layer to obtain a traffic jam prediction result, wherein the calculation of the hidden layer comprises the following steps:
Figure 907893DEST_PATH_IMAGE033
wherein,
Figure 701274DEST_PATH_IMAGE034
representing the spatial correlation characteristic at time t,
Figure 52621DEST_PATH_IMAGE035
representing the time dependency characteristics at time t,
Figure 998581DEST_PATH_IMAGE036
representing the characteristics of the various factors at time t,
Figure 206708DEST_PATH_IMAGE037
a function representing the function of splicing is shown,
Figure 113484DEST_PATH_IMAGE038
representing the spatio-temporal union features,
Figure 878309DEST_PATH_IMAGE039
a feature vector representing an output of the hidden layer,
Figure 554141DEST_PATH_IMAGE040
a matrix of weights is represented by a matrix of weights,
Figure 995487DEST_PATH_IMAGE041
representing a deviation matrix, reLU representing an activation function,
Figure 186297DEST_PATH_IMAGE013
representing a convolution operation;
inputting the feature vectors output by the hidden layer to the output layer, the computing of the output layer comprising:
Figure 879446DEST_PATH_IMAGE042
wherein,
Figure 40476DEST_PATH_IMAGE043
a prediction result representing the traffic congestion to be predicted at time t +1,
Figure 793668DEST_PATH_IMAGE044
a matrix of weights is represented by a matrix of weights,
Figure 534091DEST_PATH_IMAGE045
a matrix of deviations is represented which is,
Figure 765352DEST_PATH_IMAGE046
representing the activation function.
In a second aspect, an embodiment of the present invention further provides a traffic congestion prediction system, where the traffic congestion prediction system includes:
the data acquisition unit is used for acquiring a multi-source heterogeneous data set and acquiring traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set;
the first feature extraction unit is used for extracting the spatial correlation features among the congestion event data, the traffic event data and the traffic congestion to be predicted by adopting a convolutional neural network;
the second feature extraction unit is used for extracting the time dependency relationship features of the site traffic data and the congestion through a feature extraction network; the feature extraction network is made by fusing a gated neural unit and an attention mechanism;
the third feature extraction unit is used for acquiring various factor data except traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set, and performing multi-classification processing and single-hot coding processing on the various factor data to acquire various factor features;
and the prediction result acquisition unit is used for splicing the spatial correlation characteristic, the time dependency relationship characteristic and the multiple factor characteristics to obtain a space-time combined characteristic, and inputting the space-time combined characteristic into the multilayer perceptron model to obtain the prediction result of the traffic jam to be predicted.
In a third aspect, an embodiment of the present invention further provides a traffic congestion prediction apparatus, including at least one control processor and a memory, which is communicatively connected to the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a method of traffic congestion prediction as described above.
In a fourth aspect, the present invention also provides a computer-readable storage medium storing computer-executable instructions for causing a computer to execute a traffic congestion prediction method as described above.
It is to be understood that the advantageous effects of the second aspect to the fourth aspect compared to the related art are the same as the advantageous effects of the first aspect compared to the related art, and reference may be made to the related description of the first aspect, which is not repeated herein.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow chart of a traffic congestion prediction method according to an embodiment of the present invention;
fig. 2 is a block diagram of a traffic congestion prediction system according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
In the description of the present invention, if there are first, second, etc. described, it is only for the purpose of distinguishing technical features, and it is not understood that relative importance is indicated or implied or that the number of indicated technical features is implicitly indicated or that the precedence of the indicated technical features is implicitly indicated.
In the description of the present invention, it should be understood that the orientation or positional relationship referred to, for example, the upper, lower, etc., is indicated based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, but does not indicate or imply that the device or element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present invention.
In the description of the present invention, it should be noted that unless otherwise explicitly defined, terms such as arrangement, installation, connection and the like should be broadly understood, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.
First, several terms referred to in the present application are resolved:
decision tree: the non-parameter supervised learning algorithm is a hierarchical tree structure and consists of root nodes, branches, internal nodes and leaf nodes.
Extreme gradient lifting tree: the method is an integrated learning algorithm based on the decision tree.
Random forest: is a classifier that contains multiple decision trees and whose output classes are dependent on the mode of the class output by the individual tree.
K-nearest neighbor algorithm: is a non-parametric statistical method for classification and regression.
Long and short term memory network: the method is a variant of a recurrent neural network and is widely applied to a deep learning model of time series prediction.
The problem of traffic jam is always a very concern for citizens going out, and traffic jam prediction is also an important research field of an intelligent traffic system. By combining big data, the traffic jam prediction model can effectively predict future traffic conditions according to road conditions, station traffic and historical jam data, so that citizens can be guided to go out, detour and peak shifting. The existing research methods mainly comprise a statistics-based method, a traditional machine learning method and a deep learning model-based method, wherein the statistics-based method is mainly designed for small data sets, is not suitable for processing complex and dynamic data and cannot capture the relationship among features; the traditional machine learning method needs manual feature extraction and cannot extract complex space-time features.
In China, a high-speed road section with the average traffic speed lower than 30km/h is regarded as a congestion road section, a data source provided by an APP indicates that the congestion data can be uploaded to a cloud end when the driving speed is lower than 30km/h, and no related congestion data is recorded when the driving speed is higher than 30 km/h. Due to the characteristics of discontinuous data, difficulty in calculating congestion lengths, difficulty in distinguishing congestion events and the like, congestion prediction is more difficult than prediction of other traffic flows (such as high-speed station flow, high-speed road section running speed and the like).
In order to solve the problems, the method and the device can comprehensively predict the traffic jam situation by acquiring the multi-source heterogeneous data set, acquiring traffic event data, jam event data and station flow data from the multi-source heterogeneous data set and predicting the traffic jam from multiple aspects by acquiring various data sets, thereby improving the accuracy of traffic jam prediction; extracting congestion event data, space correlation characteristics between the traffic event data and traffic congestion to be predicted by adopting a convolutional neural network, and extracting time dependence relation characteristics of station traffic data and the congestion through a characteristic extraction network; the characteristic extraction network is formed by fusing a gate control neural unit and an attention mechanism, obtains multiple factor data except traffic event data, congestion event data and station traffic data from a multi-source heterogeneous data set, performs multi-classification processing and single-hot-code processing on the multiple factor data to obtain multiple factor characteristics, performs characteristic extraction on different data sets in different modes, can capture the relation among the characteristics, and improves the effectiveness of the characteristic extraction; the space correlation characteristics, the time dependency relationship characteristics and the multiple factor characteristics are spliced to obtain space-time combined characteristics, the space-time combined characteristics are input into the multilayer perceptron model to obtain a prediction result of the traffic jam to be predicted, the traffic jam condition is comprehensively predicted through the multiple types of characteristics, and the accuracy of traffic jam prediction can be improved.
Referring to fig. 1, an embodiment of the present invention provides a traffic congestion prediction method, which includes, but is not limited to, steps S100 to S500:
s100, acquiring a multi-source heterogeneous data set, and acquiring traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set;
s200, extracting congestion event data, traffic event data and spatial correlation characteristics between traffic congestion to be predicted by adopting a convolutional neural network;
step S300, extracting time dependency relationship characteristics of the site traffic data and the congestion through a characteristic extraction network; the feature extraction network is made by fusing a gated neural unit and an attention mechanism;
s400, acquiring various factor data except traffic event data, congestion event data and station traffic data from a multi-source heterogeneous data set, and performing multi-classification processing and one-hot coding processing on the various factor data to obtain various factor characteristics;
and S500, splicing the spatial correlation characteristics, the time dependency relationship characteristics and the multiple factor characteristics to obtain space-time combined characteristics, and inputting the space-time combined characteristics into the multilayer perceptron model to obtain a prediction result of the traffic jam to be predicted.
In steps S100 to S500 of some embodiments, in order to comprehensively predict a traffic jam condition in consideration of traffic jam from multiple aspects, thereby improving accuracy of traffic jam prediction, traffic event data, congestion event data, and station traffic data are acquired from a multi-source heterogeneous data set by acquiring the multi-source heterogeneous data set; in order to capture the relationship among the characteristics and improve the effectiveness of characteristic extraction, the spatial correlation characteristics among congestion event data, traffic event data and traffic congestion to be predicted are extracted by adopting a convolutional neural network, and the time dependency relationship characteristics between station flow data and congestion are extracted by a characteristic extraction network; the characteristic extraction network is formed by fusing a gate control neural unit and an attention mechanism, acquires various factor data except traffic event data, congestion event data and station traffic data from a multi-source heterogeneous data set, and performs multi-classification processing and single-hot-code processing on the various factor data to obtain various factor characteristics; in order to improve the accuracy of traffic jam prediction, space-time joint characteristics are obtained by splicing the space correlation characteristics, the time dependency relationship characteristics and the multiple factor characteristics, and the space-time joint characteristics are input into the multilayer perceptron model to obtain a prediction result of traffic jam to be predicted.
In some embodiments, obtaining traffic event data, congestion event data, and site traffic data from a multi-source heterogeneous dataset comprises:
the traffic event and the congestion event in the multi-source heterogeneous data set are subjected to one-hot coding processing to obtain traffic event data and congestion event data, and the station flow in the multi-source heterogeneous data set is subjected to normalization processing to obtain station flow data.
In the embodiment, different methods are adopted to extract multiple types of data from the multi-source heterogeneous data set, and different extraction methods are adopted for different types of data, so that the data can be effectively extracted.
In some embodiments, extracting spatial correlation features between congestion event data, traffic event data, and traffic congestion to be predicted using a convolutional neural network comprises:
presetting a first historical time step, and acquiring historical data of a first quantity of congestion event data and historical data of traffic event data which are adjacent to geographical positions in the preset first historical time step;
splicing the historical data of the traffic event data and the historical data of the congestion event data to obtain a spliced data sequence;
and inputting the spliced data sequence into a convolutional neural network to obtain the spatial correlation characteristics among the congestion event data, the traffic event data and the traffic congestion to be predicted.
It should be noted that the first historical time step and the first quantity of this embodiment may be changed according to an actual situation, and this embodiment is not limited specifically.
In the embodiment, the convolutional neural network is adopted, so that the spatial correlation characteristics between the congestion event data and the traffic event data can be effectively extracted, the relation between the characteristics can be captured, and the feature extraction effectiveness is improved.
In some embodiments, inputting the stitched data sequence into a convolutional neural network to obtain spatial correlation characteristics between congestion event data, traffic event data, and traffic congestion to be predicted, includes:
inputting the spliced data sequence into a convolutional neural network, processing the spliced data sequence through a convolutional layer and a pooling layer:
Figure 212514DEST_PATH_IMAGE001
wherein,
Figure 11974DEST_PATH_IMAGE002
and
Figure 380639DEST_PATH_IMAGE003
representing the output of the convolutional layer, E the concatenated data sequence,
Figure 274645DEST_PATH_IMAGE004
and
Figure 779576DEST_PATH_IMAGE005
a matrix of weights is represented by a matrix of weights,
Figure 671309DEST_PATH_IMAGE006
Figure 901171DEST_PATH_IMAGE007
Figure 208655DEST_PATH_IMAGE008
and
Figure 958305DEST_PATH_IMAGE009
representing a deviation matrix, reLU representing an activation function,
Figure 224202DEST_PATH_IMAGE010
the value of the maximum function is represented,
Figure 98617DEST_PATH_IMAGE011
and
Figure 616317DEST_PATH_IMAGE012
the output of the pooling layer is represented as,
Figure 95840DEST_PATH_IMAGE013
representing a convolution operation;
after the concatenated data sequences are processed at the convolutional and pooling layers, the data sequence will be
Figure 391692DEST_PATH_IMAGE012
Inputting to a full connection layer, and obtaining a spatial correlation characteristic, wherein the spatial correlation characteristic is expressed as:
Figure 956665DEST_PATH_IMAGE014
wherein,
Figure 248363DEST_PATH_IMAGE015
representing the spatial correlation characteristics between the congestion event data at time t, the traffic event data and the traffic congestion to be predicted,
Figure 847971DEST_PATH_IMAGE016
a matrix of weights is represented by a matrix of weights,
Figure 580304DEST_PATH_IMAGE017
a deviation matrix is represented.
In some embodiments, extracting the time dependency characteristics of the station traffic data and the congestion through the characteristic extraction network comprises:
presetting a second historical time step, and acquiring a second quantity of inbound site traffic data and outbound site traffic data with the top geographical position rank in the second historical time step;
splicing the inbound site traffic data and the outbound site traffic data to obtain spliced site traffic data;
inputting the flow data of the splicing site into a gated neural unit to obtain a first vector, and outputting the first vector in the t step of the gated neural unit
Figure 632574DEST_PATH_IMAGE018
Expressed as:
Figure 147869DEST_PATH_IMAGE019
wherein,
Figure 477350DEST_PATH_IMAGE020
showing the flow data of the splicing station in the t-1 step,
Figure 990371DEST_PATH_IMAGE021
representing the flow data of the splicing site in the t step, wherein GRU represents a gating neural unit;
inputting the first vector into an attention mechanism to obtain a second vector, wherein the attention mechanism is calculated by the formula:
Figure 654570DEST_PATH_IMAGE022
wherein,
Figure 176819DEST_PATH_IMAGE023
representing the vector output by the gated neural unit at time t
Figure 282178DEST_PATH_IMAGE018
The value of the attention distribution of (2),
Figure 605581DEST_PATH_IMAGE024
and
Figure 101284DEST_PATH_IMAGE025
the weight coefficient is represented by a weight coefficient,
Figure 551857DEST_PATH_IMAGE026
the coefficient of variation is represented by a coefficient of variation,
Figure 714985DEST_PATH_IMAGE027
representing the vector output by the gated neural unit at time j
Figure 632125DEST_PATH_IMAGE018
The value of the attention distribution of (2),
Figure 21650DEST_PATH_IMAGE028
representing the attention weight, i represents the total time;
calculating a vector output by the attention mechanism through a full-connection layer to obtain a time dependency relationship characteristic, wherein a calculation formula of the full-connection layer is as follows:
Figure 620121DEST_PATH_IMAGE029
wherein,
Figure 762390DEST_PATH_IMAGE030
representing the time dependency characteristics at time t,
Figure 319273DEST_PATH_IMAGE031
a matrix of weights is represented by a matrix of weights,
Figure 851885DEST_PATH_IMAGE032
representing the deviation vector and ReLU the activation function.
It should be noted that the second historical time step and the second quantity in this embodiment may be changed according to actual situations, and this embodiment is not particularly limited.
In the embodiment, the time dependency relationship characteristics of the site flow data and the congestion can be effectively extracted by adopting the gate control neural unit and the attention mechanism, and the relationship between the characteristics can be captured, so that the effectiveness of characteristic extraction is improved.
In some embodiments, the multi-classification process and the one-hot encoding process are performed on the multi-factor data to obtain the multi-factor characteristics, including:
if the multi-factor data in the multi-source heterogeneous data set are classified variables, the multi-factor data are subjected to multi-classification processing and are expressed as classified 0-1 variables, two-classification factor data are obtained, and the two-classification factor data are mapped into multi-factor characteristics through single-hot coding;
and if the multi-factor data in the multi-source heterogeneous data set are multi-classification variables, mapping the multi-factor data into multi-factor characteristics by adopting single-hot coding.
In the embodiment, the two-classification processing and the one-hot coding processing can effectively extract multiple factor features and capture the relationship between the features, thereby improving the effectiveness of feature extraction.
In some embodiments, the obtaining a space-time combined feature by splicing the spatial correlation feature, the temporal dependency relationship feature and the multiple factor features, and inputting the space-time combined feature into the multilayer perceptron model to obtain a prediction result of traffic congestion to be predicted includes:
inputting the space-time joint characteristics into a multi-layer perceptron model, and calculating through a hidden layer and an output layer to obtain a traffic jam prediction result, wherein the calculation of the hidden layer comprises the following steps:
Figure 630879DEST_PATH_IMAGE033
wherein,
Figure 503020DEST_PATH_IMAGE034
representing the spatial correlation characteristic at time t,
Figure 355438DEST_PATH_IMAGE035
representing the time dependency characteristics at time t,
Figure 313030DEST_PATH_IMAGE036
indicating the characteristics of a number of factors at time t,
Figure 581200DEST_PATH_IMAGE037
the representation of the splicing function is shown,
Figure 448793DEST_PATH_IMAGE038
the characteristics of the spatio-temporal union are represented,
Figure 816321DEST_PATH_IMAGE039
a feature vector representing the output of the hidden layer,
Figure 651422DEST_PATH_IMAGE040
a matrix of weights is represented by a matrix of weights,
Figure 457704DEST_PATH_IMAGE041
representing a deviation matrix, reLU representing an activation function,
Figure 304437DEST_PATH_IMAGE013
representing a convolution operation;
inputting the feature vector output by the hidden layer into an output layer, wherein the calculation of the output layer comprises the following steps:
Figure 482346DEST_PATH_IMAGE042
wherein,
Figure 414530DEST_PATH_IMAGE043
a prediction result representing the traffic congestion to be predicted at time t +1,
Figure 24503DEST_PATH_IMAGE044
a matrix of weights is represented by a matrix of weights,
Figure 115956DEST_PATH_IMAGE045
a matrix of deviations is represented which is,
Figure 825286DEST_PATH_IMAGE046
representing an activation function.
In the embodiment, after the spatial correlation characteristic, the time dependency relationship characteristic and the multiple factor characteristics are spliced, the traffic jam condition is comprehensively predicted through the multiple types of characteristics, and the accuracy of traffic jam prediction can be improved.
To facilitate understanding by those skilled in the art, the following sets of preferred embodiments are provided:
1. and processing the multi-source heterogeneous data set.
According to the characteristics of multi-source heterogeneous data, the data are divided into two types of data: the category data and the continuous data adopt different processing methods for different categories of data in the multi-source heterogeneous data set, such as:
in the embodiment, congestion event data (the vehicle running speed is lower than 30 km/h) returned by a high-grade map is combined, the congestion event data is combined with a stake number, so that a congestion event is accurately positioned on an expressway, the congestion event data is recorded as 1 when a road section n is congested in a time period [ t, t + t1] (t 1 takes 0.5h, namely 30 min), and the congestion event data is recorded as 0 if the road section n does not have any congestion event information in the time period [ t, t + t1] (t 1 takes 0.5h, namely 30 min). After being processed, the congestion event is processed into congestion event data with category variables of 0-1.
The data set of the embodiment comprises the records of the toll stations entering and leaving the toll stations of a certain provincial highway network system, and the data set is cleaned and preprocessed to count and obtain the traffic of the toll stations entering and leaving the toll stations at all times. In the embodiment, the traffic of the station m at the time t is represented as
Figure 385711DEST_PATH_IMAGE047
Where M represents the total number of sites, resulting in site traffic data for ingress and egress sites for 268 billed sites. The statistical interval of the station traffic is 0.5 hour, each station traffic data comprises station traffic data of the toll station entering and leaving station from 1/2019 to 31/2019/12/31/year, and the traffic unit is per hour (namely, vehicle/hour). NeedleFor the station traffic data, the embodiment uses the normalization layer to process the station traffic data, and when the purpose of the normalization layer is to solve the optimization problem by using a gradient descent method, the solution speed of the gradient descent can be increased after normalization, that is, the convergence speed of the model is increased. After the station traffic is normalized, each sample set contains 17520 pieces of data, and is divided into a training set, a verification set and a test set of the model according to the following ratio of 6. The flow data is a continuous variable, namely how many vehicles exist in a certain time step, and becomes a floating point number from 0 to 1 after normalization processing.
2. And a spatial correlation feature extraction module.
Because the influence factors of the congestion of the highway network are complex, the congestion events of the highway network are not only influenced by the upstream road sections of the highway network, but also influenced by the downstream road sections of the highway network. The structure of the highway network is complicated and complicated, and the spatial influence is difficult to extract. Therefore, the embodiment captures the spatial correlation characteristics of the highway network congestion through the convolutional neural network. The method comprises the following specific steps:
the first historical time step adopted by the embodiment is T, and the historical data of the traffic event data of the adjacent N road sections with the nearest geographic positions is constructed as
Figure 737058DEST_PATH_IMAGE048
Wherein
Figure 355121DEST_PATH_IMAGE049
representing the condition of the traffic event of the jth historical time step of the adjacent i-number road section; the historical data of the congestion event data of the adjacent N road sections with the nearest geographic positions is constructed as
Figure 625566DEST_PATH_IMAGE050
Wherein, in the process,
Figure 532342DEST_PATH_IMAGE051
and the congestion condition of the jth historical time step of the adjacent i-number road section is shown. Splicing the historical data C of the traffic event data and the historical data D of the congestion event data to obtain the highwaySequence of traffic events and congestion events for adjacent sections of the network
Figure 64211DEST_PATH_IMAGE052
And E is input into the convolutional neural network. The traffic incident data comprises the total traffic incident information of each urban area and each expressway in the whole province in 2019, and comprises incident types (such as large traffic flow, traffic control, sudden traffic accidents, road construction and the like), duration, starting-point expressway stake numbers, ending-point expressway stake numbers and occurrence time data. In this embodiment, the first historical time step T is 6, and the number N of links with the closest geographical position is 8.
In this embodiment, a convolutional neural network framework composed of 2 convolutional layers and 2 pooling layers is constructed, according to the characteristics of a highway network, the 2 convolutional layers are all designed as two-dimensional convolutions, the pooling mode is selected as maximum pooling, and the ReLU activation function is selected as the activation function of the convolutional layers. The processing of the convolutional and pooling layers can be expressed as follows:
Figure 474463DEST_PATH_IMAGE001
after being processed by the convolutional layer and the pooling layer, historical traffic events and congestion events are mapped into the hidden layer feature space and then are processed by the convolutional layer and the pooling layer
Figure 181388DEST_PATH_IMAGE012
Input to the fully-connected layer to obtain spatial correlation features, the fully-connected layer employs the activation function ReLU. The spatial correlation characteristic output by the convolutional neural network at the time t can be expressed as:
Figure 372198DEST_PATH_IMAGE014
the method comprises the steps that the spatial correlation between the congestion events of the adjacent road sections and the traffic jam to be predicted can be captured through the convolutional neural network, the spatial correlation between the traffic events of the adjacent road sections and the traffic jam to be predicted can be captured through the convolutional neural network, and therefore the spatial correlation characteristics between the congestion event data and the traffic jam to be predicted can be obtained, and the spatial correlation characteristics between the traffic event data and the traffic jam to be predicted can be obtained.
3. And a time dependency relationship feature extraction module.
Since the high-speed station traffic has strong periodicity and time dependency, and the neighboring high-speed station traffic has strong nonlinear time correlation with the congestion of the high-speed road section, the embodiment captures the time periodicity of the station traffic data and captures the nonlinear time correlation of the congestion and the station traffic data through the gate control neural unit and the attention mechanism. The method specifically comprises the following steps:
in this embodiment, the second historical time step is T, and inbound site traffic data of M top-ranked high-speed sites with the closest geographic location is constructed as
Figure 65348DEST_PATH_IMAGE053
Wherein
Figure 736632DEST_PATH_IMAGE054
for historical data of inbound site traffic data with the length of site M being T, outbound site traffic data of M high-speed sites with the nearest geographic positions and the top rank are constructed as
Figure 224245DEST_PATH_IMAGE055
Wherein
Figure 964668DEST_PATH_IMAGE056
the historical data of outbound station traffic data with the length T of the station m. Historical data of inbound site traffic data
Figure 461508DEST_PATH_IMAGE057
And historical data of outbound site traffic data
Figure 908670DEST_PATH_IMAGE058
After the splicing operation is carried out, the flow data of the splicing station is obtained as
Figure 206665DEST_PATH_IMAGE059
And inputting the flow data of the splicing site into a gating neural unit, and fully learning the features to be extracted so as to capture the time dependence relationship. In this embodiment, the second historical time step T is 6, and the number M of high-speed stations with the closest geographical position is 10. The output of the gated neural unit is a first vector, and the output of the gated neural unit is the first vector output in the t step
Figure 309750DEST_PATH_IMAGE018
Can be expressed as:
Figure 469336DEST_PATH_IMAGE019
then, inputting the first vector after the gated neural unit activation processing into an attention mechanism for summarizing, calculating weights corresponding to different feature vectors through weight distribution, and continuously calculating a parameter matrix with more optimal iteration, wherein the calculation mode of the attention mechanism can be represented as follows:
Figure 974267DEST_PATH_IMAGE022
calculating the output of the attention mechanism through a full-connection layer to obtain a time dependency relationship characteristic, wherein the activation function of the full-connection layer is ReLU, and the time dependency relationship characteristic output at the time t is obtained as follows:
Figure 944628DEST_PATH_IMAGE029
4. and a multi-factor feature extraction module.
The embodiment designs an embedded layer and a full connection layer to extract various factor data which affect the congestion of the highway network, such as time, holidays, weather and the like. For example:
for the traffic of each time of the toll station, category features with time characteristics, such as the hour in one day, the day in one week, whether the day is weekend, whether the day is holiday, whether the day before the holiday is holiday, whether the day after the holiday is holiday, and the like, are extracted, and the time features of the corresponding time are converted into an embedded vector Other by adopting a one-hot coding mechanism, wherein the embedded vector Other comprises the hour in one day (24 features), the day in one week (7 features), whether the day is holiday (2 features), whether the day before the holiday (2 features), whether the day after the holiday (2 features), current road section historical traffic event data (2 features), and the like. In this embodiment, for data of two classification variables (if the data is a holiday, the data is represented as a two-classification 0-1 variable through multi-classification processing, two classification factor data are obtained, and the two classification factor data are mapped into multiple factor features through one-hot coding; for the data of the multi-classified category variables, the embodiment adopts a one-hot coding method to map the data into a plurality of 0-1 binary features (i.e., multiple factor features) so as to ensure that the distances between different categories are the same, thereby facilitating better extraction of the relationship between the features. The present embodiment sets the historical time step to 6 (3 hours in the past), that is, predicts the road congestion condition of a single time step in the future by historical 6 time steps (3 hours).
The processing for the weather factors is: firstly, carrying out one-hot coding processing, then inputting the one-hot coding and other time factors into the embedding layer for embedding operation, and then inputting the output of the embedding layer into the full-connection layer to obtain the characteristics of multiple factors. The weather data in this embodiment includes weather data of 2019, which is refined to each urban area and each half hour, and includes data of weather conditions (such as cloudy, sunny, cloudy, rainy, and snowy), temperature, wind power, wind direction, and the like.
5. And predicting the condition of traffic jam.
Inputting the space-time joint characteristics into a multi-layer sensor model, and calculating through a hidden layer and an output layer to obtain a prediction result of traffic jam to be predicted, wherein the calculation of the hidden layer comprises the following steps:
Figure 331747DEST_PATH_IMAGE033
wherein,
Figure 170390DEST_PATH_IMAGE036
representing the characteristics of various factors at time t.
Inputting the feature vector output by the hidden layer into an output layer, wherein the calculation of the output layer comprises the following steps:
Figure 654461DEST_PATH_IMAGE042
for better illustration, the following experiments were performed in this example:
1. performance index
The present embodiment uses the F1 Score (F1 Score), the positive sample classification accuracy (P), and the positive sample classification recall (R) as the evaluation indexes of the prediction model performance, and these three indexes are widely applied to the classification situation where the sample distribution is extremely uneven. Wherein TP (True Positive), FP (False Positive) and FN (False Negative) indicate True Positive and False Negative, respectively. The performance index is calculated according to the following formula:
Figure 920357DEST_PATH_IMAGE060
2. and (4) carrying out comparative experiments.
In order to evaluate the performance of the model, the processed multi-source heterogeneous data set is divided into a training set, a verification set and a test set according to a certain proportion, the training set is used for model training, the verification set and the test set are used for model evaluation, and the effectiveness of the model is evaluated by comparing with other baseline models. In the embodiment, the overall performance is compared and evaluated by using a decision tree, an extreme gradient lifting tree, a random forest, a K-nearest neighbor algorithm, a long-short term memory network and other baseline prediction methods and the technical scheme of the embodiment.
Through comparative evaluation, the technical scheme of the embodiment has the best prediction effect, and the result is shown in table 1, and 3 conclusions can be obtained:
(1) The technical scheme of the embodiment is obviously superior to other methods in all indexes, especially in indexes of F1 score, accuracy rate and recall rate, and has very important guiding effect on the travel of citizens;
(2) Because the deep learning network can fully learn the nonlinear relation among the characteristics, the model has greater advantages when modeling the relevant data of the highway network;
(3) Since the baseline model cannot learn the spatial correlation of the highway network, the technical scheme of the embodiment can learn the spatial correlation through the convolutional neural network, and therefore the prediction performance is better.
TABLE 1
Figure 386584DEST_PATH_IMAGE061
3. Ablation experiment
To evaluate the effectiveness of the various components in the model, the present embodiment is validated by ablation experiments, i.e., the performance of the model is evaluated by reducing certain design components in the model, which can reflect the effectiveness of various components in the model.
The effectiveness of different components in the technical solution of the present embodiment is evaluated through ablation experiments, and the results are shown in table 2, which shows three indexes, namely F1 score, accuracy rate and recall rate. The result shows that deleting any component will affect the performance of the solution of the present embodiment. When the spatial correlation characteristic extraction module is deleted, the F1 score, the accuracy rate and the recall rate are respectively reduced from 0.683, 0.685 and 0.683 to 0.488, 0.502 and 0.475; when the time dependency relationship characteristic extraction module is deleted, the F1 score, the precision rate and the recall rate are respectively reduced from 0.683, 0.685 and 0.683 to 0.605, 0.623 and 0.589; when the multi-factor feature extraction module is deleted, the F1 score, the precision rate and the recall rate are respectively reduced from 0.683, 0.685 and 0.683 to 0.637, 0.633 and 0.642.
This illustrates: 1) The congestion events occurring on the highway network are obviously influenced by the road network structure and the traffic conditions of adjacent road sections, so that the congestion events occurring on the highway network are emphasized and are processed in time so as to avoid causing congestion of the adjacent road sections; 2) The influence of the spatial correlation of the high-speed road network congestion is greater than the influence of the traffic station flow; 3) Since the spatial correlation of the highway network and the influence of the highway site traffic is more direct than the influence of time factors (such as day of the week, whether it is a holiday today), the prediction accuracy and recall rate drop off more when the first two components are ablated.
TABLE 2
Figure 560077DEST_PATH_IMAGE062
Referring to fig. 2, an embodiment of the present invention further provides a traffic congestion prediction system, which includes a data obtaining unit 100, a first feature extracting unit 200, a second feature extracting unit 300, a third feature extracting unit 400, and a prediction result obtaining unit 500, where:
the data acquisition unit 100 is configured to acquire a multi-source heterogeneous data set, and acquire traffic event data, congestion event data, and station traffic data from the multi-source heterogeneous data set;
a first feature extraction unit 200, configured to extract spatial correlation features between congestion event data, traffic event data, and traffic congestion to be predicted by using a convolutional neural network;
a second feature extraction unit 300, configured to extract, through a feature extraction network, a time dependency relationship feature between site traffic data and congestion; the feature extraction network is made by fusing a gated neural unit and an attention mechanism;
a third feature extraction unit 400, configured to obtain multiple factor data, excluding traffic event data, congestion event data, and station traffic data, from the multi-source heterogeneous data set, and perform multi-classification processing and single-hot coding processing on the multiple factor data to obtain multiple factor features;
the prediction result obtaining unit 500 is configured to splice the spatial correlation feature, the temporal dependency relationship feature, and the multiple factor features to obtain a spatiotemporal union feature, and input the spatiotemporal union feature into the multilayer perceptron model to obtain a prediction result of the traffic congestion to be predicted.
It should be noted that, since a traffic congestion prediction system in the present embodiment and a traffic congestion prediction method described above are based on the same inventive concept, the corresponding contents in the method embodiments are also applicable to the present system embodiment, and are not described in detail herein.
An embodiment of the present invention further provides a traffic congestion prediction apparatus, including: at least one control processor and a memory for communicative connection with the at least one control processor.
The memory, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Non-transitory software programs and instructions required to implement a traffic congestion prediction method of the above-described embodiments are stored in a memory, and when executed by a processor, perform a traffic congestion prediction method of the above-described embodiments, for example, performing the method steps S100 to S500 in fig. 1 described above.
The above described system embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
Embodiments of the present invention also provide a computer-readable storage medium storing computer-executable instructions, which are executed by one or more control processors, and may cause the one or more control processors to execute a traffic congestion prediction method in the above method embodiments, for example, to execute the functions of the above method steps S100 to S500 in fig. 1.
One of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as is well known to those skilled in the art.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims (10)

1. A traffic congestion prediction method, characterized by comprising:
acquiring a multi-source heterogeneous data set, and acquiring traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set;
extracting the congestion event data, the traffic event data and the spatial correlation characteristics between the traffic congestion to be predicted by adopting a convolutional neural network;
extracting the time dependency relationship characteristics of the station traffic data and the congestion through a characteristic extraction network; the feature extraction network is made by fusing a gated neural unit and an attention mechanism;
acquiring multiple factor data except traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set, and performing multi-classification processing and single-hot coding processing on the multiple factor data to obtain multiple factor characteristics; wherein the multi-factor data includes time, holidays, and weather;
and splicing the spatial correlation characteristic, the time dependency relationship characteristic and the multiple factor characteristics to obtain a space-time combined characteristic, and inputting the space-time combined characteristic into a multilayer perceptron model to obtain a prediction result of the traffic jam to be predicted.
2. The method of claim 1, wherein the obtaining traffic event data, congestion event data, and station traffic data from the multi-source heterogeneous data set comprises:
and carrying out one-hot coding processing on the traffic events and the congestion events in the multi-source heterogeneous data set to obtain the traffic event data and the congestion event data, and carrying out normalization processing on the station flow in the multi-source heterogeneous data set to obtain the station flow data.
3. The method of claim 1, wherein said extracting spatial correlation characteristics between the congestion event data, the traffic event data and the traffic congestion to be predicted using a convolutional neural network comprises:
presetting a first historical time step, and acquiring historical data of a first quantity of congestion event data and historical data of traffic event data which are adjacent to geographical positions in the preset first historical time step;
splicing the historical data of the traffic event data and the historical data of the congestion event data to obtain a spliced data sequence;
and inputting the spliced data sequence into the convolutional neural network to obtain the congestion event data, the traffic event data and the spatial correlation characteristics between the traffic congestion to be predicted.
4. The method according to claim 3, wherein the inputting the concatenated data sequence into the convolutional neural network to obtain the spatial correlation characteristics between the congestion event data, the traffic event data and the traffic congestion to be predicted comprises:
inputting the spliced data sequence into the convolutional neural network, processing the spliced data sequence through a convolutional layer and a pooling layer:
Figure QLYQS_1
wherein,
Figure QLYQS_3
and &>
Figure QLYQS_6
Represents the output of a convolutional layer, E represents the concatenated data sequence, R represents the concatenated data sequence>
Figure QLYQS_11
And &>
Figure QLYQS_2
Represents a weight matrix, based on the weight of the reference signal>
Figure QLYQS_8
、/>
Figure QLYQS_9
、/>
Figure QLYQS_13
And &>
Figure QLYQS_4
Represents a deviation matrix, reLU represents an activation function, ->
Figure QLYQS_7
Represents the maximum function value, < >>
Figure QLYQS_10
And &>
Figure QLYQS_12
Represents the output of the pooling layer, < > or >>
Figure QLYQS_5
Representing a convolution operation;
after the convolution layer and the pooling layer process the concatenated data sequence, the concatenated data sequence will be processed
Figure QLYQS_14
Inputting the spatial correlation characteristics into a full connection layer, and obtaining the spatial correlation characteristics, wherein the spatial correlation characteristics are expressed as:
Figure QLYQS_15
wherein,
Figure QLYQS_16
a spatial correlation feature representing the congestion event data at time t, the traffic event data and the traffic congestion to be predicted, ->
Figure QLYQS_17
Represents a weight matrix, based on the weight of the reference signal>
Figure QLYQS_18
A deviation matrix is represented.
5. The method according to claim 1, wherein the extracting, through a feature extraction network, the time-dependent relationship feature of the station traffic data and the congestion includes:
presetting a second historical time step, and acquiring a second quantity of inbound site traffic data and outbound site traffic data with the top geographical position rank in the second historical time step;
splicing the inbound site traffic data and the outbound site traffic data to obtain spliced site traffic data;
inputting the flow data of the splicing site into the gated neural unit to obtain a first vector, and outputting the first vector in the t step of the gated neural unit
Figure QLYQS_19
Expressed as:
Figure QLYQS_20
wherein,
Figure QLYQS_21
spliced station traffic data representing the t-1 th step, based on the data flow in the database, and based on the data flow in the database>
Figure QLYQS_22
Representing the splice site traffic of the t-th stepData, GRU represents gated neural units;
inputting the first vector into the attention mechanism to obtain a second vector, wherein the attention mechanism is calculated by the formula:
Figure QLYQS_23
wherein,
Figure QLYQS_25
representing the output vector ≥from the gated neural unit at time t>
Figure QLYQS_27
Is taken into consideration, based on the attention profile value of (4)>
Figure QLYQS_30
And &>
Figure QLYQS_24
Represents a weight coefficient, <' > based on>
Figure QLYQS_28
Represents a deviation factor>
Figure QLYQS_29
Represents the output vector @ by the gated neural unit at time j>
Figure QLYQS_31
Is taken into consideration, based on the attention profile value of (4)>
Figure QLYQS_26
Indicating the attention weight, i indicates the total time;
calculating the vector output by the attention mechanism through a full-connection layer to obtain the time dependency relationship characteristic, wherein the full-connection layer calculation formula is as follows:
Figure QLYQS_32
/>
wherein,
Figure QLYQS_33
represents a time-dependency characteristic at time t>
Figure QLYQS_34
Represents a weight matrix, based on the weight of the reference signal>
Figure QLYQS_35
Representing the deviation vector and ReLU the activation function.
6. The method according to claim 1, wherein said performing multi-classification processing and one-hot coding processing on said multi-factor data to obtain multi-factor features comprises:
if the multi-factor data in the multi-source heterogeneous data set are classified variables, representing the multi-factor data as classified 0-1 variables through multi-classification processing to obtain two-classification factor data, and mapping the two-classification factor data into multiple factor characteristics through single-hot coding;
and if the multi-factor data in the multi-source heterogeneous data set are multi-classification variables, mapping the multi-factor data into multi-factor characteristics by adopting single-hot coding.
7. The traffic congestion prediction method according to claim 1, wherein the obtaining of the prediction result of the traffic congestion to be predicted by splicing the spatial correlation feature, the temporal dependency relationship feature and the multi-factor feature to obtain a spatiotemporal joint feature and inputting the spatiotemporal joint feature into a multi-layered perceptron model comprises:
inputting the space-time joint features into a multilayer perceptron model, and calculating through a hidden layer and an output layer to obtain a traffic jam prediction result, wherein the calculation of the hidden layer comprises the following steps:
Figure QLYQS_36
wherein,
Figure QLYQS_38
represents the spatial correlation characteristic at time t>
Figure QLYQS_42
Representing the time dependency characteristics at time t,
Figure QLYQS_44
characteristic of the various factors present at time t>
Figure QLYQS_39
Represents a splicing function, <' > or>
Figure QLYQS_40
Represents the spatiotemporal union feature>
Figure QLYQS_43
A feature vector representing the output of the hidden layer, -a>
Figure QLYQS_45
Represents a weight matrix, based on the weight of the reference signal>
Figure QLYQS_37
Represents a deviation matrix, reLU represents an activation function, ->
Figure QLYQS_41
Representing a convolution operation;
inputting the feature vectors output by the hidden layer to the output layer, the computing of the output layer comprising:
Figure QLYQS_46
wherein,
Figure QLYQS_47
represents the prediction result of the traffic jam to be predicted at the time t +1, is->
Figure QLYQS_48
Represents a weight matrix, based on the weight of the reference signal>
Figure QLYQS_49
Represents a deviation matrix, <' > or>
Figure QLYQS_50
Representing an activation function.
8. A traffic congestion prediction system, characterized in that the traffic congestion prediction system comprises:
the data acquisition unit is used for acquiring a multi-source heterogeneous data set and acquiring traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set;
the first feature extraction unit is used for extracting the spatial correlation features among the congestion event data, the traffic event data and the traffic congestion to be predicted by adopting a convolutional neural network;
the second feature extraction unit is used for extracting the time dependency relationship features of the site traffic data and the congestion through a feature extraction network; the feature extraction network is made by fusing a gated neural unit and an attention mechanism;
the third feature extraction unit is used for acquiring various factor data except traffic event data, congestion event data and station flow data from the multi-source heterogeneous data set, and performing multi-classification processing and single-hot coding processing on the various factor data to acquire various factor features; wherein the multiple factor data includes time, holidays, and weather;
and the prediction result acquisition unit is used for splicing the spatial correlation characteristic, the time dependency relationship characteristic and the multiple factor characteristics to obtain a space-time joint characteristic, and inputting the space-time joint characteristic into a multilayer sensor model to obtain a prediction result of the traffic jam to be predicted.
9. A traffic congestion prediction apparatus comprising at least one control processor and a memory for communicative connection with said at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a method of traffic congestion prediction according to any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform the method of traffic congestion prediction according to any one of claims 1 to 7.
CN202211612325.6A 2022-12-15 2022-12-15 Traffic jam prediction method, system, equipment and storage medium Active CN115620524B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211612325.6A CN115620524B (en) 2022-12-15 2022-12-15 Traffic jam prediction method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211612325.6A CN115620524B (en) 2022-12-15 2022-12-15 Traffic jam prediction method, system, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115620524A CN115620524A (en) 2023-01-17
CN115620524B true CN115620524B (en) 2023-03-28

Family

ID=84879919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211612325.6A Active CN115620524B (en) 2022-12-15 2022-12-15 Traffic jam prediction method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115620524B (en)

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10395183B2 (en) * 2016-03-15 2019-08-27 Nec Corporation Real-time filtering of digital data sources for traffic control centers
JP7292824B2 (en) * 2017-07-25 2023-06-19 ヤフー株式会社 Prediction device, prediction method, and prediction program
JP7228151B2 (en) * 2018-03-26 2023-02-24 東日本高速道路株式会社 Traffic congestion prediction system, traffic congestion prediction method, learning device, prediction device, program, and learned model
CN109754605B (en) * 2019-02-27 2021-12-07 中南大学 Traffic prediction method based on attention temporal graph convolution network
US11423775B2 (en) * 2019-07-18 2022-08-23 International Business Machines Corporation Predictive route congestion management
CN112085947B (en) * 2020-07-31 2023-10-24 浙江工业大学 Traffic jam prediction method based on deep learning and fuzzy clustering
US20220058944A1 (en) * 2020-08-24 2022-02-24 Quantela Inc Computer-based method and system for traffic congestion forecasting
AU2020102350A4 (en) * 2020-09-21 2020-10-29 Guizhou Minzu University A Spark-Based Deep Learning Method for Data-Driven Traffic Flow Forecasting
CN113034913A (en) * 2021-03-22 2021-06-25 平安国际智慧城市科技股份有限公司 Traffic congestion prediction method, device, equipment and storage medium
CN113112793A (en) * 2021-03-29 2021-07-13 华南理工大学 Traffic flow prediction method based on dynamic space-time correlation
CN113160570A (en) * 2021-05-27 2021-07-23 长春理工大学 Traffic jam prediction method and system
CN113469425B (en) * 2021-06-23 2024-02-13 北京邮电大学 Deep traffic jam prediction method
CN113450568B (en) * 2021-06-30 2022-07-19 兰州理工大学 Convolutional network traffic flow prediction model based on space-time attention mechanism
CN114692984B (en) * 2022-04-09 2023-02-07 华东交通大学 Traffic prediction method based on multi-step coupling graph convolution network
CN115148019A (en) * 2022-05-16 2022-10-04 中远海运科技股份有限公司 Early warning method and system based on holiday congestion prediction algorithm
CN115222089A (en) * 2022-05-30 2022-10-21 西南交通大学 Road traffic jam prediction method, device, equipment and readable storage medium
CN115359444B (en) * 2022-10-18 2023-04-07 智道网联科技(北京)有限公司 Road congestion prediction method and device

Also Published As

Publication number Publication date
CN115620524A (en) 2023-01-17

Similar Documents

Publication Publication Date Title
CN110415516B (en) Urban traffic flow prediction method and medium based on graph convolution neural network
CN114664091A (en) Early warning method and system based on holiday traffic prediction algorithm
CN110956807B (en) Highway flow prediction method based on combination of multi-source data and sliding window
CN110836675B (en) Decision tree-based automatic driving search decision method
CN114330868A (en) Passenger flow prediction method based on self-attention personalized enhanced graph convolution network
CN108986453A (en) A kind of traffic movement prediction method based on contextual information, system and device
CN111310786A (en) Traffic detector abnormity diagnosis method and device based on random forest classifier
CN114943482B (en) Smart city exhaust emission management method and system based on Internet of things
CN103632547B (en) The lower link travel time prediction system of moving bottleneck impact and implementation method
CN115148019A (en) Early warning method and system based on holiday congestion prediction algorithm
CN111582559A (en) Method and device for estimating arrival time
CN111242395B (en) Method and device for constructing prediction model for OD (origin-destination) data
CN113379099B (en) Machine learning and copula model-based highway traffic flow self-adaptive prediction method
CN114363316A (en) Intelligent networking monitoring and supervision system for cross-regional road infrastructure
CN114694382B (en) Dynamic one-way traffic control system based on Internet of vehicles environment
CN113159403A (en) Method and device for predicting pedestrian track at intersection
CN115691165A (en) Traffic signal lamp scheduling method, device and equipment and readable storage medium
CN114418606B (en) Network vehicle order demand prediction method based on space-time convolution network
CN116597642A (en) Traffic jam condition prediction method and system
CN116050581A (en) Smart city subway driving scheduling optimization method and Internet of things system
CN110287995B (en) Multi-feature learning network model method for grading all-day overhead traffic jam conditions
CN115620524B (en) Traffic jam prediction method, system, equipment and storage medium
CN117391257A (en) Road congestion condition prediction method and device
Dutta et al. Hybrid Deep Learning Enabled Air Pollution Monitoring in ITS Environment.
CN111626495A (en) Job scheduling system based on cloud platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant