CN113744526B - Highway risk prediction method based on LSTM and BF - Google Patents

Highway risk prediction method based on LSTM and BF Download PDF

Info

Publication number
CN113744526B
CN113744526B CN202110979482.XA CN202110979482A CN113744526B CN 113744526 B CN113744526 B CN 113744526B CN 202110979482 A CN202110979482 A CN 202110979482A CN 113744526 B CN113744526 B CN 113744526B
Authority
CN
China
Prior art keywords
time
risk
lstm
accident
traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110979482.XA
Other languages
Chinese (zh)
Other versions
CN113744526A (en
Inventor
熊晓夏
刘擎超
沈钰杰
蔡英凤
陈龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Qiantong Zhilian Technology Co ltd
Original Assignee
Guizhou Qiantong Zhilian Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Qiantong Zhilian Technology Co ltd filed Critical Guizhou Qiantong Zhilian Technology Co ltd
Priority to CN202110979482.XA priority Critical patent/CN113744526B/en
Publication of CN113744526A publication Critical patent/CN113744526A/en
Application granted granted Critical
Publication of CN113744526B publication Critical patent/CN113744526B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0137Measuring and analyzing of parameters relative to traffic conditions for specific applications
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/017Detecting movement of traffic to be counted or controlled identifying vehicles

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a highway risk prediction method based on LSTM and BF, comprising offline risk early warning prediction model training and online risk early warning model real-time prediction; the training of the offline risk early warning prediction model comprises the steps of constructing an LSTM instantaneous risk discrimination model for distinguishing accidents from safety states and establishing a BF sequence risk prediction model based on a decreasing coefficient, a prior probability and a threshold; when the online risk early warning model is used for real-time prediction, an output sequence of the LSTM instantaneous risk discrimination model at each observation point in a specified period is used as an input vector, an observation frequency ratio in a mode is used as a prior probability, a BF sequence risk prediction model is input, and a future road section risk state prediction result is finally obtained. The invention improves the accuracy of real-time risk prediction of the expressway.

Description

Expressway risk prediction method based on LSTM and BF
Technical Field
The invention relates to the technical field of traffic safety evaluation and intelligent traffic, in particular to a highway risk prediction method based on LSTM (Long Short-Term Memory) and BF (Bayesian Filtering).
Background
In the countries with the first six automobile reserves, the fatality rate of the traffic accidents in China far exceeds the death rate of America, germany, japan and the like, the loss of casualties and properties caused by major traffic safety accidents is always maintained at a high level, and the traffic safety becomes a prominent problem influencing the development of the socioeconomic in China. The method for predicting the risk state of the highway traffic accident is researched, provides reliable basis for early warning of highway traffic safety, is beneficial to eliminating potential safety hazards and preventing and reducing the occurrence of the highway traffic accident.
The highway traffic accident risk prediction is generally based on mass traffic flow data, firstly, the traffic accident risk characteristics are mined and extracted, and then a prediction model is established according to the extracted characteristics, so that the traffic accident risk prediction of a certain road section/area in a future period of time is realized. The traditional traffic accident risk prediction generally adopts a parameter model method, such as a multinomial logistic regression model, a Bayesian network model and the like. In recent years, some researches have introduced nonparametric machine learning models, such as K-nearest neighbor method, support vector machine, random forest method, and the like. With the continuous development of deep learning technology, many researchers also apply a deep network model to the risk assessment field, such as a recurrent neural network, a graph convolution network, and the like. However, most models at present generally need detailed parameter information such as traffic flow, occupancy, vehicle speed and the like, and in the aspect of practical application, particularly for a highway access system generally only having a vehicle license plate identification data type, certain limitations still exist.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides an LSTM and BF-based highway risk prediction method, which learns the time dependence relationship existing in historical risk data through an LSTM instantaneous risk discrimination model, improves the real-time risk prediction effect by fusing LSTM instantaneous risk discrimination model prediction results through a BF sequence risk prediction model, and solves the problem of efficiently predicting the accident risk state through highway intersection data.
The present invention achieves the above-described object by the following technical means.
A highway risk prediction method based on LSTM and BF includes the following steps:
s1, training of offline risk early warning prediction model
Obtaining a K-type accident traffic space-time mode library based on historical traffic accident data of expressways in the past year, and counting the occurrence frequency of each accident traffic space-time mode; extracting multistep high-frequency time-varying variables through checkpoint license plate recognition historical data based on a space-time range between an upstream checkpoint and a downstream checkpoint within a specified time before an accident occurs, and constructing an LSTM instantaneous risk discrimination model for distinguishing the accident from a safety state; taking the sequence of the LSTM instantaneous risk discrimination model discrimination results on each observation point in a specified period as an input vector, and establishing a BF sequence risk prediction model based on prior probability by using a traffic space-time mode matched with low-frequency time-varying variables and constant variables;
s2, predicting the online risk early warning model in real time
Acquiring low-frequency time-varying variables and predicted road section constant variables at a predicted moment in real time, and matching traffic space-time modes of predicted space-time in real time; acquiring the license plate, the vehicle type, the passing time and the lane of each vehicle passing through the gate in real time, extracting multistep high-frequency time-varying variables between an upstream gate and a downstream gate in a time window, and inputting an LSTM instantaneous risk discrimination model to obtain an LSTM instantaneous risk discrimination model discrimination result at the time t; and (3) inputting a BF sequence risk prediction model by taking an output sequence of the LSTM instantaneous risk discrimination model on each observation point in a specified period as an input vector and taking observation frequency ratio in a matched traffic space-time mode as prior probability, and finally obtaining a future road section risk state prediction result.
Further, the construction process of the LSTM instantaneous risk discrimination model specifically includes:
1) Based on historical traffic accidents, to predict time t 0 Front t w Inner upstream and downstream bayonet M 1 M 2 The space-time range between the two is an input time window, and T-step input high-frequency time-varying eigenvector X = { X = is extracted 1 ,x 2 ,…,x t ,…,x T Output the future t from the predicted time p Accidents occur within the time length, and accident samples are obtained;
2) Randomly downsampling the accident-free space-time range of the same time interval and the same road section of each accident sample to predict the time t' 0 Front t w Inner upstream and downstream bayonet M 1 M 2 The space-time range between the T steps is an input time window, and T-step input high-frequency time-varying feature vector X '= { X' 1 ,x′ 2 ,…,x′ t ,…,x′ T Output the future t from the predicted time p No accident occurs within a time length, and a non-accident sample is obtained;
3) And fusing the accident sample and the non-accident sample, and training the LSTM instantaneous risk discrimination model to minimize the loss function so as to obtain the finally calibrated LSTM instantaneous risk discrimination model.
Further, the predicted time t 0 Is the time of occurrence of an accident t c Front t p Time, t p Is the predicted time duration.
Further, the output sequence value of the LSTM instantaneous risk discrimination model is regarded as a binary random variable y belonging to {0,1}, the value of the random variable y belonging to {0,1} is determined by a parameter theta, and the expected value of the parameter theta is represented as:
Figure RE-GDA0003305055030000021
wherein: y is the output sequence of the LSTM instantaneous risk discrimination model, m and q represent the number of accident type and non-accident type variables of the LSTM output sequence respectively, and a and b are hyperparameters of beta distribution and represent the prior probabilities of the accident type and the non-accident type respectively.
Furthermore, the values of a and b are varied with the traffic space-time mode of each sample, and the value of the parameter a is the traffic space-time mode TM of the sample c Observation frequency f of internal accident recording c To the total frequency of all modes, i.e.
Figure RE-GDA0003305055030000022
Corresponding parameter
Figure RE-GDA0003305055030000031
Wherein f is k And recording the observation frequency of the accidents in each traffic space-time mode.
Further, the traffic space-time mode of each sample is determined by the following method:
1) Based on historical traffic accident data of the highway, low-frequency time-varying variables and road section constant variables of accident samples are used as characteristic variables [ z 1 ,z 2 ,...,z s ,...,z S ]Obtaining K accident modes by a K-models clustering method, wherein the clustering center of each accident mode is TM k =[z 1k ,z 2k ,...,z sk ,...,z Sk ],k=1,2,...,K;
2) Taking the low-frequency time-varying variable and the road section constant variable of the sample to be determined as the traffic space-time mode characteristic variable TM of the sample t =[z 1t ,z 2t ,...,z st ,...,z St ]At TM k =[z 1k ,z 2k ,...,z sk ,...,z Sk ]Based on Hamming distance, and TM t Mode TM with minimum Hamming distance c I.e. the traffic spatiotemporal pattern in which the sample is located.
Furthermore, the prediction of the risk state of the future road section is determined according to the E (theta | y) and the threshold value, when E (theta | y) < tau 1 : predicting that the road section is in a low risk state when tau 1 ≤E(θ|y)<τ 2 : predicting that the road section is in a risk state when tau 2 E (θ | y): predicting that the road section is in a high risk state, where 1 、τ 2 Representing the threshold values for medium and high risk states, respectively.
Further, a random deactivation Dropout is arranged between a hidden layer and a Dense full-connection output layer of the LSTM instantaneous risk discrimination model, and the LSTM instantaneous risk discrimination model is regularized by matching the input connection weight L2 and is attenuated by a learning rate.
Further, the high-frequency time-varying variables comprise upstream and downstream bayonet flow, inter-lane flow difference, lane average flow, road section flow density and large/small vehicle flow ratio.
Further, the low frequency time-varying variables include season, day of week, day period, and weather type.
Further, the constant variables comprise the number of lanes on the road section, the line shape of the road, the distance between the entrance and the exit of the road section and the number of the entrance and the exit of the road section, and the continuous variables of the distance between the entrance and the exit of the road section are discretized according to value intervals.
The invention has the beneficial effects that:
(1) The invention does not need a sensor to acquire detailed parameter information such as traffic flow, occupancy rate, vehicle speed and the like, is suitable for a highway access system which only has vehicle license plate identification data types generally, and has higher practicability.
(2) According to the method, an LSTM-BF prediction framework is constructed, wherein an LSTM instantaneous risk discrimination model learns the time dependence relation existing in historical risk data, and a BF sequence risk prediction model is fused with the prediction result of the LSTM instantaneous risk discrimination model in a specified period, so that the real-time risk prediction precision of the expressway is improved.
(3) The method distinguishes three types of variables including the high-frequency time-varying variable, the low-frequency time-varying variable and the road constant variable, the multi-step characteristic input of the LSTM instantaneous risk discrimination model is constructed through the high-frequency time-varying variable, the risk prior probability of the BF sequence risk prediction model is obtained through the low-frequency time-varying variable and the road constant variable matching traffic space-time mode, the differential influence of the different types of variables on the risk prediction is fully considered, and the accuracy of the real-time risk prediction model of the expressway is improved.
Drawings
FIG. 1 is a flow chart of the LSTM and BF based highway risk prediction method of the present invention;
FIG. 2 is a conceptual diagram of the input spatio-temporal extent of the LSTM model of the present invention;
fig. 3 is a conceptual diagram of a multi-step feature input (step number T = 4) of the LSTM model of the present invention;
FIG. 4 is a conceptual diagram of an input sequence of the BF model according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
As shown in fig. 1, a highway risk early warning method based on LSTM and BF specifically includes the following steps:
step one, training an offline risk early warning prediction model
Acquiring historical traffic accident data of expressways in the past years according to a traffic accident information base, extracting low-frequency time-varying variables and road section constant variables of accident samples based on the historical traffic accident data of expressways in the past years, acquiring a K-type accident mode base by a cluster analysis method, and counting to obtain the occurrence frequency of each accident mode. The method comprises the steps of utilizing license plate identification data collected by a checkpoint real-time information collection system, extracting multistep high-frequency time-varying characteristic variables in a moving time window form on the basis of a space-time range between an upstream checkpoint and a downstream checkpoint within a specified time before an accident occurs, and constructing an LSTM instantaneous risk discrimination model (namely a deep learning neural network) for distinguishing the accident from a safety state. And establishing a BF sequence risk prediction model based on prior probability by taking the sequence of LSTM instantaneous risk discrimination model discrimination results on each observation point in a specified period as an input vector.
The first step is specifically as follows:
dividing a main line of the expressway into L road sections according to the positions of bayonets along the expressway, determining the incidence relation between the upstream bayonets and the downstream bayonets of the road sections according to the traffic flow direction, acquiring road section constant variables comprising the number of lanes of the road sections, the line shape of the road, the distance between the entrance ramps and the exit ramps of the road sections and the number of the entrance ramps and the exit ramps of the road sections, and discretizing continuous variables of the distance between the entrance ramps and the exit ramps of the road sections according to value intervals.
Step (2) acquiring historical traffic accident record data of each road section divided in the step (1) based on historical traffic accident data of the expressway, extracting accident occurrence road sections, occurrence time (namely low-frequency time-varying variable specifically comprising season, days of the week (day of the week), and time of day (hours of the day)), weather types (belonging to the low-frequency time-varying variable) and the road section constant variable in the step (1) as characteristic variable [ z ] 1 ,z 2 ,...,z s ,...,z S ]. Based on the characteristic variables, K accident modes are obtained through a K-modes clustering method, and the clustering center of each accident mode is TM k =[z 1k ,z 2k ,...,z sk ,...,z Sk ]K =1, 2.. K, and counting to obtain the observation frequency f of the accident record in each accident mode 1 ,f 2 ,…,f K
And (3) utilizing license plate identification data acquired by the checkpoint real-time information acquisition system, extracting multistep high-frequency time-varying characteristic variables in a moving time window form based on a space-time range between an upstream checkpoint and a downstream checkpoint within a specified time before an accident occurs, and constructing an LSTM instantaneous risk discrimination model for distinguishing the accident from a safety state. The step (3) is specifically realized by the following sub-steps:
and 3-1) counting the total flow, the lane dividing flow and the vehicle dividing type flow passing through each bayonet in unit time based on the information of the license plate of the vehicle passing through the bayonet, the passing time of the vehicle, the lane where the vehicle is located and the type of the vehicle (small vehicle/large vehicle) which are historically collected by the real-time information collection system of the bayonet.
Step 3-2), adopting a moving time window form to construct accident sample input and output of an LSTM instantaneous risk discrimination model, wherein the step 3-2) is realized by the following steps:
step 3-2-1), recording D historical traffic accident data of the highway, determining the incidence relation of an upstream gate and a downstream gate of the road section according to the traffic flow direction to determine an upstream gate M of the road section with accident occurrence 1 Downstream bayonet M 2 And the occurrence time t c
Step 3-2-2), as shown in fig. 2, with the accident occurrence time t c Front t p Time being a predicted time t 0 I.e. t 0 =t c -t p ,t p Is the predicted duration. With t 0 T before time w (unit is min) upstream and downstream bayonet M 1 M 2 The space-time range between the two is an input time window, 5min is taken as a unit step length, and T = T is extracted w Step 5 input feature vector X = { X = 1 ,x 2 ,…,x t ,…,x T Inputting x every step t The high-frequency time-varying variables include: the method comprises the steps of (1) carrying out standardization on upstream and downstream bayonets by adopting a z-score method, wherein the flow (total flow) of the upstream and downstream bayonets, the flow difference (difference value of divided lane flow) between lanes, the average flow (average value of divided lane flow) of lanes, the flow density (total flow divided by the distance between the upstream and downstream bayonets) of a road section and the flow ratio (ratio of divided type flow) of large/small vehicles are carried out, and each time-varying variable is subjected to standardization treatment by adopting the z-score method; the moving step length of the input time window is 1min; the output tag is marked as 1, i.e. the future t from the predicted time p Accidents occur during the time period, as shown in fig. 3.
And 3-2-3), constructing input and output of the LSTM instantaneous risk discrimination model positive samples for all D recorded accidents according to the method in the step 3-2-2), and obtaining D LSTM instantaneous risk discrimination model positive samples. According to down-sampling proportion as positiveNumber of samples (accident samples): number of negative samples (non-accident samples) =1:3, randomly downsampling the accident-free space-time range of the same time period and the same road section of each accident sample, constructing the input of the LSTM instantaneous risk discrimination model negative sample according to the method in the step 3-2-2), and recording the output label as 0, namely, t is the future from the predicted time p And no accident occurs within the duration, and 3D LSTM instantaneous risk discrimination model negative samples are obtained.
And 3-2-4), in order to avoid data overfitting caused by excessively high model depth, the number of hidden layer layers of the LSTM instantaneous risk discrimination model is set to be 1, and the number of nodes of the hidden layers is obtained through tests. Meanwhile, in order to prevent the model from being over-fitted, random inactivation Dropout is added between the hidden layer and the Dense fully-connected output layer, and the over-fitting condition is improved by using input connection weight L2 regularization and a learning rate attenuation technology.
Step 3-2-5), the positive samples and the negative samples in the step 3-2-3) are fused into total samples (D +3D =4D in total), samples are randomly extracted according to 60%, 20% and 20% of training sets, verification sets and test sets to form corresponding sets, a binary cross entropy loss function is used as a loss function, an Adam optimizer is adopted, and the LSTM instantaneous risk discrimination model in the step 3-2-4) is trained to enable the loss function to be minimum, so that the finally calibrated LSTM instantaneous risk discrimination model is obtained.
And (4) establishing a BF sequence risk prediction model based on prior probability by taking the sequence of the LSTM instantaneous risk discrimination model discrimination result in a specified period as an input vector. The step (4) is specifically realized by the following sub-steps:
step 4-1), based on the 4D total samples obtained in step (3), predicting the time t for each sample 0 Chronologically sequencing t 0 N-1 observation points with front interval of 1min and t 0 Itself as the observation period (i.e. t) of the sample 0 -N+1, t 0 -N+2,…,t 0 N observation points in total).
Step 4-2), as shown in fig. 4, taking each observation point in the observation period of the sample obtained in step 4-1) as the predicted time t 0 Obtaining N groups of LSTM transient moments corresponding to the N observation points according to the method in the step 3-2-2)Inputting the input characteristics of the temporal risk discrimination model, respectively inputting the LSTM instantaneous risk discrimination model obtained in the step 3-2-5), and obtaining a sequence y = [ y ] formed by LSTM instantaneous risk discrimination model discrimination results at N observation points in an observation period 1 ,...,y N ]And taking the sequence as an input vector of the BF model. And the BF output label is consistent with the positive label and the negative label of the sample, if the original sample is the positive sample, the BF output label is marked as 1, otherwise, the BF output label is marked as 0.
Step 4-3), regarding the output of the LSTM instantaneous risk discrimination model as an observation sample of a two-classification random variable y epsilon {0,1}, and establishing a BF sequence risk prediction model based on a decreasing coefficient, a prior probability and a threshold, wherein the step 4-3) is realized by the following steps:
step 4-3-1), the value of the random variable y belongs to {0,1} is determined by a parameter theta:
p(y=1|θ)=θ
where the parameter θ represents the probability of a random variable y =1 (accident class). Adopting beta distribution as prior probability distribution form of parameter theta:
Figure RE-GDA0003305055030000061
where Γ (x) is a gamma function, and a and b are hyper-parameters of the beta distribution, representing a priori knowledge of the two classes, respectively, which may be represented by the initial observation frequencies of the two classes.
Step 4-3-2), when the LSTM output sequence y = [ y) is given 1 ,...,y N ]According to the bayesian probability formula, the posterior probability p (θ | y) of the parameter θ is the product of the prior probability beta (θ | a, b) and the binomial distribution probability bin (m | N, θ) and is normalized, that is:
Figure RE-GDA0003305055030000062
where m and q represent the number of y =1 (accident class) and y =0 (non-accident class) variables in the LSTM output sequence, respectively, and satisfy m + q = N. The desired value of the parameter θ can be simply expressed as:
Figure RE-GDA0003305055030000071
step 4-3-3), considering that the LSTM instantaneous risk discrimination model output closer to the current observation time t is more weighted, adding a decreasing function r changing along with the observation time n To improve the accuracy of the final prediction result:
Figure RE-GDA0003305055030000072
Figure RE-GDA0003305055030000073
where N =1, 2.., N represents the sequence number of the nth output value of the LSTM sequence (N being the sequence number of the last output value of the sequence); c is an element of (0, 1)]In order to decrease the constant, the size of the constant significantly affects the prediction effect of BF, and when C is 1, the weight of each output result in the LSTM sequence is the same; m is n And q is n Corresponding label value representing the nth output result of the LSTM sequence and satisfying m n +q n =1 if output is accident (y) n = 1), then m is marked n =1,q n =0; otherwise non-accident class (y) n = 0) is m n =0,q n =1。
And 4-3-4) judging the final prediction type by setting a threshold tau as a BF sequence risk prediction model, namely judging the type as an accident if E (theta | y) > tau, and otherwise judging the type as a non-accident (safety type).
And 4-4) training the BF sequence risk prediction model in the step 4-3) based on the obtained BF sample to obtain a final calibrated BF sequence risk prediction model. The step 4-4) is realized by the following steps:
step 4-4-1), constructing input and output of BF sequence risk prediction model samples for all samples in the step 4-1) according to the method in the step 4-2), and finally obtaining 4D BF sequence risk prediction model samples.
And 4-4-2) extracting the low-frequency time-varying variables of the season, the days of the week, the time period of the day and the weather type corresponding to each sample according to the historical traffic accident data of the expressway. Taking the low-frequency time-varying variable and the road section constant variable corresponding to the sample obtained in the step (1) as the traffic space-time mode characteristic variable TM of the sample t =[z 1t ,z 2t ,...,z st ,...,z St ]K-type accident pattern TM obtained in step (2) based on Hamming distance k =[z 1k ,z 2k ,...,z sk ,...,z Sk ]K =1,2.., K, where pattern matching is performed. The Hamming distance calculation method comprises the following steps: initializing the distance value to be 0; for each characteristic variable z s If z is st =z sk If the distance value is not changed, the distance value is increased by 1; the process is cycled through to all characteristic variables [ z ] 1 ,z 2 ,...,z s ,...,z S ]And finishing the traversal. Selecting and TM t Mode TM with minimum Hamming distance c The traffic space-time pattern of the sample.
Step 4-4-3), setting the values of the parameters a and b in the BF sequence risk prediction model of the step 4-3) to be changed along with the traffic space-time mode of each sample, wherein the value of the parameter a is TM of the sample c Observation frequency f of accident logging in pattern c To the total frequency of all modes, i.e.
Figure RE-GDA0003305055030000081
Corresponding parameter
Figure RE-GDA0003305055030000082
Step 4-4-4), respectively randomly extracting samples obtained in the step 4-4-1) to form corresponding sets according to 75% and 25% of training sets and testing sets, taking the predicted F1 value as a target function, and obtaining optimal values C, N and tau of other parameters C, N and tau of the BF sequence risk prediction model by a grid search method, wherein the optimal values of the parameters are: c is 0.5,1, N is 5,15, tau is 0.1,1. And finally, obtaining a BF sequence risk prediction model based on optimal parameter combinations of { C = C, N = N, tau = tau }.
Step two, predicting the online risk early warning model in real time
The checkpoint real-time information acquisition system acquires license plates, vehicle types, passing moments and lanes of each vehicle passing through the checkpoint in real time, multi-step high-frequency time-varying characteristic variables between an upstream checkpoint and a downstream checkpoint in a time window are extracted in real time through the characteristic extraction method in the step one, an LSTM instantaneous risk discrimination model is input, and an LSTM instantaneous risk discrimination model discrimination result at the time t is obtained. And respectively acquiring the low-frequency time-varying variable and the road section constant variable in the step one in real time through a checkpoint real-time information acquisition system and a traffic geographic information base, and matching the current traffic space-time mode in the accident mode base acquired in the step one in real time based on Hamming distance to obtain the ratio of the current mode frequency to other mode frequencies. And inputting a BF sequence risk prediction model by taking a sequence of LSTM instantaneous risk discrimination model discrimination results on each observation point in a specified period as an input vector and a mode frequency ratio as risk prior probability, and finally obtaining a risk state prediction result of a future road section.
The second step is specifically as follows:
and (1) acquiring the low-frequency time-varying variable in the step one in real time through a checkpoint real-time information acquisition system and acquiring the predicted road section constant variable in the step one in real time through a traffic geographic information base at the predicted time t. Using low-frequency time-varying variable and predicted road section constant as characteristic variable TM t =[z 1t ,z 2t ,...,z st ,...,z St ]K-class accident pattern TM obtained in step one based on Hamming distance k =[z 1k ,z 2k ,...,z sk ,...,z Sk ]K =1,2.., K, real-time pattern matching is performed. The hamming distance calculation method is consistent with that described in step one. Selection and TM t Mode TM with minimum Hamming distance c And predicting the traffic space-time mode of the road section for the predicted time.
And (2) for the predicted road section l, respectively photographing each vehicle passing through the upstream and downstream gates by utilizing the upstream and downstream gate information systems of the road section and automatically identifying license plates, wherein the specifically collected vehicle information comprises the license plate of the vehicle, the passing time of the vehicle and the lane where the vehicle is locatedAnd vehicle type (small/large). Based on the acquired information, determining an observation period (t-N +1, t-N +2, \ 8230;, t totally N observation points) of a prediction time t in real time according to the BF sequence risk prediction model input vector construction method in the step one, obtaining N groups of LSTM instantaneous risk discrimination model input characteristics corresponding to N observation points according to the LSTM model multi-step high-frequency time-varying characteristic variable construction method in the step one, respectively inputting the LSTM instantaneous risk discrimination models obtained by training in the step one, and obtaining a sequence y = [ y ] formed by LSTM discrimination results on N observation points in the period 1 ,...,y N ]。
Step (3), the mode TM obtained in step (1) c Observation frequency f of internal accident record c The proportion of the total frequency of all modes is the value of the parameter a in the BF sequence risk prediction model in step one, namely
Figure RE-GDA0003305055030000091
Parameter(s)
Figure RE-GDA0003305055030000092
Using the LSTM discrimination result sequence y = [ y ] obtained in the step (2) 1 ,...,y N ]Inputting the vector of BF sequence risk prediction model based on the input step
Figure RE-GDA0003305055030000093
And obtaining the accident risk probability E (theta | y) of the forecast road section at the forecast time by using the BF sequence risk forecasting model of the parameter combination.
Step (4), according to the accident risk probability obtained in the step (3), making a corresponding highway risk early warning strategy: when the BF sequence risk prediction model in the step (3) outputs the risk probability E (theta | y) < tau 1 : the predicted road section is in a low risk state, and the highway management department does not need to take any processing measures; when tau is 1 ≤E(θ|y)<τ 2 : predicting that the road section is in a middle risk state, and a highway management department needs to strengthen the monitoring of the road section and prepare for early warning; when tau is 2 E (θ | y): predicting that the road section is in a high risk state, and adopting a stable way by the highway management department to deal with the traffic flow entering and leaving the road sectionAnd taking measures, namely giving early warning and reminding to the driver, and making accident rescue preparation. Wherein tau is 12 ∈[0,1]And τ is 1 <τ 2 Respectively, representing the threshold values for medium and high risk states.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art without departing from the principle and spirit of the present invention.

Claims (6)

1. A highway risk prediction method based on LSTM and BF is characterized by comprising the following steps:
s1, training an offline risk early warning prediction model
Obtaining a K-type accident traffic space-time mode library based on historical traffic accident data of expressways in the past year, and counting the occurrence frequency of each accident traffic space-time mode; extracting multistep high-frequency time-varying variables through checkpoint license plate recognition historical data based on a space-time range between an upstream checkpoint and a downstream checkpoint within a specified time before an accident occurs, and constructing an LSTM instantaneous risk discrimination model for distinguishing the accident from a safety state; taking a sequence of LSTM instantaneous risk discrimination model discrimination results on each observation point in a specified period as an input vector, and establishing a BF sequence risk prediction model based on prior probability by using a traffic space-time mode matched with a low-frequency time-varying variable and a constant variable;
s2, predicting online risk early warning model in real time
Acquiring a low-frequency time-varying variable at a prediction moment and a prediction road section constant variable in real time, and matching a traffic space-time mode of a prediction space-time in real time; acquiring the license plate, the vehicle type, the passing time and the lane of each vehicle passing through the gate in real time, extracting multistep high-frequency time-varying variables between an upstream gate and a downstream gate in a time window, and inputting an LSTM instantaneous risk discrimination model to obtain an LSTM instantaneous risk discrimination model discrimination result at the time t; inputting a BF sequence risk prediction model by taking an output sequence of an LSTM instantaneous risk discrimination model on each observation point in a specified period as an input vector and taking an observation frequency ratio in a matched traffic space-time mode as a prior probability, and finally obtaining a future road section risk state prediction result;
the construction process of the LSTM instantaneous risk discrimination model specifically comprises the following steps:
1) Based on historical traffic accidents to predict time t 0 Front t w Inner upstream and downstream bayonet M 1 M 2 The space-time range between the two is an input time window, and T-step input eigenvectors X = { X = are extracted 1 ,x 2 ,…,x t ,…,x T Output the future t from the predicted time p Accidents occur within the time length, and accident samples are obtained; the predicted time t 0 Is the time of occurrence of an accident t c Front t p Time, t p Is the predicted time length;
2) Randomly downsampling the accident-free space-time range of the same time period and the same road section of each accident sample to predict the time t 0 ' front t w Inner upstream and downstream bayonet M 1 M 2 The space-time range between the two is an input time window, and T-step input eigenvector X' = { X } is extracted 1 ′,x 2 ′,…,x t ′,…,x′ T Output future t from the predicted time p No accident occurs within a time length, and a non-accident sample is obtained;
3) Fusing accident samples and non-accident samples, training an LSTM instantaneous risk discrimination model to minimize a loss function, and obtaining a finally calibrated LSTM instantaneous risk discrimination model;
the output sequence value of the LSTM instantaneous risk discrimination model is regarded as a binary random variable y belonging to {0,1}, the value of the random variable y belonging to {0,1} is determined by a parameter theta, and the expected value of the parameter theta is represented as:
Figure FDA0003784033980000011
wherein: y is an output sequence of the LSTM instantaneous risk discrimination model, m and q respectively represent the number of accident-type and non-accident-type variables of the LSTM output sequence, and a and b are hyper-parameters of beta distribution and respectively represent the prior probability of the accident type and the non-accident type;
taking the output of the LSTM instantaneous risk discrimination model as an observation sample of a binary classification random variable y belonging to {0,1}, and establishing a BF sequence risk prediction model based on a decreasing coefficient, a prior probability and a threshold value:
when given an LSTM output sequence y = [ y 1 ,...,y N ]According to a Bayes probability formula, the posterior probability p (theta | y) of the parameter theta is a prior probability b eta The product of (θ | a, b) and binomial distribution probability bin (m | N, θ) is normalized:
Figure FDA0003784033980000021
adding a decreasing function r as a function of the observed time n The expected value of the parameter θ is expressed as:
Figure FDA0003784033980000022
wherein: n =1, 2.. The N denotes the sequence number of the nth output value of the LSTM sequence, N is the sequence number of the last output value of the sequence, C e (0, 1)]To decrease the constant, m n And q is n A corresponding tag value representing the nth output of the LSTM sequence;
determining a final prediction type by setting a threshold tau as a BF sequence risk prediction model;
the traffic space-time mode of each sample is determined by the following method:
1) Based on historical traffic accident data of the highway, low-frequency time-varying variables and road section constant variables of accident samples are used as characteristic variables [ z 1 ,z 2 ,...,z s ,...,z S ]Obtaining K accident modes by a K-models clustering method, wherein the clustering center of each accident mode is TM k =[z 1k ,z 2k ,...,z sk ,...,z Sk ],k=1,2,...,K;
2) Taking the low-frequency time-varying variable and the road section constant variable of the sample to be determined as the traffic space-time mode characteristic variable TM of the sample t =[z 1t ,z 2t ,...,z st ,...,z St ]At TM k =[z 1k ,z 2k ,...,z sk ,...,z Sk ]Based on Hamming distance, and TM t Mode TM with minimum Hamming distance c The traffic space-time mode of the sample is obtained;
the prediction of the risk state of the future road section is judged according to the E (theta | y) and the threshold value, when the E (theta | y) < tau 1 : predicting that the road section is in a low risk state when tau 1 ≤E(θ|y)<τ 2 : predicting that the road section is in a risk state when tau 2 E (θ | y): predicting that the road section is in a high risk state, where 1 、τ 2 Representing the threshold values for medium and high risk states, respectively, E (θ | y) being the expected value of the parameter θ.
2. The LSTM and BF-based highway risk prediction method according to claim 1, wherein the values of a and b vary with the traffic spatio-temporal pattern (TM) in which each sample is located, and the value of the parameter a is the traffic spatio-temporal pattern (TM) in which the sample is located c Observation frequency f of internal accident record c To the total frequency of all modes, i.e.
Figure FDA0003784033980000031
Corresponding parameter
Figure FDA0003784033980000032
Wherein f is k And recording the observation frequency of the accidents in each traffic space-time mode.
3. The LSTM and BF-based highway risk prediction method according to claim 1, wherein a random deactivation Dropout is provided between the hidden layer and the sense fully connected output layer of the LSTM transient risk discrimination model, and the LSTM transient risk discrimination model is regularized using L2 input connection weight and learning rate decay.
4. The LSTM and BF-based highway risk prediction method according to claim 1, wherein said high frequency time varying variables comprise upstream and downstream bayonet traffic, inter-lane traffic difference, lane average traffic, road section traffic density and large/small car traffic ratio.
5. The LSTM and BF-based highway risk prediction method according to claim 1, wherein said low frequency time varying variables comprise season, days of the week, time of day and weather type.
6. The LSTM and BF based highway risk prediction method according to claim 1, wherein said constant variables comprise road section lane number, road alignment, road section entrance and exit ramp distance and road section entrance and exit ramp number, and said road section entrance and exit ramp distance continuous variables are discretized according to value intervals.
CN202110979482.XA 2021-08-25 2021-08-25 Highway risk prediction method based on LSTM and BF Active CN113744526B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110979482.XA CN113744526B (en) 2021-08-25 2021-08-25 Highway risk prediction method based on LSTM and BF

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110979482.XA CN113744526B (en) 2021-08-25 2021-08-25 Highway risk prediction method based on LSTM and BF

Publications (2)

Publication Number Publication Date
CN113744526A CN113744526A (en) 2021-12-03
CN113744526B true CN113744526B (en) 2022-12-23

Family

ID=78732681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110979482.XA Active CN113744526B (en) 2021-08-25 2021-08-25 Highway risk prediction method based on LSTM and BF

Country Status (1)

Country Link
CN (1) CN113744526B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114582131B (en) * 2022-03-17 2023-08-29 中远海运科技股份有限公司 Monitoring method and system based on ramp intelligent flow control algorithm
CN114944055B (en) * 2022-03-29 2023-04-18 浙江省交通投资集团有限公司智慧交通研究分公司 Expressway collision risk dynamic prediction method based on electronic toll gate frame
CN114863680B (en) * 2022-04-27 2023-04-18 腾讯科技(深圳)有限公司 Prediction processing method, prediction processing device, computer equipment and storage medium
CN115376308B (en) * 2022-05-26 2024-06-04 南京工程学院 Prediction method for automobile running time
CN115565373B (en) * 2022-09-22 2024-04-05 中南大学 Expressway tunnel accident real-time risk prediction method, device, equipment and medium
CN116030627B (en) * 2022-12-31 2024-04-30 东南大学 Road traffic accident analysis method integrating predicted traffic risk variables

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646534A (en) * 2013-11-22 2014-03-19 江苏大学 A road real time traffic accident risk control method
CN106355883A (en) * 2016-10-20 2017-01-25 同济大学 Risk evaluation model-based traffic accident happening probability acquiring method and system
CN107742193A (en) * 2017-11-28 2018-02-27 江苏大学 A kind of driving Risk Forecast Method based on time-varying state transition probability Markov chain
WO2018103313A1 (en) * 2016-12-06 2018-06-14 杭州海康威视数字技术股份有限公司 Traffic accident occurrence risk prediction method, device and system
CN108198415A (en) * 2017-12-28 2018-06-22 同济大学 A kind of city expressway accident forecast method based on deep learning
CN112949999A (en) * 2021-02-04 2021-06-11 浙江工业大学 High-speed traffic accident risk early warning method based on Bayesian deep learning
CN112990545A (en) * 2021-02-08 2021-06-18 东南大学 Traffic safety state prediction method for expressway intersection area

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646534A (en) * 2013-11-22 2014-03-19 江苏大学 A road real time traffic accident risk control method
CN106355883A (en) * 2016-10-20 2017-01-25 同济大学 Risk evaluation model-based traffic accident happening probability acquiring method and system
WO2018103313A1 (en) * 2016-12-06 2018-06-14 杭州海康威视数字技术股份有限公司 Traffic accident occurrence risk prediction method, device and system
CN107742193A (en) * 2017-11-28 2018-02-27 江苏大学 A kind of driving Risk Forecast Method based on time-varying state transition probability Markov chain
CN108198415A (en) * 2017-12-28 2018-06-22 同济大学 A kind of city expressway accident forecast method based on deep learning
CN112949999A (en) * 2021-02-04 2021-06-11 浙江工业大学 High-speed traffic accident risk early warning method based on Bayesian deep learning
CN112990545A (en) * 2021-02-08 2021-06-18 东南大学 Traffic safety state prediction method for expressway intersection area

Also Published As

Publication number Publication date
CN113744526A (en) 2021-12-03

Similar Documents

Publication Publication Date Title
CN113744526B (en) Highway risk prediction method based on LSTM and BF
Park et al. Real-time prediction and avoidance of secondary crashes under unexpected traffic congestion
CN102750824B (en) Urban road traffic condition detection method based on voting of network sorter
CN112215487B (en) Vehicle running risk prediction method based on neural network model
Florio et al. Neural-network models for classification and forecasting of freeway traffic flow stability
Yan et al. Spatial-temporal chebyshev graph neural network for traffic flow prediction in iot-based its
CN112085947A (en) Traffic jam prediction method based on deep learning and fuzzy clustering
CN109345832B (en) Urban road overtaking prediction method based on deep recurrent neural network
CN101216998A (en) An information amalgamation method of evidence theory urban traffic flow based on fuzzy rough sets
CN112990545B (en) Traffic safety state prediction method for expressway intersection area
Wang et al. Crash prediction for freeway work zones in real time: A comparison between Convolutional Neural Network and Binary Logistic Regression model
Pawar et al. Classification of gaps at uncontrolled intersections and midblock crossings using support vector machines
Sayed et al. Artificial intelligence-based traffic flow prediction: a comprehensive review
CN113449905A (en) Traffic jam early warning method based on gated cyclic unit neural network
CN117238126A (en) Traffic accident risk assessment method under continuous flow road scene
Zhu et al. Early identification of recurrent congestion in heterogeneous urban traffic
Yang et al. A traffic dynamic operation risk assessment method using driving behaviors and traffic flow Data: An empirical analysis
Florio et al. Neural network models for classification and forecasting of freeway traffic flow stability
CN116798223A (en) Sub-region division and state identification method based on macroscopic basic diagram/FCM clustering
Sun et al. Vision-based traffic conflict detection using trajectory learning and prediction
Huang SVM‐Based Real‐Time Identification Model of Dangerous Traffic Stream State
Turki et al. Using a new algorithm in Machine learning Approaches to estimate level-of-service in hourly traffic flow data in vehicular ad hoc networks
Li A deep learning approach for real-time crash risk prediction at urban arterials
Obereigner et al. Methods for Traffic Data Classification with regard to Potential Safety Hazards
Shi et al. CPT‐DF: Congestion Prediction on Toll‐Gates Using Deep Learning and Fuzzy Evaluation for Freeway Network in China

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221208

Address after: 557100 Yunguan Office Building, Yongle Road, Danjiang Town, Leishan County, Qiandongnan Miao and Dong Autonomous Prefecture, Guizhou Province

Applicant after: Guizhou Qiantong Zhilian Technology Co.,Ltd.

Address before: Zhenjiang City, Jiangsu Province, 212013 Jingkou District Road No. 301

Applicant before: JIANGSU University

GR01 Patent grant
GR01 Patent grant