CN111967712B - Traffic risk prediction method based on complex network theory - Google Patents

Traffic risk prediction method based on complex network theory Download PDF

Info

Publication number
CN111967712B
CN111967712B CN202010649490.3A CN202010649490A CN111967712B CN 111967712 B CN111967712 B CN 111967712B CN 202010649490 A CN202010649490 A CN 202010649490A CN 111967712 B CN111967712 B CN 111967712B
Authority
CN
China
Prior art keywords
traffic
network
grid
model
road
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010649490.3A
Other languages
Chinese (zh)
Other versions
CN111967712A (en
Inventor
李大庆
郑参
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202010649490.3A priority Critical patent/CN111967712B/en
Publication of CN111967712A publication Critical patent/CN111967712A/en
Application granted granted Critical
Publication of CN111967712B publication Critical patent/CN111967712B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/18Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0133Traffic data processing for classifying traffic situation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • Geometry (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Artificial Intelligence (AREA)
  • Chemical & Material Sciences (AREA)
  • Software Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Development Economics (AREA)
  • Analytical Chemistry (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a traffic risk prediction method based on a complex network theory, which comprises the following steps: step A: establishing a double-layer traffic network model based on empirical data division grids; and B: extracting and screening features based on a complex network theory; and C: risk prediction is carried out based on an ensemble learning theory; step D: evaluating and verifying the model; through the steps, two dimensions of the function and the structure of the traffic system are comprehensively considered, scientific and reliable technical support and theoretical support are provided for the identification of traffic risks, and important support is provided for risk diagnosis of the traffic system, formulation of targeted management control measures and improvement of traffic operation reliability; the method has the advantages of strong systematicness, high portability and easy operation, and solves the problem that risks in a complex traffic system are difficult to identify and predict.

Description

Traffic risk prediction method based on complex network theory
Technical Field
The invention provides a traffic risk prediction method based on a complex network theory, and relates to the technical fields of risk analysis, network science and the like.
Background
Risk refers to a possible occurrence of an event that, if occurring, can impede the development of the system, even go to death, and is also defined as the uncertainty of whether an event occurred or not. The risk exists in the system objectively, and the loss caused by the risk can be prevented or reduced by adopting a precautionary measure, but the risk cannot be eliminated. In a complex system, because risks in the system often appear in the characteristics of sudden occurrence, large spread range and strong destructive power, great difficulty is brought to the identification, prediction and prevention of system risks, new challenges are also provided to the research of risk management, control and prevention of the complex system, and the loss caused by the occurrence of the system risks can bring great influence to the life of people and even the operation of the society, so that the accurate prediction of the risks in the complex system by adopting a scientific and reasonable method is necessary. The traffic system plays an important role in the aspects of travel, urban operation and the like, and in recent years, with the rapid development of mobile interconnection and vehicle-mounted technology, the traffic system has the characteristic of high complexity in structure and function. Under the complex and changeable environment and demand, the traffic system can face the occurrence of artificial and natural risk conditions such as traffic accidents, construction closure, rainstorm, snow disasters and the like, the traffic risk events often cause traffic jam, and meanwhile, the traffic system has the characteristic of space-time evolution, and the risk events can be spread in the traffic system after the occurrence of the traffic risk events, so that a large amount of extra cost is added for the travel of residents, and great resource waste is brought to the society.
In the current research of risk identification and prediction of a traffic system, the main methods include a Model-based analysis method, qualitative analysis and quantitative analysis, particularly, the structure and the function of the system are described based on a Process Flow Diagram (PFD) and grey correlation analysis, and the risk is identified and predicted by analyzing the system deviation generation condition and the correlation degree among all influencing factors and quantifying the system deviation generation condition and the correlation degree among all influencing factors; in addition, with the advent of the big data age and the development of technology thereof, knowledge-based analysis methods have been developed, and the main methods thereof include a causal relationship model, a machine learning model, a deep learning model, and the like, which are based on empirical data generated by a traffic system, such as: and (3) traffic flow, vehicle-mounted speed and the like, and an unknown relation and a pattern in the data are discovered and revealed by constructing a historical data set application model, so that the risk state in a traffic system is identified and predicted. The method only uses the known model and data to predict the risks of the traffic system from the state of the traffic system, does not dynamically consider the incidence relation and the evolution mode among the risks in the traffic system from the network level, and is difficult to explain the internal mechanism of the risk formation of the traffic system. Therefore, aiming at the traffic system with high structural and functional complexity, the invention combines the complex network theory and the machine learning method to identify and predict the risk of the traffic system, provides a new perspective and a new method for researching the risk identification prediction and management control in the traffic system, enriches the cognition of people on the risk in the traffic system, and has important significance for ensuring the healthy and stable operation of the traffic system.
Disclosure of Invention
Objects of the invention
The invention is mainly used for solving the problem of risk identification and prediction under the background of a complex system and a network structure, the conventional method mainly analyzes the risk of a traffic system from the function of the system, and the invention provides a traffic risk prediction method based on a complex network theory by comprehensively considering two dimensions of the function and the structure of the traffic system from the perspective of the complex network aiming at the high complexity and the time-space evolution characteristic of the traffic system and the problem that the conventional method cannot well identify and predict the risk of the traffic system. The method provided by the invention can effectively identify and predict the risks of the traffic system, and provides important support for risk diagnosis of the traffic system, formulation of targeted management control measures and improvement of traffic operation reliability.
(II) technical scheme
In order to achieve the purpose, the method adopts the technical scheme that: a traffic risk prediction method based on a complex network theory is provided.
The invention relates to a traffic risk prediction method based on a complex network theory, which comprises the following steps:
step A: dividing grids based on empirical data to construct a double-layer traffic network model;
and B: extracting and screening features based on a complex network theory;
and C: risk prediction is carried out based on an ensemble learning theory;
step D: and (5) evaluating and verifying the model.
Through the steps, the purpose of risk prediction of the traffic system can be achieved, the method is strong in systematicness, high in transportability and easy to operate, and the problem that risks in a complex traffic system are difficult to identify and predict is solved.
The step A of establishing the double-layer traffic network model based on the empirical data division grids comprises the following steps of: firstly, acquiring basic information of roads in a research area, wherein the basic information mainly comprises two parts, namely traffic network road information and longitude and latitude information of a traffic road intersection, dividing the basic information into N-M grid areas according to the area and the size of a research area range and the longitude and latitude information of road sections and intersections, and labeling the grid areas; secondly, aiming at each grid area, constructing a grid traffic jam network model by using a complex network theory and a method according to actual traffic data, intersection as a node, road section as an edge and relative speed of the road section as an edge weight in a grid on a microscopic level; on a macroscopic level, each grid area is used as a node, whether congestion roads exist between grids is used as a judgment bar for judging whether edges are connected or not, the number of the congestion roads existing between the grids is used as an edge weight, and a grid node traffic network model is constructed by applying a complex network theory and a method; the specific method comprises the following steps:
step A1: dividing grid areas based on geographic information;
step A2: preprocessing speed data to obtain relative speed matrix
Figure BDA0002574365680000031
Step A3: construction of grid traffic congestion network model G 1 (N 1 ,L 1 );
Step A4: construction of mesh node traffic network model G 2 (N 2 ,L 2 );
The step A1 of "dividing the grid area based on the geographic information" specifically includes the following steps: firstly, extracting traffic network models and traffic road information required by dividing grid areas from a geographic information system (Mapinfo) file by using programming software Python, wherein the extracted information mainly comprises vehicle-mounted speed of each road at each moment, longitude and latitude information of intersections, network topological structure information of a researched traffic system and the like, and in the process of extracting the longitude and latitude of the intersections, the invention uses Python to call a Baidu map Application Programming Interface (API) and adopts a sequential traversal method to obtain the longitude and latitude information of the intersections by matching the topological structure of a road network with the names of the intersections, and processes the road with failed longitude and latitude acquisition due to the difference of the names of the road intersections on the Baidu map and the Mapinfo to obtain an accurate and standard longitude and latitude information data set of the traffic system road network; secondly, calculating the area S and the latitude and longitude dereferencing range of the researched area according to the obtained traffic road information of the researched area and the longitude and latitude information of the intersection, and scientifically and reasonably determining the number of the divided grids to be N × M according to the actual background condition of the researched area, so that the area of each grid is S/(N × M); finally, according to the divided grid areas, counting which intersections are in the grid according to the longitude and latitude information of each intersection in the traffic network and recording;
wherein, the speed data preprocessing described in the step A2 is used for obtaining a relative speed matrix
Figure BDA0002574365680000041
", it is as follows: in this step, first, actual traffic operation data of a vehicle-mounted Global Positioning System (GPS) is acquired at an arbitrary timing t i Expressing the speeds corresponding to all R roads into a vector form V according to the sequence relation of the roads i =(v 1 ,v 2 ,…,v R ) (ii) a Further, the above process is repeated for all T moments, and finally the velocity vectors V at all moments are integrated i Generating an initial speed matrix>
Figure BDA0002574365680000042
Secondly, in the process of collecting the speed information of the traffic system by using the floating car technology, the speed information of each area at each moment cannot be ensured due to the influence of the network communication technology and human and natural factorsThe complete collection remains, so that the invention requires a speed compensation of the original speed information of the traffic system, i.e. the original speed matrix ≥ is>
Figure BDA0002574365680000043
In which there is a partial absence value (actually recorded as 0), so that a search for a speed matrix &isrequired>
Figure BDA0002574365680000044
The velocity missing value in (1), i.e. the element with the value of 0 in the matrix, is subjected to velocity compensation; for t i Time-lapse road R j Is first found in the road network G (N, L) for the road R j Is selected based on the set of neighbor roads>
Figure BDA0002574365680000045
Searching whether the speed record exists in the road in the set at the moment, and if one element in the set has the speed record, taking the average value of the elements in the set, wherein the specific formula is as follows:
Figure BDA0002574365680000051
in the above-mentioned formula, the compound has the following structure,
Figure BDA0002574365680000052
road R indicating lack of speed j At t i A speed compensation value at a time instant>
Figure BDA0002574365680000053
Road R indicating lack of speed j Is selected based on the set of neighbor roads>
Figure BDA0002574365680000054
Is not the sum of 0 element values, J represents the speed-missing road R j Is selected based on the set of neighbor roads>
Figure BDA0002574365680000055
The number of elements other than 0;
if the road R is j All the neighboring road speeds are not recorded, the road R is determined j Is compensated to 0, the original velocity matrix is used after each compensation
Figure BDA0002574365680000056
Updated to compensated>
Figure BDA0002574365680000057
Repeating the above process at each moment until all 0 values in the velocity matrix are compensated to obtain a completed velocity matrix->
Figure BDA0002574365680000058
In the original absolute velocity matrix
Figure BDA0002574365680000059
After the road speed compensation is completed, because the road grades at all levels are different, normalization processing is carried out on the compensated speed matrix to obtain the relative speed of the compensated speed matrix, and the judgment standard is unified; for any road R j Based on the speed matrix->
Figure BDA00025743656800000510
In which a speed vector at all times of the link is taken>
Figure BDA00025743656800000511
And extracts the maximum speed limit of the road section
Figure BDA00025743656800000512
The speed vector at this moment is combined>
Figure BDA00025743656800000513
Is divided by the maximum speed limit->
Figure BDA00025743656800000514
To obtain a normalized velocity
Figure BDA00025743656800000515
Get the normalized speed matrix->
Figure BDA00025743656800000516
As follows:
Figure BDA00025743656800000517
wherein, the step A3 of building the grid traffic jam network model G 1 (N 1 ,L 1 ) ", it is as follows: for each grid area divided in the step A, firstly, according to actual map data under each grid area, using software tools such as Python, mapinfo and the like to extract structure information among roads and road intersection information contained in each grid area; secondly, selecting a suitable geographical coverage area of traffic according to the requirement of actual research, such as selecting a five-ring traffic network in Beijing; then, according to a complex network method, abstracting a road intersection in each grid area as a node in the network, abstracting a road in the traffic network of the grid area as a connecting edge between nodes in the network, and taking the relative speed of each road as the weight of the connecting edge so as to establish a grid traffic congestion network in each grid area; meanwhile, most roads of the traffic network run in two directions and have directionality, so the grid traffic jam network constructed by the invention is a directed weighting network;
wherein, the step A4 of building the grid node traffic network model G 2 (N 2 ,L 2 ) ", it is as follows: firstly, constructing an intersection traffic network model between grids according to intersection information contained in each grid area and road topological structure information of a traffic network (whole network) of the whole research area, namely deleting the road topological structure information contained in the grid area on the basis of the whole network; secondly, counting congestion existing between grid areasThe number of roads is recorded; and finally, abstracting a grid area into nodes, abstracting whether congestion roads exist between grids as connecting edges or not by applying a complex network theory and a complex network method according to the information, and establishing a grid node traffic network model by taking the number of the congestion roads between the grids as connecting edge weights.
Wherein, the method for extracting and screening the features based on the complex network theory in the step B comprises the following steps: for each time t i The grid traffic jam network and the grid node traffic network (referred to as a double-layer traffic network for short) set a seepage threshold q (t) to carry out seepage analysis, and determine the seepage threshold q (t) through the seepage analysis of the double-layer traffic network; secondly, aiming at each grid traffic jam network and nodes (grids) in the grid node traffic network under the seepage threshold q (t) at each moment, extracting the characteristics of each grid area by using the theory and method of a complex network, wherein the characteristics comprise the structural and functional characteristics such as maximum jam sub-cluster, node betweenness mean value, node degree mean value, average speed of the grid jam network, the number of first-order neighbor congested roads and the like, screening the extracted characteristics by using a machine learning method on the basis, selecting the characteristics which greatly contribute to the traffic risk identification and prediction effect, constructing a high-quality sample characteristic set, and improving the traffic risk identification and prediction effect and efficiency to the maximum extent; meanwhile, labeling the grid area at the time t according to the proportion of the congested road at the time t + delta t in each grid area; the specific steps of the process are as follows:
step B1: analyzing seepage of a traffic network;
and step B2: extracting risk characteristics based on a complex network;
and step B3: screening risk characteristics based on machine learning;
wherein, the seepage analysis of the traffic network in the step B1 is specifically performed as follows: the seepage theory is applied to carry out seepage analysis on the double-layer traffic network,firstly, a control variable, namely a seepage threshold value is given to a traffic network at each moment, and is set as q (t), so that each road in the traffic network can present two states: unblocked state (i.e. v) i_ratio (t) > q (t)) and congestion status (i.e., v i_ratio (t) q (t)); deleting the unblocked connecting edges in the traffic network from the original network, and keeping the congested connecting edges in the original traffic network, wherein the rest network is the traffic network in a congested state at the moment t, and is referred to as a congested network for short; the next q (t) value corresponds to a congestion network at each moment, and as the q (t) value is reduced, the congestion of the traffic network becomes higher, namely, the number of invalid edges is increased, the traffic network becomes more sparse, so that the traffic congestion risk at the current moment is identified and predicted when a proper seepage threshold q (t), namely the urban traffic network is in a stage with the most abundant congestion information, is selected;
wherein, the "risk feature extraction based on complex network" in step B2 is specifically performed as follows: in the step, a grid traffic jam network and a grid node traffic network are constructed for each moment under a seepage threshold q (t), and from the viewpoint of statistical physics, a complex network theory and a complex network method are used for preliminarily extracting micro and macro characteristics of a grid area of a double-layer traffic network at each moment from the two viewpoints of structure and function; firstly, on a microscopic level, each grid traffic congestion network is used as a research object, and the microscopic features of each grid area are calculated at the key seepage threshold value at each moment; the grid traffic congestion network has different characteristics at different moments, and the congestion network in the grid area can show dynamic characteristics in space along with the evolution of time, so that the grid traffic congestion network has a spatio-temporal characteristic; secondly, on a macro level, aiming at the constructed grid node traffic network model, taking the nodes (grid areas) thereof as research objects, and calculating the macro features of the grid areas (nodes) at each moment, as shown in fig. 2, such as the micro features: the maximum congestion subgroup, the mean value of node betweenness, the mean value of node degree, the mean value of aggregation coefficient, the average speed and the growth rate of the congestion network of the grid traffic congestion network, and the like, wherein the macro characteristics are as follows: the node average path length, the node strength, the node betweenness, the node degree, the growth rate and the like of the grid node traffic network;
in the invention, a method is provided for extracting features from the perspective of a complex network, the feature extraction of a grid is exemplified, and the features of an actual traffic system can be preliminarily extracted in a targeted manner from two aspects of the structure and the function of the actual traffic system according to the actual background and the actual situation of the actual traffic system, so that a sample feature set is constructed, and an initial feature matrix M is constructed f
The step B3 of "risk feature screening based on machine learning" specifically includes the following steps: in step B2, extracting the features of the function and the structure of the grid area at each moment based on the related knowledge of the complex network, and then constructing an initial feature matrix M f In order to improve the accuracy and precision of risk identification and prediction in the traffic system, a relevant theoretical method of machine learning is used for carrying out feature selection on the preliminarily constructed sample feature set in the step, so that a high-quality sample feature set is screened out, and the effect of risk identification and prediction in the traffic system is improved to the greatest extent; meanwhile, the structure and function characteristics of the traffic system are screened, important characteristics are screened out, irrelevant characteristics are removed, dimension disasters can be relieved, the difficulty of learning tasks is reduced, and the generalization capability of an over-fitting enhanced machine learning model is reduced; aiming at the characteristic that a traffic system has high complexity of space-time evolution and in order to optimize a given learner, the invention uses a relatively classical LVW (Las Vegas Wrapper) method in a wrapping mode to select characteristics, as shown in figure 3, and the specific steps are as follows:
(1) Setting an initial optimal error E to be infinite, setting the current optimal feature subset to be an attribute complete set A, and setting the repetition times t =0;
(2) Randomly generating a group of feature subsets A ', and calculating the error E' of the classifier when the feature subsets are used;
(3) If E ' is smaller than E, enabling A ' = A and E ' = E, repeating the steps (2) and (3), otherwise, T + +, and jumping out of the cycle when T is larger than or equal to the stop control parameter T;
LVM method in calculation processThen the performance of the final used learning device is taken as the evaluation criterion of the characteristic subset, the characteristic subset which is most beneficial to the performance of the given learning device and is customized is selected for the given learning device, the high-quality sample characteristic set is screened out, and the characteristic matrix is constructed
Figure BDA0002574365680000081
Wherein, the step C of "risk identification and prediction based on ensemble learning theory" includes the following steps: in order to accurately identify and predict the congestion risk in the traffic system and effectively control the congestion risk, the method comprises the steps of firstly constructing an integrated learning model by using machine learning and relevant mathematical knowledge; secondly, in order to eliminate the influence of non-uniform dimension among the feature vectors on the model, a feature scaling method is used for data feature set
Figure BDA0002574365680000082
Performing standardization processing to obtain a standard sample characteristic matrix>
Figure BDA0002574365680000083
Finally, in order to ensure that the model learns as much as possible the knowledge of the characteristics of the risks in the traffic system, the standard sample characteristic matrix ≥ is selected>
Figure BDA0002574365680000091
Dividing the data into a training set and a test set according to a certain proportion (a: b), training an ensemble learning model by using the training set data, and then identifying and predicting risks in a grid area of the traffic system at the current moment by using the trained ensemble learning model; the specific steps of the process are as follows:
step C1: constructing an ensemble learning model;
and step C2: risk identification and prediction are carried out by applying an ensemble learning model;
wherein, the "building integrated learning model" in the step C1 is specifically performed as follows: the invention aims to learn a more stable and better-performance model by using risk historical data information of a traffic system, the integrated learning model is more prominent in learning compared with a single classifier model, and in order to make up for the defect of learning of the single classifier model, the integrated learning theory is introduced in the invention, and the integrated learning model is constructed to carry out risk identification and prediction on the traffic system; the ensemble learning is to combine a plurality of weak supervision models to obtain a better and more comprehensive strong supervision model, and the potential core idea is that even if a certain weak classifier obtains wrong prediction, other weak classifiers can correct the errors, the current mainstream ensemble learning framework comprises Bagging, boosting and Stacking, the invention uses the Bagging framework and the associated theoretical method of ensemble learning to construct a random forest model to identify and predict the risk of the traffic system, as shown in fig. 4, the implementation steps are as follows:
(1) Suppose there is a data set D = { x = i1 ,x i2 ,…,x in ,y i }(i∈[1,m]) With a characteristic number N, with a sample generation sampling space (m x N) put back m*n
(2) Building a base learner (decision tree): for each sample d j ={x i1 ,x i2 ,…,x ik ,y i }(i∈[1,m]) (where K < M) generating decision trees and recording the result h of each decision tree j (x);
(3) Train T times of
Figure BDA0002574365680000092
Where φ (x), is a mathematical model having: absolute majority voting, relative majority voting, weighted majority voting, and the like;
a special binary classifier, namely a random forest model, is constructed through the processes, risks in the traffic system are identified and predicted, in the process, the classification function is a symbolic function, output values are 0 and 1, and low risks and high risks in a grid area are respectively represented as follows:
Figure BDA0002574365680000101
in the above formula, f (x) i ) Representing the risk status of the ith grid area, 0 representing low risk and 1 representing high risk;
meanwhile, an ensemble learning model is constructed by applying an ensemble learning theory to identify and predict risks of the traffic system, and a proper ensemble learning framework and model can be selected according to the distribution characteristics of data samples to identify and predict the risks, so that the risk identification and prediction effects of the traffic system are further improved;
in step C2, the risk identification and prediction by using the ensemble learning model is specifically performed as follows: in this step, based on the feature set of the high-quality sample extracted and screened in the step C, i.e. the feature matrix
Figure BDA0002574365680000102
Identifying and predicting risks in the traffic system by using the ensemble learning model constructed in the step C1; because the difference between characteristic dimensions in the historical sample data set can affect the performance of the ensemble learning model, when the model is used for risk identification and prediction, firstly, the sample characteristic set of a research object needs to be subjected to characteristic scaling, the influence of different dimensions between characteristic vectors on the model precision is eliminated, the convergence speed of the model is improved, and a standard sample characteristic matrix (Liang) is obtained>
Figure BDA0002574365680000103
The mainstream feature Scaling method in machine learning mainly comprises min-max normalization, mean normalization, and Scaling to unit length, wherein the method is used for sample feature set(s) of a traffic system>
Figure BDA0002574365680000104
The mainstream method for scaling the characteristics can select a proper characteristic scaling method according to the condition of the actual traffic system, the characteristics of the data characteristic set and the characteristics of the applied machine learning method in actual application, thereby ensuring the risk identification and prediction in the traffic systemMaximum accuracy and precision of; />
After scaling the characteristics of the sample data set in the traffic system, in this step, the standard sample characteristic matrix based on the traffic system
Figure BDA0002574365680000105
C, identifying and predicting risks in the traffic system by using the ensemble learning model constructed in the step C, and learning the characteristics of the ensemble learning model needing to learn the risks in the process, so that the standard sample characteristic set (the characteristic set) is combined in the method>
Figure BDA0002574365680000106
Randomly dividing the training set into a training set and a test set according to a certain proportion (a: b), wherein the training set is used for training the random Sen-wheel model to learn the characteristics of risks to the maximum extent, and the test set is used for testing the training effect of the model.
Wherein, the model evaluation and verification in step D is performed as follows: in the process of identifying and predicting the risk in the traffic system by using the ensemble learning model constructed in the step C, in order to accurately and scientifically evaluate the performance of the model, in the step, firstly, evaluation indexes are reasonably selected based on the actual traffic system condition and the final target of the invention, for example: accuracy, precision, recall, F1 value, etc., the nature of which is calculated from the Confusion Matrix (fusion Matrix); secondly, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the ensemble learning model is evaluated by using a cross validation method in the step, so that the scientificity and reliability of model evaluation are further improved; the method specifically comprises the following substeps:
step D1: selecting a model evaluation index;
step D2: evaluating and analyzing the model;
the specific method of selecting the model evaluation index in step D1 is as follows: the invention is directed at the risk in the traffic system to discern and predict, its final goal is to employ the integrated learning model to discern the risk in the traffic system accurately and scientifically, its essence belongs to the abnormal detection problem in the machine learning, the main characteristic is to have the unbalanced problem of data classification, namely the sample size of the normal data is large and the sample size of the risk data is small, therefore, it can't reflect the model performance quality objectively to use the rate of accuracy alone; according to the invention, the risk identification detection problem is faced in a scene, under the scene, the model is evaluated by adopting two evaluation indexes of recall rate and accuracy, and the formula is as follows:
Figure BDA0002574365680000111
Figure BDA0002574365680000112
in the formula, accuracy represents Accuracy, recall represents recall, and TP is the number of correct predicted cases; TN is the number of correctly predicted negative cases, FP is the number of predicted positive cases, FN is the number of predicted negative cases;
the prediction error condition of the real risk unit in the traffic system is better, because if the real congestion risk in the traffic system is not identified, the traffic system is damaged to a great extent once the real congestion risk occurs, and therefore, the recall rate needs to be concerned more; meanwhile, in order to ensure that the normal accurate prediction is normal, reduce the error rate of the normal sample prediction and enable a manager of the traffic system to accurately manage and control the real risk in the traffic system to the maximum extent under the limited resource cost, the accuracy and the recall rate are introduced as the evaluation indexes of the model;
the "evaluation and analysis of the model" described in step D2 is specifically performed as follows: in the step, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the integrated learning model is evaluated by using a cross validation method in machine learning, so that the scientificity and reliability of model evaluation are further improved; the classical methods of cross validation mainly include: the invention relates to a leave-one method, a K-fold cross validation method, a self-service method and the like, wherein the self-service method is used for cross validation, and the steps are as follows:
(1) Randomly selecting one sample in a data set containing N samples each time, and taking the sample as a training sample;
(2) Putting the randomly selected samples in the step (1) back into the original data set, and sampling the samples in a put-back mode for N times to generate a data set with the same size as the original data set, wherein the new data set is a training set;
(3) After N times of extraction, the original data set probably has
Figure BDA0002574365680000121
Will not appear in the new dataset, and therefore, samples that do not appear in the new dataset will be taken as validation sets;
(4) Repeating the above steps M times, M models can be trained, and the values of their evaluation indexes can be obtained, and then taking the average value, the performance evaluation value of the model can be obtained.
Through the steps, based on the complex network theory and the integrated learning theory method, from the perspective of the complex network, the two dimensions of the function and the structure of the traffic system are comprehensively considered, and scientific and reliable technical support and theoretical support are provided for the identification of traffic risks; the technical method provided by the invention can efficiently and accurately identify and predict the risk of the traffic system, and provides important support for risk diagnosis of the traffic system, establishment of targeted management control measures and improvement of traffic operation reliability.
(III) advantages and effects
The invention provides a traffic risk prediction method based on a complex network theory, which has the following advantages:
(1) Global property: the invention constructs the traffic network model from the micro and macro two levels to extract the function and structure characteristics, greatly improves the accuracy of the risk prediction of the traffic system, and has great significance for understanding the risk evolution mechanism of the traffic system and improving the reliability of the traffic system;
(2) And (3) timeliness: the invention can monitor the traffic state and predict the future risk in real time, and provides powerful support for the formulation and implementation of a risk control strategy of a traffic system, thereby ensuring the healthy and stable operation of the system;
(3) Expandability: the risk prediction method provided by the invention can be expanded to the risk identification and prediction of other types of complex systems, such as biological systems, communication systems, financial systems and the like.
(4) The method of the invention is scientific, has good manufacturability and has wide popularization and application value.
Drawings
Fig. 1 is a flow chart of a traffic risk prediction method according to the present invention.
FIG. 2 is a traffic risk profile of the present invention.
FIG. 3 is a logic diagram of the process of wrapped feature selection of the present invention.
Fig. 4 is a random forest model architecture diagram of the present invention.
FIG. 5 is a trend chart of evaluation indexes of the random forest model of the present invention.
The sequence numbers, symbols and code numbers in the figure are explained as follows:
s: the area of the region of interest;
V i :t i the speed vectors of R roads at the moment;
Figure BDA0002574365680000131
an initial velocity matrix;
Figure BDA0002574365680000132
compensating the normalized speed matrix;
G 1 (N 1 ,L 1 ): a grid traffic congestion network model;
G 2 (N 2 ,L 2 ): a mesh node traffic network model;
q (t): a seepage threshold of the traffic network at time t;
V i_ratio : a normalized velocity vector;
M f : an initial feature matrix;
Figure BDA0002574365680000133
the screened high-quality feature matrix;
Figure BDA0002574365680000141
a high-quality feature matrix after feature scaling;
f(x i ): risk status of ith grid area
Accuracy: the model accuracy rate;
recall: model recall;
TP: the number of correctly predicted positive examples;
TN: the number of negative cases correctly predicted;
FP: predicting negative examples as the number of positive examples;
FN: the positive examples are predicted as the number of negative examples.
Detailed Description
In order to make the technical problems and technical solutions to be solved by the present invention clearer, the following detailed description is made with reference to the accompanying drawings and specific embodiments. It is to be understood that the embodiments described herein are for purposes of illustration and explanation only and are not intended to limit the invention.
The invention is further described in the following description and embodiments with reference to the drawings.
The actual traffic system data used in the embodiment of the present invention is obtained by counting the real-time speed data of the floating cars on each road section within a certain time span of all roads in the five-ring area in Beijing, which is provided by QF technology corporation, at a time interval of 1 minute and a time granularity of higher, and at the same time interval of 0 to 00-23.
The traffic risk prediction method based on the complex network theory of the embodiment of the invention is shown in figure 1, and the specific implementation steps are as follows:
step A: dividing grids based on empirical data to construct a double-layer traffic network model;
and B: extracting and screening features based on a complex network theory;
and C: risk prediction is carried out based on an ensemble learning theory;
step D: and (5) evaluating and verifying the model.
Through the steps, the purpose of risk prediction of the traffic system can be achieved, the method is strong in systematicness, high in transportability and easy to operate, and the problem that risks in a complex traffic system are difficult to identify and predict is solved.
The step A of establishing the double-layer traffic network model based on the empirical data division grids comprises the following steps of: firstly, acquiring basic information of roads in a research area, wherein the basic information mainly comprises two parts, namely traffic network road information and longitude and latitude information of a traffic road intersection, dividing the basic information into N-M grid areas according to the area and the size of the research area range and the longitude and latitude information of road sections and intersections, and labeling the grid areas; secondly, aiming at each grid area, constructing a grid traffic jam network model by using a complex network theory and a method according to actual traffic data, intersection as a node, road section as an edge and relative speed of the road section as an edge weight in a grid on a microscopic level; on a macroscopic level, each grid area is used as a node, whether congestion roads exist between grids is used as a judgment bar for judging whether edges are connected, the number of the congestion roads existing between the grids is used as an edge weight, and a grid node traffic network model is constructed by applying a complex network theory and a method.
Step A1: dividing grid areas based on geographic information;
step A2: preprocessing speed data to obtain relative speed matrix
Figure BDA0002574365680000151
/>
Step A3: construction ofGrid traffic jam network model G 1 (N 1 ,L 1 );
Step A4: construction of mesh node traffic network model G 2 (N 2 ,L 2 );
The step A1 of "dividing the grid area based on the geographic information" specifically includes the following steps: firstly, extracting traffic network models and traffic road information required by grid area division by utilizing a Python language Mapinfo file, wherein the extracted information mainly comprises vehicle-mounted speed of each road at each moment, longitude and latitude information of intersections, network topological structure information of a Beijing city five-ring traffic system and the like; secondly, according to the obtained five-ring traffic road information in Beijing and the longitude and latitude information of the crossroad, calculating that the area S in the five-ring area of Beijing is 667 square kilometers, the longitude range is 116.20-116.56, the latitude range is 39.76-40.03, and scientifically and reasonably determining that the number of divided grids is 2500 according to the actual background condition in the five-ring area of Beijing, so that the area of each grid is 516m; and finally, according to the divided grid areas, counting which intersections are in the grid according to the longitude and latitude information of each intersection in the traffic network aiming at each grid area, and recording.
The "speed data preprocessing" described in step A2 obtains a relative speed matrix
Figure BDA0002574365680000161
", it is done as follows: in this step, first, actual traffic operation data of a vehicle-mounted Global Positioning System (GPS) is acquired at an arbitrary timing t i The corresponding speeds of all R roads are determined according to the roadsSequential relation, expressed in vector form V i =(v 1 ,v 2 ,…,v R ) (ii) a Further, the above process is repeated for all T moments, and finally the velocity vectors V at all moments are integrated i Generating an initial speed matrix>
Figure BDA0002574365680000162
Secondly, in the process of collecting the speed information of the five-ring traffic system in Beijing by using the floating car technology, the speed information of each area at each moment can not be completely collected and reserved due to the influence of the network communication technology and human and natural factors, so that the original speed information of the traffic system needs to be subjected to speed compensation processing, namely an original speed matrix (or matrix) is used for judging whether the original speed information is the original speed information or not>
Figure BDA0002574365680000163
There is a partial missing value (actually recorded as 0) and therefore, it is necessary to find the velocity matrix
Figure BDA0002574365680000164
The velocity missing value in (1), i.e. the element with the value of 0 in the matrix, is subjected to velocity compensation; for t i Road R at time j Is compensated for by first finding the road R in the road network G (N, L) j Is selected based on the set of neighbor roads>
Figure BDA0002574365680000165
Searching whether the speed record exists on the road in the set at the moment, and if one element in the set has the speed record, taking the average value of the elements in the set, wherein the specific formula is as follows:
Figure BDA0002574365680000166
in the above formula, the first and second carbon atoms are,
Figure BDA0002574365680000167
road R indicating lack of speed j At t i A speed compensation value at a time instant>
Figure BDA0002574365680000168
Road R indicating lack of speed j Is selected based on the set of neighbor roads>
Figure BDA0002574365680000169
Is not a sum of 0 element values, J represents a speed-missing road R j In a neighbor road set>
Figure BDA00025743656800001610
The number of elements other than 0.
If the road R is j All the neighboring road speeds are not recorded, the road R is determined j Is compensated to 0, the original velocity matrix is used after each compensation
Figure BDA00025743656800001611
Updated to compensated->
Figure BDA00025743656800001612
Repeating the above process at each moment until all 0 values in the velocity matrix are compensated to obtain a completed velocity matrix->
Figure BDA00025743656800001613
/>
In the original absolute velocity matrix
Figure BDA0002574365680000171
After the road speed compensation is completed, because the road grades at all levels are different, the compensated speed matrix is normalized to obtain the relative speed, and the judgment standard is unified. For any road R j Based on the speed matrix->
Figure BDA0002574365680000172
In which a speed vector at all times of the link is taken>
Figure BDA0002574365680000173
And extracts the maximum speed limit of the road section
Figure BDA0002574365680000174
The speed vector at this moment is combined>
Figure BDA0002574365680000175
Is divided by the maximum speed limit->
Figure BDA0002574365680000176
To obtain a normalized velocity
Figure BDA0002574365680000177
Get the normalized speed matrix->
Figure BDA0002574365680000178
As follows:
Figure BDA0002574365680000179
construction of grid traffic congestion network model G described in step A3 1 (N 1 ,L 1 ) ", it is as follows: aiming at each grid area divided in the step A, firstly, according to the five-ring actual map data in Beijing City under each grid area, the structure information between roads and the road intersection information contained in each grid area are extracted by software tools such as Python, mapinfo and the like; secondly, selecting a five-ring traffic network in Beijing; then, according to a complex network method, abstracting a road intersection in each grid area as a node in the network, abstracting a road in the traffic network of the grid area as a connecting edge between nodes in the network, and taking the relative speed of each road as the weight of the connecting edge so as to establish a grid traffic congestion network in each grid area; meanwhile, most roads of the five-ring traffic network in Beijing are in bidirectional driving and have directionality, so the grid traffic jam network constructed by the invention is a directed weighted networkLinking the collaterals.
"construction of mesh node traffic network model G" described in step A4 2 (N 2 ,L 2 ) ", it is done as follows: firstly, constructing an intersection traffic network model between grids according to intersection information contained in each grid area and road topological structure information of a whole Beijing city five-ring traffic network (whole network), namely deleting the road topological structure information contained in the grid area on the basis of the whole network; secondly, counting the number of congested roads between the grid areas and recording the number; and finally, abstracting a grid area into nodes, abstracting whether congestion roads exist between grids as connecting edges or not by applying a complex network theory and a complex network method according to the information, and establishing a grid node traffic network model by taking the number of the congestion roads between the grids as connecting edge weights.
The method for extracting and screening the features based on the complex network theory in the step B comprises the following steps: for each time t i The grid traffic congestion network and the grid node traffic network (referred to as a double-layer traffic network for short) set a seepage threshold q (t) for seepage analysis, and determine the seepage threshold q (t) =0.5 through the seepage analysis of the double-layer traffic network; secondly, aiming at each grid traffic jam network and nodes (grids) in the grid node traffic network at each moment under the condition that the seepage threshold value is 0.5, extracting the characteristics of each grid area by using the theory and the method of a complex network, wherein the characteristics comprise the structural and functional characteristics such as maximum jam sub-cluster, node median, node degree mean, the average speed of the grid jam network, the number of first-order neighbor congested roads and the like, screening the extracted characteristics by using a machine learning method on the basis, selecting the characteristics which greatly contribute to the traffic risk identification and prediction effect, constructing a high-quality sample characteristic set, and improving the traffic risk identification and prediction effect and efficiency to the maximum extent; meanwhile, the proportion of the congested road at the t + delta t moment in each grid area to the t momentAnd marking the carved grid area. The specific steps of the process are as follows:
step B1: analyzing seepage of a traffic network;
and step B2: extracting risk characteristics based on a complex network;
and step B3: screening risk characteristics based on machine learning;
the "seepage analysis of the traffic network" described in step B1 is specifically performed as follows: a seepage theory is applied to carry out seepage analysis on a double-layer traffic network, firstly, a control variable, namely a seepage threshold value is given for the traffic network at each moment, and is set as q (t), so that each road in the traffic network can present two states: unblocked state (i.e. v) i_ratio (t) > q (t)) and congestion status (i.e., v i_ratio (t) q (t)); deleting the unblocked connecting edge in the traffic network from the original network, and keeping the jammed connecting edge in the original traffic network, wherein the rest network is the traffic network in a jammed state at the moment t, and is referred to as the jammed network for short; the next q (t) value corresponds to a congestion network at each moment, and as the q (t) value is reduced, the congestion network becomes more congested, namely, the more invalid edges, the more sparse the traffic network becomes, therefore, when a proper seepage threshold value q (t) =0.5 is selected, namely, the urban traffic network is in a stage with the most abundant congestion information, the traffic congestion risk at the current moment is identified and predicted;
the "extraction of risk features based on a complex network" described in step B2 is specifically performed as follows: in the step, the grid traffic congestion network and the grid node traffic network are constructed at each moment under the condition that the seepage threshold q (t) =0.5, and from the angle of statistical physics, a complex network theory and a complex network method are used for preliminarily extracting micro and macro characteristics of the grid area of the double-layer traffic network at each moment from the angle of structure and function. Firstly, on a microscopic level, each grid traffic congestion network is used as a research object, and the microscopic features of each grid area are calculated at the key seepage threshold value at each moment; the grid traffic congestion network has different characteristics at different moments, and the congestion network in the grid area can show dynamic characteristics in space along with the evolution of time, so that the grid traffic congestion network has a spatio-temporal characteristic; secondly, on a macro level, aiming at the constructed grid node traffic network model, taking the nodes (grid areas) thereof as research objects, and calculating the macro features of the grid areas (nodes) at each moment, as shown in fig. 2, such as the micro features: the maximum congestion subgroup, the mean value of node betweenness, the mean value of node degree, the mean value of aggregation coefficient, the average speed and the growth rate of the congestion network of the grid traffic congestion network, and the like, wherein the macro characteristics are as follows: the node average path length, the node strength, the node betweenness, the node degree, the growth rate and the like of the grid node traffic network.
In the invention, a method is provided for extracting features from the perspective of a complex network, the feature extraction of a grid is exemplified, and the features of an actual five-ring traffic system in Beijing City can be preliminarily extracted in a targeted manner according to the actual background and situation of the system and from two aspects of the structure and the function of the system, so as to construct a sample feature set and an initial feature matrix M f The dimension is (8752, 40, 30), i.e. 8752 samples, each sample having 40 features.
The "risk feature screening based on machine learning" described in step B3 is specifically performed as follows: in step B2, extracting the features of the function and the structure of the grid region at each moment based on the related knowledge of the complex network, and then constructing an initial feature matrix M f In order to improve the accuracy and precision of risk identification and prediction in a five-ring traffic system in Beijing, a relevant theoretical method of machine learning is applied to carry out feature selection on a preliminarily constructed sample feature set in the steps, so that a high-quality sample feature set is screened out, and the effect of risk identification and prediction in the traffic system is improved to the greatest extent; meanwhile, the structure and functional characteristics of the five-ring traffic system in Beijing are screened, important characteristics are screened out, irrelevant characteristics are removed, dimension disasters can be relieved, the difficulty of learning tasks is reduced, and the generalization capability of an over-fitting enhanced machine learning model is reduced; aiming at the five-ring traffic system in Beijing city, the five-ring traffic system has the characteristic of high complexity of space-time evolution and aims to carry out learning on a given learnerOptimization, the present invention uses the relatively classical LVW (Las Vegas Wrapper) method in the wrapped-type for feature selection, as shown in FIG. 3. The LVM method is applied to screen out high-quality samples with the characteristics as follows: the point betweenness variance, the edge betweenness variance, the grid congested road proportion and the node betweenness of the grid node traffic network are 10 characteristics in total, and a high-quality characteristic matrix is constructed
Figure BDA0002574365680000201
The dimension is (8752, 10, 30), i.e. a total of 8752 samples, each sample sharing 10 high quality features.
Wherein, the step C of 'risk identification and prediction based on ensemble learning theory' comprises the following steps: in order to accurately identify and predict the congestion risk in the five-ring traffic system in Beijing, and effectively control the congestion risk, the method comprises the following steps of firstly constructing an integrated learning model by using machine learning and mathematical related knowledge; secondly, in order to eliminate the influence of non-uniform dimension among the feature vectors on the model, a feature scaling method is used for data feature set
Figure BDA0002574365680000202
Standardized processing is carried out to obtain a standard sample characteristic matrix->
Figure BDA0002574365680000203
Dimension (8752, 10, 30); finally, in order to ensure that the model learns the characteristic knowledge of the risk in the five-ring road traffic system in Beijing City as much as possible, the standard sample characteristic matrix is subjected to the characteristic matrix>
Figure BDA0002574365680000204
According to the weight ratio of 7:3, the proportion of the data is divided into a training set and a test set, namely, the number of samples in the training set is 6126, the number of samples in the test set is 2626, the data in the training set is used for training an ensemble learning model, and then the trained ensemble learning model is used for identifying and predicting risks in a grid area of the traffic system at the current moment. The specific steps of the process are as follows:
step C1: constructing an integrated learning model;
and C2: risk identification and prediction are carried out by applying an ensemble learning model;
the "building ensemble learning model" described in step C1 is specifically performed as follows: the invention aims to learn a more stable model with better performance by using risk historical data information of a five-ring traffic system in Beijing, and the integrated learning model is more prominent in learning compared with a single classifier model. The ensemble learning is to combine a plurality of weak supervision models to obtain a better and more comprehensive strong supervision model, and the potential core idea is that even if a certain weak classifier obtains wrong prediction, other weak classifiers can correct the errors, the current mainstream ensemble learning framework comprises Bagging, boosting and Stacking.
A special binary classifier, namely a random forest model, is constructed through the processes, risks in a five-ring traffic system in Beijing are identified and predicted, in the process, a classification function is a symbolic function, output values are 0 and 1, and low risks and high risks in a grid area are respectively represented as follows:
Figure BDA0002574365680000211
in the above formula, f (x) i ) Indicating the risk status of the ith grid area, with 0 representing a low congestion risk and 1 representing a high congestion risk.
Meanwhile, an ensemble learning model is constructed by applying an ensemble learning theory to identify and predict risks of the five-ring traffic system in Beijing according to the distribution characteristics of data samples, and a proper ensemble learning framework and model can be selected to identify and predict the risks, so that the effects of identifying and predicting the risks of the traffic system are further improved.
In step C2, "risk identification and prediction using ensemble learning model" specifically includes the following steps: in this step, based on the feature set of the high-quality sample extracted and screened in the step C, i.e. the feature matrix
Figure BDA0002574365680000212
And (4) identifying and predicting risks in the traffic system by using the ensemble learning model constructed in the step C1. Because the difference between the characteristic dimensions in the historical sample data set can affect the performance of the ensemble learning model, when the model is used for risk identification and prediction, firstly, the sample feature set of a research object needs to be subjected to feature scaling, the influence of different dimensions among feature vectors on the model precision is eliminated, the convergence speed of the model is improved, and a standard sample feature matrix ^ is obtained>
Figure BDA0002574365680000213
The mainstream feature Scaling method in machine learning mainly comprises min-max normalization, mean normalization, and Scaling to unit length, wherein the method is used for sample feature set(s) of a traffic system>
Figure BDA0002574365680000214
The mainstream method for feature scaling selects a standardized feature scaling method according to the actual situation of the five-ring traffic system in Beijing, the characteristics of the data feature set and the applied machine learning method, thereby ensuring the maximum accuracy and precision of risk identification and prediction in the traffic system.
After feature scaling is performed on the sample data set in the five-ring road traffic system in Beijing, in the step, the standard sample feature matrix based on the traffic system
Figure BDA0002574365680000221
Identifying and predicting risks in the traffic system by using the random forest model constructed in the step C1, wherein in the process, the random forest model needs to learn the characteristics of the risks, so that the standard sample characteristic set is used for learning the characteristics of the risks, and the embodiment performs judgment on the basis of the standard sample characteristic set>
Figure BDA0002574365680000222
And (3) randomly dividing the random forest into a training set and a testing set according to the proportion of 7, wherein the number of samples in the training set is 6126, the number of samples in the testing set is 2626, and the training set is used for training a random forest model to learn the characteristics of the congestion risk to the maximum extent.
The method for evaluating and verifying the model in the step D comprises the following steps: in the process of identifying and predicting the risk in the traffic system by using the ensemble learning model constructed in the step C, in order to accurately and scientifically evaluate the performance of the model, in the step, firstly, evaluation indexes are reasonably selected based on the actual traffic system condition and the final target of the invention, for example: accuracy, precision, recall, F1 value, etc., the nature of which is calculated from a Confusion Matrix (fusion Matrix); secondly, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the ensemble learning model is evaluated by using a cross validation method in the step, so that the scientificity and the reliability of the evaluation of the model are further improved. The method specifically comprises the following substeps:
step D1: selecting a model evaluation index;
step D2: evaluating and analyzing the model;
the specific way of selecting the model evaluation index in the step D1 is as follows: the invention aims at identifying and predicting risks in a traffic system, and the final aim is to accurately and scientifically identify the risks in the traffic system by using an integrated learning model, which essentially belongs to the problem of abnormal detection in machine learning. According to the invention, the risk identification detection problem is faced in a scene, under the scene, the model is evaluated by adopting two evaluation indexes of recall rate and accuracy, and the formula is as follows:
Figure BDA0002574365680000223
Figure BDA0002574365680000231
in the formula, accuracy represents Accuracy, recall represents recall, and TP is the number of correct predicted cases; TN is the number of correctly predicted negative cases, FP is the number of positive cases predicted from negative cases, and FN is the number of negative cases predicted from positive cases.
The prediction error condition of the units with real risks in the road traffic system in the five rings in Beijing City should be less and better, because if the real congestion risk in the road traffic system in the five rings in Beijing City is not identified, once the real congestion risk occurs, the traffic system is damaged to a great extent, and therefore, the recall rate needs to be concerned more; meanwhile, in order to ensure that the normal and accurate prediction is normal, reduce the error rate of the normal sample prediction and enable a manager of the traffic system to accurately manage and control the real risk in the traffic system to the maximum extent under the limited resource cost, the accuracy rate is introduced as the evaluation index of the model. The random forest model in the ensemble learning is used for identifying and predicting the congestion risk of the road traffic system in the five rings of Beijing city, the accuracy rate is 89.83%, the recall rate is 86.74%, the level is high, and the performance of the model is good.
The "evaluation and analysis of the model" described in step D2 is specifically performed as follows: in the step, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the ensemble learning model is evaluated by using a cross validation method in machine learning, and the scientificity and reliability of model evaluation are further improved. The classical methods of cross validation mainly include: the invention relates to a leave-one method, a K-turn cross validation method, a self-help method and the like, wherein the self-help method is used for cross validation, and the steps are as follows:
(1) Randomly selecting one sample at a time in a data set containing 8752 samples, and using the sample as a training sample;
(2) Putting the randomly selected sample in (1) back into the original data set, and then sampling 8752 times in a putting-back mode to generate a data set with the same size as the original data set, wherein the new data set is a training set;
(3) Through 8752 times of extraction, 3221 samples in the original data set can not appear in the new data set, and therefore, the samples which do not appear in the new data set are taken as a verification set;
(4) Repeating the above steps 10 times, 10 models can be trained, and the values of the evaluation indexes can be obtained, and then averaging is performed, so that the performance evaluation value of the model can be obtained.
As shown in fig. 5, a random forest model is used for identifying and predicting the congestion risk of the road traffic system in the five rings of beijing city, and a self-service method is used for performing cross validation on the model for 10 times, wherein the average value of the accuracy is about 92.84%, the average value of the recall rate is about 92.45%, and the average value is at a higher level, which indicates that the model has stronger generalization capability and better performance, can accurately and reliably identify and predict the congestion risk of the road traffic system in the five rings of beijing city, and provides powerful guarantee for ensuring safe, stable and healthy operation.
The invention has not been described in detail and is within the skill of the art.
The above description is only a part of the embodiments of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims (1)

1. A traffic risk prediction method based on a complex network theory is characterized in that: the method comprises the following steps:
step A: dividing grids based on empirical data to construct a double-layer traffic network model;
and B: extracting and screening features based on a complex network theory;
and C: risk prediction is carried out based on an ensemble learning theory;
step D: evaluating and verifying the model;
the method for constructing the double-layer traffic network model based on the empirical data division grids in the step A comprises the following steps: firstly, acquiring basic information of roads in a research area, wherein the basic information comprises two parts, namely traffic network road information and longitude and latitude information of a traffic road intersection, dividing the basic information into N-M grid areas according to the area and the size of a research area range and the longitude and latitude information of road sections and the longitude and latitude information of the intersection, and labeling the grid areas; secondly, aiming at each grid area, constructing a grid traffic jam network model by using a complex network theory and a method according to actual traffic data, intersection as a node, road section as an edge and relative speed of the road section as an edge weight in a grid on a microscopic level; on a macroscopic level, each grid area is used as a node, whether congested roads exist between grids is used as a judgment bar for judging whether edges are connected or not, the number of the congested roads existing between the grids is used as an edge weight, and a grid node traffic network model is constructed by applying a complex network theory and a method; the specific method comprises the following steps:
step A1: dividing grid areas based on geographic information;
step A2: preprocessing speed data to obtain relative speed matrix
Figure FDA0003926665530000011
Step A3: constructing a grid traffic jam network model;
step A4: constructing a grid node traffic network model;
the step A1 of dividing the grid area based on the geographic information includes the following specific steps: firstly, extracting traffic network models and traffic road information required for dividing grid areas from a geographic information system (Mapinfo file) by using programming software Python, wherein the extracted information comprises vehicle-mounted speed of each road at each moment, longitude and latitude information of intersections and network topological structure information of a researched traffic system, calling a Baidu map Application Programming Interface (API) by using Python and matching the topological structure of the road network and the names of the intersections by adopting a sequential traversal method to obtain longitude and latitude information of the intersections, and processing the road and intersection information which cause longitude and latitude acquisition failure due to the difference of the names of the road intersections on the Baidu map and the Mapinfo to obtain an accurate standard traffic system road network longitude and latitude information data set; secondly, calculating the area S and the latitude and longitude dereferencing range of the researched area according to the obtained traffic road information and the longitude and latitude information of the intersection of the researched area, and scientifically and reasonably determining the number of the divided grids to be N x M according to the actual background condition of the researched area, so that the area of each grid is S/(N x M); finally, according to the divided grid areas, counting which intersections are in the grid according to the longitude and latitude information of each intersection in the traffic network and recording;
wherein, the speed data preprocessing described in the step A2 is used to obtain the relative speed matrix
Figure FDA0003926665530000021
The method comprises the following steps: in this step, first, actual traffic operation data of a GPS, which is a vehicle-mounted global positioning system, is acquired at an arbitrary time t i Expressing the speeds corresponding to all R roads into a vector form V according to the sequence relation of the roads i =(v 1 ,v 2 ,…,v R ) (ii) a Repeating the above process for all T moments, and finally integrating the velocity vectors V of all moments i Generating an initial velocity matrix
Figure FDA0003926665530000022
Secondly, the original speed information of the traffic system needs to be speed-compensated, i.e. the original speed matrix ≥ is>
Figure FDA0003926665530000023
In which there is a partial deletionValue, therefore, the lookup speed matrix->
Figure FDA0003926665530000024
The velocity missing value in (1), that is, the element with the value of 0 in the matrix, is used for velocity compensation; for t i Road R at time j To compensate for the speed loss value, first find the road R in the road network j Is selected based on the set of neighbor roads>
Figure FDA0003926665530000025
Searching whether the speed record exists on the road in the set at the moment, and if one element in the set has the speed record, taking the average value of the elements in the set, wherein the specific formula is as follows:
Figure FDA0003926665530000026
in the above-mentioned formula, the compound has the following structure,
Figure FDA0003926665530000027
road R indicating lack of speed j At t i The instant speed compensation value->
Figure FDA0003926665530000028
Road R indicating lack of speed j Is selected based on the set of neighbor roads>
Figure FDA0003926665530000029
Is not a sum of 0 element values, J represents a speed-missing road R j In a neighbor road set>
Figure FDA00039266655300000210
The number of elements other than 0;
if the road R j All the neighboring road speeds are not recorded, the road R is determined j Is compensated to 0, the original speed is compensated after each compensationDegree matrix
Figure FDA00039266655300000211
Updated to compensated->
Figure FDA00039266655300000212
Repeating the above process at each moment until all 0 values in the speed matrix are compensated to obtain a completed speed matrix->
Figure FDA00039266655300000213
In the original absolute velocity matrix
Figure FDA00039266655300000214
After the road speed compensation is completed, because the road grades at all levels are different, normalization processing is carried out on the compensated speed matrix to obtain the relative speed of the compensated speed matrix, and the judgment standard is unified; for any road R j Based on the speed matrix->
Figure FDA00039266655300000215
Extracts the speed vector at all moments of the road>
Figure FDA00039266655300000216
And extracts the maximum speed limit for the section>
Figure FDA00039266655300000217
The speed vector at this moment is combined>
Figure FDA00039266655300000218
Is divided by the maximum speed limit->
Figure FDA00039266655300000219
To obtain a normalized speed->
Figure FDA0003926665530000031
Get the normalized speed matrix->
Figure FDA0003926665530000032
As follows:
Figure FDA0003926665530000033
the method for constructing the grid traffic congestion network model in the step A3 comprises the following specific steps: for each grid area divided in the step A, firstly, according to actual map data under each grid area, using Python and Mapinfo software tools to extract structure information among roads and road intersection information contained in each grid area; secondly, selecting a proper geographical coverage area of traffic according to the requirement of actual research, abstracting a road intersection in each grid area as a node in the network according to a complex network method, abstracting the road in the grid area traffic network as a connecting edge between nodes in the network, and taking the relative speed of each road as the weight of the connecting edge so as to establish a grid traffic congestion network in each grid area; meanwhile, most roads of the traffic network are driven in two directions and have directionality, so the constructed grid traffic jam network is a directed weighted network;
wherein, the construction of the mesh node traffic network model in the step A4 specifically comprises the following steps: firstly, according to intersection information contained in a plurality of grid areas and road topological structure information of a whole research area traffic network, namely the whole network, an intersection traffic network model between grids is constructed, namely the road topological structure information contained in the grid areas is deleted on the basis of the whole network; secondly, counting the number of congested roads between the grid areas and recording the number; finally, abstracting a grid area into nodes, abstracting whether congestion roads exist between grids into connecting edges or not by applying the theory and the method of a complex network, and establishing a grid node traffic network model by taking the number of the congestion roads between the grids as the weight of the connecting edges;
the feature extraction and screening based on the complex network theory in the step B comprises the following steps: for each time t i The grid traffic congestion network and the grid node traffic network are referred to as a double-layer traffic network for short, a seepage threshold q (t) is set for seepage analysis, and the seepage threshold q (t) is determined through the seepage analysis of the double-layer traffic network; secondly, aiming at each grid traffic jam network and each node in the grid node traffic network, namely the grid, under the seepage threshold q (t) of each moment, extracting the characteristics of each grid area by using the theory and the method of a complex network, wherein the characteristics comprise maximum jam subgroups, node median, node degree mean, the average speed of the grid jam network and the number structure and the functional characteristics of first-order neighbor jam roads, screening the extracted characteristics by using a machine learning method on the basis, selecting the characteristics which greatly contribute to the traffic risk identification and prediction effect, constructing a high-quality sample characteristic set, and improving the traffic risk identification and prediction effect and efficiency; meanwhile, labeling the grid area at the time t according to the proportion of the congested road at the time t + delta t in each grid area; the method comprises the following specific steps:
step B1: analyzing seepage of a traffic network;
and step B2: extracting risk features based on a complex network;
and step B3: screening risk characteristics based on machine learning;
the seepage analysis of the traffic network in step B1 is specifically performed as follows: carrying out seepage analysis on the double-layer traffic network by using a seepage theory; firstly, a control variable, namely a seepage threshold value is given to a traffic network at each moment, and is set as q (t), so that each road in the traffic network can present two states: unblocked state i.e. v i_ratio (t) > q (t) and congestion status v i_ratio Q (t) is less than or equal to q (t); the free links in the traffic network are removed from the original networkDeleting the network, namely reserving the jammed connecting edge in the original traffic network, wherein the remaining network is the traffic network in the jammed state at the moment t, and is referred to as the jammed network for short; the next q (t) value corresponds to a congestion network at each moment, and as the q (t) value is reduced, the congestion of the traffic network becomes higher, namely, the number of invalid edges is increased, the traffic network becomes more sparse, so that the traffic congestion risk at the current moment is identified and predicted when a proper seepage threshold q (t), namely the urban traffic network is in a stage with the most abundant congestion information, is selected;
wherein, the extracting of the risk characteristics based on the complex network in the step B2 specifically includes the following steps: constructing a grid traffic jam network and a grid node traffic network at each moment under a seepage threshold q (t), and preliminarily extracting micro and macro characteristics of a grid area of a double-layer traffic network at each moment from the two aspects of structure and function by applying a complex network theory and a complex network method from the viewpoint of statistical physics; firstly, on a microscopic level, each grid traffic jam network is used as a research object, and the microscopic characteristics of each grid area are calculated at a key seepage threshold value at each moment; the grid traffic congestion network has different characteristics at different moments, and the congestion network in the grid area can show dynamic characteristics in space along with the evolution of time, so that the grid traffic congestion network has a spatio-temporal characteristic; secondly, on a macroscopic level, aiming at the constructed grid node traffic network model, taking a node, namely a grid area, as a research object, calculating a macroscopic feature of the grid area, namely the node, a maximum congestion subgroup, a mean value of node betweenness, a mean value of node degree, a mean value of aggregation coefficient, an average speed and an increase rate of a congestion network of the grid traffic congestion network at each moment, wherein the macroscopic feature comprises the following steps: the average path length of nodes, the strength of the nodes, the node betweenness, the node degree and the growth rate of the nodes of the grid node traffic network; pertinently and preliminarily extracting the characteristics of the target to construct a sample characteristic set, and constructing an initial characteristic matrix M f
Wherein, the risk feature screening based on machine learning in step B3 specifically includes the following steps: based on repetition in step B2Extracting the features of the function and the structure of the grid region at each moment by the related knowledge of the hybrid network, and then constructing an initial feature matrix M f In the step, a relevant theoretical method of machine learning is used for carrying out feature selection on the preliminarily constructed sample feature set, so that a high-quality sample feature set is screened out, and the effects of risk identification and prediction in a traffic system are improved; meanwhile, the structure and function characteristics of the traffic system are screened, important characteristics are screened out, and irrelevant characteristics are removed; the characteristic selection is carried out by applying a classic LVW method in a wrapping mode, and the specific steps are as follows:
(1) Setting an initial optimal error E to be infinite, setting the current optimal feature subset as an attribute complete set A, and setting the repetition times t =0;
(2) Randomly generating a group of feature subsets A ', and calculating the error E' of the classifier when the feature subsets are used;
(3) If E ' is smaller than E, enabling A ' = A and E ' = E, repeating the steps (2) and (3), otherwise, T + +, and jumping out of the cycle when T is larger than or equal to the stop control parameter T;
in the calculation process, the LVM method directly takes the performance of the finally used learner as the evaluation criterion of the feature subsets, selects the feature subsets which are most beneficial to the performance and customized for the given learner, screens out high-quality sample feature sets, and constructs a feature matrix
Figure FDA0003926665530000051
Wherein, the risk identification and prediction based on the ensemble learning theory in the step C is performed as follows: firstly, constructing an integrated learning model by using machine learning and mathematic related knowledge; secondly, data feature set is scaled by using feature scaling method
Figure FDA0003926665530000052
Standardized processing is carried out to obtain a standard sample characteristic matrix->
Figure FDA0003926665530000053
Finally, the characteristic matrix of the standard sample is evaluated>
Figure FDA0003926665530000054
Dividing the data into a training set and a test set according to a preset proportion, training an ensemble learning model by using the training set data, and then identifying and predicting risks in a grid area of the traffic system at the current moment by using the trained ensemble learning model; the method comprises the following specific steps:
step C1: constructing an ensemble learning model;
and step C2: carrying out risk identification and prediction by using an ensemble learning model;
wherein, in the step C1, the integrated learning model is constructed by the following specific steps: a random forest model is constructed by using a Bagging framework and an integrated learning related theoretical method to identify and predict risks of a traffic system, and the method comprises the following implementation steps:
(1) Let exist in dataset D = { x = i1 ,x i2 ,…,x in ,y i },i∈[1,m](ii) a With a number of features N, with samples returned to generate a sampling space (m x N) m*n
(2) Constructing a base learner, namely a decision tree: for each sample d j ={x i1 ,x i2 ,…,x ik ,y i },i∈[1,m](ii) a Where K < M, generating decision trees and recording the result h of each decision tree j (x);
(3) Train for T times
Figure FDA0003926665530000061
Wherein the first and second phases are represented by phi (x),
a binary classifier, namely a random forest model, is constructed through the processes, risks in the traffic system are identified and predicted, in the process, the classification function is a symbolic function, the output values are 0 and 1, and the low risk and the high risk of the grid area are represented respectively as follows:
Figure FDA0003926665530000062
in the above formula, f (x) i ) Representing the risk status of the ith grid area, 0 representing low risk and 1 representing high risk;
meanwhile, an ensemble learning model is constructed by applying an ensemble learning theory to identify and predict risks of the traffic system, and a proper ensemble learning framework and model can be selected according to the distribution characteristics of data samples to identify and predict the risks, so that the effects of identifying and predicting the risks of the traffic system are improved;
wherein, in the step C2, the risk identification and prediction are performed by using the ensemble learning model, specifically as follows: based on the high-quality sample feature set extracted and screened in the step C, namely a feature matrix
Figure FDA0003926665530000063
Identifying and predicting risks in the traffic system by using the ensemble learning model constructed in the step C1; when the model is used for risk identification and prediction, firstly, the sample feature set of a research object is subjected to feature scaling, the influence of different dimensions among feature vectors on the model precision is eliminated, the convergence speed of the model is improved, and a standard sample feature matrix/based on the condition of the condition is obtained>
Figure FDA0003926665530000064
The mainstream feature Scaling method in machine learning comprises min-max normalization, mean normalization, standardization and Scaling to unit length, which are sample feature sets for a traffic system>
Figure FDA0003926665530000065
Mainstream methods of feature scaling;
after feature scaling is carried out on a sample data set in a traffic system, a standard sample feature matrix based on the traffic system
Figure FDA0003926665530000066
Identifying risks in the traffic system by applying the ensemble learning model constructed in the step CIn the process of learning, the ensemble learning model requires learning of a learning-risky feature, whereupon a standard sample feature set>
Figure FDA0003926665530000067
Randomly dividing the training set into a training set and a test set according to a preset proportion, wherein the training set is used for training a random wheel model and learning the characteristics of risks, the test set is used for testing the training effect of the model, and the training and the testing are repeated by changing the proportion of the training set to the test set until the effect of the model is optimal;
wherein, the model evaluation and verification in step D is performed as follows: in the process of identifying and predicting risks in the traffic system by using the integrated learning model constructed in the step C, firstly, reasonably selecting an evaluation index based on the actual traffic system condition and the final target, and calculating according to a Confusion Matrix, namely the fusion Matrix; secondly, evaluating the ensemble learning model by using a cross validation method, and improving the scientificity and reliability of model evaluation; the method specifically comprises the following substeps:
step D1: selecting a model evaluation index;
step D2: evaluating and analyzing the model;
selecting a model evaluation index in the step D1 specifically comprises the following steps: identifying and predicting risks in a traffic system, and evaluating a model by adopting two evaluation indexes of recall rate and accuracy, wherein the formula is as follows:
Figure FDA0003926665530000071
Figure FDA0003926665530000072
in the formula, accuracy represents Accuracy, recall represents recall, and TP is the number of correct predicted cases; TN is the number of correctly predicted negative cases, FP is the number of positive cases predicted from negative cases, FN is the number of negative cases predicted from positive cases;
the evaluation analysis of the model in step D2 is specifically performed as follows: the integrated learning model is evaluated by using a cross validation method in machine learning, so that the scientificity and reliability of model evaluation are improved; the self-help method is used for cross validation, and the steps are as follows:
(1) Randomly selecting one sample in a data set containing N samples each time, and taking the sample as a training sample;
(2) Putting the samples randomly selected in the step (1) back into the original data set, and sampling the samples in a place-by-place manner for N times to generate a data set with the same size as the original data set, wherein the new data set is a training set;
(3) After N times of extraction, the original data set comprises
Figure FDA0003926665530000073
Will not appear in the new data set, and therefore, will take samples that do not appear in the new data set as the validation set;
(4) Repeating the steps M times, training M models and obtaining the values of the evaluation indexes of the models, and then taking the average value to obtain the performance evaluation value of the model.
CN202010649490.3A 2020-07-08 2020-07-08 Traffic risk prediction method based on complex network theory Active CN111967712B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010649490.3A CN111967712B (en) 2020-07-08 2020-07-08 Traffic risk prediction method based on complex network theory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010649490.3A CN111967712B (en) 2020-07-08 2020-07-08 Traffic risk prediction method based on complex network theory

Publications (2)

Publication Number Publication Date
CN111967712A CN111967712A (en) 2020-11-20
CN111967712B true CN111967712B (en) 2023-04-07

Family

ID=73361398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010649490.3A Active CN111967712B (en) 2020-07-08 2020-07-08 Traffic risk prediction method based on complex network theory

Country Status (1)

Country Link
CN (1) CN111967712B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112989374B (en) * 2021-03-09 2021-11-26 闪捷信息科技有限公司 Data security risk identification method and device based on complex network analysis
CN113034913A (en) * 2021-03-22 2021-06-25 平安国际智慧城市科技股份有限公司 Traffic congestion prediction method, device, equipment and storage medium
CN112991743B (en) * 2021-04-22 2021-10-08 泰瑞数创科技(北京)有限公司 Real-time traffic risk AI prediction method based on driving path and system thereof
CN115985089B (en) * 2022-12-01 2024-03-19 西部科学城智能网联汽车创新中心(重庆)有限公司 Method and device for perceiving weak traffic participants based on cloud
CN116307737B (en) * 2023-05-06 2023-07-18 交通运输部水运科学研究所 Dangerous cargo container security risk prediction method based on port berth congestion degree

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583494A (en) * 2018-11-28 2019-04-05 重庆邮电大学 The feature extraction and prediction technique of dynamic network link based on structure Sub-Image Feature
CN110211378A (en) * 2019-05-29 2019-09-06 北京航空航天大学 A kind of urban transportation health indicator system appraisal procedure based on Complex Networks Theory
CN111081016A (en) * 2019-12-18 2020-04-28 北京航空航天大学 Urban traffic abnormity identification method based on complex network theory

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109377750B (en) * 2018-09-18 2020-10-09 北京航空航天大学 Traffic system elastic critical point determining method based on seepage analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583494A (en) * 2018-11-28 2019-04-05 重庆邮电大学 The feature extraction and prediction technique of dynamic network link based on structure Sub-Image Feature
CN110211378A (en) * 2019-05-29 2019-09-06 北京航空航天大学 A kind of urban transportation health indicator system appraisal procedure based on Complex Networks Theory
CN111081016A (en) * 2019-12-18 2020-04-28 北京航空航天大学 Urban traffic abnormity identification method based on complex network theory

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《Recent Progress on the Resilience of Complex Networks》;Jianxi Gao 等;《Energies》;20151027;全文 *
《Scale-free resilience of real traffic jams》;Limiao Zhang 等;《PNAS》;20190430;第116卷(第18期);全文 *
复杂网络理论与城市交通***复杂性问题的相关研究;高自友等;《交通运输***工程与信息》;20060625(第03期);全文 *

Also Published As

Publication number Publication date
CN111967712A (en) 2020-11-20

Similar Documents

Publication Publication Date Title
CN111967712B (en) Traffic risk prediction method based on complex network theory
CN111081016B (en) Urban traffic abnormity identification method based on complex network theory
Yu et al. Rapid visual screening of soft-story buildings from street view images using deep learning classification
CN114547827A (en) Infrastructure group running state evaluation method, electronic device and storage medium
Liu et al. A comprehensive risk analysis of transportation networks affected by rainfall‐induced multihazards
Wang et al. Design and implementation of early warning system based on educational big data
CN117034143B (en) Distributed system fault diagnosis method and device based on machine learning
Liu et al. Traffic dynamics exploration and incident detection using spatiotemporal graphical modeling
Pampoore-Thampi et al. Mining GIS data to predict urban sprawl
Soldan et al. Short-term forecast of EV charging stations occupancy probability using big data streaming analysis
CN113191642B (en) Regional landslide sensitivity analysis method based on optimal combination strategy
CN115099328A (en) Traffic flow prediction method, system, device and storage medium based on countermeasure network
Kovačević et al. Sampling and machine learning methods for a rapid earthquake loss assessment system
CN117933701A (en) Rail transit engineering construction safety risk monitoring method and system
CN117275215A (en) Urban road congestion space-time prediction method based on graph process neural network
CN111830937B (en) Vehicle fault identification model construction and identification method and device and management system
Dorosan et al. Use of machine learning in understanding transport dynamics of land use and public transportation in a developing city
Dong et al. Short-term traffic flow forecasting of road network based on spatial-temporal characteristics of traffic flow
AU2021100003A4 (en) A deep transportation model to predict the human mobility for autonomous vehicle
Zhang et al. Modeling urban growth by cellular automata: A case study of Xiamen City, China
CN117150698B (en) Digital twinning-based smart city grid object construction method and system
Yao et al. A Stable Passenger Flow Forecast Approach for Newly Opened Metro Stations Based on Multi-Source Data and Random Forest Regression Model
CN118015839B (en) Expressway road domain risk prediction method and device
Vitale et al. Monitoring and Forecasting Land Cover Dynamics Using Remote Sensing and Geospatial Technology
Rashid et al. Network Wide Evacuation Traffic Prediction in a Rapidly Intensifying Hurricane from Traffic Detectors and Facebook Movement Data: A Deep Learning Approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant