CN109712402B - Mobile object running time prediction method and device based on meta-path congestion mode mining - Google Patents

Mobile object running time prediction method and device based on meta-path congestion mode mining Download PDF

Info

Publication number
CN109712402B
CN109712402B CN201910110832.1A CN201910110832A CN109712402B CN 109712402 B CN109712402 B CN 109712402B CN 201910110832 A CN201910110832 A CN 201910110832A CN 109712402 B CN109712402 B CN 109712402B
Authority
CN
China
Prior art keywords
time
meta
path
congestion
paths
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910110832.1A
Other languages
Chinese (zh)
Other versions
CN109712402A (en
Inventor
韩京宇
王宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN201910110832.1A priority Critical patent/CN109712402B/en
Publication of CN109712402A publication Critical patent/CN109712402A/en
Application granted granted Critical
Publication of CN109712402B publication Critical patent/CN109712402B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for predicting the travel time of a mobile object mined based on a meta-path congestion mode, which comprises the following steps: collecting a plurality of GPS data at fixed time intervals as training samples, and matching the GPS data to a road network through a map matching algorithm to obtain a path track corresponding to the GPS data; dividing the matched path track into meta paths, storing the meta paths into a path dictionary, excavating the congestion state of each meta path at each time, and extracting a congestion feature vector according to the relevance among different meta paths; adding the extracted congestion feature vector into a feature matrix, and filling a vacancy value in the feature matrix by adopting a K-Means clustering algorithm to obtain a prediction model; and inputting a path track needing to predict the running time. The method extracts the congestion characteristics of local roads, captures the congestion change rule from finer granularity, and provides a k-means clustering algorithm for sparse trajectory data, so as to provide accurate support for prediction.

Description

Mobile object running time prediction method and device based on meta-path congestion mode mining
Technical Field
The invention relates to the technical field of information, in particular to a method and a device for predicting the travel time of a mobile object mined based on a meta-path congestion mode.
Background
With the rapid development of mobile internet, satellite positioning technology, and LBS technology, more and more location-related services are required to accurately predict the travel time of a current or future trip. The accurate prediction of the travel time is beneficial to reasonably planning a travel route for a driver, avoids road congestion and provides a reference basis for urban traffic construction.
Conventional data collection methods utilize static sensors on fixed streets and highways in cities, but these sensors do not cover the entire road and are costly to maintain. With the development of GPS technology, trajectory data may be collected by GPS equipped vehicles for predicting traffic conditions. Due to the large difference of traffic conditions in different areas, the instantaneous change of traffic congestion conditions and the high-speed movement of vehicles, the driving time of the same route at different moments may be greatly different. Moreover, due to cost reasons, technical reasons and privacy reasons, the problem of sparse trajectory data is particularly serious, and the specific expression is that the available trajectory data in a specific time and space is rare.
Existing travel time prediction methods are divided into two categories: a route-based travel time prediction method and a trip-based time prediction method. The route-based travel time prediction method is a conventional time prediction method, and the travel time of a specific route is obtained from historical trajectory data. Typical methods include establishing traffic flow models to simulate travel time distribution of the paths, using dynamic bayesian and pattern matching, etc. However, due to the fact that the traffic conditions in different areas are different greatly, road congestion conditions change constantly, the driving time is difficult to calculate accurately, the calculation cost of a complex mathematical model is high, and the difficulty of online prediction is increased. The trip-based time prediction requires finding a trip of the same departure time, start point and end point in the history data, and the time prediction is performed based on this. However, the same trip is difficult to find under the condition of sparse data, and when a new trip plan appears, the travel time cannot be accurately estimated.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects of the prior art, the invention provides a moving object running time prediction method based on the meta-path congestion mode mining, which can solve the problems of low accuracy and low real-time performance of the moving object running time prediction on a road network, and also provides a moving object running time prediction device based on the meta-path congestion mode mining.
The technical scheme is as follows: the invention discloses a method for predicting the travel time of a moving object mined based on a meta-path congestion mode, which comprises the following steps of:
(1) collecting a plurality of GPS data at fixed time intervals as training samples, and matching the GPS data to a road network through a map matching algorithm to obtain a path track corresponding to the GPS data;
(2) dividing the matched path track into meta paths, storing the meta paths into a path dictionary, excavating the congestion state of each meta path at each time, and extracting a congestion feature vector according to the relevance among different meta paths;
(3) adding the extracted congestion feature vector into a feature matrix, and filling a vacancy value in the feature matrix by adopting a K-Means clustering algorithm to obtain a prediction model;
(4) and inputting a path track needing to predict the running time.
Preferably, in the step (2), the meta-path is a path between any adjacent intersections on the road network.
Preferably, in the step (2), the path dictionary is a set of meta paths.
Preferably, in the step (1), the road network is a directed graph in an intersection node set and a road segment edge set.
Preferably, in the step (2), the mining of the congestion state of each meta-path at each time and extracting the congestion feature vector according to the correlation between different meta-paths specifically include:
(21) extracting the elapsed time of each meta-path, calculating an accumulated distribution function of the elapsed time, and obtaining the congestion state on the meta-path according to the accumulated distribution function, wherein the congestion state is uniformly distributed among [0,1 ];
(22) defining a feature vector of a fixed time interval from a certain meta-path and its neighborhood, the feature vector being a multidimensional vector containing the congestion status.
Preferably, in the step (3), a K-Means clustering algorithm is used for filling the vacancy values in the feature matrix, and the method further includes optimizing the initial clustering result by using a time similarity constraint, including using the similarity between adjacent fixed time intervals and the periodic similarity based on the urban traffic flow as the time similarity constraint.
In another aspect, the present invention provides a moving object travel time prediction apparatus based on meta-path congestion pattern mining, including: the method comprises the steps that a mobile object, intelligent sensing equipment carried by the mobile object, a server and a road network are used, the mobile object sends GPS data to the server at fixed time intervals through the intelligent sensing equipment, the server matches the GPS data to the road network through a map matching algorithm and stores path tracks, the matched path tracks are divided into meta-paths to be stored in a path dictionary, the congestion state of each meta-path at each time is mined, congestion features are extracted according to the relevance between the meta-paths, the server adds the extracted congestion features into a feature matrix, and vacancy values in the feature matrix are filled through a k-means clustering algorithm.
Preferably, the GPS data is in the format of a mobile object id, time, longitude, latitude, and speed.
Preferably, the meta-path is a path between any adjacent intersections on the road network.
Preferably, the path dictionary is a set of meta paths.
Has the advantages that: compared with the prior art, the invention has the following remarkable advantages: according to the method, the road network roads are decomposed into the meta-paths, the congestion characteristics of local roads are extracted by utilizing the spatial relationship among the meta-paths in the neighborhood, and the decomposed meta-paths capture the congestion change rule from finer granularity; and aiming at sparse track data, a k-means clustering algorithm is adopted, and data are clustered to fill vacancy values in a characteristic matrix, so that accurate support is provided for prediction, and accurate and efficient running time prediction is realized.
Drawings
FIG. 1 is a flow chart of a method according to the present invention;
FIG. 2 is a block diagram of a prediction method according to the present invention;
FIG. 3 is a schematic diagram illustrating the congestion status of a path according to the present invention;
fig. 4 is a diagram of different congestion states of associated paths according to the present invention.
Detailed Description
Example 1
As shown in fig. 1 and 2, the invention provides a method for predicting the travel time of a mobile object on a road network based on a meta-path congestion pattern mining method, which extracts meta-paths from historical tracks, mines a spatio-temporal related congestion pattern, enriches available track data by using a k-means-based clustering algorithm, and predicts the travel time of the mobile object on line. The method comprises the following steps:
step 1, firstly, collecting training samples and preprocessing data of the training samples; collecting a plurality of GPS data at fixed time intervals as training samples, and matching the GPS data to a road network through a map matching algorithm to obtain a path track corresponding to the GPS data;
in one embodiment, the GPS data format is mobile object id, time, longitude, latitude, and speed.
Secondly, performing off-line mining, including steps 2 and 3, step 2, dividing the matched path track into meta-paths, storing the meta-paths into a path dictionary, mining the congestion state of each meta-path at each time, and extracting congestion feature vectors according to the relevance among different meta-paths;
where a meta-path is a track between two intersections on a road network, it may be traversed by one or more tracks. A path dictionary is a collection of meta-paths.
The traffic states between adjacent meta-paths are not independent, and the spatial relationship between the meta-paths is used for capturing the characteristics of local traffic patterns, and the specific steps are as follows:
step 21, extracting the elapsed time d (r) of each meta-path from the historical data, and calculating the cumulative distribution function. The historical data is the historical GPS track of the mobile object, i.e. the mobile object may have previously traversed the meta-path and may upload the historical GPS data.
Calculating congestion status on meta-paths according to cumulative distribution function
Figure GDA0003274322780000041
Is [0,1]]Even distribution among them, so that the congestion degree of different meta-paths and different driving time can be compared;
if a certain meta-path r has different travel times, the travel times are respectively as follows: 6s, 7s, 8s, 9 s.
The cumulative distribution function for 6s is then:
Figure GDA0003274322780000042
the cumulative distribution function for 7s is:
Figure GDA0003274322780000043
similarly, the cumulative distribution function for 8s is:
Figure GDA0003274322780000044
the cumulative distribution function for 9s is 1.
The obtained 2/6, 4/6, 5/6, 1 is the congestion state, 2/6 is the most unblocked state, and 1 is the most congested state. Step 22, neighborhood set nb (r) ═ o of meta-path r1,…,osRepresents s element paths adjacent to the element path r, including the element path r, and calculates its dynamic congestion state-discretizing the whole time range into fixed time intervals fiCalculating the time of passage observed on the meta-path in each time interval
Figure GDA0003274322780000045
Some time interval fiPossibly containing a plurality of observations, representing the observations by their expectations, and calculating the state of congestion
Figure GDA0003274322780000046
Step 23, give meta path r and neighborhood o1,…,osDefine a time interval fiFeature vector M (r)i,M(r)iAt a time interval fiS-dimensional vector containing the congestion status of NB (r), i.e.
Figure GDA0003274322780000047
All M (r) to be calculatediSuperimposed on the feature matrix m (r).
Let N be the number of eigenvectors in the feature matrix, i.e. the total number of discrete time intervals, and the matrix m (r) represents the dynamic congestion status of nb (r), and has the following structure:
Figure GDA0003274322780000051
step 3, adding the extracted congestion feature vector into a feature matrix, and filling a vacancy value in the feature matrix by adopting a K-Means clustering algorithm to obtain a prediction model;
the invention can process sparse data, and fill vacancy values in a characteristic matrix by adopting a k-means clustering algorithm, and the method comprises the following specific steps:
step 31, clustering the rows of M (r) into k groups, and calculating k clustering centers c1,…,ckThen, the nearest cluster center to the row in m (r) is found, and the missing value in m (r) is initialized using the cluster center. Because the size change of the feature matrix of different element paths is large, an optimal K value is found out by adopting a K-Means method;
and step 32, optimizing an initial clustering result by introducing a time relation into the congestion characteristics. Given a feature matrix M (r), the correlation matrix W represents the time similarity constraint of two time intervals in M (r), and each term in the matrix is W obtained in the followingi,jFinding k clustering centers, reducing the difference between the k clustering centers and the actual observed value as much as possible, and simultaneously determining the soft distribution of the rows in M (r) and the k clustering centers;
the similarity between adjacent fixed time intervals is set to a first type of time similarity constraint, step 33, that is, meta-path traffic conditions are unlikely to transition from a fully clear state to a fully congested state over successive time intervals. An exponential decay function is thus defined between the ith time interval and the jth time interval:
Figure GDA0003274322780000052
tiand tjIs a time interval fiAnd fjThe start time of (c).
Step 34, secondA type of temporal similarity constraint is based on the periodic similarity of urban traffic flows. The time of use (SOD) in the peak arrival interval, called h, is of great concerniI.e. time interval fiRegardless of whether it is on a weekday or on a weekend. Definition fiAnd fjSOD weight between:
Figure GDA0003274322780000053
step 35, edge weight wi,jFrom CSODAnd CsmIs calculated by linear combination ofi,j=θCsm(i,j)+(1-θ)CSOD(i, j), θ is a coefficient;
and step 36, Q is an Nxk-order clustering distribution matrix. Each row Q of QiIs a binary vector if the time interval fiIs assigned to cluster j, then qi(j) Otherwise, it is 0. The k cluster centers are row vectors of a k × s order matrix C. Using the above results to initialize Q and C, and then iterate continuously to find the optimum by reaching the following minimization problem:
Figure GDA0003274322780000061
wherein L is a Laplace matrix, L-D-W,
Figure GDA0003274322780000062
the constant coefficient gamma controls the weight of the time consistency problem in the clustering process, and the above formula is solved by using an alternating direction optimization method.
Step 4, an online prediction stage: and inputting a path track needing to predict the running time.
In another aspect, the present invention further provides a device for predicting travel time of a mobile object based on meta-path congestion pattern mining, including: the method comprises the steps that a mobile object, intelligent sensing equipment carried by the mobile object, a server and a road network are used, the mobile object sends GPS data to the server at fixed time intervals through the intelligent sensing equipment, the server matches the GPS data to the road network through a map matching algorithm and stores path tracks, the matched path tracks are divided into meta-paths to be stored in a path dictionary, the congestion state of each meta-path at each time is mined, congestion features are extracted according to the relevance between the meta-paths, the server adds the extracted congestion features into a feature matrix, and vacancy values in the feature matrix are filled through a k-means clustering algorithm.
The data format sent by the mobile object to the server is as follows: moving object id, time, longitude, latitude, speed. The mobile object communicates with the server regardless of connection problems and delay time, and assumes that technologies such as WiFi and cellular can cover the entire area and provide corresponding services.
The server divides the urban road network, extracts road sections between all adjacent intersections in the road network, and expresses the road sections by using road section ids.
In one embodiment, a meta-path is a track between two intersections on a road network, which may be traversed by one or more tracks. A path dictionary is a collection of meta-paths. According to the fact that the traffic states between adjacent meta-paths are not independent, the spatial relation between the meta-paths is used for capturing the characteristics of the local traffic mode, and the specific steps are as follows:
s1, extracting the elapsed time d (r) of each meta-path from historical data, and calculating a cumulative distribution function of the meta-paths.
Calculating congestion status on meta-paths according to cumulative distribution function
Figure GDA0003274322780000063
Is [0,1]]Even distribution among them, so that the congestion degree of different meta-paths can be compared;
s2. neighborhood set nb (r) of meta-path r ═ o1,…,osRepresents s element paths (containing r) adjacent to the element path r, calculates its dynamic congestion state-discretizes the whole time range into fixed time intervals fiCalculating the time of passage observed on the meta-path in each time interval
Figure GDA0003274322780000071
Some time interval fiPossibly containing a plurality of observations, representing the observations by their expectations, and calculating the state of congestion
Figure GDA0003274322780000072
S3, giving meta path r and neighborhood { o1,…,osDefine a time interval fiFeature vector M (r)i,M(r)iAt a time interval fiS-dimensional vector containing the congestion status of NB (r), i.e.
Figure GDA0003274322780000073
All M (r) to be calculatediSuperimposed on the feature matrix m (r). Let N be the number of eigenvectors in the feature matrix, i.e. the total number of discrete time intervals, and the matrix m (r) represents the dynamic congestion status of nb (r), and has the following structure:
Figure GDA0003274322780000074
s4, sparse data processing as the invention: filling vacancy values in the feature matrix by adopting a k-means clustering algorithm, and specifically comprising the following steps of:
clustering the rows of M (r) into k groups, and calculating k cluster centers c1,…,ckThen, the nearest cluster center to the row in m (r) is found, and the missing value in m (r) is initialized using the cluster center. Because the size change of the feature matrix of different element paths is large, an optimal K value is found out by adopting a K-Means method;
the initial clustering result is optimized by introducing a time relationship in the congestion characteristics. Giving a characteristic matrix M (r), wherein an incidence matrix W represents time similarity constraint of two time intervals in the characteristic matrix M (r), finding k clustering centers, reducing the difference with an actual observed value as much as possible, and simultaneously determining the soft distribution of rows in the characteristic matrix M (r) and the k clustering centers;
the similarity between adjacent fixed time intervals is set as a first type of time similarity constraint, i.e., meta-path traffic conditions are less likely to transition from a fully clear state to a fully congested state over successive time intervals. An exponential decay function is thus defined between the ith time interval and the jth time interval:
Figure GDA0003274322780000075
wherein t isiAnd tjIs a time interval fiAnd fjThe start time of (c).
A second type of temporal similarity constraint is based on the periodic similarity of urban traffic flows. The time of use (SOD) in the peak arrival interval, called h, is of great concerniI.e. time interval fiRegardless of whether it is on a weekday or on a weekend. Definition fiAnd fjSOD weight between:
Figure GDA0003274322780000081
edge weight wi,jFrom CSODAnd CsmIs calculated by linear combination of (a), wherein the coefficient theta is 0.5: wi,j=θCsm(i,j)+(1-θ)CSOD(i,j);
Q is an Nxk-order cluster allocation matrix. Each row Q of QiIs a binary vector if the time interval fiIs assigned to cluster j, then qi(j) Otherwise, it is 0. The k cluster centers are row vectors of a k × s order matrix C. Q and C are initialized using the above results, and then the optimal values are found by solving the following minimization problem:
Figure GDA0003274322780000082
wherein L is a Laplace matrix, L-D-W,
Figure GDA0003274322780000083
the constant coefficient gamma controls the weight of the time consistency problem in the clustering process, and the above formula is solved by using an alternating direction optimization method.
To verify the effectiveness of the present invention, the following experiments were made: the user sends the orbit data to the navigation through containing GPS locate function intelligence sensing equipment, like smart mobile phone etc. and a orbit data contains: user id, time, longitude, latitude, speed. The user id may be a mobile phone model or a user phone number.
The navigation system collects track information of a user and stores the track information to the server, the server matches the track to a road network through a map matching algorithm, and invalid track points are removed.
The urban road network is divided into meta-paths, the meta-paths are road sections between two adjacent intersections on the road network, and files are created in a server by taking meta-path ids as names.
And dividing the matched track into meta paths, storing the meta paths into a file with meta path id as a name, calculating the elapsed time d (r) of each meta path, and calculating the cumulative distribution function of each meta path. Calculating congestion status on meta-paths according to cumulative distribution function
Figure GDA0003274322780000084
Is [0,1]]And the congestion degree of different meta-paths can be compared.
As shown in fig. 3: the paths have different congestion states at different moments, the congestion states are uniformly distributed among [0,1], and the congestion states of the paths are represented by the shades of colors, wherein the darkest color is the most congested state, and the lightest color is the most unblocked state.
As shown in fig. 4: at an intersection, r1,r2And r3Three congestion states are respectively represented in a visual mode for different meta-paths. Wherein, the dark color is the congestion state, and the light color is the unblocked state.
For any meta path, finding its neighbors and defining a time interval fiE.g. 30 minutes, the time of day is used as fiDividing, calculating each fiIs a feature vector ofiCongestion status of meta-paths and adjacent paths within a time interval. Finally, all the calculated eigenvectors are added to a total matrix M (r).
Since data of a part of the road is sparse, a null value may exist in m (r). Finding out the optimal K value by adopting a K-Means method, clustering the rows of M (r) into K groups, and calculating K clustering centers c1,…,ckThen, the nearest cluster center to the row in m (r) is found, and the missing value in m (r) is initialized using the cluster center.
Similarity between adjacent time intervals is calculated, and an exponential decay function is defined between the ith time interval and the jth time interval:
Figure GDA0003274322780000091
tiand tjIs a time interval fiAnd fjThe start time of (c). SigmasmDefault to 2.
Calculating the periodic similarity based on the urban traffic flow:
Figure GDA0003274322780000092
edge weight wi,jFrom CSODAnd CsmIs calculated, where the coefficient θ is 0.5:
wi,j=θCsm(i,j)+(1-θ)CSOD(i,j)。
and calculating similarity constraint between two time intervals in the M (r) according to the two similarities, and representing the similarity constraint by using a correlation matrix M (r).
Q is an Nxk-order cluster allocation matrix. Each row Q of QiIs a binary vector if the time interval fiIs assigned to cluster j, then qi(j) Otherwise, it is 0.
The k cluster centers are row vectors of a k × s order matrix C. Q and C are initialized using the above results, and then the optimal values are found by solving the following minimization problem:
Figure GDA0003274322780000101
l is a laplace matrix, L-D-W,
Figure GDA0003274322780000102
and optimizing the clustering result after solving the final value.
The user inputs travel time and travel track, and the system provides predicted travel time for the user to refer.

Claims (4)

1. A method for predicting the travel time of a mobile object based on meta-path congestion pattern mining is characterized by comprising the following steps:
(1) collecting a plurality of GPS data at fixed time intervals as training samples, and matching the GPS data to a road network through a map matching algorithm to obtain a path track corresponding to the GPS data;
(2) dividing the matched path track into meta-paths, storing the meta-paths into a path dictionary, excavating the congestion state of each meta-path at each time, and extracting congestion characteristic vectors according to the relevance among different meta-paths, wherein the meta-paths are paths among any adjacent intersections on the road network, and the path dictionary is a set of the meta-paths;
the step (2) specifically comprises:
step (21), extracting the elapsed time d (r) of each meta-path from historical data, and calculating the cumulative distribution function of the elapsed time d (r), wherein the historical data is the historical GPS track of the mobile object, namely the mobile object passes through the meta-path in the past and uploads the historical GPS data;
calculating congestion status on meta-paths according to cumulative distribution function
Figure FDA0003274322770000011
Figure FDA0003274322770000012
Is [0,1]]Even distribution among them, so that the congestion degree of different meta-paths and different driving time can be compared;
step (22), neighborhood set nb (r) ═ o of meta path r1,…,osRepresents s element paths adjacent to the element path r, including the element path r, calculates its dynamic congestion state, and discretizes the whole time range into fixed time intervals fiCalculating the time of passage observed on the meta-path in each time interval
Figure FDA0003274322770000013
Some time interval fiContaining a plurality of observations, representing the observations by their expectations, and calculating the state of congestion
Figure FDA0003274322770000014
Step (23), giving meta path r and neighborhood { o1,…,osDefine a time interval fiFeature vector M (r)i,M(r)iAt a time interval fiAn S-dimensional vector containing the congestion status of nb (r), i.e.:
Figure FDA0003274322770000015
all M (r) to be calculatediSuperimposed into the feature matrix m (r);
let N be the number of eigenvectors in the feature matrix, i.e. the total number of discrete time intervals, and the matrix m (r) represents the dynamic congestion status of nb (r), and has the following structure:
Figure FDA0003274322770000021
(3) adding the extracted congestion feature vector into a feature matrix, and filling a vacancy value in the feature matrix by adopting a K-Means clustering algorithm to obtain a prediction model;
in the step (3), filling vacancy values in the feature matrix by adopting a K-Means clustering algorithm, and optimizing an initial clustering result by adopting time similarity constraint, wherein the time similarity constraint comprises the similarity between adjacent fixed time intervals and the periodic similarity based on urban traffic flow as time similarity constraint; the step (3) specifically comprises the following steps:
step (31), clustering the rows of M (r) into k groups, and calculating k clustering centers c1,…,ckThen finding the nearest cluster center to the row in M (r), initializing the missing value in M (r) by using the cluster center, and finding out the optimal K value by adopting a K-Means method;
step (32), optimizing an initial clustering result by introducing a time relation into congestion characteristics; given a feature matrix M (r), the correlation matrix W represents the time similarity constraint of two time intervals in M (r), and each term in the matrix is W obtained in the followingi,jFinding k clustering centers, reducing the difference with the actual observed value, and simultaneously determining the soft distribution of the rows in M (r) and the k clustering centers;
step (33), setting the similarity between adjacent fixed time intervals as a first type of time similarity constraint, i.e. meta-path traffic conditions cannot transition from a fully open state to a fully congested state in consecutive time intervals, thus defining an exponential decay function between the ith time interval and the jth time interval:
Figure FDA0003274322770000022
tiand tjIs a time interval fiAnd fjThe start time of (c);
step (34), the second type of time similarity constraint is based on the periodic similarity of urban traffic flow, and defines the ith time interval f, regardless of whether it is on a weekday or on a weekendiAnd the jth time interval fjWeight between elapsed times for inter-arrival peaks:
Figure FDA0003274322770000023
wherein h isiIs a time interval fiTime spent in the inner reach peak section;
step (35), edge weight wi,jFrom CSODAnd CsmThe linear combination of (a) and (b) is calculated to yield:
wi,j=θCsm(i,j)+(1-θ)CSOD(i, j), θ is a coefficient;
step (36), Q is an Nxk-order clustering distribution matrix, and Q is arranged in each row of QiIs a binary vector if the time interval fiIs assigned to cluster j, then qi(j) With 1, otherwise 0, k cluster centers are row vectors of the k × s order matrix C, initializing Q and C using the above results, and then iterating continuously to find the optimum by achieving the following minimization problem:
Figure FDA0003274322770000031
wherein L is a Laplace matrix, L-D-W,
Figure FDA0003274322770000032
the weight of the time consistency problem in the clustering process is controlled by constant coefficient gamma, and the above formula is solved by using an alternating direction optimization method;
(4) and inputting a path track needing to predict the running time.
2. The method according to claim 1, wherein in step (1), the road network is a directed graph in an intersection node set and a link edge set.
3. A mobile object travel time prediction device based on meta-path congestion pattern mining, comprising: the system comprises a mobile object, intelligent sensing equipment carried by the mobile object, a server and a road network, wherein the mobile object sends GPS data to the server at fixed time intervals through the intelligent sensing equipment, the server matches GPS data to the road network through a map matching algorithm and stores the path track, divides the matched path track into meta paths and stores the meta paths into a path dictionary, excavates the congestion state of each meta path at each time, and extracting congestion characteristics according to the relevance among the element paths, adding the extracted congestion characteristics into a characteristic matrix by the server, filling vacancy values in the characteristic matrix by adopting a k-means clustering algorithm, optimizing an initial clustering result by adopting time similarity constraint, wherein the time similarity constraint comprises the similarity between adjacent fixed time intervals and the periodic similarity based on urban traffic flow;
according to the relevance among the meta-paths, the congestion features are extracted as follows:
the meta-path is a path between any adjacent intersections on the road network, and the path dictionary is a set of meta-paths;
step (21), extracting the elapsed time d (r) of each meta-path from historical data, and calculating the cumulative distribution function of the elapsed time d (r), wherein the historical data is the historical GPS track of the mobile object, namely the mobile object passes through the meta-path in the past and uploads the historical GPS data;
calculating congestion status on meta-paths according to cumulative distribution function
Figure FDA0003274322770000041
Figure FDA0003274322770000042
Is [0,1]]Uniform distribution therebetween, thereby comparing the degree of congestion at different meta-paths and different travel times;
step (22), neighborhood set nb (r) ═ o of meta path r1,…,osRepresents s element paths adjacent to the element path r, including the element path r, and calculates the dynamic congestion state to disperse the whole time range into fixed time interval fiCalculating the time of passage observed on the meta-path in each time interval
Figure FDA0003274322770000043
Some time interval fiContaining a plurality of observations, representing the observations by their expectations, and calculating the state of congestion
Figure FDA0003274322770000044
Step (23), giving meta path r and neighborhood { o1,…,osDefine a time interval fiFeature vector M (r)i,M(r)iAt a time interval fiAn S-dimensional vector containing the congestion status of nb (r), i.e.:
Figure FDA0003274322770000045
all M (r) to be calculatediSuperimposed into the feature matrix m (r);
let N be the number of eigenvectors in the feature matrix, i.e. the total number of discrete time intervals, and the matrix m (r) represents the dynamic congestion status of nb (r), and has the following structure:
Figure FDA0003274322770000046
optimizing the initial clustering result by adopting time similarity constraint, wherein the similarity between adjacent fixed time intervals and the periodic similarity based on urban traffic flow are taken as time similarity constraint, and the time similarity constraint specifically comprises the following steps:
step (31), clustering the rows of M (r) into k groups, and calculating k clustering centers c1,…,ckThen finding the nearest cluster center to the row in M (r), initializing the missing value in M (r) by using the cluster center, and finding out the optimal K value by adopting a K-Means method;
step (32), optimizing an initial clustering result by introducing a time relation into congestion characteristics; given a feature matrix M (r), the correlation matrix W represents two time intervals in M (r)Each term in the matrix is w, as found belowi,jFinding k clustering centers, reducing the difference with the actual observed value, and simultaneously determining the soft distribution of the rows in M (r) and the k clustering centers;
step (33), setting the similarity between adjacent fixed time intervals as a first type of time similarity constraint, i.e. meta-path traffic conditions cannot transition from a fully open state to a fully congested state in consecutive time intervals, thus defining an exponential decay function between the ith time interval and the jth time interval:
Figure FDA0003274322770000051
tiand tjIs a time interval fiAnd fjThe start time of (c);
step (34), the second type of time similarity constraint is based on the periodic similarity of urban traffic flow, and defines the ith time interval f, regardless of whether it is on a weekday or on a weekendiAnd the jth time interval fjWeight between elapsed times for inter-arrival peaks:
Figure FDA0003274322770000052
wherein h isiIs a time interval fiTime spent in the inner reach peak section;
step (35), edge weight wi,jFrom CSODAnd CsmThe linear combination of (a) and (b) is calculated to yield:
wi,j=θCsm(i,j)+(1-θ)CSOD(i, j), θ is a coefficient;
step (36), Q is an Nxk-order clustering distribution matrix, and Q is arranged in each row of QiIs a binary vector if the time interval fiIs assigned to cluster j, then qi(j) K row vectors centered on the k × s order matrix C, with 1 otherwise 0, using the above results to initialize Q and C, and then iterating over, by achieving the following minimization problemFinding the optimal value:
Figure FDA0003274322770000053
wherein L is a Laplace matrix, L-D-W,
Figure FDA0003274322770000054
the constant coefficient gamma controls the weight of the time consistency problem in the clustering process, and the above formula is solved by using an alternating direction optimization method.
4. The apparatus of claim 3, wherein the GPS data is in the form of a mobile object id, time, longitude, latitude, and speed.
CN201910110832.1A 2019-02-12 2019-02-12 Mobile object running time prediction method and device based on meta-path congestion mode mining Active CN109712402B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910110832.1A CN109712402B (en) 2019-02-12 2019-02-12 Mobile object running time prediction method and device based on meta-path congestion mode mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910110832.1A CN109712402B (en) 2019-02-12 2019-02-12 Mobile object running time prediction method and device based on meta-path congestion mode mining

Publications (2)

Publication Number Publication Date
CN109712402A CN109712402A (en) 2019-05-03
CN109712402B true CN109712402B (en) 2021-11-12

Family

ID=66264245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910110832.1A Active CN109712402B (en) 2019-02-12 2019-02-12 Mobile object running time prediction method and device based on meta-path congestion mode mining

Country Status (1)

Country Link
CN (1) CN109712402B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110570650B (en) * 2019-05-17 2021-05-11 东南大学 Travel path and node flow prediction method based on RFID data
CN110598747B (en) * 2019-08-13 2023-05-02 广东工业大学 Road classification method based on self-adaptive K-means clustering algorithm
CN111739283B (en) * 2019-10-30 2022-05-20 腾讯科技(深圳)有限公司 Road condition calculation method, device, equipment and medium based on clustering

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127159A (en) * 2007-09-18 2008-02-20 中国科学院软件研究所 Traffic flow data sampling and analysis based on network limited moving object database
CN104778274A (en) * 2015-04-23 2015-07-15 山东大学 Wide-range urban road network travel time estimation method based on sparse taxi GPS (Global Positioning System) data
CN104778834A (en) * 2015-01-23 2015-07-15 哈尔滨工业大学 Urban road traffic jam judging method based on vehicle GPS data
CN105006147A (en) * 2015-06-19 2015-10-28 武汉大学 Road segment travel time deducing method based on road space-time incidence relation
CN105679021A (en) * 2016-02-02 2016-06-15 重庆云途交通科技有限公司 Travel time fusion prediction and query method based on traffic big data
CN106127662A (en) * 2016-06-23 2016-11-16 福州大学 A kind of system of selection of the K means initial cluster center for taxi track data
US9779357B1 (en) * 2013-03-07 2017-10-03 Steve Dabell Method and apparatus for providing estimated patrol properties and historic patrol records
CN108629978A (en) * 2018-06-07 2018-10-09 重庆邮电大学 A kind of traffic trajectory predictions method based on higher-dimension road network and Recognition with Recurrent Neural Network
CN108765940A (en) * 2018-05-28 2018-11-06 南京邮电大学 Road congestion based on high-order Markov model finds method
CN108986453A (en) * 2018-06-15 2018-12-11 华南师范大学 A kind of traffic movement prediction method based on contextual information, system and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060247850A1 (en) * 2005-04-18 2006-11-02 Cera Christopher D Data-driven traffic views with keyroute status
US8620568B2 (en) * 2010-12-28 2013-12-31 Telenav, Inc. Navigation system with congestion estimation mechanism and method of operation thereof
CN103226892B (en) * 2013-04-08 2016-02-03 福建工程学院 A kind of road congestion state discovery method of Optimization-type
CN103839409B (en) * 2014-02-27 2015-09-09 南京大学 Based on the traffic flow modes method of discrimination of multibreak facial vision sensing cluster analysis
CN108922191B (en) * 2018-07-27 2021-05-04 重庆大学 Travel time calculation method based on soft set

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127159A (en) * 2007-09-18 2008-02-20 中国科学院软件研究所 Traffic flow data sampling and analysis based on network limited moving object database
US9779357B1 (en) * 2013-03-07 2017-10-03 Steve Dabell Method and apparatus for providing estimated patrol properties and historic patrol records
CN104778834A (en) * 2015-01-23 2015-07-15 哈尔滨工业大学 Urban road traffic jam judging method based on vehicle GPS data
CN104778274A (en) * 2015-04-23 2015-07-15 山东大学 Wide-range urban road network travel time estimation method based on sparse taxi GPS (Global Positioning System) data
CN105006147A (en) * 2015-06-19 2015-10-28 武汉大学 Road segment travel time deducing method based on road space-time incidence relation
CN105679021A (en) * 2016-02-02 2016-06-15 重庆云途交通科技有限公司 Travel time fusion prediction and query method based on traffic big data
CN106127662A (en) * 2016-06-23 2016-11-16 福州大学 A kind of system of selection of the K means initial cluster center for taxi track data
CN108765940A (en) * 2018-05-28 2018-11-06 南京邮电大学 Road congestion based on high-order Markov model finds method
CN108629978A (en) * 2018-06-07 2018-10-09 重庆邮电大学 A kind of traffic trajectory predictions method based on higher-dimension road network and Recognition with Recurrent Neural Network
CN108986453A (en) * 2018-06-15 2018-12-11 华南师范大学 A kind of traffic movement prediction method based on contextual information, system and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种面向数据缺失问题的K-means改进算法;张建民;姚亮;胡学钢;《合肥工业大学学报(自然科学版)》;20080930;第1455-1457页 *
基于梯度提升回归树的城市道路行程时间预测;龚越;罗小芹;王殿海;杨少辉;《浙江大学学报(工学版)》;20180331;第453-460页 *

Also Published As

Publication number Publication date
CN109712402A (en) 2019-05-03

Similar Documents

Publication Publication Date Title
CN109670277B (en) Travel time prediction method based on multi-mode data fusion and multi-model integration
US11270579B2 (en) Transportation network speed foreeasting method using deep capsule networks with nested LSTM models
CN110570651B (en) Road network traffic situation prediction method and system based on deep learning
CN109035761B (en) Travel time estimation method based on auxiliary supervised learning
CN109712402B (en) Mobile object running time prediction method and device based on meta-path congestion mode mining
CN111653088A (en) Vehicle driving quantity prediction model construction method, prediction method and system
CN108021858A (en) Mobile object recognition methods and object flow analysis method
Lam et al. Short-term hourly traffic forecasts using Hong Kong annual traffic census
CN110889444B (en) Driving track feature classification method based on convolutional neural network
CN107195177A (en) Based on Forecasting Methodology of the distributed memory Computational frame to city traffic road condition
KR20140128063A (en) Traffic prediction system
CN112863182B (en) Cross-modal data prediction method based on transfer learning
Chen et al. Dynamic travel time prediction using pattern recognition
US20220414450A1 (en) Distributed Multi-Task Machine Learning for Traffic Prediction
CN110415517A (en) A kind of accurate early warning system of congestion in road based on vehicle driving trace and method
CN110021161B (en) Traffic flow direction prediction method and system
CN111524358B (en) Regional radiation sexual communication flow prediction method
CN116580563B (en) Markov chain-based regional congestion traffic source prediction method, device and equipment
Ara et al. Ride Hailing Service Demand Forecast by Integrating Convolutional and Recurrent Neural Networks.
CN115630056A (en) Road patrol quantitative assessment system and method based on GPS track positioning
CN114913447A (en) Police intelligent command room system and method based on scene recognition
Kyaw et al. Estimating travel speed of yangon road network using gps data and machine learning techniques
Woo et al. Data-driven prediction methodology of origin–destination demand in large network for real-time service
Ahanin et al. An efficient traffic state estimation model based on fuzzy C-mean clustering and MDL using FCD
EP4328888A1 (en) Method and processing system for processing probe data and probe

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant