CN109712402B

CN109712402B - Mobile object running time prediction method and device based on meta-path congestion mode mining

Info

Publication number: CN109712402B
Application number: CN201910110832.1A
Authority: CN
Inventors: 韩京宇; 王宁
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2019-02-12
Filing date: 2019-02-12
Publication date: 2021-11-12
Anticipated expiration: 2039-02-12
Also published as: CN109712402A

Abstract

The invention discloses a method for predicting the travel time of a mobile object mined based on a meta-path congestion mode, which comprises the following steps: collecting a plurality of GPS data at fixed time intervals as training samples, and matching the GPS data to a road network through a map matching algorithm to obtain a path track corresponding to the GPS data; dividing the matched path track into meta paths, storing the meta paths into a path dictionary, excavating the congestion state of each meta path at each time, and extracting a congestion feature vector according to the relevance among different meta paths; adding the extracted congestion feature vector into a feature matrix, and filling a vacancy value in the feature matrix by adopting a K-Means clustering algorithm to obtain a prediction model; and inputting a path track needing to predict the running time. The method extracts the congestion characteristics of local roads, captures the congestion change rule from finer granularity, and provides a k-means clustering algorithm for sparse trajectory data, so as to provide accurate support for prediction.

Description

Mobile object running time prediction method and device based on meta-path congestion mode mining

Technical Field

The invention relates to the technical field of information, in particular to a method and a device for predicting the travel time of a mobile object mined based on a meta-path congestion mode.

Background

With the rapid development of mobile internet, satellite positioning technology, and LBS technology, more and more location-related services are required to accurately predict the travel time of a current or future trip. The accurate prediction of the travel time is beneficial to reasonably planning a travel route for a driver, avoids road congestion and provides a reference basis for urban traffic construction.

Conventional data collection methods utilize static sensors on fixed streets and highways in cities, but these sensors do not cover the entire road and are costly to maintain. With the development of GPS technology, trajectory data may be collected by GPS equipped vehicles for predicting traffic conditions. Due to the large difference of traffic conditions in different areas, the instantaneous change of traffic congestion conditions and the high-speed movement of vehicles, the driving time of the same route at different moments may be greatly different. Moreover, due to cost reasons, technical reasons and privacy reasons, the problem of sparse trajectory data is particularly serious, and the specific expression is that the available trajectory data in a specific time and space is rare.

Existing travel time prediction methods are divided into two categories: a route-based travel time prediction method and a trip-based time prediction method. The route-based travel time prediction method is a conventional time prediction method, and the travel time of a specific route is obtained from historical trajectory data. Typical methods include establishing traffic flow models to simulate travel time distribution of the paths, using dynamic bayesian and pattern matching, etc. However, due to the fact that the traffic conditions in different areas are different greatly, road congestion conditions change constantly, the driving time is difficult to calculate accurately, the calculation cost of a complex mathematical model is high, and the difficulty of online prediction is increased. The trip-based time prediction requires finding a trip of the same departure time, start point and end point in the history data, and the time prediction is performed based on this. However, the same trip is difficult to find under the condition of sparse data, and when a new trip plan appears, the travel time cannot be accurately estimated.

Disclosure of Invention

The purpose of the invention is as follows: in order to overcome the defects of the prior art, the invention provides a moving object running time prediction method based on the meta-path congestion mode mining, which can solve the problems of low accuracy and low real-time performance of the moving object running time prediction on a road network, and also provides a moving object running time prediction device based on the meta-path congestion mode mining.

The technical scheme is as follows: the invention discloses a method for predicting the travel time of a moving object mined based on a meta-path congestion mode, which comprises the following steps of:

(1) collecting a plurality of GPS data at fixed time intervals as training samples, and matching the GPS data to a road network through a map matching algorithm to obtain a path track corresponding to the GPS data;

(2) dividing the matched path track into meta paths, storing the meta paths into a path dictionary, excavating the congestion state of each meta path at each time, and extracting a congestion feature vector according to the relevance among different meta paths;

(3) adding the extracted congestion feature vector into a feature matrix, and filling a vacancy value in the feature matrix by adopting a K-Means clustering algorithm to obtain a prediction model;

(4) and inputting a path track needing to predict the running time.

Preferably, in the step (2), the meta-path is a path between any adjacent intersections on the road network.

Preferably, in the step (2), the path dictionary is a set of meta paths.

Preferably, in the step (1), the road network is a directed graph in an intersection node set and a road segment edge set.

Preferably, in the step (2), the mining of the congestion state of each meta-path at each time and extracting the congestion feature vector according to the correlation between different meta-paths specifically include:

(21) extracting the elapsed time of each meta-path, calculating an accumulated distribution function of the elapsed time, and obtaining the congestion state on the meta-path according to the accumulated distribution function, wherein the congestion state is uniformly distributed among [0,1 ];

(22) defining a feature vector of a fixed time interval from a certain meta-path and its neighborhood, the feature vector being a multidimensional vector containing the congestion status.

Preferably, in the step (3), a K-Means clustering algorithm is used for filling the vacancy values in the feature matrix, and the method further includes optimizing the initial clustering result by using a time similarity constraint, including using the similarity between adjacent fixed time intervals and the periodic similarity based on the urban traffic flow as the time similarity constraint.

In another aspect, the present invention provides a moving object travel time prediction apparatus based on meta-path congestion pattern mining, including: the method comprises the steps that a mobile object, intelligent sensing equipment carried by the mobile object, a server and a road network are used, the mobile object sends GPS data to the server at fixed time intervals through the intelligent sensing equipment, the server matches the GPS data to the road network through a map matching algorithm and stores path tracks, the matched path tracks are divided into meta-paths to be stored in a path dictionary, the congestion state of each meta-path at each time is mined, congestion features are extracted according to the relevance between the meta-paths, the server adds the extracted congestion features into a feature matrix, and vacancy values in the feature matrix are filled through a k-means clustering algorithm.

Preferably, the GPS data is in the format of a mobile object id, time, longitude, latitude, and speed.

Preferably, the meta-path is a path between any adjacent intersections on the road network.

Preferably, the path dictionary is a set of meta paths.

Has the advantages that: compared with the prior art, the invention has the following remarkable advantages: according to the method, the road network roads are decomposed into the meta-paths, the congestion characteristics of local roads are extracted by utilizing the spatial relationship among the meta-paths in the neighborhood, and the decomposed meta-paths capture the congestion change rule from finer granularity; and aiming at sparse track data, a k-means clustering algorithm is adopted, and data are clustered to fill vacancy values in a characteristic matrix, so that accurate support is provided for prediction, and accurate and efficient running time prediction is realized.

Drawings

FIG. 1 is a flow chart of a method according to the present invention;

FIG. 2 is a block diagram of a prediction method according to the present invention;

FIG. 3 is a schematic diagram illustrating the congestion status of a path according to the present invention;

fig. 4 is a diagram of different congestion states of associated paths according to the present invention.

Detailed Description

Example 1

As shown in fig. 1 and 2, the invention provides a method for predicting the travel time of a mobile object on a road network based on a meta-path congestion pattern mining method, which extracts meta-paths from historical tracks, mines a spatio-temporal related congestion pattern, enriches available track data by using a k-means-based clustering algorithm, and predicts the travel time of the mobile object on line. The method comprises the following steps:

step 1, firstly, collecting training samples and preprocessing data of the training samples; collecting a plurality of GPS data at fixed time intervals as training samples, and matching the GPS data to a road network through a map matching algorithm to obtain a path track corresponding to the GPS data;

in one embodiment, the GPS data format is mobile object id, time, longitude, latitude, and speed.

Secondly, performing off-line mining, including steps 2 and 3, step 2, dividing the matched path track into meta-paths, storing the meta-paths into a path dictionary, mining the congestion state of each meta-path at each time, and extracting congestion feature vectors according to the relevance among different meta-paths;

where a meta-path is a track between two intersections on a road network, it may be traversed by one or more tracks. A path dictionary is a collection of meta-paths.

The traffic states between adjacent meta-paths are not independent, and the spatial relationship between the meta-paths is used for capturing the characteristics of local traffic patterns, and the specific steps are as follows:

step 21, extracting the elapsed time d (r) of each meta-path from the historical data, and calculating the cumulative distribution function. The historical data is the historical GPS track of the mobile object, i.e. the mobile object may have previously traversed the meta-path and may upload the historical GPS data.

Calculating congestion status on meta-paths according to cumulative distribution function

Is [0,1]]Even distribution among them, so that the congestion degree of different meta-paths and different driving time can be compared;

if a certain meta-path r has different travel times, the travel times are respectively as follows: 6s, 7s, 8s, 9 s.

The cumulative distribution function for 6s is then:

the cumulative distribution function for 7s is:

similarly, the cumulative distribution function for 8s is:

the cumulative distribution function for 9s is 1.

The obtained 2/6, 4/6, 5/6, 1 is the congestion state, 2/6 is the most unblocked state, and 1 is the most congested state. Step 22, neighborhood set nb (r) ═ o of meta-path r₁,…,o_sRepresents s element paths adjacent to the element path r, including the element path r, and calculates its dynamic congestion state-discretizing the whole time range into fixed time intervals f_iCalculating the time of passage observed on the meta-path in each time interval

Some time interval f_iPossibly containing a plurality of observations, representing the observations by their expectations, and calculating the state of congestion

Step 23, give meta path r and neighborhood o₁,…,o_sDefine a time interval f_iFeature vector M (r)_i，M(r)_iAt a time interval f_iS-dimensional vector containing the congestion status of NB (r), i.e.

All M (r) to be calculated_iSuperimposed on the feature matrix m (r).

Let N be the number of eigenvectors in the feature matrix, i.e. the total number of discrete time intervals, and the matrix m (r) represents the dynamic congestion status of nb (r), and has the following structure:

step 3, adding the extracted congestion feature vector into a feature matrix, and filling a vacancy value in the feature matrix by adopting a K-Means clustering algorithm to obtain a prediction model;

the invention can process sparse data, and fill vacancy values in a characteristic matrix by adopting a k-means clustering algorithm, and the method comprises the following specific steps:

step 31, clustering the rows of M (r) into k groups, and calculating k clustering centers c₁,…,c_kThen, the nearest cluster center to the row in m (r) is found, and the missing value in m (r) is initialized using the cluster center. Because the size change of the feature matrix of different element paths is large, an optimal K value is found out by adopting a K-Means method;

and step 32, optimizing an initial clustering result by introducing a time relation into the congestion characteristics. Given a feature matrix M (r), the correlation matrix W represents the time similarity constraint of two time intervals in M (r), and each term in the matrix is W obtained in the following_i,jFinding k clustering centers, reducing the difference between the k clustering centers and the actual observed value as much as possible, and simultaneously determining the soft distribution of the rows in M (r) and the k clustering centers;

the similarity between adjacent fixed time intervals is set to a first type of time similarity constraint, step 33, that is, meta-path traffic conditions are unlikely to transition from a fully clear state to a fully congested state over successive time intervals. An exponential decay function is thus defined between the ith time interval and the jth time interval:

t_iand t_jIs a time interval f_iAnd f_jThe start time of (c).

Step 34, secondA type of temporal similarity constraint is based on the periodic similarity of urban traffic flows. The time of use (SOD) in the peak arrival interval, called h, is of great concern_iI.e. time interval f_iRegardless of whether it is on a weekday or on a weekend. Definition f_iAnd f_jSOD weight between:

step 35, edge weight w_i,jFrom C_SODAnd C_smIs calculated by linear combination of_i,j＝θC_sm(i,j)+(1-θ)C_SOD(i, j), θ is a coefficient;

and step 36, Q is an Nxk-order clustering distribution matrix. Each row Q of Q_iIs a binary vector if the time interval f_iIs assigned to cluster j, then q_i(j) Otherwise, it is 0. The k cluster centers are row vectors of a k × s order matrix C. Using the above results to initialize Q and C, and then iterate continuously to find the optimum by reaching the following minimization problem:

wherein L is a Laplace matrix, L-D-W,

the constant coefficient gamma controls the weight of the time consistency problem in the clustering process, and the above formula is solved by using an alternating direction optimization method.

Step 4, an online prediction stage: and inputting a path track needing to predict the running time.

In another aspect, the present invention further provides a device for predicting travel time of a mobile object based on meta-path congestion pattern mining, including: the method comprises the steps that a mobile object, intelligent sensing equipment carried by the mobile object, a server and a road network are used, the mobile object sends GPS data to the server at fixed time intervals through the intelligent sensing equipment, the server matches the GPS data to the road network through a map matching algorithm and stores path tracks, the matched path tracks are divided into meta-paths to be stored in a path dictionary, the congestion state of each meta-path at each time is mined, congestion features are extracted according to the relevance between the meta-paths, the server adds the extracted congestion features into a feature matrix, and vacancy values in the feature matrix are filled through a k-means clustering algorithm.

The data format sent by the mobile object to the server is as follows: moving object id, time, longitude, latitude, speed. The mobile object communicates with the server regardless of connection problems and delay time, and assumes that technologies such as WiFi and cellular can cover the entire area and provide corresponding services.

The server divides the urban road network, extracts road sections between all adjacent intersections in the road network, and expresses the road sections by using road section ids.

In one embodiment, a meta-path is a track between two intersections on a road network, which may be traversed by one or more tracks. A path dictionary is a collection of meta-paths. According to the fact that the traffic states between adjacent meta-paths are not independent, the spatial relation between the meta-paths is used for capturing the characteristics of the local traffic mode, and the specific steps are as follows:

s1, extracting the elapsed time d (r) of each meta-path from historical data, and calculating a cumulative distribution function of the meta-paths.

Is [0,1]]Even distribution among them, so that the congestion degree of different meta-paths can be compared;

s2. neighborhood set nb (r) of meta-path r ═ o₁,…,o_sRepresents s element paths (containing r) adjacent to the element path r, calculates its dynamic congestion state-discretizes the whole time range into fixed time intervals f_iCalculating the time of passage observed on the meta-path in each time interval

S3, giving meta path r and neighborhood { o₁,…,o_sDefine a time interval f_iFeature vector M (r)_i，M(r)_iAt a time interval f_iS-dimensional vector containing the congestion status of NB (r), i.e.

All M (r) to be calculated_iSuperimposed on the feature matrix m (r). Let N be the number of eigenvectors in the feature matrix, i.e. the total number of discrete time intervals, and the matrix m (r) represents the dynamic congestion status of nb (r), and has the following structure:

s4, sparse data processing as the invention: filling vacancy values in the feature matrix by adopting a k-means clustering algorithm, and specifically comprising the following steps of:

clustering the rows of M (r) into k groups, and calculating k cluster centers c₁,…,c_kThen, the nearest cluster center to the row in m (r) is found, and the missing value in m (r) is initialized using the cluster center. Because the size change of the feature matrix of different element paths is large, an optimal K value is found out by adopting a K-Means method;

the initial clustering result is optimized by introducing a time relationship in the congestion characteristics. Giving a characteristic matrix M (r), wherein an incidence matrix W represents time similarity constraint of two time intervals in the characteristic matrix M (r), finding k clustering centers, reducing the difference with an actual observed value as much as possible, and simultaneously determining the soft distribution of rows in the characteristic matrix M (r) and the k clustering centers;

the similarity between adjacent fixed time intervals is set as a first type of time similarity constraint, i.e., meta-path traffic conditions are less likely to transition from a fully clear state to a fully congested state over successive time intervals. An exponential decay function is thus defined between the ith time interval and the jth time interval:

wherein t is_iAnd t_jIs a time interval f_iAnd f_jThe start time of (c).

A second type of temporal similarity constraint is based on the periodic similarity of urban traffic flows. The time of use (SOD) in the peak arrival interval, called h, is of great concern_iI.e. time interval f_iRegardless of whether it is on a weekday or on a weekend. Definition f_iAnd f_jSOD weight between:

edge weight w_i,jFrom C_SODAnd C_smIs calculated by linear combination of (a), wherein the coefficient theta is 0.5: w_i,j＝θC_sm(i,j)+(1-θ)C_SOD(i,j)；

Q is an Nxk-order cluster allocation matrix. Each row Q of Q_iIs a binary vector if the time interval f_iIs assigned to cluster j, then q_i(j) Otherwise, it is 0. The k cluster centers are row vectors of a k × s order matrix C. Q and C are initialized using the above results, and then the optimal values are found by solving the following minimization problem:

wherein L is a Laplace matrix, L-D-W,

To verify the effectiveness of the present invention, the following experiments were made: the user sends the orbit data to the navigation through containing GPS locate function intelligence sensing equipment, like smart mobile phone etc. and a orbit data contains: user id, time, longitude, latitude, speed. The user id may be a mobile phone model or a user phone number.

The navigation system collects track information of a user and stores the track information to the server, the server matches the track to a road network through a map matching algorithm, and invalid track points are removed.

The urban road network is divided into meta-paths, the meta-paths are road sections between two adjacent intersections on the road network, and files are created in a server by taking meta-path ids as names.

And dividing the matched track into meta paths, storing the meta paths into a file with meta path id as a name, calculating the elapsed time d (r) of each meta path, and calculating the cumulative distribution function of each meta path. Calculating congestion status on meta-paths according to cumulative distribution function

Is [0,1]]And the congestion degree of different meta-paths can be compared.

As shown in fig. 3: the paths have different congestion states at different moments, the congestion states are uniformly distributed among [0,1], and the congestion states of the paths are represented by the shades of colors, wherein the darkest color is the most congested state, and the lightest color is the most unblocked state.

As shown in fig. 4: at an intersection, r₁,r₂And r₃Three congestion states are respectively represented in a visual mode for different meta-paths. Wherein, the dark color is the congestion state, and the light color is the unblocked state.

For any meta path, finding its neighbors and defining a time interval f_iE.g. 30 minutes, the time of day is used as f_iDividing, calculating each f_iIs a feature vector of_iCongestion status of meta-paths and adjacent paths within a time interval. Finally, all the calculated eigenvectors are added to a total matrix M (r).

Since data of a part of the road is sparse, a null value may exist in m (r). Finding out the optimal K value by adopting a K-Means method, clustering the rows of M (r) into K groups, and calculating K clustering centers c₁,…,c_kThen, the nearest cluster center to the row in m (r) is found, and the missing value in m (r) is initialized using the cluster center.

Similarity between adjacent time intervals is calculated, and an exponential decay function is defined between the ith time interval and the jth time interval:

t_iand t_jIs a time interval f_iAnd f_jThe start time of (c). Sigma_smDefault to 2.

Calculating the periodic similarity based on the urban traffic flow:

edge weight w_i,jFrom C_SODAnd C_smIs calculated, where the coefficient θ is 0.5:

w_i,j＝θC_sm(i,j)+(1-θ)C_SOD(i,j)。

and calculating similarity constraint between two time intervals in the M (r) according to the two similarities, and representing the similarity constraint by using a correlation matrix M (r).

Q is an Nxk-order cluster allocation matrix. Each row Q of Q_iIs a binary vector if the time interval f_iIs assigned to cluster j, then q_i(j) Otherwise, it is 0.

The k cluster centers are row vectors of a k × s order matrix C. Q and C are initialized using the above results, and then the optimal values are found by solving the following minimization problem:

l is a laplace matrix, L-D-W,

and optimizing the clustering result after solving the final value.

The user inputs travel time and travel track, and the system provides predicted travel time for the user to refer.

Claims

1. A method for predicting the travel time of a mobile object based on meta-path congestion pattern mining is characterized by comprising the following steps:

(2) dividing the matched path track into meta-paths, storing the meta-paths into a path dictionary, excavating the congestion state of each meta-path at each time, and extracting congestion characteristic vectors according to the relevance among different meta-paths, wherein the meta-paths are paths among any adjacent intersections on the road network, and the path dictionary is a set of the meta-paths;

the step (2) specifically comprises:

step (21), extracting the elapsed time d (r) of each meta-path from historical data, and calculating the cumulative distribution function of the elapsed time d (r), wherein the historical data is the historical GPS track of the mobile object, namely the mobile object passes through the meta-path in the past and uploads the historical GPS data;

step (22), neighborhood set nb (r) ═ o of meta path r₁,…,o_sRepresents s element paths adjacent to the element path r, including the element path r, calculates its dynamic congestion state, and discretizes the whole time range into fixed time intervals f_iCalculating the time of passage observed on the meta-path in each time interval

Some time interval f_iContaining a plurality of observations, representing the observations by their expectations, and calculating the state of congestion

Step (23), giving meta path r and neighborhood { o₁,…,o_sDefine a time interval f_iFeature vector M (r)_i，M(r)_iAt a time interval f_iAn S-dimensional vector containing the congestion status of nb (r), i.e.:

all M (r) to be calculated_iSuperimposed into the feature matrix m (r);

in the step (3), filling vacancy values in the feature matrix by adopting a K-Means clustering algorithm, and optimizing an initial clustering result by adopting time similarity constraint, wherein the time similarity constraint comprises the similarity between adjacent fixed time intervals and the periodic similarity based on urban traffic flow as time similarity constraint; the step (3) specifically comprises the following steps:

step (31), clustering the rows of M (r) into k groups, and calculating k clustering centers c₁,…,c_kThen finding the nearest cluster center to the row in M (r), initializing the missing value in M (r) by using the cluster center, and finding out the optimal K value by adopting a K-Means method;

step (32), optimizing an initial clustering result by introducing a time relation into congestion characteristics; given a feature matrix M (r), the correlation matrix W represents the time similarity constraint of two time intervals in M (r), and each term in the matrix is W obtained in the following_i,jFinding k clustering centers, reducing the difference with the actual observed value, and simultaneously determining the soft distribution of the rows in M (r) and the k clustering centers;

step (33), setting the similarity between adjacent fixed time intervals as a first type of time similarity constraint, i.e. meta-path traffic conditions cannot transition from a fully open state to a fully congested state in consecutive time intervals, thus defining an exponential decay function between the ith time interval and the jth time interval:

t_iand t_jIs a time interval f_iAnd f_jThe start time of (c);

step (34), the second type of time similarity constraint is based on the periodic similarity of urban traffic flow, and defines the ith time interval f, regardless of whether it is on a weekday or on a weekend_iAnd the jth time interval f_jWeight between elapsed times for inter-arrival peaks:

wherein h is_iIs a time interval f_iTime spent in the inner reach peak section;

step (35), edge weight w_i,jFrom C_SODAnd C_smThe linear combination of (a) and (b) is calculated to yield:

w_i,j＝θC_sm(i,j)+(1-θ)C_SOD(i, j), θ is a coefficient;

step (36), Q is an Nxk-order clustering distribution matrix, and Q is arranged in each row of Q_iIs a binary vector if the time interval f_iIs assigned to cluster j, then q_i(j) With 1, otherwise 0, k cluster centers are row vectors of the k × s order matrix C, initializing Q and C using the above results, and then iterating continuously to find the optimum by achieving the following minimization problem:

wherein L is a Laplace matrix, L-D-W,

the weight of the time consistency problem in the clustering process is controlled by constant coefficient gamma, and the above formula is solved by using an alternating direction optimization method;

(4) and inputting a path track needing to predict the running time.

2. The method according to claim 1, wherein in step (1), the road network is a directed graph in an intersection node set and a link edge set.

3. A mobile object travel time prediction device based on meta-path congestion pattern mining, comprising: the system comprises a mobile object, intelligent sensing equipment carried by the mobile object, a server and a road network, wherein the mobile object sends GPS data to the server at fixed time intervals through the intelligent sensing equipment, the server matches GPS data to the road network through a map matching algorithm and stores the path track, divides the matched path track into meta paths and stores the meta paths into a path dictionary, excavates the congestion state of each meta path at each time, and extracting congestion characteristics according to the relevance among the element paths, adding the extracted congestion characteristics into a characteristic matrix by the server, filling vacancy values in the characteristic matrix by adopting a k-means clustering algorithm, optimizing an initial clustering result by adopting time similarity constraint, wherein the time similarity constraint comprises the similarity between adjacent fixed time intervals and the periodic similarity based on urban traffic flow;

according to the relevance among the meta-paths, the congestion features are extracted as follows:

the meta-path is a path between any adjacent intersections on the road network, and the path dictionary is a set of meta-paths;

Is [0,1]]Uniform distribution therebetween, thereby comparing the degree of congestion at different meta-paths and different travel times;

step (22), neighborhood set nb (r) ═ o of meta path r₁,…,o_sRepresents s element paths adjacent to the element path r, including the element path r, and calculates the dynamic congestion state to disperse the whole time range into fixed time interval f_iCalculating the time of passage observed on the meta-path in each time interval

all M (r) to be calculated_iSuperimposed into the feature matrix m (r);

optimizing the initial clustering result by adopting time similarity constraint, wherein the similarity between adjacent fixed time intervals and the periodic similarity based on urban traffic flow are taken as time similarity constraint, and the time similarity constraint specifically comprises the following steps:

step (32), optimizing an initial clustering result by introducing a time relation into congestion characteristics; given a feature matrix M (r), the correlation matrix W represents two time intervals in M (r)Each term in the matrix is w, as found below_i,jFinding k clustering centers, reducing the difference with the actual observed value, and simultaneously determining the soft distribution of the rows in M (r) and the k clustering centers;

t_iand t_jIs a time interval f_iAnd f_jThe start time of (c);

wherein h is_iIs a time interval f_iTime spent in the inner reach peak section;

w_i,j＝θC_sm(i,j)+(1-θ)C_SOD(i, j), θ is a coefficient;

step (36), Q is an Nxk-order clustering distribution matrix, and Q is arranged in each row of Q_iIs a binary vector if the time interval f_iIs assigned to cluster j, then q_i(j) K row vectors centered on the k × s order matrix C, with 1 otherwise 0, using the above results to initialize Q and C, and then iterating over, by achieving the following minimization problemFinding the optimal value:

wherein L is a Laplace matrix, L-D-W,

4. The apparatus of claim 3, wherein the GPS data is in the form of a mobile object id, time, longitude, latitude, and speed.