WO2015170289A1 - Method and system for vehicular traffic prediction - Google Patents

Method and system for vehicular traffic prediction Download PDF

Info

Publication number
WO2015170289A1
WO2015170289A1 PCT/IB2015/053376 IB2015053376W WO2015170289A1 WO 2015170289 A1 WO2015170289 A1 WO 2015170289A1 IB 2015053376 W IB2015053376 W IB 2015053376W WO 2015170289 A1 WO2015170289 A1 WO 2015170289A1
Authority
WO
WIPO (PCT)
Prior art keywords
cell
event data
network
data
traffic
Prior art date
Application number
PCT/IB2015/053376
Other languages
French (fr)
Inventor
Stefano Marzorati
Davide TOSI
Original Assignee
Vodafone Omnitel B.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vodafone Omnitel B.V. filed Critical Vodafone Omnitel B.V.
Publication of WO2015170289A1 publication Critical patent/WO2015170289A1/en

Links

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0108Measuring and analyzing of parameters relative to traffic conditions based on the source of data
    • G08G1/012Measuring and analyzing of parameters relative to traffic conditions based on the source of data from other sources than vehicle or roadside beacons, e.g. mobile networks
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0133Traffic data processing for classifying traffic situation
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/015Detecting movement of traffic to be counted or controlled with provision for distinguishing between two or more types of vehicles, e.g. between motor-cars and cycles

Definitions

  • Mobility in the urban areas has been a subject for studies in the last years, mainly for the interest in the environmental and social policies field aimed at a reduction of the environmental impact coming from people and goods movement.
  • An increase in the offering of analysis technologies and a higher attention to the sustainable mobility suggest socio-economic and living standards improvements.
  • the active or passive traffic monitoring systems use FDC ⁇ Floating Car Data) technologies in order to calculate real-time updated traffic data.
  • the vehicles provided with a device connected to a data collection system periodically provide information related to the vehicle velocity and position in a given instant.
  • a device installed on the vehicle is not required and the data are passively collected by statistical detection equipment and conveniently processed in order to infer the users' position.
  • a passive FDC technology is the FC D ⁇ Floating Cellular Network Data), in which each mobile terminal (2G, 3G, 4G) being connected to the network becomes an information source that can be used to monitor the mobility patterns by anonymously processing each mobile station position.
  • an automatic method for predicting the vehicle traffic in a geographic area covered by a mobile telecommunications network and comprising a plurality of radio cells comprising:
  • each network event datum comprises data identifying a signalling message exchanged between the mobile network and a mobile terminal, an identification code of the mobile terminal user, data identifying a radio cell where the mobile terminal has been localized receiving/transmitting the signalling message, and an event time instant associated with said signalling message
  • a collection time interval comprising the request instant and collecting in real time network event data localized in the second cell by intercepting by a network passive probe current network event data in the collection time interval, wherein the network event data are associated with at least one event described by a signalling message of the same type as the signalling message of the at least one event of the sample network event data stored for the assessment interval,
  • Ndet a number of detected network event data, (Ndet)2, collected in the second cell in the collection time interval, analyzing in the event database the stored network event data that are localized in the second cell and occurred in the assessment time interval and applying said mathematical relationships to the stored network event data for the second cell to determine the numerical traffic prediction datum for the second cell.
  • generating mathematical relationships based on the mathematical regression function of best approximation comprises:
  • calculating a plurality (n-1) of normalization factors fi, i 1, 2, .. (n-1), based on the ratio between a threshold value (Ni)i of number of sample network event data for the first cell and the threshold value (N n )i of number of sample network event data relating to the maximum numerical traffic value n for the same cell.
  • applying said mathematical relationships to the stored network event data for the second cell to determine the numerical traffic prediction datum in the second cell comprises:
  • the collection time interval comprising the request instant has the same duration as the sampling time interval.
  • receiving network event data in an event database comprises:
  • the assessment time interval comprises a plurality of sequential observation intervals and the method comprises, before transmitting and loading network event data in the event database:
  • transmitting and loading network event data comprises transmitting and loading the coded network event data in the event database.
  • the assessment time interval is divided in a plurality of identical sampling intervals, each sampling interval being formed by a respective plurality of observation intervals.
  • the step of aggregating and coding the sample network event data comprises creating a .csv file for each observation interval, wherein each network event datum in an observation interval is codified as input of the .csv file for that observation interval.
  • aggregating and coding detected network event data localized in the second cell in sequential observation intervals corresponding to the observation intervals of the sample event data after collecting in real time network event data and before calculating the number of event data detected in real time in the collection time interval, aggregating and coding detected network event data localized in the second cell in sequential observation intervals corresponding to the observation intervals of the sample event data.
  • the collection time interval is composed of a plurality of sequential observation intervals.
  • calculating the network event data number detected in the collection time interval in the second cells, (Ndet)2 is carried out by calculating the number of current network events located in the second cell for each observation interval constituting the collection time interval and adding the respective numbers for each observation interval in order to determine the number of event data collected in the collection time interval.
  • the number of network event data detected in an observation time interval is the sum of the network event data collected in real-time in respective event time instants (e.g. Timestamp) comprised in said observation interval.
  • correlating the sample network event data and the traffic sampling data for the first radio cell is carried out by applying a plurality of regression mathematical functions, each being described by a continuous curve and by a respective plurality of numerical parameters, wherein correlating comprises determining the correlation coefficient for each plurality of mathematical functions to determine a plurality of correlation coefficients, and selecting and storing the best approximation regression mathematical function is carried out by selecting the regression mathematical function in the plurality of the regression functions having the highest correlation coefficient of the plurality of correlation coefficients.
  • selecting sample network event data in the event database is carried out by selecting all stored network event data in the event database localized in the first radio cell and having event instants comprised in the assessment time interval.
  • the degree of the polynomial regression mathematical function is equal to or greater than 1.
  • analyzing and filtering the sample network event data to eliminate the non-significant network events comprises:
  • step (viii) repeating the steps from (iv) and (vii) for the remaining user identification codes of the plurality of user identification codes so as to select sample network event data relating to mobile terminals associated with vehicle movements, and (ix) transmitting and loading the plurality of selected network event data in the event data base, wherein correlating the plurality of numbers of sample event data with the traffic sampling data is carried out by taking the number of filtered network event data resulting from step (ix) as an independent variable for the first cell of the plurality of cells.
  • the step of selecting comprises eliminating the network event data related to a target identification code for which the spatial path comprises at least one displacement between at least one pair of adjacent and contiguous cells Ci and Q having a displacement velocity equal to or smaller than the velocity threshold value.
  • the determination of the displacement velocity between two adjacent and contiguous cells comprises:
  • determining if the displacement velocity is equal to or greater than or smaller than a displacement velocity threshold value comprises:
  • each tag being associated with a range of displacement velocity values and establishing a velocity tag threshold value
  • determining if the displacement velocity is equal to or greater than or smaller than a velocity threshold value comprises determining if the velocity tag is greater than a velocity tag threshold value.
  • detecting and filtering network event data to eliminate the nonsignificant network events further comprises, after the step (viii) and before the step (ix):
  • the mobile network is a 2G or 3G radio access network.
  • the signalling messages are one or more messages selected in the group consisting of: CM Service Request, Common ID, Paging Response, Location Update Request, Location Updated Accept, TMSI Reallocation, Location Report, Relocation Command, Handover Request, Handover Complete and Handover Performed.
  • the identification code of the mobile terminal user is the EVISI code being obscured by means of a hash algorithm (O-EVISI).
  • the passive network probe is installed on a signalling interface of a mobile telecommunication network and configured to intercept network event data passively and the signalling interface is between the mobile network core and at least one radio access station.
  • connection interface is an A interface of a 2G network or an I u interface of a 3G network and the mobile network core is a MSC server.
  • the mobile network is a 2G or 3G radio access network and the data identifying the radio cell where the mobile terminal has been localized receiving/transmitting the signalling message comprise the LAC and the SAC.
  • the present disclosure is related also to a traffic prediction system in a mobile network comprising a plurality of radio cells, the system comprising:
  • a passive network probe configured to passively intercept network event data located in the plurality of radio cells and occurred in event instants comprised in an assessment time interval, wherein each given network datum comprises identification data of a signalling message exchanged between the mobile network and a mobile terminal, an identification code of the mobile terminal user, identification data of a radio cell where the mobile terminal has been localized receiving/transmitting the signalling message, and an event time instant associated with said signalling message,
  • a monitoring server configured to receive the network event data from the probe and to process the received data in order to create structured data
  • a storage configured to receive and temporarily store structured network event data received from the monitoring server
  • a traffic data source containing traffic sampling numerical data
  • a database containing the mobile network topology
  • an event database configured to receive network event data from the storage
  • a tool configured to carry out a statistical data processing and extract network event data from the event database during an assessment time interval and related to at least one cell of the plurality of cells of the geographic area and to extract traffic sampling data from the traffic data source during a time interval corresponding to the assessment interval, the statistical processing tool being also configured to analyse the extracted data by calculating a plurality of numbers of network event data related to the assessment time interval and to find a regression mathematical function between the number of network event data and the traffic sampling data and, if at least one statistically valid mathematical function is identified, to store said mathematical function, and a traffic prediction tool configured to be triggered by an external traffic prediction request associated with at least one cell of the plurality of cells, to receive real-time network event data collected by means of the network probe and stored in the events data base, to determine an event number related to the request instant by taking the detected network event data from the event database during a collection time interval comprising the request instant and to apply mathematical relationships, defined by the regression function stored in the statistical processing tool, in order to determine the traffic numerical values related to the request instant
  • the traffic prediction result can be used by navigation systems to calculate the paths having a better practicability based on current traffic information or by simulators with the purpose of adopting traffic plans, for example to modify the public transportation frequency and route or the dimensions of the critical urban areas having a high congestion level.
  • the method comprises a first macro-step in which the statistical models are derived and which is executed off-line and a second macro-step, subsequent to the first one, being executed on line, in which the statistical models are executed on network data being collected in real-time with the purpose of predicting the vehicular traffic situations and/or calculating traffic indicators without a real traffic observation.
  • - figure 1 schematically shows a geographic area of interest divided in a plurality of cells.
  • - figure 2 is a diagram of a mobile network infrastructure provided with a network data detection system, according to the present invention.
  • FIG. 3 is a diagram showing the architecture of the traffic prediction system according to one embodiment of the present invention.
  • FIG. 4 schematically shows the main steps of the traffic flow prediction method according to one embodiment of the present invention.
  • FIG. 5 is a graph showing a linear regression curve compared with an exemplary logistic regression curve.
  • FIG. 6 is an exemplary time-event diagram.
  • FIG. 7 is an exemplary map resulting from the application of the method according to one embodiment.
  • FIG. 8 is a graph reporting a regression function of a polynomial type, which better approximates the points (diamonds) whose coordinates are the real traffic data imported from an external data source and the total event number detected by a network probe.
  • the traffic flow prediction is carried out in a geographic area covered by a mobile telecommunications network and divided in a plurality of radio cells.
  • the geographic area is an urban area, for example corresponding to a middle or big city.
  • Figure 1 schematically shows a geographic area 20 divided in a plurality of radio cells 21.
  • a radio cell covers an area of about 300 meters and the geographic area consists of 2000 cells, being at least two by two adjacent and contiguous.
  • GSM 2G one
  • BTS base transceiver station
  • UMTS 3G one
  • Node B station arranged.
  • BTS or Nodes B communicate, by means of a transceiver antenna, with the mobile terminals, which transmit information, such as voice, data, text (SMS), system signalling, etc.
  • FIG. 2 is a block diagram of a mobile network infrastructure, which can be a 2G, 3G or 4G network.
  • a 2G network a plurality of BTSs 23, each of them having a covering area defining a cell, are connected to a base station controller (BSC) 24 to concentrate the traffic towards a Mobile Switching Centre (MSC) 25 and to sort the calls towards BTSs.
  • BSC base station controller
  • MSC Mobile Switching Centre
  • the MSC manages the control and routing of the calls, the resources allocation and contains information related to a mobile terminal, such as I MS I ⁇ International Mobile Subscriber Identity) and I ME I of the users being radio connected to BTSs.
  • the MSC is connected to a plurality of BSCs 24 (only one BSC is shown in the figure).
  • the MSC 25 is made of two network logical entities, the MSC Server (MSC-S), managing the signalling, and a circuit switch (Media Gateway), managing user data.
  • MSC-S MSC Server
  • Media Gateway Media Gateway
  • BSC and MSC communicate to each other by means of an interface, named interface A, which handles the resource allocation to the mobile terminals and their mobility.
  • the mobile network could be a 3G one.
  • the Node B stations are connected to a Radio Network Controller (RNC), being connected in turn to a digital switching centre and MSC and RNC communicate by means of an interface I u .
  • RNC Radio Network Controller
  • a passive network probe 26 is physically connected to the MSC-S of the MSC managing an area comprising the geographic area to be monitored.
  • the network probe is configured to monitor real-time signalling messages being generated at the interface A of the MSC, or, in case of a 3G network, at the interface I u , where the RNC ends the connection.
  • the network probe is configured to passively intercept the signalling messages passing in the mobile network and to collect the signalling messages exchanged between mobile network and mobile terminals.
  • the probe is configured to capture all the signalling messages being exchanged between mobile network and mobile terminals.
  • Each intercepted signalling message is associated with a message intercepting time instant, an identification of the mobile terminal user and with an identification of the cell transmitting and/or receiving the messages.
  • the time instant associated with a captured signalling message is the Timestamp being recorded by the mobile network, i.e. the event recording time at the interface A or I u .
  • the network data each of them comprising identification data of a signalling message exchanged between the mobile network and a mobile terminal, the associated time instant, identification data of the radio cell where the mobile terminal is located while using the network (particularly the cell from which the terminal receives or transmits the signalling message) and user identification data are overall indicated as network event data.
  • the probe is configured to convert and code the collected network event data by converting them from their local format (provider specific) to a serialized data format, such as xDR (external Data Representation).
  • the network probe transmits the collected network event data to a monitoring server 27, which is configured to receive data from the probe and to process received data, particularly by assembling and filtering them as described in the following.
  • the network probe is configured to trace a plurality of network event data generated by the network and described in the Technical Specifications of 2G and 3G networks, each plurality of network event data comprising:
  • CM Service Request a request by a mobile terminal of a service allocation
  • Paging Response the allocation procedure for the localization of a mobile terminal
  • Location Update Request the request to update the localization area of a moving mobile terminal
  • Location Updating Accept the handover procedure when a mobile terminal moves from one cell and enters in another cell
  • HO Request, HO Complete and HO Performed the handover procedure when a mobile terminal moves from one cell and enters in another cell
  • the piece of information temporally identifying a certain network event namely the recording time instant of the signalling message (Timestamp) and, optionally, the TMSI, namely the identity temporarily assigned by the BSC to each mobile terminal when it is turned on to avoid the EVISI being intercepted fraudulently.
  • the probe monitors both 2G network and 3G network devices, it is preferred that the probe captures the network indicator too (signalling message Channel Descriptor) which disambiguates the events coming from the 2G network from those coming from the 3 G network.
  • the network event data are transmitted to the monitoring server 27.
  • FIG. 3 is a diagram showing the architecture of the traffic prediction system according to one embodiment.
  • the network event data captured by the network probe 26 are transmitted to the monitoring server 27, which processes them by compressing, coding and aggregating the network event data collected by the probe.
  • the server 27 is configured to code the received data as a sequence of .csv ⁇ comma separated value) files, being identified with a file name of the form ⁇ StartDate>- ⁇ EndDate> ⁇ ServerName>.csv, where ⁇ StartDate> and ⁇ EndDate> specify the starting instant and the final instant defining the network observation time interval.
  • the observation time interval is of from 1 second to 5 seconds.
  • Each .csv file has as records (inputs) a plurality of coded network event data (e.g. xDR) that happened at a respective time instant comprised in the network observation time interval.
  • the .csv files are sequentially created by the monitoring server and are associated with consecutive observation time intervals.
  • the monitoring server 27 is configured to filter the network event data that can reveal the mobile terminal user identity in order to make those identification data anonymous.
  • the monitoring server is configured to process the IMSI associated with the mobile terminal that generated/received the event by obscuring it by means of irreversible encryption algorithms, being known per se.
  • the network data being structured in a data table 28, e.g. a .csv file, are transmitted to a traffic prediction system 29 being implemented on a server managing traffic data from mobile networks (not shown in the figure).
  • the traffic prediction system comprises a storage 30, preferably having high capacity and performances. In one embodiment, the storage capacity is 10 TB having 10.000 rpm drives.
  • the storage 30 receives and temporarily stores the received data, for example, the storage is configured in order to keep a .csv file history 6 months long.
  • the transmission mode of the network data from the monitoring server is preferably of the "push" type, that is the server 27 pushes a given data quantity in a continuous way at given time intervals, for example a 1 Mb file every 5 seconds (corresponding to the observation time interval) by means of a secure data transfer network protocol, such as SFTP.
  • the monitoring server outputs a plurality of files in .csv format, wherein each .csv file has a plurality of inputs, each input containing the following network information being captured by the network probe:
  • Timestamp the time elapsed since 01/01/1970 in seconds (in order to temporally locate the event);
  • Event type being collected by the network probe, the event being defined by a respective signalling message, wherein the latter is identified by an event identification code (i.e. the signalling message identification data).
  • the following signalling messages are monitored: CM Service Request, Common ID (the procedure informing the RNC or BSC about the user IMSI), Paging Response, Location Updating Request, Location Updating Accept, TMSI Reallocation Complete (the completion of the procedure used to protect the mobile terminal identity at every cell change), Location Report, Relocation Command, and events related to the handover procedure (HO Request, HO Complete and HO Performed);
  • LAC Location Area Code
  • the code identifying the area covered by the radio cell of the mobile network (e.g. area 21 in figure 1);
  • SAC or CI the Service Area Code or the cell identification
  • O-IMSI the IMSI code obscured by means of hash mechanisms; Channel Descriptor: 2G or 3G channel descriptor.
  • the identification code of the signalling message is an integer. For example, “1” identifies the CM Service Request, “2" the Common ID event, “3” Paging Response, "4" Location Update Accept, etc.
  • the .csv file records are structured in the storage 30 in the following way: "Timestamp, event identification code, LAC, SAC (or CI), O-FMSI, Channel Descriptor".
  • a file record is "1297868695,3, 51504,6360,3E6AF3303D865D23, 1".
  • the traffic prediction system 29 receives also data related to the mobile network topology, namely the division of the monitored geographic area in a plurality of cells and each cell covering area.
  • the data related to the network topology can be made available by the service provider owning the network, for example in a first data table being structured in a .csv file containing the identification of each cell (LAC and SAC) and its spatial geometry.
  • the server 33 is configured to receive that data table and to process data contained therein producing a second data table, for example a shapefile recording the geometrical data (spatial geometries) and the attributes they represent.
  • the traffic prediction system 29 comprises a database 31 containing the mobile network topology and the network server 33 transmits the topological data 32 to the database 31 that stores and manages them.
  • the traffic prediction system 29 also receives the spatial data defining the geometrical and geographical elements of a geographical map of the monitored geographic area.
  • the spatial data are made available in one or more central repositories 34, external to the prediction system, for example Internet publicly available repositories, such as OpenStreetMap and Google Satellite Maps.
  • the spatial data are imported by the prediction system in a spatial database 35, which stores them.
  • the data stored in the storage 30 and databases 31 and 35 are inputted to a vehicular traffic prediction software tool 36 being suitable to process the input data and to determine a traffic prediction based on the input data.
  • the traffic prediction tool 36 comprises a parallel loading tool 37 configured to take the data being structured in .csv files from the storage 30 and to load them en bloc and in real-time in an event database 38.
  • the data transmitted and loaded in the event database which are related to a pre-set time interval, in the following specified as assessment time interval, and originating from a plurality of radio cells are stored in the event database and constitute the "historical" network event data for analysing the correlations between those data and the traffic indexes, as it will be described with more details in the following.
  • a filtering tool 39 being configured to analyse and filter the network events that are non-relevant to the vehicular traffic prediction calculation, extracts network event data being structured in the event database 38 and network topological data being structured in the database 31, wherein the network event data are associated with time instants comprised in a pre-set time interval.
  • the filtering tool 39 extracts the network event data from a .csv file being related to a pre-set time interval corresponding to the assessment time interval of the network event data stored for the plurality of cells.
  • the assessment time interval is formed by a sequence of network observation time intervals, corresponding to the observation intervals where the output data from the monitoring server 27 are coded.
  • the filtering tool 39, as well as the other tools included in the traffic prediction system 29 are implemented by software modules in order to be executed by a processor.
  • a first filtering mathematical algorithm implemented in the filtering tool 39, takes as input data the network topology data from the database 31 and the network event data stored in the event database 38 and it is configured to detect mobility modes that are not consistent with a movement on a user vehicle, for example mobile users that likely moved or are moving on foot or by bicycle.
  • the network event data comprise O-IMSI, LAC, SAC (or CI), the identification code of the signalling message captured by the probe (e.g. "1", "2", etc.) and the time instant associated with each event (e.g. Timestamp).
  • the first filtering mathematical algorithm carries out a procedure comprising receiving a plurality of network event data associated with time instants being coded as a file sequence associated with sequential observation time intervals and being comprised in an assessment time interval and eliminating from the plurality of network event data the data being associated with a mobile terminal whose spatial path through a plurality of adjacent and contiguous cells presents at least one displacement velocity from a cell to an adjacent and contiguous cell being lower than a pre-set threshold velocity value.
  • the first filtering algorithm carries out a procedure comprising: extracting a plurality of network event data from the event database 38 related to an assessment time interval, grouping network event data of the plurality of event data being associated with a same O-EVISI, in order to create a plurality of groups related to respective and different mobile terminals and temporally ordering the network event data of each group associated with an O-EVISI by establishing a spatial path being followed by each mobile terminal associated with a respective O-EVISI through a plurality of cells comprised in the plurality of cells forming the geographic area (during the reference time interval, that is the assessment interval).
  • each cell of the plurality of cells is identified by respective LAC and SAC (or CI) and the spatial path of each O-EVISI is represented by a sequence of identifications SAC (or CI) ordered in an increasing time order.
  • the plurality of network event data extracted from the event database comprise all the network event data loaded in the event database during the assessment time interval.
  • the procedure carried out by the first filtering algorithm comprises extracting from the network topology database the topological data referred to the cells crossed by each O-EVISI, cross-referencing the event data for each O-EVISI to the respective network topology data (being both identified by the same SAC/CI) in order to map the network event data to the network cells that generated them, and calculating a respective centroid for each cell of the plurality of cells crossed along the established path of each O-EVISI by means of mathematical functions being known per se.
  • the procedure carried out by the first filtering algorithm comprises: calculating the displacement velocity between the centroid of a cell Ci and the centroid of an adjacent and contiguous cell Q for each pair ⁇ Ci, > along the path, during a time interval defined by the difference between the time instant being associated with the event generated in cell Ci and the time instant associated with the event in cell Q, associating to the calculated velocity a velocity tag Ty. If the velocity tag Ty is greater than a pre-set threshold velocity tag value, it is assumed that the mobile terminal user is moving using a (motor) vehicle.
  • the network event data associated with an O-EVISI whose path during the assessment time interval is associated with at least one velocity tag equal to or smaller than the threshold value for the movement between two adj acent and contiguous cells are eliminated from the plurality of network event data being received as input in the algorithm.
  • the procedure for time ordering the network event data is carried out, for example, by ordering in an ascending way the Timestamps associated with each event for a certain mobile terminal (O-EVISI).
  • the time ordering can be carried out by means of known mathematical functions of hierarchical ordering, by first ordering on the basis of the O-EVISI and then on the basis of the Timestamp.
  • the velocity of the mobile terminal identified by an O-EVISI is calculated by the ratio (distance between cell Ci centroid and cell Q centroid)/(difference between the time instant being associated with LAC of cell Cj and the time instant being associated with LAC of cell Ci).
  • the first filtering algorithm automatically eliminates a network event being associated with a target O-EVISI and to a first event instant if a network event, being associated with the same O-EVISI and to a second time instant immediately preceding the first temporal instant, is located in the same radio cell.
  • the first filtering algorithm eliminates a given network datum indicating a movement with respect to a given network datum being intercepted immediately before inside the same cell, i.e. those subsequent network event data comprise the same LAC.
  • the movements inside the same cell are eliminated because the displacement velocity is automatically determined being zero, thus smaller than the threshold velocity value.
  • the first filtering mathematical algorithm uses the following classification for the velocity tags associated with a respective velocity interval calculated for a user, being identified by the O-IMSI, who moves between cell Ci and cell Cj, assuming a threshold velocity tag value equal to 15:
  • the filtering tool selects the O-IMSI having Tij > 15 for each movement between adjacent and contiguous cells.
  • the filtering tool 39 is configured to input the output data of the first filtering algorithm to a second filtering algorithm in order to detect and filter network anomalies, such as a too fast movement between cells and/or the ping-pong effect in the sequence of cells crossed by a mobile terminal.
  • the ping-pong effect takes place when a mobile terminal carries out the handover between two base stations, namely between two cells Ck-i and Ck, passing from Ck-i to Ck and back to Ck-i within a pre-set time interval, indicated in the following as target time interval.
  • the second algorithm verifies, for each adjacent and contiguous cells pair ⁇ Ck-i, Ck> crossed by a mobile terminal, if the localization updates of the mobile device are compatible with the target time interval for the crossing from a cell Ck-i to an adjacent and contiguous cell Ck and back from cell Ck to cell Ck-i, for example, it verifies that they are not too quick.
  • the second filtering algorithm executes the following procedure: receiving as input data a plurality of network event data being associated with time instants comprised in the assessment time interval, grouping network event data of the plurality of event data in groups being associated with the same mobile terminal identified by a respective O- IMSI, temporally ordering the network event data of each group associated with each O-IMSI and establishing a spatial path being followed by each mobile terminal associated with a respective O-IMSI through a plurality of cells, the path being defined by a sequence of radio cells crossed by a mobile terminal and each cell being associated with an event localization time for a specific cell, ⁇ Ck, tk>.
  • the localization time associated with the event recording is the Timestamp.
  • the second filtering algorithm verifies that the crossing time of a mobile terminal along the spatial path from each cell Ck-i to an adjacent and contiguous cell Ck, (tk-tk-i), and that the back crossing time between cells Ck and Ck-i (tk-tk-2), where tk-2 is the time at two previous instants with respect the instant tk, are compatible with a pre-set target crossing time, t s .
  • the second algorithm determines if at least one of the following condition is verified: (tk-tk-i) > t s and (tk-tk-2) ⁇ 2t s . If at least one of the two conditions is verified, the event associated with ⁇ Ck, tk> is eliminated.
  • the filtering tool iteratively carries out the second filtering algorithm on all the mobile terminals being identified by a respective O-EVISI, i.e. on all the network event data groups related to the respective O-EVISI.
  • the second filtering algorithm receives the network event data being filtered by the first filtering algorithm, which eliminated the network event data related to movements of a mobile terminal inside the same cell. Therefore, the input data of the second algorithm for a specific mobile terminal associated with consecutive time instants are network event data related to different radio cells.
  • the filtering tool could be configured to execute the second filtering algorithm before the first filtering algorithm.
  • the first filtering algorithm receives the network event data being filtered by the second filtering algorithm.
  • the filtering tool transmits and loads the input data being processed by the filtering tool 39 in order to eliminate non-relevant network events in the target database 38.
  • the target database is a relational database comprising two tables implementing the respective relationships of non-filtered databases, being transmitted and loaded by the parallel loading tool 36, and of filtered data being transmitted and loaded by the filtering tool 39.
  • a statistical data processing tool 40 receives the structured and filtered network data from the target database 38 and real traffic sampling data being observed by existing traffic systems.
  • the traffic sampling data are "historical" traffic data coming from at least one traffic information source and being collected in one or more repositories 43 of information publicly available in Internet, such as InfoTraffic, WebCam, Google Traffic, Waze or from a manual traffic sampling in pre-set points of the geographic area.
  • the statistical processing tool 40 is configured to extract the traffic data contained in the repository 43 in a time interval corresponding to the assessment time interval and related to the plurality of cells of the geographic area and to extract the stored network event data for the assessment time interval.
  • the statistical processing tool is further configured to analyse the received data and to find a mathematical relationship between the historical network events being structured and stored in the database 38 during an assessment time frame and the traffic sampling data extracted from the repository 43 being associated with the same time frame.
  • the at least one statistically significant mathematical relationship is stored in the statistical data processing tool 40.
  • the at least one mathematical relationship is a regression mathematical function being applied to the network event data, as described in more details in the following.
  • the stored regression mathematical functions are used by a traffic prediction tool 41 during a second on-line phase of the method according to the present disclosure.
  • the traffic prediction tool 41 is activated by external requests (for example navigation systems or simulators generating requests automatically) for the real-time calculation of the vehicular traffic indexes.
  • an external request to the traffic prediction system is activated by a mobile terminal user through an application interface.
  • the traffic prediction tool 41 When the traffic prediction tool 41 is triggered by an external request, it is configured to receive real-time network event data collected by means of the network probe 26, to determine the event number related to the request instant by taking the detected network event data from the database 38 in a collection time interval comprising the request instant and to apply mathematical relationships, being defined by the regression function stored in the statistical processing tool 40, in order to determine the traffic numerical values related to the request instant and to the radio cells associated with the request, which form the prediction geographic area as the processing final output.
  • the statistical processing tool can be configured also to create three-layer maps grouping the calculated traffic data for that request instant, the mobile network topology contained in 31 and the geographical map of the geographic area monitored by the spatial data contained in the database 35 in a single map.
  • the above-described tools (39, 40 e 41) are software modules implemented in the usual way, for example using programming languages such as Java, C or scripting languages to receive, transmit and process data contained in the data bases.
  • FIG. 4 schematically shows the main steps of the traffic flow prediction method according to one embodiment of the present invention.
  • the method comprises a first plurality of steps 51 being carried out off-line, before monitoring the real traffic conditions.
  • the first plurality of steps 51 comprises: receiving "historical" network event data 49a that have been collected through network probes and stored in an event database (e.g. the event database 38 in the system of figure 3) related to a plurality of cells forming the geographic area of interest; receiving network topology data 49b (e.g. from database 31); creating a data structure based on historical events data and network topology data in order to map the network event data to the network cells generating them; filtering the sample event data by eliminating the non-relevant network event data (e.g.
  • the filtering tool 39 selecting sample network event data from the stored network event data; correlating (e.g. by means of the data statistical processing tool 40 in the system of figure 3) the sample event data to the traffic sampling data 50 coming from at least one traffic information external source, and applying at least one regression mathematical function being described by a continuous curve and by a plurality of numerical parameters to the number of sample network event data as independent variable and to the traffic sampling data as dependent variable finding the plurality of numerical parameters of the regression function which best approximates the curve of discrete input values in said regression function.
  • the step of applying at least one regression function is executed in at least one cell of the plurality of cells and preferably in a sub-plurality of the plurality of cells.
  • a plurality of regression functions is applied to the sample event data and to the traffic sampling data and the method comprises selecting a regression function of the plurality of regression functions that best approximates the curve being formed by the discrete values of the independent and dependent variables.
  • the selected regression function is stored in order to be used in a subsequent step for the real traffic prediction in the radio cell.
  • the sample network event data and the traffic sampling data are related to the same time frame, indicated as assessment time interval, starting from a starting instant, for example two hours, twenty four hours or one week starting from a specific time instant, and related to the same radio cell.
  • the data sample is selected in order to be statistically relevant, generally comprising at least one hundred detections correlated to respective traffic sampling data and network event data.
  • the method comprises a second plurality of steps carried out on-line for the real-time traffic prediction in a geographic area of interest comprising a plurality of cells.
  • the stored historical network event data are analysed to determine a plurality of relationships or mathematical models between network event data stored in the assessment time interval and traffic numerical values starting from the application of the regression function being determined in the off-line step for the at least one first cell and those mathematical relationships are applied to the real-time network event data collected during a collection time interval comprising the request instant for each cell of the plurality of cells forming the geographic area of interest (step 47) in order to determine the corresponding traffic prediction numeric datum (step 48).
  • a dotted line 44 shows the division between the first plurality of off-line steps and the second plurality of on-line steps.
  • the method selects the network event data related to an event type, namely to a specific signalling message, for example Location Update Request.
  • the independent variable used to determine the regression function in the off-line step is the number of "historical" network event data related to the specific signalling message and, accordingly, the independent variable in the on- line phase is the number of real-time event data captured by the probe at a detection instant or during a time interval related to the selected event, always related to the same event, i.e. Location Update Request.
  • the independent variable both to determine the regression function and to apply the same, following traffic prediction requests, is the total number of network event data captured by the probe. For example, if the probe is configured to trace the signalling messages CM Service Request, Common ID, Paging Response, Location Updating Accept, TMSI Reallocation Complete, Location Report, Relocation Command, HO Request, HO Complete and HO Performed, the total number of network event data captured by the probe is the sum of all event data described by those signalling messages captured in an observation time interval.
  • the independent variable to determine off-line the regression function is the total number of network event data related to a plurality (i.e. at least two) of events, namely of network event data used to calculate the independent variable are related to two different signalling messages.
  • the current network event data being intercepted by the network probe during a collection time interval are aggregated and coded in sequential observation intervals by the monitoring server 27 in a way similar to the "historical" network event data during the off-line step.
  • the current event data structured in a data table, are transmitted by the server 27 to the storage 30.
  • the parallel loading tool 37 to take the structured data and load them en bloc and in real-time in the event database 38.
  • the observation intervals dividing the collection interval have all the same duration, for example 2 seconds, and they are selected to be equal to the ones of the network event data being detected and stored in the off-line step. Therefore, the collection time interval is divided in observation intervals and the calculation of the number of detected network event data is carried out by calculating the number of current network event data for each observation interval constituting the collection time interval.
  • the current network event data are taken by the filtering tool 39 in order to filter away the non-relevant network events for the vehicular traffic prediction calculation in a way similar to the one previously described for the off-line step.
  • a first and second filtering algorithm are applied to the current network event data and the filtered event data are inputted to the event database 38.
  • the filtering tool sequentially extracts all the network event data related to a plurality of consecutive observation intervals forming the collection time interval, so as to continuously filter the network event data being loaded in the event database 38.
  • the real-time detected network event data are inputted into the traffic prediction tool 41 in order to calculate the traffic indexes in the cells forming the geographic area of interest during the collection interval, as explained in more details in the following.
  • the Applicant observed that the accuracy of the traffic condition prediction depends on the correlation coefficient of the correlation mathematical function being identified during the first off-line step of the invention.
  • the correlation function is a polynomial regression function having degree three calculated by means of the least squares technique.
  • Table 1 reports an example of sample event data coming from the mobile network (i.e., number of signalling messages zi and z 2 ) and of input real traffic sampling data, i.e. Yh (off-line step 51 in figure 4), for a specific cell.
  • the sampling time intervals have all an even duration equal to 2 minutes.
  • each sampling time interval is formed by a sequence of observation time intervals all having same duration, for example 5 seconds.
  • the sampling intervals are of from 1 minute to 5 minutes.
  • Handover N> lists the number of event data of the handover type, i.e., the events where the mobile terminal, during a service session (call, data sending/receiving, etc.), is redirected from its actual cell to a new cell.
  • the handover events include Handover Request, Handover Complete and Handover Performed (therefore z 2 is the sum of events related to three event types). Those events zi and z 2 are recorded by the network probe as events of the reference cell and during the time interval ⁇ corresponding to the one related to the real traffic index reported in the same table row and received as input data from an external data source.
  • the sample event data of Table 1 are data being filtered by the first and second filtering algorithm in the ways described above.
  • the least squares technique is applied in order to determine the numerical parameters of the regression function by using known software tools included, for example, in the statistical data processing tool 40 in the embodiment of figure 3.
  • R 2 correlation coefficient, describing the correlation level between Y and z.
  • SSresid sum of the squared residuals of the regression (residual deviation), if perfect linear relationship.
  • an F test and/or a T test is applied in order to verify if the relationship between Y and z is random, namely if the errors are following a normal distribution.
  • the F critical value is 4.53. If the F value is greater than the critical value, the correlation is not random.
  • the T value is greater than a critical value, the correlation is not random.
  • a regression function f(z) is generated starting form data Yh and z (or zi, z 2 , z 3 , . . . in case of multivariable regression function) being collected in the same pre-set assessment time interval (for example, a week) for each cell. In this way, several "significant" statistical models can be found, and every statistical model is associated with a regression function describing the correlation between the number of events being generated/ex changed by the mobile network and the real traffic indexes being collected from external data sources.
  • a plurality of regression functions are selected (e.g. by means of the statistical data processing tool 40), each regression function is a mathematical function described by a continuous curve and by a plurality of unknown numerical parameters.
  • the plurality of regression functions is applied to the input data, i.e. sample event data and traffic sampling data, and the regression function better describing the correlation between the network events and the traffic indexes is selected, preferably selecting the regression function having the highest correlation coefficient R 2 .
  • the regression function between the network event data and the traffic data can be described by a generalised linear model.
  • generalised linear model it is meant a linear function and polynomial functions having a degree greater than one.
  • a linear regression function or more generally a polynomial regression function, generally requires that the errors distribution is normal in order to apply the least squares procedures. In other words, it is preferable that the above-described F or T tests are verified.
  • the correlation between the sample event data and the real traffic sampling data can be described by a single variable or multivariable logistic regression function.
  • a logistic regression does not assume a linear relationship between independent variables and dependent variables and requires that neither the regression errors distribution be normal nor the error variance be constant (homoscedasticity).
  • f(z) is the probability that Y is equal to a
  • Figure 5 is a graph showing a linear regression curve 61 compared to an exemplary logistic regression curve 62.
  • the linear regression function 61 can predict values that are greater than 1 or smaller than zero, or generally values in between 0 and 1.
  • the independent variable z of the logistic regression function has a linear relationship with the logit of the dependent variable:
  • the sample events z collected for a cell can be expressed as a function of the sampling time interval, Ati, in a time-event diagram.
  • Figure 6 shows an example of time-event diagram, reporting the event number, in this case all the events that the probe is able to monitor, as a function of time t.
  • a plurality of event number threshold values, Nl, N2, N3 and N4 have been defined. Each event number threshold value is associated with the transition between a traffic level and the next one that are taken from external sources of traffic sampling data and that show four traffic conditions, shown using four respective numerical values shown also as indexes.
  • the traffic level is considered being moving and shown with index 1, between threshold value Nl and value N2, the vehicle flow is considered being high and shown with index 2 and between threshold value N2 and value N3, the vehicle flow is considered being congested and shown with index 3. Beyond threshold value N3, the vehicle flow is defined by a level 4, being "impossible".
  • the method comprises generating a time-event diagram for said first cell, that describes the event number, i.e. signalling messages, recorded for the first cell as a function of the event recording time obtained by the analysis of the "historical" network events being stored with reference to the first cell.
  • the assessment time interval of the historical data is divided in a plurality of sampling time intervals and the event number in the diagram is a y-axis point referring to a specific sampling time interval.
  • the normalization factors are stored in the traffic prediction tool 41.
  • the normalization factors determined for the first cell during the off-line phase are then applied during the on-line phase to the recorded values of real-time network event number for each respective cell, different from the first cell, of the plurality of cells forming the geographic area, in order to calculate the corresponding traffic indexes for each cell of the geographic area.
  • the network event data localized in cell Ck are collected during a pre-set collection time interval comprising the time instant at which the request has been generated.
  • the number of network event data, (Ndet)k, detected during the collection time interval is calculated by summing the events detected during that interval.
  • the collection interval is selected to be equal to the sampling time interval where the number of network event data in the first cell during the off-line phase have been calculated.
  • the network event data collected in real-time by the probe are aggregated and coded in the monitoring server 27 during sequential observation intervals corresponding to observation intervals of the sample event data.
  • the collection time interval is formed by a plurality of sequential observation intervals and (Ndet)k is determined by summing the respective numbers for each observation interval forming the collection interval.
  • the detected value (Ndet)k is compared to the threshold values (Nl)k, (N2)k, . . . (N(n))k in order to determine which one is the highest event number threshold value being smaller than or equal to the value (Ndet)k and the traffic numeric value corresponding to said highest event number threshold value is associated with it. For example, if (Nl)k ⁇ (Ndet)k ⁇ (N2) k , the traffic index associated with (N de t)k is 1; if (N3) k ⁇ (N de t)k ⁇ (N4)k, the traffic index associated with (Ndet)k is 3.
  • the steps of selecting the threshold value of event data number associated with the traffic index n, calculating the threshold values of event data number associated with the traffic indexes 1, (n-1) by applying the normalization factors and comparing the detected event number to the n threshold values are repeated for all the cells forming the geographic area.
  • the threshold value (N n )k being calculated based on historical off-line data and on a pre-set sampling time for each cell of the plurality of cells of the geographic area, is updated during the on-line phase every time a new maximum value is observed during the sampling time interval.
  • the system calculates the event number registered real-time during a time interval around 16:00 (e.g. between 15:58 and 16:00), which defines the collection interval, it calculates the threshold values for the second cell by applying the normalization factors for the first cell, and it selects a time-event diagram associated with the second cell. Then, the system outputs the traffic index being calculated by applying the regression function and having as input datum the recorded event number.
  • the circle represents the event number recorded real-time around 16.00 being associated with a traffic level 3, "congested flow”.
  • the procedure is iteratively applied to all the remaining cells of the plurality of cells forming the prediction geographic area.
  • the prediction geographic area can correspond with the geographic area formed by the plurality of cells for which "historical" event data are stored or it can be an area of interest for the prediction included in that area.
  • FIG. 7 shows the map of the area (the main roads of the geographic area of interest are schematically visible), the cells topology, whose borders are shown using a white continuous line and a graphical representation of the traffic indexes by means of a grey scale for each cell.
  • White colour shows a traffic index 1, light grey the traffic index 2, dark grey a traffic index 3 and black a traffic index 4, respectively.
  • the maps can be updated on line, for example by means of an application triggering the execution of traffic predictions at regular time intervals (e.g. every 3 minutes) that calculates again the traffic indexes for each cell of the plurality of cells of the geographic area of interest.
  • the above-described method has been implemented and applied (in its first off-line phase) to six different urban areas of the Milan city in order to find a statistical model based on a regression being able to describe the correlation between mobile network events and real traffic indexes of the considered areas.
  • Areas have been selected following heterogeneity criteria: the areas represent different peculiarities often existing in a big city, namely narrow streets, multiple lanes avenues, congested traffic flow areas, etc.
  • Each area was formed by a respective radio cell, indicated in the following as C1-C6.
  • samples have been collected related to the event number for each cell associated with the 6 selected geographical areas and during different times of the day.
  • the event number is calculated considering all the network events type generated during a 2 minutes time interval (i.e. sampling interval) centred on the Timestamp of the sampling request.
  • the traffic sampling data for the dependent variable Yh have been collected from a source outside the system providing real-time traffic data available in Internet providing real-time traffic data calculated starting from cameras and inductive coils being located at relevant points of the main streets of the urban areas under analysis.
  • the real traffic data Yh and the event number for each cell during an assessment time interval have been correlated by means of the least squares technique (OLS, Ordinary Least Squares) using three regression functions: a linear function according to Eq. (1), a single-variable third degree polynomial function, and a single variable logistic function of the type described by Eq. (4).
  • OLS Ordinary Least Squares
  • the statistical significance threshold has been set equal to 0.01.
  • the regression functions that best correlated the traffic behaviour related to the network event number were the third degree polynomial functions. Linear and logistic functions for the six selected areas did not provide any significant statistical model. Table 2 reports statistical significance data of the found regression functions for the six reference areas, correlating traffic data Yh (80 samples) and samplings related to the total event number, z, associated with the plurality of six cells, C1-C6.
  • the identified regression functions are third degree polynomial regression functions, of the form:
  • the polynomial regression function identified for the area covered by cell C3 has been used as statistical model to describe the relationship between real traffic and network events for all the system cells.
  • Figure 8 is a graph showing the real traffic data, Yh, imported from an external data source on the x-axis and the total event number detected by the network probe for each sampling time interval during the 76 days evaluation period on the y-axis.
  • the traffic data range from a value 1, corresponding to the moving flow condition to a value 4, corresponding to the impossible traffic condition.
  • the continuous line is the polynomial regression curve having degree three being calculated by means of the OLS technique.
  • the F value is 124.68, much higher than the critical value equal to 7 in case of 100 Df, therefore the residuals are supposed to follow a normal distribution and the null hypothesis is rejected with a probability, p ⁇ 0.01.
  • the polynomial regression function found for cell C3 it is possible to calculate the normalization factors in order to identify the traffic threshold values.
  • the system calculated the maximum event number during the data sampling time interval (2 minutes) to be equal to 315, corresponding to the value calculated by the regression function (see figure 8) for the traffic index 4. Then by observing the intersection of the regression curve with every traffic index, it has been possible to calculate the event number threshold correlated to respective traffic indexes smaller than 4.
  • the event number threshold correlated to respective traffic indexes smaller than 4.
  • the network events number reduction is 26.984% (from 315 to 230);
  • the network events number reduction is 57.143% (from 315 to 135);
  • the network events number reduction is 68.254% (from 315 to 100).
  • the network event data are detected in cells C1-C6 during a collection interval equal to the sampling interval (i.e. 2 minutes) and comprising the request time instant.
  • the system calculated the number of network event data detected during the collection time interval for each cell, (Ndet)i,.., (Ndet)6 and collects form off-line data the N4 values for each cell C1-C6 and the normalization factors calculated for cell C3.
  • the respective number of collected network event data (Ndet)i- (Ndet) 6 has been compared with the event number threshold values for the respective cell in order to determine which traffic index each number of event data had to be associated with.
  • the event number detected in the collection interval is 74 for cell CI .
  • the number (N de t)i is compared to the NI -N4 values being calculated in that way for a cell C 1. Since (Ndet)i is greater than NI and smaller than N2, the real-time traffic index associated with cell CI is the one corresponding to N2, namely 2 (high). If, again by way of example, (Ndet)i has been greater than or equal to 147, the traffic index would be 4, namely impossible.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

A method for vehicular traffic prediction in a geographic area covered by a mobile telecommunications network and comprising a plurality of radio cells, wherein the method comprises a first plurality of steps being carried out off-line, before to monitor the real traffic conditions. The first plurality of steps comprises receiving "historical" sample network event data being generated in a first cell of the plurality of cells and determining mathematical relationships correlating the sample data to traffic numerical values being acquired from an external data source. The method comprises a second plurality of steps wherein the network event data are collected real-time, in a collection time interval, from each cell of the plurality of cells in order to calculate the corresponding traffic prediction numerical datum based on mathematical models being determined starting from the mathematical relationships being determined in the off-line step, and then to calculate the current real-time traffic condition, without having a current traffic observation available.

Description

"Method and system for vehicular traffic prediction"
Mobility in the urban areas has been a subject for studies in the last years, mainly for the interest in the environmental and social policies field aimed at a reduction of the environmental impact coming from people and goods movement. An increase in the offering of analysis technologies and a higher attention to the sustainable mobility suggest socio-economic and living standards improvements.
Traditional approaches for calculating the real-time vehicular traffic conditions are often based on road sensors, such as inductive coil sensors, closed circuit cameras and on emergency calls. More recently, GPS data and social network discussions are used to identify areas where the traffic congestion is higher. However, these latter approaches are generally limited by the social discussions reliability and by the reduced availability of data collected from GPS.
The active or passive traffic monitoring systems use FDC {Floating Car Data) technologies in order to calculate real-time updated traffic data. In the active monitoring, the vehicles provided with a device connected to a data collection system periodically provide information related to the vehicle velocity and position in a given instant. In the passive monitoring, a device installed on the vehicle is not required and the data are passively collected by statistical detection equipment and conveniently processed in order to infer the users' position.
A passive FDC technology is the FC D {Floating Cellular Network Data), in which each mobile terminal (2G, 3G, 4G) being connected to the network becomes an information source that can be used to monitor the mobility patterns by anonymously processing each mobile station position.
M. T. Alrefaie et al. in "SUPERHUB: the User Centric Approach for New Traffic Prediction Models", presented to the 9th ITS European Congress, Dublin, Ireland, 4- 7 June 2013, describes a microscopic real-time traffic predicting model based on GPS data and two macroscopic models using data from a mobile network. A macroscopic model, called probabilistic approach for prediction of traffic (PAPT), is based on the assumption that the probabilistic models can describe the correlations between the real traffic situations and the mobile network events collected real-time by probes. These correlations are modelled by means of regression functions trying to define the relationships between the independent variables Y (the "historical" data collected from other data sources, for example Google traffic or navigation systems GPS) and the independent variables Z (the data collected by the probes connected to the mobile network).
The Applicant observed that the selection of probabilistic models describing the correlation between the real traffic conditions and the events collected by the mobile network is important in order to obtain an accurate prediction of the traffic flow. According to an aspect consistent with the present disclosure, it is provided an automatic method for predicting the vehicle traffic in a geographic area covered by a mobile telecommunications network and comprising a plurality of radio cells, the method comprising:
receiving network event data between the mobile network and mobile terminals in an event database and storing network event data localized in the plurality of radio cells and occurred at event instants comprised in an assessment time interval, wherein each network event datum comprises data identifying a signalling message exchanged between the mobile network and a mobile terminal, an identification code of the mobile terminal user, data identifying a radio cell where the mobile terminal has been localized receiving/transmitting the signalling message, and an event time instant associated with said signalling message,
dividing the assessment time interval into a plurality of sampling time intervals and calculating the number of sample network event data for each sampling interval comprised in the assessment time interval so as to determine a plurality of numbers of sample event data for each cell of the plurality of cells,
selecting sample network event data in the data stored in the event database as network event data associated with at least one event described by a respective signalling message localized in a cell of the plurality of radio cells in the assessment time interval,
- receiving, from a traffic data source, numerical traffic sampling data indicative of the traffic and relating to time intervals comprised during a time interval corresponding to the assessment time interval and in a sampling area corresponding to the area covered by a first radio cell of the plurality of cells, wherein the numerical traffic sampling data are defined with a plurality n of ascending numerical traffic values as traffic indexes,
correlating the plurality of numbers of sample event data calculated for the first cell with the numerical traffic sampling data by applying at least one mathematical regression function, described by a continuous curve and by a plurality of numerical parameters, to the numbers of sample network event data for the first cell as an independent variable and to the traffic sampling data as a dependent variable searching the values of the plurality of numerical parameters of the regression function which best approximate point values having as their coordinates the numbers of sample network event data and the numerical traffic values,
selecting and storing the mathematical regression function of best approximation of the at least one mathematical regression function,
generating mathematical relationships based on the mathematical regression function of best approximation,
receiving, at a time instant of request, a request of prediction of the traffic in a prediction geographic area comprising a second radio cell, wherein the second cell is comprised in the plurality of radio cells comprising the first cell,
defining a collection time interval comprising the request instant and collecting in real time network event data localized in the second cell by intercepting by a network passive probe current network event data in the collection time interval, wherein the network event data are associated with at least one event described by a signalling message of the same type as the signalling message of the at least one event of the sample network event data stored for the assessment interval,
calculating a number of detected network event data, (Ndet)2, collected in the second cell in the collection time interval, analyzing in the event database the stored network event data that are localized in the second cell and occurred in the assessment time interval and applying said mathematical relationships to the stored network event data for the second cell to determine the numerical traffic prediction datum for the second cell.
In some preferred embodiments, generating mathematical relationships based on the mathematical regression function of best approximation comprises:
calculating a first plurality n of threshold values of number of sample network event data, (Ni)i, .., (Nn)i, localized in the first cell by applying said regression function of best approximation respectively on each value of the plurality n of numerical traffic values,
calculating a plurality (n-1) of normalization factors fi, i= 1, 2, .. (n-1), based on the ratio between a threshold value (Ni)i of number of sample network event data for the first cell and the threshold value (Nn)i of number of sample network event data relating to the maximum numerical traffic value n for the same cell.
Preferably, applying said mathematical relationships to the stored network event data for the second cell to determine the numerical traffic prediction datum in the second cell comprises:
- selecting the maximum number of event data for said second cell as a threshold value of number of detected events (Nn)2 associated with the maximum traffic index n for the second cell, and
applying said plurality of normalization factors fi to the threshold value of number of events (Nn associated with the maximum traffic index n for the second cell by calculating a respective plurality (n-1) of threshold values of number of network event data lower than (Nn to obtain a second plurality of n threshold values associated with the respective plurality n of numerical traffic values,
comparing the number of network event data collected in the second cell (Ndet)2 to the second plurality n of threshold values of number of network event data so as to determine a highest threshold value of number of event data of the second plurality of threshold values (Ni)2, .., (Nn)2 for the second cell, which is less than or equal to the number (Ndet)2 of network event data collected and
associating a numerical traffic value corresponding to said highest threshold value of number of event data as a numerical traffic prediction datum for the second cell.
Preferably, the collection time interval comprising the request instant has the same duration as the sampling time interval.
In some embodiments, the normalization factors are percent factors defined by fi=(l- (Ni)i/(Nn)i)xlOO, i= 1, 2, .. (n-1).
In some embodiments, receiving network event data in an event database comprises:
intercepting network event data between the mobile network and mobile terminals in the geographic area comprising the plurality of cells by a network passive probe, and
- transmitting and loading network event data intercepted in the event database.
Preferably, the assessment time interval comprises a plurality of sequential observation intervals and the method comprises, before transmitting and loading network event data in the event database:
- aggregating and coding the network event data intercepted in sequential observation time intervals,
wherein transmitting and loading network event data comprises transmitting and loading the coded network event data in the event database.
Preferably, the assessment time interval is divided in a plurality of identical sampling intervals, each sampling interval being formed by a respective plurality of observation intervals.
In one embodiment, the step of aggregating and coding the sample network event data comprises creating a .csv file for each observation interval, wherein each network event datum in an observation interval is codified as input of the .csv file for that observation interval.
In some embodiments, after collecting in real time network event data and before calculating the number of event data detected in real time in the collection time interval, aggregating and coding detected network event data localized in the second cell in sequential observation intervals corresponding to the observation intervals of the sample event data.
Preferably, the collection time interval is composed of a plurality of sequential observation intervals.
Preferably, calculating the network event data number detected in the collection time interval in the second cells, (Ndet)2, is carried out by calculating the number of current network events located in the second cell for each observation interval constituting the collection time interval and adding the respective numbers for each observation interval in order to determine the number of event data collected in the collection time interval.
Preferably, the number of network event data detected in an observation time interval is the sum of the network event data collected in real-time in respective event time instants (e.g. Timestamp) comprised in said observation interval.
Preferably, correlating the sample network event data and the traffic sampling data for the first radio cell is carried out by applying a plurality of regression mathematical functions, each being described by a continuous curve and by a respective plurality of numerical parameters, wherein correlating comprises determining the correlation coefficient for each plurality of mathematical functions to determine a plurality of correlation coefficients, and selecting and storing the best approximation regression mathematical function is carried out by selecting the regression mathematical function in the plurality of the regression functions having the highest correlation coefficient of the plurality of correlation coefficients.
In some embodiments, selecting sample network event data in the event database is carried out by selecting all stored network event data in the event database localized in the first radio cell and having event instants comprised in the assessment time interval.
In some embodiments, the degree of the polynomial regression mathematical function is equal to or greater than 1.
In some preferred embodiments, before the step of dividing the assessment time interval in sampling intervals and calculating the number of sample network event data for each sampling interval, analyzing and filtering the sample network event data to eliminate the non-significant network events, wherein analyzing and filtering comprises:
(i) extracting network event data from the event database localized in the plurality of cells and relating to at least one event and at time instants comprised in the time interval corresponding to the assessment time interval,
(ii) grouping the extracted network event data that are associated with a same identification code of a user of a mobile terminal so as to create a plurality of groups associated with respective and different user identification codes,
(iii) chronologically ordering the network event data relating to each identification code of a user of the plurality of identification codes of the user,
(iv) based on the chronological order of the event data, establishing a spatial path traced by a target mobile terminal associated with a target identification code in the assessment time interval through at least two of the plurality of radio cells, wherein the spatial path is defined as a sequence of pairs of adjacent and contiguous cells,
(v) determining the displacement velocity from a starting cell Ci to a second arrival cell Q that is adjacent and contiguous to the starting cell for each pair of adjacent and contiguous cells along the spatial path established for the target mobile terminal, based on a first localization instant of a first event in the cell Ci associated with the target identification code and a second localization instant of a second event in the cell Q associated with the target identification code,
(vi) determining whether the displacement velocity is less than or equal to or whether it is greater than a displacement velocity threshold value,
(vii) selecting the network event data relating to the target identification code of the user in the assessment interval associated with a displacement velocity that is greater than the velocity threshold value,
(viii) repeating the steps from (iv) and (vii) for the remaining user identification codes of the plurality of user identification codes so as to select sample network event data relating to mobile terminals associated with vehicle movements, and (ix) transmitting and loading the plurality of selected network event data in the event data base, wherein correlating the plurality of numbers of sample event data with the traffic sampling data is carried out by taking the number of filtered network event data resulting from step (ix) as an independent variable for the first cell of the plurality of cells.
Preferably, in the step (vii), the step of selecting comprises eliminating the network event data related to a target identification code for which the spatial path comprises at least one displacement between at least one pair of adjacent and contiguous cells Ci and Q having a displacement velocity equal to or smaller than the velocity threshold value.
Preferably, the determination of the displacement velocity between two adjacent and contiguous cells comprises:
receiving data relating to the topology of the mobile network from a topologic data base,
based on the data relating to the network topology, calculating a respective centroid for each cell of the pair of adjacent and contiguous cells Ci and Cj passed through along the established path of a target mobile terminal as the starting cell Ci and the arrival cell Cj, and
calculating the displacement velocity between the centroid of the starting cell Ci and the centroid of the adjacent and contiguous arrival cell Cj as the ratio between the distance between the centroids of the cells of the pair of adjacent and contiguous cells and the difference between the localization time instants of the respective first and second network events generated respectively by the starting cell and the arrival cell.
Preferably, determining if the displacement velocity is equal to or greater than or smaller than a displacement velocity threshold value comprises:
establishing a plurality of velocity tags, each tag being associated with a range of displacement velocity values and establishing a velocity tag threshold value, and
determining if the displacement velocity is equal to or greater than or smaller than a velocity threshold value comprises determining if the velocity tag is greater than a velocity tag threshold value. In some embodiments, detecting and filtering network event data to eliminate the nonsignificant network events further comprises, after the step (viii) and before the step (ix):
extracting the network event data selected in step (viii) and which are associated with vehicle movements,
associating a localization time tk of the event in the cell to each cell Ck of the plurality of cells of the spatial path of a mobile terminal associated with a target identification code and selecting a target passing-through time ts, for each cell Ck of the plurality of cells of the spatial path associated with the target identification code, calculating a first passing-through time (tk- tk-i) from a starting cell Ck-i associated with a localization time tk-i to an adjacent and contiguous arrival cell Ck associated with a localization time tk and a second passing-through time (tk-tk-2) backwards between the cell Ck and
Figure imgf000011_0001
- determining whether at least one of the conditions (tk-tk-i) >ts and (tk-tk-
2) <2ts is met,
if at least one of the two conditions is met, eliminating from the established path of the mobile terminal the network event datum associated with the cell Ck and to the localization instant tk, and
- repeating the previous steps for the remaining user identification codes of the plurality of identification codes of the user so as to create a plurality of filtered network event data to be transmitted and loaded in the event data base. Preferably, the mobile network is a 2G or 3G radio access network.
Preferably, the signalling messages are one or more messages selected in the group consisting of: CM Service Request, Common ID, Paging Response, Location Update Request, Location Updated Accept, TMSI Reallocation, Location Report, Relocation Command, Handover Request, Handover Complete and Handover Performed.
Preferably, the identification code of the mobile terminal user is the EVISI code being obscured by means of a hash algorithm (O-EVISI).
Preferably, the passive network probe is installed on a signalling interface of a mobile telecommunication network and configured to intercept network event data passively and the signalling interface is between the mobile network core and at least one radio access station.
Preferably, the connection interface is an A interface of a 2G network or an Iu interface of a 3G network and the mobile network core is a MSC server.
According to some preferred embodiments, the mobile network is a 2G or 3G radio access network and the data identifying the radio cell where the mobile terminal has been localized receiving/transmitting the signalling message comprise the LAC and the SAC.
The present disclosure is related also to a traffic prediction system in a mobile network comprising a plurality of radio cells, the system comprising:
a passive network probe configured to passively intercept network event data located in the plurality of radio cells and occurred in event instants comprised in an assessment time interval, wherein each given network datum comprises identification data of a signalling message exchanged between the mobile network and a mobile terminal, an identification code of the mobile terminal user, identification data of a radio cell where the mobile terminal has been localized receiving/transmitting the signalling message, and an event time instant associated with said signalling message,
a monitoring server configured to receive the network event data from the probe and to process the received data in order to create structured data; a storage configured to receive and temporarily store structured network event data received from the monitoring server,
a traffic data source containing traffic sampling numerical data, a database containing the mobile network topology,
- a traffic prediction system comprising:
an event database configured to receive network event data from the storage,
a tool configured to carry out a statistical data processing and extract network event data from the event database during an assessment time interval and related to at least one cell of the plurality of cells of the geographic area and to extract traffic sampling data from the traffic data source during a time interval corresponding to the assessment interval, the statistical processing tool being also configured to analyse the extracted data by calculating a plurality of numbers of network event data related to the assessment time interval and to find a regression mathematical function between the number of network event data and the traffic sampling data and, if at least one statistically valid mathematical function is identified, to store said mathematical function, and a traffic prediction tool configured to be triggered by an external traffic prediction request associated with at least one cell of the plurality of cells, to receive real-time network event data collected by means of the network probe and stored in the events data base, to determine an event number related to the request instant by taking the detected network event data from the event database during a collection time interval comprising the request instant and to apply mathematical relationships, defined by the regression function stored in the statistical processing tool, in order to determine the traffic numerical values related to the request instant and to the at least one cell as the processing final output.
The traffic prediction result can be used by navigation systems to calculate the paths having a better practicability based on current traffic information or by simulators with the purpose of adopting traffic plans, for example to modify the public transportation frequency and route or the dimensions of the critical urban areas having a high congestion level.
According to some embodiments of the present invention, the method comprises a first macro-step in which the statistical models are derived and which is executed off-line and a second macro-step, subsequent to the first one, being executed on line, in which the statistical models are executed on network data being collected in real-time with the purpose of predicting the vehicular traffic situations and/or calculating traffic indicators without a real traffic observation.
Further characteristics and advantages of the invention will be clear from the following detailed description made referring to the invention embodiments given by way of non-limiting example and the attached figures where:
- figure 1 schematically shows a geographic area of interest divided in a plurality of cells. - figure 2 is a diagram of a mobile network infrastructure provided with a network data detection system, according to the present invention.
- figure 3 is a diagram showing the architecture of the traffic prediction system according to one embodiment of the present invention.
- figure 4 schematically shows the main steps of the traffic flow prediction method according to one embodiment of the present invention.
- figure 5 is a graph showing a linear regression curve compared with an exemplary logistic regression curve.
- figure 6 is an exemplary time-event diagram.
- figure 7 is an exemplary map resulting from the application of the method according to one embodiment.
- figure 8 is a graph reporting a regression function of a polynomial type, which better approximates the points (diamonds) whose coordinates are the real traffic data imported from an external data source and the total event number detected by a network probe.
The traffic flow prediction is carried out in a geographic area covered by a mobile telecommunications network and divided in a plurality of radio cells. In the preferred forms, the geographic area is an urban area, for example corresponding to a middle or big city. Figure 1 schematically shows a geographic area 20 divided in a plurality of radio cells 21. For example, a radio cell covers an area of about 300 meters and the geographic area consists of 2000 cells, being at least two by two adjacent and contiguous. If the mobile network is a 2G one (GSM), on each cell a base transceiver station (BTS) is arranged or, if the network is a 3G one (UMTS), a Node B station is arranged. BTS or Nodes B communicate, by means of a transceiver antenna, with the mobile terminals, which transmit information, such as voice, data, text (SMS), system signalling, etc.
Figure 2 is a block diagram of a mobile network infrastructure, which can be a 2G, 3G or 4G network. In case of a 2G network, a plurality of BTSs 23, each of them having a covering area defining a cell, are connected to a base station controller (BSC) 24 to concentrate the traffic towards a Mobile Switching Centre (MSC) 25 and to sort the calls towards BTSs. As is known, the MSC manages the control and routing of the calls, the resources allocation and contains information related to a mobile terminal, such as I MS I {International Mobile Subscriber Identity) and I ME I of the users being radio connected to BTSs. Usually, the MSC is connected to a plurality of BSCs 24 (only one BSC is shown in the figure). Generally, the MSC 25 is made of two network logical entities, the MSC Server (MSC-S), managing the signalling, and a circuit switch (Media Gateway), managing user data. BSC and MSC communicate to each other by means of an interface, named interface A, which handles the resource allocation to the mobile terminals and their mobility.
It should be understood that the mobile network could be a 3G one. In this case and as generally known, the Node B stations are connected to a Radio Network Controller (RNC), being connected in turn to a digital switching centre and MSC and RNC communicate by means of an interface Iu.
A passive network probe 26 is physically connected to the MSC-S of the MSC managing an area comprising the geographic area to be monitored. The network probe is configured to monitor real-time signalling messages being generated at the interface A of the MSC, or, in case of a 3G network, at the interface Iu, where the RNC ends the connection. The network probe is configured to passively intercept the signalling messages passing in the mobile network and to collect the signalling messages exchanged between mobile network and mobile terminals. Preferably, the probe is configured to capture all the signalling messages being exchanged between mobile network and mobile terminals. Each intercepted signalling message is associated with a message intercepting time instant, an identification of the mobile terminal user and with an identification of the cell transmitting and/or receiving the messages. In one embodiment, the time instant associated with a captured signalling message is the Timestamp being recorded by the mobile network, i.e. the event recording time at the interface A or Iu.
In the present description and claims, the network data, each of them comprising identification data of a signalling message exchanged between the mobile network and a mobile terminal, the associated time instant, identification data of the radio cell where the mobile terminal is located while using the network (particularly the cell from which the terminal receives or transmits the signalling message) and user identification data are overall indicated as network event data. Preferably, the probe is configured to convert and code the collected network event data by converting them from their local format (provider specific) to a serialized data format, such as xDR (external Data Representation).
The network probe transmits the collected network event data to a monitoring server 27, which is configured to receive data from the probe and to process received data, particularly by assembling and filtering them as described in the following.
In some embodiments, the network probe is configured to trace a plurality of network event data generated by the network and described in the Technical Specifications of 2G and 3G networks, each plurality of network event data comprising:
data identifying a signalling message selected in the group consisting of: a request by a mobile terminal of a service allocation (CM Service Request), for example a call; the allocation procedure for the localization of a mobile terminal (Paging Response); the request to update the localization area of a moving mobile terminal (Location Update Request), the completion of the localization update procedure (Location Updating Accept); the handover procedure when a mobile terminal moves from one cell and enters in another cell (HO Request, HO Complete and HO Performed);
the EVISI univocally identifying the SIM and thus the user of the mobile terminal transmitting/receiving the signalling message,
the piece of information temporally identifying a certain network event, namely the recording time instant of the signalling message (Timestamp) and, optionally, the TMSI, namely the identity temporarily assigned by the BSC to each mobile terminal when it is turned on to avoid the EVISI being intercepted fraudulently.
In case the probe monitors both 2G network and 3G network devices, it is preferred that the probe captures the network indicator too (signalling message Channel Descriptor) which disambiguates the events coming from the 2G network from those coming from the 3 G network. The network event data are transmitted to the monitoring server 27.
Figure 3 is a diagram showing the architecture of the traffic prediction system according to one embodiment. The network event data captured by the network probe 26 are transmitted to the monitoring server 27, which processes them by compressing, coding and aggregating the network event data collected by the probe. In one embodiment, the server 27 is configured to code the received data as a sequence of .csv {comma separated value) files, being identified with a file name of the form <StartDate>-<EndDate> <ServerName>.csv, where <StartDate> and <EndDate> specify the starting instant and the final instant defining the network observation time interval. Preferably, the observation time interval is of from 1 second to 5 seconds. Each .csv file has as records (inputs) a plurality of coded network event data (e.g. xDR) that happened at a respective time instant comprised in the network observation time interval. The .csv files are sequentially created by the monitoring server and are associated with consecutive observation time intervals.
The monitoring server 27 is configured to filter the network event data that can reveal the mobile terminal user identity in order to make those identification data anonymous. For this purpose, the monitoring server is configured to process the IMSI associated with the mobile terminal that generated/received the event by obscuring it by means of irreversible encryption algorithms, being known per se. The network data being structured in a data table 28, e.g. a .csv file, are transmitted to a traffic prediction system 29 being implemented on a server managing traffic data from mobile networks (not shown in the figure). The traffic prediction system comprises a storage 30, preferably having high capacity and performances. In one embodiment, the storage capacity is 10 TB having 10.000 rpm drives. The storage 30 receives and temporarily stores the received data, for example, the storage is configured in order to keep a .csv file history 6 months long. The transmission mode of the network data from the monitoring server is preferably of the "push" type, that is the server 27 pushes a given data quantity in a continuous way at given time intervals, for example a 1 Mb file every 5 seconds (corresponding to the observation time interval) by means of a secure data transfer network protocol, such as SFTP.
In one embodiment, the monitoring server outputs a plurality of files in .csv format, wherein each .csv file has a plurality of inputs, each input containing the following network information being captured by the network probe:
Timestamp: the time elapsed since 01/01/1970 in seconds (in order to temporally locate the event);
Event type being collected by the network probe, the event being defined by a respective signalling message, wherein the latter is identified by an event identification code (i.e. the signalling message identification data). The following signalling messages are monitored: CM Service Request, Common ID (the procedure informing the RNC or BSC about the user IMSI), Paging Response, Location Updating Request, Location Updating Accept, TMSI Reallocation Complete (the completion of the procedure used to protect the mobile terminal identity at every cell change), Location Report, Relocation Command, and events related to the handover procedure (HO Request, HO Complete and HO Performed);
LAC (Location Area Code), the code identifying the area covered by the radio cell of the mobile network (e.g. area 21 in figure 1);
SAC or CI: the Service Area Code or the cell identification;
O-IMSI: the IMSI code obscured by means of hash mechanisms; Channel Descriptor: 2G or 3G channel descriptor.
In one embodiment, the identification code of the signalling message is an integer. For example, "1" identifies the CM Service Request, "2" the Common ID event, "3" Paging Response, "4" Location Update Accept, etc.
For example, the .csv file records (inputs) are structured in the storage 30 in the following way: "Timestamp, event identification code, LAC, SAC (or CI), O-FMSI, Channel Descriptor". For example, a file record is "1297868695,3, 51504,6360,3E6AF3303D865D23, 1".
From a network server 33, the traffic prediction system 29 receives also data related to the mobile network topology, namely the division of the monitored geographic area in a plurality of cells and each cell covering area. The data related to the network topology can be made available by the service provider owning the network, for example in a first data table being structured in a .csv file containing the identification of each cell (LAC and SAC) and its spatial geometry. The server 33 is configured to receive that data table and to process data contained therein producing a second data table, for example a shapefile recording the geometrical data (spatial geometries) and the attributes they represent.
In the embodiment of figure 3, the traffic prediction system 29 comprises a database 31 containing the mobile network topology and the network server 33 transmits the topological data 32 to the database 31 that stores and manages them. The traffic prediction system 29 also receives the spatial data defining the geometrical and geographical elements of a geographical map of the monitored geographic area. The spatial data are made available in one or more central repositories 34, external to the prediction system, for example Internet publicly available repositories, such as OpenStreetMap and Google Satellite Maps. The spatial data are imported by the prediction system in a spatial database 35, which stores them.
The data stored in the storage 30 and databases 31 and 35 are inputted to a vehicular traffic prediction software tool 36 being suitable to process the input data and to determine a traffic prediction based on the input data.
The traffic prediction tool 36 comprises a parallel loading tool 37 configured to take the data being structured in .csv files from the storage 30 and to load them en bloc and in real-time in an event database 38. The data transmitted and loaded in the event database, which are related to a pre-set time interval, in the following specified as assessment time interval, and originating from a plurality of radio cells are stored in the event database and constitute the "historical" network event data for analysing the correlations between those data and the traffic indexes, as it will be described with more details in the following.
A filtering tool 39, being configured to analyse and filter the network events that are non-relevant to the vehicular traffic prediction calculation, extracts network event data being structured in the event database 38 and network topological data being structured in the database 31, wherein the network event data are associated with time instants comprised in a pre-set time interval. In one embodiment, the filtering tool 39 extracts the network event data from a .csv file being related to a pre-set time interval corresponding to the assessment time interval of the network event data stored for the plurality of cells.
The assessment time interval is formed by a sequence of network observation time intervals, corresponding to the observation intervals where the output data from the monitoring server 27 are coded.
The filtering tool 39, as well as the other tools included in the traffic prediction system 29 are implemented by software modules in order to be executed by a processor.
For that purpose and according to one embodiment, a first filtering mathematical algorithm, implemented in the filtering tool 39, takes as input data the network topology data from the database 31 and the network event data stored in the event database 38 and it is configured to detect mobility modes that are not consistent with a movement on a user vehicle, for example mobile users that likely moved or are moving on foot or by bicycle. The network event data comprise O-IMSI, LAC, SAC (or CI), the identification code of the signalling message captured by the probe (e.g. "1", "2", etc.) and the time instant associated with each event (e.g. Timestamp). The first filtering mathematical algorithm carries out a procedure comprising receiving a plurality of network event data associated with time instants being coded as a file sequence associated with sequential observation time intervals and being comprised in an assessment time interval and eliminating from the plurality of network event data the data being associated with a mobile terminal whose spatial path through a plurality of adjacent and contiguous cells presents at least one displacement velocity from a cell to an adjacent and contiguous cell being lower than a pre-set threshold velocity value. Particularly and according to one embodiment, the first filtering algorithm carries out a procedure comprising: extracting a plurality of network event data from the event database 38 related to an assessment time interval, grouping network event data of the plurality of event data being associated with a same O-EVISI, in order to create a plurality of groups related to respective and different mobile terminals and temporally ordering the network event data of each group associated with an O-EVISI by establishing a spatial path being followed by each mobile terminal associated with a respective O-EVISI through a plurality of cells comprised in the plurality of cells forming the geographic area (during the reference time interval, that is the assessment interval). In the 2G and 3 G networks, each cell of the plurality of cells is identified by respective LAC and SAC (or CI) and the spatial path of each O-EVISI is represented by a sequence of identifications SAC (or CI) ordered in an increasing time order. Preferably, the plurality of network event data extracted from the event database comprise all the network event data loaded in the event database during the assessment time interval. Subsequently, the procedure carried out by the first filtering algorithm comprises extracting from the network topology database the topological data referred to the cells crossed by each O-EVISI, cross-referencing the event data for each O-EVISI to the respective network topology data (being both identified by the same SAC/CI) in order to map the network event data to the network cells that generated them, and calculating a respective centroid for each cell of the plurality of cells crossed along the established path of each O-EVISI by means of mathematical functions being known per se. Then, the procedure carried out by the first filtering algorithm comprises: calculating the displacement velocity between the centroid of a cell Ci and the centroid of an adjacent and contiguous cell Q for each pair <Ci, > along the path, during a time interval defined by the difference between the time instant being associated with the event generated in cell Ci and the time instant associated with the event in cell Q, associating to the calculated velocity a velocity tag Ty. If the velocity tag Ty is greater than a pre-set threshold velocity tag value, it is assumed that the mobile terminal user is moving using a (motor) vehicle. The network event data associated with an O-EVISI whose path during the assessment time interval is associated with at least one velocity tag equal to or smaller than the threshold value for the movement between two adj acent and contiguous cells are eliminated from the plurality of network event data being received as input in the algorithm.
The procedure for time ordering the network event data is carried out, for example, by ordering in an ascending way the Timestamps associated with each event for a certain mobile terminal (O-EVISI). The time ordering can be carried out by means of known mathematical functions of hierarchical ordering, by first ordering on the basis of the O-EVISI and then on the basis of the Timestamp. The velocity of the mobile terminal identified by an O-EVISI is calculated by the ratio (distance between cell Ci centroid and cell Q centroid)/(difference between the time instant being associated with LAC of cell Cj and the time instant being associated with LAC of cell Ci).
In one embodiment, the first filtering algorithm automatically eliminates a network event being associated with a target O-EVISI and to a first event instant if a network event, being associated with the same O-EVISI and to a second time instant immediately preceding the first temporal instant, is located in the same radio cell. Particularly, for a given O-EVISI, the first filtering algorithm eliminates a given network datum indicating a movement with respect to a given network datum being intercepted immediately before inside the same cell, i.e. those subsequent network event data comprise the same LAC. The movements inside the same cell are eliminated because the displacement velocity is automatically determined being zero, thus smaller than the threshold velocity value. For example, the first filtering mathematical algorithm uses the following classification for the velocity tags associated with a respective velocity interval calculated for a user, being identified by the O-IMSI, who moves between cell Ci and cell Cj, assuming a threshold velocity tag value equal to 15:
if Tij < 3, the user is stationary.
if 3 < Tij < 6, the user is moving by foot,
if 6 < Tij < 15, the user is moving by bicycle, and
if Tij > 15, the user is moving by a motor vehicle.
The filtering tool selects the O-IMSI having Tij > 15 for each movement between adjacent and contiguous cells.
The filtering tool 39 is configured to input the output data of the first filtering algorithm to a second filtering algorithm in order to detect and filter network anomalies, such as a too fast movement between cells and/or the ping-pong effect in the sequence of cells crossed by a mobile terminal. The ping-pong effect takes place when a mobile terminal carries out the handover between two base stations, namely between two cells Ck-i and Ck, passing from Ck-i to Ck and back to Ck-i within a pre-set time interval, indicated in the following as target time interval. The second algorithm verifies, for each adjacent and contiguous cells pair <Ck-i, Ck> crossed by a mobile terminal, if the localization updates of the mobile device are compatible with the target time interval for the crossing from a cell Ck-i to an adjacent and contiguous cell Ck and back from cell Ck to cell Ck-i, for example, it verifies that they are not too quick.
The second filtering algorithm executes the following procedure: receiving as input data a plurality of network event data being associated with time instants comprised in the assessment time interval, grouping network event data of the plurality of event data in groups being associated with the same mobile terminal identified by a respective O- IMSI, temporally ordering the network event data of each group associated with each O-IMSI and establishing a spatial path being followed by each mobile terminal associated with a respective O-IMSI through a plurality of cells, the path being defined by a sequence of radio cells crossed by a mobile terminal and each cell being associated with an event localization time for a specific cell, <Ck, tk>. In one embodiment, the localization time associated with the event recording is the Timestamp. The second filtering algorithm verifies that the crossing time of a mobile terminal along the spatial path from each cell Ck-i to an adjacent and contiguous cell Ck, (tk-tk-i), and that the back crossing time between cells Ck and Ck-i (tk-tk-2), where tk-2 is the time at two previous instants with respect the instant tk, are compatible with a pre-set target crossing time, ts. In one embodiment, the second algorithm determines if at least one of the following condition is verified: (tk-tk-i) > ts and (tk-tk-2) <2ts. If at least one of the two conditions is verified, the event associated with <Ck, tk> is eliminated. The filtering tool iteratively carries out the second filtering algorithm on all the mobile terminals being identified by a respective O-EVISI, i.e. on all the network event data groups related to the respective O-EVISI.
It should be noted that in the present embodiment, the second filtering algorithm receives the network event data being filtered by the first filtering algorithm, which eliminated the network event data related to movements of a mobile terminal inside the same cell. Therefore, the input data of the second algorithm for a specific mobile terminal associated with consecutive time instants are network event data related to different radio cells.
It should be understood that the filtering tool could be configured to execute the second filtering algorithm before the first filtering algorithm. In that embodiment, the first filtering algorithm receives the network event data being filtered by the second filtering algorithm.
The filtering tool transmits and loads the input data being processed by the filtering tool 39 in order to eliminate non-relevant network events in the target database 38. According to one embodiment, the target database is a relational database comprising two tables implementing the respective relationships of non-filtered databases, being transmitted and loaded by the parallel loading tool 36, and of filtered data being transmitted and loaded by the filtering tool 39.
A statistical data processing tool 40 receives the structured and filtered network data from the target database 38 and real traffic sampling data being observed by existing traffic systems.
The traffic sampling data are "historical" traffic data coming from at least one traffic information source and being collected in one or more repositories 43 of information publicly available in Internet, such as InfoTraffic, WebCam, Google Traffic, Waze or from a manual traffic sampling in pre-set points of the geographic area. The statistical processing tool 40 is configured to extract the traffic data contained in the repository 43 in a time interval corresponding to the assessment time interval and related to the plurality of cells of the geographic area and to extract the stored network event data for the assessment time interval. The statistical processing tool is further configured to analyse the received data and to find a mathematical relationship between the historical network events being structured and stored in the database 38 during an assessment time frame and the traffic sampling data extracted from the repository 43 being associated with the same time frame. If at least one statistically significant mathematical relationship is identified, it is stored in the statistical data processing tool 40. Preferably, the at least one mathematical relationship is a regression mathematical function being applied to the network event data, as described in more details in the following. The stored regression mathematical functions are used by a traffic prediction tool 41 during a second on-line phase of the method according to the present disclosure. The traffic prediction tool 41 is activated by external requests (for example navigation systems or simulators generating requests automatically) for the real-time calculation of the vehicular traffic indexes. In one embodiment, an external request to the traffic prediction system is activated by a mobile terminal user through an application interface.
When the traffic prediction tool 41 is triggered by an external request, it is configured to receive real-time network event data collected by means of the network probe 26, to determine the event number related to the request instant by taking the detected network event data from the database 38 in a collection time interval comprising the request instant and to apply mathematical relationships, being defined by the regression function stored in the statistical processing tool 40, in order to determine the traffic numerical values related to the request instant and to the radio cells associated with the request, which form the prediction geographic area as the processing final output.
The statistical processing tool can be configured also to create three-layer maps grouping the calculated traffic data for that request instant, the mobile network topology contained in 31 and the geographical map of the geographic area monitored by the spatial data contained in the database 35 in a single map. The above-described tools (39, 40 e 41) are software modules implemented in the usual way, for example using programming languages such as Java, C or scripting languages to receive, transmit and process data contained in the data bases.
Figure 4 schematically shows the main steps of the traffic flow prediction method according to one embodiment of the present invention. The method comprises a first plurality of steps 51 being carried out off-line, before monitoring the real traffic conditions. The first plurality of steps 51 comprises: receiving "historical" network event data 49a that have been collected through network probes and stored in an event database (e.g. the event database 38 in the system of figure 3) related to a plurality of cells forming the geographic area of interest; receiving network topology data 49b (e.g. from database 31); creating a data structure based on historical events data and network topology data in order to map the network event data to the network cells generating them; filtering the sample event data by eliminating the non-relevant network event data (e.g. by means of the filtering tool 39); selecting sample network event data from the stored network event data; correlating (e.g. by means of the data statistical processing tool 40 in the system of figure 3) the sample event data to the traffic sampling data 50 coming from at least one traffic information external source, and applying at least one regression mathematical function being described by a continuous curve and by a plurality of numerical parameters to the number of sample network event data as independent variable and to the traffic sampling data as dependent variable finding the plurality of numerical parameters of the regression function which best approximates the curve of discrete input values in said regression function. In figure 4, only the correlate step is shown in the plurality of steps 51. The step of applying at least one regression function is executed in at least one cell of the plurality of cells and preferably in a sub-plurality of the plurality of cells.
Preferably, a plurality of regression functions is applied to the sample event data and to the traffic sampling data and the method comprises selecting a regression function of the plurality of regression functions that best approximates the curve being formed by the discrete values of the independent and dependent variables. The selected regression function is stored in order to be used in a subsequent step for the real traffic prediction in the radio cell. The sample network event data and the traffic sampling data are related to the same time frame, indicated as assessment time interval, starting from a starting instant, for example two hours, twenty four hours or one week starting from a specific time instant, and related to the same radio cell. The data sample is selected in order to be statistically relevant, generally comprising at least one hundred detections correlated to respective traffic sampling data and network event data.
In the embodiment of figure 4, after determining the regression mathematical function, the method comprises a second plurality of steps carried out on-line for the real-time traffic prediction in a geographic area of interest comprising a plurality of cells. Particularly, for each cell, the stored historical network event data are analysed to determine a plurality of relationships or mathematical models between network event data stored in the assessment time interval and traffic numerical values starting from the application of the regression function being determined in the off-line step for the at least one first cell and those mathematical relationships are applied to the real-time network event data collected during a collection time interval comprising the request instant for each cell of the plurality of cells forming the geographic area of interest (step 47) in order to determine the corresponding traffic prediction numeric datum (step 48). In the diagram of figure 4, a dotted line 44 shows the division between the first plurality of off-line steps and the second plurality of on-line steps.
According to some embodiments, the method selects the network event data related to an event type, namely to a specific signalling message, for example Location Update Request. In this case, the independent variable used to determine the regression function in the off-line step is the number of "historical" network event data related to the specific signalling message and, accordingly, the independent variable in the on- line phase is the number of real-time event data captured by the probe at a detection instant or during a time interval related to the selected event, always related to the same event, i.e. Location Update Request.
In some preferred embodiments, the independent variable, both to determine the regression function and to apply the same, following traffic prediction requests, is the total number of network event data captured by the probe. For example, if the probe is configured to trace the signalling messages CM Service Request, Common ID, Paging Response, Location Updating Accept, TMSI Reallocation Complete, Location Report, Relocation Command, HO Request, HO Complete and HO Performed, the total number of network event data captured by the probe is the sum of all event data described by those signalling messages captured in an observation time interval. In one embodiment, the independent variable to determine off-line the regression function is the total number of network event data related to a plurality (i.e. at least two) of events, namely of network event data used to calculate the independent variable are related to two different signalling messages.
The current network event data being intercepted by the network probe during a collection time interval are aggregated and coded in sequential observation intervals by the monitoring server 27 in a way similar to the "historical" network event data during the off-line step. Particularly, the current event data, structured in a data table, are transmitted by the server 27 to the storage 30. The parallel loading tool 37 to take the structured data and load them en bloc and in real-time in the event database 38. Preferably, the observation intervals dividing the collection interval have all the same duration, for example 2 seconds, and they are selected to be equal to the ones of the network event data being detected and stored in the off-line step. Therefore, the collection time interval is divided in observation intervals and the calculation of the number of detected network event data is carried out by calculating the number of current network event data for each observation interval constituting the collection time interval.
Preferably, the current network event data are taken by the filtering tool 39 in order to filter away the non-relevant network events for the vehicular traffic prediction calculation in a way similar to the one previously described for the off-line step. Particularly, a first and second filtering algorithm are applied to the current network event data and the filtered event data are inputted to the event database 38.
In one embodiment, the filtering tool sequentially extracts all the network event data related to a plurality of consecutive observation intervals forming the collection time interval, so as to continuously filter the network event data being loaded in the event database 38.
The real-time detected network event data are inputted into the traffic prediction tool 41 in order to calculate the traffic indexes in the cells forming the geographic area of interest during the collection interval, as explained in more details in the following. The Applicant observed that the accuracy of the traffic condition prediction depends on the correlation coefficient of the correlation mathematical function being identified during the first off-line step of the invention. In some preferred embodiments, the correlation function is a polynomial regression function having degree three calculated by means of the least squares technique.
Table 1 reports an example of sample event data coming from the mobile network (i.e., number of signalling messages zi and z2) and of input real traffic sampling data, i.e. Yh (off-line step 51 in figure 4), for a specific cell.
The assessment time interval is divided in sampling intervals Δΐί, i= 1,2,3, . . . and the event number reported in Table 1 is calculated on one sampling interval. The sampling time intervals have all an even duration equal to 2 minutes.
Table 1
Figure imgf000028_0001
In Table 1, the column <Yh-' Traffic indexes> lists all the traffic indexes that have been observed in a specific area corresponding to the cell by analysing the traffic data coming from external sources. Every index is associated with a specific sampling time interval Δΐί, i= 1,2,3, . . . , and to a specific elementary area of the mobile network of the geographic area, i.e. to a cell. The sampling time intervals Δΐί are sequential and their sum defines the assessment time interval. In one embodiment, each sampling time interval is formed by a sequence of observation time intervals all having same duration, for example 5 seconds.
In some embodiments, the sampling intervals are of from 1 minute to 5 minutes. The indexes Yh group the traffic in four levels having integer consecutive index numbers: 1 = moving flow, 2 = high flow, 3 = congested flow, 4 = impossible flow, according to the standard definitions often used when describing the traffic condition. The column <zi: Location update Request N. > lists the number of events Location Update Request being recorded by the probe. According to the 2G and 3G network protocols, the events zi are messages transmitted by a mobile terminal when it received the service allocation. Finally, the column <∑2: Handover N> lists the number of event data of the handover type, i.e., the events where the mobile terminal, during a service session (call, data sending/receiving, etc.), is redirected from its actual cell to a new cell. In the current example, the handover events include Handover Request, Handover Complete and Handover Performed (therefore z2 is the sum of events related to three event types). Those events zi and z2 are recorded by the network probe as events of the reference cell and during the time interval Δΐί corresponding to the one related to the real traffic index reported in the same table row and received as input data from an external data source.
The sample event data of Table 1 are data being filtered by the first and second filtering algorithm in the ways described above.
From data reported in Table 1, it is noted that a linear relationship holds between the dependent variable Yh (traffic data) and the independent variable zi: when the real traffic index increases, the number of Location Update Request increases. It is also noted that the relationship between the variables Yh and z2 seems to be linear too. A linear multivariable relationship exists between Yh and <zi, z2> as well.
Specifically, a linear single variable regression function, Y=f(z), as a function of the input data can be expressed by the following equation: f(z) = m-z + b, (1) where f(z) is the dependent variable Y as a function of the independent variable a constant and m is the slope coefficient calculated by:
∑(z - z)(y - y)
m =
∑( - )2
(2)
A linear multivariable regression function can be expressed by the following equation: f(z) = mi-zi + ηΐ2·Ζ2 + m3-z3 + ... + mn-zn + b (3) where b is a constant and mi, i=l, 2, ...n, being the slope coefficients of the independent variables zi. The constant and the slope coefficients represent the numerical parameters of the regression function.
Generally, a regression polynomial function is defined, besides by the degree n (n=l for a linear function), by a plurality of numerical parameters.
In the example reported in Table 1, the slope coefficient and the constant the numerical parameters for which the regression curve gets as close as possible to the points of plane (Yh,z) in case of single-variable regression and to the set of points (Yh,zi, z2,..) in case of variate regression. In a known way, the least squares technique is applied in order to determine the numerical parameters of the regression function by using known software tools included, for example, in the statistical data processing tool 40 in the embodiment of figure 3.
The estimation accuracy of the numerical parameters of the regression function and dependent variable is given by the following standard errors:
Seb'. standard error for the constant b;
Sey: standard error for Y;
Sem: standard error per the slope coefficient m;
- R2: correlation coefficient, describing the correlation level between Y and z. There is a perfect correlation (i.e. in a linear regression, all the Y points are on the line described by f(z)) when R2=l, while there is a total lack of correlation when R2=0;
D/. freedom degree, describing the minimum number of data enough to evaluate the quantity of information contained in the statistic;
Ssreg. sum of the regression squares, i.e. ^(y - y)2 , where y are the estimated values, and
SSresid : sum of the squared residuals of the regression (residual deviation), if
Figure imgf000030_0001
perfect linear relationship.
Preferably, in case of linear regression, to the collected data Y and z, an F test and/or a T test is applied in order to verify if the relationship between Y and z is random, namely if the errors are following a normal distribution. For example, with a value p=0.05 e Df=6, the F critical value is 4.53. If the F value is greater than the critical value, the correlation is not random. In a similar way, if the T value is greater than a critical value, the correlation is not random.
A regression function f(z) is generated starting form data Yh and z (or zi, z2, z3, . . . in case of multivariable regression function) being collected in the same pre-set assessment time interval (for example, a week) for each cell. In this way, several "significant" statistical models can be found, and every statistical model is associated with a regression function describing the correlation between the number of events being generated/ex changed by the mobile network and the real traffic indexes being collected from external data sources. A plurality of regression functions are selected (e.g. by means of the statistical data processing tool 40), each regression function is a mathematical function described by a continuous curve and by a plurality of unknown numerical parameters. The plurality of regression functions is applied to the input data, i.e. sample event data and traffic sampling data, and the regression function better describing the correlation between the network events and the traffic indexes is selected, preferably selecting the regression function having the highest correlation coefficient R2.
The Applicant noticed that, since the traffic flow can be defined by a small plurality of values (e.g. the traffic indexes of Table 1 and figure 6), in some embodiments, the regression function between the network event data and the traffic data can be described by a generalised linear model. For generalised linear model, it is meant a linear function and polynomial functions having a degree greater than one.
A linear regression function, or more generally a polynomial regression function, generally requires that the errors distribution is normal in order to apply the least squares procedures. In other words, it is preferable that the above-described F or T tests are verified.
In other embodiments, the correlation between the sample event data and the real traffic sampling data can be described by a single variable or multivariable logistic regression function. A logistic regression does not assume a linear relationship between independent variables and dependent variables and requires that neither the regression errors distribution be normal nor the error variance be constant (homoscedasticity). A logistic regression is often used to determine a dependent variable Y=f(z) of a dichotomous type, namely taking two values a and β only. In this case, f(z) is the probability that Y is equal to a and (l-f(z)) is the probability that Υ=β. The Applicant understood that a model using a logistic regression function could estimate the probability whether a path has a congested traffic flow or not. For example, non-congested flow =0 and congested flow =1.
Figure 5 is a graph showing a linear regression curve 61 compared to an exemplary logistic regression curve 62. In the example, the logistic regression curve is described by the following function:
Figure imgf000032_0001
where the variable z is a linear function of the explanatory variable x, z=bo + bix. The logistic curve of Eq. (4) is particularly adapted to model a dichotomous dependent variable because it approximates better the points Y=0 and Y=l on the Y-axis. On the contrary, the linear regression function 61 can predict values that are greater than 1 or smaller than zero, or generally values in between 0 and 1.
The independent variable z of the logistic regression function has a linear relationship with the logit of the dependent variable:
logit(p) = log
(5)
The sample events z collected for a cell can be expressed as a function of the sampling time interval, Ati, in a time-event diagram. Figure 6 shows an example of time-event diagram, reporting the event number, in this case all the events that the probe is able to monitor, as a function of time t. A plurality of event number threshold values, Nl, N2, N3 and N4 have been defined. Each event number threshold value is associated with the transition between a traffic level and the next one that are taken from external sources of traffic sampling data and that show four traffic conditions, shown using four respective numerical values shown also as indexes. Until the threshold value Nl, the traffic level is considered being moving and shown with index 1, between threshold value Nl and value N2, the vehicle flow is considered being high and shown with index 2 and between threshold value N2 and value N3, the vehicle flow is considered being congested and shown with index 3. Beyond threshold value N3, the vehicle flow is defined by a level 4, being "impossible".
In one embodiment, in the off-line step and after determining a regression function that better describes the correlation between historical real traffic data and network event data for a first radio cell, the method comprises generating a time-event diagram for said first cell, that describes the event number, i.e. signalling messages, recorded for the first cell as a function of the event recording time obtained by the analysis of the "historical" network events being stored with reference to the first cell. Basically, the assessment time interval of the historical data is divided in a plurality of sampling time intervals and the event number in the diagram is a y-axis point referring to a specific sampling time interval.
By applying the regression function determined during the off-line step for that cell (therefore having known numerical parameters) to the first radio cell, a plurality n of event number threshold values, e.g. Nl, N2, N3, ... , Nn corresponding to respective n traffic values, e.g. numerical values 1, 2, n (in the example of figure 6, n=4) is calculated. Then, by taking the event number threshold values, Nl, N2, N3, etc., calculated for the first cell as input data, a respective percentage normalization factor, f, i=l, (n-1) based on the ratio between the event number threshold value of the traffic level i and the maximum traffic level n, fi=(l-Ni/Nn)xl00 is calculated. The normalization factors are stored in the traffic prediction tool 41.
The normalization factors determined for the first cell during the off-line phase, are then applied during the on-line phase to the recorded values of real-time network event number for each respective cell, different from the first cell, of the plurality of cells forming the geographic area, in order to calculate the corresponding traffic indexes for each cell of the geographic area.
In the on-line phase, following on from a traffic prediction request in a cell Ck of the plurality of cells different from the first cell, the network event data localized in cell Ck are collected during a pre-set collection time interval comprising the time instant at which the request has been generated. The number of network event data, (Ndet)k, detected during the collection time interval is calculated by summing the events detected during that interval. The collection interval is selected to be equal to the sampling time interval where the number of network event data in the first cell during the off-line phase have been calculated.
In some preferred embodiments, the network event data collected in real-time by the probe are aggregated and coded in the monitoring server 27 during sequential observation intervals corresponding to observation intervals of the sample event data. Preferably, the collection time interval is formed by a plurality of sequential observation intervals and (Ndet)k is determined by summing the respective numbers for each observation interval forming the collection interval.
The traffic prediction tool 41 is configured to analyse the network event data detected during the off-line step and stored in the assessment time interval in the event database 38 and to retrieve the event number threshold value (Nn)k associated with the maximum traffic index n by selecting the maximum number of event data among the "historical" network event data for that cell Ck. The maximum number of event data is associated with a sampling time comprised in the assessment interval. Then, the event number threshold values (Nl)k, (N2)k, . . . (N(n-i))k are calculated by applying the normalization factors determined before for the first cell, f, i=l, 2,... (n-l). The detected value (Ndet)k is compared to the threshold values (Nl)k, (N2)k, . . . (N(n))k in order to determine which one is the highest event number threshold value being smaller than or equal to the value (Ndet)k and the traffic numeric value corresponding to said highest event number threshold value is associated with it. For example, if (Nl)k < (Ndet)k < (N2)k, the traffic index associated with (Ndet)k is 1; if (N3)k < (Ndet)k< (N4)k, the traffic index associated with (Ndet)k is 3.
If the traffic request is related to a geographic area comprising a plurality of radio cells, the steps of selecting the threshold value of event data number associated with the traffic index n, calculating the threshold values of event data number associated with the traffic indexes 1, (n-1) by applying the normalization factors and comparing the detected event number to the n threshold values are repeated for all the cells forming the geographic area.
Preferably, the threshold value (Nn)k, being calculated based on historical off-line data and on a pre-set sampling time for each cell of the plurality of cells of the geographic area, is updated during the on-line phase every time a new maximum value is observed during the sampling time interval.
For example, and referring to figure 6, if an external client requests the real-time traffic index for "current Monday at 16:00" in a geographic area covering a plurality of network cells including at least one second cell different from the first cell, the system calculates the event number registered real-time during a time interval around 16:00 (e.g. between 15:58 and 16:00), which defines the collection interval, it calculates the threshold values for the second cell by applying the normalization factors for the first cell, and it selects a time-event diagram associated with the second cell. Then, the system outputs the traffic index being calculated by applying the regression function and having as input datum the recorded event number. In the example, the circle represents the event number recorded real-time around 16.00 being associated with a traffic level 3, "congested flow". The procedure is iteratively applied to all the remaining cells of the plurality of cells forming the prediction geographic area. The prediction geographic area can correspond with the geographic area formed by the plurality of cells for which "historical" event data are stored or it can be an area of interest for the prediction included in that area.
By carrying out the previously described steps for each one of the plurality of cells of the geographic area of interest, it is possible to generate a traffic flow map covering the geographic area by means of the method of the present invention. Figure 7 shows the map of the area (the main roads of the geographic area of interest are schematically visible), the cells topology, whose borders are shown using a white continuous line and a graphical representation of the traffic indexes by means of a grey scale for each cell. White colour shows a traffic index 1, light grey the traffic index 2, dark grey a traffic index 3 and black a traffic index 4, respectively.
Since the traffic indexes are varying real-time, the maps can be updated on line, for example by means of an application triggering the execution of traffic predictions at regular time intervals (e.g. every 3 minutes) that calculates again the traffic indexes for each cell of the plurality of cells of the geographic area of interest.
Example
The above-described method has been implemented and applied (in its first off-line phase) to six different urban areas of the Milan city in order to find a statistical model based on a regression being able to describe the correlation between mobile network events and real traffic indexes of the considered areas. Areas have been selected following heterogeneity criteria: the areas represent different peculiarities often existing in a big city, namely narrow streets, multiple lanes avenues, congested traffic flow areas, etc. Each area was formed by a respective radio cell, indicated in the following as C1-C6.
During an evaluation time interval equal to 76 days, samples have been collected related to the event number for each cell associated with the 6 selected geographical areas and during different times of the day. The event number is calculated considering all the network events type generated during a 2 minutes time interval (i.e. sampling interval) centred on the Timestamp of the sampling request.
The traffic sampling data for the dependent variable Yh have been collected from a source outside the system providing real-time traffic data available in Internet providing real-time traffic data calculated starting from cameras and inductive coils being located at relevant points of the main streets of the urban areas under analysis. The real traffic data Yh and the event number for each cell during an assessment time interval have been correlated by means of the least squares technique (OLS, Ordinary Least Squares) using three regression functions: a linear function according to Eq. (1), a single-variable third degree polynomial function, and a single variable logistic function of the type described by Eq. (4). In the analysis, the statistical significance threshold has been set equal to 0.01. The normality of the residual distribution, being considered a requirement in order to apply the OLS regression, has been proved by means of the Fisher Test. In agreement to the selected statistical significance threshold, the probability values (i.e. p-values) of obtaining a result equal to or greater than the observed one are smaller than or equal to 0.01.
The regression functions that best correlated the traffic behaviour related to the network event number were the third degree polynomial functions. Linear and logistic functions for the six selected areas did not provide any significant statistical model. Table 2 reports statistical significance data of the found regression functions for the six reference areas, correlating traffic data Yh (80 samples) and samplings related to the total event number, z, associated with the plurality of six cells, C1-C6. The identified regression functions are third degree polynomial regression functions, of the form:
Yh = Cl-Z3 + C2-Z2 + C3-Z + C4 (8) with C1-C4 the numerical parameters of the function.
In Table 2, Ssreg is the regression squares sum, Ssres,d is the square residual sum, Sey is the standard error for Y and the F value is the Fisher Test result. Table 2
Cell ID Model R2 F value SSreg SSresid Sey significance
CI no 0.4341 26.4147 11.3637 23.6680 0.6560
C2 no 0.2808 16.2912 3.3735 8.9042 0.4550
C3 yes 0.7229 124.6879 41.2810 19.2023 0.5754
C4 no 0.3158 7.7379 1.4945 8.3054 0.4394
C5 no 0.1500 6.4388 2.2328 14.9115 0.5888
C6 no 0.2428 10.9038 1.1517 3.5913 0.3250
The results show that the regression model is meaningfully applied for the cell C3, whose correlation coefficient, R2, is equal to about 0.72. For the remaining cells, the model is not particularly meaningful. For cell C3, about the 72% of the traffic flow variation is described by the total event number collected by the probe.
The polynomial regression function identified for the area covered by cell C3 has been used as statistical model to describe the relationship between real traffic and network events for all the system cells.
Figure 8 is a graph showing the real traffic data, Yh, imported from an external data source on the x-axis and the total event number detected by the network probe for each sampling time interval during the 76 days evaluation period on the y-axis. The traffic data range from a value 1, corresponding to the moving flow condition to a value 4, corresponding to the impossible traffic condition. The continuous line is the polynomial regression curve having degree three being calculated by means of the OLS technique. The regression function numerical parameters of Eq. (8) for cell C3 are ci=2.3652, C2=18.522, C3=78.896 e c4=0. The F value is 124.68, much higher than the critical value equal to 7 in case of 100 Df, therefore the residuals are supposed to follow a normal distribution and the null hypothesis is rejected with a probability, p< 0.01. By means of the polynomial regression function found for cell C3, it is possible to calculate the normalization factors in order to identify the traffic threshold values. By analysing the sample event data for cell C3, the system calculated the maximum event number during the data sampling time interval (2 minutes) to be equal to 315, corresponding to the value calculated by the regression function (see figure 8) for the traffic index 4. Then by observing the intersection of the regression curve with every traffic index, it has been possible to calculate the event number threshold correlated to respective traffic indexes smaller than 4. In figure 8, it is possible to note that:
- the most sever traffic index 4 correlates with N4=315 network events;
- the traffic index 3 correlates with N3=230 network events;
- the traffic index 2 correlates with N2=135 network events;
- the traffic index 1 correlates with Nl=100 network events.
The normalization factors have been calculated as a reduction percentage in the network events number to pass from traffic index threshold n to traffic index threshold i=l, 2,.., (n-1), fi=(l-Ni/Nn)xl00.
In figure 8, it is possible to note that:
- to pass from traffic index 4 to index 3, the network events number reduction is 26.984% (from 315 to 230);
- to pass from traffic index 4 to index 2, the network events number reduction is 57.143% (from 315 to 135);
- to pass from traffic index 4 to index 1, the network events number reduction is 68.254% (from 315 to 100).
Then these normalization factors have been applied to all the real-time network event number recorded for each cell of the geographic area of interest, in order to calculate the corresponding traffic thresholds for each cell.
Particularly, following a request of current traffic prediction at a request instant, the network event data are detected in cells C1-C6 during a collection interval equal to the sampling interval (i.e. 2 minutes) and comprising the request time instant. The system calculated the number of network event data detected during the collection time interval for each cell, (Ndet)i,.., (Ndet)6 and collects form off-line data the N4 values for each cell C1-C6 and the normalization factors calculated for cell C3. Then, the system applied those normalization factors to the N4 values of the cells CI, C2, C4-C6 in order to determine a respective plurality of n event number threshold values Ni, with i=l, .., (n-1) for each cell CI, C2, C4, C5, and C6.
For each cell C1-C6, the respective number of collected network event data (Ndet)i- (Ndet)6 has been compared with the event number threshold values for the respective cell in order to determine which traffic index each number of event data had to be associated with.
For example, in the collection time interval the event number detected in the collection interval, (Ndet)i is 74 for cell CI . If the N4 value, taken as maximum event number stored for cell CI is equal to 200, the threshold values NI, N2, N3 are calculated: N3=(l-200*26.984%)=146, N2=86 and Nl=63 for cell CI . The number (Ndet)i is compared to the NI -N4 values being calculated in that way for a cell C 1. Since (Ndet)i is greater than NI and smaller than N2, the real-time traffic index associated with cell CI is the one corresponding to N2, namely 2 (high). If, again by way of example, (Ndet)i has been greater than or equal to 147, the traffic index would be 4, namely impossible.
This process is repeated for cells C2, C4, C5 and C6 in order to determine the traffic index for each cell during the collection interval comprising the request instant.

Claims

Claims:
1. An automatic method for predicting the vehicle traffic in a geographic area covered by a mobile telecommunications network and comprising a plurality of radio cells, the method comprising:
- receiving network event data between the mobile network and mobile terminals in an event database and storing network event data localized in the plurality of radio cells and occurred at event instants comprised in an assessment time interval, wherein each network event datum comprises data identifying a signalling message exchanged between the mobile network and a mobile terminal, an identification code of the mobile terminal user, data identifying a radio cell where the mobile terminal has been localized receiving/transmitting the signalling message, and an event time instant associated with said signalling message,
selecting sample network event data in the data stored in the event database as network event data associated with at least one event described by a respective signalling message localized in a cell of the plurality of radio cells in the assessment time interval,
dividing the assessment time interval into a plurality of sampling time intervals and calculating the number of sample network event data for each sampling interval comprised in the assessment time interval so as to determine a plurality of numbers of sample event data for each cell of the plurality of cells,
receiving, from a traffic data source, numerical traffic sampling data indicative of the traffic and relating to time intervals comprised in a time interval corresponding to the assessment time interval and in a sampling area corresponding to the area covered by a first radio cell of the plurality of cells, wherein the numerical traffic sampling data are defined with a plurality n of ascending numerical traffic values as traffic indexes,
correlating the plurality of numbers of sample event data calculated for the first cell with the numerical traffic sampling data by applying at least one mathematical regression function, described by a continuous curve and by a plurality of numerical parameters, to the numbers of sample network event data as an independent variable and to the traffic sampling data as a dependent variable searching the values of the plurality of numerical parameters of the regression function which best approximate point values having as their coordinates the numbers of sample network event data and the numerical traffic values,
selecting and storing the mathematical regression function of best approximation of the at least one mathematical regression function,
generating mathematical relationships based on the mathematical regression function of best approximation,
receiving, at a time instant of request, a request of prediction of the traffic in a prediction geographic area comprising a second radio cell, wherein the second cell is comprised in the plurality of radio cells comprising the first cell,
defining a collection time interval comprising the request instant and collecting in real time network event data localized in the second cell by intercepting by a network passive probe current network event data in the collection time interval, wherein the network event data are associated with at least one event described by a signalling message of the same type as the signalling message of the at least one event of the sample network event data stored for the assessment interval,
calculating a number of detected network event data, (Ndet)2, collected in the second cell in the collection time interval,
analyzing in the event database the stored network event data that are localized in the second cell and occurred in the assessment time interval and applying said mathematical relationships to the stored network event data for the second cell to determine the numerical traffic prediction datum in the second cell.
2. The method according to claim 1, wherein generating mathematical relationships based on the mathematical regression function of best approximation comprises: calculating a first plurality n of threshold values of number of sample network event data, (Ni)i, .., (Nn)i, localized in the first cell by applying said regression function of best approximation respectively on each value of the plurality n of numerical traffic values,
calculating a plurality (n-1) of normalization factors fi, i= 1, 2, .. (n-1), based on the ratio between a threshold value (Ni)i of number of sample network event data for the first cell and the threshold value (Nn)i of number of sample network event data relating to the maximum numerical traffic value n for the same cell,
3. The method according to claim 2, wherein applying said mathematical relationships to the stored network event data for the second cell to determine the numerical traffic prediction datum in the second cell comprises:
selecting the number maximum of event data for said second cell as a threshold value of number of detected events (Nn)2 associated with the maximum traffic index n for the second cell, and
applying said plurality of normalization factors fi to the threshold value of number of events (Nn)2 associated with the maximum traffic index n for the second cell by calculating a respective plurality (n-1) of threshold values of number of network event data lower than (Nn)2 obtaining a second plurality of n threshold values associated with the respective plurality n of numerical traffic values,
comparing the number of network event data collected in the second cell (Ndet)2 to the second plurality n of threshold values of number of network event data so as to determine a highest threshold value of number of event data of the second plurality of threshold values (Ni)2, .., (Nn)2 for the second cell, which is less than or equal to the number (Ndet)2 of network event data collected and
associating a numerical traffic value corresponding to said highest threshold value of number of event data as a numerical traffic prediction datum for the second cell.
4. The method according to claim 2 or 3, wherein the normalization factors are percent factors defined by
Figure imgf000043_0001
i= 1, 2, .. (n-1).
5. The method according to one of the claims 1 to 4, wherein receiving network event data in an event database comprises:
intercepting network event data between the mobile network and mobile terminals in the geographic area comprising the plurality of cells by a network passive probe, and
- transmitting and loading network event data intercepted in the event database.
6. The method according to claim 5, wherein the assessment time interval comprises a plurality of sequential observation intervals and the method comprises, before transmitting and loading network event data in the event data base:
aggregating and coding the network event data intercepted in sequential observation time intervals,
wherein transmitting and loading network event data comprises transmitting and loading the coded network event data in the event database.
7. The method according to claim 6, further comprising, after collecting in real time network event data and before calculating the number of event data detected in real time in the collection time interval, aggregating and coding detected network event data localized in the second cell in sequential observation intervals corresponding to the observation intervals of the sample event data.
8. The method of one of the preceding claims, wherein selecting sample network event data in the event database is carried out by selecting all stored network event data in the event database localized in the first radio cell and having event instants comprised in the assessment time interval.
9. The method of one of the preceding claims, further comprising, before the step of dividing the assessment time interval in sampling intervals and calculating the number of sample network event data for each sampling interval, analyzing and filtering the sample network event data to eliminate the non-significant network events, wherein analyzing and filtering comprises:
(i) extracting network event data from the event database localized in the plurality of cells and relating to at least one event and at time instants comprised in the time interval corresponding to the assessment time interval,
(ii) grouping the extracted network event data that are associated with a same identification code of a user of a mobile terminal so as to create a plurality of groups associated with respective and different user identification codes,
(iii) chronologically ordering the network event data relating to each identification code of a user of the plurality of identification codes of the user,
(iv) based on the chronological order of the event data, establishing a spatial path traced by a target mobile terminal associated with a target identification code in the assessment time interval through at least two of the plurality of radio cells, wherein the spatial path is defined as a sequence of pairs of adjacent and contiguous cells,
(v) determining the displacement velocity from a starting cell Ci to a second arrival cell Q that is adjacent and contiguous to the starting cell for each pair of adjacent and contiguous cells along the spatial path established for the target mobile terminal, based on a first localization instant of a first event in the cell Ci associated with the target identification code and on a second localization instant of a second event in the cell Q associated with the target identification code,
(vi) determining whether the displacement velocity is less than or equal to or whether it is greater than of a displacement velocity threshold value,
(vii) selecting the network event data relating to the target identification code of the user in the assessment interval associated with a displacement velocity that is greater than the velocity threshold value,
(viii) repeating the steps from (iv) and (vii) for the remaining user identification codes of the plurality of user identification codes so as to select sample network event data relating to mobile terminals associated with vehicle movements, and (ix) transmitting and loading the plurality of selected network event data in the event data base,
wherein correlating the plurality of numbers of sample event data with the traffic sampling data is carried out by taking the number of filtered network event data resulting from step (ix) as an independent variable for the first cell.
10. The method according to claim 9, wherein the determination of the displacement velocity between two adjacent and contiguous cells comprises:
receiving data relating to the topology of the mobile network from a topologic data base,
based on the data relating to the network topology, calculating a respective centroid for each cell of the pair of adjacent and contiguous cells Ci and Cj passed through along the established path of a target mobile terminal as the starting cell Ci and the arrival cell Cj, and
- calculating the displacement velocity between the centroid of the starting cell Ci and the centroid of the adjacent and contiguous arrival cell Q as the ratio between the distance between the centroids of the cells of the pair of adjacent and contiguous cells and the difference between the localization time instants of the respective first and second network events generated respectively by the starting cell and the arrival cell.
11. The method according to claim 9 or 10, wherein detecting and filtering network event data to suppress the non-significant network events further comprises, after the step (viii) and before the step (ix):
- extracting the network event data selected in step (viii) and which are associated with vehicle movements,
associating a localization time tk of the event in the cell to each cell Ck of the plurality of cells of the spatial path of a mobile terminal associated with a target identification code and selecting a target passing-through time ts, - for each cell Ck of the plurality of cells of the spatial path associated with the target identification code, calculating a first passing-through time (tk- tk-i) from a starting cell Ck-i associated with a localization time tk-i to an adjacent and contiguous arrival cell Ck associated with a localization time tk and a second passing-through time (tk-tk-2) backwards between the cell Ck and
Figure imgf000046_0001
determining whether at least one of the conditions (tk-tk-i) >ts and (tk-tk- 2) <2ts is met,
if at least one of the two conditions is met, eliminating from the established path of the mobile terminal the network event datum associated with the cell Ck and to the localization instant tk, and
repeating the previous steps for the remaining user identification codes of the plurality of identification codes of the user so as to create a plurality of filtered network event data to be transmitted and loaded in the event data base.
12. The method of one of the preceding claims, wherein the mobile network is a 2G or 3G radio access network and the data identifying the radio cell where the mobile terminal has been localized receiving/transmitting the signalling message comprise the LAC and the SAC.
PCT/IB2015/053376 2014-05-09 2015-05-08 Method and system for vehicular traffic prediction WO2015170289A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ITMI2014A000850 2014-05-09
ITMI20140850 2014-05-09

Publications (1)

Publication Number Publication Date
WO2015170289A1 true WO2015170289A1 (en) 2015-11-12

Family

ID=51179014

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2015/053376 WO2015170289A1 (en) 2014-05-09 2015-05-08 Method and system for vehicular traffic prediction

Country Status (1)

Country Link
WO (1) WO2015170289A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105913664A (en) * 2016-06-29 2016-08-31 肖锐 Traffic flow monitoring and predicting system
CN106128098A (en) * 2016-06-29 2016-11-16 肖锐 A kind of multi-display apparatus that can carry out traffic flow forecasting
CN106157615A (en) * 2016-06-29 2016-11-23 肖锐 A kind of traffic flow information management handheld terminal
RU2664034C1 (en) * 2017-04-05 2018-08-14 Общество С Ограниченной Ответственностью "Яндекс" Traffic information creation method and system, which will be used in the implemented on the electronic device cartographic application
CN112183221A (en) * 2020-09-04 2021-01-05 北京科技大学 Semantic-based dynamic object self-adaptive trajectory prediction method
CN112884190A (en) * 2019-11-29 2021-06-01 杭州海康威视数字技术股份有限公司 Flow prediction method and device
CN114037160A (en) * 2021-11-10 2022-02-11 西南交通大学 Method for constructing passenger flow prediction model of SEM-Logit tourism railway
CN114173356A (en) * 2021-11-04 2022-03-11 中国联合网络通信集团有限公司 Network quality detection method, device, equipment and storage medium
CN114724414A (en) * 2022-03-14 2022-07-08 中国科学院地理科学与资源研究所 Method, device, electronic equipment and medium for determining urban air traffic sharing rate
CN115394086A (en) * 2022-10-26 2022-11-25 北京闪马智建科技有限公司 Traffic parameter prediction method, device, storage medium and electronic device
CN116502125A (en) * 2023-04-28 2023-07-28 成都赛力斯科技有限公司 Vehicle event dividing method and device and vehicle networking server
CN117910660A (en) * 2024-03-18 2024-04-19 华中科技大学 Bus arrival time prediction method and system based on GPS data and space-time correlation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001069570A2 (en) * 2000-03-17 2001-09-20 Makor Issues And Rights Ltd. Real time vehicle guidance and traffic forecasting system
US20030014181A1 (en) * 2001-07-10 2003-01-16 David Myr Traffic information gathering via cellular phone networks for intelligent transportation systems
US20030014180A1 (en) * 2001-07-10 2003-01-16 David Myr Method for regional system wide optimal signal timing for traffic control based on wireless phone networks
WO2004027729A1 (en) * 2002-09-19 2004-04-01 Politecnico Di Torino A method and system for detecting and estimating road traffic from location data of mobile terminals in a radiocommunication system, so as a program therefor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001069570A2 (en) * 2000-03-17 2001-09-20 Makor Issues And Rights Ltd. Real time vehicle guidance and traffic forecasting system
US20030014181A1 (en) * 2001-07-10 2003-01-16 David Myr Traffic information gathering via cellular phone networks for intelligent transportation systems
US20030014180A1 (en) * 2001-07-10 2003-01-16 David Myr Method for regional system wide optimal signal timing for traffic control based on wireless phone networks
WO2004027729A1 (en) * 2002-09-19 2004-04-01 Politecnico Di Torino A method and system for detecting and estimating road traffic from location data of mobile terminals in a radiocommunication system, so as a program therefor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
M. T. ALREFAIE ET AL.: "SUPERHUB: the User Centric Approach for New Traffic Prediction Models", THE 9TH ITS EUROPEAN CONGRESS, 4 June 2013 (2013-06-04)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106128098A (en) * 2016-06-29 2016-11-16 肖锐 A kind of multi-display apparatus that can carry out traffic flow forecasting
CN106157615A (en) * 2016-06-29 2016-11-23 肖锐 A kind of traffic flow information management handheld terminal
CN105913664A (en) * 2016-06-29 2016-08-31 肖锐 Traffic flow monitoring and predicting system
RU2664034C1 (en) * 2017-04-05 2018-08-14 Общество С Ограниченной Ответственностью "Яндекс" Traffic information creation method and system, which will be used in the implemented on the electronic device cartographic application
US10720049B2 (en) 2017-04-05 2020-07-21 Yandex Europe Ag Method and system for generating traffic information to be used in map application executed on electronic device
CN112884190B (en) * 2019-11-29 2023-11-03 杭州海康威视数字技术股份有限公司 Flow prediction method and device
CN112884190A (en) * 2019-11-29 2021-06-01 杭州海康威视数字技术股份有限公司 Flow prediction method and device
CN112183221A (en) * 2020-09-04 2021-01-05 北京科技大学 Semantic-based dynamic object self-adaptive trajectory prediction method
CN112183221B (en) * 2020-09-04 2024-05-03 北京科技大学 Semantic-based dynamic object self-adaptive track prediction method
CN114173356A (en) * 2021-11-04 2022-03-11 中国联合网络通信集团有限公司 Network quality detection method, device, equipment and storage medium
CN114173356B (en) * 2021-11-04 2024-01-09 中国联合网络通信集团有限公司 Network quality detection method, device, equipment and storage medium
CN114037160B (en) * 2021-11-10 2023-04-18 西南交通大学 Method for constructing passenger flow prediction model of SEM-Logit tourism railway
CN114037160A (en) * 2021-11-10 2022-02-11 西南交通大学 Method for constructing passenger flow prediction model of SEM-Logit tourism railway
CN114724414B (en) * 2022-03-14 2023-06-09 中国科学院地理科学与资源研究所 Method and device for determining urban air traffic sharing rate, electronic equipment and medium
CN114724414A (en) * 2022-03-14 2022-07-08 中国科学院地理科学与资源研究所 Method, device, electronic equipment and medium for determining urban air traffic sharing rate
CN115394086A (en) * 2022-10-26 2022-11-25 北京闪马智建科技有限公司 Traffic parameter prediction method, device, storage medium and electronic device
CN116502125A (en) * 2023-04-28 2023-07-28 成都赛力斯科技有限公司 Vehicle event dividing method and device and vehicle networking server
CN116502125B (en) * 2023-04-28 2024-03-12 重庆赛力斯凤凰智创科技有限公司 Vehicle event dividing method and device and vehicle networking server
CN117910660A (en) * 2024-03-18 2024-04-19 华中科技大学 Bus arrival time prediction method and system based on GPS data and space-time correlation

Similar Documents

Publication Publication Date Title
WO2015170289A1 (en) Method and system for vehicular traffic prediction
EP3132592B1 (en) Method and system for identifying significant locations through data obtainable from a telecommunication network
EP3335209B1 (en) Method and system for computing an o-d matrix obtained through radio mobile network data
Calabrese et al. Urban sensing using mobile phone network data: a survey of research
Asgari et al. A survey on human mobility and its applications
EP2608181B1 (en) Method for detecting traffic
EP2608144A2 (en) Mobile device user categorisation based on location statistics
EP2603893A1 (en) Aggregating demographic distribution information
US11765549B2 (en) Contact tracing involving an index case, based on comparing geo-temporal patterns that include mobility profiles
Holleczek et al. Traffic measurement and route recommendation system for mass rapid transit (mrt)
Tosi Cell phone big data to compute mobility scenarios for future smart cities
CN105101399B (en) Pseudo-base station mobile route acquisition methods, device and pseudo-base station localization method, device
WO2020002094A1 (en) Method and system for traffic analysis
Bahoken et al. Designing origin-destination flow matrices from individual mobile phone paths: the effect of spatiotemporal filtering on flow measurement
US20230199513A1 (en) Method and system for calculating origin-destination matrices exploiting mobile communication network data
CN115002697B (en) Contact user identification method, device and equipment of user to be checked and storage medium
CN116701551A (en) Abnormality prediction method, device, equipment and storage medium
EP3563592B1 (en) Method for determining the mobility status of a user of a wireless communication network
Kishore et al. Mobile phone data analysis guidelines: applications to monitoring physical distancing and modeling COVID-19
CN112601177B (en) Method, system, server and storage medium for guiding people flow in public area
CN113409018B (en) People stream density determining method, device, equipment and storage medium
Jormakka Validation of mobile network data in producing Origin-Destination matrices
WO2022219457A1 (en) Method for characterization of paths travelled by mobile user terminals
Rajna Mobility analysis with mobile phone data
Victor et al. Smartphone-collected mobile network events for mobility modeling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15732345

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15732345

Country of ref document: EP

Kind code of ref document: A1