CN104464344B - A kind of vehicle running path Forecasting Methodology and system - Google Patents
A kind of vehicle running path Forecasting Methodology and system Download PDFInfo
- Publication number
- CN104464344B CN104464344B CN201410628190.1A CN201410628190A CN104464344B CN 104464344 B CN104464344 B CN 104464344B CN 201410628190 A CN201410628190 A CN 201410628190A CN 104464344 B CN104464344 B CN 104464344B
- Authority
- CN
- China
- Prior art keywords
- path
- sequence
- path sequence
- pattern
- sequence pattern
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/09—Arrangements for giving variable traffic instructions
- G08G1/0962—Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages
- G08G1/0968—Systems involving transmission of navigation instructions to the vehicle
- G08G1/096805—Systems involving transmission of navigation instructions to the vehicle where the transmitted instructions are used to compute a route
- G08G1/096811—Systems involving transmission of navigation instructions to the vehicle where the transmitted instructions are used to compute a route where the route is computed offboard
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2474—Sequence data queries, e.g. querying versioned data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Remote Sensing (AREA)
- Computational Linguistics (AREA)
- Radar, Positioning & Navigation (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Traffic Control Systems (AREA)
Abstract
A kind of vehicle running path Forecasting Methodology and system, determine minimum internal memory, scanning pattern greatest length including based on Hadoop platform, and original route sequence library is averagely divided into n disjoint subpath sequence library;Original path sequence library and n sub-path sequence data base are uploaded to HDFS respectively;By main controlled node, n sub-path sequence data base is dispatched to different Map nodes, each Map node performs the GSP algorithm improved, according to minimum support x set in advance, the subpath sequence library in Map node memory is left in scanning in, calculating local path sequence pattern, Reduce node carries out merger process and obtains overall situation candidate sequence pattern;Scanning original path sequence library obtains global path sequence pattern again;Produced path correlation rule by global path sequence pattern and calculate the confidence level of path correlation rule, obtaining vehicle running path and predict the outcome.
Description
Technical field
The invention belongs to intelligent transportation system technical field, particularly relate to a kind of vehicle running path Forecasting Methodology and be
System.
Background technology
(1) intelligent transportation system
Along with the development of geographic positioning technology is with ripe, and the rise of mobile computing, based on path and geographical position
Application becomes academia and the common focus of industrial quarters even government.Routing information and geographical position are as the weight moving object
Want attribute, can be that a lot of improvement serviced with application system provides important support.By path and the position letter of mobile object
Cease and input as system, expedited the emergence of numerous emerging application.Intelligent transportation system is exactly that the most famous one should
Use field.The predecessor of intelligent transportation system is intelligent vehicle roadnet.Intelligent transportation system is by advanced information technology, data
Communication transfer technology, Electronic transducer technology, electron controls technology and computer processing technology etc. be effectively integrated apply to whole
Individual traffic management system, and set up a kind of on a large scale in, comprehensive play a role, in real time, accurately and efficiently
Multi-transportation and management system.Intelligent transportation system is a complicated comprehensive system, can divide from the angle of system composition
Some subsystems below becoming:
1) advanced transportation information service systems (ATIS)
On the basis of ATIS is built upon perfect information network.Traffic participant by being equipped on road, Che Shang, change
Take advantage of that station is upper, on parking lot and the sensor of RSMC and transmission equipment, provide the real-time friendship of various places to traffic information center
Communication breath;ATIS obtain these information and by process after, in real time to traffic participant provide Traffic Information, public transport
Information, transfer information, traffic weather information, parking lot information and other information relevant to trip;Traveler is according to these
Information determines the trip mode of oneself, selects route.Further, when be equipped with on car be automatically positioned with navigation system time, should
System can help driver to automatically select travel route.
2) advanced traveler information systems (ATMS)
ATMS some with ATIS shared information collection, process and transmission system, but ATMS is mainly to traffic pipe
Reason person uses, and is used for detecting control and management highway communication, provides communication contact between road, vehicle and driver.It
Traffic in roadnet, vehicle accident, meteorological condition and traffic environment will be carried out real-time supervision, rely on advanced person's
Vehicle testing techniques and Computerized Information Processing Tech, it is thus achieved that about the information of traffic, and according to the information pair collected
Traffic is controlled, and such as signal lights, issues induction information, road control, accident treatment and rescue etc..
3) advanced public transportation system (APTS)
The main purpose of APTS is the development using various intellectual technologies to promote public transportation industry, makes public transit system realize peace
Target the most convenient, that economy, freight volume are big.As by personal computer, closed-circuit television etc. to the public with regard to trip mode and event, road
Line and train number selection etc. provide consulting, are provided the real-time traffic information of vehicle to the person of waiting by display in bus stop.?
Public transit vehicle administrative center, can dispatch a car according to the real-time status reasonable arrangement of vehicle, the plan of returning the vehicle to the garage and knock off etc., improve work efficiency and
Service quality.
4) advanced vehicle control system (AVCS)
The purpose of AVCS is that exploitation helps driver to carry out the various technology of this wagon control, so that running car peace
Entirely, efficiently.AVCS includes the warning to driver and help, and barrier is avoided waiting automatic Pilot technology.
5) transportation management system
Herein referring to based on expressway network and information management system, utilize that Logistics Theory is managed is intelligentized
Logistic management system.Comprehensive utilization satellite fix, GIS-Geographic Information System, logistics information and network technology effectively organize goods to transport
Defeated, improve shipping efficiency.
6) E-payment system (ETC)
ETC is the most state-of-the-art toll on the road and bridge's mode.By the vehicle carried device that is arranged in vehicle windscreen with
The special short range communication of microwave between microwave antenna on charge station ETC track, utilizes Computer Networking to carry out with bank
Backstage settlement process, thus reach vehicle and be not required to stop by toll on the road and bridge station and the purpose of road and bridge expense can be paid, and paid
Expense sorting after background process give relevant income owner.Existing track is installed electric non-stop toll system
System, can make the traffic capacity in track improve 3~5 times.
7) emergency rescuing system (EMS)
EMS is a special system, and its basis is ATIS, ATMS and relevant rescue facility and facility, passes through
The rescue facility of traffic surveillance and control center with occupation is unified into organic whole by ATIS and ATMS, provides vehicle event for road user
The services such as barrier on-the-spot emergency action, trailer, on-the-spot rescue, eliminating accident vehicle.
(2) path Predicting Technique
The method of path prediction is broadly divided into following two classes:
1) path based on Markov model Forecasting Methodology.Document [1]: Simmons R, Browning B, Zhang Y,
et al.Learning to predict driver route and destination intent[C].Proceedings
Of Intelligent Transportation Systems Conference, even if 2006:127-132. proposition has more preferably
Path, people also habitual can select that passes by the past to be familiar with route.Based on this premise, by driver history is travelled road
The observation of footpath data, sets up Markov probabilistic model and generates Markov probability tree, accordingly can be by current time state, it was predicted that car
The Path selection of subsequent time.Document [2]: ETC charge data Research on Mining [J] based on mixing Markov model. traffic
Transport system engineering and information .2012.12 (4). choose ETC historical data build path sequence transaction database, it is proposed that one
Plant method based on forecast model prediction vehicle on highway path, mixing Markov path, utilize and the method achieve public affairs at a high speed
Road ETC vehicle will pass through the prediction of state future.But the distance of the method prediction is short, it is merely able to predict that vehicle subsequent time will arrive
The section reached.
2) path based on sequential mode mining Forecasting Methodology.Document [3]: Yang J, Hu M.Trajpattern:
mining sequential patterns from imprecise trajectories of mobile objects[C]
.Proceedings of the International Conferences on Extending Database
Technology, 2006:664 681. is for the position prediction problem of moving target under mobile computing environment, it is proposed that a kind of
The method excavating target travel rule from historical trajectory data, is first divided into several lattice of area equation by moving region
Son, then changes into by the ordered sequence formed through these grid limits by target trajectory, then uses standard GSP to calculate
Method is excavated Frequent Sequential Patterns therein and generates rule of inference.Document [4]: Giannotti F, Nanni M,
Pedreschi D.Trajectory pattern mining[C].Proceedings of the 13th ACM SlGKDD
International Conference on Knowledge Discovery and Data Mining, 2007;330—
339. propose a kind of frequent Sequential Pattern Mining Algorithm being provided data by GPS device, on the basis of the algorithm of document [3]
On, add this parameter of the time of staying in grid.But the method operational capability when processing mass data can not expire far away
The requirement of foot people.Therefore, it is necessary to give full play to the newest fruits of computer software and hardware development, improve computational efficiency.
At present, intelligent transportation system uses substantial amounts of advanced sensing device, network technology, camera arrangement and supercomputing
Machine system, it is possible to monitor in real time and collect substantial amounts of traffic data.Assume with series installation the intersection of electronic eye
Transportation network is constituted, then vehicle running path sequence (hereinafter referred to as path sequence) can be come with node sequence arrangement for node
Represent.If I={ik, k=1,2 ..., n} is a project set, project ikRepresent, on the road i.e. road of circuit node, electronics is installed
The intersection of eye, n is intersection number.Path sequence is the ordered arrangement of disparity items, and path sequence S can be expressed as
S=< s1,s2,…sj.,…sn>, wherein sjFor the project in project set I.The most individual continuous item group in one path sequence
The sequence become is referred to as the subpath sequence of this path sequence.If the subpath sequence that path sequence α is path sequence β, Ze Cheng road
Footpath sequence β comprises path sequence α.Path sequence S is in path sequence data base at the support counting of path sequence data base
Comprise the path sequence number of S.Path sequence S is that the path sequence comprising S is in path in the support of path sequence data base
Percentage ratio shared in sequence library, is designated as Support (S).Given minimum support ξ, if path sequence S is in path sequence
Support in column database is not less than ξ, then path sequence S is called path sequence pattern.Path sequence has following character (following
It is called for short character 1): each two adjacent items contained by path sequence is road two adjacent sections point.
(3) Map-Reduce programming framework
Map-Reduce is a kind of programming framework, have employed concept " Map (mapping) " and " Reduce (reduction) ", for big
The concurrent operation of scale data collection (more than 1TB).At relevant document: [3] Jeffrey Dean and Sanjay
Ghemawat.Map-Reduce:Simplified data processing on large Cluster[C]
.Commuication of the ACM, propose in 2008,51 (1): 107-113..User only need to write two be referred to as Map and
The function of Reduce, system can manage the coordination between execution and the task of Map or Reduce parallel task, and
The situation of certain mission failure above-mentioned can be processed, and the fault-tolerance to hardware fault can be ensured simultaneously.
Calculating process based on Map-Reduce is as follows:
1) first input file is divided into M data fragmentation by the Map-Reduce storehouse in user program, each burst big
Little general from 16 to 64MB the size of each data slot (user can be controlled by optional parameter), then Map-
Reduce storehouse creates substantial amounts of copies of programs in a group of planes.
2) these copies of programs have a special program-primary control program, and in copy, other program is all by master control journey
The working procedure of sequence distribution task.Having M Map task and R Reduce task to be allocated, a Map is appointed by primary control program
Business or Reduce task distribute to an idle working procedure.
3) working procedure that Map task is assigned reads relevant input data slot, from the data slot of input
Parsing key-value (key, value) right, then key-value to passing to user-defined Map function, Map function will produce
The interim key-value in centre to being saved in local memory cache.
4) key-value in caching is divided into R region to by partition functions, is the most periodically written to local disk
On.The key-value of caching will pass back to primary control program to the storage position on local disk, primary control program be responsible for these
Storage position passes to the working procedure that Reduce task is assigned again.
5) receive, when the working procedure that Reduce task is assigned, the data storage location information that primary control program is sent
After, use remote procedure call (remote procedure calls) from the working procedure place master that Map task is assigned
These are read data cached on the disk of machine.When the working procedure that Reduce task is assigned have read all of intermediate data
After, there is the data aggregate of same keys together by making after key is ranked up.Owing to many different keys can be mapped to
In identical Reduce task, it is therefore necessary to be ranked up.If intermediate data cannot complete the most greatly sequence in internal memory, then
Will be ranked up in outside.
6) the intermediate data working procedure traversal sequence of Reduce task after is assigned, for each unique in
Between key-value pair, the working procedure of Reduce task is assigned and passes to the set of intermediate value associated with it for this key use
Family self-defining Reduce function.The output of Reduce function is appended to the output file of affiliated subregion.
7) after all of Map and Reduce task all completes, primary control program wakes up user program up. during this time,
Calling Map-Reduce in user program just returns.
(4) Hadoop cloud calculates platform
Hadoop is the open source software meeting reliability, extensibility, Distributed Calculation developed by Apache foundation
Project.User can develop distributed program in the case of not knowing about distributed low-level details.Make full use of the power of cluster
Carry out high-speed computation and storage.Hadoop achieves a distributed file system (Hadoop Distributed File
System), it is called for short HDFS.HDFS has the feature of high fault tolerance, and is designed to be deployed on cheap hardware;And it carries
Carry out the data of access application for high-throughput, be suitable for those application programs having super large data set.HDFS relaxes
The requirement of POSIX, can access the data in file system in the form of streaming.
The design that the framework of Hadoop is most crucial is exactly: HDFS and Map-Reduce.HDFS is that the data of magnanimity provide
Storage, Map-Reduce is that the data of magnanimity provide calculating.
But, for concrete technical problems, need to solve how planning technology scheme is to use Map-Reduce parallel
The problem realized.Not yet there is the technical scheme with ideal effect in the art.
Summary of the invention
For existing path based on Markov model Forecasting Methodology prediction distance short, be merely able to predict vehicle next
The section that moment will arrive, existing path based on sequential mode mining Forecasting Methodology is processing mass data and high dimensional data
The problem of operational capability poor efficiency, and the character 1 being had for vehicle running path sequence, the present invention improves original GSP algorithm
The generation process of candidate sequence pattern, promotes the operational performance of original GSP algorithm, and utilizes Map-Reduce programming framework to changing
Entering GSP algorithm and carry out parallelization, design meets the sequence library decomposition strategy of concurrent operation requirement, reduces I/O expense.At this
On the basis of make full use of Hadoop cloud calculate platform Large-scale parallel computing ability improve mass data sequential mode mining
Efficiency, shortens working hours.
The technical scheme that the present invention provides is a kind of vehicle running path Forecasting Methodology, carries out following based on Hadoop platform
Step,
Step 1, according to the internal memory situation of every computer in Hadoop platform, determines the minimum internal memory of all nodes, and
Being designated as Q, unit is GB;
Step 2, scanning storage has the original path sequence library of vehicle running path sequence, obtains original path sequence
In data base, the bar number scale of path sequence is m bar, and every paths sequence includes more than one crossing, original path sequence library
The actual storage size of middle longest path sequence is designated as P, and unit is B;
Step 3, is averagely divided into n disjoint subpath sequence by original route sequence library by horizontal division mode
Data base, wherein P × (m/n)≤Q × 109;
Step 4, uploads to original path sequence library in certain specified folder of HDFS;
Step 5, uploads to n sub-path sequence data base in another specified folder of HDFS;
Step 6, the main controlled node of Hadoop platform n step 5 uploaded a sub-path sequence data base is dispatched to not
Same Map node, each Map node performs the GSP algorithm improved, and according to minimum support ξ set in advance, scanning is left in
Subpath sequence library in Map node memory, calculates local path sequence pattern, with<key, value>to form
Passing to Reduce node, wherein key is local path sequence pattern, and value is the support meter of local path sequence pattern
Number;
The GSP algorithm that each Map node performs to improve is as follows,
Operation a, for being assigned to the subpath sequence library of this Map node, scanning subpath sequence library obtains
1-path sequence pattern L1, make k=1,
Operation b, by k-path sequence pattern LkProduce candidate's k+1-path sequence Ck+1, again scan former sequence library,
Calculate the support of each path candidate sequence, produce k+1-path sequence pattern Lk+1;Wherein, candidate k+1-path sequence is produced
Row Ck+1Divide the following two kinds situation,
(1) if being produced candidate's 2-path sequence pattern, scanning storage traffic network information by 1-path sequence pattern
Adjacency list, check 1-path sequence pattern L1In each path sequence pattern s1Adjacent node, will be with s1Adjacent node entry
Mesh adds s to1In;
(2) if by k-path sequence pattern produce candidate's k+1-path sequence pattern, k > 1,
First, to any two path sequence pattern s in k-path sequence pattern1And s2If removing path sequence pattern
s1First project with remove path sequence pattern s2Last project obtained by path sequence identical, then by s1With
s2It is attached;Then, prune, if certain the subpath sequence including certain path candidate sequence pattern is not path sequence
Pattern, then delete from path candidate sequence pattern;
Operation c, makes k=k+1, repetitive operation b, until not having new path candidate sequence to produce;
Step 7,<key, the value>that Map node is passed over by Reduce node obtains the overall situation to carrying out merger process
Candidate sequence pattern;
Step 8, scanning step 4 leaves the original path sequence library in HDFS in overall situation candidate sequence mould again
Formula counts, and finds out and meets the sequence pattern not less than minimum support ξ set in advance, obtains global path sequence pattern;
Step 9, is produced path correlation rule by the global path sequence pattern produced in step 8 and calculates path association rule
Confidence level then, obtains vehicle running path and predicts the outcome.
The present invention correspondingly provides a kind of vehicle running path prognoses system, arranges based on Hadoop platform with lower module,
Internal memory confirms module, for according to the internal memory situation of every computer in Hadoop platform, determines in all nodes
The internal memory of the machine that internal memory is minimum, and it is designated as Q, unit is GB;
Longest path sequence confirms module, has the original path sequence data of vehicle running path sequence for scanning storage
Storehouse, the bar number scale obtaining path sequence in original path sequence library is m bar, and every paths sequence includes more than one road
Mouthful, in original path sequence library, the actual storage size of longest path sequence is designated as P, and unit is B;
Subpath sequence library divides module, for averagely being divided by horizontal division mode by original route sequence library
For n disjoint subpath sequence library, wherein P × (m/n)≤Q × 109;
Transmission module on raw data base, for uploading to certain specified folder of HDFS by original path sequence library
In;
Transmission module on subdata base, for uploading to another specified folder of HDFS by n sub-path sequence data base
In;
Local path sequence pattern module, for making the main controlled node of Hadoop platform be uploaded by transmission module on subdata base
N sub-path sequence data base be dispatched to different Map nodes, each Map node performs the GSP algorithm improved, according in advance
The minimum support ξ first set, scanning is left the subpath sequence library in Map node memory in, is calculated local path
Sequence pattern, with<key, value>to form pass to Reduce node, wherein key is local path sequence pattern,
Value is the support counting of local path sequence pattern;
The GSP algorithm that each Map node performs to improve is as follows,
Operation a, for being assigned to the subpath sequence library of this Map node, scanning subpath sequence library obtains
1-path sequence pattern L1, make k=1,
Operation b, by k-path sequence pattern LkProduce candidate's k+1-path sequence Ck+1, again scan former sequence library,
Calculate the support of each path candidate sequence, produce k+1-path sequence pattern Lk+1;Wherein, candidate k+1-path sequence is produced
Row Ck+1Divide the following two kinds situation,
(1) if being produced candidate's 2-path sequence pattern, scanning storage traffic network information by 1-path sequence pattern
Adjacency list, check 1-path sequence pattern L1In each path sequence pattern s1Adjacent node, will be with s1Adjacent node entry
Mesh adds s to1In;
(2) if by k-path sequence pattern produce candidate's k+1-path sequence pattern, k > 1,
First, to any two path sequence pattern s in k-path sequence pattern1And s2If removing path sequence pattern
s1First project with remove path sequence pattern s2Last project obtained by path sequence identical, then by s1With
s2It is attached;Then, prune, if certain the subpath sequence including certain path candidate sequence pattern is not path sequence
Pattern, then delete from path candidate sequence pattern;
Operation c, makes k=k+1, repetitive operation b, until not having new path candidate sequence to produce;
Overall situation candidate sequence mode module is right for<key, the value>making Reduce node pass over Map node
Carry out merger process and obtain overall situation candidate sequence pattern;
Global path sequence pattern module, on scanning raw data base again transmission module leave in HDFS former
Beginning path sequence data base, to overall situation candidate sequence mode counting, finds out and meets not less than minimum support ξ's set in advance
Sequence pattern, obtains global path sequence pattern;
Predict the outcome module, for being produced road by the global path sequence pattern produced in global path sequence pattern module
Footpath correlation rule also calculates the confidence level of path correlation rule, obtains vehicle running path and predicts the outcome.
Relative to domestic and international existing vehicle running path Forecasting Methodology, the present invention is according to Map-Reduce programming framework
Basic demand, has redesigned and vehicle running path sequence has carried out sequential mode mining and generates the stream of path correlation rule
Journey.Present invention is alternatively directed to vehicle running path sequence character 1 the generation process of original GSP algorithm candidate sequence pattern is changed
Entering, the present invention have also been devised rational sequence library decomposition strategy, it is achieved that improves the parallelization of GSP algorithm, reduces I/O and opens
Pin, can give full play to share the disposal ability of the cluster computer of storage, improve work efficiency.Technical scheme has
Simply, quick feature, it is possible to preferably improve and vehicle running path sequence carries out sequential mode mining and generates path closing
The efficiency of connection rule.
Accompanying drawing explanation
Fig. 1 is the flow chart of the embodiment of the present invention;
Fig. 2 is the simulation traffic network schematic diagram of the embodiment of the present invention;
Fig. 3 is the adjacency list of the storage simulation traffic network of the embodiment of the present invention;
Fig. 4 is that the original path sequence library of the embodiment of the present invention divides schematic diagram;
Fig. 5 is that embodiment of the present invention antithetical phrase path sequence data base 1 performs Map task schematic diagram;
Fig. 6 is that embodiment of the present invention antithetical phrase path sequence data base 2 performs Map task schematic diagram;
Fig. 7 is that embodiment of the present invention antithetical phrase path sequence data base 3 performs Map task schematic diagram.
Detailed description of the invention
Technical solution of the present invention is described in detail below in conjunction with drawings and Examples.
Embodiment, as a example by simulation traffic network as shown in Figure 2, all has electronic eye in 14 intersections of A~N
Gather data.Owing to the present invention will utilize the information of traffic network, so using adjacency list storage traffic network information, this road
The adjacency list that net is corresponding is shown in that accompanying drawing 3, A crossing adjoin with B, C crossing, and B crossing adjoins with A, D crossing, and C crossing is adjacent with A, E crossing
Connecing, D crossing and B, G, F crossing adjoin, and E crossing adjoins with C, F, H crossing, and F crossing adjoins with D, G, J, H, E crossing, G crossing with
D, I, F crossing adjoins, and H crossing adjoins with F, K, E crossing, and I crossing adjoins with G, L crossing, and J crossing adjoins with F, N crossing, K road
Mouth adjoins with H, M crossing, and L crossing adjoins with I, N crossing, and M crossing and K, N crossing adjoin, and N crossing adjoins with J, L, M crossing.Will
The traveling of the vehicle of electronic eye collection records corresponding path sequence and is stored in vehicle running path sequence library, every paths
Sequence includes more than one crossing, such as shown in following table.
Path sequence |
<A B D F H K> |
<A C E F G I L> |
<A B D F H K M N> |
<C E F G I L N> |
<A B D F H K> |
<C E F G I L N> |
<A B D G I L N> |
<A B D F H K M> |
<A B D F H K> |
<E F G I L N> |
<A B D G I L N> |
<A B D F H K M N> |
What path sequence pattern reflected is the route selection of vehicle regularity.Produced by path sequence pattern and there is directivity
Path correlation rule, rule former piece represent the path sequence that vehicle has travelled, consequent represents what vehicle will travel
Path sequence.Such as<A B D>→<confidence level conf (<A B D>→<F H K>) definition of F H K>this paths correlation rule
For path sequence data base comprises the number of path sequence<A B D F H K>and the number comprising path sequence<A B D>
Ratio.I.e. represent run over the following probability through FHK node of vehicle of A tri-nodes of B D be conf (<A B D>→
<F H K>)。
Based on the above-mentioned original path sequence library previously generated, what the present invention designed programs frame based on Map-Reduce
The vehicle running path Forecasting Methodology flow process of frame is shown in that accompanying drawing 1, institute can be used computer software by those skilled in the art in steps
Technology realizes flow process and automatically runs.It is as follows that embodiment implements process:
Step 1, according to the internal memory situation of every computer in Hadoop platform, determines the machine that in all nodes, internal memory is minimum
The internal memory of device, and it is designated as Q (unit: GB).In embodiment, obtain Q=2GB.
Owing to original path sequence library will be averagely divided into n disjoint subpath sequence data by step 3
Storehouse, and subpath sequence library is put in node memory.So in order to not allow the computer one-tenth that wherein certain internal memory is less
Bottleneck for computing, it is proposed that when being embodied as, in Hadoop platform, the internal memory of every computer is the same with operational performance.
Step 2, (original path sequence library can be with the form of text document for run-down original path sequence library
Storage, is beneficial in incoming for original path sequence library HDFS), the bar number scale obtaining path sequence in data base is m bar, number
It is P (unit: B) according to the actual storage size of longest path sequence in storehouse.In embodiment, in data base, the bar number of path sequence is
Article 12, owing to a character is taken up space as 1B, therefore maximum length sequence actual storage size is 17B (including space and angle brackets),
Therefore m=12, P=17B are obtained.
Step 3, is averagely divided into n disjoint subpath sequence by original route sequence library by horizontal division mode
Data base's form storage of text document (n disjoint subpath sequence library can also).General m can be divided exactly by n,
Make each subpath sequence library include m/n paths sequence, i.e. first sub-path sequence data base comprises original road
The path sequence of the 1st article to the m/n article of footpath sequence library, the individual sub-path sequence data base of kth (1 < k < n) comprises original road
The path sequence of (k-1) × (m/n)+1 article of footpath sequence library Dao k × (m/n) article, the n-th subpath sequence library
Comprise the path sequence of (n-1) × (m/n)+1 article Dao m article of original path sequence library.In order to path candidate sequence
The original route sequence library being placed in external memory need not be scanned during mode counting, reduce I/O expense, each subpath sequence should be made
Data base can put into internal memory.I.e. should meet P × (m/n)≤Q × 109.When P, Q use other unit, also should meet corresponding bar
Part, in the protection scheme of the present invention.
Such as Fig. 4, embodiment sets and original path sequence library is divided into n=3 sub-path sequence data base, implements
17 × (12/3) < 2 × 10 in example9, meet and subpath sequence library put into the requirement in internal memory.
Original path sequence library is divided the subpath sequence library 1,2,3 obtained as follows:
The path sequence table of subpath sequence library 1
Path sequence |
<A B D F H K> |
<A C E F G I L> |
<A B D F H K M N> |
<C E F G I L N> |
The path sequence table of subpath sequence library 2
Path sequence |
<A B D F H K> |
<C E F G I L N> |
<A B D G I L N> |
<A B D F H K M> |
The path sequence table of subpath sequence library 3
Path sequence |
<A B D F H K> |
<E F G I L N> |
<A B D G I L N> |
<A B D F H K M N> |
Each path sequence includes project set { some projects in A, B, C, D, E, F, G, H, I, J, K, L, M, N} respectively.Son
Path sequence data base 1 comprises the 1st article of original path sequence library to the 4th paths sequence, subpath sequence library 2
The 5th article that comprises original path sequence library comprises original path sequence to the 8th paths sequence, subpath sequence library 3
The 9th article of column database to the 12nd paths sequence.
If the number of Map node is q in Hadoop platform, it is proposed that the number of subpath sequence library is equal to Map node
Number, i.e. n=q.If n < q, when running the method, (q-n) individual Map node is had to obtain not in the case of not having mission failure
To utilizing, Duty-circle is the highest.If n > q, when running the method, n-q subpath in the case of not having mission failure
Sequence library needs just can be processed after q the complete front q of Map node processing sub-path sequence data base, treatment effeciency
The highest.Therefore n=q can meet Duty-circle and treatment effeciency simultaneously.
Step 4, uploads in certain specified folder of HDFS by original path sequence library, and step 8 will scan deposits
It is placed on the path sequence data base of this specified folder.
Step 5, uploads in another specified folder of HDFS by n sub-path sequence data base, the n in this document folder
Individual sub-path sequence data base is the input file that step 6 processes.
Step 6, main controlled node (running the computer node of primary control program) n sub-path sequence step 5 uploaded
Data base is dispatched to different Map nodes (performing the computer node of Map task), and the GSP that each Map node performs to improve calculates
Method, according to minimum support ξ set in advance, scanning is left the subpath sequence library in Map node memory in, is calculated
Local path sequence pattern, with<key, value>to form pass to Reduce node (perform Reduce task computer
Node), wherein key is local path sequence pattern, and value is the support counting of local path sequence pattern.
The GSP algorithm that each Map node performs to improve is as follows:
Operation a, for being assigned to the subpath sequence library of this Map node, first scanning subpath sequence library
Obtain 1-path sequence pattern L1, the most a length of 1 and support in subpath sequence library be not less than the path sequence of ξ
Set.If the collection of the path sequence that a length of k and the support in subpath sequence library are not less than ξ is combined into k-path
Sequence pattern Lk;Make k=1,
Operation b, then by k-path sequence pattern LkProduce candidate's k+1-path sequence Ck+1, again scan former sequence number
According to storehouse, calculate the support of each path candidate sequence, produce k+1-path sequence pattern Lk+1;
Operation c, makes k=k+1, repetitive operation b afterwards, until not having new path candidate sequence to produce, and gained 1-path
Sequence pattern L1, 2-path sequence pattern L2... it is all local path sequence pattern.The number of times of scan database and the path of generation
The greatest length of sequence pattern is identical.
Wherein, produce path candidate sequence pattern and mainly divide the following two kinds situation:
(1) if being produced candidate's 2-path sequence pattern by 1-path sequence pattern, scanning adjacency list, checking 1-path
Sequence pattern L1In each path sequence pattern s1Adjacent node, if s1Adjacent node also in 1-path sequence pattern L1
In, then s1With s1Adjacent node connects, will be with s1Adjacent node project adds s to1In.
(2) if being produced candidate's k+1-path sequence pattern (k > 1) by k-path sequence pattern, path candidate sequence is produced
Row pattern is main in two steps:
First, to any two path sequence pattern s in k-path sequence pattern1And s2If removing path sequence pattern
s1First project with remove path sequence pattern s2Last project obtained by path sequence identical, then can be by
s1With s2It is attached, will s2Last project add s to1In.Then prune: if certain path candidate sequence mould
Certain subpath sequence of formula is not path sequence pattern, then this path candidate sequence pattern is unlikely to be path sequence pattern,
It is deleted from path candidate sequence pattern.
Embodiment sets minimum support as 50%, performs to improve concrete steps such as Fig. 5,6,7 of GSP algorithm.It is assigned to
The Map node of subpath sequence library 1, scanning subpath sequence library 1 obtains 1-path sequence pattern L1, then by 1-
Path sequence pattern L1Produce candidate's 2-path sequence pattern C2, again scan former sequence library, calculate each path candidate sequence
The support of row pattern, produces 2-path sequence pattern L2, repetitive operation afterwards, until not having new path candidate sequence pattern
Produce.Antithetical phrase path sequence data base 2, subpath sequence library 3 are respectively by the corresponding Map node respective handling being assigned to.
See Fig. 5, the following each table of acquired results during antithetical phrase path sequence data base 1 execution:
L1(1-path sequence pattern)
Path sequence | Support counting |
<A> | 3 |
<B> | 2 |
<C> | 2 |
<D> | 2 |
<E> | 2 |
<F> | 4 |
<G> | 2 |
<H> | 2 |
<I> | 2 |
<K> | 2 |
<L> | 2 |
<N> | 2 |
C2(candidate's 2-path sequence pattern)
Path sequence |
<A B> |
<A C> |
<B A> |
<B D> |
<C A> |
<C E> |
<D B> |
<D G> |
<D F> |
<E F> |
<E H> |
<E C> |
<F D> |
<F G> |
<F H> |
<F E> |
<G D> |
<G I> |
<G F> |
<H E> |
<H F> |
<H K> |
<I G> |
<I L> |
<K H> |
<L I> |
<L N> |
<N L> |
L2(2-path sequence pattern)
Path sequence | Support counting |
<A B> | 2 |
<B D> | 2 |
<C E> | 2 |
<D F> | 2 |
<E F> | 2 |
<F G> | 2 |
<F H> | 2 |
<G I> | 2 |
<H K> | 2 |
<I L> | 2 |
C3(candidate's 3-path sequence pattern)
Path sequence |
<A B D> |
<B D F> |
<C E F> |
<D F G> |
<D F H> |
<E F G> |
<E F H> |
<F G I> |
<F H K> |
<G I L> |
L3(3-path sequence pattern)
Path sequence | Support counting |
<A B D> | 2 |
<B D F> | 2 |
<C E F> | 2 |
<D F H> | 2 |
<E F G> | 2 |
<F G I> | 2 |
<F H K> | 2 |
<G I L> | 2 |
C4(candidate's 4-path sequence pattern)
Path sequence |
<A B D F> |
<B D F H> |
<C E F G> |
<D F H K> |
<E F G I> |
<F G I L> |
L4(4-path sequence pattern)
Path sequence | Support counting |
<A B D F> | 2 |
<B D F H> | 2 |
<C E F G> | 2 |
<D F H K> | 2 |
<E F G I> | 2 |
<F G I L> | 2 |
C5(candidate's 5-path sequence pattern)
Path sequence |
<A B D F H> |
<B D F H K> |
<C E F G I> |
<E F G I L> |
L5(5-path sequence pattern)
Path sequence | Support counting |
<A B D F H> | 2 |
<B D F H K> | 2 |
<C E F G I> | 2 |
<E F G I L> | 2 |
C6(candidate's 6-path sequence pattern)
Path sequence |
<A B D F H K> |
<C E F G I L> |
L6(6-path sequence pattern)
Path sequence | Support counting |
<A B D F H K> | 2 |
<C E F G I L> | 2 |
See Fig. 6, the following each table of acquired results during antithetical phrase path sequence data base 2 execution:
L1(1-path sequence pattern)
Path sequence | Support counting |
<A> | 3 |
<B> | 3 |
<D> | 3 |
<F> | 3 |
<G> | 2 |
<H> | 2 |
<I> | 2 |
<K> | 2 |
<L> | 2 |
<N> | 2 |
C2(candidate's 2-path sequence pattern)
Path sequence |
<A B> |
<B A> |
<B D> |
<D B> |
<D G> |
<D F> |
<F D> |
<F G> |
<F H> |
<G D> |
<G I> |
<G F> |
<H E> |
<H F> |
<H K> |
<I G> |
<I L> |
<K H> |
<L I> |
<L N> |
<N L> |
L2(2-path sequence pattern)
Path sequence | Support counting |
<A B> | 3 |
<B D> | 3 |
<D F> | 2 |
<F H> | 2 |
<G I> | 2 |
<H K> | 2 |
<I L> | 2 |
<L N> | 2 |
C3(candidate's 3-path sequence pattern)
Path sequence |
<A B D> |
<B D F> |
<D F H> |
<F H K> |
<G I L> |
<I L N> |
L3(3-path sequence pattern)
Path sequence | Support counting |
<A B D> | 3 |
<B D F> | 2 |
<D F H> | 2 |
<F H K> | 2 |
<G I L> | 2 |
<I L N> | 2 |
C4(candidate's 4-path sequence pattern)
Path sequence |
<A B D F> |
<B D F H> |
<D F H K> |
<G I L N> |
L4(4-path sequence pattern)
Path sequence | Support counting |
<A B D F> | 2 |
<B D F H> | 2 |
<D F H K> | 2 |
<G I L N> | 2 |
C5(candidate's 5-path sequence pattern)
Path sequence |
<A B D F H> |
<B D F H K> |
L5(5-path sequence pattern)
Path sequence | Support counting |
<A B D F H> | 2 |
<B D F H K> | 2 |
C6(candidate's 6-path sequence pattern)
Path sequence |
<A B D F H K> |
L6(6-path sequence pattern)
Path sequence | Support counting |
<A B D F H K> | 2 |
See Fig. 7, the following each table of acquired results during antithetical phrase path sequence data base 3 execution:
L1(1-path sequence pattern)
Path sequence | Support counting |
<A> | 3 |
<B> | 3 |
<D> | 3 |
<F> | 3 |
<G> | 2 |
<H> | 2 |
<I> | 2 |
<K> | 2 |
<L> | 2 |
<N> | 3 |
C2(candidate's 2-path sequence pattern)
Path sequence |
<A B> |
<B A> |
<B D> |
<D B> |
<D G> |
<D F> |
<F D> |
<F G> |
<F H> |
<G D> |
<G I> |
<G F> |
<H E> |
<H F> |
<H K> |
<I G> |
<I L> |
<K H> |
<L I> |
<L N> |
<N L> |
L2(2-path sequence pattern)
Path sequence | Support counting |
<A B> | 3 |
<B D> | 3 |
<D F> | 2 |
<F H> | 2 |
<G I> | 2 |
<H K> | 2 |
<I L> | 2 |
<L N> | 2 |
C3(candidate's 3-path sequence pattern)
Path sequence |
<A B D> |
<B D F> |
<D F H> |
<F H K> |
<G I L> |
<I L N> |
L3(3-path sequence pattern)
Path sequence | Support counting |
<A B D> | 3 |
<B D F> | 2 |
<D F H> | 2 |
<F H K> | 2 |
<G I L> | 2 |
<I L N> | 2 |
C4(candidate's 4-path sequence pattern)
Path sequence |
<A B D F> |
<B D F H> |
<D F H K> |
<G I L N> |
L4(4-path sequence pattern)
Path sequence | Support counting |
<A B D F> | 2 |
<B D F H> | 2 |
<D F H K> | 2 |
<G I L N> | 2 |
C5(candidate's 5-path sequence pattern)
Path sequence |
<A B D F H> |
<B D F H K> |
L5(5-path sequence pattern)
Path sequence | Support counting |
<A B D F H> | 2 |
<B D F H K> | 2 |
C6(candidate's 6-path sequence pattern)
Path sequence |
<A B D F H K> |
L6(6-path sequence pattern)
Path sequence | Support counting |
<A B D F H K> | 2 |
Map working node passes to<key, the value>of Reduce working node to such as following table:
key | value |
<A> | 3 |
<B> | 2 |
<C> | 2 |
<D> | 2 |
<E> | 2 |
<F> | 4 |
<G> | 2 |
<H> | 2 |
<I> | 2 |
<K> | 2 |
<L> | 2 |
<N> | 2 |
<A B> | 2 |
<B D> | 2 |
<C E> | 2 |
<D F> | 2 |
<E F> | 2 |
<F G> | 2 |
<F H> | 2 |
<G I> | 2 |
<H K> | 2 |
<I L> | 2 |
<A B D> | 2 |
<B D F> | 2 |
<C E F> | 2 |
<D F H> | 2 |
<E F G> | 2 |
<F G I> | 2 |
<F H K> | 2 |
<G I L> | 2 |
<A B D F> | 2 |
<B D F H> | 2 |
<C E F G> | 2 |
<D F H K> | 2 |
<E F G I> | 2 |
<F G I L> | 2 |
<A B D F H> | 2 |
<B D F H K> | 2 |
<C E F G I> | 2 |
<E F G I L> | 2 |
<A B D F H K> | 2 |
<C E F G I L> | 2 |
<A> | 3 |
<B> | 3 |
<D> | 3 |
<F> | 3 |
<G> | 2 |
<H> | 2 |
<I> | 2 |
<K> | 2 |
<L> | 2 |
<N> | 2 |
<A B> | 3 |
<B D> | 3 |
<D F> | 2 |
<F H> | 2 |
<G I> | 2 |
<H K> | 2 |
<I L> | 2 |
<L N> | 2 |
<A B D> | 3 |
<B D F> | 2 |
<D F H> | 2 |
<F H K> | 2 |
<G I L> | 2 |
<I L N> | 2 |
<A B D F> | 2 |
<B D F H> | 2 |
<D F H K> | 2 |
<G I L N> | 2 |
<A B D F H> | 2 |
<B D F H K> | 2 |
<A B D F H K> | 2 |
<A> | 3 |
<B> | 3 |
<D> | 3 |
<F> | 3 |
<G> | 2 |
<H> | 2 |
<I> | 2 |
<K> | 2 |
<L> | 2 |
<N> | 3 |
<A B> | 3 |
<B D> | 3 |
<D F> | 2 |
<F H> | 2 |
<G I> | 2 |
<H K> | 2 |
<I L> | 2 |
<L N> | 2 |
<A B D> | 3 |
<B D F> | 2 |
<D F H> | 2 |
<F H K> | 2 |
<G I L> | 2 |
<I L N> | 2 |
<A B D F> | 2 |
<B D F H> | 2 |
<D F H K> | 2 |
<G I L N> | 2 |
<A B D F H> | 2 |
<B D F H K> | 2 |
<A B D F H K> | 2 |
N sub-path sequence data base can be dispatched to different Map working nodes by Master node by Hadoop automatically,
And the coordination between execution and the task of Map parallel task can be managed, and certain mission failure above-mentioned can be processed
Situation.Realize relatively easy, quick in this way.
Step 7, merger is processed and obtains overall situation candidate by<key, the value>that Map node is passed over by Reduce node
Sequence pattern, is i.e. combined<key, the value>that key is identical, and by<key, value>, to being converted to,<key, this key are correlated with
The set of value >, the overall candidate sequence pattern such as following table that embodiment produces.
key | Value gathers |
<A> | {3,3,3} |
<B> | {2,3,3} |
<C> | {2} |
<D> | {2,3,3} |
<E> | {2} |
<F> | {4,3,3} |
<G> | {2,2,2} |
<H> | {2,2,2} |
<I> | {2,2,2} |
<K> | {2,2,2} |
<L> | {2,2,2} |
<N> | {2,2,3} |
<A B> | {2,3,3} |
<B D> | {2,3,3} |
<C E> | {2} |
<D F> | {2,2,2} |
<E F> | {2} |
<F G> | {2} |
<F H> | {2,2,2} |
<G I> | {2,2,2} |
<H K> | {2,2,2} |
<I L> | {2,2,2} |
<A B D> | {2,3,3} |
<B D F> | {2,2,2} |
<C E F> | {2} |
<D F H> | {2,2,2} |
<E F G> | {2} |
<F G I> | {2} |
<F H K> | {2,2,2} |
<G I L> | {2,2,2} |
<A B D F> | {2,2,2} |
<B D F H> | {2,2,2} |
<C E F G> | {2} |
<D F H K> | {2,2,2} |
<E F G I> | {2} |
<F G I L> | {2} |
<A B D F H> | {2,2,2} |
<B D F H K> | {2,2,2} |
<C E F G I> | {2} |
<E F G I L> | {2} |
<A B D F H K> | {2,2,2} |
<C E F G I L> | {2} |
<L N> | {2,2} |
<I L N> | {2,2} |
<G I L N> | {2,2} |
At merger, reason Hadoop is automatically performed in order to do not repeat identical local sequence pattern counting.
Step 8, scanning step 4 leaves the original path sequence library in HDFS in overall situation candidate sequence mould again
Formula counts, and finds out and meets not less than the sequence pattern of minimum support ξ set in advance, embodiment output<key, value>as
Following table.The local sequence pattern that Map task is merely creating, and it is unsatisfactory for the minimum support of the overall situation, so again scanning former sequence
Column database, obtains the path sequence pattern of the overall situation.Scan former sequence library to step 7 gained overall situation path candidate sequence mould
Key counting in formula, has obtained global path sequence pattern, has i.e. obtained the key in following table.
key | value |
<A> | 9 |
<B> | 8 |
<D> | 8 |
<F> | 10 |
<G> | 6 |
<H> | 6 |
<I> | 6 |
<K> | 6 |
<L> | 6 |
<N> | 7 |
<A B> | 8 |
<B D> | 8 |
<D F> | 6 |
<F H> | 6 |
<G I> | 6 |
<H K> | 6 |
<I L> | 6 |
<A B D> | 8 |
<B D F> | 6 |
<D F H> | 6 |
<F H K> | 6 |
<G I L> | 6 |
<A B D F> | 6 |
<B D F H> | 6 |
<D F H K> | 6 |
<A B D F H> | 6 |
<B D F H K> | 6 |
<A B D F H K> | 6 |
Step 9, is produced path correlation rule by the global path sequence pattern produced in step 8 and calculates path association rule
Confidence level then, obtains vehicle running path and predicts the outcome.The concrete of path correlation rule is produced by global path sequence pattern
Step is: using front n the project (1≤n<L) of L-path sequence pattern (L>1) as rule former piece, rear L-n project is as rule
Then consequent, the ratio of support of the support that confidence level is whole path sequence pattern of rule and rule former piece.The road produced
Footpath correlation rule and confidence level thereof such as following table:
Path correlation rule | Confidence level |
<A>→<B> | 88.89% |
<B>→<D> | 100% |
<D>→<F> | 75% |
<F>→<H> | 60% |
<G>→<I> | 100% |
<H>→<K> | 100% |
<I>→<L> | 100% |
<A>→<B D> | 88.89% |
<A B>→<D> | 100% |
<B>→<D F> | 75% |
<B D>→<F> | 75% |
<D>→<F H> | 75% |
<D F>→<H> | 100% |
<F>→<H K> | 60% |
<F H>→<K> | 100% |
<G>→<I L> | 100% |
<G I>→<L> | 100% |
<A>→<B D F> | 66.67% |
<A B>→<D F> | 75% |
<A B D>→<F> | 75% |
<B>→<D F H> | 75% |
<B D>→<F H> | 75% |
<B D F>→<H> | 100% |
<D>→<F H K> | 100% |
<D F>→<H K> | 100% |
<D F H>→<K> | 100% |
<A>→<B D F H> | 66.67% |
<A B>→<D F H> | 75% |
<A B D>→<F H> | 75% |
<A B D F>→<H> | 100% |
<B>→<D F H K> | 75% |
<B D>→<F H K> | 75% |
<B D F>→<H K> | 100% |
<B D F H>→<K> | 100% |
<A>→<B D F H K> | 66.67% |
<A B>→<D F H K> | 75% |
<A B D>→<F H K> | 75% |
<A B D F>→<H K> | 100% |
<A B D F H>→<K> | 100% |
When being embodied as, step 1~5 can be performed by the main controlled node of Hadoop platform, and step 6 is by the master of Hadoop platform
Control node is dispatched to Map node and performs, and step 7, step 8, step 9 are performed by the Reduce node of Hadoop platform.
The present invention correspondingly provides a kind of vehicle running path prognoses system, arranges based on Hadoop platform with lower module,
Internal memory confirms module, for according to the internal memory situation of every computer in Hadoop platform, determines that in all nodes, internal memory is minimum
The internal memory of machine, and be designated as Q;
Longest path sequence confirms module, has the original path sequence data of vehicle running path sequence for scanning storage
Storehouse, the bar number scale obtaining path sequence in original path sequence library is m bar, and every paths sequence includes more than one road
Mouthful, in original path sequence library, the actual storage size of longest path sequence is designated as P;
Subpath sequence library divides module, for averagely being divided by horizontal division mode by original route sequence library
For n disjoint subpath sequence library;
Transmission module on raw data base, for uploading to certain specified folder of HDFS by original path sequence library
In;
Transmission module on subdata base, for uploading to another specified folder of HDFS by n sub-path sequence data base
In;
Local path sequence pattern module, for making the main controlled node of Hadoop platform be uploaded by transmission module on subdata base
N sub-path sequence data base be dispatched to different Map nodes, each Map node performs the GSP algorithm improved, according in advance
The minimum support ξ first set, scanning is left the subpath sequence library in Map node memory in, is calculated local path
Sequence pattern, with<key, value>to form pass to Reduce node, wherein key is local path sequence pattern,
Value is the support counting of local path sequence pattern;
The GSP algorithm that each Map node performs to improve is as follows,
Operation a, for being assigned to the subpath sequence library of this Map node, scanning subpath sequence library obtains
1-path sequence pattern L1, make k=1,
Operation b, by k-path sequence pattern LkProduce candidate's k+1-path sequence Ck+1, again scan former sequence library,
Calculate the support of each path candidate sequence, produce k+1-path sequence pattern Lk+1;Wherein, candidate k+1-path sequence is produced
Row Ck+1Divide the following two kinds situation,
(1) if being produced candidate's 2-path sequence pattern, scanning storage traffic network information by 1-path sequence pattern
Adjacency list, check 1-path sequence pattern L1In each path sequence pattern s1Adjacent node, if s1Adjacent node also
In 1-path sequence pattern L1In, will be with s1Adjacent node project adds s to1In;
(2) if by k-path sequence pattern produce candidate's k+1-path sequence pattern, k > 1,
First, to any two path sequence pattern s in k-path sequence pattern1And s2If removing path sequence pattern
s1First project with remove path sequence pattern s2Last project obtained by path sequence identical, then by s1With
s2It is attached;Then, prune, if certain the subpath sequence including certain path candidate sequence pattern is not path sequence
Pattern, then delete from path candidate sequence pattern;
Operation c, makes k=k+1, repetitive operation b, until not having new path candidate sequence to produce;
Overall situation candidate sequence mode module is right for<key, the value>making Reduce node pass over Map node
Carry out merger process and obtain overall situation candidate sequence pattern;
Global path sequence pattern module, on scanning raw data base again transmission module leave in HDFS former
Beginning path sequence data base, to overall situation candidate sequence mode counting, finds out and meets not less than minimum support ξ's set in advance
Sequence pattern, obtains global path sequence pattern;
Predict the outcome module, for being produced road by the global path sequence pattern produced in global path sequence pattern module
Footpath correlation rule also calculates the confidence level of path correlation rule, obtains vehicle running path and predicts the outcome.
Specific embodiment described herein is only to present invention spirit explanation for example.Technology neck belonging to the present invention
Described specific embodiment can be made various amendment or supplements or use similar mode to replace by the technical staff in territory
Generation, but without departing from the spirit of the present invention or surmount scope defined in appended claims.
Claims (2)
1. a vehicle running path Forecasting Methodology, it is characterised in that: follow the steps below based on Hadoop platform,
Step 1, according to the internal memory situation of every computer in Hadoop platform, determines the minimum internal memory of all nodes, and is designated as
Q, unit is GB;
Step 2, scanning storage has the original path sequence library of vehicle running path sequence, obtains original path sequence data
In storehouse, the bar number scale of path sequence is m bar, and every paths sequence includes more than one crossing, in original path sequence library
The actual storage size of long path sequence is designated as P, and unit is B;
Step 3, is averagely divided into n disjoint subpath sequence number by original path sequence library by horizontal division mode
According to storehouse, wherein P × (m/n)≤Q × 109;
Step 4, uploads to original path sequence library in certain specified folder of HDFS;
Step 5, uploads to n sub-path sequence data base in another specified folder of HDFS;
Step 6, the main controlled node of Hadoop platform n step 5 uploaded a sub-path sequence data base is dispatched to different
Map node, each Map node performs the GSP algorithm improved, and according to minimum support ξ set in advance, Map is left in scanning in
Subpath sequence library in node memory, calculates local path sequence pattern, with<key, value>to form transmission
To Reduce node, wherein key is local path sequence pattern, and value is the support counting of local path sequence pattern;
The GSP algorithm that each Map node performs to improve is as follows,
Operation a, for being assigned to the subpath sequence library of this Map node, scanning subpath sequence library obtains 1-road
Footpath sequence pattern L1, make k=1,
Operation b, by k-path sequence pattern LkProduce candidate's k+1-path sequence Ck+1, scanning original path sequence data again
Storehouse, calculates the support of each path candidate sequence, produces k+1-path sequence pattern Lk+1;Wherein, candidate k+1-path is produced
Sequence Ck+1Divide the following two kinds situation,
(1) if being produced candidate's 2-path sequence pattern, the neighbour of scanning storage traffic network information by 1-path sequence pattern
Connect table, check 1-path sequence pattern L1In each path sequence pattern s1Adjacent node, if s1Adjacent node also at 1-
Path sequence pattern L1In, will be with s1Adjacent node project adds s to1In;
(2) if by k-path sequence pattern produce candidate's k+1-path sequence pattern, k > 1,
First, to any two path sequence pattern s in k-path sequence pattern1And s2If removing path sequence pattern s1's
First project with remove path sequence pattern s2Last project obtained by path sequence identical, then by s1With s2Enter
Row connects;Then, prune, if certain the subpath sequence including certain path candidate sequence pattern is not path sequence mould
Formula, then delete from path candidate sequence pattern;
Operation c, makes k=k+1, repetitive operation b, until not having new path candidate sequence to produce;
Step 7,<key, the value>that Map node is passed over by Reduce node obtains overall situation candidate to carrying out merger process
Sequence pattern;
Step 8, scanning step 4 leaves the original path sequence library in HDFS in overall situation candidate sequence pattern meter again
Number, finds out and meets the sequence pattern not less than minimum support ξ set in advance, obtain global path sequence pattern;
Step 9, is produced path correlation rule by the global path sequence pattern produced in step 8 and calculates path correlation rule
Confidence level, obtains vehicle running path and predicts the outcome.
2. a vehicle running path prognoses system, it is characterised in that: arrange based on Hadoop platform with lower module,
Internal memory confirms module, for according to the internal memory situation of every computer in Hadoop platform, determines internal memory in all nodes
The internal memory of minimum machine, and it is designated as Q, unit is GB;
Longest path sequence confirms module, has the original path sequence library of vehicle running path sequence for scanning storage,
The bar number scale obtaining path sequence in original path sequence library is m bar, and every paths sequence includes more than one crossing, former
In beginning path sequence data base, the actual storage size of longest path sequence is designated as P, and unit is B;
Subpath sequence library divides module, for being averagely divided into by horizontal division mode by original path sequence library
N disjoint subpath sequence library, wherein P × (m/n)≤Q × 109;
Transmission module on raw data base, for uploading to original path sequence library in certain specified folder of HDFS;
Transmission module on subdata base, for uploading to n sub-path sequence data base in another specified folder of HDFS;
Local path sequence pattern module, for the n making the main controlled node of Hadoop platform be uploaded by transmission module on subdata base
Individual sub-path sequence data base is dispatched to different Map nodes, and each Map node performs the GSP algorithm improved, according to setting in advance
Fixed minimum support ξ, scanning is left the subpath sequence library in Map node memory in, is calculated local path sequence
Pattern, with<key, value>to form pass to Reduce node, wherein key is local path sequence pattern, and value is
The support counting of local path sequence pattern;
The GSP algorithm that each Map node performs to improve is as follows,
Operation a, for being assigned to the subpath sequence library of this Map node, scanning subpath sequence library obtains 1-road
Footpath sequence pattern L1, make k=1,
Operation b, by k-path sequence pattern LkProduce candidate's k+1-path sequence Ck+1, scanning original path sequence data again
Storehouse, calculates the support of each path candidate sequence, produces k+1-path sequence pattern Lk+1;Wherein, candidate k+1-path is produced
Sequence Ck+1Divide the following two kinds situation,
(1) if being produced candidate's 2-path sequence pattern, the neighbour of scanning storage traffic network information by 1-path sequence pattern
Connect table, check 1-path sequence pattern L1In each path sequence pattern s1Adjacent node, if s1Adjacent node also at 1-
Path sequence pattern L1In, will be with s1Adjacent node project adds s to1In;
(2) if by k-path sequence pattern produce candidate's k+1-path sequence pattern, k > 1,
First, to any two path sequence pattern s in k-path sequence pattern1And s2If removing path sequence pattern s1's
First project with remove path sequence pattern s2Last project obtained by path sequence identical, then by s1With s2Enter
Row connects;Then, prune, if certain the subpath sequence including certain path candidate sequence pattern is not path sequence mould
Formula, then delete from path candidate sequence pattern;
Operation c, makes k=k+1, repetitive operation b, until not having new path candidate sequence to produce;
Overall situation candidate sequence mode module,<key, the value>that be used for making Reduce node pass over Map node is to carrying out
Merger processes and obtains overall situation candidate sequence pattern;
Global path sequence pattern module, on scanning raw data base again, transmission module leaves the original road in HDFS in
Footpath sequence library, to overall situation candidate sequence mode counting, is found out and is met the sequence not less than minimum support ξ set in advance
Pattern, obtains global path sequence pattern;
Predict the outcome module, closes for being produced path by the global path sequence pattern produced in global path sequence pattern module
Join rule and calculate the confidence level of path correlation rule, obtaining vehicle running path and predict the outcome.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410628190.1A CN104464344B (en) | 2014-11-07 | 2014-11-07 | A kind of vehicle running path Forecasting Methodology and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410628190.1A CN104464344B (en) | 2014-11-07 | 2014-11-07 | A kind of vehicle running path Forecasting Methodology and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104464344A CN104464344A (en) | 2015-03-25 |
CN104464344B true CN104464344B (en) | 2016-09-14 |
Family
ID=52910319
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410628190.1A Expired - Fee Related CN104464344B (en) | 2014-11-07 | 2014-11-07 | A kind of vehicle running path Forecasting Methodology and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104464344B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101714250B1 (en) * | 2015-10-28 | 2017-03-08 | 현대자동차주식회사 | Method for predicting driving path around the vehicle |
CN106652440B (en) * | 2015-10-30 | 2019-05-21 | 杭州海康威视数字技术股份有限公司 | A kind of determination method and device in the frequent activities region of vehicle |
US10152882B2 (en) * | 2015-11-30 | 2018-12-11 | Nissan North America, Inc. | Host vehicle operation using remote vehicle intention prediction |
CN105716620B (en) * | 2016-03-16 | 2018-03-23 | 沈阳建筑大学 | A kind of air navigation aid based on cloud computing and big data |
CN107316016B (en) * | 2017-06-19 | 2020-06-23 | 桂林电子科技大学 | Vehicle track statistical method based on Hadoop and monitoring video stream |
CN107862868B (en) * | 2017-11-09 | 2019-08-20 | 泰华智慧产业集团股份有限公司 | A method of track of vehicle prediction is carried out based on big data |
CN108717786B (en) * | 2018-07-17 | 2022-06-17 | 南京航空航天大学 | Traffic accident cause mining method based on universality meta-rule |
CN110084402B (en) * | 2019-03-25 | 2022-03-11 | 广东工业大学 | Bus self-adaptive scheduling method based on station optimization and ant tracing |
CN113468245B (en) * | 2021-07-19 | 2023-05-05 | 金陵科技学院 | Dynamic minimum support calculation method for rail transit application |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4682865B2 (en) * | 2006-02-17 | 2011-05-11 | アイシン・エィ・ダブリュ株式会社 | Route search system, route guidance method in route guidance system, and navigation apparatus |
JP4353192B2 (en) * | 2006-03-02 | 2009-10-28 | トヨタ自動車株式会社 | Course setting method, apparatus, program, and automatic driving system |
CN102509170A (en) * | 2011-10-10 | 2012-06-20 | 浙江鸿程计算机***有限公司 | Location prediction system and method based on historical track data mining |
CN103298059B (en) * | 2013-05-13 | 2015-09-30 | 西安电子科技大学 | The degree of communication perception method for routing of position-based prediction in car self-organization network |
CN103366566B (en) * | 2013-06-25 | 2015-05-06 | 中国科学院信息工程研究所 | Running track prediction method aiming at specific vehicle potential group |
CN103929804A (en) * | 2014-03-20 | 2014-07-16 | 南京邮电大学 | Position predicting method based on user moving rule |
-
2014
- 2014-11-07 CN CN201410628190.1A patent/CN104464344B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN104464344A (en) | 2015-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104464344B (en) | A kind of vehicle running path Forecasting Methodology and system | |
Kumar et al. | Bus travel time prediction using a time-space discretization approach | |
Kumar et al. | Short term traffic flow prediction for a non urban highway using artificial neural network | |
Zhan et al. | Shortest path algorithms: an evaluation using real road networks | |
US20190092171A1 (en) | Methods, Circuits, Devices, Systems & Associated Computer Executable Code for Driver Decision Support | |
Chen et al. | Uncertainty in urban mobility: Predicting waiting times for shared bicycles and parking lots | |
CN107331200A (en) | A kind of CBD underground garages intelligent parking guiding system, method and device | |
CN105868861A (en) | Bus passenger flow evolution analysis method based on time-space data fusion | |
CN107195177A (en) | Based on Forecasting Methodology of the distributed memory Computational frame to city traffic road condition | |
CN113763700A (en) | Information processing method, information processing device, computer equipment and storage medium | |
López-Ramos | Integrating network design and frequency setting in public transportation networks: a survey | |
Wang et al. | A unified framework with multi-source data for predicting passenger demands of ride services | |
Xia et al. | A parallel grid-search-based SVM optimization algorithm on Spark for passenger hotspot prediction | |
Ma et al. | Dynamic vehicle routing problem for flexible buses considering stochastic requests | |
Wei et al. | Bi-level programming model for multi-modal regional bus timetable and vehicle dispatch with stochastic travel time | |
Fan et al. | Online trajectory prediction for metropolitan scale mobility digital twin | |
CN104282142A (en) | Bus station arrangement method based on taxi GPS data | |
Sun et al. | Solving demand-responsive feeder transit service design with fuzzy travel demand: A collaborative ant colony algorithm approach | |
Yu et al. | Optimal routing of multimodal mobility systems with ride‐sharing | |
Asimakopoulos et al. | Towards a dynamic waste collection management system using real-time and forecasted data | |
Krislata et al. | Traffic Flows System Development for Smart City. | |
Raharjo et al. | Knowledge development on Urban public transportation concepts: A literature study in bibliometric analysis | |
Xia et al. | A distributed EMDN-GRU model on Spark for passenger waiting time forecasting | |
Pang et al. | Dynamic train dwell time forecasting: a hybrid approach to address the influence of passenger flow fluctuations | |
Asimakopoulos et al. | Architecture and Implementation Issues, Towards a Dynamic Waste Collection Management System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160914 Termination date: 20171107 |
|
CF01 | Termination of patent right due to non-payment of annual fee |