CN106211071A - Group activity method of data capture based on multi-source space-time trajectory data and system - Google Patents
Group activity method of data capture based on multi-source space-time trajectory data and system Download PDFInfo
- Publication number
- CN106211071A CN106211071A CN201610517438.6A CN201610517438A CN106211071A CN 106211071 A CN106211071 A CN 106211071A CN 201610517438 A CN201610517438 A CN 201610517438A CN 106211071 A CN106211071 A CN 106211071A
- Authority
- CN
- China
- Prior art keywords
- data
- activity
- time
- moving point
- track
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/029—Location-based management or tracking services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W64/00—Locating users or terminals or network equipment for network management purposes, e.g. mobility management
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses group activity method of data capture based on multi-source space-time trajectory data and system, method includes: data that backstage obtains originating mobile terminal signaling data and original social software is registered also carry out pretreatment, generate and meet the pending signaling data of specific format and pending data of registering;The moving point track data that backstage obtains from pending signaling data;Build and learn the prior information of group activity rule;Obtain moving point track data, obtain activity venue data;Backstage, according to moving point track data, the prior information of group activity rule, activity venue data, uses and carries out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model.The present invention uses Bayesian model to carry out the deduction of individual activity, and considers the previous moment Activity Type impact on later moment in time Activity Type in spatio-temporal activity track, it is achieved on a large scale, accurate, quick, the high efficiency extraction of magnanimity group activity and collection.
Description
Technical field
The present invention relates to technical field of data processing, particularly relate to group activity data based on multi-source space-time trajectory data
Collection method and system.
Background technology
Traditional movable gathering method depends on activity log or activity survey, and sample size is few, and the time of collecting is long, time-consumingly consumes
Power.The outburst of space-time trajectory data is large-scale groups, and movable collection provides new tool.Space-time data analyzes correlational study
The individual activity identification being primarily upon in realistic space, especially travel activity, lack the extraction to activity base attribute information.
Need development to merge the group activity extracting method of multi-source space-time trajectory data, establish for urban science based on magnanimity activity research
Given data basis.Space-time trajectory data (as mobile phone signaling data, vehicle GPS data, social activity are registered data etc.) is although comprising rich
Rich temporal information and positional information, but semantic information lacks relatively, and spatial and temporal resolution is different, it is impossible to directly provide
Group activity information.
Therefore, prior art has yet to be improved and developed.
Summary of the invention
In view of the deficiencies in the prior art, present invention aim at providing a kind of colony based on multi-source space-time trajectory data to live
Dynamic method of data capture and system.
Technical scheme is as follows:
A kind of group activity method of data capture based on multi-source space-time trajectory data, wherein, method includes:
Data that A, backstage obtain originating mobile terminal signaling data and original social software is registered, respectively to original mobile eventually
End signaling data and original social software data of registering carry out pretreatment, and the correspondence of generation meets the pending signaling of specific format
Data and pending data of registering;
B, backstage, by presetting the rule of time and space, are extracted moving point from pending signaling data, are obtained
Moving point track data;According to the pending classification information of registering registered in data, build and learn group activity rule
Prior information;Obtain moving point track data, obtain activity venue data;
C, backstage according to moving point track data, the prior information of group activity rule, activity venue data, use based on
Bayesian model carries out moving point track semantic information labelling, generation activity space-time track chain.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, described A specifically includes:
A1, backstage obtain originating mobile terminal signaling data, originating mobile terminal signaling data is carried out quality cleaning, goes
Except repeating data, remove the data of attribute disappearance, remove time and space data not within the predefined range, remove user and count
Amount, less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
A2, backstage obtain original social software and register data, and data of registering original social software carry out quality cleaning, go
Except repeating data, remove the data of attribute disappearance, remove time and the space not data in research range, remove user and register
Quantity at a range of user data, removes the user data only registered at the three unities, generates pretreatment and registers data;
A3, data that pretreatment signaling data and pretreatment are registered spatial resolution according to the yardstick of pre-defined rule grid
Resolution change, generate corresponding pending signaling data and pending data of registering.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, by advance in described B
Setting time and the rule in space, extract moving point from pending signaling data, and the moving point track data obtained specifically wraps
Include:
B11, backstage obtain pending signaling data, people and time are ranked up according to specific time rule, obtain
The sequential track of people;
B12, sequential track according to people, calculate people and enter and leave the time of ad-hoc location, and people entered successively is each
Individual position is set to moving point, and first moving point being set in moving point track first position that people enters;
B13, the space length calculating every bit and existing moving point in sequential track and time difference, if space length
Less than setting threshold value, and time difference is less than setting threshold value, then by described addition moving point, otherwise, be set to newly by described point
Moving point, until institute a little all calculates complete in sequential track, obtain the candidate active locus of points;
B14, the candidate active point obtained in the candidate active locus of points, when detect candidate active point entry time and from
The difference of ETAD expected time of arrival and departure less than the second setting threshold value, then, after being removed from the candidate active locus of points by correspondence candidate active point, generates
Moving point track data.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, according to waiting to locate in described B
Reason is registered the classification information of registering in data, builds and learns the prior information of group activity rule and specifically includes:
B21, platform of registering according to social activity register classification and user's data of registering of different time sections in a day total
Amount, is calculated different groups activity and is distributed at intraday intensive probable;
B22, data of registering according to user, calculate the different groups activity movable transition probability under different time and divide
Cloth;
B23, data of registering according to user, calculate different regions and carry out the probability distribution of different groups activity.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, acquisition activity in described B
Locus of points data, obtain activity venue data and specifically include:
B31, preset the time identification window of the activity venue of people, be designated as the first active window, second movable respectively
Window;
B32, the moving point track data of acquisition people, lived with the first active window and second respectively by the moving point persistent period
Dynamic window mates, if the persistent period of moving point falls in a certain active window, and accounts for total activity widow time length
More than 50%, then the activity venue that the corresponding described active window of this moving point is corresponding is as candidate active position;
B33, obtain match time the longest candidate active position as the activity venue data of user.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, described C specifically includes:
C1, according to Bayesian model, and after the Activity Type in given position, time and previous moment, generate
Subsequent time carries out the new probability formula of a certain type of activity;
C2, according to each moving point in moving point track data, calculate and be engaged in different movable probability size, obtain
The maximum of probability Activity Type that activity mark is described moving point of big probability;
C3, by after all moving point labellings in moving point track data, output activity space-time track chain.
A kind of group activity data gathering system based on multi-source space-time trajectory data, wherein, system includes:
Pretreatment module, obtains originating mobile terminal signaling data for backstage and original social software is registered data, point
Other to originating mobile terminal signaling data with original social software data of registering carry out pretreatment, the correspondence of generation meets particular bin
The pending signaling data of formula and pending data of registering;
Activity venue data acquisition module, for backstage by presetting the rule of time and space, from pending letter
Make extracting data moving point, the moving point track data obtained;According to the pending classification information of registering registered in data, structure
Build and learn the prior information of group activity rule;Obtain moving point track data, obtain activity venue data;
Semantic marker module, for backstage according to moving point track data, the prior information of group activity rule, actively
Point data, uses and carries out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described pretreatment module
Specifically include:
Signaling data processing unit, obtains originating mobile terminal signaling data, to originating mobile terminal signaling for backstage
Data carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, and removal time and space are not within the predefined range
Data, remove user and put quantity less than or greater than the user data of certain threshold value, generate pretreatment signaling data;
Register data processing unit, obtain original social software for backstage and register data, original social software is registered
Data carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, remove time and space not in research range
Data, remove user's quantity of registering at a range of user data, remove the user data only registered at the three unities, raw
Pretreatment is become to register data;
Resolution conversion unit, for the spatial resolution of data of pretreatment signaling data and pretreatment being registered according in advance
The resolution of the yardstick determining regular grid is changed, and generates corresponding pending signaling data and pending data of registering.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described counts actively
Specifically include according to acquisition module:
Sequencing unit, obtains pending signaling data, people and time is carried out according to specific time rule for backstage
Sequence, the sequential track of the people obtained;
Moving point indexing unit, for the sequential track according to people, calculates people and enters and leaves the time of ad-hoc location, depend on
Secondary each position by people's entrance is set to moving point, and first position that people enters is set to first in moving point track
Individual moving point;
Candidate active locus of points signal generating unit, for calculate in sequential track the space of every bit and existing moving point away from
From with time difference, if space length less than set threshold value, and time difference less than set threshold value, then by described some addition activity
Point, otherwise, is set to new moving point by described point, until institute a little all calculates complete in sequential track, obtains candidate active
The locus of points;
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting
The entry time of candidate active point and the difference of time departure are less than the second setting threshold value, then by correspondence candidate active point from candidate
After moving point track removes, generate moving point track data;
First probability calculation unit, for according to social activity register platform register classification and user in one day different time
Between the data total amount of registering of section, be calculated different groups activity and be distributed at intraday intensive probable;
Second probability calculation unit, for the data of registering according to user, calculates different groups activity under different time
Movable transfering probability distribution;
3rd probability calculation unit, for the data of registering according to user, calculates different regions and carries out different groups work
Dynamic probability distribution;
Preset unit, for presetting the time identification window of the activity venue of people, be designated as first respectively movable
Window, the second active window;
Candidate active location determination unit, for obtaining the moving point track data of people, by the moving point persistent period respectively
Mate with the first active window and the second active window, if the persistent period of moving point falls in a certain active window, and
Account for more than the 50% of total activity widow time length, then the activity venue that the corresponding described active window of this moving point is corresponding is as time
Select moving position;
Activity venue data capture unit, for obtain match time the longest candidate active position as the work of user
Dynamic locality data.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described semantic marker mould
Block specifically includes:
4th probability calculation unit, is used for according to Bayesian model, and given position, time and the previous moment
Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;
Maximum of probability Activity Type indexing unit, for according to each moving point in moving point track data, calculate from
The probability size of thing difference activity, obtains the maximum of probability Activity Type that activity mark is described moving point of maximum of probability;
Movable space-time track chain signal generating unit, for by after all moving point labellings in moving point track data, exporting
Movable space-time track chain.
The invention provides a kind of group activity method of data capture based on multi-source space-time trajectory data and system, this
Bright employing Bayesian model carries out the deduction of individual activity, and considers in spatio-temporal activity track previous moment Activity Type to rear
The impact of one moment Activity Type, it is achieved on a large scale, accurate, quick, the high efficiency extraction of magnanimity group activity and collection.
Accompanying drawing explanation
Fig. 1 is the preferable enforcement of a kind of based on multi-source space-time trajectory data the group activity method of data capture of the present invention
The flow chart of example.
Fig. 2 is the preferable enforcement of a kind of based on multi-source space-time trajectory data the group activity data gathering system of the present invention
The functional schematic block diagram of example.
Detailed description of the invention
For making the purpose of the present invention, technical scheme and effect clearer, clear and definite, below to the present invention the most specifically
Bright.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
The invention provides the preferable enforcement of a kind of group activity method of data capture based on multi-source space-time trajectory data
The flow chart of example, as it is shown in figure 1, wherein, method includes:
Data that step S100, backstage obtain originating mobile terminal signaling data and original social software is registered, respectively to former
Beginning mobile terminal signaling data and original social software data of registering carry out pretreatment, and the correspondence of generation meets treating of specific format
Process signaling data and pending data of registering.Wherein mobile terminal is preferably mobile phone.
In further embodiment, step S100 specifically includes:
Step S101, backstage obtain originating mobile terminal signaling data, and originating mobile terminal signaling data is carried out quality
Clean, remove and repeat data, remove the data of attribute disappearance, remove time and space data not within the predefined range, remove
User puts the quantity user data less than or greater than certain threshold value, generates pretreatment signaling data;
Step S102, backstage obtain original social software and register data, and data of registering original social software carry out quality
Clean, remove and repeat data, remove the data of attribute disappearance, remove time and the space not data in research range, remove
User registers quantity at a range of user data, the user data that removal is only registered at the three unities, generation pretreatment label
To data;
Step S103, data that pretreatment signaling data and pretreatment are registered spatial resolution according to pre-defined rule grid
The resolution of yardstick change, generate corresponding pending signaling data and pending data of registering.
When being embodied as, data of registering mobile phone signaling data and social activity carry out pretreatment, obtain being allowed to meet locating afterwards
The data that reason requires, particular content includes:
Mobile phone signaling data is carried out quality cleaning, repeats data including removing, remove the data of attribute disappearance, during removal
Between and the space not data in research range, remove user and put quantity less than or greater than the user data of certain threshold value;Threshold value
Choose and depend on concrete data type, data form, the quality of data.Preferably, threshold value span is every less than 3
My god, more than 100 every days.
Data of registering social activity carry out quality cleaning, repeat data including removing, and remove the data of attribute disappearance, during removal
Between and the space not data in research range;Remove user's quantity of registering less than 2 times, the user data more than 100 times;Remove
The user data only registered at the three unities;
For multi-source space-time trajectory data, it is considered to the impact of spatial resolution.Mobile phone signaling data and social activity are registered number
According to the unified yardstick being converted to rule-based grid of spatial resolution.The scale size of regular grid generally depends on above two
The spatial resolution of class data itself.Preferential scale selection is 500m*500m.
Step S200, backstage by presetting the rule of time and space, extraction activity from pending signaling data
Point, the moving point track data obtained;According to the pending classification information of registering registered in data, build and learn group activity
The prior information of rule;Obtain moving point track data, obtain activity venue data.
Further, by presetting the rule of time and space in step S200, carry from pending signaling data
Taking moving point, the moving point track data obtained specifically includes:
Step S211, backstage obtain pending signaling data, people and time are ranked up according to specific time rule,
The sequential track of the people obtained;
Step S212, sequential track according to people, calculate people and enter and leave the time of ad-hoc location, people entered successively
Each position be set to moving point, and first moving point that first position that people enters is set in moving point track;
Step S213, the space length calculating every bit and existing moving point in sequential track and time difference, if empty
Spacing is less than setting threshold value, and time difference is less than setting threshold value, then by described addition moving point, otherwise, by described point
It is set to new moving point, until institute a little all calculates complete in sequential track, obtains the candidate active locus of points;
Step S214, the candidate active point obtained in the candidate active locus of points, when detecting the entrance of candidate active point
Between and time departure difference less than second setting threshold value, then correspondence candidate active point is removed from the candidate active locus of points
After, generate moving point track data.
When being embodied as, for the mobile phone signaling data through processing, by extracting the moving point of people, obtain the work of people
Moving point trace.The method extracting moving point is mainly judged by the rule of setting time and space, and concrete grammar is as follows:
For the mobile phone signaling data produced, it is ranked up according to people and time, obtains the sequential track of people;
Utilizing the sequential track of people, calculate its time entering and leaving each position (grid), first position is set to
First moving point in moving point track;
Move over time, calculate in sequential track the space of every bit and the moving point in existing moving point track away from
From with time difference;If space length is less than setting threshold value, and time difference is less than setting threshold value, then this point is joined this
Moving point;Otherwise, this point is set to new moving point;Until institute a little all calculates complete in sequential track, obtain candidate active
The locus of points;Preferably set threshold value in the range of 500m-1000m.
For the candidate active point in the candidate active locus of points, if the difference of the entry time of this point and time departure is less than
Certain threshold value, then it is assumed that the not active point of this point, removes it from the candidate active locus of points, the moving point rail finally obtained
Mark.Preferably, threshold value span is 1 hour-3 hours.
Further, according to the pending classification information of registering registered in data in step S200, build and learn colony
The prior information of mechanics specifically includes:
Step S221, register classification and user's number of registering of different time sections in a day of platform of registering according to social activity
According to total amount, it is calculated different groups activity and is distributed at intraday intensive probable;
Step S222, data of registering according to user, calculate the different groups activity movable transfer under different time general
Rate is distributed;
Step S223, data of registering according to user, calculate different regions and carry out the probability distribution of different groups activity.
When being embodied as, for registering data through the social activity that processed, utilize its rich in abundant classification information of registering,
Build and the prior information of study group activity rule.Concrete grammar is as follows:
Register the classification of registering that platform provided, and user's data of registering of different time sections in a day according to social activity
Total amount, is calculated different groups activity and is distributed Pr (AT at intraday intensive probablei| t), it is expressed as:
checkins(ATi, t) represent the quantity of registering that moment t Activity Type is i, ∑tcheckins(ATi, t) it is one day
Each moment interior is engaged in the quantity of registering that Activity Type is i, wherein ATiFor being engaged in the number of registering that the class of activity is i, according to user
Track of registering, be calculated the different groups activity movable transfering probability distribution under different time, be expressed as Pr (ATi,t|
ATj, t-1), the classification of wherein i, j expression activity, t express time.ATi, t represents that being engaged in Activity Type in t is registering of i
Quantity, (ATj, t-1) represent and be engaged in the number of registering that Activity Type is j, this probability P r (AT in the t-1 momenti,t|ATj, t-1) meaning
In the case of justice is for being engaged in movable j in known previous moment t-1, it is engaged in the probability of movable i at moment t;Pr (X) represents event X
Probability announce;
According to the track of registering of user, it is calculated different mesh region and carries out the probability distribution of different groups activity, table
It is shown as: Pr (Gridm|ATi, t), wherein m is grid sequence number, GridmRepresenting m-th grid, i is the class of activity, and t is the time.
In further embodiment, described step S200 obtains moving point track data, obtain activity venue data tool
Body includes:
Step S231, preset the time identification window of the activity venue of people, be designated as respectively the first active window, second
Active window;
Step S232, obtain people moving point track data, by the moving point persistent period respectively with the first active window and
Second active window mates, if the persistent period of moving point falls in a certain active window, and accounts for total activity widow time
More than the 50% of length, then the activity venue that the corresponding described active window of this moving point is corresponding is as candidate active position;
Step S233, obtain match time the longest candidate active position as the activity venue data of user.
When being embodied as, the moving point track data obtained, the house of detection people and work activities.Concrete grammar is as follows:
According to general knowledge, set the most movable and identification window of work activities, be set to: 0 .-7 point, 9 .-17 points;
For the moving point track data of people, the persistent period of moving point is mated with two above identification window,
If the persistent period of this moving point falls in identification window, and account for more than the 50% of total identification window time span, then it is assumed that
It is made into merit, as candidate house or work activities position;
Find the longest house match time or work activities position as the house of this user and work activities position;If
There is no that the match is successful, then it is assumed that this user does not find at home or work activities position.
Step S300, backstage according to moving point track data, the prior information of group activity rule, activity venue data,
Use and carry out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model.
Utilize the moving point track through obtaining, the group activity temporal prior information obtained, the house work of the people obtained
Making action message, carry out moving point track semantic information labelling based on Bayesian model, the action message of labelling mainly includes occupying
Family, work, other (such as: amusement/shopping/study/leisure/trip etc.), obtain movable space-time track chain.
The spatio-temporal activity track chain obtained dynamically changes have important meaning for research urban planning and urban function region
Justice.According to the change of spatio-temporal activity, tune can be the most quickly made in the dynamically change of the urban function region for having planned
Whole and prediction.
Further embodiment, step S300 specifically includes:
Step S301, according to Bayesian model, and given position, time and the Activity Type in previous moment
After, generate subsequent time and carry out the new probability formula of a certain type of activity;
Step S302, according to each moving point in moving point track data, calculate and be engaged in different movable probability size,
Obtain the maximum of probability Activity Type that activity mark is described moving point of maximum of probability;
Step S303, by after all moving point labellings in moving point track data, output activity space-time track chain.
When being embodied as, according to Bayesian model, in the activity class in given particular location, time and previous moment
Under type, the lower moment at a moment will carry out the probability of a certain type of activity and is:
Wherein, m is grid sequence number, and j is the Activity Type in previous moment, and t is current time, and i is that current time is movable
Type.
For Pr (Gridm|ATi,t,ATj), it is believed that ATjWith GridmCondition is unrelated, then this formula can be reduced to:
Pr(Gridm|ATi,t,ATj)=Pr (Gridm|ATi,t) (2)
For Pr (ATi|t,ATj), this formula can be rewritten as:
Pr(ATi|t,ATj)=Pr (ATi,t|ATj, t-1) and (3)
In conjunction with formula (2) (3), formula (1) is converted to:
Pr(ATi|Gridm,t,ATj) ∝ Pr (Gridm|ATi, t) Pr(ATi,t|ATj,t-1)Pr(ATj|t)(5)
For moving point track, it is sequentially inputted in formula (5), calculates and be engaged in different movable probability size, take maximum
The activity mark of probability is the maximum of probability Activity Type of this moving point;
Especially, for having been marked as at home or the grid position of work activities type, then by Pr (Gridm|ATi,t)
It is set to 1, and by ATj, t-1=AThomeorATworking, the labelling continuing with next moving point processes.Until all of work
Moving point in moving point trace obtains labelling, and output obtains movable Space-time Chain.AThomeRepresent that Activity Type is at home, ATworking
Represent that Activity Type is in work.
Wherein, moving point track extraction method depends on concrete data type, the spatial and temporal resolution of data, is not limited to
The method that the present invention introduces;
Be limited to the observation duration of space-time data at home with work activities detection method, choosing of threshold value is not limited to this
The method of bright introduction;
Build and the prior information of study group activity rule is not limited to social media and registers data, it is also possible to use and occupy
The modes such as people's survey data, GPS track data, volunteer's data.
The present invention proposes a kind of brand-new group activity collection method based on multi-source space-time trajectory data, uses Bayes
Model carries out the deduction of individual activity, solves the problems such as existing method takes time and effort, cost is high, sample size is little, it is achieved on a large scale,
Accurate, quick, the high efficiency extraction of magnanimity group activity and collection.The group activity deduction of the present invention not only allows for city space
The constraint of the Factors on Human class activities such as middle time, position, it is also contemplated that in spatio-temporal activity track, previous moment Activity Type is to rear
The impact of one moment Activity Type, the deduction of consideration activity in mankind's spatio-temporal activity chain.
Present invention also offers the preferable real of a kind of group activity data gathering system based on multi-source space-time trajectory data
Execute the functional schematic block diagram of example, as in figure 2 it is shown, system includes:
Pretreatment module 100, obtains originating mobile terminal signaling data for backstage and original social software is registered data,
Data of registering originating mobile terminal signaling data and original social software respectively carry out pretreatment, and the correspondence of generation meets specific
The pending signaling data of form and pending data of registering;Specifically as described in embodiment of the method.
Activity venue data acquisition module 200, for backstage by presetting the rule of time and space, from pending
Signaling data extracts moving point, the moving point track data obtained;According to the pending classification information of registering registered in data,
Build and learn the prior information of group activity rule;Obtain moving point track data, obtain activity venue data;Concrete such as side
Described in method embodiment.
Semantic marker module 300, for backstage according to moving point track data, the prior information of group activity rule, work
Dynamic locality data, uses and carries out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model;Tool
Body is as described in embodiment of the method.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described pretreatment module
Specifically include:
Signaling data processing unit, obtains originating mobile terminal signaling data, to originating mobile terminal signaling for backstage
Data carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, and removal time and space are not within the predefined range
Data, remove user and put quantity less than or greater than the user data of certain threshold value, generate pretreatment signaling data;Concrete such as side
Described in method embodiment.
Register data processing unit, obtain original social software for backstage and register data, original social software is registered
Data carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, remove time and space not in research range
Data, remove user's quantity of registering at a range of user data, remove the user data only registered at the three unities, raw
Pretreatment is become to register data;Specifically as described in embodiment of the method.
Resolution conversion unit, for the spatial resolution of data of pretreatment signaling data and pretreatment being registered according in advance
The resolution of the yardstick determining regular grid is changed, and generates corresponding pending signaling data and pending data of registering;Tool
Body is as described in embodiment of the method.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described counts actively
Specifically include according to acquisition module:
Sequencing unit, obtains pending signaling data, people and time is carried out according to specific time rule for backstage
Sequence, the sequential track of the people obtained;Specifically as described in embodiment of the method.
Moving point indexing unit, for the sequential track according to people, calculates people and enters and leaves the time of ad-hoc location, depend on
Secondary each position by people's entrance is set to moving point, and first position that people enters is set to first in moving point track
Individual moving point;Specifically as described in embodiment of the method.
Candidate active locus of points signal generating unit, for calculate in sequential track the space of every bit and existing moving point away from
From with time difference, if space length less than set threshold value, and time difference less than set threshold value, then by described some addition activity
Point, otherwise, is set to new moving point by described point, until institute a little all calculates complete in sequential track, obtains candidate active
The locus of points;Specifically as described in embodiment of the method.
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting
The entry time of candidate active point and the difference of time departure are less than the second setting threshold value, then by correspondence candidate active point from candidate
After moving point track removes, generate moving point track data;Specifically as described in embodiment of the method.
First probability calculation unit, for according to social activity register platform register classification and user in one day different time
Between the data total amount of registering of section, be calculated different groups activity and be distributed at intraday intensive probable;Specifically implement such as method
Described in example.
Second probability calculation unit, for the data of registering according to user, calculates different groups activity under different time
Movable transfering probability distribution;Specifically as described in embodiment of the method.
3rd probability calculation unit, for the data of registering according to user, calculates different regions and carries out different groups work
Dynamic probability distribution;Specifically as described in embodiment of the method.
Preset unit, for presetting the time identification window of the activity venue of people, be designated as first respectively movable
Window, the second active window;Specifically as described in embodiment of the method.
Candidate active location determination unit, for obtaining the moving point track data of people, by the moving point persistent period respectively
Mate with the first active window and the second active window, if the persistent period of moving point falls in a certain active window, and
Account for more than the 50% of total activity widow time length, then the activity venue that the corresponding described active window of this moving point is corresponding is as time
Select moving position;Specifically as described in embodiment of the method.
Activity venue data capture unit, for obtain match time the longest candidate active position as the work of user
Dynamic locality data;Specifically as described in embodiment of the method.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described semantic marker mould
Block specifically includes:
4th probability calculation unit, is used for according to Bayesian model, and given position, time and the previous moment
Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;Specifically as described in embodiment of the method.
Maximum of probability Activity Type indexing unit, for according to each moving point in moving point track data, calculate from
The probability size of thing difference activity, obtains the maximum of probability Activity Type that activity mark is described moving point of maximum of probability;Tool
Body is as described in embodiment of the method.
Movable space-time track chain signal generating unit, for by after all moving point labellings in moving point track data, exporting
Movable space-time track chain;Specifically as described in embodiment of the method.
In sum, the invention provides a kind of group activity method of data capture based on multi-source space-time trajectory data and
System, method includes: data that backstage obtains originating mobile terminal signaling data and original social software is registered also carry out pretreatment,
Generate and meet the pending signaling data of specific format and pending data of registering;The work that backstage obtains from pending signaling data
Moving point trace data;Build and learn the prior information of group activity rule;Obtain moving point track data, obtain activity venue
Data;Backstage, according to moving point track data, the prior information of group activity rule, activity venue data, uses based on pattra leaves
This model carries out moving point track semantic information labelling, generation activity space-time track chain.The present invention uses Bayesian model to carry out
The deduction of individual activity, and consider the previous moment Activity Type shadow to later moment in time Activity Type in spatio-temporal activity track
Ring, it is achieved on a large scale, accurate, quick, the high efficiency extraction of magnanimity group activity and collection.
It should be appreciated that the application of the present invention is not limited to above-mentioned citing, for those of ordinary skills, can
To be improved according to the above description or to convert, all these modifications and variations all should belong to the guarantor of claims of the present invention
Protect scope.
Claims (10)
1. a group activity method of data capture based on multi-source space-time trajectory data, it is characterised in that described method includes:
Data that A, backstage obtain originating mobile terminal signaling data and original social software is registered, believe originating mobile terminal respectively
Making data and original social software data of registering carry out pretreatment, the correspondence of generation meets the pending signaling data of specific format
With pending data of registering;
Moving point, the work obtained, by presetting the rule of time and space, are extracted from pending signaling data in B, backstage
Moving point trace data;According to the pending classification information of registering registered in data, build and learn the priori of group activity rule
Information;Obtain moving point track data, obtain activity venue data;
C, backstage, according to moving point track data, the prior information of group activity rule, activity venue data, use based on pattra leaves
This model carries out moving point track semantic information labelling, generation activity space-time track chain.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 1, its feature exists
In, described A specifically includes:
A1, backstage obtain originating mobile terminal signaling data, and originating mobile terminal signaling data carries out quality cleaning, remove weight
Complex data, removes the data of attribute disappearance, removes time and space data not within the predefined range, and it is little that removal user puts quantity
In or more than the user data of certain threshold value, generate pretreatment signaling data;
A2, backstage obtain original social software and register data, and data of registering original social software carry out quality cleaning, remove weight
Complex data, removes the data of attribute disappearance, removes time and the space not data in research range, removes user and registers quantity
At a range of user data, remove the user data only registered at the three unities, generate pretreatment and register data;
A3, data that pretreatment signaling data and pretreatment are registered spatial resolution according to the yardstick of pre-defined rule grid point
Resolution is changed, and generates corresponding pending signaling data and pending data of registering.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 1, its feature exists
In, by presetting the rule of time and space in described B, from pending signaling data, extract moving point, the work obtained
Moving point trace data specifically include:
B11, backstage obtain pending signaling data, people and time are ranked up according to specific time rule, the people obtained
Sequential track;
B12, sequential track according to people, calculate people and enter and leave the time of ad-hoc location, each position people entered successively
Install and be set to moving point, and first moving point that first position that people enters is set in moving point track;
B13, the space length calculating every bit and existing moving point in sequential track and time difference, if space length is less than
Set threshold value, and time difference is less than setting threshold value, then by described addition moving point, otherwise, described point is set to new work
Dynamic point, until institute a little all calculates complete in sequential track, obtains the candidate active locus of points;
B14, the candidate active point obtained in the candidate active locus of points, when detecting the entry time of candidate active point and leaving
Between difference less than second setting threshold value, then after correspondence candidate active point being removed from the candidate active locus of points, generation activity
Locus of points data.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 3, its feature exists
In, according to the pending classification information of registering registered in data in described B, build and learn the prior information of group activity rule
Specifically include:
B21, register classification and user's data total amount of registering of different time sections in a day of platform of registering according to social activity, meter
Calculation obtains different groups activity and is distributed at intraday intensive probable;
B22, data of registering according to user, calculate the different groups activity movable transfering probability distribution under different time;
B23, data of registering according to user, calculate different regions and carry out the probability distribution of different groups activity.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 4, its feature exists
In, described B obtains moving point track data, obtains activity venue data and specifically include:
B31, preset the time identification window of the activity venue of people, be designated as the first active window, the second active window respectively;
B32, obtain people moving point track data, by the moving point persistent period respectively with the first active window and the second active window
Mouth mates, if the persistent period of moving point falls in a certain active window, and accounts for the 50% of total activity widow time length
Above, then the activity venue that the corresponding described active window of this moving point is corresponding is as candidate active position;
B33, obtain match time the longest candidate active position as the activity venue data of user.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 5, its feature exists
In, described C specifically includes:
C1, according to Bayesian model, and after the Activity Type in given position, time and previous moment, generate next
Moment carries out the new probability formula of a certain type of activity;
C2, according to each moving point in moving point track data, calculate and be engaged in different movable probability size, obtain the most general
The activity mark of rate is the maximum of probability Activity Type of described moving point;
C3, by after all moving point labellings in moving point track data, output activity space-time track chain.
7. a group activity data gathering system based on multi-source space-time trajectory data, it is characterised in that system includes:
Pretreatment module, obtains originating mobile terminal signaling data for backstage and original social software is registered data, the most right
Originating mobile terminal signaling data and original social software data of registering carry out pretreatment, and the correspondence of generation meets specific format
Pending signaling data and pending data of registering;
Activity venue data acquisition module, for backstage by presetting the rule of time and space, from pending signaling number
According to middle extraction moving point, the moving point track data obtained;According to the pending classification information of registering registered in data, build also
The prior information of study group activity rule;Obtain moving point track data, obtain activity venue data;
Semantic marker module, for backstage according to moving point track data, the prior information of group activity rule, count actively
According to, use and carry out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model.
Group activity data gathering system based on multi-source space-time trajectory data the most according to claim 7, its feature exists
In, described pretreatment module specifically includes:
Signaling data processing unit, obtains originating mobile terminal signaling data, to originating mobile terminal signaling data for backstage
Carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, remove time and space number not within the predefined range
According to, remove user and put the quantity user data less than or greater than certain threshold value, generate pretreatment signaling data;
Register data processing unit, obtain original social software for backstage and register data, data that original social software is registered
Carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, remove time and the space not number in research range
According to, remove user's quantity of registering at a range of user data, remove the user data only registered at the three unities, generate pre-
Process is registered data;
Resolution conversion unit, the spatial resolution for data of pretreatment signaling data and pretreatment being registered is established rules according to pre-
Then the resolution of the yardstick of grid is changed, and generates corresponding pending signaling data and pending data of registering.
Group activity data gathering system based on multi-source space-time trajectory data the most according to claim 8, its feature exists
In, described activity venue data acquisition module specifically includes:
Sequencing unit, obtains pending signaling data, people and time is ranked up according to specific time rule for backstage,
The sequential track of the people obtained;
Moving point indexing unit, for the sequential track according to people, calculates people and enters and leaves the time of ad-hoc location, successively will
Each position that people enters is set to moving point, and first work being set in moving point track first position that people enters
Dynamic point;
Candidate active locus of points signal generating unit, for calculate in sequential track the space length of every bit and existing moving point with
Time difference, if space length is less than setting threshold value, and time difference is less than setting threshold value, then by described addition moving point,
Otherwise, described point is set to new moving point, until institute a little all calculates complete in sequential track, obtains candidate active point rail
Mark;
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when candidate being detected
The entry time of moving point and the difference of time departure are less than the second setting threshold value, then by correspondence candidate active point from candidate active
After the locus of points removes, generate moving point track data;
First probability calculation unit, for register classification and user's different time sections in a day of platform of registering according to social activity
Data total amount of registering, be calculated different groups activity and be distributed at intraday intensive probable;
Second probability calculation unit, for the data of registering according to user, calculates different groups activity work under different time
Dynamic transfering probability distribution;
3rd probability calculation unit, for the data of registering according to user, calculates different regions and carries out different groups activity
Probability distribution;
Preset unit, for presetting the time identification window of activity venue of people, be designated as respectively the first active window,
Second active window;
Candidate active location determination unit, for obtaining the moving point track data of people, by the moving point persistent period respectively with the
One active window and the second active window mate, if the persistent period of moving point falls in a certain active window, and account for total
More than the 50% of active window time span, then live as candidate in the activity venue that the corresponding described active window of this moving point is corresponding
Dynamic position;
Activity venue data capture unit, for obtain match time the longest candidate active position as user actively
Point data.
Group activity data gathering system based on multi-source space-time trajectory data the most according to claim 9, its feature exists
In, described semantic marker module specifically includes:
4th probability calculation unit, is used for according to Bayesian model, and given position, time and the work in previous moment
After dynamic type, generate subsequent time and carry out the new probability formula of a certain type of activity;
Maximum of probability Activity Type indexing unit, for according to each moving point in moving point track data, calculates and is engaged in not
With movable probability size, obtain the maximum of probability Activity Type that activity mark is described moving point of maximum of probability;
Movable space-time track chain signal generating unit, for by after all moving point labellings in moving point track data, output activity
Space-time track chain.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610517438.6A CN106211071B (en) | 2016-07-04 | 2016-07-04 | Group activity method of data capture and system based on multi-source space-time trajectory data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610517438.6A CN106211071B (en) | 2016-07-04 | 2016-07-04 | Group activity method of data capture and system based on multi-source space-time trajectory data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106211071A true CN106211071A (en) | 2016-12-07 |
CN106211071B CN106211071B (en) | 2019-05-21 |
Family
ID=57464652
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610517438.6A Active CN106211071B (en) | 2016-07-04 | 2016-07-04 | Group activity method of data capture and system based on multi-source space-time trajectory data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106211071B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169260A (en) * | 2017-03-23 | 2017-09-15 | 四川省公安厅 | Based on space-time track isomerous multi-source resonance data system and method |
CN107274058A (en) * | 2017-05-10 | 2017-10-20 | 福建海峡中创网络信息技术股份有限公司 | A kind of determination methods of mechanics |
CN108597224A (en) * | 2018-05-02 | 2018-09-28 | 深圳市数字城市工程研究中心 | A kind of recognition methods to be improved the traffic conditions and system based on space-time trajectory data |
CN108629000A (en) * | 2018-05-02 | 2018-10-09 | 深圳市数字城市工程研究中心 | A kind of the group behavior feature extracting method and system of mobile phone track data cluster |
CN109918395A (en) * | 2019-02-19 | 2019-06-21 | 北京明略软件***有限公司 | One kind of groups method for digging and device |
CN110543457A (en) * | 2019-09-11 | 2019-12-06 | 北京明略软件***有限公司 | Track type document processing method and device, storage medium and electronic device |
CN111275969A (en) * | 2020-02-15 | 2020-06-12 | 湖南大学 | Vehicle track filling method based on intelligent identification of road environment |
CN112069573A (en) * | 2020-08-24 | 2020-12-11 | 深圳大学 | City group space simulation method, system and equipment based on cellular automaton |
CN112070304A (en) * | 2020-09-09 | 2020-12-11 | 深圳大学 | City group element interaction measuring method, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7373524B2 (en) * | 2004-02-24 | 2008-05-13 | Covelight Systems, Inc. | Methods, systems and computer program products for monitoring user behavior for a server application |
CN102880719A (en) * | 2012-10-16 | 2013-01-16 | 四川大学 | User trajectory similarity mining method for location-based social network |
CN104750829A (en) * | 2015-04-01 | 2015-07-01 | 华中科技大学 | User position classifying method and system based on signing in features |
CN104750751A (en) * | 2013-12-31 | 2015-07-01 | 华为技术有限公司 | Method and device for annotating trace data |
CN105243148A (en) * | 2015-10-25 | 2016-01-13 | 西华大学 | Checkin data based spatial-temporal trajectory similarity measurement method and system |
-
2016
- 2016-07-04 CN CN201610517438.6A patent/CN106211071B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7373524B2 (en) * | 2004-02-24 | 2008-05-13 | Covelight Systems, Inc. | Methods, systems and computer program products for monitoring user behavior for a server application |
CN102880719A (en) * | 2012-10-16 | 2013-01-16 | 四川大学 | User trajectory similarity mining method for location-based social network |
CN104750751A (en) * | 2013-12-31 | 2015-07-01 | 华为技术有限公司 | Method and device for annotating trace data |
CN104750829A (en) * | 2015-04-01 | 2015-07-01 | 华中科技大学 | User position classifying method and system based on signing in features |
CN105243148A (en) * | 2015-10-25 | 2016-01-13 | 西华大学 | Checkin data based spatial-temporal trajectory similarity measurement method and system |
Non-Patent Citations (1)
Title |
---|
JINZHOU CAO: "Exploring the distribution and dynamics of functional regions using mobile phone data and social media data", 《CUPUM》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169260A (en) * | 2017-03-23 | 2017-09-15 | 四川省公安厅 | Based on space-time track isomerous multi-source resonance data system and method |
CN107169260B (en) * | 2017-03-23 | 2021-05-11 | 四川省公安厅 | Heterogeneous multi-source data resonance system and method based on space-time trajectory |
CN107274058A (en) * | 2017-05-10 | 2017-10-20 | 福建海峡中创网络信息技术股份有限公司 | A kind of determination methods of mechanics |
CN108597224B (en) * | 2018-05-02 | 2020-05-19 | 深圳市数字城市工程研究中心 | Method and system for identifying to-be-improved traffic facilities based on space-time trajectory data |
CN108629000A (en) * | 2018-05-02 | 2018-10-09 | 深圳市数字城市工程研究中心 | A kind of the group behavior feature extracting method and system of mobile phone track data cluster |
CN108597224A (en) * | 2018-05-02 | 2018-09-28 | 深圳市数字城市工程研究中心 | A kind of recognition methods to be improved the traffic conditions and system based on space-time trajectory data |
CN109918395A (en) * | 2019-02-19 | 2019-06-21 | 北京明略软件***有限公司 | One kind of groups method for digging and device |
CN110543457A (en) * | 2019-09-11 | 2019-12-06 | 北京明略软件***有限公司 | Track type document processing method and device, storage medium and electronic device |
CN111275969A (en) * | 2020-02-15 | 2020-06-12 | 湖南大学 | Vehicle track filling method based on intelligent identification of road environment |
CN111275969B (en) * | 2020-02-15 | 2022-02-25 | 湖南大学 | Vehicle track filling method based on intelligent identification of road environment |
CN112069573A (en) * | 2020-08-24 | 2020-12-11 | 深圳大学 | City group space simulation method, system and equipment based on cellular automaton |
CN112070304A (en) * | 2020-09-09 | 2020-12-11 | 深圳大学 | City group element interaction measuring method, equipment and storage medium |
CN112070304B (en) * | 2020-09-09 | 2021-05-18 | 深圳大学 | City group element interaction measuring method, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106211071B (en) | 2019-05-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106211071A (en) | Group activity method of data capture based on multi-source space-time trajectory data and system | |
CN106448233B (en) | Public bus network timetable cooperative optimization method based on big data | |
CN108955693B (en) | Road network matching method and system | |
CN111681421B (en) | Mobile phone signaling data-based external passenger transport hub centralized-sparse space distribution analysis method | |
CN102799897B (en) | Computer recognition method of GPS (Global Positioning System) positioning-based transportation mode combined travelling | |
CN103533501B (en) | A kind of geography fence generation method | |
CN108761509B (en) | Automobile driving track and mileage prediction method based on historical data | |
CN102087788B (en) | Method for estimating traffic state parameter based on confidence of speed of float car | |
CN105740904B (en) | A kind of trip based on DBSCAN clustering algorithm and activity pattern recognition methods | |
CN104504099B (en) | Traffic trip state cutting method based on location track | |
CN105788260A (en) | Public transportation passenger OD calculation method based on intelligent public transportation system data | |
CN105809962A (en) | Traffic trip mode splitting method based on mobile phone data | |
CN105117789A (en) | Resident trip mode comprehensive judging method based on handset signaling data | |
CN108629000A (en) | A kind of the group behavior feature extracting method and system of mobile phone track data cluster | |
CN104766473A (en) | Traffic trip feature extraction method based on multi-mode public transport data matching | |
CN110836675B (en) | Decision tree-based automatic driving search decision method | |
CN110796337A (en) | System for evaluating accessibility of urban bus station service | |
CN103116702A (en) | Bicycle-mode traveling selection forecasting method based on activity chain mode | |
CN113780665B (en) | Private car stay position prediction method and system based on enhanced recurrent neural network | |
CN111144281A (en) | Urban rail transit OD passenger flow estimation method based on machine learning | |
CN105893352A (en) | Air quality early-warning and monitoring analysis system based on big data of social network | |
CN108171974A (en) | A kind of traffic trip mode discrimination method based on cellular triangulation location data | |
CN112512032A (en) | Mobile phone signaling data-based external trip crowd identification method | |
CN115100012A (en) | Method for calculating walking accessibility of rail transit station | |
CN102999789A (en) | Digital city safety precaution method based on semi-hidden-markov model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |