CN106211071A - Group activity method of data capture based on multi-source space-time trajectory data and system - Google Patents

Group activity method of data capture based on multi-source space-time trajectory data and system Download PDF

Info

Publication number
CN106211071A
CN106211071A CN201610517438.6A CN201610517438A CN106211071A CN 106211071 A CN106211071 A CN 106211071A CN 201610517438 A CN201610517438 A CN 201610517438A CN 106211071 A CN106211071 A CN 106211071A
Authority
CN
China
Prior art keywords
data
activity
time
moving point
track
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610517438.6A
Other languages
Chinese (zh)
Other versions
CN106211071B (en
Inventor
涂伟
曹劲舟
李清泉
乐阳
曹瑞
王振声
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201610517438.6A priority Critical patent/CN106211071B/en
Publication of CN106211071A publication Critical patent/CN106211071A/en
Application granted granted Critical
Publication of CN106211071B publication Critical patent/CN106211071B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W64/00Locating users or terminals or network equipment for network management purposes, e.g. mobility management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses group activity method of data capture based on multi-source space-time trajectory data and system, method includes: data that backstage obtains originating mobile terminal signaling data and original social software is registered also carry out pretreatment, generate and meet the pending signaling data of specific format and pending data of registering;The moving point track data that backstage obtains from pending signaling data;Build and learn the prior information of group activity rule;Obtain moving point track data, obtain activity venue data;Backstage, according to moving point track data, the prior information of group activity rule, activity venue data, uses and carries out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model.The present invention uses Bayesian model to carry out the deduction of individual activity, and considers the previous moment Activity Type impact on later moment in time Activity Type in spatio-temporal activity track, it is achieved on a large scale, accurate, quick, the high efficiency extraction of magnanimity group activity and collection.

Description

Group activity method of data capture based on multi-source space-time trajectory data and system
Technical field
The present invention relates to technical field of data processing, particularly relate to group activity data based on multi-source space-time trajectory data Collection method and system.
Background technology
Traditional movable gathering method depends on activity log or activity survey, and sample size is few, and the time of collecting is long, time-consumingly consumes Power.The outburst of space-time trajectory data is large-scale groups, and movable collection provides new tool.Space-time data analyzes correlational study The individual activity identification being primarily upon in realistic space, especially travel activity, lack the extraction to activity base attribute information. Need development to merge the group activity extracting method of multi-source space-time trajectory data, establish for urban science based on magnanimity activity research Given data basis.Space-time trajectory data (as mobile phone signaling data, vehicle GPS data, social activity are registered data etc.) is although comprising rich Rich temporal information and positional information, but semantic information lacks relatively, and spatial and temporal resolution is different, it is impossible to directly provide Group activity information.
Therefore, prior art has yet to be improved and developed.
Summary of the invention
In view of the deficiencies in the prior art, present invention aim at providing a kind of colony based on multi-source space-time trajectory data to live Dynamic method of data capture and system.
Technical scheme is as follows:
A kind of group activity method of data capture based on multi-source space-time trajectory data, wherein, method includes:
Data that A, backstage obtain originating mobile terminal signaling data and original social software is registered, respectively to original mobile eventually End signaling data and original social software data of registering carry out pretreatment, and the correspondence of generation meets the pending signaling of specific format Data and pending data of registering;
B, backstage, by presetting the rule of time and space, are extracted moving point from pending signaling data, are obtained Moving point track data;According to the pending classification information of registering registered in data, build and learn group activity rule Prior information;Obtain moving point track data, obtain activity venue data;
C, backstage according to moving point track data, the prior information of group activity rule, activity venue data, use based on Bayesian model carries out moving point track semantic information labelling, generation activity space-time track chain.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, described A specifically includes:
A1, backstage obtain originating mobile terminal signaling data, originating mobile terminal signaling data is carried out quality cleaning, goes Except repeating data, remove the data of attribute disappearance, remove time and space data not within the predefined range, remove user and count Amount, less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
A2, backstage obtain original social software and register data, and data of registering original social software carry out quality cleaning, go Except repeating data, remove the data of attribute disappearance, remove time and the space not data in research range, remove user and register Quantity at a range of user data, removes the user data only registered at the three unities, generates pretreatment and registers data;
A3, data that pretreatment signaling data and pretreatment are registered spatial resolution according to the yardstick of pre-defined rule grid Resolution change, generate corresponding pending signaling data and pending data of registering.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, by advance in described B Setting time and the rule in space, extract moving point from pending signaling data, and the moving point track data obtained specifically wraps Include:
B11, backstage obtain pending signaling data, people and time are ranked up according to specific time rule, obtain The sequential track of people;
B12, sequential track according to people, calculate people and enter and leave the time of ad-hoc location, and people entered successively is each Individual position is set to moving point, and first moving point being set in moving point track first position that people enters;
B13, the space length calculating every bit and existing moving point in sequential track and time difference, if space length Less than setting threshold value, and time difference is less than setting threshold value, then by described addition moving point, otherwise, be set to newly by described point Moving point, until institute a little all calculates complete in sequential track, obtain the candidate active locus of points;
B14, the candidate active point obtained in the candidate active locus of points, when detect candidate active point entry time and from The difference of ETAD expected time of arrival and departure less than the second setting threshold value, then, after being removed from the candidate active locus of points by correspondence candidate active point, generates Moving point track data.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, according to waiting to locate in described B Reason is registered the classification information of registering in data, builds and learns the prior information of group activity rule and specifically includes:
B21, platform of registering according to social activity register classification and user's data of registering of different time sections in a day total Amount, is calculated different groups activity and is distributed at intraday intensive probable;
B22, data of registering according to user, calculate the different groups activity movable transition probability under different time and divide Cloth;
B23, data of registering according to user, calculate different regions and carry out the probability distribution of different groups activity.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, acquisition activity in described B Locus of points data, obtain activity venue data and specifically include:
B31, preset the time identification window of the activity venue of people, be designated as the first active window, second movable respectively Window;
B32, the moving point track data of acquisition people, lived with the first active window and second respectively by the moving point persistent period Dynamic window mates, if the persistent period of moving point falls in a certain active window, and accounts for total activity widow time length More than 50%, then the activity venue that the corresponding described active window of this moving point is corresponding is as candidate active position;
B33, obtain match time the longest candidate active position as the activity venue data of user.
Described group activity method of data capture based on multi-source space-time trajectory data, wherein, described C specifically includes:
C1, according to Bayesian model, and after the Activity Type in given position, time and previous moment, generate Subsequent time carries out the new probability formula of a certain type of activity;
C2, according to each moving point in moving point track data, calculate and be engaged in different movable probability size, obtain The maximum of probability Activity Type that activity mark is described moving point of big probability;
C3, by after all moving point labellings in moving point track data, output activity space-time track chain.
A kind of group activity data gathering system based on multi-source space-time trajectory data, wherein, system includes:
Pretreatment module, obtains originating mobile terminal signaling data for backstage and original social software is registered data, point Other to originating mobile terminal signaling data with original social software data of registering carry out pretreatment, the correspondence of generation meets particular bin The pending signaling data of formula and pending data of registering;
Activity venue data acquisition module, for backstage by presetting the rule of time and space, from pending letter Make extracting data moving point, the moving point track data obtained;According to the pending classification information of registering registered in data, structure Build and learn the prior information of group activity rule;Obtain moving point track data, obtain activity venue data;
Semantic marker module, for backstage according to moving point track data, the prior information of group activity rule, actively Point data, uses and carries out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described pretreatment module Specifically include:
Signaling data processing unit, obtains originating mobile terminal signaling data, to originating mobile terminal signaling for backstage Data carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, and removal time and space are not within the predefined range Data, remove user and put quantity less than or greater than the user data of certain threshold value, generate pretreatment signaling data;
Register data processing unit, obtain original social software for backstage and register data, original social software is registered Data carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, remove time and space not in research range Data, remove user's quantity of registering at a range of user data, remove the user data only registered at the three unities, raw Pretreatment is become to register data;
Resolution conversion unit, for the spatial resolution of data of pretreatment signaling data and pretreatment being registered according in advance The resolution of the yardstick determining regular grid is changed, and generates corresponding pending signaling data and pending data of registering.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described counts actively Specifically include according to acquisition module:
Sequencing unit, obtains pending signaling data, people and time is carried out according to specific time rule for backstage Sequence, the sequential track of the people obtained;
Moving point indexing unit, for the sequential track according to people, calculates people and enters and leaves the time of ad-hoc location, depend on Secondary each position by people's entrance is set to moving point, and first position that people enters is set to first in moving point track Individual moving point;
Candidate active locus of points signal generating unit, for calculate in sequential track the space of every bit and existing moving point away from From with time difference, if space length less than set threshold value, and time difference less than set threshold value, then by described some addition activity Point, otherwise, is set to new moving point by described point, until institute a little all calculates complete in sequential track, obtains candidate active The locus of points;
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting The entry time of candidate active point and the difference of time departure are less than the second setting threshold value, then by correspondence candidate active point from candidate After moving point track removes, generate moving point track data;
First probability calculation unit, for according to social activity register platform register classification and user in one day different time Between the data total amount of registering of section, be calculated different groups activity and be distributed at intraday intensive probable;
Second probability calculation unit, for the data of registering according to user, calculates different groups activity under different time Movable transfering probability distribution;
3rd probability calculation unit, for the data of registering according to user, calculates different regions and carries out different groups work Dynamic probability distribution;
Preset unit, for presetting the time identification window of the activity venue of people, be designated as first respectively movable Window, the second active window;
Candidate active location determination unit, for obtaining the moving point track data of people, by the moving point persistent period respectively Mate with the first active window and the second active window, if the persistent period of moving point falls in a certain active window, and Account for more than the 50% of total activity widow time length, then the activity venue that the corresponding described active window of this moving point is corresponding is as time Select moving position;
Activity venue data capture unit, for obtain match time the longest candidate active position as the work of user Dynamic locality data.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described semantic marker mould Block specifically includes:
4th probability calculation unit, is used for according to Bayesian model, and given position, time and the previous moment Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;
Maximum of probability Activity Type indexing unit, for according to each moving point in moving point track data, calculate from The probability size of thing difference activity, obtains the maximum of probability Activity Type that activity mark is described moving point of maximum of probability;
Movable space-time track chain signal generating unit, for by after all moving point labellings in moving point track data, exporting Movable space-time track chain.
The invention provides a kind of group activity method of data capture based on multi-source space-time trajectory data and system, this Bright employing Bayesian model carries out the deduction of individual activity, and considers in spatio-temporal activity track previous moment Activity Type to rear The impact of one moment Activity Type, it is achieved on a large scale, accurate, quick, the high efficiency extraction of magnanimity group activity and collection.
Accompanying drawing explanation
Fig. 1 is the preferable enforcement of a kind of based on multi-source space-time trajectory data the group activity method of data capture of the present invention The flow chart of example.
Fig. 2 is the preferable enforcement of a kind of based on multi-source space-time trajectory data the group activity data gathering system of the present invention The functional schematic block diagram of example.
Detailed description of the invention
For making the purpose of the present invention, technical scheme and effect clearer, clear and definite, below to the present invention the most specifically Bright.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
The invention provides the preferable enforcement of a kind of group activity method of data capture based on multi-source space-time trajectory data The flow chart of example, as it is shown in figure 1, wherein, method includes:
Data that step S100, backstage obtain originating mobile terminal signaling data and original social software is registered, respectively to former Beginning mobile terminal signaling data and original social software data of registering carry out pretreatment, and the correspondence of generation meets treating of specific format Process signaling data and pending data of registering.Wherein mobile terminal is preferably mobile phone.
In further embodiment, step S100 specifically includes:
Step S101, backstage obtain originating mobile terminal signaling data, and originating mobile terminal signaling data is carried out quality Clean, remove and repeat data, remove the data of attribute disappearance, remove time and space data not within the predefined range, remove User puts the quantity user data less than or greater than certain threshold value, generates pretreatment signaling data;
Step S102, backstage obtain original social software and register data, and data of registering original social software carry out quality Clean, remove and repeat data, remove the data of attribute disappearance, remove time and the space not data in research range, remove User registers quantity at a range of user data, the user data that removal is only registered at the three unities, generation pretreatment label To data;
Step S103, data that pretreatment signaling data and pretreatment are registered spatial resolution according to pre-defined rule grid The resolution of yardstick change, generate corresponding pending signaling data and pending data of registering.
When being embodied as, data of registering mobile phone signaling data and social activity carry out pretreatment, obtain being allowed to meet locating afterwards The data that reason requires, particular content includes:
Mobile phone signaling data is carried out quality cleaning, repeats data including removing, remove the data of attribute disappearance, during removal Between and the space not data in research range, remove user and put quantity less than or greater than the user data of certain threshold value;Threshold value Choose and depend on concrete data type, data form, the quality of data.Preferably, threshold value span is every less than 3 My god, more than 100 every days.
Data of registering social activity carry out quality cleaning, repeat data including removing, and remove the data of attribute disappearance, during removal Between and the space not data in research range;Remove user's quantity of registering less than 2 times, the user data more than 100 times;Remove The user data only registered at the three unities;
For multi-source space-time trajectory data, it is considered to the impact of spatial resolution.Mobile phone signaling data and social activity are registered number According to the unified yardstick being converted to rule-based grid of spatial resolution.The scale size of regular grid generally depends on above two The spatial resolution of class data itself.Preferential scale selection is 500m*500m.
Step S200, backstage by presetting the rule of time and space, extraction activity from pending signaling data Point, the moving point track data obtained;According to the pending classification information of registering registered in data, build and learn group activity The prior information of rule;Obtain moving point track data, obtain activity venue data.
Further, by presetting the rule of time and space in step S200, carry from pending signaling data Taking moving point, the moving point track data obtained specifically includes:
Step S211, backstage obtain pending signaling data, people and time are ranked up according to specific time rule, The sequential track of the people obtained;
Step S212, sequential track according to people, calculate people and enter and leave the time of ad-hoc location, people entered successively Each position be set to moving point, and first moving point that first position that people enters is set in moving point track;
Step S213, the space length calculating every bit and existing moving point in sequential track and time difference, if empty Spacing is less than setting threshold value, and time difference is less than setting threshold value, then by described addition moving point, otherwise, by described point It is set to new moving point, until institute a little all calculates complete in sequential track, obtains the candidate active locus of points;
Step S214, the candidate active point obtained in the candidate active locus of points, when detecting the entrance of candidate active point Between and time departure difference less than second setting threshold value, then correspondence candidate active point is removed from the candidate active locus of points After, generate moving point track data.
When being embodied as, for the mobile phone signaling data through processing, by extracting the moving point of people, obtain the work of people Moving point trace.The method extracting moving point is mainly judged by the rule of setting time and space, and concrete grammar is as follows:
For the mobile phone signaling data produced, it is ranked up according to people and time, obtains the sequential track of people;
Utilizing the sequential track of people, calculate its time entering and leaving each position (grid), first position is set to First moving point in moving point track;
Move over time, calculate in sequential track the space of every bit and the moving point in existing moving point track away from From with time difference;If space length is less than setting threshold value, and time difference is less than setting threshold value, then this point is joined this Moving point;Otherwise, this point is set to new moving point;Until institute a little all calculates complete in sequential track, obtain candidate active The locus of points;Preferably set threshold value in the range of 500m-1000m.
For the candidate active point in the candidate active locus of points, if the difference of the entry time of this point and time departure is less than Certain threshold value, then it is assumed that the not active point of this point, removes it from the candidate active locus of points, the moving point rail finally obtained Mark.Preferably, threshold value span is 1 hour-3 hours.
Further, according to the pending classification information of registering registered in data in step S200, build and learn colony The prior information of mechanics specifically includes:
Step S221, register classification and user's number of registering of different time sections in a day of platform of registering according to social activity According to total amount, it is calculated different groups activity and is distributed at intraday intensive probable;
Step S222, data of registering according to user, calculate the different groups activity movable transfer under different time general Rate is distributed;
Step S223, data of registering according to user, calculate different regions and carry out the probability distribution of different groups activity.
When being embodied as, for registering data through the social activity that processed, utilize its rich in abundant classification information of registering, Build and the prior information of study group activity rule.Concrete grammar is as follows:
Register the classification of registering that platform provided, and user's data of registering of different time sections in a day according to social activity Total amount, is calculated different groups activity and is distributed Pr (AT at intraday intensive probablei| t), it is expressed as:
Pr ( AT i | t ) = c h e c k i n s ( AT i , t ) Σ t c h e c k i n s ( AT i , t )
checkins(ATi, t) represent the quantity of registering that moment t Activity Type is i, ∑tcheckins(ATi, t) it is one day Each moment interior is engaged in the quantity of registering that Activity Type is i, wherein ATiFor being engaged in the number of registering that the class of activity is i, according to user Track of registering, be calculated the different groups activity movable transfering probability distribution under different time, be expressed as Pr (ATi,t| ATj, t-1), the classification of wherein i, j expression activity, t express time.ATi, t represents that being engaged in Activity Type in t is registering of i Quantity, (ATj, t-1) represent and be engaged in the number of registering that Activity Type is j, this probability P r (AT in the t-1 momenti,t|ATj, t-1) meaning In the case of justice is for being engaged in movable j in known previous moment t-1, it is engaged in the probability of movable i at moment t;Pr (X) represents event X Probability announce;
According to the track of registering of user, it is calculated different mesh region and carries out the probability distribution of different groups activity, table It is shown as: Pr (Gridm|ATi, t), wherein m is grid sequence number, GridmRepresenting m-th grid, i is the class of activity, and t is the time.
In further embodiment, described step S200 obtains moving point track data, obtain activity venue data tool Body includes:
Step S231, preset the time identification window of the activity venue of people, be designated as respectively the first active window, second Active window;
Step S232, obtain people moving point track data, by the moving point persistent period respectively with the first active window and Second active window mates, if the persistent period of moving point falls in a certain active window, and accounts for total activity widow time More than the 50% of length, then the activity venue that the corresponding described active window of this moving point is corresponding is as candidate active position;
Step S233, obtain match time the longest candidate active position as the activity venue data of user.
When being embodied as, the moving point track data obtained, the house of detection people and work activities.Concrete grammar is as follows:
According to general knowledge, set the most movable and identification window of work activities, be set to: 0 .-7 point, 9 .-17 points;
For the moving point track data of people, the persistent period of moving point is mated with two above identification window, If the persistent period of this moving point falls in identification window, and account for more than the 50% of total identification window time span, then it is assumed that It is made into merit, as candidate house or work activities position;
Find the longest house match time or work activities position as the house of this user and work activities position;If There is no that the match is successful, then it is assumed that this user does not find at home or work activities position.
Step S300, backstage according to moving point track data, the prior information of group activity rule, activity venue data, Use and carry out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model.
Utilize the moving point track through obtaining, the group activity temporal prior information obtained, the house work of the people obtained Making action message, carry out moving point track semantic information labelling based on Bayesian model, the action message of labelling mainly includes occupying Family, work, other (such as: amusement/shopping/study/leisure/trip etc.), obtain movable space-time track chain.
The spatio-temporal activity track chain obtained dynamically changes have important meaning for research urban planning and urban function region Justice.According to the change of spatio-temporal activity, tune can be the most quickly made in the dynamically change of the urban function region for having planned Whole and prediction.
Further embodiment, step S300 specifically includes:
Step S301, according to Bayesian model, and given position, time and the Activity Type in previous moment After, generate subsequent time and carry out the new probability formula of a certain type of activity;
Step S302, according to each moving point in moving point track data, calculate and be engaged in different movable probability size, Obtain the maximum of probability Activity Type that activity mark is described moving point of maximum of probability;
Step S303, by after all moving point labellings in moving point track data, output activity space-time track chain.
When being embodied as, according to Bayesian model, in the activity class in given particular location, time and previous moment Under type, the lower moment at a moment will carry out the probability of a certain type of activity and is:
Pr ( AT i | Grid m , t , AT j ) = Pr ( Grid m | AT i , t , AT j ) Pr ( AT i | t , AT j ) Pr ( AT j | t ) Pr ( t ) Pr ( Grid m , t , AT j ) - - - ( 1 )
Wherein, m is grid sequence number, and j is the Activity Type in previous moment, and t is current time, and i is that current time is movable Type.
For Pr (Gridm|ATi,t,ATj), it is believed that ATjWith GridmCondition is unrelated, then this formula can be reduced to:
Pr(Gridm|ATi,t,ATj)=Pr (Gridm|ATi,t) (2)
For Pr (ATi|t,ATj), this formula can be rewritten as:
Pr(ATi|t,ATj)=Pr (ATi,t|ATj, t-1) and (3)
In conjunction with formula (2) (3), formula (1) is converted to:
Pr ( AT i | Grid m , t , AT j ) = Pr ( Grid m | AT i , t ) Pr ( AT i | t , AT j , t - 1 ) Pr ( AT j | t ) Pr ( t ) Pr ( Grid m , t , AT j ) - - - ( 4 )
Pr(ATi|Gridm,t,ATj) ∝ Pr (Gridm|ATi, t) Pr(ATi,t|ATj,t-1)Pr(ATj|t)(5)
For moving point track, it is sequentially inputted in formula (5), calculates and be engaged in different movable probability size, take maximum The activity mark of probability is the maximum of probability Activity Type of this moving point;
Especially, for having been marked as at home or the grid position of work activities type, then by Pr (Gridm|ATi,t) It is set to 1, and by ATj, t-1=AThomeorATworking, the labelling continuing with next moving point processes.Until all of work Moving point in moving point trace obtains labelling, and output obtains movable Space-time Chain.AThomeRepresent that Activity Type is at home, ATworking Represent that Activity Type is in work.
Wherein, moving point track extraction method depends on concrete data type, the spatial and temporal resolution of data, is not limited to The method that the present invention introduces;
Be limited to the observation duration of space-time data at home with work activities detection method, choosing of threshold value is not limited to this The method of bright introduction;
Build and the prior information of study group activity rule is not limited to social media and registers data, it is also possible to use and occupy The modes such as people's survey data, GPS track data, volunteer's data.
The present invention proposes a kind of brand-new group activity collection method based on multi-source space-time trajectory data, uses Bayes Model carries out the deduction of individual activity, solves the problems such as existing method takes time and effort, cost is high, sample size is little, it is achieved on a large scale, Accurate, quick, the high efficiency extraction of magnanimity group activity and collection.The group activity deduction of the present invention not only allows for city space The constraint of the Factors on Human class activities such as middle time, position, it is also contemplated that in spatio-temporal activity track, previous moment Activity Type is to rear The impact of one moment Activity Type, the deduction of consideration activity in mankind's spatio-temporal activity chain.
Present invention also offers the preferable real of a kind of group activity data gathering system based on multi-source space-time trajectory data Execute the functional schematic block diagram of example, as in figure 2 it is shown, system includes:
Pretreatment module 100, obtains originating mobile terminal signaling data for backstage and original social software is registered data, Data of registering originating mobile terminal signaling data and original social software respectively carry out pretreatment, and the correspondence of generation meets specific The pending signaling data of form and pending data of registering;Specifically as described in embodiment of the method.
Activity venue data acquisition module 200, for backstage by presetting the rule of time and space, from pending Signaling data extracts moving point, the moving point track data obtained;According to the pending classification information of registering registered in data, Build and learn the prior information of group activity rule;Obtain moving point track data, obtain activity venue data;Concrete such as side Described in method embodiment.
Semantic marker module 300, for backstage according to moving point track data, the prior information of group activity rule, work Dynamic locality data, uses and carries out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model;Tool Body is as described in embodiment of the method.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described pretreatment module Specifically include:
Signaling data processing unit, obtains originating mobile terminal signaling data, to originating mobile terminal signaling for backstage Data carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, and removal time and space are not within the predefined range Data, remove user and put quantity less than or greater than the user data of certain threshold value, generate pretreatment signaling data;Concrete such as side Described in method embodiment.
Register data processing unit, obtain original social software for backstage and register data, original social software is registered Data carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, remove time and space not in research range Data, remove user's quantity of registering at a range of user data, remove the user data only registered at the three unities, raw Pretreatment is become to register data;Specifically as described in embodiment of the method.
Resolution conversion unit, for the spatial resolution of data of pretreatment signaling data and pretreatment being registered according in advance The resolution of the yardstick determining regular grid is changed, and generates corresponding pending signaling data and pending data of registering;Tool Body is as described in embodiment of the method.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described counts actively Specifically include according to acquisition module:
Sequencing unit, obtains pending signaling data, people and time is carried out according to specific time rule for backstage Sequence, the sequential track of the people obtained;Specifically as described in embodiment of the method.
Moving point indexing unit, for the sequential track according to people, calculates people and enters and leaves the time of ad-hoc location, depend on Secondary each position by people's entrance is set to moving point, and first position that people enters is set to first in moving point track Individual moving point;Specifically as described in embodiment of the method.
Candidate active locus of points signal generating unit, for calculate in sequential track the space of every bit and existing moving point away from From with time difference, if space length less than set threshold value, and time difference less than set threshold value, then by described some addition activity Point, otherwise, is set to new moving point by described point, until institute a little all calculates complete in sequential track, obtains candidate active The locus of points;Specifically as described in embodiment of the method.
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting The entry time of candidate active point and the difference of time departure are less than the second setting threshold value, then by correspondence candidate active point from candidate After moving point track removes, generate moving point track data;Specifically as described in embodiment of the method.
First probability calculation unit, for according to social activity register platform register classification and user in one day different time Between the data total amount of registering of section, be calculated different groups activity and be distributed at intraday intensive probable;Specifically implement such as method Described in example.
Second probability calculation unit, for the data of registering according to user, calculates different groups activity under different time Movable transfering probability distribution;Specifically as described in embodiment of the method.
3rd probability calculation unit, for the data of registering according to user, calculates different regions and carries out different groups work Dynamic probability distribution;Specifically as described in embodiment of the method.
Preset unit, for presetting the time identification window of the activity venue of people, be designated as first respectively movable Window, the second active window;Specifically as described in embodiment of the method.
Candidate active location determination unit, for obtaining the moving point track data of people, by the moving point persistent period respectively Mate with the first active window and the second active window, if the persistent period of moving point falls in a certain active window, and Account for more than the 50% of total activity widow time length, then the activity venue that the corresponding described active window of this moving point is corresponding is as time Select moving position;Specifically as described in embodiment of the method.
Activity venue data capture unit, for obtain match time the longest candidate active position as the work of user Dynamic locality data;Specifically as described in embodiment of the method.
Described group activity data gathering system based on multi-source space-time trajectory data, wherein, described semantic marker mould Block specifically includes:
4th probability calculation unit, is used for according to Bayesian model, and given position, time and the previous moment Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;Specifically as described in embodiment of the method.
Maximum of probability Activity Type indexing unit, for according to each moving point in moving point track data, calculate from The probability size of thing difference activity, obtains the maximum of probability Activity Type that activity mark is described moving point of maximum of probability;Tool Body is as described in embodiment of the method.
Movable space-time track chain signal generating unit, for by after all moving point labellings in moving point track data, exporting Movable space-time track chain;Specifically as described in embodiment of the method.
In sum, the invention provides a kind of group activity method of data capture based on multi-source space-time trajectory data and System, method includes: data that backstage obtains originating mobile terminal signaling data and original social software is registered also carry out pretreatment, Generate and meet the pending signaling data of specific format and pending data of registering;The work that backstage obtains from pending signaling data Moving point trace data;Build and learn the prior information of group activity rule;Obtain moving point track data, obtain activity venue Data;Backstage, according to moving point track data, the prior information of group activity rule, activity venue data, uses based on pattra leaves This model carries out moving point track semantic information labelling, generation activity space-time track chain.The present invention uses Bayesian model to carry out The deduction of individual activity, and consider the previous moment Activity Type shadow to later moment in time Activity Type in spatio-temporal activity track Ring, it is achieved on a large scale, accurate, quick, the high efficiency extraction of magnanimity group activity and collection.
It should be appreciated that the application of the present invention is not limited to above-mentioned citing, for those of ordinary skills, can To be improved according to the above description or to convert, all these modifications and variations all should belong to the guarantor of claims of the present invention Protect scope.

Claims (10)

1. a group activity method of data capture based on multi-source space-time trajectory data, it is characterised in that described method includes:
Data that A, backstage obtain originating mobile terminal signaling data and original social software is registered, believe originating mobile terminal respectively Making data and original social software data of registering carry out pretreatment, the correspondence of generation meets the pending signaling data of specific format With pending data of registering;
Moving point, the work obtained, by presetting the rule of time and space, are extracted from pending signaling data in B, backstage Moving point trace data;According to the pending classification information of registering registered in data, build and learn the priori of group activity rule Information;Obtain moving point track data, obtain activity venue data;
C, backstage, according to moving point track data, the prior information of group activity rule, activity venue data, use based on pattra leaves This model carries out moving point track semantic information labelling, generation activity space-time track chain.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 1, its feature exists In, described A specifically includes:
A1, backstage obtain originating mobile terminal signaling data, and originating mobile terminal signaling data carries out quality cleaning, remove weight Complex data, removes the data of attribute disappearance, removes time and space data not within the predefined range, and it is little that removal user puts quantity In or more than the user data of certain threshold value, generate pretreatment signaling data;
A2, backstage obtain original social software and register data, and data of registering original social software carry out quality cleaning, remove weight Complex data, removes the data of attribute disappearance, removes time and the space not data in research range, removes user and registers quantity At a range of user data, remove the user data only registered at the three unities, generate pretreatment and register data;
A3, data that pretreatment signaling data and pretreatment are registered spatial resolution according to the yardstick of pre-defined rule grid point Resolution is changed, and generates corresponding pending signaling data and pending data of registering.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 1, its feature exists In, by presetting the rule of time and space in described B, from pending signaling data, extract moving point, the work obtained Moving point trace data specifically include:
B11, backstage obtain pending signaling data, people and time are ranked up according to specific time rule, the people obtained Sequential track;
B12, sequential track according to people, calculate people and enter and leave the time of ad-hoc location, each position people entered successively Install and be set to moving point, and first moving point that first position that people enters is set in moving point track;
B13, the space length calculating every bit and existing moving point in sequential track and time difference, if space length is less than Set threshold value, and time difference is less than setting threshold value, then by described addition moving point, otherwise, described point is set to new work Dynamic point, until institute a little all calculates complete in sequential track, obtains the candidate active locus of points;
B14, the candidate active point obtained in the candidate active locus of points, when detecting the entry time of candidate active point and leaving Between difference less than second setting threshold value, then after correspondence candidate active point being removed from the candidate active locus of points, generation activity Locus of points data.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 3, its feature exists In, according to the pending classification information of registering registered in data in described B, build and learn the prior information of group activity rule Specifically include:
B21, register classification and user's data total amount of registering of different time sections in a day of platform of registering according to social activity, meter Calculation obtains different groups activity and is distributed at intraday intensive probable;
B22, data of registering according to user, calculate the different groups activity movable transfering probability distribution under different time;
B23, data of registering according to user, calculate different regions and carry out the probability distribution of different groups activity.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 4, its feature exists In, described B obtains moving point track data, obtains activity venue data and specifically include:
B31, preset the time identification window of the activity venue of people, be designated as the first active window, the second active window respectively;
B32, obtain people moving point track data, by the moving point persistent period respectively with the first active window and the second active window Mouth mates, if the persistent period of moving point falls in a certain active window, and accounts for the 50% of total activity widow time length Above, then the activity venue that the corresponding described active window of this moving point is corresponding is as candidate active position;
B33, obtain match time the longest candidate active position as the activity venue data of user.
Group activity method of data capture based on multi-source space-time trajectory data the most according to claim 5, its feature exists In, described C specifically includes:
C1, according to Bayesian model, and after the Activity Type in given position, time and previous moment, generate next Moment carries out the new probability formula of a certain type of activity;
C2, according to each moving point in moving point track data, calculate and be engaged in different movable probability size, obtain the most general The activity mark of rate is the maximum of probability Activity Type of described moving point;
C3, by after all moving point labellings in moving point track data, output activity space-time track chain.
7. a group activity data gathering system based on multi-source space-time trajectory data, it is characterised in that system includes:
Pretreatment module, obtains originating mobile terminal signaling data for backstage and original social software is registered data, the most right Originating mobile terminal signaling data and original social software data of registering carry out pretreatment, and the correspondence of generation meets specific format Pending signaling data and pending data of registering;
Activity venue data acquisition module, for backstage by presetting the rule of time and space, from pending signaling number According to middle extraction moving point, the moving point track data obtained;According to the pending classification information of registering registered in data, build also The prior information of study group activity rule;Obtain moving point track data, obtain activity venue data;
Semantic marker module, for backstage according to moving point track data, the prior information of group activity rule, count actively According to, use and carry out moving point track semantic information labelling, generation activity space-time track chain based on Bayesian model.
Group activity data gathering system based on multi-source space-time trajectory data the most according to claim 7, its feature exists In, described pretreatment module specifically includes:
Signaling data processing unit, obtains originating mobile terminal signaling data, to originating mobile terminal signaling data for backstage Carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, remove time and space number not within the predefined range According to, remove user and put the quantity user data less than or greater than certain threshold value, generate pretreatment signaling data;
Register data processing unit, obtain original social software for backstage and register data, data that original social software is registered Carry out quality cleaning, remove and repeat data, remove the data of attribute disappearance, remove time and the space not number in research range According to, remove user's quantity of registering at a range of user data, remove the user data only registered at the three unities, generate pre- Process is registered data;
Resolution conversion unit, the spatial resolution for data of pretreatment signaling data and pretreatment being registered is established rules according to pre- Then the resolution of the yardstick of grid is changed, and generates corresponding pending signaling data and pending data of registering.
Group activity data gathering system based on multi-source space-time trajectory data the most according to claim 8, its feature exists In, described activity venue data acquisition module specifically includes:
Sequencing unit, obtains pending signaling data, people and time is ranked up according to specific time rule for backstage, The sequential track of the people obtained;
Moving point indexing unit, for the sequential track according to people, calculates people and enters and leaves the time of ad-hoc location, successively will Each position that people enters is set to moving point, and first work being set in moving point track first position that people enters Dynamic point;
Candidate active locus of points signal generating unit, for calculate in sequential track the space length of every bit and existing moving point with Time difference, if space length is less than setting threshold value, and time difference is less than setting threshold value, then by described addition moving point, Otherwise, described point is set to new moving point, until institute a little all calculates complete in sequential track, obtains candidate active point rail Mark;
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when candidate being detected The entry time of moving point and the difference of time departure are less than the second setting threshold value, then by correspondence candidate active point from candidate active After the locus of points removes, generate moving point track data;
First probability calculation unit, for register classification and user's different time sections in a day of platform of registering according to social activity Data total amount of registering, be calculated different groups activity and be distributed at intraday intensive probable;
Second probability calculation unit, for the data of registering according to user, calculates different groups activity work under different time Dynamic transfering probability distribution;
3rd probability calculation unit, for the data of registering according to user, calculates different regions and carries out different groups activity Probability distribution;
Preset unit, for presetting the time identification window of activity venue of people, be designated as respectively the first active window, Second active window;
Candidate active location determination unit, for obtaining the moving point track data of people, by the moving point persistent period respectively with the One active window and the second active window mate, if the persistent period of moving point falls in a certain active window, and account for total More than the 50% of active window time span, then live as candidate in the activity venue that the corresponding described active window of this moving point is corresponding Dynamic position;
Activity venue data capture unit, for obtain match time the longest candidate active position as user actively Point data.
Group activity data gathering system based on multi-source space-time trajectory data the most according to claim 9, its feature exists In, described semantic marker module specifically includes:
4th probability calculation unit, is used for according to Bayesian model, and given position, time and the work in previous moment After dynamic type, generate subsequent time and carry out the new probability formula of a certain type of activity;
Maximum of probability Activity Type indexing unit, for according to each moving point in moving point track data, calculates and is engaged in not With movable probability size, obtain the maximum of probability Activity Type that activity mark is described moving point of maximum of probability;
Movable space-time track chain signal generating unit, for by after all moving point labellings in moving point track data, output activity Space-time track chain.
CN201610517438.6A 2016-07-04 2016-07-04 Group activity method of data capture and system based on multi-source space-time trajectory data Active CN106211071B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610517438.6A CN106211071B (en) 2016-07-04 2016-07-04 Group activity method of data capture and system based on multi-source space-time trajectory data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610517438.6A CN106211071B (en) 2016-07-04 2016-07-04 Group activity method of data capture and system based on multi-source space-time trajectory data

Publications (2)

Publication Number Publication Date
CN106211071A true CN106211071A (en) 2016-12-07
CN106211071B CN106211071B (en) 2019-05-21

Family

ID=57464652

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610517438.6A Active CN106211071B (en) 2016-07-04 2016-07-04 Group activity method of data capture and system based on multi-source space-time trajectory data

Country Status (1)

Country Link
CN (1) CN106211071B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169260A (en) * 2017-03-23 2017-09-15 四川省公安厅 Based on space-time track isomerous multi-source resonance data system and method
CN107274058A (en) * 2017-05-10 2017-10-20 福建海峡中创网络信息技术股份有限公司 A kind of determination methods of mechanics
CN108597224A (en) * 2018-05-02 2018-09-28 深圳市数字城市工程研究中心 A kind of recognition methods to be improved the traffic conditions and system based on space-time trajectory data
CN108629000A (en) * 2018-05-02 2018-10-09 深圳市数字城市工程研究中心 A kind of the group behavior feature extracting method and system of mobile phone track data cluster
CN109918395A (en) * 2019-02-19 2019-06-21 北京明略软件***有限公司 One kind of groups method for digging and device
CN110543457A (en) * 2019-09-11 2019-12-06 北京明略软件***有限公司 Track type document processing method and device, storage medium and electronic device
CN111275969A (en) * 2020-02-15 2020-06-12 湖南大学 Vehicle track filling method based on intelligent identification of road environment
CN112069573A (en) * 2020-08-24 2020-12-11 深圳大学 City group space simulation method, system and equipment based on cellular automaton
CN112070304A (en) * 2020-09-09 2020-12-11 深圳大学 City group element interaction measuring method, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7373524B2 (en) * 2004-02-24 2008-05-13 Covelight Systems, Inc. Methods, systems and computer program products for monitoring user behavior for a server application
CN102880719A (en) * 2012-10-16 2013-01-16 四川大学 User trajectory similarity mining method for location-based social network
CN104750829A (en) * 2015-04-01 2015-07-01 华中科技大学 User position classifying method and system based on signing in features
CN104750751A (en) * 2013-12-31 2015-07-01 华为技术有限公司 Method and device for annotating trace data
CN105243148A (en) * 2015-10-25 2016-01-13 西华大学 Checkin data based spatial-temporal trajectory similarity measurement method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7373524B2 (en) * 2004-02-24 2008-05-13 Covelight Systems, Inc. Methods, systems and computer program products for monitoring user behavior for a server application
CN102880719A (en) * 2012-10-16 2013-01-16 四川大学 User trajectory similarity mining method for location-based social network
CN104750751A (en) * 2013-12-31 2015-07-01 华为技术有限公司 Method and device for annotating trace data
CN104750829A (en) * 2015-04-01 2015-07-01 华中科技大学 User position classifying method and system based on signing in features
CN105243148A (en) * 2015-10-25 2016-01-13 西华大学 Checkin data based spatial-temporal trajectory similarity measurement method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JINZHOU CAO: "Exploring the distribution and dynamics of functional regions using mobile phone data and social media data", 《CUPUM》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169260A (en) * 2017-03-23 2017-09-15 四川省公安厅 Based on space-time track isomerous multi-source resonance data system and method
CN107169260B (en) * 2017-03-23 2021-05-11 四川省公安厅 Heterogeneous multi-source data resonance system and method based on space-time trajectory
CN107274058A (en) * 2017-05-10 2017-10-20 福建海峡中创网络信息技术股份有限公司 A kind of determination methods of mechanics
CN108597224B (en) * 2018-05-02 2020-05-19 深圳市数字城市工程研究中心 Method and system for identifying to-be-improved traffic facilities based on space-time trajectory data
CN108629000A (en) * 2018-05-02 2018-10-09 深圳市数字城市工程研究中心 A kind of the group behavior feature extracting method and system of mobile phone track data cluster
CN108597224A (en) * 2018-05-02 2018-09-28 深圳市数字城市工程研究中心 A kind of recognition methods to be improved the traffic conditions and system based on space-time trajectory data
CN109918395A (en) * 2019-02-19 2019-06-21 北京明略软件***有限公司 One kind of groups method for digging and device
CN110543457A (en) * 2019-09-11 2019-12-06 北京明略软件***有限公司 Track type document processing method and device, storage medium and electronic device
CN111275969A (en) * 2020-02-15 2020-06-12 湖南大学 Vehicle track filling method based on intelligent identification of road environment
CN111275969B (en) * 2020-02-15 2022-02-25 湖南大学 Vehicle track filling method based on intelligent identification of road environment
CN112069573A (en) * 2020-08-24 2020-12-11 深圳大学 City group space simulation method, system and equipment based on cellular automaton
CN112070304A (en) * 2020-09-09 2020-12-11 深圳大学 City group element interaction measuring method, equipment and storage medium
CN112070304B (en) * 2020-09-09 2021-05-18 深圳大学 City group element interaction measuring method, equipment and storage medium

Also Published As

Publication number Publication date
CN106211071B (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN106211071A (en) Group activity method of data capture based on multi-source space-time trajectory data and system
CN106448233B (en) Public bus network timetable cooperative optimization method based on big data
CN108955693B (en) Road network matching method and system
CN111681421B (en) Mobile phone signaling data-based external passenger transport hub centralized-sparse space distribution analysis method
CN102799897B (en) Computer recognition method of GPS (Global Positioning System) positioning-based transportation mode combined travelling
CN103533501B (en) A kind of geography fence generation method
CN108761509B (en) Automobile driving track and mileage prediction method based on historical data
CN102087788B (en) Method for estimating traffic state parameter based on confidence of speed of float car
CN105740904B (en) A kind of trip based on DBSCAN clustering algorithm and activity pattern recognition methods
CN104504099B (en) Traffic trip state cutting method based on location track
CN105788260A (en) Public transportation passenger OD calculation method based on intelligent public transportation system data
CN105809962A (en) Traffic trip mode splitting method based on mobile phone data
CN105117789A (en) Resident trip mode comprehensive judging method based on handset signaling data
CN108629000A (en) A kind of the group behavior feature extracting method and system of mobile phone track data cluster
CN104766473A (en) Traffic trip feature extraction method based on multi-mode public transport data matching
CN110836675B (en) Decision tree-based automatic driving search decision method
CN110796337A (en) System for evaluating accessibility of urban bus station service
CN103116702A (en) Bicycle-mode traveling selection forecasting method based on activity chain mode
CN113780665B (en) Private car stay position prediction method and system based on enhanced recurrent neural network
CN111144281A (en) Urban rail transit OD passenger flow estimation method based on machine learning
CN105893352A (en) Air quality early-warning and monitoring analysis system based on big data of social network
CN108171974A (en) A kind of traffic trip mode discrimination method based on cellular triangulation location data
CN112512032A (en) Mobile phone signaling data-based external trip crowd identification method
CN115100012A (en) Method for calculating walking accessibility of rail transit station
CN102999789A (en) Digital city safety precaution method based on semi-hidden-markov model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant