CN104268243B - A kind of position data processing method and processing device - Google Patents

A kind of position data processing method and processing device Download PDF

Info

Publication number
CN104268243B
CN104268243B CN201410513908.2A CN201410513908A CN104268243B CN 104268243 B CN104268243 B CN 104268243B CN 201410513908 A CN201410513908 A CN 201410513908A CN 104268243 B CN104268243 B CN 104268243B
Authority
CN
China
Prior art keywords
region
node
user
tree
hierarchical tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410513908.2A
Other languages
Chinese (zh)
Other versions
CN104268243A (en
Inventor
王飞
邵钏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201410513908.2A priority Critical patent/CN104268243B/en
Publication of CN104268243A publication Critical patent/CN104268243A/en
Application granted granted Critical
Publication of CN104268243B publication Critical patent/CN104268243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Navigation (AREA)
  • Position Fixing By Use Of Radio Waves (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a kind of position data processing method and processing device, can solve to face the analysis in the case of magnanimity position data rapidly and efficiently and handle these position datas, so that the application towards these magnanimity position datas is more efficient more accurate.In some feasible embodiments of the invention, a kind of position data processing method may include:Obtain the location data in target area;It is polymerize to obtain user trajectory according to location data;According to the user in the target area by way of positional information, determine optimal node of the user trajectory in the hierarchical tree of region, the region hierarchical tree is using the target area as root node, tree using the subregion that the target area is included as child node, the optimal node be comprising the user in the target area by way of all positional informations lowest level node;The mapping relations established between the user trajectory and the optimal node are to obtain region level index tree.

Description

A kind of position data processing method and processing device
Technical field
The present invention relates to data processing field, more particularly to a kind of position data processing method and processing device.
Background technology
With the popularization of smart mobile phone, the location data scale that communication network reports constantly expands, it is sufficient to supports The analysis business of multiple high grade is built in these data.Currently, location data be applied to " smart city " structure also get over To be more taken seriously, ground on location data urban planning, crowd's moving characteristic, commercial value evaluation etc. are many Study carefully in burning hot progress.
Position data is analyzed and excavation is mainly directed towards location data, such as the MR data of communication network, and data volume is big, Average 100 general-purpose family 0.5T/ days, and these data all carry the feature of time series.Master is analyzed and excavated to current location data To be analyzed using distributed batch processing framework, such as Spark:The such Distributed Architecture of Spark is in reply position data analysis When, the size of the configuration foundation data volume completely of its cluster and internal memory, data volume increase, individual task analysis input is in accordingly Linearly increasing, performance drastically declines.The current analysis for region needs first to download whole network data and then refiltered, and this is in data During amount increase, increase system load and need to filter out a large amount of invalid datas, so as to cause computing resource waste.Therefore, exist In the epoch that big data arrives, in face of magnanimity position data, how analysis rapidly and efficiently and these position datas are handled, so that Towards these magnanimity position datas application it is more efficient more accurate, this turns into a urgent problem to be solved.
The content of the invention
Efficient analysis and processing magnanimity position data how is read soon for the big data epoch, so that extra large towards these This more efficient more accurate technical problem of application of position data is measured, the embodiment of the present invention provides a kind of position data processing method And device, specifically include:
In a first aspect, a kind of position data processing method, including:
Obtain the location data in target area;
It is polymerize to obtain user trajectory according to the location data, the user trajectory includes the user in institute State in target area by way of positional information;
According to the user in the target area by way of positional information, determine the user trajectory in region level Optimal node on tree, the region hierarchical tree are using the target area as root node, are included with the target area Subregion is the tree of child node, the optimal node be comprising the user in the target area by way of it is all The lowest level node of positional information;
The mapping relations established between the user trajectory and the optimal node are described to obtain region level index tree Region level index tree is the region hierarchical tree comprising the mapping relations.
With reference in a first aspect, in the first possible embodiment of first aspect, user's rail is determined described Before optimal node of the mark in the hierarchical tree of region, methods described also includes:
With reference to the location data, according to data balancing principle and transregional minimum principle, the target area is entered Row division obtains the region hierarchical tree.
With reference to the first possible embodiment of first aspect, in second of possible embodiment of first aspect In, the location data with reference to described in, according to data balancing principle and transregional minimum principle, the target area is carried out Division obtains the region hierarchical tree and specifically included:
Prime area hierarchical tree is obtained according to data balancing principle and transregional minimum principle;
The prime area hierarchical tree is optimized according to optimization principles to obtain the region hierarchical tree.
With reference to first aspect second in possible embodiment, in the third possible embodiment of first aspect In, it is described that prime area hierarchical tree is obtained according to data balancing principle and transregional minimum principle;According to optimization principles to described first Beginning region hierarchical tree optimizes to obtain the region hierarchical tree, specifically includes:
ymin=a × M+b × N;
In the y that wherein a × M+b × N is obtained, be worth minimum for ymin, the wherein y expressions prime area hierarchical tree, wherein yminThe region hierarchical tree is represented, wherein M embodies data balancing principle, and the more balanced M values of data are smaller, and M=| 0.5- sections Total position points in position points/root node in point |;Wherein N embodies transregional minimum principle, and transregional fewer N values are smaller, and N= Across child node track number/total track number;Wherein a and b is respectively M and N weighted value, and a+b=1.
The third with reference to the possible embodiment of the first of first aspect and first aspect to first aspect is possible Any one embodiment of embodiment, it is described according to the user in the 4th kind of possible embodiment of first aspect Position data is polymerize after obtaining user trajectory, and methods described also includes:The abnormity point in the user trajectory is removed, to institute User trajectory is stated to be smoothed.
With reference to combine the first possible embodiment of first aspect and first aspect to the 4th kind of first aspect can Any one embodiment of the embodiment of energy, in the 5th kind of possible embodiment of first aspect,
It is described establish the mapping relations between the user trajectory and the optimal node after, methods described is also wrapped Include:The user trajectory is stored in storage location corresponding to the optimal node.
With reference to combine the first possible embodiment of first aspect and first aspect to the 5th kind of first aspect can Any one embodiment of the embodiment of energy, in the 6th kind of possible embodiment of first aspect, methods described is also wrapped Include:
The user trajectory of the optimal node mapping is obtained according to the region level index tree;
Adaptive region optimization is carried out to obtain to the optimal node according to the user trajectory that the optimal node maps Optimize node;
According to the mapping relations between the user trajectory and the optimal node and the optimization node, obtain described Mapping relations between user trajectory and the optimization node;
According to the mapping relations between the region level index tree and the user trajectory and the optimization node, formed Region level index tree after optimization.
Second aspect, a kind of position data processing unit, including:
Data acquisition module, for obtaining the location data in target area;
Track acquisition module, for being polymerize to obtain user trajectory, user's rail according to the location data Mark include the user in the target area by way of positional information;
Optimal node determining module, for according to the user in the target area by way of positional information, it is determined that Optimal node of the user trajectory in the hierarchical tree of region, the region hierarchical tree be using the target area as root node, Tree using the subregion that the target area is included as child node, the optimal node for comprising the user in institute State in target area by way of all positional informations lowest level node;
Region level index tree acquisition module, closed for establishing the mapping between the user trajectory and the optimal node To obtain region level index tree, the region level index tree is the region hierarchical tree comprising the mapping relations for system.
With reference to second aspect, in the first possible embodiment of second aspect, described device also includes:
Region hierarchical tree division module, for reference to the location data, according to data balancing principle and it is transregional most Small principle, the target area is divided to obtain the region hierarchical tree.
With reference to the first possible embodiment of second aspect, in second of possible embodiment of second aspect In, the region hierarchical tree division module, it is used for:
Prime area hierarchical tree is obtained according to data balancing principle and transregional minimum principle;
The prime area hierarchical tree is optimized according to optimization principles to obtain the region hierarchical tree.
With reference to second of possible embodiment of second aspect, in the third possible embodiment of second aspect In, the region hierarchical tree division module, it is specifically used for:
ymin=a × M+b × N;
In the y that wherein a × M+b × N is obtained, be worth minimum for ymin, the wherein y expressions prime area hierarchical tree, wherein yminThe region hierarchical tree is represented, wherein M embodies data balancing principle, and the more balanced M values of data are smaller, and M=| 0.5- sections Total position points in position points/root node in point |;Wherein N embodies transregional minimum principle, and transregional fewer N values are smaller, and N= Across child node track number/total track number;Wherein a and b is respectively M and N weighted value, and a+b=1.
The third with reference to the possible embodiment of the first of second aspect and second aspect to second aspect is possible Any one embodiment in embodiment, in the 4th kind of possible embodiment of second aspect, described device is also wrapped Smooth trajectory module is included, for removing the abnormity point in the user trajectory, the user trajectory is smoothed.
The 4th kind with reference to the possible embodiment of the first of second aspect and second aspect to second aspect is possible Any one embodiment in embodiment, in the 5th kind of possible embodiment of second aspect, described device is also wrapped Memory module is included, for the user trajectory to be stored in into storage location corresponding to the optimal node.
The 5th kind with reference to the possible embodiment of the first of second aspect and second aspect to second aspect is possible Any one embodiment in embodiment, in the 6th kind of possible embodiment of second aspect, described device is also wrapped Adaptive region optimization module is included, the adaptive region optimization module is used for:
The user trajectory of the optimal node mapping is obtained according to the region level index tree;
Adaptive region optimization is carried out to obtain to the optimal node according to the user trajectory that the optimal node maps Optimize node;
According to the mapping relations between the user trajectory and the optimal node and the optimization node, obtain described Mapping relations between user trajectory and the optimization node;
According to the mapping relations between the region level index tree and the user trajectory and the optimization node, formed Region level index tree after optimization.
In summary, by building region level index tree so that location data is minimum single according to user trajectory Position is mapped and managed in a manner of the level index tree of region, so that application and pipe towards these location datas Manage more efficiently and accurately, the mass users position data especially towards the big data epoch, side provided in an embodiment of the present invention Method enables to application and management towards mass users position data more efficient more rapidly.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, make required in being described below to embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings Accompanying drawing.
Fig. 1 is the flow chart for the position data processing method that the embodiment of the present invention one provides;
Fig. 2 a are the schematic diagrames for the position data processing method that the embodiment of the present invention two provides;
Fig. 2 b are another schematic diagrames for the position data processing method that the embodiment of the present invention two provides;
Fig. 3 is the schematic diagram for the position data processing method that the embodiment of the present invention three provides;
Fig. 4 is the position data process flow figure that yet another embodiment of the invention provides;
Fig. 5 a are the position data processing method schematic diagrames that yet another embodiment of the invention provides;
Fig. 5 b are another schematic diagrames for the position data processing method that yet another embodiment of the invention provides;
Fig. 6 is the schematic diagram for the position data processing unit that the embodiment of the present invention four provides;
Fig. 7 is a kind of schematic diagram of computer equipment provided in an embodiment of the present invention.
Embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention Formula is described in further detail.
Embodiment one, refer to Fig. 1, and it illustrates a kind of position data processing method that one embodiment of the invention provides Flow chart, this method comprise the following steps:
Location data in S103, acquisition target area;
S105, it is polymerize according to the location data to obtain user trajectory, the user trajectory includes the use Family in the target area by way of positional information;
S107, according to the user in the target area by way of positional information, determine the user trajectory in area Optimal node on domain level tree, the region hierarchical tree are using the target area as root node, with the target area institute Comprising subregion be child node tree, the optimal node be comprising the user in the target area by way of All positional informations lowest level node;
S109, the mapping relations established between the user trajectory and the optimal node are indexed with obtaining region level Tree, the region level index tree is the region hierarchical tree comprising the mapping relations.
In summary, the position data processing method that the present embodiment provides, by obtaining the customer location in target area Data, and polymerize according to the location data to obtain user trajectory, and further according to the user in the mesh Mark region in by way of positional information determine optimal node of the user trajectory in the hierarchical tree of region, establish user's rail Mapping relations between mark and the optimal node are to obtain region level index tree.So as to by mass users position data Divided in the way of the level index tree of region, and then the region level index tree efficient retrieval designated area can be based on User trajectory.When position data output increases severely, task analysis input can also be described according to embodiments of the present invention Region level index tree rapidly and efficiently find designated area, without being downloaded to the data beyond designated area and Filter analysis, resource consumption is saved and has improved analytical performance.
What deserves to be explained is user trajectory of the present invention can be understood as:User connects in the position in space with the time The curve for continuing change and being formed, i.e. the track includes the information of time-space domain.The understanding be applied to full text user trajectory, behind not Repeat again.
Embodiment two, the position data processing method that another embodiment of the present invention provides on the basis of embodiment one are: Before the step S107 of embodiment one, the present embodiment also includes step S106:With reference to the location data, according to data Homeostatic principle and transregional minimum principle, the target area is divided to obtain the region hierarchical tree, specifically such as Fig. 2 a institutes State, such as:Rectangle 1 on the left of Fig. 2 a represents target area, and correspondingly, the circle 1 on the right side of Fig. 2 a is represented corresponding to target area Root node, the rectangle 2 and 3 on the left of Fig. 2 a represent the subregion 2 and 3 that division target area obtains, and rectangle 4 and 5 represents division The subregion 4 and 5 that region 2 obtains, rectangle 6 and 7 represent the subregion 6 and 7 that division subregion 3 obtains.The rest may be inferred so that will The target area in left side is divided and finally gives corresponding region hierarchical tree.What deserves to be explained is figure herein is only used for Help understand that target area divides to obtain the process of region hierarchical tree, wherein specific dividing mode and numerical characteristics etc. are not right This programme causes any restrictions.
Further, wherein data balancing principle represents:Each sub-regions of the location data in target area it Between distribution as uniform as possible;Wherein transregional minimum principle represents:Make region level of the user trajectory after division as far as possible The subregion number crossed in tree between each sub-regions node is minimum.Data balancing principle herein can be understood as horizontal number According to equilibrium, transregional minimum principle can be understood as longitudinal data balancing.
For example:The core traffic main artery region in one city, flowing 80% will be through daily for the urban population This region is crossed, now if being a sub-regions by the region division, then 80% user trajectory all falls at this Region, remaining 20% is dispersed in other regions, does not meet across cell minimum principle thus, when to where 80% user trajectory When region carries out task analysis, due to granularity too coarse granule, the operation of many redundancies is also resulted in.
Specifically, the location data with reference to described in, according to data balancing principle and transregional minimum principle, to described Target area, which is divided to obtain the region hierarchical tree, to be specifically included:
Prime area hierarchical tree is obtained according to data balancing principle and transregional minimum principle;
The prime area hierarchical tree is optimized according to optimization principles to obtain the region hierarchical tree.
Further, it is described that prime area hierarchical tree is obtained according to data balancing principle and transregional minimum principle;According to excellent Change principle the prime area hierarchical tree is optimized to obtain the region hierarchical tree and specifically include:
ymin=a × M+b × N;
In the y that wherein a × M+b × N is obtained, be worth minimum for ymin, the wherein y expressions prime area hierarchical tree, wherein yminThe region hierarchical tree is represented, wherein M embodies data balancing principle, and the more balanced M values of data are smaller, and M=| 0.5- sections Total position points in position points/root node in point |;Wherein N embodies transregional minimum principle, and transregional fewer N values are smaller, and N= Across child node track number/total track number;Wherein a and b is respectively M and N weighted value, and a+b=1.It should be noted that implement Remaining correlation step of example two is identical with the processing mode of the corresponding step of embodiment one, therefore here is omitted.
In summary, target area is divided according to data balancing principle and transregional minimum principle to obtain region level Tree, from the horizontal and vertical equilibrium for having considered data so that what the region hierarchical tree for dividing to obtain divided to target area Data distribution is more balanced between each sub-regions, based on so region hierarchical tree of data balancing as far as possible, is existed according to user In target area by way of positional information determine optimal node of the user trajectory in the hierarchical tree of region, and the area finally got Domain level index tree can be that a granularity is more suitable, and area is specified in the positioning that oriented mission processing also can be more efficient when application Domain, avoid operating incoherent redundant data and wasting process resource.
What deserves to be explained is it is a kind of common region division that division is carried out to target area according to the division of administrative region Mode, but in embodiments of the present invention, is not limited specific dividing mode, but according to data balancing principle and Transregional minimum principle is divided, if administrative division can just meet data balancing principle and transregional minimum principle, It then can also turn into a kind of division embodiment of the present invention.Specifically can as shown in Figure 2 b, for example, being lifted by target area of Hangzhou Example, it is assumed that the administrative division for scheming left side is just met for data balancing principle and transregional minimum principle, then corresponds to left side administrative division Right side dendrogram be that obtained region hierarchical tree is divided according to target area Hangzhou.The figure only should meet that data are equal A kind of possible dividing mode of the principle that weighs and transregional minimum principle, it is not intended as the limited features of this programme.
Embodiment three, on the basis of embodiment one and embodiment two, Fig. 3 is refer to, it illustrates of the invention another real The position data processing method schematic diagram of example offer is provided.This method includes:
Specifically, step S105 in embodiment one, according to the location data polymerize to obtain user trajectory, The user trajectory include the user in the target area by way of positional information.Specifically can be according to as described in Figure 3 Mode polymerize to obtain user trajectory according to location data.It is location data on the left of Fig. 3, wherein P1 corresponds to position Put a little 1, P2 correspondence positions point 2, until Pn ..., each location point can include but is not limited to its corresponding longitude and latitude and Temporal information.The track on right side is polymerize the obtained positional information for including P1, P2 ... Pn according to leftward position data User trajectory.
Optionally, as shown in figure 4, it illustrates the position data process flow that yet another embodiment of the invention provides Figure, it is polymerize can also to include after obtaining user trajectory according to the location data in the S105 of embodiment one: S105a, the abnormity point in the user trajectory is removed, the user trajectory is smoothed.
Specifically, the smooth trajectory processing that abnormity point is removed, considers mainly in combination with track reasonability.Abnormity point is removed and rail The smooth method of mark have it is a variety of, it is without limitation in the embodiment of the present invention.In order to make it easy to understand, the embodiment of the present invention provides one The optional cleaning rule of kind, and be explained with specific example, it is specific as follows:
1st, abnormity point is removed:
Here abnormity point refers to the location point for substantially not meeting general knowledge, such as:For the people moved in city, i.e., Make by means of the vehicles (including subway, not including aircraft), speed that it is moved not over 120km/h, consider further that with The measurement error of former and later two location points of family, then for two adjacent point now and pre, the distance between 2 points should not surpass Cross T1 values:
T1=(now.time-pre.time) * 120km/h+now.error+pre.error (1.1)
Due to it is current we obtain measurement error less than each point, it is possible to a definite value is arranged to, due to big absolutely The measurement error of majority point is within 300 meters, and therefore (1.1) formula can be replaced with T2:
T2=(now.time-pre.time) * 120km/h+2*300m (1.2)
Decision logic:A normal point is determined first in adjacent 2 points, is more than (1.2) formula when the distance between 2 points When, by another point deletion.
What deserves to be explained is:Above-mentioned 120km/h and 300m is that rule of thumb or can need the ginseng that is adjusted Number.
2nd, remove and repeat point:
Multiple longitude and latitude points (location point) of the same user of synchronization are removed, i.e., when same user occurs in synchronization It is considered that certainly existing abnormity point when multiple location points, we are called repetition point to this abnormity point, remove repetition point and allow Same user only corresponds to a location point a moment, and this is that we will do.
Specifically removing the method for repetition point can be:
2.1st, when distance is near between the plurality of location point, to mutually multiple positions of different longitude and latitude are averaged in the same time Value.
2.2nd, when distance is remote between the plurality of location point, point location point corresponding with the immediate historical juncture is compared Accuracy, specifically accuracy can be judged with used T1 or T2, such as:Take position corresponding to immediate 3 historical junctures Point A, B, C are put, current time, corresponding two repetitions point was D, E, then judged respectively between D and A, B, C according to above-mentioned T1 or T2 Accuracy, if two allowable errors met with T1 or T2 in tri- location points of D and A, B, C, then it is assumed that D accuracy is 2, the accuracy between E and A, B, C is judged respectively according to above-mentioned T1 or T2, if 1 symbol in tri- location points of E and A, B, C Close the allowable error with T1 or T2, then it is assumed that E accuracy is 1, at this time it is considered that the accuracy in current time D points is high In E, further, then by the method in 1 judge whether the distance of the D points and penultimate moment location point is normal, if It is normal then retain.Otherwise any operation is not done.
In summary, user trajectory is smoothed, clearly falls abnormity point therein so that obtained user trajectory It is more accurate, so that optimal node of the searching user trajectory in the hierarchical tree of region is also just more accurate, further output Region level index tree confidence level it is also higher.
In addition, on the basis of embodiment one, two, three, based on such scheme, storage processing can also be further carried out, Specifically:In S109:Establish after the mapping relations between the user trajectory and the optimal node, methods described is also wrapped Include:The user trajectory is stored in storage location corresponding to the optimal node.
Based on such operation, corresponding region storage location is found for user trajectory, so as to user trajectory The calling of management is more succinct.
Also need to explanation be a bit:, can also be further based on such scheme on the basis of embodiment one, two, three Processing is optimized, specifically:In S109:The mapping relations established between the user trajectory and the optimal node are to obtain After the level index tree of region, this method also includes:
The user trajectory of the optimal node mapping is obtained according to the region level index tree;
Adaptive region optimization is carried out to obtain to the optimal node according to the user trajectory that the optimal node maps Optimize node;
According to the mapping relations between the user trajectory and the optimal node and the optimization node, obtain described Mapping relations between user trajectory and the optimization node;
According to the mapping relations between the region level index tree and the user trajectory and the optimization node, formed Region level index tree after optimization.
Specifically, the adaptive region optimization in the program can operate as steps described below:
Judge whether include many places tracking clustering center in optimal node, many places herein for more than at Liang Chu and two, if The optimal node is then carried out multidomain treat-ment and obtains multiple nodes many places tracking clustering center to be present by judged result, goes forward side by side one Step carries out minimum enclosed rectangle processing respectively to region corresponding to the plurality of node, obtains optimizing node, specifically as shown in Figure 5 a, Big rectangle wherein in the figure of the left side represents optimal node, and tracking clustering center at two is included in the optimal node, wherein right side is small Dashed rectangle is optimization node.Figure herein is used only to example, and wherein data characteristics does not form any restrictions to this programme. It should be noted that the number of optimization node herein can obtain the node number one of multiple nodes with corresponding multidomain treat-ment Cause, naturally it is also possible to less than the node number of the plurality of node, be not limited herein.
If judged result is in the absence of many places tracking clustering center, region corresponding to the optimal node is carried out minimum outer Rectangle processing is connect, obtains optimizing node.Specifically as shown in Figure 5 b, the big rectangle wherein in the figure of the left side represents optimal node, and this is most Tracking clustering center at one is included in excellent node, wherein the big dashed rectangle in right side is optimization node.Figure herein is used only to show Example, wherein feature does not form any restrictions to this programme.
It is construed to what deserves to be explained is minimum enclosed rectangle is generic noun in the art, corresponding to it:Minimum external square Shape (minimum bounding rectangle, MBR), minimum boundary rectangle is also translated into, minimum includes rectangle, or minimum outer Bag rectangle.Minimum enclosed rectangle refers to the maximum of some two-dimensional shapes (such as point, straight line, polygon) represented with two-dimensional coordinate Scope, i.e., with the maximum abscissa in given each summit of two-dimensional shapes, minimum abscissa, maximum ordinate, minimum ordinate Fix the rectangle on border.
By adaptive region optimization processing, before region level index tree after the optimization ultimately formed is compared to optimization Region level index tree (is referred to as " region level index tree ") in the embodiment of the present invention, there is finer node division, so as to right Can also be more accurate more efficient in the task operating of designated area, so as to further avoid to incoherent redundant data Operation, has saved resource consumption and has improved task operating performance.
In order to contribute to it is clearer understand such scheme, the embodiment of the present invention is still using the Hangzhou shown in Fig. 2 b as target Region illustrates.It will again be assumed that administrative division just result as shown in Figure 2 b meets data balancing principle and transregional minimum original Then, then the target area divides obtained region hierarchical tree according to corresponding right side dendrogram, by user's rail of corresponding region Mark is mapped in the region hierarchical tree according to the mapping relations with optimal node, then obtains region level index tree.So, when need When carrying out task operating to the location data in such as West Lake north, the task operating can be any associative operation, herein It is assumed to be mobility status in region to count, then corresponding node is directly found according to region level index tree and (be assumed to be node 6) all user trajectories corresponding to the node 6, are obtained and carry out analytic statistics.Such method to be appointed towards designated area The data for no longer needing first to download whole big region during business operation are filtered again, such as first download the user position of whole Hangzhou Data are put, then the data unrelated with West Lake north are filtered out one by one, the position data for being finally left West Lake north carries out task operating.This The processing mode of sample not only waste of resource and analytical performance is very low.As can be seen here, the feasible embodiment of some of the invention In, user trajectory is mapped on the optimal node in the region hierarchical tree one by one by the way of the hierarchical tree of region, this is optimal Node be comprising user in target area by way of all positional informations lowest level node, so as to according to establishing user trajectory Mapping relations between optimal node obtain region level index tree so that based on such region level index tree to towards The task processing and application of position data are more efficient more accurate.
In order to preferably implement the such scheme of the embodiment of the present invention, it is also provided below and implements such scheme for coordinating Relevant apparatus.
Example IV, refer to Fig. 6, and the embodiment of the present invention provides a kind of position data processing unit 600, it may include:
Data acquisition module 630, for obtaining the location data in target area;
Track acquisition module 650, for being polymerize to obtain user trajectory, the user according to the location data Track include the user in the target area by way of positional information;
Optimal node determining module 670, for according to the user in the target area by way of positional information, really Fixed optimal node of the user trajectory in the hierarchical tree of region, the region hierarchical tree is using the target area as root section Point, the tree using the subregion that the target area is included as child node, the optimal node are to include the user In the target area by way of all positional informations lowest level node;
Region level index tree acquisition module 690, for establishing reflecting between the user trajectory and the optimal node Relation is penetrated to obtain region level index tree, the region level index tree is the region hierarchical tree comprising the mapping relations.
In summary, the position data processing unit 600 that the present embodiment provides, mesh is obtained by data acquisition module 630 The location data in region is marked, track acquisition module 650 is polymerize to obtain user's rail according to the location data Mark, and further optimal node determining module 670 according to the user in the target area by way of positional information determine Optimal node of the user trajectory in the hierarchical tree of region, region level index tree acquisition module 690 establish user's rail Mapping relations between mark and the optimal node are to obtain region level index tree.So as to by mass users position data Divided in the way of the level index tree of region, and then the region level index tree efficient retrieval designated area can be based on User trajectory.When position data output increases severely, task analysis input can also be described according to embodiments of the present invention Region level index tree rapidly and efficiently find designated area, without being downloaded to the data beyond designated area and Filter analysis, resource consumption is saved and has improved analytical performance.
What deserves to be explained is user trajectory of the present invention can be understood as:User connects in the position in space with the time The curve for continuing change and being formed, i.e. the track includes the information of time-space domain.The understanding be applied to full text user trajectory, behind not Repeat again.
In some embodiments of the present invention, described device 600 can also include:
Region hierarchical tree division module, for reference to the location data, according to data balancing principle and it is transregional most Small principle, the target area is divided to obtain the region hierarchical tree.
It is further alternative, the region hierarchical tree division module, it is used for:
Prime area hierarchical tree is obtained according to data balancing principle and transregional minimum principle;
The prime area hierarchical tree is optimized according to optimization principles to obtain the region hierarchical tree.
It is further optional, the region hierarchical tree division module, it is specifically used for:
ymin=a × M+b × N;
In the y that wherein a × M+b × N is obtained, be worth minimum for ymin, the wherein y expressions prime area hierarchical tree, wherein yminThe region hierarchical tree is represented, wherein M embodies data balancing principle, and the more balanced M values of data are smaller, and M=| 0.5- sections Total position points in position points/root node in point |;Wherein N embodies transregional minimum principle, and transregional fewer N values are smaller, and N= Across child node track number/total track number;Wherein a and b is respectively M and N weighted value, and a+b=1.
In summary, hierarchical tree division module in region is entered according to data balancing principle and transregional minimum principle to target area Row division obtains region hierarchical tree, from the horizontal and vertical equilibrium for having considered data so that divide obtained region level Data distribution is more balanced between setting each sub-regions to target area division, based on so region of data balancing as far as possible Hierarchical tree, according to user in target area by way of positional information determine optimal section of the user trajectory in the hierarchical tree of region Point, and the region level index tree finally got can be that a granularity is more suitable, oriented mission processing also can when application More efficient positioning designated area, avoid operating incoherent redundant data and wasting process resource.
What deserves to be explained is it is a kind of common region division that division is carried out to target area according to the division of administrative region Mode, but in embodiments of the present invention, is not limited specific dividing mode, but according to data balancing principle and Transregional minimum principle is divided, if administrative division can just meet data balancing principle and transregional minimum principle, It then can also turn into a kind of division embodiment of the present invention.Specifically can as shown in Figure 2 b, for example, being lifted by target area of Hangzhou Example, it is assumed that the administrative division for scheming left side is just met for data balancing principle and transregional minimum principle, then corresponds to left side administrative division Right side dendrogram be that obtained region hierarchical tree is divided according to target area Hangzhou.The figure only should meet that data are equal A kind of possible dividing mode of the principle that weighs and transregional minimum principle, it is not intended as the limited features of this programme.
In other examples of implementation of the present invention, described device 600 can also include:Smooth trajectory module, for clear Except the abnormity point in the user trajectory, the user trajectory is smoothed.
In specific operating process, the smooth trajectory that abnormity point is removed is handled, and is considered mainly in combination with track reasonability.It is different Often point remove and smooth trajectory method have it is a variety of, it is without limitation in the embodiment of the present invention.It is in order to make it easy to understand, of the invention Smooth trajectory module in embodiment provides a kind of optional cleaning rule, and is explained with specific example, specifically such as Under:
1st, abnormity point is removed:
Here abnormity point refers to the location point for substantially not meeting general knowledge, such as:For the people moved in city, i.e., Make by means of the vehicles (including subway, not including aircraft), speed that it is moved not over 120km/h, consider further that with The measurement error of former and later two location points of family, then for two adjacent point now and pre, the distance between 2 points should not surpass Cross T1 values:
T1=(now.time-pre.time) * 120km/h+now.error+pre.error (1.1)
Due to it is current we obtain measurement error less than each point, it is possible to a definite value is arranged to, due to big absolutely The measurement error of majority point is within 300 meters, and therefore (1.1) formula can be replaced with T2:
T2=(now.time-pre.time) * 120km/h+2*300m (1.2)
Decision logic:A normal point is determined first in adjacent 2 points, is more than (1.2) formula when the distance between 2 points When, by another point deletion.
What deserves to be explained is:Above-mentioned 120km/h and 300m is that rule of thumb or can need the ginseng that is adjusted Number.
2nd, remove and repeat point:
Multiple longitude and latitude points (location point) of the same user of synchronization are removed, i.e., when same user occurs in synchronization It is considered that certainly existing abnormity point when multiple location points, we are called repetition point to this abnormity point, remove repetition point and allow Same user only corresponds to a location point a moment, and this is that we will do.
Specifically removing the method for repetition point can be:
2.1st, when distance is near between the plurality of location point, to mutually multiple positions of different longitude and latitude are averaged in the same time Value.
2.2nd, when distance is remote between the plurality of location point, point location point corresponding with the immediate historical juncture is compared Accuracy, specifically accuracy can be judged with used T1 or T2, such as:Take position corresponding to immediate 3 historical junctures Point A, B, C are put, current time, corresponding two repetitions point was D, E, then judged respectively between D and A, B, C according to above-mentioned T1 or T2 Accuracy, if two allowable errors met with T1 or T2 in tri- location points of D and A, B, C, then it is assumed that D accuracy is 2, the accuracy between E and A, B, C is judged respectively according to above-mentioned T1 or T2, if 1 symbol in tri- location points of E and A, B, C Close the allowable error with T1 or T2, then it is assumed that E accuracy is 1, at this time it is considered that the accuracy in current time D points is high In E, further, then by the method in 1 judge whether the distance of the D points and penultimate moment location point is normal, if It is normal then retain.Otherwise any operation is not done.
In summary, smooth trajectory module is smoothed to user trajectory, clearly falls abnormity point therein so that The user trajectory arrived is more accurate, so that it is also just more accurate to find optimal node of the user trajectory in the hierarchical tree of region Really, the confidence level of the region level index tree further exported is also higher.
On the basis of above-mentioned all device embodiments, described device 600 can further include:Memory module, use In the user trajectory is stored in into storage location corresponding to the optimal node.
It is that user trajectory finds corresponding region storage location by memory module, so that the management to user trajectory Calling it is more succinct.
On the basis of above-mentioned all device embodiments, described device 600 can further include:Adaptive region is excellent Change module, the adaptive region optimization module is used for:
The user trajectory of the optimal node mapping is obtained according to the region level index tree;
Adaptive region optimization is carried out to obtain to the optimal node according to the user trajectory that the optimal node maps Optimize node;
According to the mapping relations between the user trajectory and the optimal node and the optimization node, obtain described Mapping relations between user trajectory and the optimization node;
According to the mapping relations between the region level index tree and the user trajectory and the optimization node, formed Region level index tree after optimization.
The concrete operation method of the adaptive region optimization module may refer to the description of embodiment of the method relevant position, Here is omitted.
By adaptive region optimization processing, before region level index tree after the optimization ultimately formed is compared to optimization Region level index tree (is referred to as " region level index tree ") in the embodiment of the present invention, there is finer node division, so as to right Can also be more accurate more efficient in the task operating of designated area, so as to further avoid to incoherent redundant data Operation, has saved resource consumption and has improved task operating performance.
The embodiment of the present invention also provides a kind of computer-readable medium, it is characterised in that including computer executed instructions, with For during computer executed instructions, the computer performs the position as disclosed in Fig. 1 embodiments described in the computing device of computer Data processing method.
Fig. 7 is refer to, the embodiment of the present invention also provides a kind of computer equipment 700, it may include:Processor 710, memory 720, communication interface 730, bus 740;The processor 710, memory 720, communication interface 730 are connected by the bus 740 Connect and mutual communication;The communication interface 730, for receiving and sending data;The memory 720, which is used to store, to be calculated Machine execute instruction;When the computer equipment is run, the processor 710 is used to perform the calculating in the memory Machine execute instruction, the customer location processing method as disclosed in Fig. 1 embodiments is performed with the computer equipment.
More than, the embodiment of the invention discloses a kind of computer equipment, the equipment is using the use from network or locally obtained Family position data, polymerization obtain user trajectory, determine optimal node of the user trajectory in the hierarchical tree of region, and then establish user Mapping relations between track and optimal node utilize the area of this method acquisition to obtain the technical scheme of region level index tree Domain level index tree, can solve analysis rapidly and efficiently towards magnanimity position data and handle these position datas, so that Obtain the more efficient more accurately technical problem of application towards these magnanimity position datas.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and is not described in some embodiment Part, may refer to the associated description of other embodiments.
It should be noted that for foregoing each method embodiment, in order to be briefly described, therefore it is all expressed as a series of Combination of actions, but those skilled in the art should know, the present invention is not limited by described sequence of movement because according to According to the present invention, some steps can use other orders or carry out simultaneously.Secondly, those skilled in the art should also know, Embodiment described in this description belongs to preferred embodiment, and not necessarily the present invention must for involved action and module Must.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can To instruct the hardware of correlation to complete by program, the program can be stored in a computer-readable recording medium, storage Medium can include:ROM, RAM, disk or CD etc..
The location data treating method and apparatus provided above the embodiment of the present invention is described in detail, this Apply specific case in text to be set forth the principle and embodiment of the present invention, the explanation of above example is only intended to Help to understand method and its core concept of the invention;Meanwhile for those of ordinary skill in the art, the think of according to the present invention Think, in specific embodiments and applications there will be changes, in summary, this specification content should not be construed as pair The limitation of the present invention.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.

Claims (12)

  1. A kind of 1. position data processing method, it is characterised in that including:
    Obtain the location data in target area;
    It is polymerize to obtain user trajectory according to the location data, the user trajectory includes the user in the mesh Mark region in by way of positional information;
    With reference to the location data, according to data balancing principle and transregional minimum principle, the target area is drawn Get the region hierarchical tree;
    According to the user in the target area by way of positional information, determine the user trajectory in the region level Optimal node on tree, the region hierarchical tree are using the target area as root node, are included with the target area Subregion is the tree of child node, the optimal node be comprising the user in the target area by way of it is all The lowest level node of positional information;
    The mapping relations established between the user trajectory and the optimal node are to obtain region level index tree, the region Level index tree is the region hierarchical tree comprising the mapping relations.
  2. 2. according to the method for claim 1, it is characterised in that the location data with reference to described in, it is equal according to data Weigh principle and transregional minimum principle, the target area is divided to obtain the region hierarchical tree specifically included:
    Prime area hierarchical tree is obtained according to data balancing principle and transregional minimum principle;
    The prime area hierarchical tree is optimized according to optimization principles to obtain the region hierarchical tree.
  3. 3. according to the method for claim 2, it is characterised in that described to be obtained according to data balancing principle and transregional minimum principle Take prime area hierarchical tree;The prime area hierarchical tree is optimized according to optimization principles to obtain the region hierarchical tree, Specifically include:
    ymin=a × M+b × N;
    In the y that wherein a × M+b × N is obtained, be worth minimum for ymin, the wherein y expressions prime area hierarchical tree, wherein ymin The region hierarchical tree is represented, wherein M embodies data balancing principle, and the more balanced M values of data are smaller, and M=| 0.5- child nodes Total position points in interior position points/root node |;Wherein N embodies transregional minimum principle, and transregional fewer N values are smaller, and N=across Child node track number/total track number;Wherein a and b is respectively M and N weighted value, and a+b=1.
  4. 4. according to the method described in any one of claims 1 to 3, it is characterised in that described to be entered according to the location data After row polymerization obtains user trajectory, methods described also includes:The abnormity point in the user trajectory is removed, to the user trajectory It is smoothed.
  5. 5. according to the method for claim 4, it is characterised in that establish the user trajectory and the optimal node described Between mapping relations after, methods described also includes:The user trajectory is stored in storage corresponding to the optimal node Position.
  6. 6. according to the method for claim 5, it is characterised in that methods described also includes:
    The user trajectory of the optimal node mapping is obtained according to the region level index tree;
    Adaptive region optimization is carried out to be optimized to the optimal node according to the user trajectory that the optimal node maps Node;
    According to the mapping relations between the user trajectory and the optimal node and the optimization node, the user is obtained Mapping relations between track and the optimization node;
    According to the mapping relations between the region level index tree and the user trajectory and the optimization node, optimization is formed Region level index tree afterwards.
  7. A kind of 7. position data processing unit, it is characterised in that including:
    Data acquisition module, for obtaining the location data in target area;
    Track acquisition module, for being polymerize to obtain user trajectory, the user trajectory bag according to the location data Containing the user in the target area by way of positional information;
    Region hierarchical tree division module, for reference to the location data, according to data balancing principle and transregional minimum former Then, the target area is divided to obtain the region hierarchical tree;
    Optimal node determining module, for according to the user in the target area by way of positional information, it is determined that described Optimal node of the user trajectory in the region hierarchical tree, the region hierarchical tree be using the target area as root node, Tree using the subregion that the target area is included as child node, the optimal node for comprising the user in institute State in target area by way of all positional informations lowest level node;
    Region level index tree acquisition module, for establish the mapping relations between the user trajectory and the optimal node with Region level index tree is obtained, the region level index tree is the region hierarchical tree comprising the mapping relations.
  8. 8. device according to claim 7, it is characterised in that the region hierarchical tree division module, be used for:
    Prime area hierarchical tree is obtained according to data balancing principle and transregional minimum principle;
    The prime area hierarchical tree is optimized according to optimization principles to obtain the region hierarchical tree.
  9. 9. device according to claim 8, it is characterised in that the region hierarchical tree division module, be specifically used for:
    ymin=a × M+b × N;
    In the y that wherein a × M+b × N is obtained, be worth minimum for ymin, the wherein y expressions prime area hierarchical tree, wherein ymin The region hierarchical tree is represented, wherein M embodies data balancing principle, and the more balanced M values of data are smaller, and M=| 0.5- child nodes Total position points in interior position points/root node |;Wherein N embodies transregional minimum principle, and transregional fewer N values are smaller, and N=across Child node track number/total track number;Wherein a and b is respectively M and N weighted value, and a+b=1.
  10. 10. according to the device described in any one of claim 7 to 9, it is characterised in that described device also includes smooth trajectory mould Block, for removing the abnormity point in the user trajectory, the user trajectory is smoothed.
  11. 11. device according to claim 10, it is characterised in that described device also includes memory module, for by described in User trajectory is stored in storage location corresponding to the optimal node.
  12. 12. device according to claim 11, it is characterised in that described device also includes adaptive region optimization module, The adaptive region optimization module is used for:
    The user trajectory of the optimal node mapping is obtained according to the region level index tree;
    Adaptive region optimization is carried out to be optimized to the optimal node according to the user trajectory that the optimal node maps Node;
    According to the mapping relations between the user trajectory and the optimal node and the optimization node, the user is obtained Mapping relations between track and the optimization node;
    According to the mapping relations between the region level index tree and the user trajectory and the optimization node, optimization is formed Region level index tree afterwards.
CN201410513908.2A 2014-09-29 2014-09-29 A kind of position data processing method and processing device Active CN104268243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410513908.2A CN104268243B (en) 2014-09-29 2014-09-29 A kind of position data processing method and processing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410513908.2A CN104268243B (en) 2014-09-29 2014-09-29 A kind of position data processing method and processing device

Publications (2)

Publication Number Publication Date
CN104268243A CN104268243A (en) 2015-01-07
CN104268243B true CN104268243B (en) 2017-11-17

Family

ID=52159764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410513908.2A Active CN104268243B (en) 2014-09-29 2014-09-29 A kind of position data processing method and processing device

Country Status (1)

Country Link
CN (1) CN104268243B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107526815A (en) * 2017-08-28 2017-12-29 知谷(上海)网络科技有限公司 The determination method and electronic equipment of Move Mode in the range of target area
CN110007987B (en) * 2018-01-05 2022-03-25 武汉斗鱼网络科技有限公司 Method and system for managing hierarchy of view object
CN109657022B (en) * 2018-12-08 2020-06-30 拉扎斯网络科技(上海)有限公司 Merchant searching method and device, electronic equipment and storage medium
CN109873713B (en) * 2018-12-28 2020-07-10 华中科技大学 Decentralized service cluster system for location service and fault detection method
CN111159107B (en) * 2019-12-30 2023-03-21 北京明略软件***有限公司 Data processing method and server cluster
CN113807127A (en) * 2020-06-12 2021-12-17 杭州海康威视数字技术股份有限公司 Personnel archiving method and device and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1948913A (en) * 2006-08-25 2007-04-18 北京航空航天大学 Heuristic path culculating method for treating large scale floating vehicle data
CN102291435A (en) * 2011-07-15 2011-12-21 武汉大学 Mobile information searching and knowledge discovery system based on geographic spatiotemporal data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1948913A (en) * 2006-08-25 2007-04-18 北京航空航天大学 Heuristic path culculating method for treating large scale floating vehicle data
CN102291435A (en) * 2011-07-15 2011-12-21 武汉大学 Mobile information searching and knowledge discovery system based on geographic spatiotemporal data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"An Efficient Trajectory Index Structure for Moving Objects in Location-Based Services";Jae-Woo Chang 等;《R. Meersman et al. (Eds.): OTM Workshops 2005, LNCS 3762》;20051231;1107-1116 *
"基于用户轨迹挖掘的智能位置服务";郑宇 等;《中国计算机学会通讯》;20101231;第6卷(第6期);1-11 *

Also Published As

Publication number Publication date
CN104268243A (en) 2015-01-07

Similar Documents

Publication Publication Date Title
CN104268243B (en) A kind of position data processing method and processing device
CN105893349B (en) Classification tag match mapping method and device
CN104504003B (en) The searching method and device of diagram data
CN106202335B (en) A kind of traffic big data cleaning method based on cloud computing framework
CN106250457B (en) The inquiry processing method and system of big data platform Materialized View
CN104657418A (en) Method for discovering complex network fuzzy association based on membership transmission
CN105630800A (en) Node importance ranking method and system
CN104200045A (en) Parallel computing method for distributed hydrodynamic model of large-scale watershed system
CN104077438A (en) Power grid large-scale topological structure construction method and system
CN105205052B (en) A kind of data digging method and device
CN106709503A (en) Large spatial data clustering algorithm K-DBSCAN based on density
CN103440309A (en) Automatic resource and environment model combination modeling semantic recognition and recommendation method
CN113516246A (en) Parameter optimization method, quantum chip control method and device
US10678967B2 (en) Adaptive resource reservoir development
CN111709102B (en) Water supply network partitioning method based on hierarchical clustering
CN104778088A (en) Method and system for optimizing parallel I/O (input/output) by reducing inter-progress communication expense
CN108683599B (en) Preprocessing-based method and system for determining maximum flow of flow network
CN115964875A (en) Digital twin water network tetrahedral model construction method
CN106933882A (en) A kind of big data incremental calculation method and device
CN114566048B (en) Traffic control method based on multi-view self-adaptive space-time diagram network
Heijnen et al. Maximising the worth of nascent networks
CN105095239A (en) Uncertain graph query method and device
Zexi et al. Cuckoo search algorithm for solving numerical integration
CN108446342A (en) A kind of environmental quality assessment system, method, apparatus and storage device
Shen et al. Improving time-dependent contraction hierarchies

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant