CN106780262B - Co-location mode discovery method and device considering urban road network constraints - Google Patents

Co-location mode discovery method and device considering urban road network constraints Download PDF

Info

Publication number
CN106780262B
CN106780262B CN201710023460.XA CN201710023460A CN106780262B CN 106780262 B CN106780262 B CN 106780262B CN 201710023460 A CN201710023460 A CN 201710023460A CN 106780262 B CN106780262 B CN 106780262B
Authority
CN
China
Prior art keywords
instance
instances
value
influence
types
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710023460.XA
Other languages
Chinese (zh)
Other versions
CN106780262A (en
Inventor
姚晓婧
彭玲
池天河
崔绍龙
陈六嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN201710023460.XA priority Critical patent/CN106780262B/en
Publication of CN106780262A publication Critical patent/CN106780262A/en
Application granted granted Critical
Publication of CN106780262B publication Critical patent/CN106780262B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Abstract

The invention discloses a co-location mode discovery method and device considering urban road network constraints. The method comprises the following steps: constructing a second-order example proximity relation table for a target area under map projection, wherein the second-order example proximity relation table comprises a set of example pairs with different types and reachable distance values of all examples and reachable distances thereof in the target area, and the reachable distances thereof are within a preset distance attenuation threshold; calculating to obtain the network kernel density value of each instance under the influence of other types of instance sets different from the instance type according to a preset distance attenuation threshold and a second-order instance proximity relation table; calculating to obtain the average influence of each instance set on other types of instance sets according to the network kernel density value; and calculating the popularity of each candidate parity mode according to the average influence, and determining popular parity modes in the candidate parity modes according to a preset popularity threshold. By the method and the device, accuracy of mining urban facility data in a co-located mode is improved.

Description

Co-location mode discovery method and device considering urban road network constraints
Technical Field
The invention relates to the field of spatial data pattern mining, in particular to a co-location pattern discovery method and device considering urban road network constraints.
Background
In the past decades, the development of population, housing and infrastructure and the enlargement of employment scale make the urban development situation of China show a sudden and violent situation. With the rapid development of scientific technology and the increasing expansion of data resources, a subsequent 'smart city' gradually turns to integrated processing of multi-source space data, and the city basic service facility data covers the position and attribute information of various city elements and is used as a key of a city basic database, and how to extract useful distribution rules and mode characteristic information from the city basic database is used for guiding the reasonable planning and reasonable layout of a new town to become a key problem of the current city development. Most of urban facility data exist in a point form, a common method for solving point mode discovery, namely 'co-location mode mining', is developed vigorously in recent years, and is widely applied to the fields of population distribution, public safety, environment management and the like.
The method for mining the co-location mode, namely mining a series of feature type combinations with mutual dependency relationship in space from a real point data set, generally comprises two steps: firstly, determining whether individuals with different characteristics have an adjacent relation in a homogeneous Euclidean space through a preset distance threshold value, and establishing a second-order entity adjacency relation table; then, a significant parity pattern is obtained using a frequent term mining method.
However, real urban space has various restrictive factors, and the utilization of the excavation frame has great limitations: first, the conventional method is mostly based on euclidean distance, considering that the planar space is homogeneous and isotropic. However, in a real urban space, many human-related phenomena occur in traffic networks, such as traffic accidents, street events, distribution of infrastructure, and the like. Second, the conventional method uses a distance threshold of a knife to make a non-judgment on the proximity relationship of the instances, and treats the instances within the threshold equally as the closer distance and the farther distance. In fact, according to the first law of geography, similar things are more closely related, so that the influence of the close example relation on the magnitude of the pattern popularity index is greater than that of the long example relation. Under the two premises, the final result is directly deviated by using the traditional algorithm, so that some originally unpopular modes are wrongly determined as the parity mode, or some interesting modes are ignored.
The urban public service facilities are important contents of the urban basic database, and the characteristic research on the spatial distribution of the urban public service facilities can provide important scientific basis for urban planning and decision-making. The intensive development of cities advocates the rational arrangement of facilities, and the pattern discovery of urban facility distribution is a precondition for achieving the goal. Traditionally, understanding of the space is homogeneous, facilities in the city are mostly distributed in a manually-caused European space with a road skeleton as a limiting factor, and the conventional co-location mode mining method cannot solve the urban problem well.
Disclosure of Invention
The present invention is directed to solving the problems described above. The invention aims to provide a method and a device for discovering a co-location mode by considering urban road network constraints, which solve the problems.
The invention provides a co-location mode discovery method considering urban road network constraints, which comprises the following steps: constructing a second-order example proximity relation table for a target area under map projection, wherein the second-order example proximity relation table comprises a set of example pairs with different types and reachable distance values, and the reachable distance values of all the examples and the reachable distance values are within a preset distance attenuation threshold value; calculating to obtain the network kernel density value of each instance under the influence of other types of instance sets different from the instance type according to a preset distance attenuation threshold and the second-order instance proximity relation table; calculating to obtain the average influence of each instance set on other types of instance sets according to the network kernel density value; and calculating the popularity of each candidate parity mode according to the average influence, and determining popular parity modes in the candidate parity modes according to a preset popularity threshold.
The method also has the following characteristics: the constructing the second order instance proximity relation table comprises:
storing reachable distance values between a plurality of adjacent example pairs and corresponding examples in a two-dimensional hash table TIns _ net2Each cell unit in the table is expressed by a triplet set of the formula:
TIns_net2(ex,ey)={<oi,oj,Rdis(oi,oj)>,…},
wherein (e)x,ey) For two-space object instances, oi,ojFor two adjacent examples, Rdis (o)i,oj) Is the reachable distance value between two adjacent instances;
according to
Rdis(oi,oj)=g(y)*(oi-oj)net(t)|ht
Calculating an achievable distance value between two adjacent instances, wherein Rdis (o)i,oj) Is the reachable distance value between the two adjacent instances, g (y) is a binary function if instance oiTo example ojThe direction along the road is opposite to the road passing direction, the value is 0, otherwise, the value is 1, (o)i-oj)net(t)Is an example oiTo example ojUsed shortest path time, htThe density attenuation threshold based on the path time is a constraint condition for solving the shortest path time, and represents that the reachable distance between the instances must satisfy htThreshold, else instance oiTo example ojIs not reachable.
The method also has the following characteristics: the network kernel density value of each instance under the influence of other types of instance sets different from the implemented type is obtained by calculation according to a preset distance attenuation threshold and the second-order instance proximity relation table, and the method comprises the following steps:
according to
Figure GDA0002729957030000031
The type of calculation is exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
wherein the content of the first and second substances,
Figure GDA0002729957030000032
is of type exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
Figure GDA0002729957030000033
for all types on the region as eyA subset of the set of instances of (c),
nmaxbeing the maximum of the number of instances of a single type on a region,
k represents a spatial weighting function that is,
n(O’(ey)->oi) Is O'(ey) Example of (1) toiThe number of pairs of instances that can be reached,
the calculation result of the formula has a value range of (0, 1).
The method also has the following characteristics: calculating the average influence of each instance set on other types of instance sets according to the network kernel density value, wherein the average influence comprises the following steps:
according to
Figure GDA0002729957030000034
Compute instance set O' (e)y) For instance set O' (e)x) The average influence of (a) on the magnetic field,
wherein the content of the first and second substances,
Figure GDA0002729957030000041
is the example set O' (e)y) For instance set O' (e)x) The average influence of (a) on the magnetic field,
Figure GDA0002729957030000042
is of type exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
Figure GDA0002729957030000043
for all types on the region as exA subset of the set of instances of (c),
k represents a spatial weighting function that is,
n(O’(ex) ) is an example set O' (e)x) The number of instances in (1) is,
n(ex) For all types within a region as exNumber of instances of (c).
The method also has the following characteristics: the calculating the popularity of each candidate parity mode according to the average influence force, and the determining popular parity modes in the candidate parity modes according to a preset popularity threshold value comprises:
according to
Figure GDA0002729957030000044
The popularity of a given candidate pattern is calculated,
wherein, PICPFor a given popularity of a candidate pattern, the range of values is (0, 1)],
TIns_netCPAn example table formed by the clique examples of the candidate mode CP is obtained by connecting the non-repeated example pairs related to the types in the CP in the second-order example proximity relation table through the clique examples,
min { } is used to find the minimum of the input set,
Figure GDA0002729957030000045
for calculating a type of exIn the example table tinnetCPThe projection of the image onto the image plane is performed,
Figure GDA0002729957030000046
is a set of examples
Figure GDA0002729957030000047
For example collection
Figure GDA0002729957030000048
The average influence of (c);
when the popularity calculated for a candidate parity pattern is larger than a set popularity threshold, determining that the candidate pattern is a popular parity pattern.
The invention also provides a co-location mode discovery device considering urban road network constraints, which comprises: the second-order example proximity relation table building module is used for building a second-order example proximity relation table for a target area under map projection, and the second-order example proximity relation table comprises all examples in the target area and a set of example pairs with different types and reachable distance values among the example pairs, wherein the reachable distances of the examples and the reachable distances of the examples are within a preset distance attenuation threshold;
the network kernel density calculation module is used for calculating and obtaining the network kernel density value of each instance under the influence of other types of instance sets different from the instance type according to a preset distance attenuation threshold and the second-order instance proximity relation table;
the average influence calculation module is used for calculating the average influence of each instance set on other types of instance sets according to the network kernel density value;
and the popular co-location mode acquisition module is used for calculating the popularity of each candidate co-location mode according to the average influence and determining popular co-location modes in the candidate co-location modes according to a preset popularity threshold.
The device also has the following characteristics: the second-order instance adjacency relation table construction module is specifically used for storing a plurality of adjacent instance pairs and reachable distance values between corresponding instances in a two-dimensional hash table TIns _ net2Each cell unit in the table is expressed by a triplet set of the formula:
TIns_net2(ex,ey)={<oi,oj,Rdis(oi,oj)>,…},
wherein (e)x,ey) For two-space object instances, oi,ojFor two adjacent examples, Rdis (o)i,oj) Is the reachable distance value between two adjacent instances;
according to
Rdis(oi,oj)=g(y)*(oi-oj)net(t)|ht
Calculating an achievable distance value between two adjacent instances, wherein Rdis (o)i,oj) Is the reachable distance value between the two adjacent instances,
g (y) is a binary function, if example oiTo example ojThe direction along the road is opposite to the road passing direction, the value is 0, otherwise the value is 1,
(oi-oj)net(t)is an example oiGo to realityExample ojThe shortest path time used in the time domain,
htthe density attenuation threshold based on the path time is a constraint condition for solving the shortest path time, and represents that the reachable distance between the instances must satisfy htThreshold, else instance oiTo example ojIs not reachable.
The device also has the following characteristics: the network core density calculation module is specifically used for calculating the density of the network core according to
Figure GDA0002729957030000051
The type of calculation is exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
wherein the content of the first and second substances,
Figure GDA0002729957030000061
for all types on the region as eyA subset of the set of instances of (c),
nmaxbeing the maximum of the number of instances of a single type on a region,
k represents a spatial weighting function that is,
n(O’(ey)->oi) Is O' (e)y) Example of (1) toiThe number of pairs of instances that can be reached,
the calculation result value range of the formula is (0, 1)]Example set O' (e) under network constraints is describedy) For example oiThe magnitude of the influence of (c);
the device also has the following characteristics: the average influence calculation module is specifically used for calculating the average influence
Figure GDA0002729957030000062
Compute instance set O' (e)y) For instance set O' (e)x) The average influence of (a) on the magnetic field,
wherein the content of the first and second substances,
Figure GDA0002729957030000063
is of type exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
Figure GDA0002729957030000064
for all types on the region as exA subset of the set of instances of (c),
n(O’(ex) ) is an example set O' (e)x) The number of instances in (1) is,
n(ex) For all types within a region as exNumber of instances of (c).
The device also has the following characteristics: the popular co-located mode acquisition module is specifically used for acquiring the mode according to
Figure GDA0002729957030000065
The popularity of a given candidate pattern is calculated,
wherein, PICPFor a given popularity of the candidate pattern CP, the value range is (0, 1)],
TIns_netCPAn example table formed by the clique examples of the candidate mode CP is obtained by connecting the non-repeated example pairs related to the types in the CP in the second-order example proximity relation table through the clique examples,
min { } is used to find the minimum of the input set,
Figure GDA0002729957030000066
for calculating a type of exIn the example table tinnetCPThe projection of the image onto the image plane is performed,
Figure GDA0002729957030000067
is a set of examples
Figure GDA0002729957030000068
For example collection
Figure GDA0002729957030000069
The average influence of (c);
when the popularity calculated for a candidate parity pattern is larger than a set popularity threshold, determining that the candidate pattern is a popular parity pattern.
The invention provides a co-location mode discovery method considering urban road network constraints, which is expanded on the basis of the traditional co-location mode discovery method in two ways:
(1) based on the fact that the interconnection of urban space facility points occurs in the distance of network paths instead of Euclidean distance, the method of the space kernel function is placed in a network structure, and the traditional two-dimensional Euclidean distance is replaced by the reachability index in specific service time considering the attribute information of urban road traffic capacity, direction and the like to measure the proximity degree between space facilities.
(2) And (3) reforming the popularity index of the interesting degree of the traditional judgment mode, and adding the original index which is simply calculated by the example connection number into the adjustment parameter of the reachable distance weight. Compared with other methods in the field, the method of the invention honors the fact that the movement of people mainly depends on the road network in the city and the first geographic law, thereby improving the accuracy of the co-located mode mining on the urban facility data and having practical significance and practical value.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a flowchart of a co-located mode discovery method considering urban road network constraints according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a method of constructing a second order instance proximity relationship table;
FIG. 3 is a schematic diagram of an example table consisting of blob examples of candidate parity patterns;
fig. 4 is a schematic structural diagram of a co-located mode discovery apparatus considering urban road network constraints according to a second embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
In order to better illustrate the technical solution provided by the embodiment of the present invention, the following concepts are first explained:
shp data: the short name of shapefile is a spatial data open format developed by the ESRI company, and spatial graphic elements and corresponding two-dimensional attributes of the format are managed by an index file.
Parity mode: the co-located pattern C is a subset of a set of spatial features, where the number of types contained in C is referred to as the length, or order. In the mining problem, the commonly used concepts include two kinds of popular parity patterns and candidate parity patterns, which have no difference in representation form but different implied meanings. The candidate parity patterns are a set of feature types with potential parity relationships, and are generally obtained from popular parity patterns with lower or higher orders, and become popular parity patterns if they pass popularity verification. If the rank of the candidate parity pattern is equal to n, it is called a candidate n-rank parity pattern or a popular n-rank parity pattern.
Example of a clique: is a set of instances of different types that are spatially adjacent to each other.
Example table: to facilitate schema computation, the types to which the clique instances belong are arranged in a fixed order (e.g., a lexicographic order of the types in the schema) and stored in a table.
The invention provides a co-location mode discovery method and device considering urban road network constraints. Road linear Shp data and service facility point Shp data O ═ O, which are input as target areas under the same map projection1,o2,…,onN is the number of facilities. Service facility data (hereinafter referred to as an example) is subjected to sensitive data elimination, coordinate transformation and completeness processing, each piece of data is guaranteed to contain a facility point type, X coordinates and Y coordinates, and the type set related to the facility data is E ═ E { (E) }1,e2,…,emAnd m is the type number of the facility. Type on area is exIs stored in the set O (e)x) In (e), the number of examples thereof is denoted by n (e)x). Road network data is subjected to topology inspection, coordinate conversion and completeness processing, each piece of data is guaranteed to contain road grade, type and direction information, and in addition, a distance threshold h and a popularity threshold PI _ pre are required to be set. The data output is a parity pattern satisfying the condition.
The following describes in detail a co-location pattern discovery method and apparatus considering urban road network constraints according to an exemplary embodiment of the present invention with reference to the drawings.
Example one
Fig. 1 is a flowchart illustrating a co-location pattern discovery method considering urban road network constraints according to an embodiment of the present invention.
Step 101, constructing a second-order example proximity relation table for a target area projected by a map, wherein the second-order example proximity relation table comprises reachable distance values of all examples in the target area and different types of examples with reachable distances within a preset distance attenuation threshold.
On the premise of network co-location mode mining, whether different types of instance points are reachable needs to be found, and if the reachable distance is within a threshold h, the two instance points are adjacent. Urban road networks have differences in traffic direction and traffic capacity due toThus, the instance A through B reachable does not mean B through A reachable. Under this assumption, the process of constructing the second order instance proximity relation table under the constraint of the kernel density of the network space is as follows: firstly, converting a road network into a linear reference system based on road sections, wherein the road sections are line segments between two adjacent road intersections; secondly, dividing each road section into linear arc sections with equal length, and calling the linear arc sections as basic linear units; thirdly, finding out the examples within h and different from the types of the examples in the region, storing the example proximity relations and the reachable distances in a two-dimensional hash table of m, wherein the table is defined as a second-order example proximity relation table TIns _ net under the constraint of the kernel density of the network space2. As shown in FIG. 2, the triad set of non-empty cell units, row x, column y, is denoted as TIns _ net2(ex,ey). Each cell in the example table stores a corresponding second order pattern (e)x,ey) And its reachable distance value, through a triplet set tinnet2(ex,ey)={<oi,oj,Rdis(oi,oj)>….
Assume that there is a type e at location xxExample oi,oiThe method of calculating the proximity instances and their reachable distances of (2) is: will be separated from the OiUsing the nearest basic linear unit as a generator, searching different types of example points with the shortest path length within a threshold value h from the generator, and using oiPairs with these instance points and retains their distance values.
Example o if network direction and traffic capacity are considerediTo ojIs (o) ofi,oj) Can be expressed as:
Rdis(oi,oj)=g(y)*(oi-oj)net(t)|ht (1)
in the above formula, g (y) is a binary function, if oiTo ojThe direction along the road is opposite to the road passing direction, the value is 0, otherwise, the value is 1; (o)i-oj)net(t)Is oiTo ojShortest path time of htThe density attenuation threshold value based on the path time is a constraint condition for solving the shortest path time, and the reachable distance of the representation example must satisfy htThreshold value, otherwise oiTo ojIs not reachable.
And 102, calculating to obtain the network kernel density value of each instance under the influence of other types of instance sets different from the instance type according to a preset distance attenuation threshold and a second-order instance proximity relation table.
The spatial kernel density value may be expressed in the network space as:
Figure GDA0002729957030000101
in the above formula, f (x) is the nuclear density value at spatial position x; h is a distance attenuation threshold, which is the highest threshold of distance of instance points having a proximity relation with x; (x-o)i)netX and neighboring instance point oiNetwork reachable distance of; n is the number of instance points with a distance from x less than h; k represents a spatial weighting function that has the geometrical meaning that its density value becomes progressively smaller as the distance from the location x to each instance point increases. Many scholars have demonstrated that the choice of K has little effect on the pattern distribution results, and in the present invention we choose a gaussian function as the weighting function. The expression of the gaussian function is as follows:
Figure GDA0002729957030000102
in the above equation, exp (.) represents an exponential function with a natural constant e as the base. K (x) has the typical "bell-shaped" curve characteristic with three adjustable parameters a, b and c, where a determines the peak height of the curve, b determines the position of the abscissa where the peak occurs, and c determines the width of the curve. The invention uses a standard two-dimensional Gaussian kernel function, wherein a is equal to c is equal to 1, and b is equal to 0.
In conventional parity-mode mining, instance joins are made noneDirectionality processing, however, in the case of considering road constraints, the example connection needs to consider directionality. Under this premise, example oi(type is e)x) In type eyExample set of (e) O' (e)y) The network kernel density values under influence are defined as:
Figure GDA0002729957030000103
in the above formula, the first and second carbon atoms are,
Figure GDA0002729957030000104
is that all types on the region are eySubset of the set of instances of, nmaxIs the maximum of the number of instances of a single type over a region, K represents the spatial weighting function, n (O' (e)y)->oi) Is O' (e)y) Example of (1) toiNumber of reachable instances. The formula is obtained by transforming a network space nuclear density model, and the value range is (0, 1)]Set O' (e) under network constraints is describedy) For example oiThe magnitude of the influence of (c).
And 103, calculating the average influence of each instance set on other types of instance sets according to the network kernel density value.
Example set O' (e) based on equation (4)y) For instance set O' (e)x) Is calculated by the following formula:
Figure GDA0002729957030000111
in the above formula, the first and second carbon atoms are,
Figure GDA0002729957030000112
n(O’(ex) Represents a set O' (e)x) Number of instances of (1), n (e)x) All types on the indicated area are exNumber of instances of (c). Compared with the conventional co-located mode mining method, the average influence simultaneously emphasizes the interrelation between different types of instances in the candidate mode and a single instanceThe degree of engagement of the type instance in the candidate pattern.
And 104, calculating the popularity of each candidate parity mode according to the average influence, and determining popular parity modes in the candidate parity modes according to a preset popularity threshold.
Based on equation (5), the popularity PI of the candidate patterns based on the network core density is given belowCPThe calculation formula of (2):
Figure GDA0002729957030000113
in the above formula, PICPFor a given popularity of the candidate pattern CP, the value range is (0, 1)]As the candidate pattern length increases, PICPAnd also decreases accordingly, tinnetCPIs an instance table formed by the clique instances of the candidate mode CP, and the instance table is a second-order instance adjacent relation table TIns _ net2The non-duplicate instance pairs of types in CP are connected by clique instances (as shown in fig. 3), min { } is used to compute the minimum value of the input set,
Figure GDA0002729957030000114
for calculating a type of exIn the example table tinnetCPThe projection of the image onto the image plane is performed,
Figure GDA0002729957030000115
is a set of examples
Figure GDA0002729957030000116
For example collection
Figure GDA0002729957030000117
Average influence of (c).
Popularity PI calculated when aiming at a candidate parity modeCPAnd when the current candidate mode is larger than the set popularity threshold PI _ pre, determining the candidate mode as the popular parity mode.
Fig. 4 is a schematic structural diagram illustrating a co-located mode discovery apparatus considering urban road network constraints according to a second embodiment of the present invention.
Referring to fig. 4, the co-location pattern discovery apparatus considering the urban road network constraint includes:
a second-order instance proximity relation table building module 401, configured to build a second-order instance proximity relation table for the target area under map projection, where the second-order instance proximity relation table includes a set of pairs of instances, and reachable distance values between the pairs of instances and the reachable distance values, where the reachable distance between all instances and reachable distances of the instances in the target area are within a preset distance attenuation threshold and are different in type;
a network kernel density calculation module 402, configured to calculate, according to a preset distance attenuation threshold and the second-order instance proximity relation table, a network kernel density value of each instance under the influence of other types of instance sets different from the instance type;
an average influence calculation module 403, configured to calculate, according to the network kernel density value, an average influence of each instance set on other types of instance sets;
a popular co-located mode obtaining module 404, configured to calculate popularity of each candidate co-located mode according to the average influence, and determine a popular co-located mode in the candidate co-located modes according to a preset popularity threshold.
The second-order instance adjacency relation table construction module 401 is specifically configured to store the reachable distance values between multiple adjacent instance pairs and their corresponding instances in a two-dimensional hash table tinnet2Each cell unit in the table is expressed by a triplet set of the formula:
TIns_net2(ex,ey)={<oi,oj,Rdis(oi,oj)>,…},
wherein (e)x,ey) For two-space object instances, oi,ojFor two adjacent examples, Rdis (o)i,oj) Is the reachable distance value between two adjacent instances;
according to
Rdis(oi,oj)=g(y)*(oi-oj)net(t)|ht
Calculating an achievable distance value between two adjacent instances, wherein Rdis (o)i,oj) Is the reachable distance value between the two adjacent instances,
g (y) is a binary function, if example oiTo example ojThe direction along the road is opposite to the road passing direction, the value is 0, otherwise the value is 1,
(oi-oj)net(t)is an example oiTo example ojThe shortest path time used in the time domain,
htthe density attenuation threshold based on the path time is a constraint condition for solving the shortest path time, and represents that the reachable distance between the instances must satisfy htThreshold, else instance oiTo example ojIs not reachable.
Wherein the network core density calculation module 402 is specifically configured to calculate
Figure GDA0002729957030000121
The type of calculation is exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
wherein the content of the first and second substances,
Figure GDA0002729957030000131
for all types on the region as eyA subset of the set of instances of (c),
nmaxbeing the maximum of the number of instances of a single type on a region,
k represents a spatial weighting function that is,
n(O’(ey)->oi) Is O' (e)y) Example of (1) toiThe number of pairs of instances that can be reached,
the calculation result value range of the formula is (0, 1)]Example set O' (e) under network constraints is describedy) For example oiThe magnitude of the influence of (c);
wherein the average influence calculation module 403 is specifically configured to calculate the average influence according to
Figure GDA0002729957030000132
Compute instance set O' (e)y) For instance set O' (e)x) The average influence of (a) on the magnetic field,
wherein the content of the first and second substances,
Figure GDA0002729957030000133
is of type exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
Figure GDA0002729957030000134
for all types on the region as exA subset of the set of instances of (c),
n(O’(ex) ) is an example set O' (e)x) The number of instances in (1) is,
n(ex) For all types within a region as exNumber of instances of (c).
Wherein the popular co-located mode obtaining module 404 is specifically configured to obtain the mode according to
Figure GDA0002729957030000135
The popularity of a given candidate pattern is calculated,
wherein, PICPFor a given popularity of the candidate pattern CP, the value range is (0, 1)],
TIns_netCPIs an instance table formed by the clique instances of the candidate mode CP, and the instance table is a second-order instance adjacent relation table TIns _ net2The non-repeated instance pairs related to the types in the CP are obtained by joining clique instances,
min { } is used to find the minimum of the input set,
Figure GDA0002729957030000136
for calculating a type of exIn the example table tinnetCPThe projection of the image onto the image plane is performed,
Figure GDA0002729957030000137
is a set of examples
Figure GDA0002729957030000138
For example collection
Figure GDA0002729957030000141
The average influence of (c);
when the popularity calculated for a candidate parity pattern is larger than a set popularity threshold, determining that the candidate pattern is a popular parity pattern.
The invention provides a co-location mode discovery method considering urban road network constraints, which is expanded on the basis of the traditional co-location mode discovery method in two ways:
(1) based on the fact that the interconnection of urban space facility points occurs in the distance of network paths instead of Euclidean distance, the method of the space kernel function is placed in a network structure, and the traditional two-dimensional Euclidean distance is replaced by the reachability index in specific service time considering the attribute information of urban road traffic capacity, direction and the like to measure the proximity degree between space facilities.
(2) And (3) reforming the popularity index of the interesting degree of the traditional judgment mode, and adding the original index which is simply calculated by the example connection number into the adjustment parameter of the reachable distance weight. Compared with other methods in the field, the method of the invention honors the fact that the movement of people mainly depends on the road network in the city and the first geographic law, thereby improving the accuracy of the co-located mode mining on the urban facility data and having practical significance and practical value.
The above-described aspects may be implemented individually or in various combinations, and such variations are within the scope of the present invention.
It will be understood by those skilled in the art that all or part of the steps of the above methods may be implemented by instructing the relevant hardware through a program, and the program may be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, and the like. Alternatively, all or part of the steps of the foregoing embodiments may also be implemented by using one or more integrated circuits, and accordingly, each device/unit in the foregoing embodiments may be implemented in a form of hardware, and may also be implemented in a form of software functional device. The present invention is not limited to any specific form of combination of hardware and software.
It is to be noted that, in this document, the terms "comprises", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion, so that an article or apparatus including a series of elements includes not only those elements but also other elements not explicitly listed or inherent to such article or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of additional like elements in the article or device comprising the element.
The above embodiments are merely to illustrate the technical solutions of the present invention and not to limit the present invention, and the present invention has been described in detail with reference to the preferred embodiments. It will be understood by those skilled in the art that various modifications and equivalent arrangements may be made without departing from the spirit and scope of the present invention and it should be understood that the present invention is to be covered by the appended claims.

Claims (8)

1. A co-location pattern discovery method considering urban road network constraints, the method comprising:
constructing a second-order example proximity relation table for a target area under map projection, wherein the second-order example proximity relation table comprises a set of example pairs with different types and reachable distance values, and the reachable distance values of all the examples and the reachable distance values are within a preset distance attenuation threshold value;
calculating to obtain the network kernel density value of each instance under the influence of other types of instance sets different from the instance type according to a preset distance attenuation threshold and the second-order instance proximity relation table;
calculating to obtain the average influence of each instance set on other types of instance sets according to the network kernel density value;
calculating the popularity of each candidate parity mode according to the average influence, and determining popular parity modes in the candidate parity modes according to a preset popularity threshold;
the network kernel density value of each instance under the influence of other types of instance sets different from the implemented type is obtained by calculation according to a preset distance attenuation threshold and the second-order instance proximity relation table, and the method comprises the following steps:
according to
Figure FDA0002729957020000011
The type of calculation is exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
wherein the content of the first and second substances,
Figure FDA0002729957020000012
is of type exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
Figure FDA0002729957020000013
for all types on the region as eyA subset of the set of instances of (c),
nmaxbeing the maximum of the number of instances of a single type on a region,
k represents a spatial weighting function, oi,ojFor the two examples of proximity, the two adjacent examples,
Rdis(oi,oj) Is the reachable distance value between two adjacent instances,
htrepresenting a density decay threshold based on the path time,
n(O’(ey)->oi) Is O' (e)y) Example of (1) toiThe number of pairs of instances that can be reached,
the calculation result of the formula has a value range of (0, 1).
2. The method of claim 1,
the constructing the second order instance proximity relation table comprises:
storing reachable distance values between a plurality of adjacent example pairs and corresponding examples in a two-dimensional hash table TIns _ net2Each cell unit in the table is expressed by a triplet set of the formula:
TIns_net2(ex,ey)={<oi,oj,Rdis(oi,oj)>,…},
wherein (e)x,ey) For two-space object instances, oi,ojFor two adjacent examples, Rdis (o)i,oj) Is the reachable distance value between two adjacent instances;
according to
Rdis(oi,oj)=g(y)*(oi-oj)net(t)|ht
Calculating an achievable distance value between two adjacent instances, wherein Rdis (o)i,oj) Is the reachable distance value between the two adjacent instances, g (y) is a binary function if instance oiTo example ojThe direction along the road is opposite to the road passing direction, the value is 0, otherwise, the value is 1, (o)i-oj)net(t)Is an example oiTo example ojUsed shortest path time, htThe density attenuation threshold based on the path time is a constraint condition for solving the shortest path time, and represents that the reachable distance between the instances must satisfy htThreshold, else instance oiTo example ojIs not reachable.
3. The method of claim 1,
calculating the average influence of each instance set on other types of instance sets according to the network kernel density value, wherein the average influence comprises the following steps:
according to
Figure FDA0002729957020000021
Compute instance set O' (e)y) For instance set O' (e)x) The average influence of (a) on the magnetic field,
wherein the content of the first and second substances,
Figure FDA0002729957020000022
is the example set O' (e)y) For instance set O' (e)x) The average influence of (a) on the magnetic field,
Figure FDA0002729957020000023
is of type exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
Figure FDA0002729957020000031
for all types on the region as exA subset of the set of instances of (c),
n(O’(ex) ) is an example set O' (e)x) The number of instances in (1) is,
n(ex) For all types within a region as exNumber of instances of (c).
4. The method of claim 1,
the calculating the popularity of each candidate parity mode according to the average influence force, and the determining popular parity modes in the candidate parity modes according to a preset popularity threshold value comprises:
according to
Figure FDA0002729957020000032
The popularity of a given candidate pattern is calculated,
wherein, PICPFor a given popularity of the candidate pattern CP, the value range is (0, 1)],
TIns_netCPAn example table formed by the clique examples of the candidate mode CP is obtained by connecting the non-repeated example pairs related to the types in the CP in the second-order example proximity relation table through the clique examples,
min { } is used to find the minimum of the input set,
Figure FDA0002729957020000033
for calculating a type of exIn the example table tinnetCPThe projection of the image onto the image plane is performed,
Figure FDA0002729957020000034
is a set of examples
Figure FDA0002729957020000035
For example collection
Figure FDA0002729957020000036
The average influence of (c);
when the popularity calculated for a candidate parity pattern is larger than a set popularity threshold, determining that the candidate pattern is a popular parity pattern.
5. A co-location pattern discovery apparatus considering urban road network constraints, the apparatus comprising:
the second-order example proximity relation table building module is used for building a second-order example proximity relation table for a target area under map projection, and the second-order example proximity relation table comprises all examples in the target area and a set of examples with different types and reachable distance values between the examples and the set of examples, wherein the reachable distance of the examples is within a preset distance attenuation threshold value;
the network kernel density calculation module is used for calculating and obtaining the network kernel density value of each instance under the influence of other types of instance sets different from the instance type according to a preset distance attenuation threshold and the second-order instance proximity relation table;
the average influence calculation module is used for calculating the average influence of each instance set on other types of instance sets according to the network kernel density value;
the popular co-location mode acquisition module is used for calculating the popularity of each candidate co-location mode according to the average influence and determining popular co-location modes in the candidate co-location modes according to a preset popularity threshold; the network core density calculation module is specifically used for calculating the density of the network core according to
Figure FDA0002729957020000041
The type of calculation is exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
wherein the content of the first and second substances,
Figure FDA0002729957020000042
for all types on the region as eyA subset of the set of instances of (c),
nmaxbeing the maximum of the number of instances of a single type on a region,
k represents a spatial weighting function, oi,ojFor the two examples of proximity, the two adjacent examples,
Rdis(oi,oj) Is the reachable distance value between two adjacent instances,
htrepresenting a density decay threshold based on the path time,
n(O’(ey)->oi) Is O', (ey) Example of (1) toiThe number of pairs of instances that can be reached,
the calculation result value range of the formula is (0, 1)]Example set O' (e) under network constraints is describedy) For example oiThe magnitude of the influence of (c).
6. The apparatus of claim 5, wherein the second-order instance adjacency list construction module is specifically configured to store reachable distance values between pairs of adjacent instances and their corresponding instances in a two-dimensional hash table tinnet2Each cell unit in the table is expressed by a triplet set of the formula:
TIns_net2(ex,ey)={<oi,oj,Rdis(oi,oj)>,…},
wherein (e)x,ey) For two-space object instances, oi,ojFor two adjacent examples, Rdis (o)i,oj) Is the reachable distance value between two adjacent instances;
according to
Rdis(oi,oj)=g(y)*(oi-oj)net(t)|ht
Calculating an achievable distance value between two adjacent instances, wherein Rdis (o)i,oj) Is the reachable distance value between the two adjacent instances,
g (y) is a binary function, if example oiTo example ojThe direction along the road is opposite to the road passing direction, the value is 0, otherwise the value is 1,
(oi-oj)net(t)is an example oiTo example ojThe shortest path time used in the time domain,
htthe density attenuation threshold based on the path time is a constraint condition for solving the shortest path time, and represents that the reachable distance between the instances must satisfy htThreshold, else instance oiTo example ojIs not reachable.
7. The apparatus of claim 5,
the average influence calculation module is specifically used for calculating the average influence
Figure FDA0002729957020000051
Compute instance set O' (e)y) For instance set O' (e)x) The average influence of (a) on the magnetic field,
wherein the content of the first and second substances,
Figure FDA0002729957020000052
is of type exExample oiIn type eyExample set of (e) O' (e)y) The value of the density of the network under influence,
Figure FDA0002729957020000053
for all types on the region as exA subset of the set of instances of (c),
n(O’(ex) ) is an example set O' (e)x) The number of instances in (1) is,
n(ex) For all types within a region as exNumber of instances of (c).
8. The apparatus of claim 5, wherein the prevailing co-located mode acquisition module is specifically configured to acquire the mode based on
Figure FDA0002729957020000054
The popularity of a given candidate pattern is calculated,
wherein, PICPFor a given popularity of the candidate pattern CP, the value range is (0, 1)],
TIns_netCPIs an instance table composed of clique instances of candidate mode CPThe non-repeated instance pairs related to the type in the CP in the second-order instance proximity relation table are obtained by joining the clique instances,
min { } is used to find the minimum of the input set,
Figure FDA0002729957020000055
for calculating a type of exIn the example table tinnetCPThe projection of the image onto the image plane is performed,
Figure FDA0002729957020000061
is a set of examples
Figure FDA0002729957020000062
For example collection
Figure FDA0002729957020000063
The average influence of (c);
when the popularity calculated for a candidate parity pattern is larger than a set popularity threshold, determining that the candidate pattern is a popular parity pattern.
CN201710023460.XA 2017-01-13 2017-01-13 Co-location mode discovery method and device considering urban road network constraints Active CN106780262B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710023460.XA CN106780262B (en) 2017-01-13 2017-01-13 Co-location mode discovery method and device considering urban road network constraints

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710023460.XA CN106780262B (en) 2017-01-13 2017-01-13 Co-location mode discovery method and device considering urban road network constraints

Publications (2)

Publication Number Publication Date
CN106780262A CN106780262A (en) 2017-05-31
CN106780262B true CN106780262B (en) 2020-12-25

Family

ID=58948408

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710023460.XA Active CN106780262B (en) 2017-01-13 2017-01-13 Co-location mode discovery method and device considering urban road network constraints

Country Status (1)

Country Link
CN (1) CN106780262B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107480231A (en) * 2017-08-04 2017-12-15 深圳大学 Heuristic expansion search extension algorithm based on the track inquiry with sequence interest region
CN107526788A (en) * 2017-08-04 2017-12-29 深圳大学 The at the uniform velocity searching algorithm of track inquiry based on interest region
CN107463672A (en) * 2017-08-04 2017-12-12 深圳大学 Expansion search extension algorithm based on the track inquiry with sequence interest region
CN107463674A (en) * 2017-08-04 2017-12-12 深圳大学 At the uniform velocity search extension algorithm based on the track inquiry with sequence interest region
CN110188818B (en) * 2019-05-28 2021-05-04 南京中孚信息技术有限公司 Hot spot region clustering method and device and electronic equipment
CN110645989B (en) * 2019-10-16 2021-03-02 众虎物联网(广州)有限公司 Path planning method and device based on escalator running direction and storage medium
CN115408442B (en) * 2022-08-15 2023-03-10 云南大学 Land cover distribution relation mining method based on expansion space apposition mode

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7529195B2 (en) * 2004-07-30 2009-05-05 Fortiusone, Inc. System and method of mapping and analyzing vulnerabilities in networks
CN105528423B (en) * 2015-12-07 2018-08-17 中国科学院遥感与数字地球研究所 Take the adaptive same bit pattern acquisition methods and device of space instances distance weighting into account

Also Published As

Publication number Publication date
CN106780262A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
CN106780262B (en) Co-location mode discovery method and device considering urban road network constraints
Borruso Network density and the delimitation of urban areas
Aguilera et al. Landscape metrics in the analysis of urban land use patterns: A case study in a Spanish metropolitan area
CN109359162B (en) GIS-based school site selection method
Zhu et al. Fine-grained land use classification at the city scale using ground-level images
CN103955804B (en) A kind of crime risk spatiotemporal mode recognition methods for serving police service prevention and control block planning
WO2021208327A1 (en) Road optimization method, system and terminal, and computer-readable storage medium
US20200058099A1 (en) Vector Tile Pyramiding
Mitsova et al. Using enhanced dasymetric mapping techniques to improve the spatial accuracy of sea level rise vulnerability assessments
CN109189917A (en) A kind of city function limited region dividing method and system merging landscape and social characteristic
CN114548811B (en) Airport reachability detection method and device, electronic equipment and storage medium
CN112954623B (en) Resident occupancy rate estimation method based on mobile phone signaling big data
Zhang et al. Using street view images to identify road noise barriers with ensemble classification model and geospatial analysis
Yu Assessing the implications of the recent community opening policy on the street centrality in China: A GIS-based method and case study
CN110111375A (en) A kind of Image Matching elimination of rough difference method and device under Delaunay triangulation network constraint
Li et al. A two-phase clustering approach for urban hotspot detection with spatiotemporal and network constraints
CN116756828A (en) Urban space planning method, medium and system
Cader et al. Overcoming data scarcity for energy access planning with open data–the example of Tanzania
Zhuang et al. Integrating a deep forest algorithm with vector‐based cellular automata for urban land change simulation
Yang et al. An extended node-place model for comparative studies of transit-oriented development
Mackaness et al. Automatic classification of retail spaces from a large scale topographic database
Wu et al. Urban functional area recognition based on unbalanced clustering
Liu et al. Landslide susceptibility mapping with the fusion of multi-feature SVM model based FCM sampling strategy: A case study from Shaanxi Province
Zhang et al. Accessibility Evaluation of Public Service Facilities in Villages and Towns Based on POI Data: A Case Study of Suining County, Xuzhou, China
Zhai et al. Using spatial heterogeneity to strengthen the neighbourhood effects of urban growth simulation models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200825

Address after: 100080, No. 19 West Fourth Ring Road, Beijing, Haidian District

Applicant after: Research Institute of aerospace information innovation, Chinese Academy of Sciences

Address before: 100101 Beijing Chaoyang District Andingmen Datun Road No. 20 North

Applicant before: Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences

GR01 Patent grant
GR01 Patent grant