CN106503714B - Method for identifying city functional area based on point of interest data - Google Patents

Method for identifying city functional area based on point of interest data Download PDF

Info

Publication number
CN106503714B
CN106503714B CN201610887062.8A CN201610887062A CN106503714B CN 106503714 B CN106503714 B CN 106503714B CN 201610887062 A CN201610887062 A CN 201610887062A CN 106503714 B CN106503714 B CN 106503714B
Authority
CN
China
Prior art keywords
base station
interest point
category
grid
interest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610887062.8A
Other languages
Chinese (zh)
Other versions
CN106503714A (en
Inventor
蒋云良
董墨萱
刘勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huzhou University
Original Assignee
Huzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huzhou University filed Critical Huzhou University
Priority to CN201610887062.8A priority Critical patent/CN106503714B/en
Publication of CN106503714A publication Critical patent/CN106503714A/en
Application granted granted Critical
Publication of CN106503714B publication Critical patent/CN106503714B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention provides a method for identifying urban functional areas based on point of interest data, which is realized by the following steps: step one, map segmentation: rasterizing the map; step two, searching a base station to which the interest point belongs: finding a base station closest to the interest point; step three, calculating the distribution characteristics of the interest points of each base station; step four, clustering: carrying out fuzzy clustering analysis on the matrix in the third step to obtain different clustering results; step five, identifying the urban functional area: and C, calculating the distribution overlapping rate of the interest points with the category characteristics and the different clustering results obtained in the step four on the map, and identifying the clustered base stations. The method for identifying the urban functional areas according to the interest point data can identify the functions of the urban areas no matter the areas are tourist areas and working area residential areas, the result is basically consistent with the reality, and the effect can be improved in a more summarized manner.

Description

Method for identifying city functional area based on point of interest data
Technical Field
The invention relates to the field of big data analysis, in particular to a method for identifying a city functional area based on interest point data.
Background
With the rapid development of economy, a series of urban problems follow, and particularly, the urban problems are serious for some provincial cities or metropolis. As a result of urbanization in developing countries, "urban diseases" are manifested as traffic congestion, housing shortage, water shortage, energy shortage, environmental deterioration, employment difficulty, etc., which cause burdens on cities, even restrict the development of cities, and are likely to cause physical and mental diseases of citizens.
In recent years, some experts and scholars use various heterogeneous big data to perform urban calculation, so as to solve the problems caused by urbanization. Urban computing is a cross subject, and is a new field in computer science, which is integrated with the subjects of urban planning, traffic, energy, environment, sociology, economy and the like, with the city as a background. More specifically, urban computing addresses the challenges facing cities (e.g., environmental degradation, traffic congestion, increased energy consumption, planning lags, etc.) by constantly acquiring, integrating, and analyzing a variety of heterogeneous large data in cities. Among them, city planning is one of the applications mainly involved in city computing. The premise condition for planning the city is to know the city and the distribution condition of each functional area of the city. The urban functional area refers to an area with the land use function, the use intensity, the land use direction and the reference land price being substantially consistent, and the intensive use degree and the use potential of the area are basically the same, such as a cultural and educational area, a business area, a residential area and the like.
At present, scholars at home and abroad mainly use mobile phone data, floating car data, POI data and the like for the research of urban functional areas. Among them, POI data is widely used in the discovery of urban functional areas. POI data, collectively referred to as Point of interest data. In the GIS system, one piece of POI data may be one cell, one store, one bus station, and the like. One piece of POI data comprises parameters such as name, longitude and latitude, detailed address, POI category, contact phone and the like. In recent years, the research related to the discovery of urban functional areas by POI data mainly comprises the following steps: yuan et al have proposed in their research a DPoF framework (i.e., partitions Regions of Difference Functions) constructed using taxi GPS trajectory data and regional POI data; the Du run et al uses the topic class with the largest number of POIs as the topic of the cell to merge the adjacent cells when solving the stopping point of the irregularly switched mobile phone; in the research of the flying, public transportation IC card swiping card data and POI data are used for constructing an urban functional area identification model (DZoF).
And the position information of the mobile phone base station is often combined with the Voronoi Thiessen polygon to be used for dividing the city basic unit. The research related to the mobile phone base station division research area mainly comprises the following steps: jameson l.toole et al, when using dynamic data generated by a mobile phone user to identify land use and dynamic population relations, use location information of a base station to perform area division on a map; the V i cto Soto and the EnriueFria-Mart i z propose that the position information of the base station is also used for carrying out regional division on the map when the technology for automatically identifying and dividing the land use condition is used by using the information generated by the mobile phone base station network.
In addition, POI data includes the type comprehensively, relates to each aspect, and snatchs very conveniently, and some other data often are difficult to obtain. Currently, the mobile phone base stations of three operators cover basically the whole China. Moreover, for better serving the masses, the base stations of the operators are set up according to the population density and urban planning. That is, in a densely populated, high-rise area, the base stations are also arranged relatively densely, while in a relatively open area, the number of base stations is correspondingly reduced.
Disclosure of Invention
The invention aims to solve the technical problem of providing a method for identifying urban functional areas based on interest point data, which can identify the functions of all areas of an urban by using the interest point data. Therefore, the invention provides the following technical scheme:
a method for identifying a city functional area based on point of interest data comprises the following steps:
step one, map segmentation: rasterizing the map, and numbering all grids; dividing the map according to the position of the mobile phone base station, calculating the distance between each grid and the base station, and specifying that the grid belongs to the base station closest to the grid, so as to obtain a grid number list closest to each base station and a grid number matrix G occupied by each base station;
step two, searching a base station to which the interest point belongs: finding a base station closest to the interest point, and judging that the interest point belongs to the base station to obtain all interest point lists belonging to each base station;
step three, calculating the distribution characteristics of the interest points of each base station: classifying and counting the interest points of each base station according to the parameter of the interest point category in the data of the interest point list of each base station, namely respectively counting the number of the interest points of different categories of each base station to obtain an interest point category distribution matrix D of each base station; combining the interest point category distribution matrix D with the occupied grid number matrix G, and processing the interest point category distribution matrix D by adopting a normalization method to obtain a matrix finally used for analysis, wherein the matrix is named as Y;
step four, clustering: carrying out fuzzy clustering analysis on the matrix Y in the third step to obtain different clustering results;
step five, identifying the urban functional area: and C, calculating the distribution overlapping rate of the interest points with the category characteristics and the different clustering results obtained in the step four on the map, and identifying the clustered base stations.
On the basis of the technical scheme, the invention can also adopt the following further technical scheme:
in the third step, the normalization processing method is as follows:
respectively carrying out normalization processing on the interest point category distribution matrix D and the occupied grid number matrix G by using a formula (1), normalizing the two matrixes to be in an interval of [0,1], and combining the normalization results of the two matrixes by using a formula (2),
Figure BDA0001128577160000031
Y=A·e-X (2)
in the formula (1), { xiIs the sample set, xiFor all sample components of the sample set, xmaxIs the maximum value, x, of each component of all samples in the sample setminThe minimum value of each component of all samples in the sample set is obtained;
in formula (2), Y is the matrix ultimately used for analysis, with dimensions n × m; a is a matrix obtained by normalizing an interest point category distribution matrix D according to a formula (1), and the dimension is n multiplied by m; x is a matrix obtained by normalizing the occupied grid number matrix G according to the formula (1), and the dimension is 1 multiplied by n; n is the number of base stations and m is the number of interest point categories.
In the fourth step, the fuzzy clustering analysis adopts a C mean value clustering algorithm to divide all vectors into C clusters, and a clustering center of each cluster is obtained, so that the sum of variances in the clusters is minimum;
and clustering by using a C-means fuzzy clustering algorithm to obtain a probability list that the base station i belongs to different clusters, then extracting the class to which the maximum value of the base station i in various probabilities belongs, defining the class as the class to which the base station i belongs, and obtaining a list of the classes to which the base stations belong, wherein the list is a clustering result.
Step five, calculating the overlapping rate of the distribution of the interest point with the interest point category of s and the base station with the clustering category of n on the map, and inputting: "a grid list where the interest points of which the interest point category is" s "are located; and clustering a grid number list covered by the base station with the category of 'n' to obtain the overlapping rate.
In the fifth step, a specific method for calculating the overlapping rate of the distribution of the interest point with the interest point category of s and the base station with the cluster category of n on the map is as follows:
step1, finding out the grid numbers of the points of interest with the type of the points of interest as s according to the longitude and latitude of each point of interest;
step2, amplifying the area according to the characteristic of s, namely amplifying the area to a square area in four directions of south, east and north by taking the grid number obtained in Step1 as the center to obtain all the grid numbers in the amplified area;
step3, counting all non-repeated grid numbers obtained in Step2, and marking the set as S;
step4, finding the grid number covered by the clustering type N according to the base station number with the clustering type N and the grid number list covered by each base station, and recording the set as N;
step5, calculating a grid overlapping rate (overlapRatio), namely the overlapping rate of a grid number set S with an interest point category of S and a grid number set N covered by a cluster category of N according to the formula (3);
Figure BDA0001128577160000041
in the first step, a method for searching a base station closest to the grid is adopted, and the map is divided by the position of the mobile phone base station.
Due to the adoption of the technical scheme, the invention has the beneficial effects that: the method for identifying the urban functional areas according to the interest point data can identify the functions of the urban areas no matter the areas are tourist areas and working area residential areas, the result is basically consistent with the actual result, and the effect can be improved in a more summarized manner.
Description of the figures (figures are examples later, the position is not to be changed)
Fig. 1 is a research area of the hangzhou city provided by the present invention.
Fig. 2 is a division result of the base station of fig. 1.
Fig. 3 shows the clustering result of the clustering parameter C-4 provided by the present invention.
FIG. 4 is a general city plan drawing in Hangzhou City in 2001-
Fig. 5 is a projection of a clustering result of the residential area on a general planning chart of the Hangzhou city.
FIG. 6 is a projection of "tourist areas" clustering results onto a hundredth map.
Fig. 7 is a point of interest distribution thermodynamic diagram in which the "point of interest category" is "work".
Detailed Description
As shown in the figure, a method for identifying a city functional area based on point of interest data includes the following steps:
step one, map segmentation: rasterizing the map, and numbering all grids; dividing the map according to the position of the mobile phone base station, calculating the distance between each grid and the base station, and specifying that the grid belongs to the base station closest to the grid, so as to obtain a grid number list closest to each base station and a grid number matrix G occupied by each base station; and partitioning the map by using the position of the mobile phone base station by adopting a method of searching the base station closest to the grid.
Step two, searching a base station to which the interest point belongs: finding a base station closest to the interest point, and judging that the interest point belongs to the base station to obtain all interest point lists belonging to each base station;
step three, calculating the distribution characteristics of the interest points of each base station: classifying and counting the interest points of each base station according to the parameter of the interest point category in the data of the interest point list of each base station, namely respectively counting the number of the interest points of different categories of each base station to obtain an interest point category distribution matrix D of each base station; combining the interest point category distribution matrix D with the occupied grid number matrix G, and processing the interest point category distribution matrix D by adopting a normalization method to obtain a matrix finally used for analysis, wherein the matrix is named as Y;
the normalization processing method comprises the following steps:
respectively carrying out normalization processing on the interest point category distribution matrix D and the occupied grid number matrix G by using a formula (1), normalizing the two matrixes to be in an interval of [0,1], and combining the normalization results of the two matrixes by using a formula (2),
Figure BDA0001128577160000051
Y=A·e-X (2)
in the formula (1), { xiIs the sample set, xiFor all sample components of the sample set, xmaxIs the maximum value, x, of each component of all samples in the sample setminThe minimum value of each component of all samples in the sample set is obtained;
in formula (2), Y is the matrix ultimately used for analysis, with dimensions n × m; a is a matrix obtained by normalizing an interest point category distribution matrix D according to a formula (1), and the dimension is n multiplied by m; x is a matrix obtained by normalizing the occupied grid number matrix G according to the formula (1), and the dimension is 1 multiplied by n; n is the number of base stations and m is the number of interest point categories.
Step four, clustering: carrying out fuzzy clustering analysis on the matrix Y in the third step to obtain different clustering results; the fuzzy clustering analysis adopts a C mean value clustering algorithm to divide all vectors into C clusters, and obtains the clustering center of each cluster so as to ensure that the sum of variance in the clusters reaches the minimum;
and clustering by using a C-means fuzzy clustering algorithm to obtain a probability list that the base station i belongs to different clusters, then extracting the class to which the maximum value of the base station i in various probabilities belongs, defining the class as the class to which the base station i belongs, and obtaining a list of the classes to which the base stations belong, wherein the list is a clustering result.
Step five, identifying the urban functional area: and C, calculating the distribution overlapping rate of the interest points with the category characteristics and the different clustering results obtained in the step four on the map, and identifying the clustered base stations.
Step five, calculating the overlapping rate of the distribution of the interest point with the interest point category of s and the base station with the clustering category of n on the map, and inputting: "a grid list where the interest points of which the interest point category is" s "are located; and clustering a grid number list covered by the base station with the category of 'n' to obtain the overlapping rate.
The specific method for calculating the overlapping rate of the distribution of the interest points with the interest point category of s and the base station with the cluster category of n on the map is as follows:
step1, finding out the grid numbers of the points of interest with the type of the points of interest as s according to the longitude and latitude of each point of interest;
step2, amplifying the area according to the characteristic of s, namely amplifying the area to a square area in four directions of south, east and north by taking the grid number obtained in Step1 as the center to obtain all the grid numbers in the amplified area;
step3, counting all non-repeated grid numbers obtained in Step2, and marking the set as S;
step4, finding the grid number covered by the clustering type N according to the base station number with the clustering type N and the grid number list covered by each base station, and recording the set as N;
step5, calculating a grid overlapping rate (overlapRatio), namely the overlapping rate of a grid number set S with an interest point category of S and a grid number set N covered by a cluster category of N according to the formula (3);
for example, "living", "working", etc.; the base station with the cluster category of "N" represents the base station list displayed as "N" in the cluster result, the magnification in Step2 is determined by the characteristics of "S", such as the interest point of the "living" category, which is generally a house, and the coverage area of a house is generally 30m × 30m — 900m2If the area of a grid is 9.6m × 11.1m, the interest point whose category is "residential" should be enlarged nine times by centering on the grid where the interest point is located, i.e., a 3 × 3 square area centering on the grid where the interest point is located.
The functional area identification method provided by the invention is verified by taking the range of a single mobile phone base station as a unit area and using the point of interest data of a certain area in Hangzhou city.
The method comprises the following steps: map partitioning
A rectangular area with the longitude of 120.040-120.410 degrees and the latitude of 30.090-30.400 degrees in Hangzhou city of Zhejiang as shown in figure 1 is selected as a research object, the area is divided into grids of 0.0001 degree multiplied by 0.0001 degree (about 9.6m multiplied by 11.1m), an urban unit area is divided by using a grid attribution calculation method according to the longitude and latitude data of a mobile phone base station of an operator in Hangzhou city, and the division result is shown in figure 2.
As described above, after reading the present disclosure, those skilled in the art can make various other modifications without creative mental labor according to the technical solutions and concepts of the present disclosure, and all of them are within the protection scope of the present disclosure.
Step two: base station for searching interest point
Baidu interest point data is widely used in China, distribution of the Baidu interest point data in urban space is basically consistent with actual conditions, accuracy and reliability of the data are guaranteed, and therefore the interest point data with the Baidu degree in a research range is extracted for research. The data comprises more than 11 tens of thousands of pieces of interest point information in a research range, including the names, the longitude and latitude, the detailed addresses, the interest point categories, the contact telephone and other parameters of the interest points. In the research, the interest point data is processed according to the 'interest point category' parameter, and the interest point data is divided into shopping, work, residence, tourism, cultural and educational education of colleges and universities, kindergartens of primary schools, middle schools, medical treatment, cultural and entertainment, life service, financial service, automobile service, stations, parking lots, gourmet food and hotels 16.
Step three: calculating the distribution characteristics of the points of interest of each base station
The base station number is denoted by i, and the category of the interest point is denoted by j, wherein i is 1, 2, 3 and …, and j is 1, 2, 3, … and 16. The obtained result is the number distribution of the interest point category j to which the base station i belongs, and the table 1 is a list of the number distribution of the interest point category j to which the base station i belongs. And finally, processing the result of the table 1 by adopting a normalization method according to the number of the grids occupied by combination to obtain a matrix Y for later analysis.
Figure BDA0001128577160000081
TABLE 1
Step four: clustering: and carrying out base station clustering analysis on the result matrix Y according to the clustering method provided by the invention. Taking the parameter C as 4, namely dividing the research area into 4 different functional areas, and finally visualizing the analysis result, wherein the result is shown in fig. 3.
Step five: identifying urban functional areas
And selecting three characteristic values of 'living', 'working' and 'traveling' in the 'interest point category' parameter to identify the function of the base station. According to the overlapping rate calculation method, the overlapping rate calculation is carried out on the clustering result, and the calculation result is the result of the overlapping rate calculation in the step five as shown in the table 2. When the area of the interest point is enlarged, the area of the enlarged interest point with the category of 'living' and 'working' is 30m multiplied by 30m by combining the actual situation, namely, a square area of 3 multiplied by 3 is formed by taking the grid to which each interest point belongs as the center; and the area of the interest points with the category of tourism is enlarged to be 90m multiplied by 90m, namely, a square area of 9 multiplied by 9 with the grid to which each interest point belongs as the center.
Overlap ratio (%) Color 4 Color 3 Color 1 Color 2
Work by 1.37 1.69 1.86 0.49
Residence 0.30 0.93 4.65 0.08
Travel toy 0.14 0.17 0.36 0.88
TABLE 2
From the results of the overlap ratio calculation of table 2, it can be first determined that the function of the color 1 region in fig. 3 should be "residential zone" and the function of the color 2 region should be "tourist zone" because their overlap ratio is much higher than that of the other colors. Secondly, the maximum value of the calculation result of the overlap ratio of the "interest point category" being "work" is also in the color 1 region, but since the overlap ratio of "residential" and the color 1 region is much higher than that of other color regions, it is obvious that the color 1 region should be "residential zone" rather than "work zone". In addition, the overlap ratio of the "work" with the color 4 region and the color 3 region is relatively low, and therefore one of the color 4 region and the color 3 region must be the "work" function. In practical situations, the "residential area" is often inseparable from the "work area", and the two areas are often adjacent in geographical location, in fig. 3, the color 3 area is most adjacent to the color 1 area, and the color 4 area is most adjacent to the color 3 and color 2 areas, so the color 3 area should be the "work area". Finally, the area a in fig. 3 is a famous west lake scenic spot in the state of hangzhou, including west lake, dragon well, lingo, and the like, and the terrain of this area is mostly a mountain area, so that the interest points of the other categories are rarely distributed except for the interest points of the category "tourism", and even there are no interest points distributed basically in the range of some base stations. In fig. 3, except for the color 2, the color 4 part is slightly more than the rest of the colors, and the function of the color 2 region is already judged as "tourist region", and the function of the color 4 region is "rare region" with less distribution of interest points.
From the above analysis, the recognition results of the city regions in fig. 3 are as follows: the color 1 region is a "residential zone"; the color 2 area is a 'tourist area'; the color 3 area is a working area; the color 4 region is the "region of rare arrivals".
By applying the method of the invention, the goodness of fit of each functional area in the embodiment is as follows:
(1) degree of coincidence of residential areas
Fig. 4 is an overall urban planning map in the year 2001 + 2020 of the Hangzhou city, fig. 5 is a projection of a distribution map of residential areas identified according to the method of the present invention on the overall urban planning map in the year 2001 + 2020 of the Hangzhou city under the same longitude and latitude conditions, and the black parts in the map, namely the areas identified as the residential areas according to the method of the present invention, are distributed on the map and are basically consistent with the residential areas in the overall urban planning map of the Hangzhou city. Therefore, the recognition result of the residential area is basically consistent with the actual recognition result.
(2) Goodness of fit of "tourist area
According to the illustration in fig. 6, the projection of the recognition result of the "tourist area" on the hundred-degree map of the invention also substantially conforms to the reality. The experimental result makes accurate identification on the functions of the base stations covering the scenic spots such as 'good tea culture village', 'west stream wetland', 'west lake scenic spot' and 'Hunan lake'.
(3) Degree of fit of' working area
The "work area" identified by the method of the present invention substantially coincides with the distribution of "public management and public service facilities land", "commercial service facilities land" and "industrial land" which can be defined as "work area" in the city general plan drawing of fig. 4, hangzhou city 2001-2020. It can be known from the "working" interest point distribution thermodynamic diagram of fig. 7 and the color 3 area of fig. 3 that the "working area" identified by the method according to the present invention substantially coincides with reality.
By integrating the goodness of fit analysis of the (1), (2) and (3), the identification result of the urban area function by the method for identifying the urban functional area according to the point of interest data provided by the invention is basically matched with the actual result.

Claims (6)

1. A method for identifying a city functional area based on interest point data is characterized by comprising the following steps:
step one, map segmentation: rasterizing the map, and numbering all grids; dividing the map according to the position of the mobile phone base station, calculating the distance between each grid and the base station, and specifying that the grid belongs to the base station closest to the grid, so as to obtain a grid number list closest to each base station and a grid number matrix G occupied by each base station;
step two, searching a base station to which the interest point belongs: finding a base station closest to the interest point, and judging that the interest point belongs to the base station to obtain all interest point lists belonging to each base station;
step three, calculating the distribution characteristics of the interest points of each base station: classifying and counting the interest points of each base station according to the parameter of the interest point category in the data of the interest point list of each base station to obtain an interest point category distribution matrix D of each base station; combining the interest point category distribution matrix D with the occupied grid number matrix G, and processing by adopting a normalization method to obtain a matrix Y finally used for analysis;
step four, clustering: carrying out fuzzy clustering analysis on the matrix Y in the third step to obtain different clustering results;
step five, identifying the urban functional area: and C, calculating the distribution overlapping rate of the interest points with the category characteristics and the different clustering results obtained in the step four on the map, and identifying the clustered base stations.
2. The method for identifying urban functional areas based on point of interest data as claimed in claim 1, wherein in step three, the normalization processing method is as follows:
respectively carrying out normalization processing on the interest point category distribution matrix D and the occupied grid number matrix G by using a formula (1), normalizing the two matrixes to be in an interval of [0,1], and combining the normalization results of the two matrixes by using a formula (2),
Figure FDA0002193248220000011
Y=A·e-x (2)
in the formula (1), { xiIs the sample set, xiFor all sample components of the sample set, xmaxIs the maximum value, x, of each component of all samples in the sample setminThe minimum value of each component of all samples in the sample set is obtained;
in formula (2), Y is the matrix ultimately used for analysis, with dimensions n × m; a is a matrix obtained by normalizing an interest point category distribution matrix D according to a formula (1), and the dimension is n multiplied by m; x is a matrix obtained by normalizing the occupied grid number matrix G according to the formula (1), and the dimension is 1 multiplied by n; n is the number of base stations and m is the number of interest point categories.
3. The method according to claim 1, wherein in step four, the fuzzy clustering analysis uses C-means clustering algorithm to divide all vectors into C clusters and find the clustering center of each cluster, so that the sum of variance in the clusters is minimized;
and clustering by using a C-means fuzzy clustering algorithm to obtain a probability list that the base station i belongs to different clusters, then extracting the class to which the maximum value of the base station i in various probabilities belongs, defining the class as the class to which the base station i belongs, and obtaining a list of the classes to which the base stations belong, wherein the list is a clustering result.
4. The method as claimed in claim 1, wherein in step five, the overlapping rate of the distribution of the interest point with the interest point category "s" and the base station with the cluster category "n" on the map is calculated, and the method comprises the following steps: "a grid list where the interest points of which the interest point category is" s "are located; and clustering a grid number list covered by the base station with the category of 'n' to obtain the overlapping rate.
5. The method as claimed in claim 4, wherein in the fifth step, the specific method for calculating the overlapping rate of the distribution of the interest point with the "s" interest point category and the base station with the "n" clustering category on the map is as follows:
step1: finding out the grid numbers of the interest points with the interest point category as s according to the longitude and latitude of each interest point;
step2: amplifying the area according to the characteristics of the interest points with the interest point category of s to obtain all grid numbers in the amplified area, wherein the amplified area is obtained by amplifying the area to a square area in four directions of south, east, west and north by taking the grid number obtained by Step1 as the center;
step3: counting all non-repeated grid number sets obtained at Step2, and recording as S;
step4: according to the base station number with the clustering category of N and the grid number list covered by each base station, finding the grid number set covered by the base station number with the clustering category of N, and recording the grid number set as N;
step5: the grid overlap rate 0 verlaparato is calculated according to equation (3): the grid overlapping rate is the overlapping rate of a grid number set S with the interest point category of S and a grid number set N covered by a base station number with the clustering category of N;
Figure FDA0002193248220000031
6. the method as claimed in claim 1, wherein in step one, the map is divided by the position of the mobile phone base station by searching the base station nearest to the grid.
CN201610887062.8A 2016-10-11 2016-10-11 Method for identifying city functional area based on point of interest data Active CN106503714B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610887062.8A CN106503714B (en) 2016-10-11 2016-10-11 Method for identifying city functional area based on point of interest data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610887062.8A CN106503714B (en) 2016-10-11 2016-10-11 Method for identifying city functional area based on point of interest data

Publications (2)

Publication Number Publication Date
CN106503714A CN106503714A (en) 2017-03-15
CN106503714B true CN106503714B (en) 2020-01-03

Family

ID=58294741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610887062.8A Active CN106503714B (en) 2016-10-11 2016-10-11 Method for identifying city functional area based on point of interest data

Country Status (1)

Country Link
CN (1) CN106503714B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991142A (en) * 2017-03-22 2017-07-28 湖州师范学院 A kind of method that urban function region is recognized based on wechat data and interest point data
CN107220308B (en) * 2017-05-11 2021-07-20 百度在线网络技术(北京)有限公司 Method, device and equipment for detecting rationality of POI (Point of interest) and readable medium
CN109688532B (en) * 2017-10-16 2020-11-24 中移(苏州)软件技术有限公司 Method and device for dividing city functional area
CN108182253B (en) * 2017-12-29 2021-12-28 百度在线网络技术(北京)有限公司 Method and apparatus for generating information
CN108334624A (en) * 2018-02-09 2018-07-27 城市生活(北京)资讯有限公司 A kind of POI identification processing methods and device
CN108876475B (en) * 2018-07-12 2022-02-01 青岛理工大学 City functional area identification method based on interest point acquisition, server and storage medium
CN109523186B (en) * 2018-11-28 2021-02-09 杭州中奥科技有限公司 Urban area division method and device
CN111985514A (en) * 2019-05-23 2020-11-24 顺丰科技有限公司 Business circle identification method and device, electronic equipment and storage medium
CN110334321B (en) * 2019-06-24 2023-03-31 天津城建大学 City rail transit station area function identification method based on interest point data
CN112257970A (en) * 2019-07-22 2021-01-22 山东科技大学 Automatic city functional area dividing method based on interest point big data
CN110766589A (en) * 2019-10-28 2020-02-07 电子科技大学 Method for deducing city function based on communication data and interest point data
CN110866156B (en) * 2019-11-26 2022-05-17 北京明略软件***有限公司 Method, device, equipment and medium for identifying functional park based on social data
CN112784423A (en) * 2021-01-28 2021-05-11 河北师范大学 Urban area feature analysis method based on complex network
CN114937215B (en) * 2022-06-10 2023-03-24 中国科学院地理科学与资源研究所 Method and device for identifying urban functional area
CN116756262A (en) * 2023-08-15 2023-09-15 北京博道焦点科技有限公司 Electronic fence generation method and system based on map interest point auditing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103456233A (en) * 2012-05-28 2013-12-18 腾讯科技(深圳)有限公司 Method and system for searching interest points based on electronic map
CN104457743A (en) * 2014-12-17 2015-03-25 深圳市亲觅科技有限公司 Footmark processing method of intelligent wearing device
CN104834666A (en) * 2015-03-06 2015-08-12 中山大学 Acoustic environment functional area partitioning method based on road network and interest points
CN105657725A (en) * 2016-02-01 2016-06-08 重庆邮电大学 Method for defining radiating area of urban function area based on mobile phone signaling data
CN105740401A (en) * 2016-01-28 2016-07-06 北京理工大学 Individual behavior and group interest-based interest place recommendation method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103456233A (en) * 2012-05-28 2013-12-18 腾讯科技(深圳)有限公司 Method and system for searching interest points based on electronic map
CN104457743A (en) * 2014-12-17 2015-03-25 深圳市亲觅科技有限公司 Footmark processing method of intelligent wearing device
CN104834666A (en) * 2015-03-06 2015-08-12 中山大学 Acoustic environment functional area partitioning method based on road network and interest points
CN105740401A (en) * 2016-01-28 2016-07-06 北京理工大学 Individual behavior and group interest-based interest place recommendation method and device
CN105657725A (en) * 2016-02-01 2016-06-08 重庆邮电大学 Method for defining radiating area of urban function area based on mobile phone signaling data

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Automated Land Use Identification using Cell-Phone;Víctor Soto 等;《Hotplanet 11 proceedings of the 3rd ACM international workshop on MobiArch》;20110628;1-6 *
Discovering Regions of Different Functions in a City Using Human Mobility and POIs;Jing Yuan 等;《proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining》;20120816;186-194 *
城市功能区划分空间聚类算法研究;辜寄蓉 等;《测绘科学》;20110930;第36卷(第5期);65-67 *
基于POI数据的城市功能区定量识别及其可视化;池娇 等;《测绘地理信息》;20160430;第41卷(第2期);68-73 *
基于呼叫详细记录数据的城市功能区识别;江贵林 等;《计算机应用》;20160710;第36卷(第7期);2046-2050 *
基于手机基站数据的城市交通流量模拟;吴健生 等;《地理学报》;20121231;第67卷(第12期);1657-1665 *

Also Published As

Publication number Publication date
CN106503714A (en) 2017-03-15

Similar Documents

Publication Publication Date Title
CN106503714B (en) Method for identifying city functional area based on point of interest data
CN106096631B (en) A kind of floating population's Classification and Identification analysis method based on mobile phone big data
CN106991142A (en) A kind of method that urban function region is recognized based on wechat data and interest point data
Zhan et al. Inferring urban land use using large-scale social media check-in data
Xu et al. Context-aware real-time population estimation for metropolis
CN112288048B (en) Urban crowd trip identification method based on multi-source data driving
CN108876475B (en) City functional area identification method based on interest point acquisition, server and storage medium
CN107194525A (en) A kind of down town appraisal procedure based on mobile phone signaling
Liu et al. Recognizing urban functional zones by a hierarchical fusion method considering landscape features and human activities
Cai et al. Design and application of an attractiveness index for urban hotspots based on GPS trajectory data
CN108427679B (en) People stream distribution processing method and equipment thereof
CN109101559B (en) Urban functional area identification method based on POI and bus card swiping data
CN108717676A (en) Evaluation space method and system are lived in duty under different scale based on multi-data fusion
CN109492066B (en) Method, device, equipment and storage medium for determining branch names of points of interest
Li et al. A two-phase clustering approach for urban hotspot detection with spatiotemporal and network constraints
Song et al. Identifying flow clusters based on density domain decomposition
Yuan et al. Recognition of functional areas based on call detail records and point of interest data
CN113672788B (en) Urban building function classification method based on multi-source data and weight coefficient method
Jiao et al. Understanding the land use function of station areas based on spatiotemporal similarity in rail transit ridership: A case study in Shanghai, China
Hu et al. Construction of a refined population analysis unit based on urban forms and population aggregation patterns
Alhazzani et al. Urban Attractors: Discovering patterns in regions of attraction in cities
Chung et al. Investigating the effects of POI-based land use on traffic accidents in Suzhou Industrial Park, China
Zhou et al. Village-town system in suburban areas based on cellphone signaling mining and network hierarchy structure analysis
Chen et al. Detecting urban commercial patterns using a latent semantic information model: A case study of spatial-temporal evolution in Guangzhou, China
CN110610446A (en) County town classification method based on two-step clustering thought

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant