CN108205562B - Positioning data storage and retrieval method and device for geographic information system - Google Patents

Positioning data storage and retrieval method and device for geographic information system Download PDF

Info

Publication number
CN108205562B
CN108205562B CN201611178912.3A CN201611178912A CN108205562B CN 108205562 B CN108205562 B CN 108205562B CN 201611178912 A CN201611178912 A CN 201611178912A CN 108205562 B CN108205562 B CN 108205562B
Authority
CN
China
Prior art keywords
data
grid
mobile terminal
positioning data
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611178912.3A
Other languages
Chinese (zh)
Other versions
CN108205562A (en
Inventor
吴超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianxun Spatial Intelligence Inc
Original Assignee
Qianxun Spatial Intelligence Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianxun Spatial Intelligence Inc filed Critical Qianxun Spatial Intelligence Inc
Priority to CN201611178912.3A priority Critical patent/CN108205562B/en
Publication of CN108205562A publication Critical patent/CN108205562A/en
Application granted granted Critical
Publication of CN108205562B publication Critical patent/CN108205562B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a geographic information system, and discloses a method and a device for storing and retrieving positioning data of the geographic information system. In the invention, the geographical area is divided into a plurality of grids, the corresponding relation of the identification corresponding to the mobile terminal, the grid identification and the time period is stored in the index, and all the positioning data of the same mobile terminal in the same grid and the same time period only correspond to one index record, so that all the grids corresponding to the positioning data of the appointed mobile terminal can be easily found during retrieval, the required positioning data can be efficiently obtained according to the corresponding grids, and the workload of processing useless data is reduced. The mobile terminal carries out hashing through the corresponding identification of the mobile terminal, and one of the storage data tables is selected to be stored, so that the storage and retrieval of the positioning data can be carried out in a smaller storage data table, and the storage and retrieval efficiency is improved.

Description

Positioning data storage and retrieval method and device for geographic information system
Technical Field
The invention relates to a geographic information system, in particular to a geographic information data storage and retrieval technology.
Background
The amount of data generated by a Geographic Information System (GIS) project every day is huge, and great challenges are provided for storage and data query, and the traditional data indexing mode cannot meet the requirements for the data amount of TB and even PB. For the challenges generated by the GIS data, a new way of dealing with the data should be provided.
In GIS projects, spatial indexing is responsible.
1. Traditional GIS data is stored in databases such as PostGis, ArcGis, etc., and data storage is limited by hardware space. In the case of a large amount of data, such a data storage manner is not easily expanded.
2. The conventional grid index retrieves data as follows:
a. as shown in fig. 1, on a map layer, a uniform grid is formed according to the width Δ w and the height Δ h of each small grid, and the grid occupied by each primitive or the set of grid cells passing through is calculated.
In fig. 1, the Grid two-dimensional array Grid [ ] [ ] number of the point a is:
line: (int) ((yi-y 0)/. DELTA.h) +1
The method comprises the following steps: (int) ((xi-x 0)/. DELTA.w) +1
The number of rows and columns of the grid two-dimensional array can also be obtained by the above formula.
b. As shown in fig. 2, drawing a rectangle according to the route traveled by each primitive, and determining the grid array elements where the upper left corner and the lower right corner of the drawn rectangle are located, so as to obtain all grid sets covered by the association of the rectangle; in fig. 2, the grid of shaded portions is where the lines pass.
c. Traversing the elements in the grid set to obtain the primitives recorded in each grid element list;
it can be seen that when retrieving data, the rectangular boundaries of the grid of user data are first calculated from the large amount of data, and to retrieve the really useful data, the grid within all the rectangular boundaries is traversed. Containing a large amount of useless data. The search of useless data consumes a great deal of time, and if the grid where the data is located is directly located, the searched grid can be reduced, and the search efficiency is improved.
Disclosure of Invention
The invention aims to provide a method and a device for storing and searching positioning data of a geographic information system, which can easily find all grids corresponding to the positioning data of a specified mobile terminal during searching, further efficiently obtain required positioning data according to the corresponding grids and reduce the workload of processing useless data.
In order to solve the above technical problem, an embodiment of the present invention discloses a positioning data storage method for a geographic information system, where the system includes a storage data table for storing positioning data and an index, a geographic area involved in the positioning data is pre-divided into a plurality of grids, and a time range involved in the positioning data is pre-divided into a plurality of time periods, and the storage method includes the following steps:
acquiring positioning data of the mobile terminal, wherein the positioning data comprises an identifier, position information and time information corresponding to the mobile terminal;
obtaining a corresponding grid identifier according to the position information, and obtaining a corresponding time period according to the time information;
if the record comprising the identifier corresponding to the mobile terminal, the grid identifier and the time period does not exist in the index, adding the record into the index;
the positioning data and the grid identifier are combined into a record to be stored in a storage data table.
The embodiment of the invention also discloses a positioning data retrieval method of a geographic information system, the system comprises a storage data table for storing positioning data and an index, the geographic area related to the positioning data is divided into a plurality of grids in advance, the time range related to the positioning data is divided into a plurality of time intervals in advance, the index comprises the corresponding relation of an identifier corresponding to a mobile terminal, a grid identifier and the time intervals, and the retrieval method comprises the following steps:
searching indexes according to the time interval and the corresponding identification of the mobile terminal to obtain a corresponding grid identification set;
and inquiring a storage data table according to the time interval, the identification corresponding to the mobile terminal and the set of the corresponding grid identification to obtain the positioning data.
The embodiment of the invention also discloses a positioning data storage device of a geographic information system, the system comprises a storage data table and an index for storing positioning information, a geographic area related to the positioning data is divided into a plurality of grids in advance, a time range related to the positioning data is divided into a plurality of time intervals in advance, and the storage device comprises:
the positioning unit is used for acquiring positioning data of the mobile terminal, and the positioning data comprises an identifier, position information and time information corresponding to the mobile terminal;
the corresponding unit is used for obtaining a corresponding grid mark according to the position information and obtaining a corresponding time interval according to the time information;
an index adding unit, configured to add a record including an identifier corresponding to the mobile terminal, a grid identifier, and a time period to the index if the record does not exist in the index;
and the storage unit is used for combining the positioning data and the grid identifier into one record to be stored in the storage data table.
The embodiment of the invention also discloses a positioning data retrieval device of a geographic information system, the system comprises a storage data table for storing positioning data and an index, the geographic area related to the positioning data is divided into a plurality of grids in advance, the time range related to the positioning data is divided into a plurality of time intervals in advance, the index comprises the corresponding relation of the identification corresponding to the mobile terminal, the grid identification and the time intervals, and the retrieval device comprises:
the grid identification acquisition unit is used for searching indexes according to the time interval and the identification corresponding to the mobile terminal to obtain a set of corresponding grid identifications;
and the retrieval unit is used for inquiring the storage data table according to the time interval, the identification corresponding to the mobile terminal and the set of the corresponding grid identifications to obtain the positioning data.
Compared with the prior art, the implementation mode of the invention has the main differences and the effects that:
the geographic area is divided into a plurality of grids, the corresponding relation among the corresponding identification of the mobile terminal, the grid identification and the time period is stored in the index, and all positioning data of the same mobile terminal in the same grid and the same time period only correspond to one index record, so that all grids corresponding to the positioning data of the appointed mobile terminal can be easily found during retrieval, required positioning data can be efficiently obtained according to the corresponding grids, and the workload of useless data processing is reduced.
Furthermore, the mobile terminal carries out hashing through the corresponding identifier of the mobile terminal, and one of the plurality of storage data tables is selected for storage, so that the storage and the retrieval of the positioning data can be carried out in a smaller storage data table, and the storage and the retrieval efficiency are improved.
Further, when the data amount corresponding to one grid exceeds a preset threshold, the grid is dynamically split, and therefore the query efficiency of data is guaranteed.
Drawings
FIG. 1 is a schematic diagram of a division of a geographic area into a grid;
FIG. 2 is a diagram of a prior art grid set;
FIG. 3 is a flow chart illustrating a method for storing positioning data of a geographic information system according to a first embodiment of the present invention;
FIG. 4 is a flowchart illustrating a method for retrieving location data of a geographic information system according to a second embodiment of the present invention;
FIG. 5 is a schematic diagram of a positioning data storage device of a geographic information system according to a third embodiment of the present invention;
fig. 6 is a schematic structural diagram of a positioning data retrieving device of a geographic information system according to a fourth embodiment of the present invention.
Detailed Description
In the following description, numerous technical details are set forth in order to provide a better understanding of the present application. However, it will be understood by those skilled in the art that the technical solutions claimed in the present application can be implemented without these technical details and with various changes and modifications based on the following embodiments.
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
A first embodiment of the present invention relates to a method for storing positioning data of a geographic information system. Fig. 3 is a flow chart of the positioning data storage method of the geographic information system. The system comprises a storage data table and an index for storing positioning data, wherein the geographic area related to the positioning data is divided into a plurality of grids in advance, and the time range related to the positioning data is divided into a plurality of time periods in advance.
It should be noted that, in the embodiments of the present invention, the geographic area referred to by the positioning data refers to a pre-specified geographic area, i.e. an area where the mobile terminal may be present, such as a city, a region, a country, and so on.
Specifically, as shown in fig. 3, the method for storing the positioning data of the geographic information system includes the following steps:
in step 301, positioning data of the mobile terminal is obtained, where the positioning data includes an identifier, location information, and time information corresponding to the mobile terminal.
The mobile terminal refers to a terminal device used by a user, such as a smart phone, a tablet computer, a navigator and the like. When the user uses the mobile terminal, the user needs to log in first, and the network side can know the user identification corresponding to the mobile terminal after logging in, wherein the user identification is used for identifying different users using the mobile terminal.
In one embodiment of the present invention, the identifier corresponding to the mobile terminal may be a user identifier of a user using the mobile terminal (one user identifier may correspond to one or more mobile terminals).
In another embodiment of the present invention, the identifier corresponding to the mobile terminal is a terminal identifier, such as a SIM card number, a MAC address, an IP address, and the like.
Then step 302 is entered, a corresponding grid mark is obtained according to the position information, and a corresponding time interval is obtained according to the time information.
In the step of obtaining the corresponding time period according to the time information, the time period is divided in advance, for example, each day is used as one time period, or each time period is 2 hours, and the corresponding time period can be known by knowing the time information. Each time period may be set with a time period identification.
Preferably, in various embodiments of the present invention, the period is one day.
For the corresponding identification of the same mobile terminal, only one corresponding record is recorded in the index for all the position point data generated in the same grid every day.
Step 303 is then entered to determine whether a record comprising the identity, grid identity and time period corresponding to the mobile terminal exists in the index.
If yes, go to step 305; if not, go to step 304.
In step 304, the record including the identifier, the grid identifier and the time period corresponding to the mobile terminal is added to the index.
And if the record comprising the identification corresponding to the mobile terminal, the grid identification and the time period does not exist in the index, adding the record into the index.
Step 305 is then entered to store the positioning data and the grid identification as one record in the stored data table.
If the record (which may include other information) composed of the identifier corresponding to the mobile terminal, the corresponding grid and the corresponding time period already exists in the index, the existing record is kept without making new changes to the index. The corresponding relationship between the identifier corresponding to the mobile terminal, the corresponding grid and the corresponding time period is actually a record in the index table, and for a plurality of positioning data of the same mobile terminal (or user) in the same grid and the same time period, only one record needs to be in the index table.
This flow ends thereafter.
The geographic area is divided into a plurality of grids, the corresponding relation among the corresponding identification of the mobile terminal, the grid identification and the time period is stored in the index, and all positioning data of the same mobile terminal in the same grid and the same time period only correspond to one index record, so that all grids corresponding to the positioning data of the appointed mobile terminal can be easily found during retrieval, required positioning data can be efficiently obtained according to the corresponding grids, and the workload of useless data processing is reduced.
Further, it is preferable that the storage data table is plural. The positioning data storage method of the geographic information system may further include the following steps before step 305:
and hashing according to the identifier corresponding to the mobile terminal to obtain a storage data table corresponding to the positioning data.
The mobile terminal carries out hashing through the corresponding identification of the mobile terminal, and one of the storage data tables is selected to be stored, so that the storage and retrieval of the positioning data can be carried out in a smaller storage data table, and the storage and retrieval efficiency is improved.
The specific steps of the positioning data according to the user hash are as follows:
1. an algorithm is used to generate a random value within a certain range and the random value is returned in a 4-bit 16-ary system.
2. When the user registers, an algorithm is called to obtain 4-bit 16-system data which is used as a suffix of the user ID.
3. A position data table with user dimensions is built in advance, and a table name suffix is 4-bit 16-system data.
4. When a user needs to upload position data, the name of a data table is matched according to a 4-bit 16-system suffix of a user ID, and information such as time, grids, the user and the like is written in an index table.
5. And when inquiring the positioning data, the user quickly retrieves the data according to the information in the index table.
Hashing in user dimension is realized, and mainly the relation between a user and a hash table is established. Described herein is the manner of user ID suffixing.
It should be noted that, in the embodiments of the present invention, data is hashed according to the user dimension, mainly considering that the location information is finally associated with the user, and there is a high probability that the data is retrieved by using the user as the dimension. This is a preferred embodiment, and in some other embodiments of the present invention, the data may be hashed according to other dimensions, but not limited to this, and the purpose of hashing is to make the data set smaller, which is beneficial for data retrieval. If the hash of the device according to the uploaded data is also an option, the corresponding relation between the device and the hash table is only needed to be made.
However, some grids in which large cities are located may generate data hotspots, forming a hotspot grid. For example: the amount of one grid data in the Shanghai area is far larger than that in the Xinjiang area, and under the condition, the hot spot grid data can be refined and split, so that the data are relatively balanced as much as possible, and the generation of data hot spots is avoided.
Therefore, further, preferably, the positioning data storage method of the geographic information system further includes the following steps:
and periodically inquiring the data amount corresponding to each grid in the stored data table, and if the data amount corresponding to one grid is greater than a preset first threshold, splitting the grid into a plurality of grids so that the data amount of each split grid is less than a preset second threshold. Wherein the first threshold is greater than or equal to the second threshold.
When the data amount corresponding to one grid exceeds a preset threshold, the grid is dynamically split, so that the query efficiency of data is ensured.
If the original grid without the data hot spot is changed into a hot spot grid, the grid can still be split according to the above mode. However, after the splitting, the positioning data corresponding to the original grid is migrated to a new grid, and the grid identification information in the index is cleaned and modified to a new grid identification.
A second embodiment of the present invention relates to a method for searching for location data in a geographic information system. Fig. 4 is a flow chart of the positioning data retrieval method of the geographic information system. The system comprises a storage data table used for storing positioning data and an index, wherein the geographic area related to the positioning data is divided into a plurality of grids in advance, the time range related to the positioning data is divided into a plurality of time periods in advance, and the index comprises the corresponding relation of the identification corresponding to the mobile terminal, the grid identification and the time periods.
It should be noted that, in the embodiments of the present invention, the geographic area referred to by the positioning data refers to a pre-specified geographic area, i.e. an area where the mobile terminal may be present, such as a city, a region, a country, and so on.
The mobile terminal refers to a terminal device used by a user, such as a smart phone, a tablet computer, a navigator and the like. When the user uses the mobile terminal, the user needs to log in first, and the network side can know the user identification corresponding to the mobile terminal after logging in, wherein the user identification is used for identifying different users using the mobile terminal.
In one embodiment of the present invention, the identifier corresponding to the mobile terminal may be a user identifier of a user using the mobile terminal (one user identifier may correspond to one or more mobile terminals).
In another embodiment of the present invention, the identifier corresponding to the mobile terminal is a terminal identifier, such as a SIM card number, a MAC address, an IP address, and the like.
The time period is divided in advance, for example, each day is used as one time period, or each 2 hours is used as one time period, and the corresponding time period can be known by knowing the time information. Each time period may be set with a time period identification.
Preferably, in various embodiments of the present invention, the period is one day.
For the corresponding identification of the same mobile terminal, only one corresponding record is recorded in the index for all the position point data generated in the same grid every day.
Specifically, as shown in fig. 4, the positioning data retrieving method of the geographic information system includes the following steps:
in step 401, an index is searched according to the time interval and the identifier corresponding to the mobile terminal, and a set of corresponding grid identifiers is obtained.
In addition, it can be understood that in the index, only one record is provided for the identifier corresponding to the same mobile terminal in the same grid and the same time period.
Then, step 402 is entered, and the stored data table is queried according to the time interval, the identifier corresponding to the mobile terminal and the set of the corresponding grid identifiers, so as to obtain the positioning data.
This flow ends thereafter.
Further, it is preferable that the storage data table is plural. The positioning data retrieving method of the geographic information system may further include the following steps before step 402:
and hashing according to the identifier corresponding to the mobile terminal to obtain a storage data table corresponding to the positioning data.
The mobile terminal carries out hashing through the corresponding identification of the mobile terminal, and one of the storage data tables is selected to be stored, so that the storage and retrieval of the positioning data can be carried out in a smaller storage data table, and the storage and retrieval efficiency is improved.
Further, preferably, the positioning data retrieving method of the geographic information system further includes the steps of:
and periodically inquiring the data amount corresponding to each grid in the stored data table, and if the data amount corresponding to one grid is greater than a preset first threshold, splitting the grid into a plurality of grids so that the data amount of each split grid is less than a preset second threshold. Wherein the first threshold is greater than or equal to the second threshold.
This embodiment is a method embodiment corresponding to the first embodiment, and may be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the first embodiment.
A preferred embodiment of the present invention will be described in detail below.
A grid splitting mode:
preferably, the split can be according to longitude and latitude, and each 0.05 longitude and 0.05 latitude form a grid. For example, the intersection between east longitude 130.00-130.05 and north latitude 120.00-120.05 is logically defined as a grid, and so on. In some hot spots, the splitting of the grid needs to be more refined to solve the problem of data hot spots. Such as: the Tianjin City is located between 38 degrees to 34 degrees to 40 degrees to 15 degrees of north latitude and between 116 degrees to 43 degrees to 118 degrees to 194 degrees of east longitude, and the crossing area of 39 degrees to 05 degrees of north latitude and 117 degrees to 05 degrees of east longitude is a logical grid. Some grids in which large cities are located may generate data hotspots. The data volume of the hot spot area is large, although data hashing of one user dimension is already performed during table design, in the fragmentation of the hash table, data hot spots may still be generated, and in this case, the logical grid may be further split: for example, in the intersection region of the north latitude 39 ° 00 '-north latitude 39 ° 05' and the east longitude 117 ° 00 '-east longitude 117 ° 05', originally, only one grid is used, and the grid can be continuously split, and specifically, how many grids are split, which can be determined according to the size of the data volume, so that the optimal grid size setting and data retrieval experience are obtained.
Here, for example: grids of north latitude 39 degrees 00 ' -north latitude 39 degrees 05 ' and east longitude 117 degrees 00 ' -east longitude 117 degrees 05 ' are subdivided at intervals of 0.005 ', and one grid can be split into 100 grids. The number of the specific grid splits can be determined according to specific situations.
If the original grid without data hot spots is changed into a hot spot grid, the grid can still be split according to the above mode, however, after the split, the data in the original data fragment is moved to a new fragment, and the fragment related information in the index information is cleaned and modified into a new fragment identification bit.
The specific implementation process of the technical scheme comprises the following steps:
1. mesh partitioning
Taking China as an example:
the east situation is the junction of Dongjing, 135 deg.C and 2 min, Heilongjiang and Wusuli river.
The west border is the east longitude at 73 deg. and 40 min, pamier plateau, uzzilei mountain (ukai county).
The soutest border is 3 degrees north latitude and 52 minutes north sand, and the south sand is the great island of great mother and dark sand.
The most northern border is 33 minutes at 53 degrees north latitude, and the main navigation channel of the northern black dragon river (the county of the desert river) is used in the desert river.
The longitude of China is 73 degrees, 40 degrees to 135 degrees, 2 degrees, the latitude is 3 degrees, 52 degrees to 53 degrees, 33 degrees, the longitude span of China is 61.8, and the latitude span is 49.81. The division is performed at a grid interval of 0.05, and the longitude line of the chinese region is 61.8 × 20-1236, and the latitude line is 49.81 × 20-996.2-997.
Then the division is performed according to 0.05 degrees, and the maximum grid of the division in china is 1236 × 997 ═ 1232292.
Each grid corresponds to a grid number, such as: the number corresponding to the grid of the north latitude 39 ° 00 '-north latitude 39 ° 05' and the east longitude 117 ° 00 '-east longitude 117 ° 05' is 100. The grid number of china is 1-1232292.
2. Grid index table design
When each user device uploads data, the data comprises longitude and latitude information, a program calculates grid numbers corresponding to data points according to the longitude and latitude information, and if a user is at a certain date, position point data is generated in a certain grid at a certain day. The hash table corresponding to the user is recorded in the index table, that is, the hash table in which the user data is recorded, or the ID of the hash table may not be recorded, and the hash table is calculated according to the user ID each time. The grid number and the date of the position point data corresponding to the position point data, and the position point data generated on the same grid every day are only recorded once in the index table. Such as: if the activity range of 2016 of a user is in a grid, 4, month and 10 days of the user is recorded in the field of index information by the user, and the data in the grid is uploaded, the data is not changed any more, and if the data of 4, month and 11 days of the user is uploaded, the data is added with a record of date and grid position. In the usual case, the user retrieves the location data over a period of time, and can quickly locate the grid based on the date. If the location point data searched by the user spans a plurality of grids, the problem of data merging exists, and the data needs to be logically processed. Because this data is fragmented, there are multiple results when querying, and multiple results need to be merged.
3. Data storage
The grid index data is stored in MySQL, and the data volume cannot be large, so the time for positioning the grid is short.
MySQL is an elaborate SQL database management system that, although not an open source product, in some cases you can use freely. Due to the powerful function, flexibility, rich Application Programming Interfaces (API) and exquisite system structure, the method is favored by wide free software enthusiasts and even business software users, and particularly provides powerful power for establishing a dynamic website based on a database by combining with Apache and PHP/PERL. MySQL is a true multi-user, multi-threaded SQL database server. SQL (structured query language) is the most popular and standardized database language in the world. MySQL is implemented in a client/server architecture consisting of a server daemon MySQL ld and many different clients and libraries.
Meanwhile, in the program logic, there is a hash algorithm to calculate the table where the user data is located. For example, the user's ID is WZ20160403112200-0A 3C. 0A3C is a 16-ary string, and this encoding may correspond to the table name rule of the hash table, which is named as follows: core 0000
Cores0001
…………
CoresOA3C
…………
4. Data slicing
The positioning data is stored in an ODPS database, which may be sliced according to the grid number.
Create table Cores0000()partitioned by(gridNo)
The ODPS database is a cloud database which can store and calculate mass data and has infinite expansion characteristics. The method mainly serves for storage and calculation of batch structured data, and can provide a solution for a mass data warehouse and an analysis modeling service for big data. With the continuous enrichment and improvement of social data collection means, more and more industry data are accumulated. The data size has grown to the level of massive data (hundreds of GB, TB, or even PB) that the traditional software industry cannot carry. Under the scenario of analyzing mass data, a data analyzer usually adopts a distributed computing mode due to the limitation of the processing capability of a single server. However, the distributed computation model imposes high requirements on data analysts and is not easy to maintain. With distributed models, data analysts need to not only understand business requirements, but also be familiar with the underlying computational model. The purpose of ODPS is to provide a convenient means for analyzing and processing mass data for users. The user may not be concerned with the distributed computing details, thereby achieving the goal of analyzing the big data.
The ODPS database is a product formed by integrating Hadoop, Spark, Hbase and Hive together through Ariiyun, and has advantages in mass data storage and calculation. In the embodiments of the present invention, data is not necessarily stored in the ODPS, but the data size of the location data is large, and the database storing the location data needs to have the following characteristics: distributed, scalable, highly available. The Hadoop ecosystem or databases such as GreenPlum can meet the requirements of the Hadoop ecosystem.
5. Data indexing
Data is generally retrieved according to a time range, so fields are indexed to increase retrieval efficiency.
6. Data retrieval process
The user equipment is connected to the application service, and the application service locates the table and the grid where the data needed by the data are located according to the user table and the index table, so that rapid retrieval is realized.
In the invention, the user is used as a dimension, and the user, grid, time and partition information index is established in the MySQL database. The positioning data is stored in an ODPS database (or other cloud databases which can store and calculate mass data and have infinite expansion characteristics), and the data is firstly scattered in user dimensions to be distributed into different data tables, so that the data is scattered as much as possible, the data of different users are separated, the quantity of the retrieval data is reduced, and the retrieval efficiency is improved. Meanwhile, when the positioning data is stored in the data table, the grids where the data points are located are identified, and the data table is divided into grids and time slices.
Compared with the traditional grid index, the grid index method disclosed by the invention has the following advantages:
(1) the boundary of the data required by the user is not required to be calculated in a large amount of data;
(2) invalid grid retrieval is reduced, thereby improving efficiency.
As can be seen from the above description, the present invention discloses an algorithm for indexing in data query optimization. In the invention, GIS data is divided into different grids according to different longitudes and latitudes, the data of each grid corresponds to a data partition, the numbers of the data partitions and the numbers of the grids have a corresponding relation, and a grid data relation table maintains the corresponding relation; the grids where all tracks generated by each user device at a certain time are located are recorded in a user grid relation table; the relation table rapidly positions the grids where the data required to be inquired by the user are located through three dimensions of the user, the grids and the time, and then searches the data partitions where the required data are located through the corresponding relation between the grids and the data partitions, so that the data are searched on the smaller data partitions every time, and the data inquiry efficiency is improved.
The method embodiments of the present invention may be implemented in software, hardware, firmware, etc. Whether the present invention is implemented as software, hardware, or firmware, the instruction code may be stored in any type of computer-accessible memory (e.g., permanent or modifiable, volatile or non-volatile, solid or non-solid, fixed or removable media, etc.). Also, the Memory may be, for example, Programmable Array Logic (PAL), Random Access Memory (RAM), Programmable Read Only Memory (PROM), Read-Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), a magnetic disk, an optical disk, a Digital Versatile Disk (DVD), or the like.
A third embodiment of the present invention relates to a positioning data storage device of a geographic information system. Fig. 5 is a schematic structural diagram of the positioning data storage device of the geographic information system. The system comprises a storage data table and an index for storing positioning information, a geographical area related to the positioning data is divided into a plurality of grids in advance, and a time range related to the positioning data is divided into a plurality of time periods in advance.
Specifically, as shown in fig. 5, the positioning data storage device of the geographic information system includes:
and the positioning unit is used for acquiring positioning data of the mobile terminal, and the positioning data comprises an identifier, position information and time information corresponding to the mobile terminal.
And the corresponding unit is used for obtaining the corresponding grid identification according to the position information and obtaining the corresponding time interval according to the time information.
And the index adding unit is used for adding the record comprising the identification corresponding to the mobile terminal, the grid identification and the time period into the index if the record does not exist in the index.
And the storage unit is used for combining the positioning data and the grid identifier into one record to be stored in the storage data table.
Further, it is preferable that the storage data table is plural. The positioning data storage device of the geographic information system further comprises:
and the hashing unit is used for hashing according to the identifier corresponding to the mobile terminal to obtain a storage data table corresponding to the positioning data.
The mobile terminal carries out hashing through the corresponding identification of the mobile terminal, and one of the storage data tables is selected to be stored, so that the storage and retrieval of the positioning data can be carried out in a smaller storage data table, and the storage and retrieval efficiency is improved.
Further comprising:
the splitting unit is used for periodically inquiring the data volume corresponding to each grid in the storage data table, and if the data volume corresponding to one grid is greater than a preset first threshold, splitting the grid into a plurality of grids so that the data volume of each split grid is smaller than a preset second threshold. Wherein the first threshold is greater than or equal to the second threshold.
When the data amount corresponding to one grid exceeds a preset threshold, the grid is dynamically split, so that the query efficiency of data is ensured.
Of course, the hash unit and the split unit are only preferred, and in other embodiments of the present invention, the hash unit and the split unit may not be provided, and are not limited thereto.
The first embodiment is a method embodiment corresponding to the present embodiment, and the present embodiment can be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the first embodiment.
A fourth embodiment of the present invention relates to a positioning data search device for a geographic information system. Fig. 6 is a schematic structural diagram of the positioning data retrieving device of the geographic information system. The system comprises a storage data table used for storing positioning data and an index, wherein the geographic area related to the positioning data is divided into a plurality of grids in advance, the time range related to the positioning data is divided into a plurality of time periods in advance, and the index comprises the corresponding relation of the identification corresponding to the mobile terminal, the grid identification and the time periods.
Specifically, as shown in fig. 6, the positioning data retrieving apparatus of the geographic information system includes:
and the grid identification acquisition unit is used for searching the index according to the time interval and the identification corresponding to the mobile terminal to obtain a set of corresponding grid identifications.
And the retrieval unit is used for inquiring the storage data table according to the time interval, the identification corresponding to the mobile terminal and the set of the corresponding grid identifications to obtain the positioning data.
Further, it is preferable that the storage data table is plural. The positioning data retrieving apparatus of the geographic information system further includes:
and the hashing unit is used for hashing according to the identifier corresponding to the mobile terminal to obtain a storage data table corresponding to the positioning data.
The mobile terminal carries out hashing through the corresponding identification of the mobile terminal, and one of the storage data tables is selected to be stored, so that the storage and retrieval of the positioning data can be carried out in a smaller storage data table, and the storage and retrieval efficiency is improved.
Further comprising:
the splitting unit is used for periodically inquiring the data volume corresponding to each grid in the storage data table, and if the data volume corresponding to one grid is greater than a preset first threshold, splitting the grid into a plurality of grids so that the data volume of each split grid is smaller than a preset second threshold. Wherein the first threshold is greater than or equal to the second threshold.
When the data amount corresponding to one grid exceeds a preset threshold, the grid is dynamically split, so that the query efficiency of data is ensured.
Of course, the hash unit and the split unit are only preferred, and in other embodiments of the present invention, the hash unit and the split unit may not be provided, and are not limited thereto.
The second embodiment is a method embodiment corresponding to the present embodiment, and the present embodiment can be implemented in cooperation with the second embodiment. The related technical details mentioned in the second embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the second embodiment.
It should be noted that, in the embodiments of the apparatuses of the present invention, each unit is a logical unit, and physically, one logical unit may be one physical unit, or a part of one physical unit, or may be implemented by a combination of multiple physical units, and the physical implementation manner of the logical units itself is not the most important, and the combination of the functions implemented by the logical units is the key to solve the technical problem provided by the present invention. Furthermore, the fact that the above-described embodiments of the device according to the invention do not introduce elements that are less relevant for solving the technical problem posed by the invention in order to highlight the innovative part of the invention does not indicate that no other elements are present in the above-described embodiments of the apparatus.
It is to be noted that in the claims and the description of the present patent, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the use of the verb "comprise a" to define an element does not exclude the presence of another, same element in a process, method, article, or apparatus that comprises the element.
While the invention has been shown and described with reference to certain preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims (10)

1. A method for storing positioning data of a geographic information system, the system comprising a storage table for storing positioning data and an index, a geographic area referred to by the positioning data being pre-divided into a plurality of grids, a time range referred to by the positioning data being pre-divided into a plurality of time periods, the method comprising the steps of:
acquiring positioning data of the mobile terminal, wherein the positioning data comprises an identifier, position information and time information corresponding to the mobile terminal;
obtaining a corresponding grid identifier according to the position information, and obtaining a corresponding time period according to the time information;
if the record comprising the identifier corresponding to the mobile terminal, the grid identifier and the time period does not exist in the index, adding the record into the index; wherein, all the positioning data of the same mobile terminal in the same grid and the same time period only correspond to one index record;
combining the positioning data and the grid identifier into a record to be stored in the storage data table.
2. The method for storing positioning data of geographic information system according to claim 1, wherein said stored data table is plural;
said step of combining said positioning data and said grid identity into a record stored in said stored data table comprises the steps of:
and hashing according to the identifier corresponding to the mobile terminal to obtain a storage data table corresponding to the positioning data.
3. The method for storing location data of a geographic information system according to claim 1, further comprising the steps of:
periodically inquiring the data volume corresponding to each grid in the storage data table, and if the data volume corresponding to one grid is greater than a predetermined first threshold, splitting the grid into a plurality of grids so that the data volume of each split grid is less than a predetermined second threshold; wherein the first threshold is greater than or equal to the second threshold.
4. The method for storing positioning data of geographic information system according to claim 1, wherein said period is one day;
for the identification corresponding to the same mobile terminal, only one corresponding record is recorded in the index for all the position point data generated in the same grid every day.
5. A positioning data retrieval method of a geographic information system is characterized in that the system comprises a storage data table used for storing positioning data and an index, a geographic area related to the positioning data is divided into a plurality of grids in advance, a time range related to the positioning data is divided into a plurality of time periods in advance, the index comprises a corresponding relation of an identification corresponding to a mobile terminal, a grid identification and the time periods, and all the positioning data of the same mobile terminal in the same grid and the same time period only correspond to one index record, and the method comprises the following steps:
searching the index according to the time interval and the corresponding identification of the mobile terminal to obtain a corresponding grid identification set;
and inquiring the storage data table according to the time interval, the identification corresponding to the mobile terminal and the set of the corresponding grid identifications to obtain the positioning data.
6. A method for retrieving positioning data of a geographical information system according to claim 5, wherein there are a plurality of said stored data tables;
the step of querying the stored data table further comprises the following steps before:
and hashing according to the identifier corresponding to the mobile terminal to obtain a storage data table corresponding to the positioning data.
7. A positioning data storage device of a geographic information system, said system comprising a storage data table for storing positioning information and an index, a geographic area referred to by the positioning data being pre-divided into a plurality of grids, a time range referred to by the positioning data being pre-divided into a plurality of time periods, said storage device comprising:
the positioning unit is used for acquiring positioning data of the mobile terminal, and the positioning data comprises an identifier, position information and time information corresponding to the mobile terminal;
the corresponding unit is used for obtaining a corresponding grid mark according to the position information and obtaining a corresponding time interval according to the time information;
an index adding unit, configured to add a record including the identifier corresponding to the mobile terminal, the grid identifier, and the time period to the index if the record does not exist in the index; wherein, all the positioning data of the same mobile terminal in the same grid and the same time period only correspond to one index record;
and the storage unit is used for combining the positioning data and the grid identifier into a record to be stored in the storage data table.
8. The positioning data storage device of geographic information system according to claim 7, wherein said storage data table is plural;
the storage device further includes:
and the hashing unit is used for hashing according to the identifier corresponding to the mobile terminal to obtain a storage data table corresponding to the positioning data.
9. The positioning data storage device of geographic information system of claim 7, further comprising:
the splitting unit is used for periodically inquiring the data volume corresponding to each grid in the storage data table, and if the data volume corresponding to one grid is greater than a preset first threshold, splitting the grid into a plurality of grids so that the data volume of each split grid is smaller than a preset second threshold; wherein the first threshold is greater than or equal to the second threshold.
10. The utility model provides a geographic information system's locating data retrieval device, its characterized in that, the system is including the storage data table and an index that are used for storing the locating data, and the geographical area that the locating data relates is divided into a plurality of grids in advance, and the time span that the locating data relates is divided into a plurality of periods in advance, including the corresponding relation of the sign that mobile terminal corresponds, grid sign and period in the index, all locating data of same mobile terminal in same grid and same period only correspond an index record, retrieval device includes:
the grid identification acquisition unit is used for searching the index according to the time interval and the identification corresponding to the mobile terminal to obtain a corresponding grid identification set;
and the retrieval unit is used for inquiring the storage data table according to the time interval, the identifier corresponding to the mobile terminal and the set of the corresponding grid identifiers to obtain the positioning data.
CN201611178912.3A 2016-12-19 2016-12-19 Positioning data storage and retrieval method and device for geographic information system Active CN108205562B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611178912.3A CN108205562B (en) 2016-12-19 2016-12-19 Positioning data storage and retrieval method and device for geographic information system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611178912.3A CN108205562B (en) 2016-12-19 2016-12-19 Positioning data storage and retrieval method and device for geographic information system

Publications (2)

Publication Number Publication Date
CN108205562A CN108205562A (en) 2018-06-26
CN108205562B true CN108205562B (en) 2022-04-08

Family

ID=62602363

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611178912.3A Active CN108205562B (en) 2016-12-19 2016-12-19 Positioning data storage and retrieval method and device for geographic information system

Country Status (1)

Country Link
CN (1) CN108205562B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222904B (en) * 2018-11-27 2024-02-27 阿里巴巴集团控股有限公司 Advertisement delivery method, device, system, computing device and storage medium
CN110674134B (en) * 2019-09-16 2024-02-13 腾讯大地通途(北京)科技有限公司 Geographic information data storage method, query method and device
CN112905734A (en) * 2020-12-01 2021-06-04 厦门卫星定位应用股份有限公司 Data storage method, device, server and computer readable storage medium
CN112527828B (en) * 2020-12-10 2023-03-14 福建新大陆支付技术有限公司 Tax control record storage method and retrieval query method for tax control machine
CN115080866B (en) * 2022-08-22 2022-11-25 北京中交兴路信息科技有限公司 Travel path recommendation method and device, storage medium and terminal
CN116681767B (en) * 2023-08-03 2023-12-29 长沙智能驾驶研究院有限公司 Point cloud searching method and device and terminal equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101692729A (en) * 2009-10-21 2010-04-07 中国电信股份有限公司 Method and equipment for implementing application of mobile user dynamic data based on openness
CN103607463A (en) * 2013-11-25 2014-02-26 中国电信集团***集成有限责任公司 Positioning data storage system and method
CN105303854A (en) * 2015-09-11 2016-02-03 百度在线网络技术(北京)有限公司 Travel route data processing method and device
CN105989507A (en) * 2015-01-30 2016-10-05 北京陌陌信息技术有限公司 Method and device for generating information object based on area positioning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521982B (en) * 2011-12-20 2013-12-04 北京世纪高通科技有限公司 FCD engine data resource method and apparatus thereof
US9922062B2 (en) * 2013-07-16 2018-03-20 Clearag, Inc. High-performance gridded data storage, arrangement and extraction
CN106156332B (en) * 2016-07-06 2020-01-31 中电福富信息科技有限公司 Method for screening vehicles entering and leaving based on selected time period and selected area

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101692729A (en) * 2009-10-21 2010-04-07 中国电信股份有限公司 Method and equipment for implementing application of mobile user dynamic data based on openness
CN103607463A (en) * 2013-11-25 2014-02-26 中国电信集团***集成有限责任公司 Positioning data storage system and method
CN105989507A (en) * 2015-01-30 2016-10-05 北京陌陌信息技术有限公司 Method and device for generating information object based on area positioning
CN105303854A (en) * 2015-09-11 2016-02-03 百度在线网络技术(北京)有限公司 Travel route data processing method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"基于移动对象运动模式构建的空间索引结构";徐少平 等;《计算机技术与发展》;20070310;第218-221页 *
徐少平 等."基于移动对象运动模式构建的空间索引结构".《计算机技术与发展》.2007,第218-221页. *

Also Published As

Publication number Publication date
CN108205562A (en) 2018-06-26

Similar Documents

Publication Publication Date Title
CN108205562B (en) Positioning data storage and retrieval method and device for geographic information system
CN109165215B (en) Method and device for constructing space-time index in cloud environment and electronic equipment
CN107220285B (en) Space-time index construction method for massive trajectory point data
CN104820714A (en) Mass small tile file storage management method based on hadoop
CN104376053B (en) A kind of storage and retrieval method based on magnanimity meteorological data
CN106649656B (en) Database-oriented space-time trajectory big data storage method
Han et al. Hgrid: A data model for large geospatial data sets in hbase
CN103020281B (en) A kind of data storage and retrieval method based on spatial data numerical index
Lee et al. Efficient spatial query processing for big data
CN106528787B (en) query method and device based on multidimensional analysis of mass data
CN105138560A (en) Multilevel spatial index technology based distributed space vector data management method
CN110263117B (en) Method and device for determining POI (Point of interest) data
CN103116610A (en) Vector space big data storage method based on HBase
CN106528793A (en) Spatial-temporal fragment storage method for distributed spatial database
CN104252489A (en) Method for fast obtaining position character description information according to latitude and longitude data
CN108009265B (en) Spatial data indexing method in cloud computing environment
CN112214472B (en) Meteorological lattice data storage and query method, device and storage medium
KR101794883B1 (en) Method for generating and storing high speed diatributed index of massive spatial data in data-distributed processing
CN106503196A (en) The structure and querying method of extensible storage index structure in cloud environment
KR101654314B1 (en) Distributed processing system in spatial data and method for operating the same
CN110928878A (en) HDFS-based point cloud data processing method and device
Wang et al. Parallel trajectory search based on distributed index
Han et al. Spatial keyword range search on trajectories
CN109145225B (en) Data processing method and device
Shangguan et al. Big spatial data processing with Apache Spark

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant