CN113946700A - Space-time index construction method and device, computer equipment and storage medium - Google Patents

Space-time index construction method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113946700A
CN113946700A CN202111042088.XA CN202111042088A CN113946700A CN 113946700 A CN113946700 A CN 113946700A CN 202111042088 A CN202111042088 A CN 202111042088A CN 113946700 A CN113946700 A CN 113946700A
Authority
CN
China
Prior art keywords
temporal
data
spatio
time
partition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111042088.XA
Other languages
Chinese (zh)
Inventor
阎继宁
王力哲
王志鹏
刘洪�
邓泽
陈云亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN202111042088.XA priority Critical patent/CN113946700A/en
Publication of CN113946700A publication Critical patent/CN113946700A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/587Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a space-time index construction method, which comprises the following steps: acquiring remote sensing data, extracting metadata of the remote sensing data, and uniformly coding the metadata to obtain first data; coding the first data based on the time attribute and the space attribute of the first data to obtain a spatio-temporal grid index; constructing a distributed database, and pre-partitioning a data table in the distributed database based on the spatio-temporal grid index to obtain a plurality of partitions; the space-time grid index is stored in the partition corresponding to the distributed database, time and space information of remote sensing data is effectively combined, retrieval efficiency is improved, the data table is pre-partitioned, resource consumption caused by partition splitting can be avoided, and each partition is guaranteed to have balanced data size as much as possible.

Description

Space-time index construction method and device, computer equipment and storage medium
Technical Field
The invention relates to the technical field of remote sensing data retrieval, in particular to a method and a device for constructing a time-space index, computer equipment and a storage medium.
Background
With the development of observation technology, more and more remote sensing satellites participate in earth observation, so that the scale of remote sensing image data is larger and larger, and various index modes are provided at present in order to more accurately retrieve massive remote sensing data. At present, two tables are constructed to realize secondary index, so that range query can be effectively supported, but the two tables are required to be constructed, so that a database is difficult to maintain, and the query efficiency is low when the requirement of repeated query is met; and a double-layer space-time index GRIST is constructed based on GeoHash and R-Tree, the method adopts a two-layer index structure, the first layer adopts space grid GeoHash coding, and the second layer uses R-Tree to process time, the index is greatly improved compared with GeoMesa and PostGIS systems, but the combination of time information and space information is not effectively combined, and the index is stored on a data node, so that the defect of uneven data is caused, and the efficiency of retrieval and query is not high when the index is used.
Disclosure of Invention
The invention solves the problem of how to improve the retrieval efficiency of the remote sensing data with unevenly distributed time and space.
In order to solve the above problems, the present invention provides a method for constructing a spatio-temporal index, comprising:
acquiring remote sensing data, extracting metadata of the remote sensing data, and uniformly coding the metadata to obtain first data; coding the first data based on the time attribute and the space attribute of the first data to obtain a spatio-temporal grid index; constructing a distributed database, and pre-partitioning a data table in the distributed database based on the spatio-temporal grid index to obtain a plurality of partitions; and storing the spatio-temporal grid index in a partition corresponding to the distributed database.
Compared with the prior art, the method and the device have the advantages that the time attribute and the space attribute of the remote sensing data are combined, the time and space relevance of the remote sensing data is fully utilized to encode the remote sensing data, the retrieval efficiency can be improved, the data table is pre-partitioned aiming at the space-time grid index, a more appropriate pre-partitioning mode can be selected to establish the distributed database based on the data characteristics of the remote sensing data to be stored, the space-time grid index is partitioned and stored, the follow-up retrieval is facilitated, and the retrieval efficiency is improved.
Optionally, the encoding the first data based on the temporal attribute and the spatial attribute of the first data, and the obtaining the spatio-temporal grid index includes:
carrying out global subdivision grid coding on the first data to obtain spatial grid coding; recursively subdividing the temporal attributes to obtain temporal codes, the recursively subdividing comprising: setting an initial time point and a maximum time span, dividing the maximum time span by preset levels, and for the time span of each level after division, differentially encoding the time attribute of the time point falling in the first half of the time span and the time attribute falling in the second half of the time span; combining the spatial grid coding with the temporal coding to obtain a spatio-temporal coding, the spatio-temporal grid index being determined based on the spatio-temporal coding.
Therefore, the time attribute is recursively subdivided, the time attribute is converted into one-dimensional codes, the time attribute and the space attribute can be represented as the one-dimensional codes by combining with the space grid codes, and the one-dimensional codes are stored in the distributed database by taking the codes as main keys.
Optionally, the spatio-temporal grid index further comprises:
a satellite sensor code, wherein the satellite sensor code comprises: a six bit binary code combining the satellite name and type, the type of the sensor.
Therefore, the satellite sensor is added into the space-time grid index, so that the source of the satellite sensor can be screened during retrieval, and the retrieval efficiency is ensured; the binary code can be unified with the space grid code and the time code in a system mode, and the six-bit code can also meet the type number of the current sensors.
Optionally, the constructing a distributed database, and pre-partitioning a data table in the distributed database based on the spatio-temporal grid index to obtain a plurality of partitions includes:
extracting the space-time code; combining the space-time codes according to the Morton codes to obtain Morton codes; converting the Morton code into a preset system to obtain a conversion code; and pre-partitioning the data table in the distributed database according to the binary number of the conversion code.
Therefore, the time-space codes are combined according to the Morton codes, the combination of time and space is guaranteed, the conversion codes are subjected to the pre-partitioning, the number of the pre-partitions can be guaranteed to be matched with the number of the codes, and the most appropriate partition mode is selected for the pre-partitioning.
Optionally, after the pre-partitioning the data table in the distributed database according to the scale number of the transcoding, the method further includes:
adding a Coprocessor interface to the distributed database, and loading a Coprocessor; and using the coprocessor to perform balanced division on the space-time grid index.
Therefore, the remote sensing data has time and space nonuniformity, and the space-time grid index is divided in a balanced manner by using the coprocessor, so that the condition that the data of the space-time grid index is inclined is prevented.
Optionally, the using the co-processor to partition the spatio-temporal grid index equally comprises:
acquiring the number of spatio-temporal grid indexes in each partition; judging whether the subareas are small subareas or not, wherein the small subareas are the subareas of which the number is less than or equal to a first preset threshold value; if so, merging the small partitions in a first preset mode, wherein the first preset mode comprises the steps of obtaining the conversion codes in each small partition, and merging at least two small partitions with the smallest absolute value of the difference between the conversion codes into one partition.
Therefore, when the data amount stored in the partition is small, two small partitions with small conversion coding difference are combined into one partition, and the space-time proximity of the data in the partition can be guaranteed not to be influenced as far as possible after the partitions are combined.
Optionally, the using the co-processor to partition the spatio-temporal grid index equally comprises:
judging whether the partition is a large partition or not, wherein the partition with the number of the spatio-temporal grid indexes being larger than or equal to a second preset threshold value is taken as the large partition; if so, dividing the large partition in a second preset mode, wherein the second preset mode comprises that the large partition is divided into at least two partitions, and the number of the spatio-temporal grid indexes in each divided partition is smaller than a second preset threshold and larger than a first preset threshold.
Therefore, when the data amount stored in the partition is large, one large partition is divided into a plurality of partitions, and the data amount of each partition is ensured not to be too much or too little.
On the other hand, the invention also provides a space-time index construction device, which comprises:
the acquisition module is used for acquiring remote sensing data, extracting metadata of the remote sensing data, and uniformly coding the metadata to obtain first data; the encoding module is used for encoding all the first data based on the time attribute and the space attribute to obtain a space-time grid index; the building module is used for building a distributed database, and pre-partitioning a data table in the distributed database based on the spatio-temporal grid index to obtain a plurality of partitions; a storage module for storing the spatiotemporal grid index in a corresponding partition of the distributed database.
Compared with the prior art and the spatio-temporal index construction method, the spatio-temporal index construction device has the same advantages, and is not repeated herein.
The invention also provides a computer device, which includes a computer readable storage medium storing a computer program and a processor, wherein the computer program is read by the processor and when running, implements the spatio-temporal index construction method as described above.
Compared with the prior art, the computer equipment has the same advantages as the spatio-temporal index construction method, and is not described herein again.
The present invention also provides a computer storage medium, in which a computer program is stored, and when the computer program is read and executed by a processor, the spatiotemporal index construction method as described above is implemented.
Compared with the prior art, the computer storage medium has the same advantages as the spatio-temporal index construction method, and is not described herein again.
Drawings
FIG. 1 is a flow chart of a spatiotemporal index construction method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a spatiotemporal index construction method according to an embodiment of the present invention after step S200 is refined;
FIG. 3 is a flowchart illustrating a spatiotemporal index construction method according to an embodiment of the present invention after step S300 is refined;
FIG. 4 is another flowchart illustrating the spatio-temporal index construction method according to an embodiment of the present invention after step S300 is refined;
FIG. 5 is a flowchart illustrating a spatiotemporal index construction method according to an embodiment of the present invention after step S300 is refined;
FIG. 6 is a block flow diagram of a spatiotemporal index construction method according to an embodiment of the present invention;
FIG. 7 is a graph comparing the efficiency of spatiotemporal queries using different models, according to an embodiment of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
The main reason for the fact that time and space distribution of multi-source remote sensing data are not uniform is that more and more remote sensing satellites participate in earth observation through rapid development of the remote sensing field, the orbit parameters of remote sensing data from different satellite sensors are different, and the time resolution and the space resolution are also different, so that the storage format, the projection standard, the resolution, the data updating period and other aspects of the remote sensing data are different, and a perfect index cannot be established in the prior art so as to effectively combine the remote sensing data with non-uniform time and space distribution. Effective indexes are established for the time dimension and the space dimension, massive time data and spatial data can be effectively stored and analyzed, and retrieval and query efficiency is improved. At present, common remote sensing data organization modes do not fully utilize the spatio-temporal information of the remote sensing data to carry out spatio-temporal index modeling, so that the efficiency of retrieval and access of large-scale multi-source remote sensing data is not high.
In addition, due to the difference of satellite orbit parameter design and the difference of service groups, the distribution of the remote sensing data in the time dimension and the space dimension is not uniform, for example, a terrestrial satellite for detecting earth resources and environment has obvious spatial heterogeneity, the satellite remote sensing data is mainly concentrated in a terrestrial region, the distribution of a marine region is less, when an index is constructed for the data with unbalanced distribution, a large amount of index data is easily stored in the same node, when a large number of clients access one or more nodes in a cluster, the read-write request of the data is too large, the load of the node is too large, the performance of a system is reduced, and the access efficiency of the data is influenced.
In order to solve the above problem, the present application provides a method for constructing a non-uniform spatio-temporal index in a distributed database, comprising:
step S100, as shown in fig. 1 and fig. 6, obtaining remote sensing data, extracting metadata of the remote sensing data, and performing uniform coding on the metadata to obtain first data.
In one embodiment, the telemetry data includes satellites observed of the earth, including terrestrial satellites, meteorological satellites, and marine satellites. The land satellite is mainly used for investigating and monitoring land resources, is a main type in the satellite, and has the characteristics of wide information acquisition range, multiple and fine observation objects and higher requirements on spatial resolution and spectral resolution; the meteorological satellite is an artificial earth satellite for carrying out meteorological observation on the earth and an atmospheric chamber from the outer space, is mainly used for weather forecast and climate forecast, and has the characteristics of wide observation range, more observation times, high time resolution, high observation data quality and the like; the ocean satellite is a satellite for observing the surface of the earth and the ocean, is mainly used for dynamically monitoring ocean temperature fields, ocean currents, ocean waves, ocean salts and the like, and has higher requirements on spectral resolution.
In another embodiment, remote sensing data obtained by Landsat series satellites is used as an example, more than 800 pieces of remote sensing image metadata of the Landsat series are provided, the range covers the whole world, the data formats of the Landsat series are not completely consistent, metadata formats of the remote sensing data obtained by different Landsat satellites need to be unified, common fields and extension fields are selected to construct a unified metadata format, and a unified metadata standard structure model is shown in the following table.
Figure BDA0003249694050000061
The first data is obtained by effectively describing and explaining the metadata of the remote sensing data, and the metadata can be organized into semi-structured or structured data, namely the metadata is compiled into the first data, so that the retrieval, positioning and acquisition of the first data are realized.
Step S200, based on the time attribute and the space attribute of the first data, the first data is coded to obtain a space-time grid index.
In an embodiment, an HBase database is used for storing data, and the spatio-temporal grid index is RowKey in the HBase.
Optionally, encoding the first data further comprises: encoding the satellite sensor, wherein the encoding structure comprises: and combining the satellite name and type and the type of the sensor and carrying out six-bit binary coding processing.
HBase is a distributed, column-oriented storage system built on top of HDFS. HBase can be used when real-time reading and writing and random access to a super-large-scale data set are needed. Expanding by adding nodes from the lower island in a linear mode, and listing independent retrieval for storage and authority control of the column cluster; the table of HBase can be designed to be very sparse; the data is a string of characters and does not relate to a type.
Different from data in other fields, the remote sensing data has attributes such as time, space and satellite sensors, wherein the time attribute and the space attribute both have certain continuity, for example, when the remote sensing data is searched and queried, the remote sensing data before and after a certain time point is often searched continuously, or the data of the peripheral space of the data to be queried is queried together, so when the remote sensing data is coded, the relevance between time and space needs to be considered, correspondingly, the time and space data also need to be stored in a correlation manner, and the query efficiency of the remote sensing data can be effectively increased. In addition, the remote sensing data also has a satellite sensor attribute field, and the remote sensing data is retrieved and inquired by using the attribute field of the satellite sensor, which is also a data retrieval mode commonly used on a remote sensing satellite platform, so that the satellite sensor is required to be taken into account as the RowKey of HBase when the index is constructed.
In one embodiment, the Landsat1 satellite and MSS sensors are coded as 000000, 64 satellite sensor platforms can be added, and the codes have expandability and can be guaranteed to adapt to the number of the current satellite sensors.
In the design of the HBase table, besides the RowKey, the HBase table also includes the design of the column cluster, and in one embodiment, the specific column cluster of the HBase is set to 1 and fixed as "MD", wherein the columns below the column cluster are ID, Platform, Sensor, ul lon, ul lat, ur lon, ur lat, l lon, Ir lat, center lon, center lat, startTime, end, Level, Path, Format, Cloud, and the like, and the four-point longitude and latitude coordinates of the remote sensing data are upper left (ul _ lon, ul _ lat), upper right (ur _ lon, ur _ lat), lower left (ll _ lon, ll _ lat), lower right (lr _ lon, lr _ lat), and the time range is (time, end). Since detailed information of the remote sensing data needs to be acquired, the information needs to be stored in a column cluster of a table of the HBase.
Step S300, a distributed database is built, and the data tables in the distributed database are pre-partitioned based on the spatio-temporal grid index to obtain a plurality of partitions.
And associating the time-space grid index with the remote sensing data to form a grid index based on time and space, and realizing the retrieval of the remote sensing data through the grid index.
In HBase, the RowKey is the primary key used to retrieve records, and there are three ways to access a row in HBase, which are:
1) access is via a single RowKey.
2) Range by RowKey.
3) Full table scanning.
In one embodiment, the construction of the distributed database can be performed on the remote sensing data and the RowKey thereof after the construction of the RowKey is completed.
In one embodiment, a one-dimensional RowKey line key is formed by combining the time attribute and the space attribute of the remote sensing data, and the data table can be pre-partitioned properly through the RowKey, so that the resource consumption caused by partition splitting of the HBase can be reduced, and the response speed of the HBase is improved.
When the HBase table is just created, only one partition is provided, when the partition is too large and exceeds a preset partition threshold value, the table can be split, one partition is split into two partitions, when the table is split, a large number of resources can be consumed, frequent partitions can have great influence on the performance of the HBase, and therefore the table needs to be pre-partitioned when the table is just created so as to improve the performance of the HBase.
In one embodiment, the space attribute and the time attribute are subjected to dimension reduction operation, three-dimensional time and space longitude and latitude information is reduced to one-dimensional space-time grid coding, the relevance of remote sensing data on space and time is ensured, time and space codes generated by different subdivision levels are used as division bases of pre-partitions of HBase according to the space-time grid coding, when the time codes and the space codes are determined, the binary codes are combined by carrying out binary conversion on the space codes to form a string of binary code combinations, the number condition of the current code combinations is determined, the number of the pre-partitions of the HBase is determined, and then the range of each pre-partition is determined based on the number of the pre-partitions and the value of RowKey.
And step S400, storing the spatio-temporal grid index in a partition corresponding to the distributed database.
And storing the space-time grid index and the corresponding remote sensing data into a distributed database to complete the construction of the uneven space-time index of the distributed database.
Optionally, as shown in fig. 3, the constructing a distributed database, and pre-partitioning a data table in the distributed database based on the spatio-temporal grid index to obtain a plurality of partitions includes:
step S301, extracting each space-time code.
In an embodiment, the spatial grid code and the time code are one-dimensional codes of a spatial attribute and a time attribute in the remote sensing data, the spatial grid code and the time code in the remote sensing data are extracted, and the binary space-time code is subjected to binary conversion to obtain the binary space grid code and the binary time code.
Step S302, each space-time code is combined according to the Morton code, and the Morton code corresponding to the space-time code is obtained.
The binary space grid code and the time code in step S301 are converted and combined according to the morton code, and all combinations that may occur in the morton code are listed, so as to obtain the maximum number of combinations.
Step S303, converting each Morton code into a preset system to obtain a conversion code.
And according to the possible code combination number listed in the step S302, performing preset system conversion on the Morton code, wherein the system number of the preset system is consistent with the maximum combination number.
And step S304, pre-partitioning the data table in the distributed database according to the scale number of the conversion code.
In one embodiment, remote sensing data of Landsat series is used as stored remote sensing data, a scheme with a partition level of 2 is adopted, 64 partitioned partitions are adopted, indexes of the data are mapped into 64 partitions, the matching between the number of pre-partitions and the remote sensing data can be ensured, and the performance of HBase is ensured.
By way of example, the first-layer pre-partition in step S304 is described in detail, in an embodiment, a global subdivision trellis code is used as a code of a spatial attribute to obtain a spatial trellis code; and (5) subdividing the time attribute to obtain time codes. When the level of the time coding and the space coding is determined to be 1, the corresponding space grid coding is 0, 1, 2 and 3; the temporal code is 0 or 1. Binary conversion is performed on the spatial code, the obtained binary result is 00, 01, 10, 11, in this example, the odd number bits from left to right of the binary result represent longitude, the even number bits represent latitude, the time code is combined with the longitude and latitude code, eight cases of 000, 001, 010, 011, 100, 101, 110 and 111 are generated, namely, when the division level is 1, the corresponding space-time region codes are combined into eight types, the code is subjected to octal conversion to obtain conversion codes, the conversion codes are 0, 1, 2, 3, 4, 5, 6 and 7, the length of the code can be shortened, the total number of the partitions is 9, the upper and lower boundaries of RowKey are M and M, and the RowKey range of each partition is [ M, 0), [0, 1), [1, 2), [2, 3), [3, 4), [4, 5), [5, 6), [6, 7), [7, M ]. When data is stored in the database, the RowKey is distributed to different partitions according to the size. For example, when the beginning of the RowKey code is 01, a [0, 1) partition is hit, and the remote sensing image is stored in the partition.
Similar to the case of level 1, when it is determined that the level of temporal coding and spatial coding is 2, the number of divided partitions is 65 (8)2+ 1).
Alternatively, as shown in fig. 2, step S200 includes:
step S201, global mesh generation encoding is carried out on the first data, and space mesh encoding is obtained.
The core idea of the global subdivision grid code is that the GeoSOT code is based on the principle of earth subdivision, and a special grid suitable for spatial information or data organization is searched and constructed by dividing the specification of the earth surface. And dividing a mesh system by using an equi-longitude quadtree, and dividing the mesh system into 10 grades from 0-9 grades, wherein the 0-grade mesh takes the meridian and the equator as a central point, transforms the earth into a plane through simple projection, and expands the size of the plane from 180 degrees to 360 degrees in the earth space to 512 degrees. The grade 1 grid is obtained by equally dividing a plane into four parts on the basis of the grade 0 part, and the size of each grid is 256 degrees and 256 degrees; and 2, continuously and averagely dividing the plane of the first level into four parts respectively on the basis of the first level, sequentially dividing the four parts to form different division level parts, forming division surface patches with different latitude and longitude ranges by the grid spaces, and coding the grid units of each level by geographic codes according to the anti-Z space filling curve, so that the geographic codes have uniqueness, each code corresponds to a rectangular area, and representing the two-dimensional latitude and longitude information into one-dimensional spatial codes in a dimension reduction mode. By the coding mode, codes in the same region or similar regions have the same prefix, the spatial correlation between remote sensing data is improved, when the hierarchy is higher, namely the data are segmented to be smaller and smaller, the spatial region of each subdivision surface patch can be represented by the longitude and latitude of the center point of the subdivision surface patch, so that when the GeoSOT code is calculated, the space of the region can be represented only by the GeoSOT code corresponding to the longitude and latitude of the center point of the subdivision surface patch, two-dimensional longitude and latitude information is converted into one-dimensional codes, the spatial proximity of the remote sensing data is improved, and the retrieval efficiency of the data is effectively improved.
Step S202, performing recursive subdivision on the time attribute to obtain time codes, wherein the recursive subdivision comprises: setting an initial time point and a maximum time span, dividing the maximum time span by preset levels, and for the time span of each level after division, differentially encoding the time attribute of the time point falling in the first half of the time span and the time attribute falling in the second half of the time span.
In one embodiment, encoding the time includes setting 1970 to an initial point in time and setting a maximum time span of 128 years, meaning that the time ranges from 1970 to 2098. And setting the code of 0-64 years as 0, and setting the time code between 64-128 years as 1, and performing recursive subdivision sequentially to form different subdivision levels. When subdividing to layer 7, each code represents a 1 year time span, extending 1 year to 16 months; when split to layer 11, each code represents a 1 month time span, extending 1 month to 32 days. Based on the time codes, the time codes can be divided into 21 levels from 1970-2098, each code represents a time span of 1 hour, the division level of the time codes is as the following table, and the lower right corner of the table is blank because 21 levels are listed.
Figure BDA0003249694050000111
Step S203, combining the space grid coding and the time coding to obtain space-time coding, and determining the space-time grid index based on the space-time coding.
In one embodiment, to clearly show the code after combining the spatial trellis code with the temporal code, the combined code is named GeoSOT-ST code. The GeoSOT-ST encoding comprises the following steps: six-bit binary coding.
In another embodiment, the spatio-temporal grid index comprises a combination of a GeoSOT-ST code, a satellite sensor code, and a telemetry data ID. For example, the spatio-temporal grid index of a certain remote sensing data is 100011-. And additionally processing the time attribute, and storing the acquisition time of the remote sensing data, such as the long type values of year, month, day, hour, minute and second, as a timestamp into a table so as to increase the time relevance of the remote sensing data.
In the space-time grid index, a GeoSOT-ST coding part is a main field for indexing remote sensing data; screening satellites and sensors when the satellite sensors are coded; the timestamp is used for screening the accurate time range of the remote sensing data; the ID guarantees uniqueness of the spatio-temporal grid index.
Optionally, after the pre-partitioning the data table in the distributed database according to the scale number of the transcoding, the method further includes:
and adding a Coprocessor interface to the distributed database, and loading a Coprocessor.
And using the coprocessor to perform balanced division on the space-time grid index.
Because the remote sensing data are not uniformly distributed in time and space, RowKey of the remote sensing data has different quantity differences, so that the data quantity in each partition has different quantity, after the data are stored in a database, the partitions are further merged and split by using a coprocessor, the data quantity of each partition is ensured to be generally balanced, the data inclination is avoided, and the performance of the system is ensured.
Optionally, as shown in fig. 4, the performing balanced partitioning on the spatiotemporal grid index by using the co-processor includes:
step S310, acquiring the number of the spatio-temporal grid indexes in each partition.
Step S311, determining whether the partition is a small partition, where the small partition is the partition whose number is less than or equal to a first preset threshold.
Step S312, if yes, merging the small partitions in a first preset manner, where the first preset manner includes obtaining the transcoding codes in each small partition, and merging at least two small partitions with the smallest absolute value of the difference between the transcoding codes into one partition.
Optionally, referring to fig. 5, the performing, by the coprocessor, balanced partitioning on the spatiotemporal grid index includes:
step S320, obtaining the number of spatio-temporal grid indexes in each of the partitions.
Step S321, determining whether the partition is a large partition, wherein the partition whose number of spatio-temporal grid indexes is greater than or equal to a second preset threshold is used as the large partition.
Step S322, if yes, dividing the large partition in a second preset manner, where the second preset manner includes splitting the large partition into at least two partitions, and the number of spatio-temporal grid indexes in each split partition is smaller than a second preset threshold and larger than the first preset threshold.
Because the remote sensing data are unevenly distributed in time and space, if only pre-partitioning is carried out, a part of areas only store a small amount of remote sensing data, and a large amount of remote sensing data are concentrated in the part of areas, on the premise of pre-partitioning, the unevenly distributed partitions are further merged and split by using a coprocessor of HBase at a server, and if the data volume in a certain partition is less than or equal to a first preset threshold, the data volume of the partition is too small and needs to be merged with other partitions; if the data amount in a certain partition is greater than or equal to a second preset threshold, the data amount of the partition is too much, and the partition needs to be split into a plurality of partitions.
When the partitions are combined or split, the spatial-temporal proximity of the remote sensing data can be influenced, so that the spatial-temporal relevance of the remote sensing data needs to be taken into account. When the partitions are combined, the partitions need to be combined according to the space-time relevance, for example, the partitions corresponding to the volume data with similar transform coding are combined into one partition; when the data amount in the partition is excessive, the secondary partition needs to be split, and partial data can be transferred to the adjacent partition by expanding the transcoding range of the adjacent partitions before and after the division and/or reducing the transcoding range of the partition; when the data volume of the front partition and the rear partition of the partition is large, the partitions are divided into more partitions so as to ensure the balance of data and prevent the condition of data inclination. After multiple merging and splitting operations, indexes of the remote sensing data can be distributed in different partitions in a balanced manner.
As shown in fig. 7, the search efficiency of various models was experimentally tested, and three comparative models were used in the experiment: (1) a Lon-Lat-Time longitude and latitude and Time storage model (2) a GeoSOT-Time model, wherein GeoSOT coding filtering is adopted in the space, and filtering is carried out in combination with Time; (3) and adopting a default HBase partitioning strategy by a GeoSOT-ST non-partitioning model. And warehousing all metadata, randomly selecting points in the range of the central point of the map, constructing 1/2, 1/4, 1/8, 1/16 and 1/32 query areas of the whole map, and testing the precision ratio and the query efficiency of different query ranges.
Figure BDA0003249694050000131
And (3) calculating the accurate data volume of the query data and the query data volume of the comparison model through multiple experiments to obtain the precision ratio of the Lon-Lat-Time model of 100%, the precision ratio of the GeoSOT-Time model of 89% and the precision ratio of the GeoSOT-ST of 96%. As shown in FIG. 7, the GeoSOT-ST-2 of the invention has a greater improvement in precision ratio and query efficiency than other schemes.
A spatiotemporal index constructing apparatus according to another embodiment of the present invention includes:
the acquisition module is used for acquiring remote sensing data, extracting metadata of the remote sensing data, and uniformly coding the metadata to obtain first data; an encoding module, configured to encode the first data based on a temporal attribute and a spatial attribute, to obtain a spatio-temporal grid index; the building module is used for building a distributed database, and pre-partitioning a data table in the distributed database based on the spatio-temporal grid index to obtain a plurality of partitions; a storage module for storing the spatiotemporal grid index in a corresponding partition of the distributed database.
Compared with the prior art and the spatio-temporal index construction method, the spatio-temporal index construction device has the same advantages, and is not repeated herein.
A computer apparatus according to another embodiment of the present invention includes a computer readable storage medium storing a computer program and a processor, wherein the computer program is read by the processor and executed to implement the spatiotemporal index construction method as described above.
Compared with the prior art, the computer equipment has the same advantages as the spatio-temporal index construction method, and is not described herein again.
A computer storage medium according to another embodiment of the present invention stores a computer program, which is read and executed by a processor to implement the spatio-temporal index construction method as described above.
Compared with the prior art, the computer storage medium has the same advantages as the spatio-temporal index construction method, and is not described herein again.
Although the present disclosure has been described above, the scope of the present disclosure is not limited thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present disclosure, and these changes and modifications are intended to be within the scope of the present disclosure.

Claims (10)

1. A spatio-temporal index construction method is characterized by comprising the following steps:
acquiring remote sensing data, extracting metadata of the remote sensing data, and uniformly coding the metadata to obtain first data;
coding the first data based on the time attribute and the space attribute of the first data to obtain a spatio-temporal grid index;
constructing a distributed database, and pre-partitioning a data table in the distributed database based on the spatio-temporal grid index to obtain a plurality of partitions;
and storing the spatio-temporal grid index in a partition corresponding to the distributed database.
2. The method of claim 1, wherein the encoding the first data based on the temporal and spatial attributes of the first data to obtain the spatio-temporal grid index comprises:
carrying out global subdivision grid coding on the first data to obtain spatial grid coding;
recursively subdividing the temporal attributes to obtain temporal codes, the recursively subdividing comprising: setting an initial time point and a maximum time span, dividing the maximum time span by preset levels, and for the time span of each level after division, differentially encoding the time attribute of the time point falling in the first half of the time span and the time attribute falling in the second half of the time span;
combining the spatial grid coding with the temporal coding to obtain a spatio-temporal coding, the spatio-temporal grid index being determined based on the spatio-temporal coding.
3. The spatio-temporal index construction method according to claim 2, wherein the spatio-temporal grid index further comprises:
a satellite sensor code, wherein the satellite sensor code comprises: a six bit binary code combining the satellite name and type, the type of the sensor.
4. The spatio-temporal index construction method according to any one of claims 1-3, wherein the constructing a distributed database, and pre-partitioning a data table in the distributed database based on the spatio-temporal grid index to obtain a plurality of partitions comprises:
extracting each of the space-time codes;
combining each space-time code according to a Morton code to obtain a Morton code corresponding to the space-time code;
converting each Morton code into a preset system to obtain a conversion code;
and pre-partitioning the data table in the distributed database according to the binary number of the conversion code.
5. The spatiotemporal index construction method according to claim 4, further comprising, after the pre-partitioning of the data tables in the distributed database according to the transcoding binary number:
adding a Coprocessor interface to the distributed database, and loading a Coprocessor;
and using the coprocessor to perform balanced division on the space-time grid index.
6. The spatiotemporal index construction method of claim 5, wherein the using the co-processor to partition the spatiotemporal grid index equally comprises:
acquiring the number of spatio-temporal grid indexes in each partition;
judging whether the subareas are small subareas or not, wherein the small subareas are the subareas of which the number is less than or equal to a first preset threshold value;
if so, merging the small partitions in a first preset mode, wherein the first preset mode comprises the steps of obtaining the conversion codes in each small partition, and merging at least two small partitions with the smallest absolute value of the difference between the conversion codes into one partition.
7. The spatiotemporal index construction method of claim 5, wherein the using the co-processor to partition the spatiotemporal grid index equally comprises:
judging whether the partition is a large partition or not, wherein the partition with the number of the spatio-temporal grid indexes being larger than or equal to a second preset threshold value is taken as the large partition;
if so, dividing the large partition in a second preset mode, wherein the second preset mode comprises that the large partition is divided into at least two partitions, and the number of the spatio-temporal grid indexes in each divided partition is smaller than a second preset threshold and larger than a first preset threshold.
8. A spatio-temporal index construction apparatus, comprising:
the acquisition module is used for acquiring remote sensing data, extracting metadata of the remote sensing data, and uniformly coding the metadata to obtain first data;
the encoding module is used for encoding all the first data based on the time attribute and the space attribute to obtain a space-time grid index;
the building module is used for building a distributed database, and pre-partitioning a data table in the distributed database based on the spatio-temporal grid index to obtain a plurality of partitions;
a storage module for storing the spatiotemporal grid index in a corresponding partition of the distributed database.
9. A computer device comprising a computer readable storage medium storing a computer program and a processor, the computer program being read and executed by the processor to implement the spatio-temporal index construction method according to any one of claims 1 to 7.
10. A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, which when read and executed by a processor, implements the spatio-temporal index construction method according to any one of claims 1 to 7.
CN202111042088.XA 2021-09-07 2021-09-07 Space-time index construction method and device, computer equipment and storage medium Pending CN113946700A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111042088.XA CN113946700A (en) 2021-09-07 2021-09-07 Space-time index construction method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111042088.XA CN113946700A (en) 2021-09-07 2021-09-07 Space-time index construction method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113946700A true CN113946700A (en) 2022-01-18

Family

ID=79328066

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111042088.XA Pending CN113946700A (en) 2021-09-07 2021-09-07 Space-time index construction method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113946700A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115809277A (en) * 2023-02-06 2023-03-17 成都智元汇信息技术股份有限公司 Method for positioning administrative region of internet of things equipment based on longitude and latitude
CN115809360A (en) * 2023-02-08 2023-03-17 深圳大学 Large-scale space-time stream data real-time space connection query method and related equipment
CN115840752A (en) * 2023-02-24 2023-03-24 西安索格亚航空科技有限公司 Method for storing and inquiring global aviation navigation data
CN116049521A (en) * 2023-03-16 2023-05-02 浪潮软件科技有限公司 Space-time data retrieval method based on space grid coding
CN116881308A (en) * 2023-07-31 2023-10-13 北京和德宇航技术有限公司 Satellite telemetry data display method, device, equipment and storage medium
CN117033526A (en) * 2023-10-09 2023-11-10 中国地质大学(武汉) Data storage method, data query method, device, equipment and storage medium
DE102022117704A1 (en) 2022-07-15 2024-01-18 Bayerische Motoren Werke Aktiengesellschaft Method for storing and providing georeferenced vehicle data, computer-readable medium, and distributed system
CN117874301A (en) * 2024-03-11 2024-04-12 浙江省气象台 Method for processing, storing and calling pyramid slice based on grid data

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102022117704A1 (en) 2022-07-15 2024-01-18 Bayerische Motoren Werke Aktiengesellschaft Method for storing and providing georeferenced vehicle data, computer-readable medium, and distributed system
CN115809277A (en) * 2023-02-06 2023-03-17 成都智元汇信息技术股份有限公司 Method for positioning administrative region of internet of things equipment based on longitude and latitude
CN115809360A (en) * 2023-02-08 2023-03-17 深圳大学 Large-scale space-time stream data real-time space connection query method and related equipment
CN115809360B (en) * 2023-02-08 2023-05-05 深圳大学 Real-time space connection query method for large-scale space-time data and related equipment
CN115840752A (en) * 2023-02-24 2023-03-24 西安索格亚航空科技有限公司 Method for storing and inquiring global aviation navigation data
CN116049521A (en) * 2023-03-16 2023-05-02 浪潮软件科技有限公司 Space-time data retrieval method based on space grid coding
CN116881308A (en) * 2023-07-31 2023-10-13 北京和德宇航技术有限公司 Satellite telemetry data display method, device, equipment and storage medium
CN117033526A (en) * 2023-10-09 2023-11-10 中国地质大学(武汉) Data storage method, data query method, device, equipment and storage medium
CN117033526B (en) * 2023-10-09 2023-12-29 中国地质大学(武汉) Data storage method, data query method, device, equipment and storage medium
CN117874301A (en) * 2024-03-11 2024-04-12 浙江省气象台 Method for processing, storing and calling pyramid slice based on grid data
CN117874301B (en) * 2024-03-11 2024-05-31 浙江省气象台 Method for processing, storing and calling pyramid slice based on grid data

Similar Documents

Publication Publication Date Title
CN113946700A (en) Space-time index construction method and device, computer equipment and storage medium
CN109992636B (en) Space-time coding method, space-time index and query method and device
CN110442444B (en) Massive remote sensing image-oriented parallel data access method and system
CN111782742B (en) Large-scale geographic space data oriented storage and retrieval method and system thereof
CN107153711A (en) Geographic information data processing method and processing device
CN109635068A (en) Mass remote sensing data high-efficiency tissue and method for quickly retrieving under cloud computing environment
CN109684428A (en) Spatial data building method, device, equipment and storage medium
CN113946575B (en) Space-time trajectory data processing method and device, electronic equipment and storage medium
CN110909093B (en) Method and device for constructing intelligent landmark control network
CN112685407A (en) Spatial data indexing method based on GeoSOT global subdivision grid code
CN108009265B (en) Spatial data indexing method in cloud computing environment
CN106991149B (en) Massive space object storage method fusing coding and multi-version data
CN116860905B (en) Space unit coding generation method of city information model
CN112579677B (en) Automatic processing method for satellite remote sensing image
CN112380302A (en) Thermodynamic diagram generation method and device based on track data, electronic equipment and storage medium
CN110633282A (en) Airspace resource multistage three-dimensional gridding method and tool
CN113269870A (en) Multi-resolution digital terrain integration method based on three-dimensional subdivision grids
CN116775661A (en) Big space data storage and management method based on Beidou grid technology
CN110826454B (en) Remote sensing image change detection method and device
CN113742505B (en) Mass synthetic aperture radar interferometry (InSAR) data online visualization method
US10482085B2 (en) Methods and systems for estimating the number of points in two-dimensional data
Ladner et al. Mining Spatio-Temporal Information Systems
CN111080080B (en) Village geological disaster risk prediction method and system
CN116467540B (en) HBase-based massive space data rapid visualization method
CN116775971A (en) Time-space grid index query method and system based on degenerated quadtree

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination