CN110888880A - Proximity analysis method, device, equipment and medium based on spatial index - Google Patents

Proximity analysis method, device, equipment and medium based on spatial index Download PDF

Info

Publication number
CN110888880A
CN110888880A CN201911131958.3A CN201911131958A CN110888880A CN 110888880 A CN110888880 A CN 110888880A CN 201911131958 A CN201911131958 A CN 201911131958A CN 110888880 A CN110888880 A CN 110888880A
Authority
CN
China
Prior art keywords
data
distance
queue
spatial
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911131958.3A
Other languages
Chinese (zh)
Inventor
张业鑫
李爱兵
李纯
杨扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Optics Valley Information Technologies Co Ltd
Original Assignee
Wuhan Optics Valley Information Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Optics Valley Information Technologies Co Ltd filed Critical Wuhan Optics Valley Information Technologies Co Ltd
Priority to CN201911131958.3A priority Critical patent/CN110888880A/en
Publication of CN110888880A publication Critical patent/CN110888880A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Remote Sensing (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a proximity analysis method, a proximity analysis device, proximity analysis equipment and proximity analysis media based on spatial indexes. The method comprises the following steps: establishing an R-Tree algorithm, acquiring spatial data, establishing an R-Tree index structure through the R-Tree algorithm, performing data fragmentation on the calculated data in the R-Tree index structure, and establishing a data fragmentation index; acquiring the distance from the initial position, calculating the distance value between the distance from the initial position and each piece of fragment data, and establishing a queue according to the distance value and the corresponding piece of fragment data; setting an initial distance, calculating the distance from the initial position to the graph distance of the geometric figure in each piece of sliced data, storing the graph distance and a label corresponding to the geometric figure into a queue, and extracting the graph distance with the minimum value from the queue to update the initial distance. The invention replaces the buffer search with the fragment search, utilizes the queue with the sorting function for storing the data in the middle process, can realize the proximity analysis in a multi-thread and multi-task mode, and improves the analysis efficiency.

Description

Proximity analysis method, device, equipment and medium based on spatial index
Technical Field
The present invention relates to the field of geographic information technologies, and in particular, to a method, an apparatus, a device, and a medium for proximity analysis based on spatial index.
Background
The KNN (K-Nearest Neighbor) or K Nearest Neighbor classification algorithm is one of the simplest methods in data mining classification technology. By K nearest neighbors, it is meant the K nearest neighbors, so to speak, each sample can be represented by its nearest K neighbors. Proximity analysis can be realized according to a proximity algorithm, which is a common GIS analysis function, and the general idea of the proximity analysis algorithm is as follows: and setting a buffer radius by taking the input position as a center, and iteratively searching until K elements closest to the input position are found.
However, existing proximity analysis algorithms have some problems: the method is realized in a serial mode, and when the data volume is large, compared with parallel realization, the training is lower; the uncertainty of the buffer radius is high, when the search factors are excessive, the buffer times can be increased, and finally the iteration times are in geometric increase, so that the analysis efficiency is greatly influenced, and the display requirement cannot be met.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
In view of this, the present invention provides a proximity analysis method, apparatus, device and medium based on spatial index, and aims to solve the technical problem that the proximity analysis cannot be realized in a multi-thread and multi-task manner in the prior art.
The technical scheme of the invention is realized as follows:
in one aspect, the present invention provides a proximity analysis method based on a spatial index, including the following steps:
s1, establishing an R-Tree algorithm, acquiring spatial data, calculating the spatial data through the R-Tree algorithm, establishing an R-Tree index structure according to the calculated data, performing data fragmentation on the calculated data in the R-Tree index structure, and establishing a corresponding data fragmentation index according to each fragmented data;
s2, acquiring the data fragment indexes and the distances between the initial positions, calculating the distance values between the distances between the initial positions and each fragment data according to the data fragment indexes, and establishing queues according to the distance values and the corresponding fragment data;
s3, setting an initial distance, extracting geometric figures from the sliced data in the queue, calculating the figure distance from the initial position to each geometric figure, comparing the figure distance with the initial distance, and storing the figure distance and a label corresponding to the geometric figure into the queue according to the comparison result;
and S4, setting a label quantity threshold value, acquiring the quantity of labels in the queue in real time, and extracting the graph distance with the minimum value from the queue to update the initial distance when the quantity of the labels in the queue is greater than the label quantity threshold value.
On the basis of the above technical solution, preferably, in step S1, an R-Tree algorithm is established, which further includes the following steps, where the R-Tree algorithm sets the number of spatial data to be n and the size of a sector to be fan, estimates the number k of leaf nodes to be n/fan, sorts all spatial data according to the x value of the central point of the rectangular outer frame of the spatial data, groups the sorted spatial data, where each group has the size of fan and the last group may not be full, sorts the data in each spatial data group according to the y value of the central point of the rectangular outer frame of the spatial data, and then groups the sorted spatial data, where each group has the size of fan, each group is used as a leaf node, and the number of leaf nodes is nn.
On the basis of the foregoing technical solution, preferably, in step S1, data slicing is performed on the calculated data in the R-Tree index structure, and a corresponding data slicing index is established according to each sliced data, further including the following steps: the method comprises the following steps of slicing the minimum outsourcing rectangle of the data, the geometric figure surrounded by the minimum outsourcing rectangle of the slicing data and the label corresponding to the geometric figure.
Based on the above technical solution, preferably, in step S2, the data segment indexes and the distances between the initial positions are obtained, the distance values between the initial positions and each segment data are calculated according to the data segment indexes, and a queue is established according to the distance values and the corresponding segment data.
On the basis of the above technical solution, preferably, in step S3, an initial distance is set, geometric figures are extracted from the sliced data in the queue, and the figure distance from the initial position to each geometric figure is calculated, further including the steps of setting an initial distance and a sliced data quantity threshold, traversing the quantity of the sliced data in the queue, comparing a distance value corresponding to the sliced data with the initial distance when the quantity of the sliced data is greater than the sliced data quantity threshold, taking out the sliced data from the queue when the distance value is not greater than the initial distance, extracting geometric figures surrounded by a minimum outsourcing rectangle of the sliced data and labels corresponding to the geometric figures from the sliced data, and calculating the figure distance from the distance of the initial position to each geometric figure.
Based on the above technical solution, preferably, in step S3, the graph distance is compared with the initial distance, and the tag corresponding to the graph distance and the geometric figure is stored in the queue according to the comparison result, and the method further includes the steps of comparing the graph distance with the initial distance, and storing the tag corresponding to the graph distance and the geometric figure in the queue when the graph distance is not greater than the initial distance.
On the basis of the above technical solution, preferably, in step S4, setting a label quantity threshold, obtaining the quantity of labels in the queue in real time, and when the quantity of labels in the queue is greater than the label quantity threshold, extracting the graph distance with the smallest value from the queue to update the initial distance, further including the steps of setting a label quantity threshold, obtaining the quantity of labels in the queue in real time, comparing the quantity of labels with the label quantity threshold, and when the quantity of labels is not less than the label quantity threshold, extracting the graph distance with the smallest value from the queue to update the initial distance; and when the number of the labels is less than the label number threshold value, reselecting new fragment data.
Still further preferably, the spatial index-based proximity analysis apparatus includes:
the data fragment index establishing module is used for establishing an R-Tree algorithm, acquiring spatial data, calculating the spatial data through the R-Tree algorithm, establishing an R-Tree index structure according to the calculated data, performing data fragmentation on the calculated data in the R-Tree index structure, and establishing a corresponding data fragmentation index according to each fragmented data;
the queue establishing module is used for acquiring the data fragment indexes and the distances between the initial positions, calculating the distance value between the distance between the initial positions and each fragment data according to the data fragment indexes, and establishing a queue according to the distance value and the corresponding fragment data;
the graph distance calculation module is used for setting an initial distance, extracting geometric figures from the fragment data in the queue, calculating the graph distance from the initial position to each geometric figure, comparing the graph distance with the initial distance, and storing a label corresponding to the graph distance and the geometric figures into the queue according to a comparison result;
and the initial distance updating module is used for setting a label quantity threshold value, acquiring the number of labels in the queue in real time, and extracting the graph distance with the minimum value from the queue to update the initial distance when the number of labels in the queue is greater than the label quantity threshold value.
In a second aspect, the method for spatial index-based proximity analysis further comprises an apparatus comprising: a memory, a processor, and a spatial index based proximity analysis method program stored on the memory and executable on the processor, the spatial index based proximity analysis method program configured to implement the steps of the spatial index based proximity analysis method as described above.
In a third aspect, the method for proximity analysis based on spatial index further includes a medium, which is a computer medium, and the computer medium stores a program for proximity analysis based on spatial index, and the program for proximity analysis based on spatial index implements the steps of the method for proximity analysis based on spatial index when executed by a processor.
Compared with the prior art, the proximity analysis method based on the spatial index has the following beneficial effects:
(1) by establishing an R-Tree index structure, fragmenting spatial data, establishing a corresponding fragment index, and then replacing buffer search by fragment search, multiple elements can be searched, the search accuracy is high, and the efficiency of the whole analysis process is improved;
(2) the queue with the sorting function is used for data storage of the intermediate process, so that the analysis process of multithreading and multitasking can be realized, meanwhile, the sorting function can also directly, accurately and quickly extract required data, and the efficiency of the whole analysis process is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of an apparatus in a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a first embodiment of a spatial index-based proximity analysis method according to the present invention;
FIG. 3 is a functional block diagram of a first embodiment of a spatial index-based proximity analysis method according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, the apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the device, and that in actual implementations the device may include more or less components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a medium, may include therein an operating system, a network communication module, a user interface module, and a spatial index-based proximity analysis method program.
In the device shown in fig. 1, the network interface 1004 is mainly used for establishing a communication connection between the device and a server storing all data required in the spatial index-based proximity analysis method system; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the spatial index-based proximity analysis method apparatus of the present invention may be disposed in the spatial index-based proximity analysis method apparatus, and the spatial index-based proximity analysis method apparatus calls the spatial index-based proximity analysis method program stored in the memory 1005 through the processor 1001 and executes the spatial index-based proximity analysis method provided by the present invention.
Referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of a proximity analysis method based on spatial index according to the present invention.
In this embodiment, the method for proximity analysis based on spatial index includes the following steps:
s10: the method comprises the steps of establishing an R-Tree algorithm, obtaining spatial data, calculating the spatial data through the R-Tree algorithm, establishing an R-Tree index structure according to the calculated data, performing data fragmentation on the calculated data in the R-Tree index structure, and establishing a corresponding data fragmentation index according to each fragmented data.
It should be understood that the R-Tree algorithm may also be referred to as an R-Tree algorithm, where an R-Tree is another form of development of a B-Tree into a multidimensional space, and divides an object space into regions, each node corresponds to a region and a disk page, the disk page of a non-leaf node stores the region range of all other sub-nodes, and the regions of all the sub-nodes of the non-leaf node fall within the region range thereof; the disk pages of the leaf nodes store the circumscribed rectangles of all the space objects within the region range, and the R tree is a dynamic index structure. Both the R-tree and the B-tree are tree-like data structures used for spatial data storage.
It should be understood that the description of the R-Tree algorithm in this implementation is: setting the number of spatial data as n and the size of a sector as fan, estimating the number k of leaf nodes as n/fan, sorting all the spatial data according to the x value of the central point of a rectangular outer frame of the spatial data, grouping the sorted spatial data, wherein each group has the size of afan, and the last group may not be full of members, where the data may not be filled in the spatial data groups with the size of afan, sorting the data in each spatial data group according to the y value of the central point of the rectangular outer frame of the spatial data, regrouping each sorted group, wherein each group has the size of fan, using each group as a leaf node, and the number of the leaf nodes is nn.
It should be understood that after the system acquires the spatial data, the system performs resampling on the spatial data, where the resampling refers to a process of interpolating information of one type of pixel according to information of another type of pixel, and in this implementation, the resampling is a process of extracting required spatial data from the spatial data, and by this way, the number of data operations can be reduced, and the analysis efficiency can be improved.
It should be understood that after the R-Tree index structure is constructed and the computed data in the R-Tree index structure is subjected to data slicing, a two-level index tag is also constructed, where the one-level index tag is a data slicing index tag, and the content of the data slicing index tag includes: the method comprises the following steps of (1) slicing data minimum outsourcing rectangles, geometric figures surrounded by the slicing data minimum outsourcing rectangles, labels corresponding to the geometric figures, slicing id, the number of records contained in a slice, the size of a slicing file and the name of the slicing file; the secondary index tag is an R-Tree index tag. The fragment id, the number of records contained in the fragment, the size of the fragment file and the name of the fragment file help a user to know the whole fragmentation process more intuitively, and the R-Tree index tag helps the system to find the corresponding fragment more quickly, so that the analysis efficiency is improved.
It should be understood that data fragmentation refers to the fact that the data in the distributed database may be replicated in various physical databases at the site of the network. The data fragmentation is realized through basic operation of relational algebra, and in the embodiment, the data fragmentation is performed on the calculated data in the R-Tree index structure to obtain more nodes, so that the operation of an analysis process is facilitated.
S20: the method comprises the steps of obtaining a data fragment index and the distance between an initial position and each fragment data, calculating the distance value between the distance between the initial position and each fragment data according to the data fragment index, and establishing a queue according to the distance value and the corresponding fragment data.
It should be understood that, in the analysis process, the system may obtain the data fragment index and the distance of the initial position, where the distance of the initial position is set by the user, then the system may calculate the distance value between the initial position and each fragment, that is, the distance from the point to the minimum outsourcing rectangle of the fragment data, after obtaining the distance, the system may establish a queue according to the distance value and the corresponding fragment data, then arrange the distance values in the order from small to large, arrange the distance data with the minimum value at the top of the queue, so that the minimum distance may be intuitively obtained, and the analysis efficiency may be improved.
S30: setting an initial distance, extracting geometric figures from the sliced data in the queue, calculating the figure distance from the initial position to each geometric figure, comparing the figure distance with the initial distance, and storing the figure distance and a label corresponding to the geometric figure into the queue according to a comparison result.
It should be understood that, after the queue combination is completed, the system may set an initial distance and a threshold of the quantity of the fragmented data, where the threshold of the quantity of the fragmented data is set by the user, and is generally 0, at this time, the system may traverse the fragmented data in the queue, and when the quantity of the fragmented data in the queue is greater than the threshold of the quantity of the fragmented data, the system may directly extract a distance value corresponding to the first fragmented data in the queue, where the distance value is the minimum distance value in the queue, and then the system may compare the distance value with the initial distance, and only when the distance value is not greater than the initial distance, the system may extract the fragmented data.
It should be understood that after the fragment data is extracted, the system will extract the geometric figure surrounded by the minimum outsourcing rectangle of the fragment data and the label corresponding to the geometric figure from the fragment data, then calculate the figure distance between the initial position and the geometric figure, the geometric figure is composed of spatial data, after the figure distance is obtained, the system will compare the figure distance with the set initial distance, when the figure distance is not greater than the initial distance, store the figure distance and the label corresponding to the geometric figure in the queue, and at the same time, arrange the figure distances in the order from small to large according to the values of all the figure distances in the queue, the minimum figure distance is arranged at the top of the queue, thus when the data is extracted, the extraction can be faster, and the analysis efficiency is improved.
S40: and setting a label quantity threshold value, acquiring the quantity of labels in the queue in real time, and extracting the graph distance with the minimum value from the queue to update the initial distance when the quantity of the labels in the queue is greater than the label quantity threshold value.
It should be understood that, the system will set a threshold value of the number of tags, which is set by the user, the system will obtain the number of tags in the whole queue in real time, compare the number of tags with the threshold value of the number of tags, and when the number of tags is smaller than the threshold value of the number of tags, reselect new fragment data to supplement the number of tags in the queue; when the number of the labels is not less than the label number threshold value, the graph distance with the minimum value is extracted from the queue to update the initial distance, so that the whole analysis process is completed, and the analysis efficiency is improved.
The above description is only for illustrative purposes and does not limit the technical solutions of the present application in any way.
As can be easily found from the above description, in this embodiment, the spatial data is obtained by establishing an R-Tree algorithm, the spatial data is calculated by the R-Tree algorithm, an R-Tree index structure is established according to the calculated data, the calculated data in the R-Tree index structure is subjected to data fragmentation, and a corresponding data fragmentation index is established according to each fragmented data; acquiring a data fragment index and the distance between the initial position and each fragment data, calculating the distance value between the distance between the initial position and each fragment data according to the data fragment index, and establishing a queue according to the distance value and the corresponding fragment data; setting an initial distance, extracting geometric figures from the sliced data in the queue, calculating the figure distance from the initial position to each geometric figure, comparing the figure distance with the initial distance, and storing the figure distance and a label corresponding to the geometric figure into the queue according to a comparison result; and setting a label quantity threshold value, acquiring the quantity of labels in the queue in real time, and extracting the graph distance with the minimum value from the queue to update the initial distance when the quantity of the labels in the queue is greater than the label quantity threshold value. In the embodiment, the buffer lookup is replaced by the fragment lookup, and the queue with the sorting function is used for storing the intermediate process data, so that the proximity analysis can be performed in a multi-thread and multi-task manner, and the analysis efficiency is improved.
In addition, the embodiment of the invention also provides a proximity analysis device based on the spatial index. As shown in fig. 3, the proximity analysis apparatus based on spatial index includes: the system comprises a data slicing index establishing module 10, a queue establishing module 20, a graph distance calculating module 30 and an initial distance updating module 40.
The data fragment index establishing module 10 is configured to establish an R-Tree algorithm, acquire spatial data, calculate the spatial data through the R-Tree algorithm, establish an R-Tree index structure according to the calculated data, perform data fragmentation on the calculated data in the R-Tree index structure, and establish a corresponding data fragmentation index according to each piece of fragmented data;
the queue establishing module 20 is configured to obtain a data segment index and a distance between an initial position and each piece of segment data, calculate a distance value between the distance between the initial position and each piece of segment data according to the data segment index, and establish a queue according to the distance value and the corresponding piece of segment data;
the graph distance calculation module 30 is configured to set an initial distance, extract geometric figures from the sliced data in the queue, calculate a graph distance from an initial position to each geometric figure, compare the graph distance with the initial distance, and store a label corresponding to the graph distance and the geometric figure in the queue according to a comparison result;
and the initial distance updating module 40 is configured to set a tag quantity threshold, obtain the number of tags in the queue in real time, and extract the graph distance with the smallest value from the queue to update the initial distance when the number of tags in the queue is greater than the tag quantity threshold.
In addition, it should be noted that the above-described embodiments of the apparatus are merely illustrative, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of the modules to implement the purpose of the embodiments according to actual needs, and the present invention is not limited herein.
In addition, the technical details that are not described in detail in this embodiment may refer to the proximity analysis method based on spatial index provided in any embodiment of the present invention, and are not described herein again.
Furthermore, an embodiment of the present invention further provides a medium, where the medium is a computer medium, where a spatial index-based proximity analysis method program is stored on the computer medium, and when executed by a processor, the spatial index-based proximity analysis method program implements the following operations:
s1, establishing an R-Tree algorithm, acquiring spatial data, calculating the spatial data through the R-Tree algorithm, establishing an R-Tree index structure according to the calculated data, performing data fragmentation on the calculated data in the R-Tree index structure, and establishing a corresponding data fragmentation index according to each fragmented data;
s2, acquiring the data fragment indexes and the distances between the initial positions, calculating the distance values between the distances between the initial positions and each fragment data according to the data fragment indexes, and establishing queues according to the distance values and the corresponding fragment data;
s3, setting an initial distance, extracting geometric figures from the sliced data in the queue, calculating the figure distance from the initial position to each geometric figure, comparing the figure distance with the initial distance, and storing the figure distance and a label corresponding to the geometric figure into the queue according to the comparison result;
and S4, setting a label quantity threshold value, acquiring the quantity of labels in the queue in real time, and extracting the graph distance with the minimum value from the queue to update the initial distance when the quantity of the labels in the queue is greater than the label quantity threshold value.
Further, the program of the proximity analysis method based on the spatial index, when executed by the processor, further implements the following operations:
the R-Tree algorithm is that the number of spatial data is set to be n, the size of a sector is defined to be fan, the number k of leaf nodes is estimated to be n/fan, all the spatial data are sequenced according to the x value of the central point of a rectangular outer frame of the spatial data, the sequenced spatial data are grouped, the size of each group is set to be x fan, the last group is not full of members, the data in each spatial data group are sequenced according to the y value of the central point of the rectangular outer frame of the spatial data, each group after sequencing is grouped again, the size of each group is fan, each group is used as a leaf node, and the number of the leaf nodes is nn.
Further, the program of the proximity analysis method based on the spatial index, when executed by the processor, further implements the following operations:
the data shard index includes: the method comprises the following steps of slicing the minimum outsourcing rectangle of the data, the geometric figure surrounded by the minimum outsourcing rectangle of the slicing data and the label corresponding to the geometric figure.
Further, the program of the proximity analysis method based on the spatial index, when executed by the processor, further implements the following operations:
the method comprises the steps of obtaining a data fragment index and the distance of an initial position, obtaining the position of the minimum outsourcing rectangle of fragment data according to the data fragment index, calculating the distance value between the distance of the initial position and the position of the minimum outsourcing rectangle of the fragment data, establishing a queue according to the distance value and corresponding fragment data, and arranging the fragment data according to the distance value from small to large in sequence.
Further, the program of the proximity analysis method based on the spatial index, when executed by the processor, further implements the following operations:
setting an initial distance and a fragment data quantity threshold, traversing the quantity of fragment data in a queue, comparing a distance value corresponding to the fragment data with the initial distance when the quantity of the fragment data is greater than the fragment data quantity threshold, taking out the fragment data from the queue when the distance value is not greater than the initial distance, extracting a geometric figure surrounded by a minimum outsourcing rectangle of the fragment data and a label corresponding to the geometric figure from the fragment data, and calculating the figure distance from the distance of an initial position to each geometric figure.
Further, the program of the proximity analysis method based on the spatial index, when executed by the processor, further implements the following operations:
and comparing the graph distance with the initial distance, and storing the label corresponding to the graph distance and the geometric graph into a queue when the graph distance is not greater than the initial distance.
Further, the program of the proximity analysis method based on the spatial index, when executed by the processor, further implements the following operations:
setting a label quantity threshold, acquiring the number of labels in the queue in real time, comparing the number of labels with the label quantity threshold, and extracting a graph distance with the minimum value from the queue to update the initial distance when the number of labels is not less than the label quantity threshold; and when the number of the labels is less than the label number threshold value, reselecting new fragment data.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A proximity analysis method based on spatial index is characterized in that: comprises the following steps;
s1, establishing an R-Tree algorithm, acquiring spatial data, calculating the spatial data through the R-Tree algorithm, establishing an R-Tree index structure according to the calculated data, performing data fragmentation on the calculated data in the R-Tree index structure, and establishing a corresponding data fragmentation index according to each fragmented data;
s2, acquiring the data fragment indexes and the distances between the initial positions, calculating the distance values between the distances between the initial positions and each fragment data according to the data fragment indexes, and establishing queues according to the distance values and the corresponding fragment data;
s3, setting an initial distance, extracting geometric figures from the sliced data in the queue, calculating the figure distance from the initial position to each geometric figure, comparing the figure distance with the initial distance, and storing the figure distance and a label corresponding to the geometric figure into the queue according to the comparison result;
and S4, setting a label quantity threshold value, acquiring the quantity of labels in the queue in real time, and extracting the graph distance with the minimum value from the queue to update the initial distance when the quantity of the labels in the queue is greater than the label quantity threshold value.
2. The spatial index based proximity analysis method of claim 1, wherein: in step S1, an R-Tree algorithm is established, where the R-Tree algorithm sets the number of spatial data to be n and the size of a sector to be fan, estimates the number k of leaf nodes to be n/fan, sorts all spatial data according to the x value of the central point of the rectangular outer frame of the spatial data, groups the sorted spatial data, where each group is x fan and the last group may not be full, sorts the data in each spatial data group according to the y value of the central point of the rectangular outer frame of the spatial data, re-groups each sorted group, where each group is fan, uses each group as a leaf node, and the number of leaf nodes is nn.
3. The spatial index based proximity analysis method of claim 1, wherein: in step S1, data slicing is performed on the calculated data in the R-Tree index structure, and a corresponding data slicing index is established according to each sliced data, further including the following steps: the method comprises the following steps of slicing the minimum outsourcing rectangle of the data, the geometric figure surrounded by the minimum outsourcing rectangle of the slicing data and the label corresponding to the geometric figure.
4. The spatial index-based proximity analysis method of claim 2, wherein: in step S2, the method includes obtaining a data segment index and a distance between an initial position and each segment data, calculating a distance value between the initial position and each segment data according to the data segment index, and establishing a queue according to the distance value and the corresponding segment data, and further includes obtaining the data segment index and the distance between the initial position and each segment data, obtaining a position of a minimum outsourcing rectangle of the segment data according to the data segment index, calculating a distance value between the initial position and the position of the minimum outsourcing rectangle of the segment data, establishing a queue according to the distance value and the corresponding segment data, and arranging the segment data according to the distance value in order from small to large.
5. The spatial index based proximity analysis method of claim 3, wherein: in step S3, setting an initial distance, extracting geometric figures from the sliced data in the queue, and calculating a figure distance from the initial position to each geometric figure, further including the steps of setting an initial distance and a sliced data quantity threshold, traversing the quantity of the sliced data in the queue, comparing a distance value corresponding to the sliced data with the initial distance when the quantity of the sliced data is greater than the sliced data quantity threshold, taking out the sliced data from the queue when the distance value is not greater than the initial distance, extracting the geometric figure surrounded by the minimum outsourcing rectangle of the sliced data and a label corresponding to the geometric figure from the sliced data, and calculating a figure distance from the initial position to each geometric figure.
6. The spatial index based proximity analysis method of claim 5, wherein: and step S3, comparing the graph distance with the initial distance, and storing the label corresponding to the graph distance and the geometric figure into a queue according to the comparison result, and further comprising the following steps of comparing the graph distance with the initial distance, and storing the label corresponding to the graph distance and the geometric figure into the queue when the graph distance is not more than the initial distance.
7. The spatial index based proximity analysis method of claim 6, wherein: step S4, setting a label quantity threshold value, obtaining the quantity of labels in the queue in real time, extracting the graph distance with the minimum value from the queue to update the initial distance when the quantity of labels in the queue is larger than the label quantity threshold value, and the method also comprises the following steps of setting the label quantity threshold value, obtaining the quantity of labels in the queue in real time, comparing the quantity of labels with the label quantity threshold value, and extracting the graph distance with the minimum value from the queue to update the initial distance when the quantity of labels is not smaller than the label quantity threshold value; and when the number of the labels is less than the label number threshold value, reselecting new fragment data.
8. A spatial index-based proximity analysis apparatus, comprising:
the data fragment index establishing module is used for establishing an R-Tree algorithm, acquiring spatial data, calculating the spatial data through the R-Tree algorithm, establishing an R-Tree index structure according to the calculated data, performing data fragmentation on the calculated data in the R-Tree index structure, and establishing a corresponding data fragmentation index according to each fragmented data;
the queue establishing module is used for acquiring the data fragment indexes and the distances between the initial positions, calculating the distance value between the distance between the initial positions and each fragment data according to the data fragment indexes, and establishing a queue according to the distance value and the corresponding fragment data;
the graph distance calculation module is used for setting an initial distance, extracting geometric figures from the fragment data in the queue, calculating the graph distance from the initial position to each geometric figure, comparing the graph distance with the initial distance, and storing a label corresponding to the graph distance and the geometric figures into the queue according to a comparison result;
and the initial distance updating module is used for setting a label quantity threshold value, acquiring the number of labels in the queue in real time, and extracting the graph distance with the minimum value from the queue to update the initial distance when the number of labels in the queue is greater than the label quantity threshold value.
9. An apparatus, characterized in that the apparatus comprises: a memory, a processor, and a spatial index based proximity analysis method program stored on the memory and executable on the processor, the spatial index based proximity analysis method program configured to implement the steps of the spatial index based proximity analysis method of any of claims 1 to 7.
10. A medium, characterized in that the medium is a computer medium, on which a spatial index based proximity analysis method program is stored, which when executed by a processor implements the steps of the spatial index based proximity analysis method according to any one of claims 1 to 7.
CN201911131958.3A 2019-11-19 2019-11-19 Proximity analysis method, device, equipment and medium based on spatial index Pending CN110888880A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911131958.3A CN110888880A (en) 2019-11-19 2019-11-19 Proximity analysis method, device, equipment and medium based on spatial index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911131958.3A CN110888880A (en) 2019-11-19 2019-11-19 Proximity analysis method, device, equipment and medium based on spatial index

Publications (1)

Publication Number Publication Date
CN110888880A true CN110888880A (en) 2020-03-17

Family

ID=69747900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911131958.3A Pending CN110888880A (en) 2019-11-19 2019-11-19 Proximity analysis method, device, equipment and medium based on spatial index

Country Status (1)

Country Link
CN (1) CN110888880A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463904A (en) * 2020-11-30 2021-03-09 湖北金拓维信息技术有限公司 Mixed analysis method of distributed space vector data and single-point space data
CN112632008A (en) * 2020-12-29 2021-04-09 华录光存储研究院(大连)有限公司 Data fragment transmission method and device and computer equipment
CN113240089A (en) * 2021-05-20 2021-08-10 北京百度网讯科技有限公司 Graph neural network model training method and device based on graph retrieval engine
CN113536058A (en) * 2021-08-03 2021-10-22 上海达梦数据库有限公司 Spatial index modification method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040193615A1 (en) * 2003-03-27 2004-09-30 Kothuri Ravi Kanth V. Delayed distance computations for nearest-neighbor queries in an R-tree index
CN102253961A (en) * 2011-05-17 2011-11-23 复旦大学 Method for querying road network k aggregation nearest neighboring node based on Voronoi graph
CN103324642A (en) * 2012-03-23 2013-09-25 日电(中国)有限公司 Data index establishing system and method as well as data query method
CN104573140A (en) * 2013-10-09 2015-04-29 北京军区军事训练模拟仿真研发服务中心 Layered dynamic path planning method applied to virtual simulation
CN106055563A (en) * 2016-05-19 2016-10-26 福建农林大学 Method for parallel space query based on grid division and system of same
CN106933833A (en) * 2015-12-30 2017-07-07 中国科学院沈阳自动化研究所 A kind of positional information method for quickly querying based on Spatial Data Index Technology
CN109871418A (en) * 2019-01-04 2019-06-11 广州市城市规划勘测设计研究院 A kind of space index method and system of space-time data
CN110119408A (en) * 2019-03-22 2019-08-13 西安电子科技大学 Mobile object continuous-query method under geographical space real-time streaming data

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040193615A1 (en) * 2003-03-27 2004-09-30 Kothuri Ravi Kanth V. Delayed distance computations for nearest-neighbor queries in an R-tree index
CN102253961A (en) * 2011-05-17 2011-11-23 复旦大学 Method for querying road network k aggregation nearest neighboring node based on Voronoi graph
CN103324642A (en) * 2012-03-23 2013-09-25 日电(中国)有限公司 Data index establishing system and method as well as data query method
CN104573140A (en) * 2013-10-09 2015-04-29 北京军区军事训练模拟仿真研发服务中心 Layered dynamic path planning method applied to virtual simulation
CN106933833A (en) * 2015-12-30 2017-07-07 中国科学院沈阳自动化研究所 A kind of positional information method for quickly querying based on Spatial Data Index Technology
CN106055563A (en) * 2016-05-19 2016-10-26 福建农林大学 Method for parallel space query based on grid division and system of same
CN109871418A (en) * 2019-01-04 2019-06-11 广州市城市规划勘测设计研究院 A kind of space index method and system of space-time data
CN110119408A (en) * 2019-03-22 2019-08-13 西安电子科技大学 Mobile object continuous-query method under geographical space real-time streaming data

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BALASUBRAMANIAN L 等: "A state-of-art in R-tree variants for spatial indexing", 《INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS》 *
CHENYQ2008: "R-Tree空间索引算法的研究历程和最新进展分析", 《HTTPS://BLOG.CSDN.NET/CHENYQ2008/ARTICLE/DETAILS/2140477》 *
付伟: "基于R树的空间索引技术的研究与应用", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
何江 等: "一种基于R树空间索引技术的GIS数据索引方法", 《四川大学学报(自然科学版)》 *
何珍文 等: "聚类排序R树三维空间索引算法研究", 《全国数学地球科学与地学信息学术会议》 *
梁珺秀 等: "K近邻近似模式匹配查询", 《小型微型计算机***》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463904A (en) * 2020-11-30 2021-03-09 湖北金拓维信息技术有限公司 Mixed analysis method of distributed space vector data and single-point space data
CN112463904B (en) * 2020-11-30 2022-07-01 湖北金拓维信息技术有限公司 Mixed analysis method of distributed space vector data and single-point space data
CN112632008A (en) * 2020-12-29 2021-04-09 华录光存储研究院(大连)有限公司 Data fragment transmission method and device and computer equipment
CN113240089A (en) * 2021-05-20 2021-08-10 北京百度网讯科技有限公司 Graph neural network model training method and device based on graph retrieval engine
CN113536058A (en) * 2021-08-03 2021-10-22 上海达梦数据库有限公司 Spatial index modification method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110888880A (en) Proximity analysis method, device, equipment and medium based on spatial index
Swarndeep Saket et al. An overview of partitioning algorithms in clustering techniques
US11132388B2 (en) Efficient spatial queries in large data tables
US10452661B2 (en) Automated database schema annotation
US8849030B2 (en) Image retrieval using spatial bag-of-features
CN106503223B (en) online house source searching method and device combining position and keyword information
CN108549696B (en) Time series data similarity query method based on memory calculation
Ezzine et al. A study of handling missing data methods for big data
US20210326361A1 (en) Dynamic clustering of sparse data utilizing hash partitions
US20220005546A1 (en) Non-redundant gene set clustering method and system, and electronic device
CN115293919B (en) Social network distribution outward generalization-oriented graph neural network prediction method and system
KR20220070482A (en) Image incremental clustering method, apparatus, electronic device, storage medium and program product
Jenni et al. Pre-processing image database for efficient Content Based Image Retrieval
CN113821657A (en) Artificial intelligence-based image processing model training method and image processing method
CN110083731B (en) Image retrieval method, device, computer equipment and storage medium
US8370363B2 (en) Hybrid neighborhood graph search for scalable visual indexing
CN114741544A (en) Image retrieval method, retrieval library construction method, device, electronic equipment and medium
CN109033746B (en) Protein compound identification method based on node vector
CN113204676B (en) Compression storage method based on graph structure data
US10803053B2 (en) Automatic selection of neighbor lists to be incrementally updated
CN113779248A (en) Data classification model training method, data processing method and storage medium
CN112989193A (en) Data processing method and device, electronic equipment and computer storage medium
US20170322998A1 (en) Information processing device, information processing method, and computer-readable storage medium
CN118260273B (en) Database storage optimization method, system and medium based on enterprise data
Homola et al. Searching for sub-images using sequence alignment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200317