CN109446114A - Spatial data caching method and device and storage medium - Google Patents
Spatial data caching method and device and storage medium Download PDFInfo
- Publication number
- CN109446114A CN109446114A CN201811191662.6A CN201811191662A CN109446114A CN 109446114 A CN109446114 A CN 109446114A CN 201811191662 A CN201811191662 A CN 201811191662A CN 109446114 A CN109446114 A CN 109446114A
- Authority
- CN
- China
- Prior art keywords
- data
- data block
- block
- space
- priority
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 84
- 239000000872 buffer Substances 0.000 claims description 167
- 238000012545 processing Methods 0.000 claims description 71
- 230000015654 memory Effects 0.000 claims description 48
- 238000004590 computer program Methods 0.000 claims description 41
- 238000012512 characterization method Methods 0.000 claims description 17
- 230000007246 mechanism Effects 0.000 claims description 17
- 238000012217 deletion Methods 0.000 claims description 7
- 230000037430 deletion Effects 0.000 claims description 7
- 238000010586 diagram Methods 0.000 description 26
- 230000005291 magnetic effect Effects 0.000 description 10
- 230000001360 synchronised effect Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 238000013461 design Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000003068 static effect Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000000151 deposition Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 241000208340 Araliaceae Species 0.000 description 2
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 2
- 235000003140 Panax quinquefolius Nutrition 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 235000008434 ginseng Nutrition 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 230000005294 ferromagnetic effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
- G06F12/0895—Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1021—Hit rate improvement
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a spatial data caching method, which comprises the following steps: receiving an operation request; the operation request is used for requesting to read a data block or requesting to write the data block; determining a data block corresponding to the operation request, and adding the data block into a cache pre-write queue; determining the priority of the data block, and adding the data block into a cache queue when determining that the data block obtains the right of adding into the cache queue according to the priority of the data block; the priority is related to a degree of association of the data block with a reference cache block in the cache queue. The invention also discloses a spatial data caching device and a computer readable storage medium.
Description
Technical field
The present invention relates to big data memory technologies more particularly to a kind of spatial data caching method, device and computer can
Read storage medium.
Background technique
Increasingly mature with big data processing technique, Hai Dupu (Hadoop) becomes the popular tool of big data processing.
Data dispersion is stored in back end (DataNode) cluster by Hadoop, maps reduction (MapReduce) task in operation
When, it is determined to be assigned to task on any platform machine according to the distribution of data, realizes efficient distributed computing.Hadoop itself
It is the data processing shelf of input/output (I/O, Input/Output) intensity, influence of the I/O efficiency to its performance to Guan Chong
It wants, and caching technology can effectively reduce I/O operation number, promote I/O performance, optimize the performance of Hadoop cluster.
Hadoop itself provides a kind of caching mechanism, can permit the user configuration file to be cached, when a file
After being arranged to the file to be buffered, the data block that this document is related to is added out-pile memory and cached by DataNode, is used
To promote the access efficiency of this document.
Hadoop can also use the caching mechanism based on index, such as the distributed space data rope based on quaternary tree
Draw;It uses quad-tree partition area of space, and the leaf node of each quaternary tree stores a certain amount of spatial data, by multiple leaf nodes
Storage within the data block, and is stored in DataNode cluster with being distributed by the distributed storage mechanism of Hadoop itself.Base
In this Index Structure Design two-level cache: a. is the caching at the end DataNode first, is arranged at the end DataNode to data
The caching of block carries out data cached management using page frame replacement (LRU, Least Recently Used) algorithm, according to user
Data cached eliminate is carried out to the visiting frequency of data block.B. the caching of client is established in client towards quaternary tree leaf segment
The caching of point, the quaternary tree leaf node that client frequently accesses is cached, the management of caching also direct basis lru algorithm.
The cache granularity of Hadoop itself generally only arrives file-level, and its caching mechanism is just for data block (Block)
Itself, however the data that Hadoop is handled in real application systems are often related, such as include spatial positional information
Microblog data, miaow cluck kind running track data for running user etc..If the caching mechanism of Hadoop itself is used only, work as user
When accessing adjacent area, if adjacent region is not in memory, it is necessary to corresponding data are read on to disk, to increase
Magnetic disc i/o, may reduce the hit rate of caching, system effectiveness is caused to reduce.
Summary of the invention
It can in view of this, the main purpose of the present invention is to provide a kind of spatial data caching method, device and computers
Read storage medium.
In order to achieve the above objectives, the technical scheme of the present invention is realized as follows:
The embodiment of the invention provides a kind of spatial data caching methods, are applied to back end, which comprises
Receive operation requests;The operation requests are for requests data reading block or request writing data blocks;
It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;
The priority for determining the data block determines that the data block obtains to be added according to the priority of the data block and delays
When depositing the permission of queue, the buffer queue is added in the data block;The priority and the data block and the caching
The correlation degree of reference buffer storage block in queue is related.
In above scheme, the priority of the determination data block, comprising:
Determine space between the visiting frequency and the data block and the reference buffer storage block of the data block away from
From;The reference buffer storage block characterizes the cache blocks of highest priority in the buffer queue;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data are determined
The priority of block.
In above scheme, the priority according to the data block determines that the data block obtains and buffer queue is added
Permission, comprising:
When determining that the priority of the data block is greater than the priority of at least one cache blocks in the buffer queue, determine
The data block obtains the permission that buffer queue is added.
In above scheme, for the data block that request is read, the visiting frequency of the determination data block, comprising:
Determine access times of the data block in preset time period and the maiden visit time in preset time period and
The first time interval of the last access time;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
In above scheme, for the data block of request write-in, the visiting frequency of the determination data block, comprising:
Determine the adjacent data blocks of data block described in the buffer queue;
When determining access times of the adjacent data blocks in preset time period and the maiden visit in preset time period
Between and the last access time the second time interval;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block
Distance determines the visiting frequency of the data block.
In above scheme, the space length between the data block and reference buffer storage block is determined, comprising:
Determine corresponding first area of space of the data block and the corresponding second space region of the reference buffer storage block;
Second central point of the first nodal point coordinate and the second space region that determine first area of space is sat
Mark;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
In above scheme, the method also includes: it determines that the data block does not obtain and the permission of the buffer queue is added
When, the data block is handled according to preset strategy;
It is described that the data block is handled according to preset strategy, comprising:
For the data block that request is read, the data block is deleted;
For the data block of request write-in, data block write-in Hadoop is prewrite into enqueue, the Hadoop prewrites
Data block in enqueue for etc. the back end to be written.
The embodiment of the invention provides a kind of spatial data caching methods, are applied to client, which comprises
After calling caching eliminative mechanism, the first data and the second data are determined;The first data characterization visiting frequency is low
In the data of first threshold, the highest data of the second data characterization visiting frequency;
Determine the reference space length of first data and the second data;It is described to characterize described first with reference to space length
The correlation degree of data and the second data;
Determine whether to delete first data with reference to space length according to described, when determining deletion first data,
Delete operation is executed for first data.
In above scheme, the reference space length of determination first data and the second data, comprising:
Determine the corresponding third area of space of first data and corresponding 4th area of space of second data;
4th central point of the third center point coordinate and the 4th area of space that determine the third area of space is sat
Mark;
It is determined according to the third center point coordinate and the 4th center point coordinate described with reference to space length.
It is described to determine whether to delete first data with reference to space length according to described in above scheme, comprising:
Judge whether the reference space length is more than pre-set space threshold value, determines that the reference space length is more than default
When capacity-threshold, determines and delete first data.
The embodiment of the invention provides a kind of spatial data buffer storage, described device includes: first processing module, second
Processing module and third processing module;Wherein,
The first processing module, for receiving operation requests;The operation requests are for requests data reading block or ask
Seek writing data blocks;
The Second processing module data block is added slow for determining the operation requests corresponding data block
It deposits and prewrites enqueue;
The third processing module, for determining the priority of the data block, the priority according to the data block is true
When the fixed data block obtains the permission that buffer queue is added, the buffer queue is added in the data block;The priority
It is related to the correlation degree of reference buffer storage block in the data block and the buffer queue.
In above scheme, the third processing module, visiting frequency specifically for the determination data block and described
Space length between data block and the reference buffer storage block;The reference buffer storage block characterizes in the buffer queue priority most
High cache blocks;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data are determined
The priority of block.
In above scheme, the third processing module is greater than described delay specifically for the priority of the determination data block
When depositing the priority of at least one cache blocks in queue, determine that the data block obtains the permission that buffer queue is added.
In above scheme, the third processing module determines preset time specifically for the data block read for request
The access time of the access times of the data block and the maiden visit time in preset time period and the last time in section
First time interval;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
In above scheme, the third processing module determines the caching specifically for the data block for request write-in
The adjacent data blocks of data block described in queue;
When determining access times of the adjacent data blocks in preset time period and the maiden visit in preset time period
Between and the last access time the second time interval;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block
Distance determines the visiting frequency of the data block.
In above scheme, the third processing module is specifically used for determining corresponding first area of space of the data block
Second space region corresponding with the reference buffer storage block;
Second central point of the first nodal point coordinate and the second space region that determine first area of space is sat
Mark;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
In above scheme, the third processing module is also used to determine that the data block does not obtain and the caching team is added
When the permission of column, the data block is handled according to preset strategy;
The third processing module deletes the data block specifically for the data block read for request;For request
Data block write-in Hadoop is prewrite enqueue by the data block of write-in, and the Hadoop prewrites the data block in enqueue
For etc. back end to be written.
The embodiment of the invention provides a kind of spatial data buffer storage, described device includes: fourth processing module, the 5th
Processing module and the 6th processing module;Wherein,
The fourth processing module determines the first data and the second data after calling caching eliminative mechanism;Described
One data characterization visiting frequency is lower than the data of first threshold, the highest data of the second data characterization visiting frequency;
5th processing module, for determining the reference space length of first data and the second data;The ginseng
Examine the correlation degree that space length characterizes first data and the second data;
6th processing module, for determining whether to delete first data with reference to space length according to described, really
When deleting first data surely, delete operation is executed for first data.
In above scheme, the 5th processing module is specifically used for determining the corresponding third space region of first data
Domain and corresponding 4th area of space of second data;
4th central point of the third center point coordinate and the 4th area of space that determine the third area of space is sat
Mark;
It is determined according to the third center point coordinate and the 4th center point coordinate described with reference to space length.
In above scheme, whether the 6th processing module is more than default specifically for judging described with reference to space length
Capacity-threshold determines when determining that the reference space length is more than pre-set space threshold value and deletes first data.
The embodiment of the invention provides a kind of spatial data buffer storage, described device includes: first processor and is used for
Store the first memory for the computer program that can be run on first processor;Wherein,
The first processor is for executing any one space of back end side when running the computer program
The step of data cache method.
The embodiment of the invention provides a kind of spatial data buffer storage, described device includes: second processor and is used for
Store the second memory for the computer program that can be run in second processor;Wherein,
The second processor is for executing any one space number of client-side when running the computer program
The step of according to caching method.
The embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, described
The step of any one spatial data caching method of back end side is realized when computer program is executed by processor;Or
Person realizes the step of any one spatial data caching method of client-side when the computer program is executed by processor
Suddenly.
Spatial data caching method, device and computer readable storage medium provided by the embodiment of the present invention receive behaviour
It requests;The operation requests are for requests data reading block or request writing data blocks;Determine that the operation requests are corresponding
The data block is added caching and prewrites enqueue by data block;The priority for determining the data block, according to the data block
When priority determines that the data block obtains the permission that buffer queue is added, the buffer queue is added in the data block;Institute
It is related to the correlation degree of reference buffer storage block in the data block and the buffer queue to state priority.The embodiment of the present invention
In, the space length relevance between combined data is managed caching, the hit rate of caching is improved, thus what raising read or write
Efficiency.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of spatial data caching method provided in an embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of quaternary tree distributed index structure provided in an embodiment of the present invention.
Fig. 3 is the schematic diagram of seed region space length provided in an embodiment of the present invention;
Fig. 4 is the flow diagram of another spatial data caching method provided in an embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of the Hadoop platform of two-level cache provided in an embodiment of the present invention;
Fig. 6 is a kind of schematic diagram of the buffer structure of back end provided in an embodiment of the present invention;
Fig. 7 is the schematic diagram of the plane space equal part provided in an embodiment of the present invention based on quaternary tree index;
Fig. 8 is the flow diagram of the caching method of back end read block provided in an embodiment of the present invention;
Fig. 9 is the flow diagram of the caching method of back end writing data blocks provided in an embodiment of the present invention;
Figure 10 is the structural schematic diagram of Map structure provided in an embodiment of the present invention;
Figure 11 is the flow diagram of the caching method of client provided in an embodiment of the present invention;
Figure 12 is the structural schematic diagram of spatial data buffer storage one provided in an embodiment of the present invention;
Figure 13 is the structural schematic diagram of spatial data buffer storage two provided in an embodiment of the present invention;
Figure 14 is the structural schematic diagram of spatial data buffer storage three provided in an embodiment of the present invention;
Figure 15 is the structural schematic diagram of spatial data buffer storage four provided in an embodiment of the present invention.
Specific embodiment
In various embodiments of the present invention, operation requests are received;The operation requests for requests data reading block or
Request writing data blocks;It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;Really
The priority of the fixed data block determines that the data block obtains the power of addition buffer queue according to the priority of the data block
In limited time, the buffer queue is added in the data block;Ginseng in the priority and the data block and the buffer queue
The correlation degree for examining cache blocks is related.
Below with reference to embodiment, the present invention is further described in more detail.
Fig. 1 is a kind of flow diagram of spatial data caching method provided in an embodiment of the present invention;The method can be with
Applied to back end, as shown in Figure 1, which comprises
Step 101 receives operation requests;The operation requests are for requests data reading block or request writing data blocks.
Step 102 determines the corresponding data block of the operation requests, and caching is added in the data block and prewrites enqueue.
Here, the back end may include:
Caching prewrites enqueue, for storing the data block for being possible to write-in buffer queue.
Buffer queue (also referred to as priority query) determines whether for data block to be stored in slow for the priority according to data block
Deposit queue.
Hadoop prewrites enqueue, for storing the data block that Hadoop will be written.
Step 103, the priority for determining the data block determine that the data block obtains according to the priority of the data block
When the permission of buffer queue must be added, the buffer queue is added in the data block;The priority and the data block and
The correlation degree of reference buffer storage block in the buffer queue is related.
The priority of the data block, it is contemplated that the visiting frequency of data block and the data block and hot spot data block
Between space length.
Specifically, the priority of the determination data block, comprising:
Determine space between the visiting frequency and the data block and the reference buffer storage block of the data block away from
From;The reference buffer storage block characterizes the cache blocks of highest priority in the buffer queue;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data are determined
The priority of block.
Here, the priority is the product of the visiting frequency and first weight and the space length and institute
State the sum of products of the second weight.First weight and second weight can be preparatory by the maintenance personnel of Hadoop platform
It sets and saves.
Here, the cache blocks refer to the data block being added in the buffer queue.
Specifically, the data block can be the data block for the data block or request write-in that request is read.
Specifically, the data block read for request, the visiting frequency of the determination data block, comprising:
Determine access times of the data block in preset time period and the maiden visit time in preset time period and
The first time interval of the last access time;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
Specifically, for the data block of request write-in, the visiting frequency of the determination data block, comprising:
Determine the adjacent data blocks of data block described in the buffer queue;
When determining access times of the adjacent data blocks in preset time period and the maiden visit in preset time period
Between and the last access time the second time interval;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block
Distance determines the visiting frequency of the data block.
Here, the method also includes: determine the space length between the adjacent data blocks and the data block;Specifically
Comprise determining that the center point coordinate and the data block corresponding area of space of the corresponding area of space of the adjacent data blocks
Center point coordinate;Space length is determined according to two determining center point coordinates.
Specifically, it is determined that the space length between the data block and reference buffer storage block, comprising:
Determine corresponding first area of space of the data block and the corresponding second space region of the reference buffer storage block;
Second central point of the first nodal point coordinate and the second space region that determine first area of space is sat
Mark;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
Specifically, the priority according to the data block determines that the data block obtains the power that buffer queue is added
Limit, comprising:
When determining that the priority of the data block is greater than the priority of at least one cache blocks in the buffer queue, determine
The data block obtains the permission that buffer queue is added.
In the present embodiment, the method also includes: it determines that the data block does not obtain and the permission of the buffer queue is added
When, the data block is handled according to preset strategy.
It is described that the data block is handled according to preset strategy, comprising:
For the data block that request is read, the data block is deleted;
For the data block of request write-in, data block write-in Hadoop is prewrite into enqueue, the Hadoop prewrites
Data block in enqueue for etc. the back end to be written.
It should be noted that the Hadoop of the present embodiment uses the caching mechanism based on quaternary tree distributed index, such as Fig. 2
It is shown, it is a kind of structural schematic diagram of quaternary tree distributed index structure provided in an embodiment of the present invention.Use quaternary tree can be with
Data space is divided into non-intersecting multiple subregions, therefore space data sets can be divided with quaternary tree.In the present embodiment,
In the leaf node that data are no longer only stored to index, but it is associated with a data block to each node of index, and in number
According to the data for storing index in block.When there is new data to be inserted into, the new data node to be inserted into first is found, checks that the node is associated with
Data block whether expired, generate four child nodes by this node of the regular splitting of quaternary tree if having expired, then by data
It is inserted into corresponding child node and is stored to Hadoop request for data block;Otherwise, the leaf node directly is written in data supplementing to close
In the data block of connection.Data block each in this way is to belong to some divided subregion, to have spatial positional information.
Fig. 3 is the schematic diagram of seed region space length provided in an embodiment of the present invention;As shown in figure 3, each node can
To generate four child nodes, A, B in figure ..., the central point that K is each child node, can be determined according to the coordinate of central point
Space length between each child node, it can calculate the space length between the associated data block of each child node.
Fig. 4 is the flow diagram of another spatial data caching method provided in an embodiment of the present invention;The method is answered
For the client of Hadoop platform, as shown in Figure 4, which comprises
Step 201 after calling caching eliminative mechanism, determines the first data and the second data;First data characterization is visited
Ask that frequency is lower than the data of first threshold, the highest data of the second data characterization visiting frequency.
Here, the visiting frequency of each data can according in preset time period access times and access times it is corresponding
The time interval of initial-access time and the last access time, which calculate, to be obtained.
Step 202, the reference space length for determining first data and the second data;It is described to be characterized with reference to space length
The correlation degree of first data and the second data.
Specifically, the reference space length of the determination first data and the second data, comprising:
Determine the corresponding third area of space of first data and corresponding 4th area of space of second data;Really
The third center point coordinate of the fixed third area of space and the 4th center point coordinate of the 4th area of space;According to described
Third center point coordinate and the 4th center point coordinate determine described with reference to space length.
Step 203 determines whether to delete first data with reference to space length, determines and delete described first according to described
When data, delete operation is executed for first data.
It is specifically, described to determine whether to delete first data with reference to space length according to described, comprising:
Judge whether the reference space length is more than pre-set space threshold value, determines that the reference space length is more than default
When capacity-threshold, determines and delete first data.
Here, the pre-set space threshold value is preset and is saved by the maintenance personnel of Hadoop platform.
Fig. 5 is a kind of structural schematic diagram of the Hadoop platform of the two-level cache provided in the present embodiment;As shown in figure 5,
The Hadoop platform includes: client and back end (i.e. DataNode1, DataNode2 ... DataNodeN).
Client be equipped with one can store, the buffer queue of search space data.If searched when searching
Spatial data in the buffer queue of client, then client can directly in response to user inquiry request without again to Hadoop
Cluster carries out distributed query.
It is equipped with the buffer queue for the data block that one is stored for the back end in back end, it can be according to Fig. 1
Shown in method carry out cache management, can first access cache team when back end receives the request of user's read block
Column check data block whether in buffer queue, do not have to the reading for carrying out magnetic disc i/o if being directly returned to client if
It writes;When back end will be written in new data block, also judged according to the above method, determines that the data block being written into is direct
Disk is written or buffer queue is added.
The two-level cache refers to caching of the client to the caching of spatial data and back end to data block.
The caching of the back end is the caching for data block, and Fig. 6 is a kind of data provided in an embodiment of the present invention
The schematic diagram of the buffer structure of node, as shown in fig. 6, the back end managed by following three queues caching reading or
It writes:
A, buffer queue (also referred to as priority query), for (priority to be related to counting according to the priority of data block
According to the visiting frequency of block and the relevance of spatial position) determine whether data block buffer queue is added.The buffer queue can
With the state for keeping team full constantly, cache management is carried out with the above method.
B, caching prewrites enqueue, for storing the data block for being possible to that buffer queue is added.Here, the data block of addition
Source can be divided into two classes: by the data block that client request accesses and the new data block that Hadoop will be written.Data block
Whereabouts is divided into three classes: buffer queue is added, directly deletes and write direct Hadoop platform.
C, Hadoop prewrites enqueue, for storing the data block that Hadoop will be written.Specifically buffered queue is washed in a pan
It eliminates the data block that Hadoop is not written got off and is put into the queue, wait Hadoop platform to be written.
Following explanation is done for the calculation method for the priority that buffer queue is related to.
Fig. 7 is the schematic diagram of the plane space equal part provided in an embodiment of the present invention based on quaternary tree index;Such as Fig. 7 institute
Show, it is assumed that area of space is divided into 10 sub-regions: A, B, C, D, E, F, G, H, I, J, and each subregion corresponds in Hadoop
A data block, needed in the operational process of Hadoop by partial data block be added cache.
In the present embodiment, the priority of data block had both considered data block visiting frequency, it is also considered that between data block
The spatial position degree of association.
For visiting frequency: assuming that the visiting frequency of data block A need to be determined, determining that the access times of data block A are
C, the accessed time interval of adjacent data block twice can be expressed as ti–ti-1, i expression access times;It may thereby determine that number
It is c/ Σ (t according to the visiting frequency of block A whithin a period of timei–ti-1)。
For space length: can be calculated according to the center point coordinate of square region space between data block away from
From.
Assuming that B area indicates the data block of the highest priority in buffer queue, now need to consider whether data block A should add
Enter buffer queue, then needs to calculate the priority of data block A.Determine that a-quadrant center point coordinate is (xi,yi), B area central point
Coordinate is (xj,yj).The distance between two o'clock is calculated according to coordinate to indicate the positional distance between data block.
The visiting frequency of combined data block A, then the priority of available data block A:
Wherein, Σ (ti–ti-1) indicate caching data block there are the time, i.e., the corresponding first visit of described access times
Ask time and last time access time,Indicate weight,H indicates that buffer queue can store the maximum quantity of data block.
It should be noted that based on the data block adjacent with hot spot data block also have biggish probability become hot spot block this
Space length is added in the present embodiment, to take into account the space of data block visiting frequency and data block in thought in priority
The position degree of association so that it is few even if access times but with hot spot data block apart from close data block possess biggish priority,
And then make newly to access and be still also added in buffer queue with data block similar in hot spot data block, promote the slow of hot spot data block
Hit rate is deposited, magnetic disc i/o is reduced, to optimize the readwrite performance of entire Hadoop platform.It can from above-mentioned calculation formula
Out, with the increase of buffer memory capacity, the value of priority is more prone to space length, this will make more and hot spot data block
Adjacent data block is loaded into caching, and the hit rate of hot spot data block caching can be promoted further.
The cache management of back end is related to two kinds of situations: a kind of situation is when client will access one not in the buffer
Data block when, need to judge to be that the data block that will newly read is put into buffer queue or directly deletes after disk read block
It removes.Detailed process is illustrated in fig. 8 shown below:
Step 301, the read requests according to client determine the data block on the back end disk of request reading.
The data block of reading is fed back to write-in caching after client and prewrites enqueue by step 302.
Step 303, the priority for determining data block.
Here, the step 303 comprise determining that the space between visiting frequency, data block and the cache blocks of data block away from
From determining the priority of data block according to the determining visiting frequency and space length.
Here, the cache blocks refer specifically to cache the cache blocks to highest priority in column, i.e. hot spot data block.
Step 304 judges whether data block can be added buffer queue, and determination can be added, and enters step 305, otherwise into
Enter step 306.
Here, determine that the priority of the data block is higher than the priority of any one cache blocks in buffer queue
Determine that caching can be added to column in the data block.
The caching cache blocks minimum to priority in column are gone out team and deleted by step 305, and the data block is added and is cached
Queue.
Step 306, data block go out team and delete.
Here, it requests the data block read to prewrite from the caching to go out team in enqueue and delete.
Second situation is when the disk for having new data block that back end is written, in advance by data block write-in caching first
Queue is written, judges whether it can write into buffer queue, writes direct Hadoop if buffer queue cannot be written and prewrite team
Column.Detailed process is illustrated in fig. 9 shown below:
The data block of request write-in is sent corresponding data by step 401, the request for receiving client writing data blocks
Node.
Step 402, the data block being written into write-in caching prewrite enqueue.
Step 403, the priority for determining data block to be written.
The case where with requests data reading block, is similar, needs the visiting frequency for considering data block simultaneously and spatial position here
Relevance determines the priority of data block.Because the data block being newly added was not requested also, design is used slow in the present embodiment
The visiting frequency in queue with data block space to be written apart from nearest data block is deposited to make divided by space length between the two
For the initial access frequency of data block to be written.
It should be noted that because the also not visited access frequency that also can not just calculate the data of the data block being newly written
Degree, however the high data block of visiting frequency corresponds to hot spot access region, and also have biggish probability with region similar in the region
As hot spot region, based on calculating new number according to the visiting frequency of the close data block of space length in above-mentioned thought the present embodiment
According to the visiting frequency of block, the data block that those can be had greater probability to become hot spot data is mentioned in write-in caching at the very start
Rise data access efficiency.Here, distance can be allowed in the priority of new data block divided by space length visiting frequency
It is embodied in value, allows and obtain small priority apart from remote data block, avoid the data block being newly written larger because of priority calculated value
All write-in buffer queues influence the efficiency cached.
Step 404 judges whether data block to be written can be added buffer queue, and determination can be added, enter step
405, otherwise enter step 406.
Step 405, according to cache management strategy, eliminate cache blocks (the i.e. directly deletion that priority is minimum in buffer queue
Cache blocks), buffer queue is added in the data block being written into.
Step 406, the data block being written into directly pass through the mechanism write-in Hadoop platform of Hadoop itself.
Following explanation is done for the cache policy of client.
In the present embodiment, it is used for the spatial data of memory buffers in one Map structure of Client Design, as shown in Figure 10;
Key (Key) of the space coordinate of use space data as Map, and the value (Value) of Map then uses a List structure, it should
The identical all data of parking space coordinate in List structure.For every spatial storage methods, in addition to memory space data
The accessed visiting frequency of memory space data is gone back outside the information of itself, the visiting frequency whether should for measuring spatial data
Be eliminated out buffer queue.
When carrying out client-cache management, it is also contemplated that the spatial position degree of association between data.Data cached washes in a pan
Eliminate in addition to will according to the visiting frequency of data, also need according to the space eliminated between data and the hot spot data not being eliminated away from
It is close from, space length then this just carry out determining whether to eliminate again when caching was needed and eliminated next time without eliminating.
Detailed process is as shown in figure 11, which comprises
Step 501 calls caching eliminative mechanism, starts cache to eliminate, the number cached that is eliminated out is thought in determination
According to as data to be eliminated.
Here, the low data of visiting frequency are determined as the data to be eliminated.
Step 502, the space length for determining data and hot spot data to be eliminated.
Step 503, the space length for comparing data to be eliminated and hot spot data, determine whether to delete according to the space length
Except the data to be eliminated, determines that deletion then enters step 504, otherwise enter step 505.
Here, the client is equipped with a pre-set space threshold value, the space length and the pre-set space threshold
Value determines that the space length is greater than the pre-set space threshold value, then it is assumed that can delete, enter step 504, otherwise it is assumed that not
It can delete, then enter step 505.
Step 504 executes data cached superseded operation.
Step 505, without eliminate, wait next time caching eliminate initiate when reprocesses.
Figure 12 is the structural schematic diagram of spatial data buffer storage one provided in an embodiment of the present invention;Described device can answer
For back end, as shown in figure 12, described device, comprising: at first processing module 601, Second processing module 602 and third
Manage module 603.
The first processing module 601, for receiving operation requests;The operation requests for requests data reading block or
Request writing data blocks.
The data block is added for determining the operation requests corresponding data block for the Second processing module 602
Caching prewrites enqueue.
The third processing module 603, for determining the priority of the data block, according to the priority of the data block
When determining that the data block obtains the permission that buffer queue is added, the buffer queue is added in the data block;It is described preferential
Grade is related to the correlation degree of reference buffer storage block in the data block and the buffer queue.
Specifically, the third processing module 603, visiting frequency specifically for the determination data block and described
Space length between data block and the reference buffer storage block;The reference buffer storage block characterizes in the buffer queue priority most
High cache blocks;Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;According to institute
Visiting frequency, first weight, the space length and second weight are stated, determines the priority of the data block.
Specifically, the third processing module 603 is greater than the caching specifically for the priority of the determination data block
In queue when the priority of at least one cache blocks, determine that the data block obtains the permission that buffer queue is added.
Specifically, the third processing module 603 determines preset time specifically for the data block read for request
The access time of the access times of the data block and the maiden visit time in preset time period and the last time in section
First time interval;According to the access times and the first time interval, the visiting frequency of the data block is determined.
Specifically, the third processing module 603 determines the caching specifically for the data block for request write-in
The adjacent data blocks of data block described in queue;Determine the access times of the adjacent data blocks in preset time period, and pre-
If the second time interval of maiden visit time and the last access time in the period;According to the access times and
Second time interval, determines the visiting frequency of the adjacent data blocks;According to the visiting frequency of the adjacent data blocks, with
And the space length between the adjacent data blocks and the data block, determine the visiting frequency of the data block.
Specifically, the third processing module 603, be specifically used for determining corresponding first area of space of the data block and
The corresponding second space region of the reference buffer storage block;Determine the first nodal point coordinate and described of first area of space
Second center point coordinate of two area of space;According to the first nodal point coordinate and second center point coordinate determination
Space length.
Specifically, the third processing module 603 is also used to determine that the data block does not obtain and the buffer queue is added
Permission when, the data block is handled according to preset strategy.
The third processing module 603 deletes the data block specifically for the data block read for request;For
Data block write-in Hadoop is prewrite enqueue by the data block for requesting write-in, and the Hadoop prewrites the number in enqueue
According to block for etc. back end to be written.
It should be understood that spatial data buffer storage provided by the above embodiment is when carrying out spatial data caching, only
With the division progress of above-mentioned each program module for example, in practical application, can according to need and by above-mentioned processing distribution by
Different program modules is completed, i.e., the internal structure of device is divided into different program modules, described above complete to complete
Portion or part are handled.In addition, spatial data buffer storage provided by the above embodiment (is referred specifically to spatial data caching method
The method of back end side) embodiment belongs to same design, and specific implementation process is detailed in embodiment of the method, no longer superfluous here
It states.
Figure 13 is the structural schematic diagram of spatial data buffer storage two provided in an embodiment of the present invention;Described device can answer
For client;As shown in figure 13, described device includes: fourth processing module 701, the 5th processing module 702 and the 6th processing
Module 703.
The fourth processing module 701 determines the first data and the second data after calling caching eliminative mechanism;Institute
State the data that the first data characterization visiting frequency is lower than first threshold, the highest data of the second data characterization visiting frequency.
5th processing module 702, for determining the reference space length of first data and the second data;It is described
The correlation degree of first data and the second data is characterized with reference to space length.
6th processing module 703, for determining whether to delete first data with reference to space length according to described,
When determining deletion first data, delete operation is executed for first data.
Specifically, the 5th processing module 702 is specifically used for determining the corresponding third area of space of first data
The 4th area of space corresponding with second data;Determine the third center point coordinate and described of the third area of space
4th center point coordinate of four area of space;According to the third center point coordinate and the 4th center point coordinate determination
With reference to space length.
Specifically, whether the 6th processing module 703 is more than default sky specifically for judging described with reference to space length
Between threshold value, determine described when being more than pre-set space threshold value with reference to space length, determine and delete first data.
It should be understood that spatial data buffer storage provided by the above embodiment is when carrying out spatial data caching, only
With the division progress of above-mentioned each program module for example, in practical application, can according to need and by above-mentioned processing distribution by
Different program modules is completed, i.e., the internal structure of device is divided into different program modules, described above complete to complete
Portion or part are handled.In addition, spatial data buffer storage provided by the above embodiment (is referred specifically to spatial data caching method
The method of client-side) embodiment belongs to same design, and specific implementation process is detailed in embodiment of the method, and which is not described herein again.
The method of embodiment to realize the present invention, the embodiment of the present invention provide a kind of spatial data buffer storage, and setting exists
On back end, specifically, as shown in figure 14, described device includes first processor 801 and can be described for storing
The first memory 802 of the computer program run on first processor;Wherein, the first processor 801 is for running institute
It when stating computer program, executes: receiving operation requests;The operation requests are for requests data reading block or request write-in data
Block;It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;Determine the data block
Priority will be described when determining that the data block obtains the permission that buffer queue is added according to the priority of the data block
The buffer queue is added in data block;The pass of reference buffer storage block in the priority and the data block and the buffer queue
Connection degree is related.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining
Space length between the visiting frequency of data block and the data block and the reference buffer storage block;The reference buffer storage block
Characterize the cache blocks of highest priority in the buffer queue;Obtain corresponding first weight of the visiting frequency and the space
Apart from corresponding second weight;According to the visiting frequency, first weight, the space length and second weight,
Determine the priority of the data block.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining
When the priority of data block is greater than the priority of at least one cache blocks in the buffer queue, determine that the data block is added
Enter the permission of buffer queue.
In one embodiment, the first processor 801 is for executing when running the computer program: determining default
In period when the access of the access times of the data block and maiden visit time and the last time in preset time period
Between first time interval;According to the access times and the first time interval, the visiting frequency of the data block is determined.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining
The adjacent data blocks of data block described in buffer queue;Determine the access times of the adjacent data blocks in preset time period, with
And the second time interval of the maiden visit time in preset time period and the last access time;According to the access time
Several and second time interval, determines the visiting frequency of the adjacent data blocks;According to the access of adjacent data blocks frequency
Degree and the space length between the adjacent data blocks and the data block, determine the visiting frequency of the data block.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining
Corresponding first area of space of data block and the corresponding second space region of the reference buffer storage block;Determine first space region
Second center point coordinate of the first nodal point coordinate in domain and the second space region;According to the first nodal point coordinate and
Second center point coordinate determines the space length.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining
When data block does not obtain the permission that the buffer queue is added, the data block is handled according to preset strategy;The basis is default
Strategy handles the data block, comprising: for the data block that request is read, deletes the data block;For the number of request write-in
According to block, data block write-in Hadoop is prewrite into enqueue, the Hadoop prewrites the data block in enqueue for waiting
The back end is written.
It should be understood that spatial data buffer storage provided by the above embodiment and spatial data caching method embodiment
Belong to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Certainly, when practical application, as shown in figure 14, which can also include: at least one network interface 803.It is empty
Between various components in data buffer storage device 80 be coupled by bus system 804.It is understood that bus system 804 is used for
Realize the connection communication between these components.Bus system 804 further includes power bus, control in addition to including data/address bus
Bus and status signal bus in addition.But for the sake of clear explanation, various buses are all designated as bus system 804 in Figure 14.
Wherein, the number of the first processor 804 can be at least one.Network interface 803 is used for spatial data buffer storage 80
The communication of wired or wireless way between other equipment.
First memory 802 in the embodiment of the present invention is for storing various types of data to support spatial data to cache
The operation of device 80.
The method that the embodiments of the present invention disclose can be applied in first processor 801, or by first processor
801 realize.First processor 801 may be a kind of IC chip, the processing capacity with signal.During realization,
Each step of the above method can pass through the integrated logic circuit of the hardware in first processor 801 or the instruction of software form
It completes.Above-mentioned first processor 801 can be general processor, digital signal processor (DSP, Digital Signal
Processor) either other programmable logic device, discrete gate or transistor logic, discrete hardware components etc..The
One processor 801 may be implemented or execute disclosed each method, step and logic diagram in the embodiment of the present invention.General place
Reason device can be microprocessor or any conventional processor etc..The step of method in conjunction with disclosed in the embodiment of the present invention,
Hardware decoding processor can be embodied directly in and execute completion, or in decoding processor hardware and software module combination hold
Row is completed.Software module can be located in storage medium, which is located at first memory 802, and first processor 801 is read
The step of taking the information in first memory 802, completing preceding method in conjunction with its hardware.
In the exemplary embodiment, spatial data buffer storage 80 can be by one or more application specific integrated circuit
(ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD,
Programmable Logic Device), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic
Device), field programmable gate array (FPGA, Field-Programmable Gate Array), general processor, control
Device, microcontroller (MCU, Micro Controller Unit), microprocessor (Microprocessor) or other electronics member
Part is realized, for executing preceding method.
The embodiment of the present invention also provides a kind of spatial data buffer storage, is arranged on the client, specifically, such as Figure 15
Shown, which includes: second processor 901 and for storing the computer journey that can be run on the first processor
The second memory 902 of sequence;Wherein, the second processor 901 is for executing when running the computer program: calling slow
After depositing eliminative mechanism, the first data and the second data are determined;The first data characterization visiting frequency is lower than the number of first threshold
According to the highest data of the second data characterization visiting frequency;Determine the reference spaces of first data and the second data away from
From;The correlation degree that first data and the second data are characterized with reference to space length;Space length is referred to according to described
Determine whether to delete first data, when determining deletion first data, executes delete operation for first data.
In one embodiment, the second processor 901 is for executing when running the computer program: described in determining
The corresponding third area of space of first data and corresponding 4th area of space of second data;Determine the third space region
4th center point coordinate of the third center point coordinate in domain and the 4th area of space;According to the third center point coordinate and
4th center point coordinate determines described with reference to space length.
In one embodiment, the second processor 901 is for executing: described in judgement when running the computer program
Whether it is more than pre-set space threshold value with reference to space length, when determining that the reference space length is more than pre-set space threshold value, determines
Delete first data.
It should be understood that spatial data buffer storage provided by the above embodiment and spatial data caching method embodiment
Belong to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Certainly, when practical application, as shown in figure 15, which can also include: at least one network interface 903.It is empty
Between various components in data buffer storage device 90 be coupled by bus system 904.It is understood that bus system 904 is used for
Realize the connection communication between these components.Bus system 904 further includes power bus, control in addition to including data/address bus
Bus and status signal bus in addition.But for the sake of clear explanation, various buses are all designated as bus system 904 in Figure 15.
Wherein, the number of the second processor 901 can be at least one.Network interface 903 is used for spatial data buffer storage 90
The communication of wired or wireless way between other equipment.
Second memory 902 in the embodiment of the present invention is for storing various types of data to support spatial data to cache
The operation of device 90.
The method that the embodiments of the present invention disclose can be applied in second processor 901, or by second processor
901 realize.Second processor 901 may be a kind of IC chip, the processing capacity with signal.During realization,
Each step of the above method can pass through the integrated logic circuit of the hardware in second processor 901 or the instruction of software form
It completes.Above-mentioned second processor 901 can be general processor, DSP or other programmable logic device, discrete gate or
Person's transistor logic, discrete hardware components etc..Second processor 901 may be implemented or execute in the embodiment of the present invention
Disclosed each method, step and logic diagram.General processor can be microprocessor or any conventional processor etc..Knot
The step of closing method disclosed in the embodiment of the present invention, can be embodied directly in hardware decoding processor and execute completion, Huo Zheyong
Hardware and software module combination in decoding processor execute completion.Software module can be located in storage medium, which is situated between
Matter is located at second memory 902, and second processor 901 reads the information in second memory 902, completes in conjunction with its hardware aforementioned
The step of method.
In the exemplary embodiment, spatial data buffer storage 90 can by one or more ASIC, DSP, PLD, CPLD,
FPGA, general processor, controller, MCU, microprocessor (Microprocessor) or other electronic components are realized, for holding
Row preceding method.
It is appreciated that the memory (such as first memory 802 and second memory 902) in the embodiment of the present invention, it can
To be volatile memory or nonvolatile memory, it may also comprise both volatile and non-volatile memories.Wherein, non-easy
Lose property memory can be read-only memory (ROM, Read Only Memory), programmable read only memory (PROM,
Programmable Read-Only Memory), Erasable Programmable Read Only Memory EPROM (EPROM, Erasable
Programmable Read-Only Memory), electrically erasable programmable read-only memory (EEPROM, Electrically
Erasable Programmable Read-Only Memory), magnetic RAM (FRAM, ferromagnetic
Random access memory), flash memory (Flash Memory), magnetic surface storage, CD or CD-ROM
(CD-ROM, Compact Disc Read-Only Memory);Magnetic surface storage can be magnetic disk storage or tape storage
Device.Volatile memory can be random access memory (RAM, Random Access Memory), be used as external high speed
Caching.By exemplary but be not restricted explanation, the RAM of many forms is available, such as static random access memory
(SRAM, Static Random Access Memory), synchronous static random access memory (SSRAM, Synchronous
Static Random Access Memory), dynamic random access memory (DRAM, Dynamic Random Access
Memory), Synchronous Dynamic Random Access Memory (SDRAM, Synchronous Dynamic Random Access
Memory), double data speed synchronous dynamic RAM (DDRSDRAM, Double Data Rate
Synchronous Dynamic Random Access Memory), enhanced Synchronous Dynamic Random Access Memory
(ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), synchronized links dynamic random are deposited
Access to memory (SLDRAM, SyncLink Dynamic Random Access Memory), direct rambus arbitrary access are deposited
Reservoir (DRRAM, Direct Rambus Random Access Memory).The memory of description of the embodiment of the present invention is intended to wrap
Include but be not limited to the memory of these and any other suitable type.
In the exemplary embodiment, the embodiment of the invention also provides a kind of computer readable storage medium, for example including
The first memory 802 of computer program, above-mentioned computer program can be by the first processors 801 of spatial data buffer storage 80
It executes, to complete step described in preceding method.
Specifically, the embodiment of the invention provides a kind of computer readable storage medium, it is stored thereon with computer program,
It when the computer program is run by processor, executes: receiving operation requests;The operation requests are used for requests data reading block
Or request writing data blocks;It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;
The priority for determining the data block determines that the data block obtains according to the priority of the data block and buffer queue is added
When permission, the buffer queue is added in the data block;In the priority and the data block and the buffer queue
The correlation degree of reference buffer storage block is related.
In one embodiment, it when the computer program is run by processor, executes: determining the access frequency of the data block
Degree and the space length between the data block and the reference buffer storage block;The reference buffer storage block characterizes the caching team
The cache blocks of highest priority in column;Obtain corresponding first weight of the visiting frequency and the space length corresponding second
Weight;According to the visiting frequency, first weight, the space length and second weight, the data block is determined
Priority.
In one embodiment, it when the computer program is run by processor, executes: determining the priority of the data block
Greater than the power for when priority of at least one cache blocks, determining the data block acquisition addition buffer queue in the buffer queue
Limit.
In one embodiment, it when the computer program is run by processor, executes: determining the number in preset time period
Between first time according to the access time of maiden visit time and the last time in the access times and preset time period of block
Every;According to the access times and the first time interval, the visiting frequency of the data block is determined.
In one embodiment, it when the computer program is run by processor, executes: determining described in the buffer queue
The adjacent data blocks of data block;It determines in preset time period in the access times and preset time period of the adjacent data blocks
The maiden visit time and the last access time the second time interval;According to the access times and it is described second when
Between be spaced, determine the visiting frequency of the adjacent data blocks;According to the visiting frequency of the adjacent data blocks and described adjacent
Space length between data block and the data block determines the visiting frequency of the data block.
In one embodiment, it when the computer program is run by processor, executes: determining the data block corresponding the
One area of space and the corresponding second space region of the reference buffer storage block;Determine the first nodal point of first area of space
Second center point coordinate of coordinate and the second space region;According to the first nodal point coordinate and second central point
Coordinate determines the space length.
In one embodiment, it when the computer program is run by processor, executes: determining that the data block is not added
When entering the permission of the buffer queue, the data block is handled according to preset strategy;It is described that the number is handled according to preset strategy
According to block, comprising: for the data block that request is read, delete the data block;For the data block of request write-in, by the data
Block write-in Hadoop prewrites enqueue, the Hadoop prewrite the data block in enqueue for etc. the data section to be written
Point.
In the exemplary embodiment, the embodiment of the invention also provides a kind of computer readable storage medium, for example including
The second memory 902 of computer program, above-mentioned computer program can be by the second processors 901 of spatial data buffer storage 90
It executes, to complete step described in preceding method.
Specifically, the embodiment of the invention provides a kind of computer readable storage medium, it is stored thereon with computer program,
It when the computer program is run by processor, executes: after calling caching eliminative mechanism, determining the first data and the second data;
The first data characterization visiting frequency is lower than the data of first threshold, the highest number of the second data characterization visiting frequency
According to;Determine the reference space length of first data and the second data;It is described to characterize first data with reference to space length
With the correlation degree of the second data;Determine whether to delete first data with reference to space length according to described, determines and delete institute
When stating the first data, delete operation is executed for first data.
In one embodiment, it when the computer program is run by processor, executes: determining that first data are corresponding
Third area of space and corresponding 4th area of space of second data;Determine the third central point of the third area of space
4th center point coordinate of coordinate and the 4th area of space;According to the third center point coordinate and the 4th central point
Coordinate determines described with reference to space length.
In one embodiment, it when the computer program is run by processor, executes: judging that the space length that refers to is
No is more than pre-set space threshold value, when determining that the reference space length is more than pre-set space threshold value, determines and deletes first number
According to.
It should be understood that computer readable storage medium provided in an embodiment of the present invention can be FRAM, ROM, PROM,
The memories such as EPROM, EEPROM, Flash Memory, magnetic surface storage, CD or CD-ROM;It is also possible to include above-mentioned
The various equipment of one of memory or any combination.
The method of embodiment to realize the present invention, the embodiment of the invention also provides a kind of spatial data caching system, institutes
The system of stating includes: client and at least one back end;Wherein,
The back end, for receiving operation requests;The operation requests are write for requests data reading block or request
Enter data block;It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;Described in determination
The priority of data block, when determining that the data block obtains the permission that buffer queue is added according to the priority of the data block,
The buffer queue is added in the data block;Reference buffer storage in the priority and the data block and the buffer queue
The correlation degree of block is related.
The client determines the first data and the second data after calling caching eliminative mechanism;First data
Characterize the data that visiting frequency is lower than first threshold, the highest data of the second data characterization visiting frequency;Determine described
The reference space length of one data and the second data;The pass that first data and the second data are characterized with reference to space length
Connection degree;Determine whether to delete first data with reference to space length according to described, when determining deletion first data, needle
Delete operation is executed to first data.
It should be understood that the client, the concrete processing procedure of the back end be as detailed above, here not
It repeats again.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it
Its mode is realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only
A kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, or
It is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition portion
Mutual coupling or direct-coupling or communication connection is divided to can be through some interfaces, the INDIRECT COUPLING of equipment or unit
Or communication connection, it can be electrical, mechanical or other forms.
Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit
The component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple network lists
In member;Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
In addition, each functional unit in various embodiments of the present invention can be fully integrated in one processing unit, it can also
To be each unit individually as a unit, can also be integrated in one unit with two or more units;It is above-mentioned
Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through
The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program
When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned include: movable storage device, it is read-only
Memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or
The various media that can store program code such as person's CD.
If alternatively, the above-mentioned integrated unit of the present invention is realized in the form of software function module and as independent product
When selling or using, it also can store in a computer readable storage medium.Based on this understanding, the present invention is implemented
Substantially the part that contributes to existing technology can be embodied in the form of software products the technical solution of example in other words,
The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with
It is personal computer, server or network equipment etc.) execute all or part of each embodiment the method for the present invention.
And storage medium above-mentioned includes: that movable storage device, ROM, RAM, magnetic or disk etc. are various can store program code
Medium.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all
Made any modifications, equivalent replacements, and improvements etc. within the spirit and principles in the present invention, should be included in protection of the invention
Within the scope of.
Claims (23)
1. a kind of spatial data caching method, which is characterized in that be applied to back end, which comprises
Receive operation requests;The operation requests are for requests data reading block or request writing data blocks;
It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;
The priority for determining the data block determines that the data block obtains according to the priority of the data block and caching team is added
When the permission of column, the buffer queue is added in the data block;The priority and the data block and the buffer queue
In reference buffer storage block correlation degree it is related.
2. the method according to claim 1, wherein the priority of the determination data block, comprising:
Determine the space length between the visiting frequency and the data block and the reference buffer storage block of the data block;Institute
State the cache blocks that reference buffer storage block characterizes highest priority in the buffer queue;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data block is determined
Priority.
3. the method according to claim 1, wherein the priority according to the data block determines the number
The permission that buffer queue is added is obtained according to block, comprising:
When determining that the priority of the data block is greater than the priority of at least one cache blocks in the buffer queue, described in determination
Data block obtains the permission that buffer queue is added.
4. according to the method described in claim 2, it is characterized in that, for the data block read, the determination number is requested
According to the visiting frequency of block, comprising:
Determine access times of the data block in preset time period and the maiden visit time in preset time period and recently
The first time interval of primary access time;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
5. according to the method described in claim 2, it is characterized in that, for the data block being written, the determination number is requested
According to the visiting frequency of block, comprising:
Determine the adjacent data blocks of data block described in the buffer queue;
Determine access times of the adjacent data blocks in preset time period and the maiden visit time in preset time period and
The second time interval of the last access time;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block away from
From determining the visiting frequency of the data block.
6. according to the method described in claim 2, it is characterized in that, determining the space between the data block and reference buffer storage block
Distance, comprising:
Determine corresponding first area of space of the data block and the corresponding second space region of the reference buffer storage block;
Determine the first nodal point coordinate of first area of space and second center point coordinate in the second space region;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
7. the method according to claim 1, wherein the method also includes: determine that the data block does not obtain
When the permission of the buffer queue is added, the data block is handled according to preset strategy;
It is described that the data block is handled according to preset strategy, comprising:
For the data block that request is read, the data block is deleted;
For the data block of request write-in, data block write-in Hadoop is prewrite into enqueue, the Hadoop, which prewrites, to join the team
Data block in column for etc. the back end to be written.
8. a kind of spatial data caching method, which is characterized in that be applied to client, which comprises
After calling caching eliminative mechanism, the first data and the second data are determined;The first data characterization visiting frequency is lower than the
The data of one threshold value, the highest data of the second data characterization visiting frequency;
Determine the reference space length of first data and the second data;It is described to characterize first data with reference to space length
With the correlation degree of the second data;
Determine whether to delete first data with reference to space length according to described, when determining deletion first data, for
First data execute delete operation.
9. according to the method described in claim 8, it is characterized in that, the reference of the determination first data and the second data
Space length, comprising:
Determine the corresponding third area of space of first data and corresponding 4th area of space of second data;
Determine the third center point coordinate of the third area of space and the 4th center point coordinate of the 4th area of space;
It is determined according to the third center point coordinate and the 4th center point coordinate described with reference to space length.
10. according to the method described in claim 8, it is characterized in that, described determine whether to delete according to described with reference to space length
Except first data, comprising:
Judge whether the reference space length is more than pre-set space threshold value, determines that the reference space length is more than pre-set space
When threshold value, determines and delete first data.
11. a kind of spatial data buffer storage, which is characterized in that described device includes: first processing module, Second processing module
With third processing module;Wherein,
The first processing module, for receiving operation requests;The operation requests are write for requests data reading block or request
Enter data block;
It is pre- caching to be added for determining the corresponding data block of the operation requests in the data block by the Second processing module
Queue is written;
The third processing module determines institute according to the priority of the data block for determining the priority of the data block
When stating the permission of data block acquisition addition buffer queue, the buffer queue is added in the data block;The priority and institute
It is related to the correlation degree of reference buffer storage block in the buffer queue to state data block.
12. device according to claim 11, which is characterized in that the third processing module, described in determining
Space length between the visiting frequency of data block and the data block and the reference buffer storage block;The reference buffer storage block
Characterize the cache blocks of highest priority in the buffer queue;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data block is determined
Priority.
13. device according to claim 11, which is characterized in that the third processing module, described in determining
When the priority of data block is greater than the priority of at least one cache blocks in the buffer queue, determine that the data block is added
Enter the permission of buffer queue.
14. device according to claim 12, which is characterized in that the third processing module is specifically used for for request
The data block of reading determines the access times of the data block and the maiden visit in preset time period in preset time period
The first time interval of time and the last access time;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
15. device according to claim 12, which is characterized in that the third processing module is specifically used for for request
The data block of write-in determines the adjacent data blocks of data block described in the buffer queue;
Determine access times of the adjacent data blocks in preset time period and the maiden visit time in preset time period and
The second time interval of the last access time;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block away from
From determining the visiting frequency of the data block.
16. device according to claim 12, which is characterized in that the third processing module, described in determining
Corresponding first area of space of data block and the corresponding second space region of the reference buffer storage block;
Determine the first nodal point coordinate of first area of space and second center point coordinate in the second space region;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
17. device according to claim 11, which is characterized in that the third processing module is also used to determine the number
When not obtaining the permission that the buffer queue is added according to block, the data block is handled according to preset strategy;
The third processing module deletes the data block specifically for the data block read for request;It is written for request
Data block, by the data block write-in Hadoop prewrite enqueue, the Hadoop prewrites the data block in enqueue and is used for
Etc. back end to be written.
18. a kind of spatial data buffer storage, which is characterized in that described device includes: fourth processing module, the 5th processing module
With the 6th processing module;Wherein,
The fourth processing module determines the first data and the second data after calling caching eliminative mechanism;First number
It is lower than the data of first threshold, the highest data of the second data characterization visiting frequency according to characterization visiting frequency;
5th processing module, for determining the reference space length of first data and the second data;The reference is empty
Between distance characterize the correlation degrees of first data and the second data;
6th processing module, for determining whether to delete first data with reference to space length according to described, determination is deleted
When except first data, delete operation is executed for first data.
19. device according to claim 18, which is characterized in that the 5th processing module, described in determining
The corresponding third area of space of first data and corresponding 4th area of space of second data;
Determine the third center point coordinate of the third area of space and the 4th center point coordinate of the 4th area of space;
It is determined according to the third center point coordinate and the 4th center point coordinate described with reference to space length.
20. device according to claim 18, which is characterized in that the 6th processing module is specifically used for described in judgement
Whether it is more than pre-set space threshold value with reference to space length, when determining that the reference space length is more than pre-set space threshold value, determines
Delete first data.
21. a kind of spatial data buffer storage, which is characterized in that described device includes: first processor and can for storing
The first memory of the computer program run on first processor;Wherein,
The first processor is for when running the computer program, perform claim to require the step of any one of 1 to 7 the method
Suddenly.
22. a kind of spatial data buffer storage, which is characterized in that described device includes: second processor and can for storing
The second memory of the computer program run in second processor;Wherein,
The second processor is for when running the computer program, perform claim to require any one of 8 to 10 the methods
Step.
23. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of any one of claim 1 to 7 the method is realized when being executed by processor;Alternatively, the computer program is processed
The step of any one of claim 8 to 10 the method is realized when device executes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811191662.6A CN109446114B (en) | 2018-10-12 | 2018-10-12 | Spatial data caching method and device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811191662.6A CN109446114B (en) | 2018-10-12 | 2018-10-12 | Spatial data caching method and device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109446114A true CN109446114A (en) | 2019-03-08 |
CN109446114B CN109446114B (en) | 2020-12-18 |
Family
ID=65546420
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811191662.6A Active CN109446114B (en) | 2018-10-12 | 2018-10-12 | Spatial data caching method and device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109446114B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110287007A (en) * | 2019-05-20 | 2019-09-27 | 深圳壹账通智能科技有限公司 | Data call response method, server and computer readable storage medium |
CN112035498A (en) * | 2020-08-31 | 2020-12-04 | 北京奇艺世纪科技有限公司 | Data block scheduling method and device, scheduling layer node and storage layer node |
CN112260952A (en) * | 2020-10-20 | 2021-01-22 | 四川天邑康和通信股份有限公司 | Wifi6 router rapid data access protection method |
CN113407620A (en) * | 2020-03-17 | 2021-09-17 | 北京信息科技大学 | Data block placement method and system based on heterogeneous Hadoop cluster environment |
CN113742095A (en) * | 2021-01-14 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Cache data processing method and device, electronic equipment and storage medium |
CN116383258A (en) * | 2023-05-23 | 2023-07-04 | 菏泽全胜建筑装饰工程有限公司 | Building construction data management method and system based on BIM |
CN117170590A (en) * | 2023-11-03 | 2023-12-05 | 沈阳卓志创芯科技有限公司 | Computer data storage method and system based on cloud computing |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103019962A (en) * | 2012-12-21 | 2013-04-03 | 华为技术有限公司 | Data cache processing method, device and system |
CN103092775A (en) * | 2013-01-31 | 2013-05-08 | 武汉大学 | Spatial data double cache method and mechanism based on key value structure |
CN103701886A (en) * | 2013-12-19 | 2014-04-02 | 中国信息安全测评中心 | Hierarchic scheduling method for service and resources in cloud computation environment |
CN103942289A (en) * | 2014-04-12 | 2014-07-23 | 广西师范大学 | Memory caching method oriented to range querying on Hadoop |
CN104217019A (en) * | 2014-09-25 | 2014-12-17 | 中国人民解放军信息工程大学 | Content inquiry method and device based on multiple stages of cache modules |
CN104794064A (en) * | 2015-04-21 | 2015-07-22 | 华中科技大学 | Cache management method based on region heat degree |
CN104809179A (en) * | 2015-04-16 | 2015-07-29 | 华为技术有限公司 | Device and method for accessing Hash table |
US20160202935A1 (en) * | 2015-01-13 | 2016-07-14 | Elastifile Ltd. | Distributed file system with speculative writing |
US20170351620A1 (en) * | 2016-06-07 | 2017-12-07 | Qubole Inc | Caching Framework for Big-Data Engines in the Cloud |
CN107644086A (en) * | 2017-09-25 | 2018-01-30 | 咪咕文化科技有限公司 | Spatial data distribution method |
-
2018
- 2018-10-12 CN CN201811191662.6A patent/CN109446114B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103019962A (en) * | 2012-12-21 | 2013-04-03 | 华为技术有限公司 | Data cache processing method, device and system |
CN103092775A (en) * | 2013-01-31 | 2013-05-08 | 武汉大学 | Spatial data double cache method and mechanism based on key value structure |
CN103701886A (en) * | 2013-12-19 | 2014-04-02 | 中国信息安全测评中心 | Hierarchic scheduling method for service and resources in cloud computation environment |
CN103942289A (en) * | 2014-04-12 | 2014-07-23 | 广西师范大学 | Memory caching method oriented to range querying on Hadoop |
CN104217019A (en) * | 2014-09-25 | 2014-12-17 | 中国人民解放军信息工程大学 | Content inquiry method and device based on multiple stages of cache modules |
US20160202935A1 (en) * | 2015-01-13 | 2016-07-14 | Elastifile Ltd. | Distributed file system with speculative writing |
CN104809179A (en) * | 2015-04-16 | 2015-07-29 | 华为技术有限公司 | Device and method for accessing Hash table |
CN104794064A (en) * | 2015-04-21 | 2015-07-22 | 华中科技大学 | Cache management method based on region heat degree |
US20170351620A1 (en) * | 2016-06-07 | 2017-12-07 | Qubole Inc | Caching Framework for Big-Data Engines in the Cloud |
CN107644086A (en) * | 2017-09-25 | 2018-01-30 | 咪咕文化科技有限公司 | Spatial data distribution method |
Non-Patent Citations (1)
Title |
---|
张少将: "基于Hadoop的地理空间大数据存储与查询技术", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110287007A (en) * | 2019-05-20 | 2019-09-27 | 深圳壹账通智能科技有限公司 | Data call response method, server and computer readable storage medium |
CN113407620A (en) * | 2020-03-17 | 2021-09-17 | 北京信息科技大学 | Data block placement method and system based on heterogeneous Hadoop cluster environment |
CN113407620B (en) * | 2020-03-17 | 2023-04-21 | 北京信息科技大学 | Data block placement method and system based on heterogeneous Hadoop cluster environment |
CN112035498A (en) * | 2020-08-31 | 2020-12-04 | 北京奇艺世纪科技有限公司 | Data block scheduling method and device, scheduling layer node and storage layer node |
CN112035498B (en) * | 2020-08-31 | 2023-09-05 | 北京奇艺世纪科技有限公司 | Data block scheduling method and device, scheduling layer node and storage layer node |
CN112260952A (en) * | 2020-10-20 | 2021-01-22 | 四川天邑康和通信股份有限公司 | Wifi6 router rapid data access protection method |
CN113742095A (en) * | 2021-01-14 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Cache data processing method and device, electronic equipment and storage medium |
CN116383258A (en) * | 2023-05-23 | 2023-07-04 | 菏泽全胜建筑装饰工程有限公司 | Building construction data management method and system based on BIM |
CN116383258B (en) * | 2023-05-23 | 2023-08-11 | 菏泽全胜建筑装饰工程有限公司 | Building construction data management method and system based on BIM |
CN117170590A (en) * | 2023-11-03 | 2023-12-05 | 沈阳卓志创芯科技有限公司 | Computer data storage method and system based on cloud computing |
CN117170590B (en) * | 2023-11-03 | 2024-01-26 | 沈阳卓志创芯科技有限公司 | Computer data storage method and system based on cloud computing |
Also Published As
Publication number | Publication date |
---|---|
CN109446114B (en) | 2020-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109446114A (en) | Spatial data caching method and device and storage medium | |
CN100476742C (en) | Load balancing method based on object storage device | |
CN106331153B (en) | A kind of filter method of service request, apparatus and system | |
CN107247675B (en) | A kind of caching selection method and system based on classification prediction | |
CN107003814A (en) | Effective metadata in storage system | |
US10120810B2 (en) | Implementing selective cache injection | |
CN106528451B (en) | The cloud storage frame and construction method prefetched for the L2 cache of small documents | |
US11914894B2 (en) | Using scheduling tags in host compute commands to manage host compute task execution by a storage device in a storage system | |
US10956322B2 (en) | Storage drive dependent track removal in a cache for storage | |
US10831662B1 (en) | Systems and methods for maintaining cache coherency | |
CN104158863A (en) | Cloud storage mechanism based on transaction-level whole-course high-speed buffer | |
CN106991059A (en) | To the access control method of data source | |
CN116560562A (en) | Method and device for reading and writing data | |
CN107133183A (en) | A kind of cache data access method and system based on TCMU Virtual Block Devices | |
US8539135B2 (en) | Route lookup method for reducing overall connection latencies in SAS expanders | |
CN109144431A (en) | Caching method, device, equipment and the storage medium of data block | |
CN109582233A (en) | A kind of caching method and device of data | |
US10686906B2 (en) | Methods for managing multi-level flash storage and devices thereof | |
CN115509437A (en) | Storage system, network card, processor, data access method, device and system | |
CN110209343B (en) | Data storage method, device, server and storage medium | |
CN115794366A (en) | Memory prefetching method and device | |
CN114207602A (en) | Reducing requests using probabilistic data structures | |
CN111859225A (en) | Program file access method, device, computing equipment and medium | |
CN111880900A (en) | Design method of near data processing system for super fusion equipment | |
CN111880739A (en) | Near data processing system for super fusion equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |