CN109446114A - Spatial data caching method and device and storage medium - Google Patents

Spatial data caching method and device and storage medium Download PDF

Info

Publication number
CN109446114A
CN109446114A CN201811191662.6A CN201811191662A CN109446114A CN 109446114 A CN109446114 A CN 109446114A CN 201811191662 A CN201811191662 A CN 201811191662A CN 109446114 A CN109446114 A CN 109446114A
Authority
CN
China
Prior art keywords
data
data block
block
space
priority
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811191662.6A
Other languages
Chinese (zh)
Other versions
CN109446114B (en
Inventor
李宗祥
严国友
孙波
孙一波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Migu Cultural Technology Co Ltd
China Mobile Communications Group Co Ltd
Original Assignee
Migu Cultural Technology Co Ltd
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Migu Cultural Technology Co Ltd, China Mobile Communications Group Co Ltd filed Critical Migu Cultural Technology Co Ltd
Priority to CN201811191662.6A priority Critical patent/CN109446114B/en
Publication of CN109446114A publication Critical patent/CN109446114A/en
Application granted granted Critical
Publication of CN109446114B publication Critical patent/CN109446114B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0895Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1021Hit rate improvement

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a spatial data caching method, which comprises the following steps: receiving an operation request; the operation request is used for requesting to read a data block or requesting to write the data block; determining a data block corresponding to the operation request, and adding the data block into a cache pre-write queue; determining the priority of the data block, and adding the data block into a cache queue when determining that the data block obtains the right of adding into the cache queue according to the priority of the data block; the priority is related to a degree of association of the data block with a reference cache block in the cache queue. The invention also discloses a spatial data caching device and a computer readable storage medium.

Description

A kind of spatial data caching method, device and storage medium
Technical field
The present invention relates to big data memory technologies more particularly to a kind of spatial data caching method, device and computer can Read storage medium.
Background technique
Increasingly mature with big data processing technique, Hai Dupu (Hadoop) becomes the popular tool of big data processing. Data dispersion is stored in back end (DataNode) cluster by Hadoop, maps reduction (MapReduce) task in operation When, it is determined to be assigned to task on any platform machine according to the distribution of data, realizes efficient distributed computing.Hadoop itself It is the data processing shelf of input/output (I/O, Input/Output) intensity, influence of the I/O efficiency to its performance to Guan Chong It wants, and caching technology can effectively reduce I/O operation number, promote I/O performance, optimize the performance of Hadoop cluster.
Hadoop itself provides a kind of caching mechanism, can permit the user configuration file to be cached, when a file After being arranged to the file to be buffered, the data block that this document is related to is added out-pile memory and cached by DataNode, is used To promote the access efficiency of this document.
Hadoop can also use the caching mechanism based on index, such as the distributed space data rope based on quaternary tree Draw;It uses quad-tree partition area of space, and the leaf node of each quaternary tree stores a certain amount of spatial data, by multiple leaf nodes Storage within the data block, and is stored in DataNode cluster with being distributed by the distributed storage mechanism of Hadoop itself.Base In this Index Structure Design two-level cache: a. is the caching at the end DataNode first, is arranged at the end DataNode to data The caching of block carries out data cached management using page frame replacement (LRU, Least Recently Used) algorithm, according to user Data cached eliminate is carried out to the visiting frequency of data block.B. the caching of client is established in client towards quaternary tree leaf segment The caching of point, the quaternary tree leaf node that client frequently accesses is cached, the management of caching also direct basis lru algorithm.
The cache granularity of Hadoop itself generally only arrives file-level, and its caching mechanism is just for data block (Block) Itself, however the data that Hadoop is handled in real application systems are often related, such as include spatial positional information Microblog data, miaow cluck kind running track data for running user etc..If the caching mechanism of Hadoop itself is used only, work as user When accessing adjacent area, if adjacent region is not in memory, it is necessary to corresponding data are read on to disk, to increase Magnetic disc i/o, may reduce the hit rate of caching, system effectiveness is caused to reduce.
Summary of the invention
It can in view of this, the main purpose of the present invention is to provide a kind of spatial data caching method, device and computers Read storage medium.
In order to achieve the above objectives, the technical scheme of the present invention is realized as follows:
The embodiment of the invention provides a kind of spatial data caching methods, are applied to back end, which comprises
Receive operation requests;The operation requests are for requests data reading block or request writing data blocks;
It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;
The priority for determining the data block determines that the data block obtains to be added according to the priority of the data block and delays When depositing the permission of queue, the buffer queue is added in the data block;The priority and the data block and the caching The correlation degree of reference buffer storage block in queue is related.
In above scheme, the priority of the determination data block, comprising:
Determine space between the visiting frequency and the data block and the reference buffer storage block of the data block away from From;The reference buffer storage block characterizes the cache blocks of highest priority in the buffer queue;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data are determined The priority of block.
In above scheme, the priority according to the data block determines that the data block obtains and buffer queue is added Permission, comprising:
When determining that the priority of the data block is greater than the priority of at least one cache blocks in the buffer queue, determine The data block obtains the permission that buffer queue is added.
In above scheme, for the data block that request is read, the visiting frequency of the determination data block, comprising:
Determine access times of the data block in preset time period and the maiden visit time in preset time period and The first time interval of the last access time;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
In above scheme, for the data block of request write-in, the visiting frequency of the determination data block, comprising:
Determine the adjacent data blocks of data block described in the buffer queue;
When determining access times of the adjacent data blocks in preset time period and the maiden visit in preset time period Between and the last access time the second time interval;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block Distance determines the visiting frequency of the data block.
In above scheme, the space length between the data block and reference buffer storage block is determined, comprising:
Determine corresponding first area of space of the data block and the corresponding second space region of the reference buffer storage block;
Second central point of the first nodal point coordinate and the second space region that determine first area of space is sat Mark;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
In above scheme, the method also includes: it determines that the data block does not obtain and the permission of the buffer queue is added When, the data block is handled according to preset strategy;
It is described that the data block is handled according to preset strategy, comprising:
For the data block that request is read, the data block is deleted;
For the data block of request write-in, data block write-in Hadoop is prewrite into enqueue, the Hadoop prewrites Data block in enqueue for etc. the back end to be written.
The embodiment of the invention provides a kind of spatial data caching methods, are applied to client, which comprises
After calling caching eliminative mechanism, the first data and the second data are determined;The first data characterization visiting frequency is low In the data of first threshold, the highest data of the second data characterization visiting frequency;
Determine the reference space length of first data and the second data;It is described to characterize described first with reference to space length The correlation degree of data and the second data;
Determine whether to delete first data with reference to space length according to described, when determining deletion first data, Delete operation is executed for first data.
In above scheme, the reference space length of determination first data and the second data, comprising:
Determine the corresponding third area of space of first data and corresponding 4th area of space of second data;
4th central point of the third center point coordinate and the 4th area of space that determine the third area of space is sat Mark;
It is determined according to the third center point coordinate and the 4th center point coordinate described with reference to space length.
It is described to determine whether to delete first data with reference to space length according to described in above scheme, comprising:
Judge whether the reference space length is more than pre-set space threshold value, determines that the reference space length is more than default When capacity-threshold, determines and delete first data.
The embodiment of the invention provides a kind of spatial data buffer storage, described device includes: first processing module, second Processing module and third processing module;Wherein,
The first processing module, for receiving operation requests;The operation requests are for requests data reading block or ask Seek writing data blocks;
The Second processing module data block is added slow for determining the operation requests corresponding data block It deposits and prewrites enqueue;
The third processing module, for determining the priority of the data block, the priority according to the data block is true When the fixed data block obtains the permission that buffer queue is added, the buffer queue is added in the data block;The priority It is related to the correlation degree of reference buffer storage block in the data block and the buffer queue.
In above scheme, the third processing module, visiting frequency specifically for the determination data block and described Space length between data block and the reference buffer storage block;The reference buffer storage block characterizes in the buffer queue priority most High cache blocks;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data are determined The priority of block.
In above scheme, the third processing module is greater than described delay specifically for the priority of the determination data block When depositing the priority of at least one cache blocks in queue, determine that the data block obtains the permission that buffer queue is added.
In above scheme, the third processing module determines preset time specifically for the data block read for request The access time of the access times of the data block and the maiden visit time in preset time period and the last time in section First time interval;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
In above scheme, the third processing module determines the caching specifically for the data block for request write-in The adjacent data blocks of data block described in queue;
When determining access times of the adjacent data blocks in preset time period and the maiden visit in preset time period Between and the last access time the second time interval;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block Distance determines the visiting frequency of the data block.
In above scheme, the third processing module is specifically used for determining corresponding first area of space of the data block Second space region corresponding with the reference buffer storage block;
Second central point of the first nodal point coordinate and the second space region that determine first area of space is sat Mark;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
In above scheme, the third processing module is also used to determine that the data block does not obtain and the caching team is added When the permission of column, the data block is handled according to preset strategy;
The third processing module deletes the data block specifically for the data block read for request;For request Data block write-in Hadoop is prewrite enqueue by the data block of write-in, and the Hadoop prewrites the data block in enqueue For etc. back end to be written.
The embodiment of the invention provides a kind of spatial data buffer storage, described device includes: fourth processing module, the 5th Processing module and the 6th processing module;Wherein,
The fourth processing module determines the first data and the second data after calling caching eliminative mechanism;Described One data characterization visiting frequency is lower than the data of first threshold, the highest data of the second data characterization visiting frequency;
5th processing module, for determining the reference space length of first data and the second data;The ginseng Examine the correlation degree that space length characterizes first data and the second data;
6th processing module, for determining whether to delete first data with reference to space length according to described, really When deleting first data surely, delete operation is executed for first data.
In above scheme, the 5th processing module is specifically used for determining the corresponding third space region of first data Domain and corresponding 4th area of space of second data;
4th central point of the third center point coordinate and the 4th area of space that determine the third area of space is sat Mark;
It is determined according to the third center point coordinate and the 4th center point coordinate described with reference to space length.
In above scheme, whether the 6th processing module is more than default specifically for judging described with reference to space length Capacity-threshold determines when determining that the reference space length is more than pre-set space threshold value and deletes first data.
The embodiment of the invention provides a kind of spatial data buffer storage, described device includes: first processor and is used for Store the first memory for the computer program that can be run on first processor;Wherein,
The first processor is for executing any one space of back end side when running the computer program The step of data cache method.
The embodiment of the invention provides a kind of spatial data buffer storage, described device includes: second processor and is used for Store the second memory for the computer program that can be run in second processor;Wherein,
The second processor is for executing any one space number of client-side when running the computer program The step of according to caching method.
The embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, described The step of any one spatial data caching method of back end side is realized when computer program is executed by processor;Or Person realizes the step of any one spatial data caching method of client-side when the computer program is executed by processor Suddenly.
Spatial data caching method, device and computer readable storage medium provided by the embodiment of the present invention receive behaviour It requests;The operation requests are for requests data reading block or request writing data blocks;Determine that the operation requests are corresponding The data block is added caching and prewrites enqueue by data block;The priority for determining the data block, according to the data block When priority determines that the data block obtains the permission that buffer queue is added, the buffer queue is added in the data block;Institute It is related to the correlation degree of reference buffer storage block in the data block and the buffer queue to state priority.The embodiment of the present invention In, the space length relevance between combined data is managed caching, the hit rate of caching is improved, thus what raising read or write Efficiency.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of spatial data caching method provided in an embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of quaternary tree distributed index structure provided in an embodiment of the present invention.
Fig. 3 is the schematic diagram of seed region space length provided in an embodiment of the present invention;
Fig. 4 is the flow diagram of another spatial data caching method provided in an embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of the Hadoop platform of two-level cache provided in an embodiment of the present invention;
Fig. 6 is a kind of schematic diagram of the buffer structure of back end provided in an embodiment of the present invention;
Fig. 7 is the schematic diagram of the plane space equal part provided in an embodiment of the present invention based on quaternary tree index;
Fig. 8 is the flow diagram of the caching method of back end read block provided in an embodiment of the present invention;
Fig. 9 is the flow diagram of the caching method of back end writing data blocks provided in an embodiment of the present invention;
Figure 10 is the structural schematic diagram of Map structure provided in an embodiment of the present invention;
Figure 11 is the flow diagram of the caching method of client provided in an embodiment of the present invention;
Figure 12 is the structural schematic diagram of spatial data buffer storage one provided in an embodiment of the present invention;
Figure 13 is the structural schematic diagram of spatial data buffer storage two provided in an embodiment of the present invention;
Figure 14 is the structural schematic diagram of spatial data buffer storage three provided in an embodiment of the present invention;
Figure 15 is the structural schematic diagram of spatial data buffer storage four provided in an embodiment of the present invention.
Specific embodiment
In various embodiments of the present invention, operation requests are received;The operation requests for requests data reading block or Request writing data blocks;It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;Really The priority of the fixed data block determines that the data block obtains the power of addition buffer queue according to the priority of the data block In limited time, the buffer queue is added in the data block;Ginseng in the priority and the data block and the buffer queue The correlation degree for examining cache blocks is related.
Below with reference to embodiment, the present invention is further described in more detail.
Fig. 1 is a kind of flow diagram of spatial data caching method provided in an embodiment of the present invention;The method can be with Applied to back end, as shown in Figure 1, which comprises
Step 101 receives operation requests;The operation requests are for requests data reading block or request writing data blocks.
Step 102 determines the corresponding data block of the operation requests, and caching is added in the data block and prewrites enqueue.
Here, the back end may include:
Caching prewrites enqueue, for storing the data block for being possible to write-in buffer queue.
Buffer queue (also referred to as priority query) determines whether for data block to be stored in slow for the priority according to data block Deposit queue.
Hadoop prewrites enqueue, for storing the data block that Hadoop will be written.
Step 103, the priority for determining the data block determine that the data block obtains according to the priority of the data block When the permission of buffer queue must be added, the buffer queue is added in the data block;The priority and the data block and The correlation degree of reference buffer storage block in the buffer queue is related.
The priority of the data block, it is contemplated that the visiting frequency of data block and the data block and hot spot data block Between space length.
Specifically, the priority of the determination data block, comprising:
Determine space between the visiting frequency and the data block and the reference buffer storage block of the data block away from From;The reference buffer storage block characterizes the cache blocks of highest priority in the buffer queue;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data are determined The priority of block.
Here, the priority is the product of the visiting frequency and first weight and the space length and institute State the sum of products of the second weight.First weight and second weight can be preparatory by the maintenance personnel of Hadoop platform It sets and saves.
Here, the cache blocks refer to the data block being added in the buffer queue.
Specifically, the data block can be the data block for the data block or request write-in that request is read.
Specifically, the data block read for request, the visiting frequency of the determination data block, comprising:
Determine access times of the data block in preset time period and the maiden visit time in preset time period and The first time interval of the last access time;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
Specifically, for the data block of request write-in, the visiting frequency of the determination data block, comprising:
Determine the adjacent data blocks of data block described in the buffer queue;
When determining access times of the adjacent data blocks in preset time period and the maiden visit in preset time period Between and the last access time the second time interval;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block Distance determines the visiting frequency of the data block.
Here, the method also includes: determine the space length between the adjacent data blocks and the data block;Specifically Comprise determining that the center point coordinate and the data block corresponding area of space of the corresponding area of space of the adjacent data blocks Center point coordinate;Space length is determined according to two determining center point coordinates.
Specifically, it is determined that the space length between the data block and reference buffer storage block, comprising:
Determine corresponding first area of space of the data block and the corresponding second space region of the reference buffer storage block;
Second central point of the first nodal point coordinate and the second space region that determine first area of space is sat Mark;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
Specifically, the priority according to the data block determines that the data block obtains the power that buffer queue is added Limit, comprising:
When determining that the priority of the data block is greater than the priority of at least one cache blocks in the buffer queue, determine The data block obtains the permission that buffer queue is added.
In the present embodiment, the method also includes: it determines that the data block does not obtain and the permission of the buffer queue is added When, the data block is handled according to preset strategy.
It is described that the data block is handled according to preset strategy, comprising:
For the data block that request is read, the data block is deleted;
For the data block of request write-in, data block write-in Hadoop is prewrite into enqueue, the Hadoop prewrites Data block in enqueue for etc. the back end to be written.
It should be noted that the Hadoop of the present embodiment uses the caching mechanism based on quaternary tree distributed index, such as Fig. 2 It is shown, it is a kind of structural schematic diagram of quaternary tree distributed index structure provided in an embodiment of the present invention.Use quaternary tree can be with Data space is divided into non-intersecting multiple subregions, therefore space data sets can be divided with quaternary tree.In the present embodiment, In the leaf node that data are no longer only stored to index, but it is associated with a data block to each node of index, and in number According to the data for storing index in block.When there is new data to be inserted into, the new data node to be inserted into first is found, checks that the node is associated with Data block whether expired, generate four child nodes by this node of the regular splitting of quaternary tree if having expired, then by data It is inserted into corresponding child node and is stored to Hadoop request for data block;Otherwise, the leaf node directly is written in data supplementing to close In the data block of connection.Data block each in this way is to belong to some divided subregion, to have spatial positional information.
Fig. 3 is the schematic diagram of seed region space length provided in an embodiment of the present invention;As shown in figure 3, each node can To generate four child nodes, A, B in figure ..., the central point that K is each child node, can be determined according to the coordinate of central point Space length between each child node, it can calculate the space length between the associated data block of each child node.
Fig. 4 is the flow diagram of another spatial data caching method provided in an embodiment of the present invention;The method is answered For the client of Hadoop platform, as shown in Figure 4, which comprises
Step 201 after calling caching eliminative mechanism, determines the first data and the second data;First data characterization is visited Ask that frequency is lower than the data of first threshold, the highest data of the second data characterization visiting frequency.
Here, the visiting frequency of each data can according in preset time period access times and access times it is corresponding The time interval of initial-access time and the last access time, which calculate, to be obtained.
Step 202, the reference space length for determining first data and the second data;It is described to be characterized with reference to space length The correlation degree of first data and the second data.
Specifically, the reference space length of the determination first data and the second data, comprising:
Determine the corresponding third area of space of first data and corresponding 4th area of space of second data;Really The third center point coordinate of the fixed third area of space and the 4th center point coordinate of the 4th area of space;According to described Third center point coordinate and the 4th center point coordinate determine described with reference to space length.
Step 203 determines whether to delete first data with reference to space length, determines and delete described first according to described When data, delete operation is executed for first data.
It is specifically, described to determine whether to delete first data with reference to space length according to described, comprising:
Judge whether the reference space length is more than pre-set space threshold value, determines that the reference space length is more than default When capacity-threshold, determines and delete first data.
Here, the pre-set space threshold value is preset and is saved by the maintenance personnel of Hadoop platform.
Fig. 5 is a kind of structural schematic diagram of the Hadoop platform of the two-level cache provided in the present embodiment;As shown in figure 5, The Hadoop platform includes: client and back end (i.e. DataNode1, DataNode2 ... DataNodeN).
Client be equipped with one can store, the buffer queue of search space data.If searched when searching Spatial data in the buffer queue of client, then client can directly in response to user inquiry request without again to Hadoop Cluster carries out distributed query.
It is equipped with the buffer queue for the data block that one is stored for the back end in back end, it can be according to Fig. 1 Shown in method carry out cache management, can first access cache team when back end receives the request of user's read block Column check data block whether in buffer queue, do not have to the reading for carrying out magnetic disc i/o if being directly returned to client if It writes;When back end will be written in new data block, also judged according to the above method, determines that the data block being written into is direct Disk is written or buffer queue is added.
The two-level cache refers to caching of the client to the caching of spatial data and back end to data block.
The caching of the back end is the caching for data block, and Fig. 6 is a kind of data provided in an embodiment of the present invention The schematic diagram of the buffer structure of node, as shown in fig. 6, the back end managed by following three queues caching reading or It writes:
A, buffer queue (also referred to as priority query), for (priority to be related to counting according to the priority of data block According to the visiting frequency of block and the relevance of spatial position) determine whether data block buffer queue is added.The buffer queue can With the state for keeping team full constantly, cache management is carried out with the above method.
B, caching prewrites enqueue, for storing the data block for being possible to that buffer queue is added.Here, the data block of addition Source can be divided into two classes: by the data block that client request accesses and the new data block that Hadoop will be written.Data block Whereabouts is divided into three classes: buffer queue is added, directly deletes and write direct Hadoop platform.
C, Hadoop prewrites enqueue, for storing the data block that Hadoop will be written.Specifically buffered queue is washed in a pan It eliminates the data block that Hadoop is not written got off and is put into the queue, wait Hadoop platform to be written.
Following explanation is done for the calculation method for the priority that buffer queue is related to.
Fig. 7 is the schematic diagram of the plane space equal part provided in an embodiment of the present invention based on quaternary tree index;Such as Fig. 7 institute Show, it is assumed that area of space is divided into 10 sub-regions: A, B, C, D, E, F, G, H, I, J, and each subregion corresponds in Hadoop A data block, needed in the operational process of Hadoop by partial data block be added cache.
In the present embodiment, the priority of data block had both considered data block visiting frequency, it is also considered that between data block The spatial position degree of association.
For visiting frequency: assuming that the visiting frequency of data block A need to be determined, determining that the access times of data block A are C, the accessed time interval of adjacent data block twice can be expressed as ti–ti-1, i expression access times;It may thereby determine that number It is c/ Σ (t according to the visiting frequency of block A whithin a period of timei–ti-1)。
For space length: can be calculated according to the center point coordinate of square region space between data block away from From.
Assuming that B area indicates the data block of the highest priority in buffer queue, now need to consider whether data block A should add Enter buffer queue, then needs to calculate the priority of data block A.Determine that a-quadrant center point coordinate is (xi,yi), B area central point Coordinate is (xj,yj).The distance between two o'clock is calculated according to coordinate to indicate the positional distance between data block.
The visiting frequency of combined data block A, then the priority of available data block A:
Wherein, Σ (ti–ti-1) indicate caching data block there are the time, i.e., the corresponding first visit of described access times Ask time and last time access time,Indicate weight,H indicates that buffer queue can store the maximum quantity of data block.
It should be noted that based on the data block adjacent with hot spot data block also have biggish probability become hot spot block this Space length is added in the present embodiment, to take into account the space of data block visiting frequency and data block in thought in priority The position degree of association so that it is few even if access times but with hot spot data block apart from close data block possess biggish priority, And then make newly to access and be still also added in buffer queue with data block similar in hot spot data block, promote the slow of hot spot data block Hit rate is deposited, magnetic disc i/o is reduced, to optimize the readwrite performance of entire Hadoop platform.It can from above-mentioned calculation formula Out, with the increase of buffer memory capacity, the value of priority is more prone to space length, this will make more and hot spot data block Adjacent data block is loaded into caching, and the hit rate of hot spot data block caching can be promoted further.
The cache management of back end is related to two kinds of situations: a kind of situation is when client will access one not in the buffer Data block when, need to judge to be that the data block that will newly read is put into buffer queue or directly deletes after disk read block It removes.Detailed process is illustrated in fig. 8 shown below:
Step 301, the read requests according to client determine the data block on the back end disk of request reading.
The data block of reading is fed back to write-in caching after client and prewrites enqueue by step 302.
Step 303, the priority for determining data block.
Here, the step 303 comprise determining that the space between visiting frequency, data block and the cache blocks of data block away from From determining the priority of data block according to the determining visiting frequency and space length.
Here, the cache blocks refer specifically to cache the cache blocks to highest priority in column, i.e. hot spot data block.
Step 304 judges whether data block can be added buffer queue, and determination can be added, and enters step 305, otherwise into Enter step 306.
Here, determine that the priority of the data block is higher than the priority of any one cache blocks in buffer queue Determine that caching can be added to column in the data block.
The caching cache blocks minimum to priority in column are gone out team and deleted by step 305, and the data block is added and is cached Queue.
Step 306, data block go out team and delete.
Here, it requests the data block read to prewrite from the caching to go out team in enqueue and delete.
Second situation is when the disk for having new data block that back end is written, in advance by data block write-in caching first Queue is written, judges whether it can write into buffer queue, writes direct Hadoop if buffer queue cannot be written and prewrite team Column.Detailed process is illustrated in fig. 9 shown below:
The data block of request write-in is sent corresponding data by step 401, the request for receiving client writing data blocks Node.
Step 402, the data block being written into write-in caching prewrite enqueue.
Step 403, the priority for determining data block to be written.
The case where with requests data reading block, is similar, needs the visiting frequency for considering data block simultaneously and spatial position here Relevance determines the priority of data block.Because the data block being newly added was not requested also, design is used slow in the present embodiment The visiting frequency in queue with data block space to be written apart from nearest data block is deposited to make divided by space length between the two For the initial access frequency of data block to be written.
It should be noted that because the also not visited access frequency that also can not just calculate the data of the data block being newly written Degree, however the high data block of visiting frequency corresponds to hot spot access region, and also have biggish probability with region similar in the region As hot spot region, based on calculating new number according to the visiting frequency of the close data block of space length in above-mentioned thought the present embodiment According to the visiting frequency of block, the data block that those can be had greater probability to become hot spot data is mentioned in write-in caching at the very start Rise data access efficiency.Here, distance can be allowed in the priority of new data block divided by space length visiting frequency It is embodied in value, allows and obtain small priority apart from remote data block, avoid the data block being newly written larger because of priority calculated value All write-in buffer queues influence the efficiency cached.
Step 404 judges whether data block to be written can be added buffer queue, and determination can be added, enter step 405, otherwise enter step 406.
Step 405, according to cache management strategy, eliminate cache blocks (the i.e. directly deletion that priority is minimum in buffer queue Cache blocks), buffer queue is added in the data block being written into.
Step 406, the data block being written into directly pass through the mechanism write-in Hadoop platform of Hadoop itself.
Following explanation is done for the cache policy of client.
In the present embodiment, it is used for the spatial data of memory buffers in one Map structure of Client Design, as shown in Figure 10; Key (Key) of the space coordinate of use space data as Map, and the value (Value) of Map then uses a List structure, it should The identical all data of parking space coordinate in List structure.For every spatial storage methods, in addition to memory space data The accessed visiting frequency of memory space data is gone back outside the information of itself, the visiting frequency whether should for measuring spatial data Be eliminated out buffer queue.
When carrying out client-cache management, it is also contemplated that the spatial position degree of association between data.Data cached washes in a pan Eliminate in addition to will according to the visiting frequency of data, also need according to the space eliminated between data and the hot spot data not being eliminated away from It is close from, space length then this just carry out determining whether to eliminate again when caching was needed and eliminated next time without eliminating. Detailed process is as shown in figure 11, which comprises
Step 501 calls caching eliminative mechanism, starts cache to eliminate, the number cached that is eliminated out is thought in determination According to as data to be eliminated.
Here, the low data of visiting frequency are determined as the data to be eliminated.
Step 502, the space length for determining data and hot spot data to be eliminated.
Step 503, the space length for comparing data to be eliminated and hot spot data, determine whether to delete according to the space length Except the data to be eliminated, determines that deletion then enters step 504, otherwise enter step 505.
Here, the client is equipped with a pre-set space threshold value, the space length and the pre-set space threshold Value determines that the space length is greater than the pre-set space threshold value, then it is assumed that can delete, enter step 504, otherwise it is assumed that not It can delete, then enter step 505.
Step 504 executes data cached superseded operation.
Step 505, without eliminate, wait next time caching eliminate initiate when reprocesses.
Figure 12 is the structural schematic diagram of spatial data buffer storage one provided in an embodiment of the present invention;Described device can answer For back end, as shown in figure 12, described device, comprising: at first processing module 601, Second processing module 602 and third Manage module 603.
The first processing module 601, for receiving operation requests;The operation requests for requests data reading block or Request writing data blocks.
The data block is added for determining the operation requests corresponding data block for the Second processing module 602 Caching prewrites enqueue.
The third processing module 603, for determining the priority of the data block, according to the priority of the data block When determining that the data block obtains the permission that buffer queue is added, the buffer queue is added in the data block;It is described preferential Grade is related to the correlation degree of reference buffer storage block in the data block and the buffer queue.
Specifically, the third processing module 603, visiting frequency specifically for the determination data block and described Space length between data block and the reference buffer storage block;The reference buffer storage block characterizes in the buffer queue priority most High cache blocks;Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;According to institute Visiting frequency, first weight, the space length and second weight are stated, determines the priority of the data block.
Specifically, the third processing module 603 is greater than the caching specifically for the priority of the determination data block In queue when the priority of at least one cache blocks, determine that the data block obtains the permission that buffer queue is added.
Specifically, the third processing module 603 determines preset time specifically for the data block read for request The access time of the access times of the data block and the maiden visit time in preset time period and the last time in section First time interval;According to the access times and the first time interval, the visiting frequency of the data block is determined.
Specifically, the third processing module 603 determines the caching specifically for the data block for request write-in The adjacent data blocks of data block described in queue;Determine the access times of the adjacent data blocks in preset time period, and pre- If the second time interval of maiden visit time and the last access time in the period;According to the access times and Second time interval, determines the visiting frequency of the adjacent data blocks;According to the visiting frequency of the adjacent data blocks, with And the space length between the adjacent data blocks and the data block, determine the visiting frequency of the data block.
Specifically, the third processing module 603, be specifically used for determining corresponding first area of space of the data block and The corresponding second space region of the reference buffer storage block;Determine the first nodal point coordinate and described of first area of space Second center point coordinate of two area of space;According to the first nodal point coordinate and second center point coordinate determination Space length.
Specifically, the third processing module 603 is also used to determine that the data block does not obtain and the buffer queue is added Permission when, the data block is handled according to preset strategy.
The third processing module 603 deletes the data block specifically for the data block read for request;For Data block write-in Hadoop is prewrite enqueue by the data block for requesting write-in, and the Hadoop prewrites the number in enqueue According to block for etc. back end to be written.
It should be understood that spatial data buffer storage provided by the above embodiment is when carrying out spatial data caching, only With the division progress of above-mentioned each program module for example, in practical application, can according to need and by above-mentioned processing distribution by Different program modules is completed, i.e., the internal structure of device is divided into different program modules, described above complete to complete Portion or part are handled.In addition, spatial data buffer storage provided by the above embodiment (is referred specifically to spatial data caching method The method of back end side) embodiment belongs to same design, and specific implementation process is detailed in embodiment of the method, no longer superfluous here It states.
Figure 13 is the structural schematic diagram of spatial data buffer storage two provided in an embodiment of the present invention;Described device can answer For client;As shown in figure 13, described device includes: fourth processing module 701, the 5th processing module 702 and the 6th processing Module 703.
The fourth processing module 701 determines the first data and the second data after calling caching eliminative mechanism;Institute State the data that the first data characterization visiting frequency is lower than first threshold, the highest data of the second data characterization visiting frequency.
5th processing module 702, for determining the reference space length of first data and the second data;It is described The correlation degree of first data and the second data is characterized with reference to space length.
6th processing module 703, for determining whether to delete first data with reference to space length according to described, When determining deletion first data, delete operation is executed for first data.
Specifically, the 5th processing module 702 is specifically used for determining the corresponding third area of space of first data The 4th area of space corresponding with second data;Determine the third center point coordinate and described of the third area of space 4th center point coordinate of four area of space;According to the third center point coordinate and the 4th center point coordinate determination With reference to space length.
Specifically, whether the 6th processing module 703 is more than default sky specifically for judging described with reference to space length Between threshold value, determine described when being more than pre-set space threshold value with reference to space length, determine and delete first data.
It should be understood that spatial data buffer storage provided by the above embodiment is when carrying out spatial data caching, only With the division progress of above-mentioned each program module for example, in practical application, can according to need and by above-mentioned processing distribution by Different program modules is completed, i.e., the internal structure of device is divided into different program modules, described above complete to complete Portion or part are handled.In addition, spatial data buffer storage provided by the above embodiment (is referred specifically to spatial data caching method The method of client-side) embodiment belongs to same design, and specific implementation process is detailed in embodiment of the method, and which is not described herein again.
The method of embodiment to realize the present invention, the embodiment of the present invention provide a kind of spatial data buffer storage, and setting exists On back end, specifically, as shown in figure 14, described device includes first processor 801 and can be described for storing The first memory 802 of the computer program run on first processor;Wherein, the first processor 801 is for running institute It when stating computer program, executes: receiving operation requests;The operation requests are for requests data reading block or request write-in data Block;It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;Determine the data block Priority will be described when determining that the data block obtains the permission that buffer queue is added according to the priority of the data block The buffer queue is added in data block;The pass of reference buffer storage block in the priority and the data block and the buffer queue Connection degree is related.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining Space length between the visiting frequency of data block and the data block and the reference buffer storage block;The reference buffer storage block Characterize the cache blocks of highest priority in the buffer queue;Obtain corresponding first weight of the visiting frequency and the space Apart from corresponding second weight;According to the visiting frequency, first weight, the space length and second weight, Determine the priority of the data block.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining When the priority of data block is greater than the priority of at least one cache blocks in the buffer queue, determine that the data block is added Enter the permission of buffer queue.
In one embodiment, the first processor 801 is for executing when running the computer program: determining default In period when the access of the access times of the data block and maiden visit time and the last time in preset time period Between first time interval;According to the access times and the first time interval, the visiting frequency of the data block is determined.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining The adjacent data blocks of data block described in buffer queue;Determine the access times of the adjacent data blocks in preset time period, with And the second time interval of the maiden visit time in preset time period and the last access time;According to the access time Several and second time interval, determines the visiting frequency of the adjacent data blocks;According to the access of adjacent data blocks frequency Degree and the space length between the adjacent data blocks and the data block, determine the visiting frequency of the data block.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining Corresponding first area of space of data block and the corresponding second space region of the reference buffer storage block;Determine first space region Second center point coordinate of the first nodal point coordinate in domain and the second space region;According to the first nodal point coordinate and Second center point coordinate determines the space length.
In one embodiment, the first processor 801 is for executing when running the computer program: described in determining When data block does not obtain the permission that the buffer queue is added, the data block is handled according to preset strategy;The basis is default Strategy handles the data block, comprising: for the data block that request is read, deletes the data block;For the number of request write-in According to block, data block write-in Hadoop is prewrite into enqueue, the Hadoop prewrites the data block in enqueue for waiting The back end is written.
It should be understood that spatial data buffer storage provided by the above embodiment and spatial data caching method embodiment Belong to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Certainly, when practical application, as shown in figure 14, which can also include: at least one network interface 803.It is empty Between various components in data buffer storage device 80 be coupled by bus system 804.It is understood that bus system 804 is used for Realize the connection communication between these components.Bus system 804 further includes power bus, control in addition to including data/address bus Bus and status signal bus in addition.But for the sake of clear explanation, various buses are all designated as bus system 804 in Figure 14. Wherein, the number of the first processor 804 can be at least one.Network interface 803 is used for spatial data buffer storage 80 The communication of wired or wireless way between other equipment.
First memory 802 in the embodiment of the present invention is for storing various types of data to support spatial data to cache The operation of device 80.
The method that the embodiments of the present invention disclose can be applied in first processor 801, or by first processor 801 realize.First processor 801 may be a kind of IC chip, the processing capacity with signal.During realization, Each step of the above method can pass through the integrated logic circuit of the hardware in first processor 801 or the instruction of software form It completes.Above-mentioned first processor 801 can be general processor, digital signal processor (DSP, Digital Signal Processor) either other programmable logic device, discrete gate or transistor logic, discrete hardware components etc..The One processor 801 may be implemented or execute disclosed each method, step and logic diagram in the embodiment of the present invention.General place Reason device can be microprocessor or any conventional processor etc..The step of method in conjunction with disclosed in the embodiment of the present invention, Hardware decoding processor can be embodied directly in and execute completion, or in decoding processor hardware and software module combination hold Row is completed.Software module can be located in storage medium, which is located at first memory 802, and first processor 801 is read The step of taking the information in first memory 802, completing preceding method in conjunction with its hardware.
In the exemplary embodiment, spatial data buffer storage 80 can be by one or more application specific integrated circuit (ASIC, Application Specific Integrated Circuit), DSP, programmable logic device (PLD, Programmable Logic Device), Complex Programmable Logic Devices (CPLD, Complex Programmable Logic Device), field programmable gate array (FPGA, Field-Programmable Gate Array), general processor, control Device, microcontroller (MCU, Micro Controller Unit), microprocessor (Microprocessor) or other electronics member Part is realized, for executing preceding method.
The embodiment of the present invention also provides a kind of spatial data buffer storage, is arranged on the client, specifically, such as Figure 15 Shown, which includes: second processor 901 and for storing the computer journey that can be run on the first processor The second memory 902 of sequence;Wherein, the second processor 901 is for executing when running the computer program: calling slow After depositing eliminative mechanism, the first data and the second data are determined;The first data characterization visiting frequency is lower than the number of first threshold According to the highest data of the second data characterization visiting frequency;Determine the reference spaces of first data and the second data away from From;The correlation degree that first data and the second data are characterized with reference to space length;Space length is referred to according to described Determine whether to delete first data, when determining deletion first data, executes delete operation for first data.
In one embodiment, the second processor 901 is for executing when running the computer program: described in determining The corresponding third area of space of first data and corresponding 4th area of space of second data;Determine the third space region 4th center point coordinate of the third center point coordinate in domain and the 4th area of space;According to the third center point coordinate and 4th center point coordinate determines described with reference to space length.
In one embodiment, the second processor 901 is for executing: described in judgement when running the computer program Whether it is more than pre-set space threshold value with reference to space length, when determining that the reference space length is more than pre-set space threshold value, determines Delete first data.
It should be understood that spatial data buffer storage provided by the above embodiment and spatial data caching method embodiment Belong to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Certainly, when practical application, as shown in figure 15, which can also include: at least one network interface 903.It is empty Between various components in data buffer storage device 90 be coupled by bus system 904.It is understood that bus system 904 is used for Realize the connection communication between these components.Bus system 904 further includes power bus, control in addition to including data/address bus Bus and status signal bus in addition.But for the sake of clear explanation, various buses are all designated as bus system 904 in Figure 15. Wherein, the number of the second processor 901 can be at least one.Network interface 903 is used for spatial data buffer storage 90 The communication of wired or wireless way between other equipment.
Second memory 902 in the embodiment of the present invention is for storing various types of data to support spatial data to cache The operation of device 90.
The method that the embodiments of the present invention disclose can be applied in second processor 901, or by second processor 901 realize.Second processor 901 may be a kind of IC chip, the processing capacity with signal.During realization, Each step of the above method can pass through the integrated logic circuit of the hardware in second processor 901 or the instruction of software form It completes.Above-mentioned second processor 901 can be general processor, DSP or other programmable logic device, discrete gate or Person's transistor logic, discrete hardware components etc..Second processor 901 may be implemented or execute in the embodiment of the present invention Disclosed each method, step and logic diagram.General processor can be microprocessor or any conventional processor etc..Knot The step of closing method disclosed in the embodiment of the present invention, can be embodied directly in hardware decoding processor and execute completion, Huo Zheyong Hardware and software module combination in decoding processor execute completion.Software module can be located in storage medium, which is situated between Matter is located at second memory 902, and second processor 901 reads the information in second memory 902, completes in conjunction with its hardware aforementioned The step of method.
In the exemplary embodiment, spatial data buffer storage 90 can by one or more ASIC, DSP, PLD, CPLD, FPGA, general processor, controller, MCU, microprocessor (Microprocessor) or other electronic components are realized, for holding Row preceding method.
It is appreciated that the memory (such as first memory 802 and second memory 902) in the embodiment of the present invention, it can To be volatile memory or nonvolatile memory, it may also comprise both volatile and non-volatile memories.Wherein, non-easy Lose property memory can be read-only memory (ROM, Read Only Memory), programmable read only memory (PROM, Programmable Read-Only Memory), Erasable Programmable Read Only Memory EPROM (EPROM, Erasable Programmable Read-Only Memory), electrically erasable programmable read-only memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), magnetic RAM (FRAM, ferromagnetic Random access memory), flash memory (Flash Memory), magnetic surface storage, CD or CD-ROM (CD-ROM, Compact Disc Read-Only Memory);Magnetic surface storage can be magnetic disk storage or tape storage Device.Volatile memory can be random access memory (RAM, Random Access Memory), be used as external high speed Caching.By exemplary but be not restricted explanation, the RAM of many forms is available, such as static random access memory (SRAM, Static Random Access Memory), synchronous static random access memory (SSRAM, Synchronous Static Random Access Memory), dynamic random access memory (DRAM, Dynamic Random Access Memory), Synchronous Dynamic Random Access Memory (SDRAM, Synchronous Dynamic Random Access Memory), double data speed synchronous dynamic RAM (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), enhanced Synchronous Dynamic Random Access Memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), synchronized links dynamic random are deposited Access to memory (SLDRAM, SyncLink Dynamic Random Access Memory), direct rambus arbitrary access are deposited Reservoir (DRRAM, Direct Rambus Random Access Memory).The memory of description of the embodiment of the present invention is intended to wrap Include but be not limited to the memory of these and any other suitable type.
In the exemplary embodiment, the embodiment of the invention also provides a kind of computer readable storage medium, for example including The first memory 802 of computer program, above-mentioned computer program can be by the first processors 801 of spatial data buffer storage 80 It executes, to complete step described in preceding method.
Specifically, the embodiment of the invention provides a kind of computer readable storage medium, it is stored thereon with computer program, It when the computer program is run by processor, executes: receiving operation requests;The operation requests are used for requests data reading block Or request writing data blocks;It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue; The priority for determining the data block determines that the data block obtains according to the priority of the data block and buffer queue is added When permission, the buffer queue is added in the data block;In the priority and the data block and the buffer queue The correlation degree of reference buffer storage block is related.
In one embodiment, it when the computer program is run by processor, executes: determining the access frequency of the data block Degree and the space length between the data block and the reference buffer storage block;The reference buffer storage block characterizes the caching team The cache blocks of highest priority in column;Obtain corresponding first weight of the visiting frequency and the space length corresponding second Weight;According to the visiting frequency, first weight, the space length and second weight, the data block is determined Priority.
In one embodiment, it when the computer program is run by processor, executes: determining the priority of the data block Greater than the power for when priority of at least one cache blocks, determining the data block acquisition addition buffer queue in the buffer queue Limit.
In one embodiment, it when the computer program is run by processor, executes: determining the number in preset time period Between first time according to the access time of maiden visit time and the last time in the access times and preset time period of block Every;According to the access times and the first time interval, the visiting frequency of the data block is determined.
In one embodiment, it when the computer program is run by processor, executes: determining described in the buffer queue The adjacent data blocks of data block;It determines in preset time period in the access times and preset time period of the adjacent data blocks The maiden visit time and the last access time the second time interval;According to the access times and it is described second when Between be spaced, determine the visiting frequency of the adjacent data blocks;According to the visiting frequency of the adjacent data blocks and described adjacent Space length between data block and the data block determines the visiting frequency of the data block.
In one embodiment, it when the computer program is run by processor, executes: determining the data block corresponding the One area of space and the corresponding second space region of the reference buffer storage block;Determine the first nodal point of first area of space Second center point coordinate of coordinate and the second space region;According to the first nodal point coordinate and second central point Coordinate determines the space length.
In one embodiment, it when the computer program is run by processor, executes: determining that the data block is not added When entering the permission of the buffer queue, the data block is handled according to preset strategy;It is described that the number is handled according to preset strategy According to block, comprising: for the data block that request is read, delete the data block;For the data block of request write-in, by the data Block write-in Hadoop prewrites enqueue, the Hadoop prewrite the data block in enqueue for etc. the data section to be written Point.
In the exemplary embodiment, the embodiment of the invention also provides a kind of computer readable storage medium, for example including The second memory 902 of computer program, above-mentioned computer program can be by the second processors 901 of spatial data buffer storage 90 It executes, to complete step described in preceding method.
Specifically, the embodiment of the invention provides a kind of computer readable storage medium, it is stored thereon with computer program, It when the computer program is run by processor, executes: after calling caching eliminative mechanism, determining the first data and the second data; The first data characterization visiting frequency is lower than the data of first threshold, the highest number of the second data characterization visiting frequency According to;Determine the reference space length of first data and the second data;It is described to characterize first data with reference to space length With the correlation degree of the second data;Determine whether to delete first data with reference to space length according to described, determines and delete institute When stating the first data, delete operation is executed for first data.
In one embodiment, it when the computer program is run by processor, executes: determining that first data are corresponding Third area of space and corresponding 4th area of space of second data;Determine the third central point of the third area of space 4th center point coordinate of coordinate and the 4th area of space;According to the third center point coordinate and the 4th central point Coordinate determines described with reference to space length.
In one embodiment, it when the computer program is run by processor, executes: judging that the space length that refers to is No is more than pre-set space threshold value, when determining that the reference space length is more than pre-set space threshold value, determines and deletes first number According to.
It should be understood that computer readable storage medium provided in an embodiment of the present invention can be FRAM, ROM, PROM, The memories such as EPROM, EEPROM, Flash Memory, magnetic surface storage, CD or CD-ROM;It is also possible to include above-mentioned The various equipment of one of memory or any combination.
The method of embodiment to realize the present invention, the embodiment of the invention also provides a kind of spatial data caching system, institutes The system of stating includes: client and at least one back end;Wherein,
The back end, for receiving operation requests;The operation requests are write for requests data reading block or request Enter data block;It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;Described in determination The priority of data block, when determining that the data block obtains the permission that buffer queue is added according to the priority of the data block, The buffer queue is added in the data block;Reference buffer storage in the priority and the data block and the buffer queue The correlation degree of block is related.
The client determines the first data and the second data after calling caching eliminative mechanism;First data Characterize the data that visiting frequency is lower than first threshold, the highest data of the second data characterization visiting frequency;Determine described The reference space length of one data and the second data;The pass that first data and the second data are characterized with reference to space length Connection degree;Determine whether to delete first data with reference to space length according to described, when determining deletion first data, needle Delete operation is executed to first data.
It should be understood that the client, the concrete processing procedure of the back end be as detailed above, here not It repeats again.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only A kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, or It is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition portion Mutual coupling or direct-coupling or communication connection is divided to can be through some interfaces, the INDIRECT COUPLING of equipment or unit Or communication connection, it can be electrical, mechanical or other forms.
Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit The component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple network lists In member;Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
In addition, each functional unit in various embodiments of the present invention can be fully integrated in one processing unit, it can also To be each unit individually as a unit, can also be integrated in one unit with two or more units;It is above-mentioned Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned include: movable storage device, it is read-only Memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or The various media that can store program code such as person's CD.
If alternatively, the above-mentioned integrated unit of the present invention is realized in the form of software function module and as independent product When selling or using, it also can store in a computer readable storage medium.Based on this understanding, the present invention is implemented Substantially the part that contributes to existing technology can be embodied in the form of software products the technical solution of example in other words, The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with It is personal computer, server or network equipment etc.) execute all or part of each embodiment the method for the present invention. And storage medium above-mentioned includes: that movable storage device, ROM, RAM, magnetic or disk etc. are various can store program code Medium.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all Made any modifications, equivalent replacements, and improvements etc. within the spirit and principles in the present invention, should be included in protection of the invention Within the scope of.

Claims (23)

1. a kind of spatial data caching method, which is characterized in that be applied to back end, which comprises
Receive operation requests;The operation requests are for requests data reading block or request writing data blocks;
It determines the corresponding data block of the operation requests, caching is added in the data block and prewrites enqueue;
The priority for determining the data block determines that the data block obtains according to the priority of the data block and caching team is added When the permission of column, the buffer queue is added in the data block;The priority and the data block and the buffer queue In reference buffer storage block correlation degree it is related.
2. the method according to claim 1, wherein the priority of the determination data block, comprising:
Determine the space length between the visiting frequency and the data block and the reference buffer storage block of the data block;Institute State the cache blocks that reference buffer storage block characterizes highest priority in the buffer queue;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data block is determined Priority.
3. the method according to claim 1, wherein the priority according to the data block determines the number The permission that buffer queue is added is obtained according to block, comprising:
When determining that the priority of the data block is greater than the priority of at least one cache blocks in the buffer queue, described in determination Data block obtains the permission that buffer queue is added.
4. according to the method described in claim 2, it is characterized in that, for the data block read, the determination number is requested According to the visiting frequency of block, comprising:
Determine access times of the data block in preset time period and the maiden visit time in preset time period and recently The first time interval of primary access time;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
5. according to the method described in claim 2, it is characterized in that, for the data block being written, the determination number is requested According to the visiting frequency of block, comprising:
Determine the adjacent data blocks of data block described in the buffer queue;
Determine access times of the adjacent data blocks in preset time period and the maiden visit time in preset time period and The second time interval of the last access time;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block away from From determining the visiting frequency of the data block.
6. according to the method described in claim 2, it is characterized in that, determining the space between the data block and reference buffer storage block Distance, comprising:
Determine corresponding first area of space of the data block and the corresponding second space region of the reference buffer storage block;
Determine the first nodal point coordinate of first area of space and second center point coordinate in the second space region;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
7. the method according to claim 1, wherein the method also includes: determine that the data block does not obtain When the permission of the buffer queue is added, the data block is handled according to preset strategy;
It is described that the data block is handled according to preset strategy, comprising:
For the data block that request is read, the data block is deleted;
For the data block of request write-in, data block write-in Hadoop is prewrite into enqueue, the Hadoop, which prewrites, to join the team Data block in column for etc. the back end to be written.
8. a kind of spatial data caching method, which is characterized in that be applied to client, which comprises
After calling caching eliminative mechanism, the first data and the second data are determined;The first data characterization visiting frequency is lower than the The data of one threshold value, the highest data of the second data characterization visiting frequency;
Determine the reference space length of first data and the second data;It is described to characterize first data with reference to space length With the correlation degree of the second data;
Determine whether to delete first data with reference to space length according to described, when determining deletion first data, for First data execute delete operation.
9. according to the method described in claim 8, it is characterized in that, the reference of the determination first data and the second data Space length, comprising:
Determine the corresponding third area of space of first data and corresponding 4th area of space of second data;
Determine the third center point coordinate of the third area of space and the 4th center point coordinate of the 4th area of space;
It is determined according to the third center point coordinate and the 4th center point coordinate described with reference to space length.
10. according to the method described in claim 8, it is characterized in that, described determine whether to delete according to described with reference to space length Except first data, comprising:
Judge whether the reference space length is more than pre-set space threshold value, determines that the reference space length is more than pre-set space When threshold value, determines and delete first data.
11. a kind of spatial data buffer storage, which is characterized in that described device includes: first processing module, Second processing module With third processing module;Wherein,
The first processing module, for receiving operation requests;The operation requests are write for requests data reading block or request Enter data block;
It is pre- caching to be added for determining the corresponding data block of the operation requests in the data block by the Second processing module Queue is written;
The third processing module determines institute according to the priority of the data block for determining the priority of the data block When stating the permission of data block acquisition addition buffer queue, the buffer queue is added in the data block;The priority and institute It is related to the correlation degree of reference buffer storage block in the buffer queue to state data block.
12. device according to claim 11, which is characterized in that the third processing module, described in determining Space length between the visiting frequency of data block and the data block and the reference buffer storage block;The reference buffer storage block Characterize the cache blocks of highest priority in the buffer queue;
Obtain corresponding first weight of the visiting frequency and corresponding second weight of the space length;
According to the visiting frequency, first weight, the space length and second weight, the data block is determined Priority.
13. device according to claim 11, which is characterized in that the third processing module, described in determining When the priority of data block is greater than the priority of at least one cache blocks in the buffer queue, determine that the data block is added Enter the permission of buffer queue.
14. device according to claim 12, which is characterized in that the third processing module is specifically used for for request The data block of reading determines the access times of the data block and the maiden visit in preset time period in preset time period The first time interval of time and the last access time;
According to the access times and the first time interval, the visiting frequency of the data block is determined.
15. device according to claim 12, which is characterized in that the third processing module is specifically used for for request The data block of write-in determines the adjacent data blocks of data block described in the buffer queue;
Determine access times of the adjacent data blocks in preset time period and the maiden visit time in preset time period and The second time interval of the last access time;
According to the access times and second time interval, the visiting frequency of the adjacent data blocks is determined;
According to the space between the visiting frequency of the adjacent data blocks and the adjacent data blocks and the data block away from From determining the visiting frequency of the data block.
16. device according to claim 12, which is characterized in that the third processing module, described in determining Corresponding first area of space of data block and the corresponding second space region of the reference buffer storage block;
Determine the first nodal point coordinate of first area of space and second center point coordinate in the second space region;
The space length is determined according to the first nodal point coordinate and second center point coordinate.
17. device according to claim 11, which is characterized in that the third processing module is also used to determine the number When not obtaining the permission that the buffer queue is added according to block, the data block is handled according to preset strategy;
The third processing module deletes the data block specifically for the data block read for request;It is written for request Data block, by the data block write-in Hadoop prewrite enqueue, the Hadoop prewrites the data block in enqueue and is used for Etc. back end to be written.
18. a kind of spatial data buffer storage, which is characterized in that described device includes: fourth processing module, the 5th processing module With the 6th processing module;Wherein,
The fourth processing module determines the first data and the second data after calling caching eliminative mechanism;First number It is lower than the data of first threshold, the highest data of the second data characterization visiting frequency according to characterization visiting frequency;
5th processing module, for determining the reference space length of first data and the second data;The reference is empty Between distance characterize the correlation degrees of first data and the second data;
6th processing module, for determining whether to delete first data with reference to space length according to described, determination is deleted When except first data, delete operation is executed for first data.
19. device according to claim 18, which is characterized in that the 5th processing module, described in determining The corresponding third area of space of first data and corresponding 4th area of space of second data;
Determine the third center point coordinate of the third area of space and the 4th center point coordinate of the 4th area of space;
It is determined according to the third center point coordinate and the 4th center point coordinate described with reference to space length.
20. device according to claim 18, which is characterized in that the 6th processing module is specifically used for described in judgement Whether it is more than pre-set space threshold value with reference to space length, when determining that the reference space length is more than pre-set space threshold value, determines Delete first data.
21. a kind of spatial data buffer storage, which is characterized in that described device includes: first processor and can for storing The first memory of the computer program run on first processor;Wherein,
The first processor is for when running the computer program, perform claim to require the step of any one of 1 to 7 the method Suddenly.
22. a kind of spatial data buffer storage, which is characterized in that described device includes: second processor and can for storing The second memory of the computer program run in second processor;Wherein,
The second processor is for when running the computer program, perform claim to require any one of 8 to 10 the methods Step.
23. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of any one of claim 1 to 7 the method is realized when being executed by processor;Alternatively, the computer program is processed The step of any one of claim 8 to 10 the method is realized when device executes.
CN201811191662.6A 2018-10-12 2018-10-12 Spatial data caching method and device and storage medium Active CN109446114B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811191662.6A CN109446114B (en) 2018-10-12 2018-10-12 Spatial data caching method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811191662.6A CN109446114B (en) 2018-10-12 2018-10-12 Spatial data caching method and device and storage medium

Publications (2)

Publication Number Publication Date
CN109446114A true CN109446114A (en) 2019-03-08
CN109446114B CN109446114B (en) 2020-12-18

Family

ID=65546420

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811191662.6A Active CN109446114B (en) 2018-10-12 2018-10-12 Spatial data caching method and device and storage medium

Country Status (1)

Country Link
CN (1) CN109446114B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287007A (en) * 2019-05-20 2019-09-27 深圳壹账通智能科技有限公司 Data call response method, server and computer readable storage medium
CN112035498A (en) * 2020-08-31 2020-12-04 北京奇艺世纪科技有限公司 Data block scheduling method and device, scheduling layer node and storage layer node
CN112260952A (en) * 2020-10-20 2021-01-22 四川天邑康和通信股份有限公司 Wifi6 router rapid data access protection method
CN113407620A (en) * 2020-03-17 2021-09-17 北京信息科技大学 Data block placement method and system based on heterogeneous Hadoop cluster environment
CN113742095A (en) * 2021-01-14 2021-12-03 北京沃东天骏信息技术有限公司 Cache data processing method and device, electronic equipment and storage medium
CN116383258A (en) * 2023-05-23 2023-07-04 菏泽全胜建筑装饰工程有限公司 Building construction data management method and system based on BIM
CN117170590A (en) * 2023-11-03 2023-12-05 沈阳卓志创芯科技有限公司 Computer data storage method and system based on cloud computing

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019962A (en) * 2012-12-21 2013-04-03 华为技术有限公司 Data cache processing method, device and system
CN103092775A (en) * 2013-01-31 2013-05-08 武汉大学 Spatial data double cache method and mechanism based on key value structure
CN103701886A (en) * 2013-12-19 2014-04-02 中国信息安全测评中心 Hierarchic scheduling method for service and resources in cloud computation environment
CN103942289A (en) * 2014-04-12 2014-07-23 广西师范大学 Memory caching method oriented to range querying on Hadoop
CN104217019A (en) * 2014-09-25 2014-12-17 中国人民解放军信息工程大学 Content inquiry method and device based on multiple stages of cache modules
CN104794064A (en) * 2015-04-21 2015-07-22 华中科技大学 Cache management method based on region heat degree
CN104809179A (en) * 2015-04-16 2015-07-29 华为技术有限公司 Device and method for accessing Hash table
US20160202935A1 (en) * 2015-01-13 2016-07-14 Elastifile Ltd. Distributed file system with speculative writing
US20170351620A1 (en) * 2016-06-07 2017-12-07 Qubole Inc Caching Framework for Big-Data Engines in the Cloud
CN107644086A (en) * 2017-09-25 2018-01-30 咪咕文化科技有限公司 Spatial data distribution method

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019962A (en) * 2012-12-21 2013-04-03 华为技术有限公司 Data cache processing method, device and system
CN103092775A (en) * 2013-01-31 2013-05-08 武汉大学 Spatial data double cache method and mechanism based on key value structure
CN103701886A (en) * 2013-12-19 2014-04-02 中国信息安全测评中心 Hierarchic scheduling method for service and resources in cloud computation environment
CN103942289A (en) * 2014-04-12 2014-07-23 广西师范大学 Memory caching method oriented to range querying on Hadoop
CN104217019A (en) * 2014-09-25 2014-12-17 中国人民解放军信息工程大学 Content inquiry method and device based on multiple stages of cache modules
US20160202935A1 (en) * 2015-01-13 2016-07-14 Elastifile Ltd. Distributed file system with speculative writing
CN104809179A (en) * 2015-04-16 2015-07-29 华为技术有限公司 Device and method for accessing Hash table
CN104794064A (en) * 2015-04-21 2015-07-22 华中科技大学 Cache management method based on region heat degree
US20170351620A1 (en) * 2016-06-07 2017-12-07 Qubole Inc Caching Framework for Big-Data Engines in the Cloud
CN107644086A (en) * 2017-09-25 2018-01-30 咪咕文化科技有限公司 Spatial data distribution method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张少将: "基于Hadoop的地理空间大数据存储与查询技术", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287007A (en) * 2019-05-20 2019-09-27 深圳壹账通智能科技有限公司 Data call response method, server and computer readable storage medium
CN113407620A (en) * 2020-03-17 2021-09-17 北京信息科技大学 Data block placement method and system based on heterogeneous Hadoop cluster environment
CN113407620B (en) * 2020-03-17 2023-04-21 北京信息科技大学 Data block placement method and system based on heterogeneous Hadoop cluster environment
CN112035498A (en) * 2020-08-31 2020-12-04 北京奇艺世纪科技有限公司 Data block scheduling method and device, scheduling layer node and storage layer node
CN112035498B (en) * 2020-08-31 2023-09-05 北京奇艺世纪科技有限公司 Data block scheduling method and device, scheduling layer node and storage layer node
CN112260952A (en) * 2020-10-20 2021-01-22 四川天邑康和通信股份有限公司 Wifi6 router rapid data access protection method
CN113742095A (en) * 2021-01-14 2021-12-03 北京沃东天骏信息技术有限公司 Cache data processing method and device, electronic equipment and storage medium
CN116383258A (en) * 2023-05-23 2023-07-04 菏泽全胜建筑装饰工程有限公司 Building construction data management method and system based on BIM
CN116383258B (en) * 2023-05-23 2023-08-11 菏泽全胜建筑装饰工程有限公司 Building construction data management method and system based on BIM
CN117170590A (en) * 2023-11-03 2023-12-05 沈阳卓志创芯科技有限公司 Computer data storage method and system based on cloud computing
CN117170590B (en) * 2023-11-03 2024-01-26 沈阳卓志创芯科技有限公司 Computer data storage method and system based on cloud computing

Also Published As

Publication number Publication date
CN109446114B (en) 2020-12-18

Similar Documents

Publication Publication Date Title
CN109446114A (en) Spatial data caching method and device and storage medium
CN100476742C (en) Load balancing method based on object storage device
CN106331153B (en) A kind of filter method of service request, apparatus and system
CN107247675B (en) A kind of caching selection method and system based on classification prediction
CN107003814A (en) Effective metadata in storage system
US10120810B2 (en) Implementing selective cache injection
CN106528451B (en) The cloud storage frame and construction method prefetched for the L2 cache of small documents
US11914894B2 (en) Using scheduling tags in host compute commands to manage host compute task execution by a storage device in a storage system
US10956322B2 (en) Storage drive dependent track removal in a cache for storage
US10831662B1 (en) Systems and methods for maintaining cache coherency
CN104158863A (en) Cloud storage mechanism based on transaction-level whole-course high-speed buffer
CN106991059A (en) To the access control method of data source
CN116560562A (en) Method and device for reading and writing data
CN107133183A (en) A kind of cache data access method and system based on TCMU Virtual Block Devices
US8539135B2 (en) Route lookup method for reducing overall connection latencies in SAS expanders
CN109144431A (en) Caching method, device, equipment and the storage medium of data block
CN109582233A (en) A kind of caching method and device of data
US10686906B2 (en) Methods for managing multi-level flash storage and devices thereof
CN115509437A (en) Storage system, network card, processor, data access method, device and system
CN110209343B (en) Data storage method, device, server and storage medium
CN115794366A (en) Memory prefetching method and device
CN114207602A (en) Reducing requests using probabilistic data structures
CN111859225A (en) Program file access method, device, computing equipment and medium
CN111880900A (en) Design method of near data processing system for super fusion equipment
CN111880739A (en) Near data processing system for super fusion equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant