CN110471894A - A kind of data prefetching method, device, terminal and storage medium - Google Patents

A kind of data prefetching method, device, terminal and storage medium Download PDF

Info

Publication number
CN110471894A
CN110471894A CN201910662766.9A CN201910662766A CN110471894A CN 110471894 A CN110471894 A CN 110471894A CN 201910662766 A CN201910662766 A CN 201910662766A CN 110471894 A CN110471894 A CN 110471894A
Authority
CN
China
Prior art keywords
data
target
stripe
unit
read
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910662766.9A
Other languages
Chinese (zh)
Inventor
葛凯凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910662766.9A priority Critical patent/CN110471894A/en
Publication of CN110471894A publication Critical patent/CN110471894A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application provides a kind of data prefetching method, device, terminal and storage mediums, this method comprises: the first read requests of response target data, the first read requests include reading attributes information;Based on the attribute information of the obj ect file stored in reading attributes information and local cache, target object and target stripe unit where target data are determined;If not finding target data in target stripe unit, the memory capacity of target stripe unit is determined as corresponding with target data to prefetch range;Attribute information based on target object determines destination server corresponding with target object;The second read requests for prefetching the data in range to destination server transmission reading obtain prefetching data so that destination server is read out the data prefetched in range;Receive destination server transmission prefetches data.The application can be improved cache hit rate, reduces network delay, improves the data read accesses performance of distributed file system.

Description

A kind of data prefetching method, device, terminal and storage medium
Technical field
The application belongs to field of computer technology, and in particular to a kind of data prefetching method, device, terminal and storage are situated between Matter.
Background technique
Use memory as a read-write cache in the client of distributed file system at present, when writing data First write-in caching, is then periodically eliminated, and is also first when reading data from being read caching, but when cache miss Server is needed to read data.Existing data read process is generally as follows: according to the path of file and filename lookup text The metadata information of part carries out object fragment to file data, is carried out in the buffer according to the object fragment position of file data It searches, if caching is hit, pseudo-random data Distribution Algorithm (controoled is passed through according to the data pool in metadata Replication under scalable hashing, crush) position of location data in the server, it then goes to service Data are read in device.It will do it data pre-fetching when server reads data, usually prefetch adjacent next data Block.
But existing forecasting method is prefetched to the data block adjacent with the data of reading, the hit rate of caching and Data read accesses performance of distributed file system etc. cannot be ensured preferably.
Summary of the invention
In order to improve cache hit rate, network delay is reduced, to improve the data read accesses of distributed file system Can, the application proposes a kind of data prefetching method, device, terminal and storage medium.
On the one hand, present applicant proposes a kind of data prefetching methods, which comprises
The first read requests of target data are responded, first read requests include reading attributes information;
Based on the attribute information of the obj ect file stored in the reading attributes information and local cache, the mesh is determined Mark the target object and target stripe unit where data;
The target data is searched in the target stripe unit, it, will be described if not finding the target data The memory capacity of target stripe unit is determined as corresponding with the target data prefetching range;
Based on the attribute information of the target object, destination server corresponding with the target object is determined;
The second read requests that the data in range are prefetched described in reading are sent to the destination server, so that the mesh Mark server is read out the data prefetched in range based on second read requests, obtains prefetching data;
It receives the described of destination server transmission and prefetches data.
On the other hand, present applicant proposes a kind of data pre-fetching device, described device includes:
Respond module, for responding the first read requests of target data, first read requests include reading attributes Information;
First determining module, for the category based on the obj ect file stored in the reading attributes information and local cache Property information, determines the target object and target stripe unit where the target data;
Second determining module, for searching the target data in the target stripe unit, if not finding described The memory capacity of the target stripe unit is then determined as corresponding with the target data prefetching range by target data;
Third determining module, for the attribute information based on the target object, determination is corresponding with the target object Destination server;
Sending module, the second reading for sending the data prefetched in range described in reading to the destination server are asked It asks, so that the destination server is read out the data prefetched in range based on second read requests, obtains Prefetch data;
Receiving module prefetches data described in the destination server transmission for receiving.
On the other hand, present applicant proposes a kind of terminal, the terminal includes: processor and memory, the memory In be stored at least one instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, described at least one Duan Chengxu, the code set or instruction set are loaded by the processor and are executed to realize data pre-fetching side as described above Method.
On the other hand, it present applicant proposes a kind of computer readable storage medium, is stored at least in the storage medium One instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the generation Code collection or instruction set are loaded by processor and are executed to realize data prefetching method as described above.
The application propose a kind of data prefetching method, device, terminal and storage medium, in response to user triggering include According to the reading attributes information and pair of middle storage is locally stored in first read requests of the target data of reading attributes information As the attribute information of file, the position of the target data is determined, that is, determine target object and target item where the target data Tape cell, if the target data is not found in the target stripe unit, by the memory capacity of the target stripe unit It is determined as prefetching range, while according to attribute information of target object, such as title etc., determines target corresponding with target object Server finally prefetches the data in range to this by the destination server and is read out, and extremely by the data buffer storage read Client, to realize prefetching based on object striping fragment, the object taken full advantage of in distributed file system is deposited The feature of storage, improves cache hit rate, reduces network delay, to improve the data read accesses of distributed file system Performance.
Detailed description of the invention
It in ord to more clearly illustrate embodiments of the present application or technical solution in the prior art and advantage, below will be to implementation Example or attached drawing needed to be used in the description of the prior art are briefly described, it should be apparent that, the accompanying drawings in the following description is only It is only some embodiments of the present application, for those of ordinary skill in the art, without creative efforts, It can also be obtained according to these attached drawings other attached drawings.
Fig. 1 is data pre-fetching system architecture diagram provided by the embodiments of the present application.
Fig. 2 is a kind of flow diagram of data prefetching method provided by the embodiments of the present application.
Fig. 3 is object striping structural schematic diagram provided by the embodiments of the present application.
Fig. 4 is the structure being physically separated during file item provided by the embodiments of the present application takes object to file Schematic diagram.
Fig. 5 is another flow diagram of data prefetching method provided by the embodiments of the present application.
Fig. 6 is that provided by the embodiments of the present application read operation Documents Logical address is transformed into read number is 1 object offset The structural schematic diagram for arriving 128KB content for 0.
Fig. 7 is the structural schematic diagram provided by the embodiments of the present application for prefetching request range and falling on multiple objects.
Fig. 8 is the structural schematic diagram provided by the embodiments of the present application for prefetching request range and falling on an object.
Fig. 9 is the structural representation that a fuse provided by the embodiments of the present application requests access to the subject area that range is fallen in Figure.
Figure 10 is buffer organization structural schematic diagram provided by the embodiments of the present application.
Figure 11 is cache priority grade management structural schematic diagram provided by the embodiments of the present application.
Figure 12 is caching upgradeable architecture schematic diagram provided by the embodiments of the present application.
Figure 13 is that caching provided by the embodiments of the present application reduces structural schematic diagram.
Figure 14 is another flow diagram of data prefetching method provided by the embodiments of the present application.
Figure 15 is a kind of structural schematic diagram of data pre-fetching device provided by the embodiments of the present application.
Figure 16 is server architecture schematic diagram provided by the embodiments of the present application.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only The embodiment of the application a part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people Member's every other embodiment obtained without making creative work, all should belong to the model of the application protection It encloses.
It should be noted that the description and claims of this application and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way Data be interchangeable under appropriate circumstances, so as to embodiments herein described herein can in addition to illustrating herein or Sequence other than those of description is implemented.In addition, term " includes " and " having " and their any deformation, it is intended that cover Cover it is non-exclusive include, for example, containing the process, method of a series of steps or units, system, product or server need not limit In step or unit those of is clearly listed, but may include be not clearly listed or for these process, methods, produce The other step or units of product or equipment inherently.
Fig. 1 is a kind of data pre-fetching system architecture diagram provided by the embodiments of the present application, which is to rely on distribution File system (Cephfs), the data pre-fetching system can be used as the implementation environment of data prefetching method.Wherein, Cephfs A kind of file system for carrying out storing data using Ceph storage cluster, wherein Ceph be it is a kind of to provide outstanding performance, can The unification designed by property and scalability, distributed memory system.Ceph can be made of multiple distributed networks, this point Cloth network can be block chain network, and block chain network is Distributed Storage, point-to-point transmission, common recognition mechanism, encryption A kind of application model of the computer technologies such as algorithm, the end-to-end decentralization net collectively constituted by numerous nodes Network, each node are allowed to obtain a complete database copy, are safeguarded between node based on a set of common recognition mechanism entire Block chain network.
As described in Figure 1, which can include at least: client 01, meta data server (MDS) 02, right As storage equipment namely data server (OSD) 03 and monitoring server (Monitor) 04.Wherein, Monitor can divide It is not in communication with each other with MDS, OSD and client, client can be in communication with each other with MDS and OSD respectively, and MDS can To be in communication with each other respectively with client and OSD.
Specifically, MDS may include an independently operated server perhaps distributed server or by multiple clothes The server cluster of business device composition.
Specifically, OSD may include an independently operated server perhaps distributed server or by multiple clothes The server cluster of business device composition.
Specifically, MDS is responsible for the management of file system name space, that is, the directory tree structure of file system;OSD is The storage equipment of data, for OSD, the data of file are not distinguish, and general one as with metadata being OSD corresponds to a disk;The various information of Monitor management cluster comprising meta data server information (MDSMap), data Server info (OSDMap), placement group information (PGMap) etc., wherein MDSMap, OSDMap and PGMap manage cluster shape The functions such as state and cluster expansion.
Specifically, client is by client (Client), metadata access module (MDSC module), target cache module (ObjectCacher module), stripe module (Striper module) and object module (Objecter module) composition, these moulds The function of block is as follows:
(1) Client module: it is responsible for receiving the file request operation sent from application program, and according to different operations Type enters different logical process.
(2) MDSC (MDS Client) module: it is responsible for finding out its index of metadata section according to the name of request operation file Point (Inode) information is searched if miss into MDS distributed caching by being searched in client local cache, It proceeds in OSD and searches if still miss.
(3) Striper module: it is responsible for according to RADOS (Reliable, Autonomic Distributed Object Store object size and stripe cell in)) carry out striping Fragmentation to file, to improve the concurrent operations of file.Its In, RADOS is one of core of Ceph, as a sub-project of Ceph distributed file system, specifically for the demand of Ceph Design can provide a kind of stabilization, expansible, high performance list on a storage equipment group of planes for dynamic change and heterojunction structure The storage system of one object logic memory interface and the adaptive and Self management that can be realized node.
(4) ObjectCacher module: being responsible for the data of cache file read-write, and the content of caching is carried out as unit of object Organization and administration.
(5) Objecter module: being responsible for calculating the position target OSD of data by Crush algorithm when cache miss, Realize the metadata table without looking into file, need to only calculate just can determine data distribution.
The access of Cephfs file system is generally divided into two steps: being first turned on file namely by the path of file Go out the metadata information of file, such as Inode with filename lookup, some attributes of file are contained in metadata information, than Such as creation time, modification time, affiliated person, most important one are exactly the location information of file data;Secondly by this Location information file data can be written and read namely often say file handle is read and is write.Generic-document Metadata can be held always after first switching on, that is, the file handle often said, and subsequent operation is exactly continuous File data read-write.If can reading to file data or write operation carry out relevant optimization, Cephfs file can be improved The performance of data manipulation.Based on this, a kind of data prefetching method provided by the embodiments of the present application, the data prefetching method can be with It runs among above-mentioned data pre-fetching system.Fig. 2 is a kind of process signal of data prefetching method provided by the embodiments of the present application Figure, present description provides the method operating procedures as described in embodiment or flow chart, but based on conventional or without creativeness Labour may include more or less operating procedure.The step of enumerating in embodiment sequence is only that numerous steps execution are suitable One of sequence mode does not represent and unique executes sequence.It, can be according to when system in practice or server product execute Embodiment or the execution of method shown in the drawings sequence or parallel execution (such as the ring of parallel processor or multiple threads Border).It is specific as shown in Fig. 2, the method may include:
S201. the first read requests of client end response target data, first read requests include reading attributes letter Breath.
In this specification embodiment, when user needs to be read out target data, reading can be triggered on the client Take the first read requests of the target data, wherein first read requests include the reading attributes letter of the target data Breath, the reading attributes information include but is not limited to read offset, read range etc..
S203. attribute of the client based on the obj ect file stored in the reading attributes information and local cache Information determines target object and target stripe unit where the target data.
Data pre-fetching in the embodiment of the present application is to take full advantage of Cephfs bottom based on object striping fragment The design principle of Rados object block, the principle of object striping introduced below:
The RADOS of OSD and Monitor composition is the foundation stone of Ceph storage cluster, and the data in RADOS are as unit of object Storage, object size can be 4M.Storage engines carry out storage object using the file in local file system, and object data exists In file data, there are in file metadata for object metadata.In order to distinguish the file and local file system of Cephfs offer File, be here obj ect file the short title in local file system, also referred to as object, and the file letter in Cephfs Referred to as file.
Ceph uses the data strip side of similar RAID0 in order to increase the handling capacity (throughput) of hard disk with performance Method is distributed to other objects to increase handling capacity after cutting data, and the single hard of RAID0 is solved using replication policy The problem of can not restoring when disk failure.Wherein, RAID (Redundant Array of Independent Disks) is independent Redundant arrays of inexpensive disks, RAID0 represent highest storage performance in all RAID level, and RAID0 improves the principle of storage performance It is continuous data to be distributed on multiple disks to access, in this way, system has what request of data can be parallel by multiple disks It executes, each disk executes the part request of data for belonging to own.Parallel work-flow in this data can make full use of The bandwidth of bus significantly improves disk entirety access performance.
In practical applications, the file that user sees at Cephfs layers is made of multiple obj ect files in fact, in other words, The data of file are distributed in multiple obj ect files in Cephfs with striping, and Fig. 3 is object striping figure, object introduced below Striping needs the basic conception used:
(1) su (stripe unit): stripe unit size is also stripe depth (stripe width) or stripe cell Capacity, minimum 64KB, and object size can be divided exactly.
(2) sc (stripe count): stripe cell concurrent reading and writing number, that is, the band list that a band is accommodated The number of member.
(3) object: representing an object in RADOS, usually can be set as 2MB or 4MB, preferably 4MB, larger Object may include more su.
(4) object set: representing the set of object, and a file may include one or more object sets.
It in practical applications, will not be the content stripings of different files to same although Ceph striping file In a object, i.e., physical object isolation is carried out to file, so that different files be made to be striped in a different group objects. This is mainly the composition by means of object name, and object name consists of two parts similar to X.Y in Ceph, and wherein X is indicated The metadata of file numbers (Inode ID), globally unique, and Y indicates part's number in object set, in different files It is that can repeat in object set.The data of different files can be fallen into different object sets because of Inode ID, not identical text The data of part would not be fallen in the same obj ect file, thus play the physical isolation of file, ensure that different files The physical isolation schematic diagram of safety, file is as shown in Figure 4.
In practical applications, file data has different address spaces from a different perspective:
Logical address: from the point of view of user, file data be it is one-dimensional, can be from by one-dimensional document misregistration amount Head accesses file data to tail.
Physical address: from the point of view of striping, the physical address of file data is that three-dimensional coordinate (compile by object set, band Number, object), these three coordinates respectively indicate which object set belonged to, which band, which object, such as (1,2,3) Expression belongs to first object set, second band, on third object.
It should be noted that the data of the file in Cephfs are by band before accessing to target data Change is distributed in multiple obj ect files, and when user triggers target data read requests, client can be according to the item of foregoing description With the attribute information for changing the obj ect file stored in slicing principle and local cache, determines the position of the target data, i.e., will Logical address is converted to physical address.Specifically, the client is based on depositing in the reading attributes information and local cache The attribute information of the obj ect file of storage determines target object and target stripe unit where the target data, such as Fig. 5 institute Show, may include:
S2031. memory capacity of the client based on the reading offset and stripe cell, determines the number of targets It is numbered according to the stripe cell at place.
Specifically, the calculation formula of the stripe cell number suno is as follows:
Suno=offset/su=1M/1M=1, wherein offset is to read offset.
S2033. the stripe cell number that the client is numbered based on the stripe cell and each band is accommodated, Band where determining the target data is numbered and the band bias internal amount of the target data.
Specifically, the calculation formula of the band number stripeno is as follows:
Stripeno=suno/stripe_count=1/3=0,
Wherein, 1 and 3 be integer, and 1/3 result will also take integer, i.e. stripeno calculated result is 0.
The calculation formula of the band bias internal amount (which object i.e. in object set) is as follows:
Stripepos=suno%stripe_count=1%3=1,
Wherein, remainder is sought in % expression, and 1%3 quotient is 0, and remainder is 1, therefore stripepos calculated result is 1.
S2035. the stripe cell number that the client is numbered based on the band and each object includes, determines institute State the object set number where target data.
Specifically, the calculation formula of the object set number objectset is as follows:
Objectset=stripeno/stripes_per_object=0/3=0, wherein stripes_per_ Object is the stripe cell number that each object includes.
S2037. the stripe cell number and institute that the client is numbered based on the object set, each band is accommodated Band bias internal amount is stated, determines the object number where the target data.
Specifically, the calculation formula of the object number objectno is as follows:
Objectno=objectset*stripe_count+stripepos=0*3+1=1.
S2039. the client is based on object set number, the object number, the stripe cell number and institute Band number is stated, determines the target object and target stripe unit where the target object.
In this specification embodiment, suno, stripeno, stripepos, objectset and objectno are being obtained Later, so that it may find the target object and target stripe unit where target data, i.e., by offset just from one-dimensional coordinate It is converted to three-dimensional coordinate, file degree range is then converted to the range in object.Specifically, the target data institute is being determined Target object and target stripe unit after, the method can also include:
S301. depositing based on band number, each object stripe cell number for including and the stripe cell Capacity is stored up, determines first offset of the reading offset in the target object.
Specifically, the calculation formula of the first offset block_start is as follows:
Block_start=(stripeno%stripes_per_object) * su=(0%3) * 1MB=0.
S303. the memory capacity based on the reading offset and the stripe cell, determines the reading offset in institute State the second offset in target stripe unit.
Specifically, the calculation formula of the second offset block_off is as follows:
Block_off=offset%su=1MB%1MB=0.
S305. it is based on first offset and second offset, determines the offset of the target stripe unit Address.
Specifically, the calculation formula of the offset address x_offset of object fragment is as follows:
X_offset=block_start+block_off=0+0=0.
S307. memory capacity and second offset based on the target stripe unit, determine the target item The residual capacity of tape cell.
Specifically, which is the still remaining maximum capacity max of object fragment, and the calculation formula of max is as follows:
Max=su-block_off=1MB-0=1MB.
S309. it is based on the read range and the residual capacity, determines the target read range of the target data.
Specifically, the calculation formula of target read range x_len is as follows:
X_len=min (len, max)=128KB.
Hereinafter, illustrating the specific implementation process of S2031-S2039 and S301-S309:
Assuming that object is 3MB, su is 1MB, and sc is the su that 3, stripes_per_object indicates that each object includes Number, as 3.A certain File Mapping occupies 9 su to an object set, and offset is to read offset, and len is that reading content is long Degree now enables offset=1MB, len=128KB, i.e., reads the content of 128KB from the position of file 1MB.By S2031- After S2039, offset is just converted to three-dimensional coordinate from one-dimensional coordinate: (offset=1MB) → (objectset=0, Stripeno=0, stripepos=1), after S301-S309, read operation Documents Logical address is transformed into read number It is 0 to 128KB content for 1 object offset: (offset=1MB, len=128KB) → (objectno=1, x_offset=0, X_len=128KB), as shown in Figure 6.An object band it can be seen from the mapping of above-mentioned logical address to physical address Data in unit are that continuously, the data known to the physical object isolation of file data in an object belong to the same text Part.
S205. the client searches the target data in the target stripe unit.
In this specification embodiment, after determining target stripe unit, client can be in the target stripe unit The middle lookup target data carries out S2051 if hit in the buffer, if do not hit in the buffer, carries out S2053。
If S2051. the client finds the target data, the client in the target stripe unit The data in the target read range are read since the offset address.
Specifically, the data in x_len can be read since x_offset.
If S2053. the client does not find the target data, the client in the target stripe unit The memory capacity of the target stripe unit is determined as corresponding with the target data prefetching range by end.
In practical applications, since file data striping is into different objects in Ceph, object is according to hash function It is mapped to different placement groups (Placement Group, PG), PG is by Crush Algorithm mapping to the OSD of different hosts. Based on this data distribution principle, if prefetching range involves several objects, and these objects are mapped to different OSD, According to striping thought, concurrency is can be improved in different OSD parallel processings.But the range that band prefetches is bigger, and concurrency can be got over Height, prefetching range more greatly will lead to the prefetching content frequently replaced in caching and caching jitter phenomenon occurs, concurrently concurrent It accesses number and is less than stripe cell number Shi Caiqi obvious effect, however prefetch request across OSD for what a large amount of clients were initiated Operation can cause independence to lack, such as three clients send three requests, each request prefetch range spans two it is right As total to prefetch distribution as shown in Figure 7, it can be seen that flow increases, reads to put between the prefetching of independence missing will lead to OSD cluster The problems such as big.On the contrary, only being fallen on an object if each prefetching request range, as shown in figure 8, it is only for prefetching between request Vertical, then it can be to avoid these problems.Therefore, the optimal mode of range that prefetches of the application is fallen on an object.
In practical applications, if the read range of target data is larger, more than the capacity of single band unit, at this time in advance Take range that can fall in multiple stripe cells in an object.
In practical applications, CephFS generally has User space file system (filesystem in userspace, fuse) Upgrade and modify the support for needing kernel although kernel state superior performance with two kinds of carry usage modes of kernel state, so User space fuse is generally used, wherein.Fuse User space file system would generally request to cut to user, and read-write data can To be cut into maximum 128KB.Assuming that setting 2MB for object, su is set as 256KB, and sc is set as 4, then Fig. 9 is shown once Fuse requests access to the subject area that range is fallen in, i.e. target object is object 2, and target stripe unit is stripe cell 2, in advance Taking range is the size or capacity of stripe cell 2.
In practical applications, if the OjectCacher of the above-mentioned 128KB read request of fuse transmission in the client is not ordered In, then it needs onto the obj ect file of OSD to read.If the su size of object where file fragmentation is all prefetched up, i.e., by Fig. 9 The memory capacity of middle stripe cell 2 is come up as range extraction is prefetched, and the amount that is prefetched is primary between 128KB~256KB 1~2 times of fuse read request, the pre-fetch amount is more moderate, generally requires 1~2 times after prefetching this visit range of distance, And the content of object su range is continuous in file.Therefore the application propose based on band prefetching object su range Content, prefetching based on su prefetch cost to reduce because avoiding independence missing.
S207. attribute information of the client based on the target object determines mesh corresponding with the target object Mark server.
In the embodiment of the present application, client can calculate storage by hash algorithm according to the object name of target object The PG of the target object determines target OSD corresponding with PG and data distribution then through crush algorithm.It needs to illustrate , when object is written in client, PG has been created, mapping relations and PG and target between object and PG Mapping relations between object have determined.
S209. the client sends the second reading of the data prefetched in range described in reading to the destination server Request.
S2011. the destination server reads the data prefetched in range based on second read requests It takes, obtains prefetching data.
In this specification embodiment, after client is determined to prefetch range, needs target OSD to read this and prefetch model Interior data are enclosed, be specifically as follows: according to the identification information etc. of the data prefetched in range, target OSD searches this and prefetches range The position of interior data, then target OSD prefetches the data in range to this and is read out, to obtain prefetching data.
S2013. the destination server prefetches data to described in client transmission.
In this specification embodiment, target OSD can will prefetch data and be sent to client, extremely by client-cache In ObjectCacher.Then, client can send out first read requests an of target data again, can specifically hit in the buffer And return data to user.
It is described the method also includes being managed to the local cache in a feasible embodiment The local cache is managed, may include:
S401. the local cache is divided into the first caching and the second caching by the client, first caching The access frequency of access frequency height and second caching;First caching includes at least one first stripe cell, each First stripe cell includes that history prefetches data and first data cached, and second caching includes at least one second strip list Member, each second strip unit include the second data cached and empty data;Wherein, the history prefetch data be characterized in it is described pre- The data of mistake that access prefetches in preset time before and not visited, first slow data and described second data cached The data being accessed are characterized, the sky data characterization content is empty data.
S403. first stripe cell is carried out the arrangement of priority descending according to access frequency by the client, is obtained First priority sequence.
S405. the second strip unit is carried out the arrangement of descending priority grade descending according to access frequency by the client, Obtain the second priority sequence.
In practical applications, the smallest cache granularity is cache unit (buffer head, bh) in ObjectCacher, A segment limit in its practical corresponding objects, size are not fixed, but the maximum size no more than object, such as the A institute in Figure 10 Show.For the data effectively managed and rationally superseded band prefetches, this specification embodiment has redesigned buffer organization, wherein Administrative unit is su, but minimal cache granularity is still bh, but the maximum value of bh is no more than su and cannot be across su, such as the B in Figure 10 It is shown.
In this specification embodiment, caching is divided into two sections of priority (TRL, Two Rank LRU) in S401, i.e., Priority-level it is high first caching (Up) and priority level it is low second caching (Botton), priority here refer to access frequently Rate, as shown in figure 11.Wherein, include at least one first stripe cell in UP, include at least one second strip in Botton Unit.
Continue as shown in figure 11, include three classes bh in Figure 11: pre-fetch data cache unit (prefetch bh), caching number According to cache unit (cached bh) and empty cache unit (empty bh).Wherein prefetch bh includes to prefetch data (prefetch data), and cache unit (cached bh) only includes data cached (cached data), empty bh is Sky plays the role of logic occupy-place, determines data area when calculating for striping, and prefetch data is the data prefetched, It is labeled but without accessing, and cached data is the data accessed, while prefetch data and cached There is transformational relations between data, can be converted into cached after prefetch data is in first time accessed hit data。
In this specification embodiment, the first stripe cell of each of higher level Up includes prefetch bh and cached Bh, i.e. a complete su, and each second strip unit in Botton includes cached bh and empty bh, and empty Bh is that empty namely each second strip unit only includes cached data.By the way that buffer setting is mainly limited at two sections The data volume prefetched, Up are equivalent to one and prefetch window, prefetching data and will be eliminated more than this window ranges.
In practical applications, in order to be distinguished with the data that prefetch of current preset in S2011, can will be located in UP Prefetch bh in first stripe cell is defined as history and prefetches data, i.e., it is described prefetch data before in preset time Cached bh in the first stripe cell being located in UP is defined as the first caching by the data of mistake prefetch and not visited Data, position and the cached bh in the second strip unit in Botton are defined as it is second data cached, by position and Botton In two stripe cells in empty bh be defined as sky data.
It in practical applications, can also be according to the access frequency of the first stripe cell each in S403 by UP each first Stripe cell carries out the arrangement of priority descending according to access frequency, obtains the first priority sequence, sorts more forward, show this The access frequency of one stripe cell is higher, and priority level is higher.It similarly, can be according to the visit of second strip unit each in S403 It asks that second strip unit each in Botton is carried out the arrangement of priority descending according to access frequency by frequency, obtains the second priority Sequence sorts more forward, shows that the access frequency of the second strip unit is higher, priority level is higher.
To sum up, for the cache management in the embodiment of the present application mainly according to data access frequency, reservation access frequency as far as possible is high Data, eliminate the low data of access frequency, make the hit rate highest of data.Cache granularity is bh, prefetches design based on band, Using su as memory management unit, the priority of su is measured with access frequency, the highest priority just accessed recently.In In entire caching, the priority of su is from left to right successively reduced, i.e. the global precedence effect grade ratio Botton high of Up, while using respectively Least recently used (Least recently used, LRU) Lai Guanli Up and Botton.
In a feasible embodiment, such as Figure 12, in S2051, if client is searched in target stripe unit To the target data, then the priority of the target stripe unit can be improved after cache hit, can specifically be divided into following several Kind situation:
A: if the target data is that the history prefetches data and non-first priority of the target stripe unit The first stripe cell of sequence, then the client migrates the target stripe unit to the head of first priority sequence Portion, and the history is prefetched into data labeled as first data cached.
Specifically, if hitting the prefetch bh in Up, the su comprising this bh is moved on to the MRU end of Up, and this Bh is labeled as cached bh.Wherein, MRU end indicates the highest end of priority level.
B: if the target data is described first data cached and described non-first priority of target stripe unit The first stripe cell of sequence, then the client migrates the target stripe unit to the head of first priority sequence Portion.
Specifically, if hitting the Cached bh in Up, the su comprising this bh is moved on to the MRU end of Up.
C: if the target data is described second data cached and described non-second priority of target stripe unit The first stripe cell of sequence, then the client migrates the target stripe unit to the head of second priority sequence Portion.
Specifically, if hitting the Cached bh in Botton, the su comprising this bh is moved on to the MRU end of Botton.
In a feasible embodiment, continue as shown in figure 12, in S2053, if client is not in target stripe Find the target data in unit, can go in target OSD to be prefetched at this time, when prefetching after can also promote corresponding item The priority of tape cell can specifically be divided into following several situations:
D: if the target data is the empty data, the client receives the current of the destination server transmission Data are prefetched, the target stripe unit are written into the data that currently prefetch, and the target stripe unit is migrated to institute State the stem of the first priority sequence.
Specifically, it if hitting the empty bh in Botton, indicates to cache in client miss, then it is pre- into OSD Su is taken, empty bh corresponding contents all in the su of return are added and are cached, and the su at place is moved on to the MRU end of Up.
E: if the non-history of the target data prefetches data, described first data cached, described second data cached Or at least one of described empty data, then the client receives the current of destination server transmission and prefetches data, creates It builds and currently prefetches that data are corresponding to prefetch stripe cell with described, the stripe cell that prefetches is migrated to first priority The stem of sequence.
Specifically, if any bh in miss Up and Botton, su is prefetched into OSD, and create a new su It is added to the MRU end of Up.
In a feasible embodiment, degradation is needed to guarantee cache hit rate and memory with increasing for data is prefetched Utilization rate is specifically divided into the following two kinds situation as shown in figure 13:
A: if first stripe cell or the access frequency of the second strip unit are lower than first threshold, the visitor Family end reduces access frequency lower than the first stripe cell of the first threshold or the priority of second strip unit.
Specifically, su is compeled according to the characteristic of LRU progress long-time miss by other queue element (QE)s inside Up and Botton Make to degrade.
B: if the capacity of first caching is greater than second threshold, access frequency is lower than third threshold value by the client The first stripe cell migrate to the stem of second priority sequence, and by the access frequency lower than the of third threshold value History in one stripe cell prefetches data deletion.
Specifically, the window capacity of Up is limited, once the overabundance of data prefetched, the maximum value more than Up, starting are washed in a pan Degraded operation is eliminated, the su that the tail portion Up exceeds is moved to the MRU end of Botton, and deletes prefetch bh therein, is come with this Eliminate the data prefetched.
Hereinafter, introducing a kind of data prefetching method of the application, as shown in figure 14, the side using client as executing subject Method may include:
S501. the first read requests of target data are responded, first read requests include reading attributes information.
S503. the attribute information based on the obj ect file stored in the reading attributes information and local cache determines Target object and target stripe unit where the target data.
S505. the target data is searched in the target stripe unit, it, will if not finding the target data The memory capacity of the target stripe unit is determined as corresponding with the target data prefetching range.
If finding the target data, the target data is returned into user.
S507. the attribute information based on the target object determines destination server corresponding with the target object.
S509. the second read requests that the data in range are prefetched described in reading are sent to the destination server, so that The destination server is read out the data prefetched in range based on second read requests, obtains prefectching According to.
S5011. it receives the described of destination server transmission and prefetches data.
S5013. the data that prefetch are cached.
Hereinafter, introduce a kind of data prefetching method of the application using destination server as executing subject, the method can be with Include:
Receive the second read requests that the reading that client is sent prefetches the data in range;Wherein, described to prefetch range Memory capacity when not finding target data in target stripe unit by the client based on target stripe unit determines, The target stripe unit is by client based on depositing in the reading attributes information and local cache for including in the first read requests The category information of the obj ect file of storage determines that first read requests are client end response in the request of the target data.
The data prefetched in range are read out based on second read requests, obtain prefetching data.
Data are prefetched to described in client transmission, so that the client stores the data that prefetch.
In the embodiment of the present application, ordered to test using the caching after data prefetching method provided by the embodiments of the present application The case where middle rate, statistics file are sent to the request number of OSD from OSDC during the visit.Pass through the Objecter module in OSDC Network send function increase statistical counting, carry out 100s, 300s and 600s read operation access time in network the number of transmissions, survey Test result is as shown in table 1.
1 network transmission number of table
100s 300s 600s
origin 31633 times 80813 times 138968 times
prefetch 28987 times 73867 times 133149 times
Wherein, Origin is primary Cephfs test data, and prefetch is to use method provided by the embodiments of the present application Test data afterwards, from table 1 it follows that prefetch and cache optimization after, the number of network transmission significantly reduces, and shows more More access data are hit in client-cache, and adaptive prefetching has been obviously improved cache hit rate, network transmission number Reduce 10.3% or so.
As shown in figure 15, the embodiment of the present application also provides a kind of data pre-fetching device, the apparatus may include:
Respond module 601, can be used for responding the first read requests of target data, and first read requests include reading Take attribute information.
First determining module 603 can be used for based on the object stored in the reading attributes information and local cache The attribute information of file determines target object and target stripe unit where the target data.
In a feasible embodiment, the reading attributes information includes reading offset, and the obj ect file includes extremely A few object set, each object set includes at least one object, and each object includes at least one stripe cell, then and described the One determining module 603 can also include:
Stripe cell number determination unit can be used for the memory capacity based on the reading offset and stripe cell, Determine the stripe cell number where the target data.
Band number and band bias internal amount determination unit can be used for numbering based on the stripe cell and each The stripe cell number that band is accommodated, the band where determining the target data is numbered and the band of the target data Bias internal amount.
Object set number determination unit can be used for the stripe cell numbered based on the band and each object includes Number determines the object set number where the target data.
Object number determination unit can be used for the stripe cell numbered based on the object set, each band is accommodated Number and the band bias internal amount, determine the object number where the target data.
Target object and target stripe unit determination unit can be used for compiling based on object set number, the object Number, the stripe cell number and the band number, determine the target object and target stripe list where the target data Member.
In a feasible embodiment, the reading attributes information includes read range, then described device can also wrap It includes:
First offset determination module can be used for numbering based on the band, the stripe cell that each object includes The memory capacity of the several and described stripe cell determines first offset of the reading offset in the target object.
Second offset determination module can be used for the storage based on the reading offset and the stripe cell and hold Amount determines second offset of the reading offset in the target stripe unit.
Offset address determining module can be used for determining institute based on first offset and second offset State the offset address of target stripe unit.
Residual capacity determining module can be used for the memory capacity and described second based on the target stripe unit partially Shifting amount determines the residual capacity of the target stripe unit.
Target read range determining module can be used for determining institute based on the read range and the residual capacity State the target read range of target data.
Second determining module 605 can be used for searching the target data in the target stripe unit, if not searching To the target data, then the memory capacity of the target stripe unit is determined as corresponding with the target data prefetching model It encloses.
Third determining module 606 can be used for the attribute information based on the target object, the determining and target object Corresponding destination server.
Sending module 609, can be used for sending to the destination server and prefetches the of data in range described in reading Two read requests, so that the destination server reads the data prefetched in range based on second read requests It takes, obtains prefetching data.
Receiving module 6011 can be used for receiving the described of destination server transmission and prefetch data.
In a feasible embodiment, described device can also include caching the cache management mould being managed to this Block, the caching management module may include:
Division unit is cached, can be used for for the local cache being divided into the first caching and second and cache, described first The access frequency of the access frequency height of caching and second caching;First caching includes at least one first band list Member, each first stripe cell include that history prefetches data and first data cached, second caching include at least one the Two stripe cells, each second strip unit include the second data cached and empty data;Wherein, the history prefetches data characterization It is described prefetch data before prefetch in preset time and not visited mistake data, the first slow data and described second Data cached to characterize the data being accessed, the sky data characterization content is empty data.
First priority sequence determination unit can be used for carrying out first stripe cell according to access frequency preferential Grade descending arrangement, obtains the first priority sequence.
Second priority sequence determination unit can be used for the second strip unit carrying out descending according to access frequency The arrangement of priority descending, obtains the second priority sequence.
In a feasible embodiment, the caching management module can also include:
First migration units, if can be used for the target data is that the history prefetches data and the target stripe list The first stripe cell of non-first priority sequence of member, then migrate the target stripe unit to first priority The stem of sequence, and the history is prefetched into data labeled as first data cached.
Second migration units, if can be used for the target data is the described first data cached and described target stripe list The first stripe cell of non-first priority sequence of member, then migrate the target stripe unit to first priority The stem of sequence.
Third migration units, if can be used for the target data is the described second data cached and described target stripe list The first stripe cell of non-second priority sequence of member, then migrate the target stripe unit to second priority The stem of sequence.
In a feasible embodiment, the caching management module can also include:
4th migration units receive the destination server if can be used for the target data is the empty data What is sent currently prefetches data, and the data that currently prefetch are written the target stripe unit, and by the target stripe list Member is migrated to the stem of first priority sequence.
5th migration units, if can be used for the non-history of the target data prefetches data, the first caching number According at least one of, described second data cached or described empty data, then receive the destination server send it is current pre- Fetch evidence, creation with it is described it is current prefetch that data are corresponding to prefetch stripe cell, the stripe cell that prefetches is migrated to described The stem of first priority sequence.
In a feasible embodiment, the caching management module can also include:
Priority reduces unit, if can be used for the access frequency of first stripe cell or the second strip unit Lower than first threshold, then reduce access frequency lower than the first threshold the first stripe cell or second strip unit it is preferential Grade.
6th migration units are low by access frequency if the capacity that can be used for first caching is greater than second threshold It migrates in the first stripe cell of third threshold value to the stem of second priority sequence, and by the access frequency lower than the History in first stripe cell of three threshold values prefetches data deletion.
It should be noted that the data pre-fetching device in the embodiment of the present application belong to above-mentioned data prefetching method it is identical Inventive concept.
The embodiment of the present application also provides a kind of terminal of data pre-fetching, which includes processor and memory, this is deposited At least one instruction, at least a Duan Chengxu, code set or instruction set are stored in reservoir, this at least one instruction, this at least one Duan Chengxu, the code set or instruction set are loaded as the processor and are executed to realize the data as provided by above method embodiment Forecasting method.
Embodiments herein additionally provides a kind of storage medium, and the storage medium may be disposed among terminal to save For realizing relevant at least one instruction of data prefetching method a kind of in embodiment of the method, an at least Duan Chengxu, code set or Instruction set, at least one instruction, an at least Duan Chengxu, the code set or the instruction set are loaded by the processor and are executed with reality The data prefetching method that existing above method embodiment provides.
A kind of data prefetching method, device, terminal and storage medium provided by the embodiments of the present application, on the one hand, in response to User's triggering includes the first read requests for reading the target data of offset and read range, according to the reading attributes information and The attribute information of the obj ect file of middle storage is locally stored, offset will be read from flat address and be converted to three-dimensional address, thus The target object and target stripe unit where the target data are determined, if not finding the mesh in the target stripe unit Data are marked, shows cache miss, then, is then determined as the memory capacity of the target stripe unit to prefetch range, while basis The title and crush algorithm of target object determine destination server corresponding with target object, finally by the destination server The data in range are prefetched to this to be read out, and the data buffer storage read to client are realized based on object band Change prefetching for fragment, takes full advantage of the feature of the object storage in distributed file system, improve cache hit rate, reduce Network delay, to improve the data read accesses performance of distributed file system.On the other hand, using su as cache management Local cache is divided into two sections of priority by unit, if cache hit or while prefetching need to be promoted the priority of su, while Up With Botton inside su according to the characteristic of LRU carry out long-time miss by other queue element (QE)s force degradation or will be more than UP window The su of mouthful capacity moves to the MRU end of Botton, and delete prefetch bh therein, eliminates the data prefetched with this, from And the data that band prefetches effectively are managed and rationally eliminate, further improve the data read accesses of distributed file system Energy.
Optionally, in this specification embodiment, storage medium can be located at multiple network servers of computer network In at least one network server.Optionally, in the present embodiment, above-mentioned storage medium can include but is not limited to: USB flash disk, Read-only memory (Read-Only Memory, ROM), is moved random access memory (Random Access Memory, RAM) The various media that can store program code such as dynamic hard disk, magnetic or disk.
Memory described in this specification embodiment can be used for storing software program and module, and processor passes through operation storage In the software program and module of memory, thereby executing various function application program and data processing.Memory can be main Including storing program area and storage data area, wherein storing program area can application program needed for storage program area, function Deng;Storage data area, which can be stored, uses created data etc. according to the equipment.In addition, memory may include high speed with Machine access memory, can also include nonvolatile memory, a for example, at least disk memory, flush memory device or its His volatile solid-state part.Correspondingly, memory can also include Memory Controller, to provide processor to memory Access.
Data prefetching method embodiment provided by the embodiment of the present application can be in mobile terminal, terminal, service It is executed in device or similar arithmetic unit.For running on the server, Figure 16 is one kind provided by the embodiments of the present application The hardware block diagram of the server of data prefetching method.As shown in figure 16, which can be due to configuration or performance be different Generate bigger difference, may include one or more central processing units (Central Processing Units, ) 710 CPU (processor 710 can include but is not limited to the processing dress of Micro-processor MCV or programmable logic device FPGA etc. Set), memory 730 for storing data, one or more storage application programs 723 or data 722 storage medium 720 (such as one or more mass memory units).Wherein, memory 730 and storage medium 720 can be of short duration storage Or persistent storage.The program for being stored in storage medium 720 may include one or more modules, and each module may include To the series of instructions operation in server.Further, central processing unit 710 can be set to logical with storage medium 720 Letter executes the series of instructions operation in storage medium 720 on server 700.Server 700 can also include one or one A above power supply 760, one or more wired or wireless network interfaces 750, one or more input/output interfaces 740, and/or, one or more operating systems 721, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
Input/output interface 740 can be used for that data are received or sent via a network.Above-mentioned network is specifically real Example may include the wireless network that the communication providers of server 700 provide.In an example, input/output interface 740 includes One network adapter (Network Interface Controller, NIC), can pass through base station and other network equipment phases Even so as to be communicated with internet.In an example, input/output interface 1140 can be radio frequency (Radio Frequency, RF) module, it is used to wirelessly be communicated with internet.
It will appreciated by the skilled person that structure shown in Figure 16 is only to illustrate, above-mentioned electronics is not filled The structure set causes to limit.For example, server 700 may also include more perhaps less component or tool than shown in Figure 16 Have and configuration shown in Figure 16.
It should be understood that above-mentioned the embodiment of the present application sequencing is for illustration only, do not represent the advantages or disadvantages of the embodiments. And above-mentioned this specification specific embodiment is described.Other embodiments are within the scope of the appended claims.One In a little situations, the movement recorded in detail in the claims or step can be executed according to the sequence being different from embodiment and Still desired result may be implemented.In addition, process depicted in the drawing not necessarily requires the particular order shown or company Continuous sequence is just able to achieve desired result.In some embodiments, multitasking and parallel processing it is also possible or It may be advantageous.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for device and For server example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to side The part of method embodiment illustrates.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely the preferred embodiments of the application, not to limit the application, it is all in spirit herein and Within principle, any modification, equivalent replacement, improvement and so on be should be included within the scope of protection of this application.

Claims (10)

1. a kind of data prefetching method, which is characterized in that the described method includes:
The first read requests of target data are responded, first read requests include reading attributes information;
Based on the attribute information of the obj ect file stored in the reading attributes information and local cache, the number of targets is determined According to the target object and target stripe unit at place;
The target data is searched in the target stripe unit, if not finding the target data, by the target The memory capacity of stripe cell is determined as corresponding with the target data prefetching range;
Based on the attribute information of the target object, destination server corresponding with the target object is determined;
The second read requests that the data in range are prefetched described in reading are sent to the destination server, so that the target takes Business device is read out the data prefetched in range based on second read requests, obtains prefetching data;
It receives the described of destination server transmission and prefetches data.
2. the method according to claim 1, wherein the reading attributes information include read offset, it is described right As file includes at least one object set, each object set includes at least one object, and each object includes at least one band Unit, then the attribute information based on the obj ect file stored in the reading attributes information and local cache, determines institute State the target object and target stripe unit where target data, comprising:
Memory capacity based on the reading offset and stripe cell, the stripe cell where determining the target data are compiled Number;
Based on the stripe cell number that stripe cell number and each band are accommodated, the target data place is determined Band number and the target data band bias internal amount;
Based on the stripe cell number that band number and each object include, the object where the target data is determined Collection number;
Based on stripe cell number and the band bias internal amount that object set number, each band are accommodated, determine Object number where the target data;
Based on object set number, the object number, the stripe cell number and band number, the mesh is determined Mark the target object and target stripe unit where data.
3. according to the method described in claim 2, it is characterized in that, the reading attributes information includes read range, then true After target object and target stripe unit where the fixed target data, the method also includes:
Based on the memory capacity of band number, the stripe cell number that each object includes and the stripe cell, really Fixed first offset for reading offset in the target object;
Memory capacity based on the reading offset and the stripe cell, determines the reading offset in the target stripe The second offset in unit;
Based on first offset and second offset, the offset address of the target stripe unit is determined;
Memory capacity and second offset based on the target stripe unit determine the surplus of the target stripe unit Covolume amount;
Based on the read range and the residual capacity, the target read range of the target data is determined.
4. according to the method described in claim 3, it is characterized in that, described search the target in the target stripe unit After data, the method also includes:
If finding the target data, the data in the target read range are read since the offset address.
5. according to the method described in claim 4, it is characterized in that, the method also includes being managed to the local cache The step of, it is described that the local cache is managed, comprising:
The local cache is divided into the first caching and the second caching, the access frequency height and described second of first caching The access frequency of caching;First caching includes at least one first stripe cell, and each first stripe cell includes history Data and first data cached are prefetched, second caching includes at least one second strip unit, each second strip unit Including the second data cached and empty data;Wherein, the history, which prefetches data and is characterized in, described prefetches preset time before data What the data of mistake inside prefetch and not visited, the first slow data and the second data cached characterization be accessed Data, the sky data characterization content is empty data;
First stripe cell is subjected to the arrangement of priority descending according to access frequency, obtains the first priority sequence;
The second strip unit is subjected to the arrangement of descending priority grade descending according to access frequency, obtains the second priority sequence.
6. according to the method described in claim 5, it is characterized in that,
If it is described find the target data after, the method also includes:
If the target data is that the history prefetches data and non-first priority sequence of the target stripe unit First stripe cell then migrates the target stripe unit to the stem of first priority sequence, and by the history It is data cached labeled as first to prefetch data;
If the target data is described first data cached and described non-first priority sequence of target stripe unit First stripe cell then migrates the target stripe unit to the stem of first priority sequence;
If the target data is described second data cached and described non-second priority sequence of target stripe unit First stripe cell then migrates the target stripe unit to the stem of second priority sequence;
It is described do not find the target data after, the method also includes:
If the target data is the empty data, receive the destination server transmission currently prefetches data, will be described It currently prefetches data and the target stripe unit is written, and the target stripe unit is migrated to first priority sequence Stem;
If the non-history of target data prefetches data, described first data cached, described second data cached or described At least one of empty data, then receive the destination server transmission currently prefetches data, and creation is currently prefetched with described Data are corresponding to prefetch stripe cell, and the stripe cell that prefetches is migrated to the stem of first priority sequence.
7. according to the method described in claim 5, it is characterized in that, if first stripe cell or the second strip unit Access frequency be lower than first threshold, then reduce access frequency be lower than the first threshold the first stripe cell or second strip The priority of unit;
If the capacity of first caching is greater than second threshold, the first stripe cell by access frequency lower than third threshold value is moved The stem of second priority sequence is moved to, and by the access frequency lower than going through in the first stripe cell of third threshold value History prefetches data deletion.
8. a kind of data pre-fetching device, which is characterized in that described device includes:
Respond module, for responding the first read requests of target data, first read requests include reading attributes information;
First determining module, for the attribute letter based on the obj ect file stored in the reading attributes information and local cache Breath, determines the target object and target stripe unit where the target data;
Second determining module, for searching the target data in the target stripe unit, if not finding the target The memory capacity of the target stripe unit is then determined as corresponding with the target data prefetching range by data;
Third determining module determines target corresponding with the target object for the attribute information based on the target object Server;
Sending module, for sending the second read requests for prefetching the data in range described in reading to the destination server, So that the destination server is read out the data prefetched in range based on second read requests, prefetched Data;
Receiving module prefetches data described in the destination server transmission for receiving.
9. a kind of terminal, which is characterized in that the terminal includes processor and memory, and at least one is stored in the memory Item instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code Collection or instruction set are loaded by the processor and are executed to realize data prefetching method as claimed in claim 1.
10. a kind of computer readable storage medium, which is characterized in that be stored at least one instruction, extremely in the storage medium A few Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code set or instruction Collection is loaded by processor and is executed to realize data prefetching method as claimed in claim 1.
CN201910662766.9A 2019-07-22 2019-07-22 A kind of data prefetching method, device, terminal and storage medium Pending CN110471894A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910662766.9A CN110471894A (en) 2019-07-22 2019-07-22 A kind of data prefetching method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910662766.9A CN110471894A (en) 2019-07-22 2019-07-22 A kind of data prefetching method, device, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN110471894A true CN110471894A (en) 2019-11-19

Family

ID=68508199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910662766.9A Pending CN110471894A (en) 2019-07-22 2019-07-22 A kind of data prefetching method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN110471894A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209504A (en) * 2020-01-06 2020-05-29 北京百度网讯科技有限公司 Method and apparatus for accessing map data
CN111625502A (en) * 2020-05-28 2020-09-04 浙江大华技术股份有限公司 Data reading method and device, storage medium and electronic device
CN111736771A (en) * 2020-06-12 2020-10-02 广东浪潮大数据研究有限公司 Data migration method, device and equipment and computer readable storage medium
CN111752960A (en) * 2020-06-28 2020-10-09 北京百度网讯科技有限公司 Data processing method and device
CN112181916A (en) * 2020-09-14 2021-01-05 星辰天合(北京)数据科技有限公司 File pre-reading method and device based on user space file system FUSE, and electronic equipment
CN112417350A (en) * 2020-09-17 2021-02-26 上海哔哩哔哩科技有限公司 Data storage adjusting method and device and computer equipment
CN112559574A (en) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN113268519A (en) * 2020-12-28 2021-08-17 上海能链众合科技有限公司 Data sharing method based on block chain
CN113268201A (en) * 2021-05-13 2021-08-17 三星(中国)半导体有限公司 Cache management method and device based on file attributes
CN113296692A (en) * 2020-09-29 2021-08-24 阿里云计算有限公司 Data reading method and device
CN113970998A (en) * 2020-07-24 2022-01-25 中移(苏州)软件技术有限公司 Information processing method, device, terminal and storage medium
CN114065947A (en) * 2021-11-15 2022-02-18 深圳大学 Data access speculation method and device, storage medium and electronic equipment
CN116541365A (en) * 2023-07-06 2023-08-04 成都泛联智存科技有限公司 File storage method, device, storage medium and client
CN116955223A (en) * 2023-09-18 2023-10-27 浪潮电子信息产业股份有限公司 Data prefetching method, system, electronic equipment and computer storage medium
WO2024001413A1 (en) * 2022-06-28 2024-01-04 华为技术有限公司 Data reading method, data loading apparatus, and communication system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388824A (en) * 2008-10-15 2009-03-18 中国科学院计算技术研究所 File reading method and system under sliced memory mode in cluster system
CN105653684A (en) * 2015-12-29 2016-06-08 曙光云计算技术有限公司 Pre-reading method and device of distributed file system
CN107491545A (en) * 2017-08-25 2017-12-19 郑州云海信息技术有限公司 The catalogue read method and client of a kind of distributed memory system
CN108920600A (en) * 2018-06-27 2018-11-30 中国科学技术大学 A kind of metadata of distributed type file system forecasting method based on data correlation
US20190102305A1 (en) * 2016-12-21 2019-04-04 EMC IP Holding Company LLC Method and electronic device for accessing data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388824A (en) * 2008-10-15 2009-03-18 中国科学院计算技术研究所 File reading method and system under sliced memory mode in cluster system
CN105653684A (en) * 2015-12-29 2016-06-08 曙光云计算技术有限公司 Pre-reading method and device of distributed file system
US20190102305A1 (en) * 2016-12-21 2019-04-04 EMC IP Holding Company LLC Method and electronic device for accessing data
CN107491545A (en) * 2017-08-25 2017-12-19 郑州云海信息技术有限公司 The catalogue read method and client of a kind of distributed memory system
CN108920600A (en) * 2018-06-27 2018-11-30 中国科学技术大学 A kind of metadata of distributed type file system forecasting method based on data correlation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
唐蜜: "基于客户端缓存与请求调度的Ceph文件***读时延优化策略研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209504A (en) * 2020-01-06 2020-05-29 北京百度网讯科技有限公司 Method and apparatus for accessing map data
CN111209504B (en) * 2020-01-06 2023-09-22 北京百度网讯科技有限公司 Method and apparatus for accessing map data
CN111625502A (en) * 2020-05-28 2020-09-04 浙江大华技术股份有限公司 Data reading method and device, storage medium and electronic device
CN111625502B (en) * 2020-05-28 2024-02-23 浙江大华技术股份有限公司 Data reading method and device, storage medium and electronic device
CN111736771A (en) * 2020-06-12 2020-10-02 广东浪潮大数据研究有限公司 Data migration method, device and equipment and computer readable storage medium
CN111736771B (en) * 2020-06-12 2024-02-23 广东浪潮大数据研究有限公司 Data migration method, device, equipment and computer readable storage medium
CN111752960A (en) * 2020-06-28 2020-10-09 北京百度网讯科技有限公司 Data processing method and device
CN111752960B (en) * 2020-06-28 2023-07-28 北京百度网讯科技有限公司 Data processing method and device
CN113970998A (en) * 2020-07-24 2022-01-25 中移(苏州)软件技术有限公司 Information processing method, device, terminal and storage medium
CN112181916B (en) * 2020-09-14 2024-04-09 北京星辰天合科技股份有限公司 File pre-reading method and device based on user space file system FUSE, and electronic equipment
CN112181916A (en) * 2020-09-14 2021-01-05 星辰天合(北京)数据科技有限公司 File pre-reading method and device based on user space file system FUSE, and electronic equipment
CN112417350A (en) * 2020-09-17 2021-02-26 上海哔哩哔哩科技有限公司 Data storage adjusting method and device and computer equipment
CN112417350B (en) * 2020-09-17 2023-03-24 上海哔哩哔哩科技有限公司 Data storage adjusting method and device and computer equipment
CN113296692A (en) * 2020-09-29 2021-08-24 阿里云计算有限公司 Data reading method and device
CN113296692B (en) * 2020-09-29 2022-08-16 阿里云计算有限公司 Data reading method and device
CN112559574A (en) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN112559574B (en) * 2020-12-25 2023-10-13 北京百度网讯科技有限公司 Data processing method, device, electronic equipment and readable storage medium
CN113268519A (en) * 2020-12-28 2021-08-17 上海能链众合科技有限公司 Data sharing method based on block chain
CN113268201A (en) * 2021-05-13 2021-08-17 三星(中国)半导体有限公司 Cache management method and device based on file attributes
US11977485B2 (en) 2021-05-13 2024-05-07 Samsung Electronics Co., Ltd. Method of cache management based on file attributes, and cache management device operating based on file attributes
CN114065947B (en) * 2021-11-15 2022-07-22 深圳大学 Data access speculation method and device, storage medium and electronic equipment
CN114065947A (en) * 2021-11-15 2022-02-18 深圳大学 Data access speculation method and device, storage medium and electronic equipment
WO2024001413A1 (en) * 2022-06-28 2024-01-04 华为技术有限公司 Data reading method, data loading apparatus, and communication system
CN116541365A (en) * 2023-07-06 2023-08-04 成都泛联智存科技有限公司 File storage method, device, storage medium and client
CN116541365B (en) * 2023-07-06 2023-09-15 成都泛联智存科技有限公司 File storage method, device, storage medium and client
CN116955223A (en) * 2023-09-18 2023-10-27 浪潮电子信息产业股份有限公司 Data prefetching method, system, electronic equipment and computer storage medium
CN116955223B (en) * 2023-09-18 2024-01-23 浪潮电子信息产业股份有限公司 Data prefetching method, system, electronic equipment and computer storage medium

Similar Documents

Publication Publication Date Title
CN110471894A (en) A kind of data prefetching method, device, terminal and storage medium
US7818498B2 (en) Allocating files in a file system integrated with a RAID disk sub-system
US10235044B2 (en) System and methods for storage data deduplication
CN101556557B (en) Object file organization method based on object storage device
CN107423422B (en) Spatial data distributed storage and search method and system based on grid
KR20190111124A (en) KVS Tree
KR20190119080A (en) Stream Selection for Multi-Stream Storage
CN114860163B (en) Storage system, memory management method and management node
CN105683898A (en) Set-associative hash table organization for efficient storage and retrieval of data in a storage system
CN110321301A (en) A kind of method and device of data processing
CN110268391A (en) For data cached system and method
O'Neil The sb-tree an index-sequential structure for high-performance sequential access
JP6402647B2 (en) Data arrangement program, data arrangement apparatus, and data arrangement method
Yoon et al. Mutant: Balancing storage cost and latency in lsm-tree data stores
US11971859B2 (en) Defragmentation for log structured merge tree to improve read and write amplification
JP6394231B2 (en) Data arrangement control program, data arrangement control apparatus, and data arrangement control method
Fevgas et al. LB-Grid: An SSD efficient grid file
US10416901B1 (en) Storage element cloning in presence of data storage pre-mapper with multiple simultaneous instances of volume address using virtual copies
CN117573676A (en) Address processing method and device based on storage system, storage system and medium
CN108804571B (en) Data storage method, device and equipment
CN111338569A (en) Object storage back-end optimization method based on direct mapping
CN109086002A (en) Space management, device, computer installation and the storage medium of storage object
US11836090B2 (en) Cache management for search optimization
US11775433B2 (en) Cache management for search optimization
KR20230096359A (en) Ssd device and operating method of the same using ftl based on lsm-tree and approximate indexing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191119

RJ01 Rejection of invention patent application after publication