CN102033769B - Virtualized-software flow type loading-oriented prefetching method and system - Google Patents

Virtualized-software flow type loading-oriented prefetching method and system Download PDF

Info

Publication number
CN102033769B
CN102033769B CN 201010592125 CN201010592125A CN102033769B CN 102033769 B CN102033769 B CN 102033769B CN 201010592125 CN201010592125 CN 201010592125 CN 201010592125 A CN201010592125 A CN 201010592125A CN 102033769 B CN102033769 B CN 102033769B
Authority
CN
China
Prior art keywords
file
software
sequence string
prefetch rules
files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201010592125
Other languages
Chinese (zh)
Other versions
CN102033769A (en
Inventor
沃天宇
李建欣
郑海兵
钟亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN 201010592125 priority Critical patent/CN102033769B/en
Publication of CN102033769A publication Critical patent/CN102033769A/en
Application granted granted Critical
Publication of CN102033769B publication Critical patent/CN102033769B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a virtualized-software flow type loading-oriented prefetching method and system, wherein the method comprises the following steps of: obtaining log information corresponding to file access during use of virtualized software; carrying out data mining on a log file to obtain the prefetching rule list of the virtualized software, wherein the prefetching rule list comprises a plurality of prefetching rules respectively corresponding to sequence strings with the length smaller or equal to the set length in the log file; and when a terminal user requests to access the file in the virtualized software, obtaining a target prefetching rule in the prefetching rule list, and downloading all the files of the sequence strings in the target prefetching rule to the local from a software server. The invention enables the software operation to be smoother and improves the user experience.

Description

Forecasting method and system that Virtual software streaming loads
Technical field
The present invention relates to the software service technology, particularly a kind of forecasting method and system of Virtual software streaming loading.
Background technology
Along with the rise of large-scale scalable cloud computing environment, provide software to serve namely that (Software as a Service, be called for short: technology SaaS) has caused the extensive concern of industrial community and academia by network.SaaS is a kind of delivery mode of software that provides to the terminal user by the internet, and by the delivery mode of SaaS, the user does not need to buy software, only need to access needed software by the software operation business, and pay according to software use.The user need not the operation of software is safeguarded, the service provider understands full powers and administers and maintains software, has eliminated the needs of enterprise's purchase, structure and maintain infrastructure and application program.
the virtualization software streaming loading technique of carrying out based on above-mentioned SaaS, namely by Intel Virtualization Technology support software service operation, in the virtualization software operational system, when each application program operates in a shielded independent virtual operation in environment, be isolated from each other between application program and between application program and underlying operating system, and what it carried out employing is also to load from network transmission data the software streaming loading mode that uses as required, this pattern makes the user need not to install to dispose and the part of module that only need download software just can be brought into use application software.Make thus software application have the following advantages: at first, software can carry out centralized management, reduces the management and maintenance cost of software; Secondly, due to the deployment issue as required that has solved software, the migration of the system that can simplify the operation; Again, because built a virtualized software execution environment, be to isolate mutually between software instances, can eliminate to a great extent the conflict of program; At last, the streaming of software loads and to dispose as required function and can accelerate application deployment.
But, still there is following weak point in the virtualization software streaming loading technique of prior art: in the middle of the use procedure of software, when the user need to use new function, need to file a request to software server, the software document relevant to this function downloaded, could continue to use; And in this process,, make the program software operation block and pause, affect user's experience than directly slow from the speed of local hard drive load operating program due to the speed that adopts streaming to load the data run program from network as required.
Summary of the invention
The purpose of this invention is to provide forecasting method and system that a kind of Virtual software streaming loads, block stall problem with the operation that solves in the software application process, make software application more smooth and easy.
The invention provides the forecasting method that a kind of Virtual software streaming loads, comprising:
Obtain the log information corresponding with the file access in the virtualization software use procedure, described log information comprises the file path name of institute's access file, and a plurality of log informations that obtain in described virtualization software use procedure form a journal file;
a plurality of described journal files are treated to the form of sequence string, by calculating the number of times of the sequence string appearance in journal file, and replace and the conditional probability principle obtains the probable value of individual described sequence string according to frequency, obtain the prefetch rules table of described virtualization software, described prefetch rules table comprise respectively with described journal file in each length less than or equal to a plurality of prefetch rules corresponding to the sequence string of preseting length, described prefetch rules comprises described sequence string and probable value thereof, described sequence string comprises the file index of continuous a plurality of files, wherein, the described form that a plurality of described journal files are treated to the sequence string comprises: according to File Index Table, the file path name in the log information of described journal file is mapped as Digital ID, described File Index Table comprises All Files and the Digital ID corresponding with described file respectively in described virtualization software, after described mapping is completed, remove the described Digital ID that repeats in described journal file,
When the file in the described virtualization software of terminal user's request access, obtain the target prefetch rules in described prefetch rules table, and the All Files of the sequence string in described target prefetch rules all is downloaded to this locality from software server; The file that first element in the sequence string of described target prefetch rules is corresponding is the file of accessing, and length and the probable value of the sequence string of described target prefetch rules are maximum.
The invention provides the pre-fetching system that a kind of Virtual software streaming loads, comprising: the client of looking ahead and the server of looking ahead; The described client of looking ahead comprises data collection module and data pre-fetching module, and the described server of looking ahead comprises data-mining module and memory module;
Described data collection module, be used for obtaining the log information corresponding with the file access of virtualization software use procedure, described log information comprises the file path name of institute's access file, and a plurality of log informations that obtain in described virtualization software use procedure form a journal file;
described data-mining module, be used for a plurality of described journal files are treated to the form of sequence string, by calculating the number of times of the sequence string appearance in journal file, and replace and the conditional probability principle obtains the probable value of individual described sequence string according to frequency, obtain the prefetch rules table of described virtualization software, described prefetch rules table comprise respectively with described journal file in each length less than or equal to a plurality of prefetch rules corresponding to the sequence string of preseting length, described prefetch rules comprises described sequence string and probable value thereof, described sequence string comprises the file index of continuous a plurality of files, wherein, the described form that a plurality of described journal files are treated to the sequence string comprises: according to File Index Table, the file path name in the log information of described journal file is mapped as Digital ID, described File Index Table comprises All Files and the Digital ID corresponding with described file respectively in described virtualization software, after described mapping is completed, remove the described Digital ID that repeats in described journal file,
Described memory module is used for storing described prefetch rules table;
Described data pre-fetching module, when being used for the file when the described virtualization software of terminal user's request access, obtain the target prefetch rules in described prefetch rules table, and the All Files of the sequence string in described target prefetch rules all is downloaded to this locality from software server; The file that first element in the sequence string of described target prefetch rules is corresponding is the file of accessing, and length and the probable value of the sequence string of described target prefetch rules are maximum.
Forecasting method and system that Virtual software streaming of the present invention loads, by use the prediction of file according to the prefetch rules table of virtualization software, downloaded in advance before file access, stall problem is blocked in the operation that has solved in the software application process, like running software more smooth and easy, improved user's experience.
Description of drawings
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, the below will do one to the accompanying drawing of required use in embodiment or description of the Prior Art and introduce simply, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the schematic flow sheet of the forecasting method embodiment of Virtual software streaming loading of the present invention;
Fig. 2 is the pretreated schematic flow sheet of the journal file in Fig. 1;
Fig. 3 is the schematic flow sheet of looking ahead of the file in Fig. 1;
Fig. 4 is the structural representation of the pre-fetching system embodiment of Virtual software streaming loading of the present invention.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skills obtain under the prerequisite of not making creative work belongs to the scope of protection of the invention.
Main technical schemes of the present invention is that the forecasting method that provides a kind of Virtual software streaming to load can reasonably be predicted software contingent request of data in future, is downloading in advance and be cached to local disk before real access; The method is the log information that obtains in the virtualization software use procedure, and this log information is carried out data mining, obtains the prefetch rules table of described virtualization software; According to described prefetch rules table, before the file in the end-user access virtualization software, the All Files in the target prefetch rules in the prefetch rules table at described file place all is downloaded to this locality from software server.
Below by the drawings and specific embodiments, technical scheme of the present invention is described in further detail.
Fig. 1 is the schematic flow sheet of the forecasting method embodiment of Virtual software streaming loading of the present invention, and as shown in Figure 1, the forecasting method of the present embodiment can comprise the following steps:
Step 101, obtain the log information in the virtualization software use procedure, described log information comprises the file path name of institute's access file;
This step is data acquisition, is mainly used in carrying out for follow-up modeling and data mining process the deposit of data set.During the virtualization software operation, if the relevant file access of virtualization software function occurs, the record log information corresponding with this document access, be each file access and generate a log information.The formal definition of this log information can be referring to following table 1.
Table 1
date time event filename
[0029]Wherein, on the date representation file access date, time represents concrete access time point, and event represents that such as operations such as OPEN/CLOSE/DELETE, filename is the file path name to the accessing operation of this document execution.When filename is catalogue, can add at the event place remark information, that indicate this log information recording is the event of a catalogue.
When the operational process of virtualization software, file access repeatedly may occur, a log information of correspondence is with it all recorded in file access each time, when this virtualization software end of run, can obtain many log informations.These many log informations can form a journal file, the corresponding virtualization software of this journal file.
Step 102, above-mentioned journal file is carried out pre-service;
In order to simplify follow-up modeling procedure, can before being carried out data mining, journal file carry out pre-service to it.This preprocessing process can comprise: according to File Index Table, the file path name in the log information of the journal file that obtains in step 101 is mapped as Digital ID, and after mapping is completed, removes the Digital ID that repeats in journal file.
Wherein, File Index Table comprises All Files and the Digital ID corresponding with described file respectively in virtualization software corresponding to this journal file, and namely this document concordance list has recorded All Files in the virtualization software and the corresponding relation of Digital ID thereof.
Concrete, journal file is carried out pretreated detailed process can be referring to Fig. 2, and Fig. 2 is the pretreated schematic flow sheet of the journal file in Fig. 1.This pre-service can be carried out in accordance with the following steps:
Step 201, set in advance a sequence string, juxtaposition is empty;
Step 202, judge whether journal file is disposed, complete if journal file is untreated, enter step 203 and process; Otherwise, jump to step 207;
Step 203, extract the new delegation log information in this journal file;
The file path name in this row log information is obtained in step 204, parsing;
Scan the every field in this log information, and resolve the file path name (filename) of obtaining this row log information; Show that file path is called the remarks of catalogue, does not process this log information if find in scanning that event place in this log information is filled with; If the file path name in this log information is not catalogue, continue execution in step 205.
Step 205, utilize File Index Table, the file path name is mapped as fileid.
This document concordance list is set up for each virtualization software, each Digital ID is a file of this virtualization software, for example, suppose that the function file of this virtualization software has 200, can arrange 1,2,3......200 is corresponding with these 200 files respectively.According to this document concordance list, each file path name of obtaining in step 204 being mapped to respectively is that a fileid is Digital ID.
Step 206, with the afterbody that the fileid that obtains in step 205 joins the sequence string, jump to step 202;
Step 207, after journal file is disposed, remove the fileid that repeats in the sequence string;
Step 208, this sequence string is deposited in database as a record;
Step 209, this record in step 208 is carried out mark, for example can be labeled as " dirty " data; So far, can releasing resource, finish pre-service.
Through after above-mentioned pre-service, each journal file becomes the form of a sequence string, for example.[111,15,246,212,110,210,123,147,5 ... ..], wherein, each numeral is a file index, to corresponding file in should the virtualization software File Index Table.
Step 103, journal file is carried out data mining, obtain the prefetch rules table of virtualization software;
This step can be to carrying out data mining through pretreated journal file in step 102, comprise mainly in the prefetch rules table that forms after excavating that each length in this journal file is less than or equal to a plurality of prefetch rules corresponding to the sequence string of preseting length, described prefetch rules comprises described sequence string and probable value thereof, and described sequence string comprises continuous a plurality of files.
Wherein, the data mining algorithm of this prefetch mechanisms is exactly mainly: pass through the number of times that each length in the sequence string that generates after pre-service occurs less than or equal to the different substrings of N by scanning, then utilize condition probability formula and frequency to replace principle, (sequence.len≤N) wherein deposits sequence in the middle of database in as prefetch rules to calculate the conditional probability of each sequence.The number of times that can occur by each sequence string that calculates in journal file, and replace and the conditional probability principle obtains the probable value of each described sequence string according to frequency.Need to prove, the data mining of carrying out in this step is to excavate for the uniform data that a plurality of sequence strings carry out.For example, the repeatedly operation of virtualization software A can generate a plurality of journal files, this repeatedly move can be same terminal to its repeatedly operation of carrying out, or a plurality of terminal is respectively to its once operation of carrying out.The sequence string that generates after these a plurality of journal file pre-service can be different.
Below the acquisition process of prefetch rules is illustrated, for example, { { { a, b, c, d, e} etc., the substring length N that relates in the following process of setting in this mining process is 8 for a, b, c}, sequence string for a, b, c, d, e, f, g}, sequence string to be respectively the sequence string through pretreated a plurality of journal files.
At first, can determine calculative sequence string, namely this moment the sequence string in substring, substring Sequence=S1S2...SK..., 8 〉=k 〉=2.Each length in this sequence string of needs calculating is 1 to N substring, for example, length is that 2 substring can be for { a, b}, { b, c}, { e, f} etc., length is that 3 substring can be for { a, b, c}, { b, c, d}, { e, f, g} etc., all the other substrings similarly.
Then, can calculate the conditional probability p (Sk|S1S2...SK-1) of each above-mentioned substring according to following frequency replacement formula, wherein, Sk represents the fileid of a file;
P(SK|S1S2...SK-1)=Count(S1S2..Sk)/Count(S1S2..Sk-1)...........(1)
For example, it is 3 the substring { conditional probability of a, b, c} for length, P (b|a)=Count (ab)/Count (a) needs to calculate the number of times of ab appearance in a plurality of sequence strings and the number of times that a occurs, and according to the above-mentioned P (b|a) that openly can obtain; In like manner, can calculate P (c|ab).
Then, the chain type rule according to following calculates P (abc); Wherein, in following formula (2), for p (Sk|S1S2...SK-1), S1S2...SK-1 is historical, for the ease of calculating, history can not be oversize, generally only considers the history that N-1 file consists of, be p (Sk|Sk-n+1...SK-1), the number of the history file in conditional probability is that preseting length subtracts 1; Front N-1 data of this consideration consist of historical model and are the N-Gram model;
P(Sequence)=p(S1)·p(S2|S1)·p(S3|S1S2)...p(Sk|S1S2...SK-1)...(2)
P(abc)=P(a)·P(b|a)·P(c|ab)。So having obtained length and be 3 substring, { a, b, c} and probable value thereof, the computation rule of other substrings similarly repeats no more.
Just can obtain each length in a plurality of journal files less than or equal to all sequences string and the probable value thereof of N through above-mentioned calculating.Wherein, each sequence string and probable value thereof form a prefetch rules, and all prefetch rules form prefetch rules table that should virtualization software.
Further, the data mining in this step can also be upgraded at any time according to actual operating position, after namely having obtained new journal file for this virtualization software, can upgrade replacement to the prefetch rules of existing this virtualization software.Consider if just carry out immediately data mining whenever there being new journal file to submit to, the expense of system undoubtedly can be very large, can setting threshold, when the quantity of the described journal file of the corresponding same virtualization software that gets during greater than setting threshold, described journal file is carried out data mining, and the soldier upgrades prefetch rules table corresponding to described virtualization software.
For example, can be after the daily record that will newly submit to be through the serializing pre-service, be that the sequence string deposits in the middle of database with the result of this serializing, and the result data that this newly adds is labeled as " dirty " data (namely without the used data of data mining), and will be " processing " data through the data markers of statistical computation.Only have when " dirty " data bulk of mark in database surpasses a threshold value threshold, just can carry out data mining to these " dirty " data by trigger data excavation engine, corresponding prefetch rules is upgraded; And when working as " dirty " data bulk less than threshold value threshold, wouldn't trigger data excavate.Like this, boosting algorithm efficient, reduce system overhead to a great extent.
Step 104, according to the prefetch rules table, the All Files of the sequence string in the target prefetch rules all is downloaded to this locality from software server.
When the file in terminal user's request access virtualization software, can obtain the target prefetch rules in the prefetch rules table of this virtualization software, and before the end-user access file, the All Files of the sequence string in described target prefetch rules all is downloaded to this locality from software server; The file that first element in the sequence string of described target prefetch rules is corresponding is the file of accessing, and length and the probable value of the sequence string of described target prefetch rules are maximum.
The concrete flow process of looking ahead can be referring to Fig. 3, and Fig. 3 is the schematic flow sheet of looking ahead of the file in Fig. 1, and this forecasting method can comprise the following steps:
Step 301, user ask to move virtualization software;
Step 302, obtain the prefetch rules table of this virtualization software; For example, can be that the client of prefetch mechanisms is after the running software request that the user detected, to the server end request and obtain the prefetch rules table of this virtualization software;
Step 303, according to the prefetch rules table that obtains in step 302, be created as a Hash table, this Hash table can play the effect of follow-up acceleration inquiry.
Step 304, virtualization software brings into operation;
Step 305, judge that whether the file of the current wish of this virtualization software access exists in this locality, if there is no, enters step 306; Otherwise, jump to step 309;
Whether step 306, judgement have the prefetch rules of this document in Hash table;
Concrete, for example, the terminal user wants the file a in accesses virtual software, can inquire about the prefetch rules storehouse of this virtualization software, acquire the first access file and be a plurality of substrings (for example { a, b, c}) of this document a, and the corresponding probable value of this substring.If have, enter step 307; Otherwise, jump to step 308;
Step 307, the non-existent All Files in this locality in the target prefetch rules in prefetch rules is downloaded, and jump to step 309;
Can obtain the first access file in step 306 is a plurality of substrings of this document a, for example, length is that { a, b, c, d, e, f}, { a, d, e, b, c, f} etc., also having length is 4 a plurality of substrings { a, b, c, d}, { a, d, e, b} etc. for 6 substring.In inquiry, the preferential substring of query length maximum, and further inquire about the substring of the probable value maximum in the maximum substrings of these a plurality of length, and namely can obtain the target prefetch rules, length and the probable value of the substring in this target prefetch rules are maximum.
For example, { a, b, c, d, e, f} are the target prefetch rules.Although this moment terminal user request access file a, in this step, can be with { All Files in a, b, c, d, e, f}, namely a, b, c, d, e, f, all be downloaded to this locality from software server.At this moment, be equivalent to predict the data that may use in the future in this virtualization software operational process according to the prefetch rules table, and be downloaded to local cache from server in advance before also not accessing, pause thereby hidden to a certain extent because network loads the virtualization software operation that causes, for the user provides interactive experience more smoothly.
Step 308, this document is downloaded to this locality from server, and jumps to step 309;
Step 309, judge whether end of run of this virtualization software; If enter step 310; Otherwise, jump to step 305.
The resource of step 310, release busy finishes operation.
The forecasting method that the Virtual software streaming of the present embodiment loads, by use the prediction of file according to the prefetch rules table of virtualization software, downloaded in advance before file access, stall problem is blocked in the operation that has solved in the software application process, like running software more smooth and easy, improved user's experience; In addition, also have the following advantages: high availability namely by adopting the forecasting method of the present embodiment, can guarantee that virtualization software reaches higher data hit rate in the use procedure that streaming loads as required; The transparency, namely data pre-fetching is processed according to this forecasting method automatically by prefetch mechanisms, need not the user and intervenes; The prefetch rules capable of dynamic upgrades, the feedback information that the software document that is namely produced by prefetch mechanisms in using the virtualization software process according to the user is accessed, dynamically update, constantly study, constantly improve prefetch rules, find out the most rational prefetch rules and realize looking ahead; Realize load balancing, namely dynamically updating on the basis that is based upon based on threshold value of prefetch rules, only have just to trigger when feedback information quantity surpasses a threshold value to dynamically update prefetch rules, reduced the unnecessary computing cost of system.
Fig. 4 is the structural representation of the pre-fetching system embodiment of Virtual software streaming loading of the present invention, and this pre-fetching system can be for the forecasting method described in the forecasting method embodiment that carries out Virtual software streaming loading of the present invention.The structure of this pre-fetching system of the present embodiment simple declaration, its using method can be described referring to forecasting method embodiment of the present invention.
As shown in Figure 4, this pre-fetching system can comprise the client 41 of looking ahead, the server 42 of looking ahead, and wherein, the client of looking ahead 41 can have a plurality of, is connected with the server 42 of looking ahead respectively.The client of looking ahead 41 and the server 42 of looking ahead can divide on entity and be arranged, and can communicate by Socket; The client of looking ahead 41 is mainly used in reporting journal file to the server 42 of looking ahead, and to the server 42 request prefetch rules of looking ahead; The server 42 of looking ahead can receive the journal file that the client 41 of looking ahead sends over and process, and can also respond the rule request of obtaining of the client 41 of looking ahead.
Concrete, the client of looking ahead 41 comprises data collection module 43 and data pre-fetching module 44, the server 42 of looking ahead can comprise data-mining module 45 and memory module 46.
Data collection module 43, can come the file sequence column information of accessing in the monitoring software operational process by a daemon, obtain the log information corresponding with the file access in the virtualization software use procedure, described log information comprises the file path name of institute's access file, and a plurality of log informations that obtain in described virtualization software use procedure form a journal file.This data collection module 43 can be sent to the journal file of its record data-mining module 45 after the virtualization software end of run; Wherein, when having a plurality of clients or this virtualization software to move repeatedly, data-mining module 45 can receive a plurality of journal files for this virtualization software.
Data-mining module 45 can carry out data mining according to modeling method by the report result to the data collection module 43 of the client 41 of respectively looking ahead, the prefetch rules that dynamically generates and upgrade each virtualization software, and client 41 is inquired about for looking ahead; A plurality of journal files of the corresponding same virtualization software that it can transmit data collection module 43 carry out data mining, obtain the prefetch rules table of described virtualization software, described prefetch rules table comprise respectively with described journal file in each length less than or equal to a plurality of prefetch rules corresponding to the sequence string of preseting length, described prefetch rules comprises described sequence string and probable value thereof, and described sequence string comprises continuous a plurality of files.
Memory module 46, the prefetch rules table that data-mining module 45 obtains can be stored, for example, referring to Fig. 4, stored the prefetch rules table that corresponds respectively to a plurality of virtualization softwares in this memory module 46, in order to inquire about when the corresponding virtualization software of end-user access.
Data pre-fetching module 44, can be when the file in terminal user's request access virtualization software the time, obtain the prefetch rules table of this virtualization software from the memory module 46 of server end, and obtain target prefetch rules in described prefetch rules table, the All Files of the sequence string in described target prefetch rules all is downloaded to this locality from software server; The file that first element in the sequence string of described target prefetch rules is corresponding is the file of accessing, and length and the probable value of the sequence string of described target prefetch rules are maximum.
Further, can comprise pretreatment unit 47 in data-mining module 45, this pretreatment unit 47 can be mapped as Digital ID with the file path name in the log information of journal file according to File Index Table, and described File Index Table comprises All Files and the Digital ID corresponding with described file respectively in described virtualization software; After described mapping is completed, remove the described Digital ID that repeats in described journal file.Wherein, File Index Table can exist in this data-mining module 45.
Further, data-mining module 45 can also comprise updating block 48, this updating block 48 can be in the quantity of the described journal file of the corresponding same virtualization software that gets during greater than setting threshold, described journal file is carried out data mining, and the prefetch rules table corresponding to described virtualization software upgrades.In addition, data pre-fetching module 44 can also be set up a Hash table according to described prefetch rules table, and this Hash table can facilitate follow-up query manipulation.
The pre-fetching system that the Virtual software streaming of the present embodiment loads, by data-mining module and data pre-fetching module etc. are set, use the prediction of file according to the prefetch rules table of virtualization software, downloaded in advance before file access, stall problem is blocked in the operation that has solved in the software application process, like running software more smooth and easy, improved user's experience.
One of ordinary skill in the art will appreciate that: all or part of step that realizes said method embodiment can be completed by the hardware that programmed instruction is correlated with, aforesaid program can be stored in a computer read/write memory medium, this program is carried out the step that comprises said method embodiment when carrying out; And aforesaid storage medium comprises: the various media that can be program code stored such as ROM, RAM, magnetic disc or CD.
It should be noted that at last: above embodiment only in order to technical scheme of the present invention to be described, is not intended to limit; Although with reference to previous embodiment, the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme that aforementioned each embodiment puts down in writing, and perhaps part technical characterictic wherein is equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution break away from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (6)

1. the forecasting method that Virtual software streaming loads, is characterized in that, comprising:
Obtain the log information corresponding with the file access in the virtualization software use procedure, described log information comprises the file path name of institute's access file, and a plurality of log informations that obtain in described virtualization software use procedure form a journal file;
A plurality of described journal files are treated to the form of sequence string, by calculating the number of times of the sequence string appearance in journal file, and replace and the conditional probability principle obtains the probable value of each described sequence string according to frequency, obtain the prefetch rules table of described virtualization software, described prefetch rules table comprise respectively with described journal file in each length less than or equal to a plurality of prefetch rules corresponding to the sequence string of preseting length, described prefetch rules comprises described sequence string and probable value thereof; Described sequence string comprises the file index of continuous a plurality of files, wherein, the described form that a plurality of described journal files are treated to the sequence string comprises: according to File Index Table, the file path name in the log information of described journal file is mapped as Digital ID, described File Index Table comprises All Files and the Digital ID corresponding with described file respectively in described virtualization software; After described mapping is completed, remove the described Digital ID that repeats in described journal file;
When the file in the described virtualization software of terminal user's request access, obtain the target prefetch rules in described prefetch rules table, and the All Files of the sequence string in described target prefetch rules all is downloaded to this locality from software server; The file that first element in the sequence string of described target prefetch rules is corresponding is the file of accessing, and length and the probable value of the sequence string of described target prefetch rules are maximum.
2. the forecasting method of Virtual software streaming loading according to claim 1, is characterized in that, the number of the file of the formation history in described conditional probability principle is that preseting length subtracts 1.
3. the forecasting method of Virtual software streaming loading according to claim 1, is characterized in that, before the All Files with the sequence string in described target prefetch rules all is downloaded to this locality from software server, also comprises:
Set up a Hash table according to described prefetch rules table.
4. the pre-fetching system that Virtual software streaming loads, is characterized in that, comprises look ahead client and the server of looking ahead; The described client of looking ahead comprises data collection module and data pre-fetching module, and the described server of looking ahead comprises data-mining module and memory module;
Described data collection module, be used for obtaining the log information corresponding with the file access of virtualization software use procedure, described log information comprises the file path name of institute's access file, and a plurality of log informations that obtain in described virtualization software use procedure form a journal file;
described data-mining module, be used for a plurality of described journal files are treated to the form of sequence string, by calculating the number of times of the sequence string appearance in journal file, and replace and the conditional probability principle obtains the probable value of each described sequence string according to frequency, obtain the prefetch rules table of described virtualization software, described prefetch rules table comprise respectively with described journal file in each length less than or equal to a plurality of prefetch rules corresponding to the sequence string of preseting length, described prefetch rules comprises described sequence string and probable value thereof, described sequence string comprises the file index of continuous a plurality of files, wherein, the described form that a plurality of described journal files are treated to the sequence string comprises: according to File Index Table, the file path name in the log information of described journal file is mapped as Digital ID, described File Index Table comprises All Files and the Digital ID corresponding with described file respectively in described virtualization software, after described mapping is completed, remove the described Digital ID that repeats in described journal file,
Described memory module is used for storing described prefetch rules table;
Described data pre-fetching module, when being used for the file when the described virtualization software of terminal user's request access, obtain the target prefetch rules in described prefetch rules table, and the All Files of the sequence string in described target prefetch rules all is downloaded to this locality from software server; The file that first element in the sequence string of described target prefetch rules is corresponding is the file of accessing, and length and the probable value of the sequence string of described target prefetch rules are maximum.
5. the pre-fetching system of Virtual software streaming loading according to claim 4, is characterized in that, described data-mining module also comprises:
Updating block when being used for quantity at the described journal file of the corresponding same virtualization software that gets greater than setting threshold, carries out data mining to described journal file, and the prefetch rules table corresponding to described virtualization software upgrades.
6. the pre-fetching system of Virtual software streaming loading according to claim 4, is characterized in that, described data pre-fetching module also is used for setting up a Hash table according to described prefetch rules table.
CN 201010592125 2010-12-08 2010-12-08 Virtualized-software flow type loading-oriented prefetching method and system Expired - Fee Related CN102033769B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010592125 CN102033769B (en) 2010-12-08 2010-12-08 Virtualized-software flow type loading-oriented prefetching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010592125 CN102033769B (en) 2010-12-08 2010-12-08 Virtualized-software flow type loading-oriented prefetching method and system

Publications (2)

Publication Number Publication Date
CN102033769A CN102033769A (en) 2011-04-27
CN102033769B true CN102033769B (en) 2013-05-22

Family

ID=43886703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010592125 Expired - Fee Related CN102033769B (en) 2010-12-08 2010-12-08 Virtualized-software flow type loading-oriented prefetching method and system

Country Status (1)

Country Link
CN (1) CN102033769B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102509171B (en) * 2011-10-24 2014-11-12 浙江大学 Flow mining method facing to rule execution log
CN103425564B (en) * 2013-08-22 2016-08-10 安徽融数信息科技有限责任公司 A kind of smartphone software uses Forecasting Methodology
CN104778010A (en) * 2014-01-13 2015-07-15 内蒙古近远信息技术有限责任公司 Efficient access prefetching method of media data on the basis of cloud storage platform
CN104021226B (en) * 2014-06-25 2018-01-02 华为技术有限公司 Prefetch the update method and device of rule
CN106250064B (en) * 2016-08-19 2020-05-12 深圳大普微电子科技有限公司 Solid state disk control device and solid state disk data access method based on learning
CN110188050A (en) * 2019-05-29 2019-08-30 中南大学 A kind of multichannel based on N-gram algorithm prefetches design method on demand

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6088718A (en) * 1998-01-15 2000-07-11 Microsoft Corporation Methods and apparatus for using resource transition probability models for pre-fetching resources
CN1700696A (en) * 2005-06-15 2005-11-23 深圳Tcl工业研究院有限公司 3C oriented digital household middleware engine
CN101674303A (en) * 2009-07-31 2010-03-17 厦门敏讯信息技术股份有限公司 Embedded network product programming equipment and method thereof
CN101771699A (en) * 2010-01-06 2010-07-07 华南理工大学 Method and system for improving SaaS application security

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6088718A (en) * 1998-01-15 2000-07-11 Microsoft Corporation Methods and apparatus for using resource transition probability models for pre-fetching resources
CN1700696A (en) * 2005-06-15 2005-11-23 深圳Tcl工业研究院有限公司 3C oriented digital household middleware engine
CN101674303A (en) * 2009-07-31 2010-03-17 厦门敏讯信息技术股份有限公司 Embedded network product programming equipment and method thereof
CN101771699A (en) * 2010-01-06 2010-07-07 华南理工大学 Method and system for improving SaaS application security

Also Published As

Publication number Publication date
CN102033769A (en) 2011-04-27

Similar Documents

Publication Publication Date Title
CN102033769B (en) Virtualized-software flow type loading-oriented prefetching method and system
KR101107434B1 (en) Secure custom application cloud computing architecture
US9128844B2 (en) Enhancing analytics performance using distributed multi-tiering
CN102331986B (en) Database cache management method and database server
US8904377B2 (en) Reconfiguration of computer system to allow application installation
Yuan et al. A data dependency based strategy for intermediate data storage in scientific cloud workflow systems
CN102754104B (en) The system and method for shared computation operating result between associated computing system
CN103023982B (en) Low-latency metadata access method of cloud storage client
AU2017290043A1 (en) Systems and methods for efficient distribution of stored data objects
US10380023B2 (en) Optimizing the management of cache memory
US10817515B2 (en) Cognitive data filtering for storage environments
Li et al. Beating OPT with statistical clairvoyance and variable size caching
CN103491155A (en) Cloud computing method and system for achieving mobile computing and obtaining mobile data
CN114730312A (en) Managed materialized views created from heterogeneous data sources
Hou et al. GDS-LC: A latency-and cost-aware client caching scheme for cloud storage
Dessokey et al. Memory management approaches in apache spark: A review
US11157499B2 (en) Reducing data access resources in a serverless computing environment using caching
CN114756509B (en) File system operation method, system, device and storage medium
Xiao et al. A hierarchical approach to maximizing MapReduce efficiency
CN108920951A (en) A kind of security audit frame based under cloud mode
Gopisetty et al. Improving performance of a distributed file system using hierarchical collaborative global caching algorithm with rank-based replacement technique
Wong et al. Baleen:{ML} Admission & Prefetching for Flash Caches
Saifeng VFS_CS: a light-weight and extensible virtual file system middleware for cloud storage system
US9606783B2 (en) Dynamic code selection based on data policies
Nalajala et al. Rank-based prefetching and multi-level caching algorithms to improve the efficiency of read operations in distributed file systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130522

Termination date: 20171208

CF01 Termination of patent right due to non-payment of annual fee