CN114780889B - Cache replacement system and method based on imitation learning - Google Patents

Cache replacement system and method based on imitation learning Download PDF

Info

Publication number
CN114780889B
CN114780889B CN202210491621.9A CN202210491621A CN114780889B CN 114780889 B CN114780889 B CN 114780889B CN 202210491621 A CN202210491621 A CN 202210491621A CN 114780889 B CN114780889 B CN 114780889B
Authority
CN
China
Prior art keywords
cache
eviction
neural network
current
access request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210491621.9A
Other languages
Chinese (zh)
Other versions
CN114780889A (en
Inventor
范琪琳
丘淑婷
蔡茂林
马浩
李秀华
熊庆宇
文俊浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202210491621.9A priority Critical patent/CN114780889B/en
Publication of CN114780889A publication Critical patent/CN114780889A/en
Application granted granted Critical
Publication of CN114780889B publication Critical patent/CN114780889B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses a cache replacement system and a cache replacement method based on imitation learning, wherein the system comprises an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database; the method comprises the following steps: 1) Acquiring a cache access request sequence; 2) Inputting a cache access request sequence into a trained cache replacement prediction neural network, calculating to obtain cache line eviction probability, and transmitting the cache line eviction probability to a cache replacement module; 3) And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache. The invention builds the neural network model by using the sequence information feature extractor such as the time convolution neural network and the attention mechanism, fully extracts the history information of the access sequence and the context information of the cache line in the cache, and improves the performance of model prediction.

Description

Cache replacement system and method based on imitation learning
Technical Field
The invention relates to the field of network caching, in particular to a cache replacement system and method based on imitation learning.
Background
The cache is a temporary, high-speed data exchange memory that ensures that high-frequency accessed content can be retrieved quickly. Cache replacement is a strategy for updating the content of a cache memory when the cache is full, and aims to replace the cache content accessed at low frequency in the cache and store the cache content accessed at high frequency, so that the cache resource with limited size can be utilized as efficiently as possible, and the quality of the cache replacement model directly determines the performance of a computer system or a network server.
The cache replacement system based on imitation learning surrounds a content delivery network (Content Delivery Network, CDN) as a content management and network traffic management core, ensures that a network edge cache server provides more efficient service for users, and thus builds a network application service mode with high quality, high efficiency and clear network order, which is a fundamental target of the system.
In the network caching field, the merits of the cache replacement model are core problems of whether the cache resources can be fully and efficiently utilized. In recent years, as demands for internet data streaming and network content access are increasing, cache replacement algorithms for network edge caches are also receiving more and more attention from researchers. The cache replacement model commonly used at the present stage is generally divided into the following two types: a cache replacement model based on heuristic rules and variants thereof are disclosed, wherein the model is used for replacing and modeling a cache according to heuristic rules, and the implementation is simpler. Another is a machine learning based cache replacement model, the idea of which is to continuously modify the model by supervised learning to adapt to different cache access modes.
The first model is a heuristic model, has weak generalization capability, is large in fall in different cache access modes, is greatly influenced by the quality of a cache access sequence, and is poor in performance in network caches with changeable content access. The second model based on machine learning improves the generalization ability of the model well, but requires a large number of learning samples and is affected by feedback delay, which may result in slow response in a highly dynamic environment, and failure to effectively use the historical access information, resulting in a reduced hit rate of cache replacement.
Disclosure of Invention
The invention aims to provide a cache replacement system based on imitation learning, which comprises an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database;
The access data acquisition module acquires a cache access request sequence S= { S t-H+1,St-H+2,…,st-1,st }, and transmits the cache access request sequence S= { S t-H+1,St-H+2,…,st-1,st } to the cache replacement prediction module; s t represents cache access at time t, and H is the number of historical cache accesses;
the cache replacement prediction module is stored with a neural network;
The cache replacement prediction module predicts cache line eviction probability by using a cache eviction prediction neural network and transmits the cache line eviction probability to the cache replacement module;
The cache replacement module stores the current cache access request into a cache according to the cache line eviction probability;
The database stores data of the access data acquisition module, the cache replacement prediction module and the cache replacement module.
Further, the cache eviction prediction neural network includes a time convolution neural network, an attention mechanism neural network, a socket layer, and an output layer.
Further, the time convolution neural network comprises an input layer, a hidden layer and an output layer;
the input of the time convolution neural network input layer is a cache access request sequence S= { S t-H+1,St-H+2,…,st-1,st }, and the input is an embedded e (S) = { e (S t-H+1),e(st-H+2),…,e(st-1),e(st) } of the cache access request sequence; wherein e (s t) represents the embedding of the cache access at time t;
The input of the first hidden layer of the time convolution neural network is the output e(s) of the input layer of the time convolution neural network;
The input of the j+1th hidden layer is the output of the j-th hidden layer The output is
Wherein the hidden layer outputsThe following is shown:
Wherein F (s t) represents the output of the dilation convolution at time t; an action is an Activation function;
the output F (s t) of the time-convolution expansion convolution at the time point t of the time-convolution neural network is as follows:
In the method, in the process of the invention, Information of a hidden layer of the j-th layer for access data at the moment of t-d.i; f (i) is a convolution filter; k is the convolution kernel size; d is the coefficient of expansion, d=2 j for the j-th layer;
The output of the output layer of the time convolution neural network is the output of the last hidden layer, and the output of the layer is expressed as the history information h(s) = { h (s t-H+1),h(st-H+2),…,h(st-1),h(st) } of the cache access request sequence; where h (s t) represents history information of the cache access content at time t.
Further, the input of the attention mechanism neural network is history information h(s) = { h (s t-H+1),h(st-H+2),…,h(st-1),h(st) }, embedded e (l) = { e (l 1),e(l2),e(l3),…,e(lW) } of cache line content of the current time step, and the input is a context vector gw= { g 1,g2,…,gW } of each cache line; where l w denotes the w-th cache line in the cache; w represents the number of cache lines in the cache; g w denotes the context of the w-th cache line; w=1, 2, …, W;
wherein the context vector g w is as follows:
wherein the coefficient α i is as follows:
αi=softmax(e(lw)TWeh(st-H+i)) (4)
where W e is a trainable parameter.
Further, the input of the socket layer is a context vector g w, and the output is a reuse distance dis= { dis 1,dis2,…,disW } of each cache line; wherein dis w represents the reuse distance of the w-th cache line;
wherein the reuse distance dis of each cache line is as follows:
dis=Linear(Relu(Linear(gw))) (5)
Wherein Relu is an activation function; linear is a Linear function;
The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p 1,p2,…,pW }; where p w represents the eviction probability of the w-th cache line in the cache;
Wherein the eviction probability prob for each cache line is as follows:
prob=Sigmoid(dis) (6)
In the formula, sigmoid is an activation function.
Further, the system also comprises a neural network training module;
the neural network training module is used for training the cache eviction prediction neural network of the cache replacement prediction module.
Further, the step of training the cached eviction prediction neural network comprises:
1) Judging whether the current cache is not full or whether the current cache access request is in the cache, if yes, directly putting the current cache access request into the cache or acquiring the current cache access request from the cache, otherwise, entering step 2);
2) Establishing a cache eviction prediction neural network;
3) Calculating the true reuse distance of each cache access request by Belady algorithm Wherein n is the total number of accesses; /(I)Representing the true reuse distance of the nth cache access request;
4) Generating an eviction probability prob= { p 1,p2,…,pW }, for each cache line, using a cache eviction prediction neural network;
5) Calculating an error between a cache line eviction probability output by a cache eviction prediction network at time t and a Belady algorithm Namely:
wherein, the prob t, Cache line eviction probabilities at time t are respectively the cache eviction prediction network and Belady algorithm; /(I)The probability of evicting the ith cache line at the time t is respectively calculated by a cache eviction prediction network and Belady algorithm; w is the cache size;
Wherein cache line eviction probability Satisfies the following formula:
Wherein, parameters topk and sequences The following are respectively shown:
topk=int(W*α) (10)
Where α is the eviction percentage, int () is the round-down function, decend _sort () is the descending order ordering function;
6) Updating the neural network parameter omega by using the gradient of the cache eviction prediction network, judging whether the loss function value reaches the minimum value or whether the training round number reaches the preset value, if so, ending the training of the cache eviction prediction neural network, otherwise, entering the step 7);
the parameter ω is updated as follows: cache eviction prediction neural network
Wherein ω is the current parameter of the network; omega' is the updated parameter; epsilon t is the current learning rate of the user,Is a gradient sign; bceloss is a loss function.
7) The cache replacement module selects a cache line for eviction and stores the current access into a cache;
8) Acquire a new cache access request and return to step 1).
Further, the step of storing the current cache access request into the cache by the cache replacement module includes:
1) Judging whether the current cache is not full or whether the current cache access request is in the cache, if yes, directly putting the current cache access request into the cache or acquiring the current cache access request from the cache, otherwise, entering the step 2).
2) The cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;
3) And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache.
Further, the cache line eviction policy is as follows:
index=decend_sort(prob) (13)
evict_index=index(0) (14)
where prob represents a cache line eviction probability vector, decend _sort () is a descending sort function, and index (0) represents the cache line with the highest eviction probability. The evict_index is the eviction policy; index is a guide function.
A method of using a cache replacement system based on imitation learning, comprising the steps of:
1) Constructing a cache replacement system based on imitation learning;
2) Acquiring a cache access request sequence;
3) Judging whether the current cache is not full or whether the current access is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache and entering the step 9), otherwise, entering the step 4).
4) Training a cache replacement prediction neural network of a cache replacement prediction module;
5) Inputting a cache access request sequence into a trained cache replacement prediction neural network, calculating to obtain cache line eviction probability, and transmitting the cache line eviction probability to a cache replacement module;
6) And (3) the cache replacement module judges whether the current cache is not full or whether the current cache access request is in the cache, if so, the current cache access request is directly put into the cache or obtained from the cache, otherwise, the step (7) is entered.
7) The cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;
8) And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache.
9) Acquiring a new cache access request sequence and returning to the step 3).
The method has the technical effects that the neural network model is built by using the sequence information feature extractor such as the time convolution neural network and the attention mechanism, the history information of the access sequence and the context information of the cache line in the cache are fully extracted, and the prediction performance of the model is improved.
According to the invention, the behavior of Belady experts is further simulated by using a technology simulating learning, the neural network model is trained according to the difference between the output of the cache replacement prediction module and the output of Belady experts, and the hit rate and the application range of the model are improved.
Drawings
FIG. 1 is a general algorithm flow diagram of a cache replacement method based on simulation learning;
FIG. 2 is a diagram of a neural network model structure based on a cache replacement method for simulation learning;
FIG. 3 is a flow chart of a cache replacement method neural network training based on simulation learning.
Detailed Description
The present invention is further described below with reference to examples, but it should not be construed that the scope of the above subject matter of the present invention is limited to the following examples. Various substitutions and alterations are made according to the ordinary skill and familiar means of the art without departing from the technical spirit of the invention, and all such substitutions and alterations are intended to be included in the scope of the invention.
Example 1:
Referring to fig. 1 to 3, a cache replacement system based on imitative learning includes an access data acquisition module, a cache replacement prediction module, a cache replacement module, and a database;
The access data acquisition module acquires a cache access request sequence S= { S t-H+1,st-H+2,…,st-1,st }, and transmits the cache access request sequence S= { S t-H+1,st-H+2,…,st-1,st } to the cache replacement prediction module; s t represents cache access at time t, and H is the number of historical cache accesses;
the cache replacement prediction module is stored with a neural network;
The cache replacement prediction module predicts cache line eviction probability by using a cache eviction prediction neural network and transmits the cache line eviction probability to the cache replacement module;
The cache replacement module stores the current cache access request into a cache according to the cache line eviction probability;
The database stores data of the access data acquisition module, the cache replacement prediction module and the cache replacement module.
The cache eviction prediction neural network comprises a time convolution neural network, an attention mechanism neural network, a receiving layer and an output layer.
The time convolution neural network comprises an input layer, a hidden layer and an output layer;
The input of the time convolution neural network input layer is a cache access request sequence S= { S t-H+1,st-H+2,…,st-1,st }, and the input is an embedded e (S) = { e (S t-H+1),e(st-H+2),…,e(st-1),e(st) } of the cache access request sequence; wherein e (s t) represents the embedding of the cache access at time t;
The input of the first hidden layer of the time convolution neural network is the output e(s) of the input layer of the time convolution neural network;
The input of the j+1th hidden layer is the output of the j-th hidden layer The output is
Wherein the hidden layer outputsThe following is shown:
Wherein F (s t) represents the output of the dilation convolution at time t; an action is an Activation function;
the output F (s t) of the time-convolution expansion convolution at the time point t of the time-convolution neural network is as follows:
In the method, in the process of the invention, Information of a hidden layer of the j-th layer for access data at the moment of t-d.i; f (i) is a convolution filter; k is the convolution kernel size; d is the coefficient of expansion, d=2 j for the j-th layer;
The output of the output layer of the time convolution neural network is the output of the last hidden layer, and the output of the layer is expressed as the history information h(s) = { h (s t-H+1),h(st-H+2),…,h(st-1),h(st) } of the cache access request sequence; where h (s t) represents history information of the cache access content at time t.
F (s t) in equation 2 is the medium-expansion convolution output of the time-convolution neural network at time t, F (s t) and the input at time tOutput after residual connection/>F (s t) can be seen as the intermediate output of the time-convolved neural network at time t, which is the input/>, with the current timeThe output after residual connection is the actual output of the time convolution neural network at the current moment. Here, the output h(s) of the output layer of the time convolution neural network is the actual output of the time convolution neural network of the last layer.
The input of the attention mechanism neural network is history information h(s) = { h (s t-H+1),h(st-H+2),…,h(st-1),h(st) }, embedded e (l) = { e (l 1),e(l2),e(l3),…,e(lW) } of cache line content of the current time step of the current cache access request sequence, and the input is a context vector gw= { g 1,g2,…,gw } of each cache line; where l w denotes the w-th cache line in the cache; w represents the number of cache lines in the cache; g w denotes the context of the w-th cache line; w=1, 2, …, W;
wherein the context vector g w is as follows:
wherein the coefficient α i is as follows:
αi=softmax(e(lw)TWeh(st-H+i)) (4)
where W e is a trainable parameter.
The input of the receiving layer is a context vector g w, and the output is a reuse distance dis= { dis 1,dis2,…,disW } of each cache line; wherein dis w represents the reuse distance of the w-th cache line;
wherein the reuse distance dis of each cache line is as follows:
dis=Linear(Relu(Linear(gw))) (5)
wherein Reli is an activation function; linear is a Linear function;
The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p 1,p2,…,pW } of each cache line, wherein p w represents the eviction probability of the w-th cache line in the cache;
Wherein the eviction probability prob for each cache line is as follows:
prob=Sigmoid(dis) (6)
In the formula, sigmoid is an activation function.
The system also comprises a neural network training module;
the neural network training module is used for training the cache eviction prediction neural network of the cache replacement prediction module.
The step of training the cached eviction prediction neural network comprises:
1) Judging whether the current cache is not full or whether the current cache access request is in the cache, if yes, directly putting the current cache access request into the cache or acquiring the current cache access request from the cache, otherwise, entering step 2);
2) Establishing a cache eviction prediction neural network;
3) Calculating the true reuse distance of each cache access request by Belady algorithm Wherein n is the total number of accesses; /(I)Representing the true reuse distance of the nth cache access request;
4) Generating an eviction probability prob= { p 1,p2,…,pW }, for each cache line, using a cache eviction prediction neural network;
5) Calculating error between cache line eviction probability output by cache eviction prediction network at time t and Bela dy algorithm Namely:
wherein, the prob t, Cache line eviction probabilities at time t are respectively the cache eviction prediction network and Belady algorithm; /(I)The probability of evicting the ith cache line at the time t is respectively calculated by a cache eviction prediction network and Belady algorithm; w is the cache size;
Wherein cache line eviction probability Satisfies the following formula:
topk=int(W*α) (8)
Where α is the eviction percentage, int () is the round down function, decend _sort () is the descending sort function.
6) Updating the neural network parameter omega by using the gradient of the cache eviction prediction network, judging whether the loss function value reaches the minimum value or whether the training round number reaches the preset value, if so, ending the training of the cache eviction prediction neural network, otherwise, entering the step 7);
the parameter ω is updated as follows: cache eviction prediction neural network
Wherein ω is the current parameter of the network; omega' is the updated parameter; epsilon t is the current learning rate of the user,Is a gradient sign; bceloss is a two-class cross entropy loss function.
7) The cache replacement module selects a cache line for eviction and stores the current access into a cache;
8) Acquire a new cache access request and return to step 1).
The step of storing the current cache access request into the cache by the cache replacement module comprises the following steps:
1) Judging whether the current cache is not full or whether the current cache access request is in the cache, if yes, directly putting the current cache access request into the cache or acquiring the current cache access request from the cache, otherwise, entering the step 2).
2) The cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;
3) And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache.
The cache line eviction policy is as follows:
index=decend_sort(prob) (13)
evict_index=index(0) (14)
where prob represents a cache line eviction probability vector, decend _sort () is a descending sort function, and index (0) represents the cache line with the highest eviction probability. The evict_index is the eviction policy; index is a guide function.
A method of using a cache replacement system based on imitation learning, comprising the steps of:
1) Constructing a cache replacement system based on imitation learning;
2) Acquiring a cache access request sequence;
3) Judging whether the current cache is not full or whether the current access is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache and entering the step 9), otherwise, entering the step 4).
4) Training a cache replacement prediction neural network of a cache replacement prediction module;
5) Inputting a cache access request sequence into a trained cache replacement prediction neural network, calculating to obtain cache line eviction probability, and transmitting the cache line eviction probability to a cache replacement module;
6) And (3) the cache replacement module judges whether the current cache is not full or whether the current cache access request is in the cache, if so, the current cache access request is directly put into the cache or obtained from the cache, otherwise, the step (7) is entered.
7) The cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;
8) And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache.
9) Acquiring a new cache access request sequence and returning to the step 3).
Example 2:
the cache replacement system based on imitation learning comprises an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database.
The access data acquisition module acquires an access sequence to the cache. S= { S t-H+1,st-H+2,…,st-1,st }. Wherein s t represents the cache access at time t, and H is the number of historical cache accesses in the cache access sequence. And judging whether the current cache is not full or whether the current accessed content is in the cache, if so, directly putting the current access into the cache or acquiring the current cache access from the cache, otherwise, sending the access sequence to a cache replacement prediction module.
The cache replacement prediction module comprises a time convolution neural network, an attention mechanism neural network, a bearing layer and an output layer, and sends the output cache line eviction probability to the cache replacement module.
The time convolution neural network and the attention mechanism neural network have a sequence relationship, wherein the time convolution neural network extracts the history information of each cache access in the access sequence in real time, and the attention mechanism neural network extracts the history information of each access sequence and embeds the cache line in the current cache for calculation to generate the context of each cache line. The neural network model is formed by the neural network model, the receiving layer and the output layer, and the structure comprises:
i) extracting access sequence history information by using a time convolution neural network;
II) the attention mechanism network calculates a context vector;
III) outputting reuse distance prediction of each cache line by the receiving layer;
IV) outputting the eviction probability prediction of each cache line by an output layer;
time convolution neural network the time convolution neural network comprises an input layer, a hidden layer and an output layer.
The input of the input layer of the time convolution neural network is a buffer access request sequence S= { S t-H+1,st-H+2,…,st-1,st }, and the input is an embedded e (S) = { e (S t-H+1),e(st-H+2),…,e(st-1),e(st) } of the buffer access sequence, wherein e (S t) represents the embedding of buffer access at the time t. The output of the input layer is taken as the input of the first layer of the hidden layer.
Time convolution neural network hidden layerInput of (a) is the upper hidden layer/>Output of/>The calculation steps of (a) are as follows:
Where F (s t) represents the output of the dilation convolution and the Activation is an Activation function. In order that the gradient does not disappear, the time-convolution neural network uses a residual connection at each layer to obtain the output hidden l+1 of the next layer.
In the method, in the process of the invention,Information of a hidden layer of the j-th layer for access data at the moment of t-d.i; f {0, …, k-1} → R is a convolution filter, k is the convolution kernel size, d is the expansion coefficient, d=2 j for the first layer.
The time convolution neural network output layer outputs the information of the last hidden layer as the history information h(s) = { h (s t-H+1),h(st-H+2),…,h(st-1),h(st) } of each cache access data, wherein h (s t) represents the history information of the cache access content at the time t.
Attention network: for any time step t, the inputs of the attention network are history information h(s) = { h (s t-H+1),h(st-H+2),…,h(st-1),h(st) } of the current cache access request sequence and embedded e (l) = { e (l 1),e(l2),e(l3),…,e(lW) } of the cache line content of the current time step, wherein l w represents the W-th cache line in the cache, and W represents the number of cache lines in the cache. The output is a context vector gw= { g 1,g2,…,gw } for each cache line, where g w represents the context content of the w-th cache line.
The context g w is as follows:
αi=softmax(e(lw)TWeh(st-H+i)) (3)
Wherein W e is a trainable parameter
Bearing layer: the input of the socket layer is the context vector gw, and the output is the reuse distance dis= { dis 1,dis2,…,disW } for each cache line, where dis w represents the reuse distance of the w-th cache line.
dis=Linear(Relu(Linear(gw))) (5)
The activation function of the receiving layer is RELU: f (x) =max (0, x), the linear function is Li near: y=xa T +b.
Output layer: the input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p 1,p2,…,pW } of each cache line, where p w represents the eviction probability of the w-th cache line in the cache.
prob=Sigmoid(dis) (6)
The activation function of the output layer is Sigmoid:
And the cache replacement module selects one cache line to be evicted according to the eviction probability of each cache line in the cache and stores the current access data.
The eviction selection of a cache line is as follows:
index=decend_sort(prob) (7)
evict_index=index(0) (8)
Where prob represents a cache line eviction probability vector, decend _sort () is a descending sort function, and index (0) represents the cache line with the highest eviction probability.
The neural network training module trains the neural network model to obtain a trained neural network model.
The step of training the neural network model comprises:
1) Judging whether the current cache is not full or whether the current accessed content is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache and entering the step 7), otherwise, entering the step 2).
2) Belady expert strategy calculates the true reuse distance of each access data according to all access dataWhere n is the total number of accesses.
3) A cache eviction prediction network is established, the cache eviction prediction network comprising a cache replacement prediction module and a cache eviction module.
In the neural network training process, a cache eviction prediction network model, namely the neural network model described in the above I) II) III) IV), is built. The neural network training and generating the actual cache replacement policy differ in: during training, belady experts generate cache eviction content in real time and cache line eviction probability generated by the model to calculate errors, and update parameters; in the actual cache replacement strategy, the cache line eviction probability calculation is performed only by using the trained model.
4) The cache replacement prediction module generates an eviction probability prob= { p 1,p2,…,pW } for each cache line.
5) And calculating the error between the cache line eviction probability output by the cache eviction prediction network and Belady expert strategies.
Wherein, the prob t,Probability vectors for evicting cache contents at time t by using model and Belady algorithm respectively,/>The probability of eviction of the ith cache line at time t by the model and Belady algorithm respectively, and W is the cache size.
The calculation of (2) is as follows:
topk=int(W*α) (10)
/>
Where α is the eviction percentage, int () is the round down function, decend _sort () is the descending sort function.
The cache eviction prediction network parameter ω is updated as follows:
wherein ω is the current parameter of the network, ω' is the updated parameter, ε t is the current learning rate, Is a gradient sign.
6) The global neural network parameters are updated using the gradient of the cached eviction prediction network.
7) The cache replacement module selects a cache line for eviction and stores the current access in the cache.
8) Acquire a new cache access sequence and return to step 1).
The cache replacement module comprises the following steps:
1) Judging whether the current cache is not full or whether the current accessed content is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache, otherwise, entering step 2).
2) Extracting the reuse distance of each cache access content in the current cache sequence by using the trained neural network, wherein the steps comprise:
2.1 With the last layer of hidden layer of the time convolution network and the embedding of the cache line as inputs, the context information gw of each cache line is obtained.
2.2 The context information gw of each cache line is passed through the reuse distance dis of the cache line output by the socket layer.
3) And outputting the eviction probability of each cache line through the output layer by using the trained neural network through the extracted cache reuse distance.
4) And replacing the cache content by using a cache replacement module.
The database stores data of the access data acquisition module, the cache replacement prediction module, the cache replacement module and the neural network training module.
Example 3:
Referring to fig. 1 to 3, the cache replacement method based on the imitation learning includes the steps of:
1) An access sequence to the cache is obtained.
The access data acquisition module acquires an access sequence to the cache. S= { S t-H+1,st-H+2,…,st-1,st }. Wherein s t represents the cache access at time t, and H is the number of historical cache accesses in the cache access sequence. And judging whether the current cache is not full or whether the current accessed content is in the cache, if so, directly putting the current access into the cache or acquiring the current cache access from the cache, otherwise, sending the access sequence to a cache replacement prediction module.
2) And establishing a cache replacement prediction model, and sending the output cache line eviction probability to a cache replacement module.
The cache replacement prediction module comprises a time convolution neural network, an attention mechanism neural network, a bearing layer and an output layer, and sends the output cache line eviction probability to the cache replacement module.
The time convolutional neural network includes an input layer, a hidden layer, and an output layer.
The input of the input layer of the time convolution neural network is a buffer access request sequence S= { S t-H+1,st-H+2,…,st-1,st }, and the input is an embedded e (S) = { e (S t-H+1),e(st-H+2),…,e(st-1),e(st) } of the buffer access sequence, wherein e (S t) represents the embedding of buffer access at the time t. The output of the input layer is taken as the input of the first layer of the hidden layer.
Time convolution neural network hidden layerInput of (a) is the upper hidden layer/>Output of/>The calculation steps of (a) are as follows: /(I)
Where F (s t) represents the output of the dilation convolution and the Activation is an Activation function. In order that the gradient does not disappear, the time-convolution neural network uses a residual connection at each layer to obtain the output hiddenj +1 of the next layer.
In the method, in the process of the invention,Information of a hidden layer of the j-th layer for access data at the moment of t-d.i; f {0, …, k-1} → R is a convolution filter, k is the convolution kernel size, d is the expansion coefficient, and d=2 j for the j-th layer.
The time convolution neural network output layer outputs the information of the last hidden layer as the history information h(s) = { h (s t-H+1),h(st-H+2),…,h(st-1),h(st) } of each cache access data, wherein h (s t) represents the history information of the cache access content at the time t.
For any time step t, the attention network acquires, as inputs to the attention network, an output h(s) = { h (s t-H+1),h(st-H+2),…,h(st-1),h(st) } of an output layer of the time convolution network and an embedding e (l) = { e (l 1),e(l2),e(l3),…,e(lw) } of cache line contents of a current time step, where l w represents a W-th cache line in the cache and W represents the number of cache lines in the cache. The output is a context vector gw= { g 1,g2,…,gw } for each cache line, where g w represents the context content of the w-th cache line.
The context g w is as follows:
αi=softmax(e(lw)TWeh(st-H+i)) (3)
Wherein W e is a trainable parameter
The input of the socket layer is the context vector gw, and the output is the reuse distance dis= { dis 1,dis2,…,disW } for each cache line, where dis w represents the reuse distance of the w-th cache line.
dis=Linear(Relu(Linear(gw))) (5)
The activation function of the receiving layer is RELU: f (x) =max (0, x), the linear function is Li near: y=xa T +b.
The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p 1,p2,…,pW } of each cache line, where p w represents the eviction probability of the w-th cache line in the cache.
prob=Sigmoid(dis) (6)
The activation function of the output layer is Sigmoid:
3) And inputting the output of the cache replacement prediction module into the cache replacement module so as to access the current cache to a certain cache line in the replacement cache.
And the cache replacement module selects one cache line to be evicted according to the eviction probability of each cache line in the cache and stores the current access data.
The eviction selection of a cache line is as follows:
index=decend_sort(prob) (7)
evict_index=index(0) (8)
Where prob represents a cache line eviction probability vector, decend _sort () is a descending sort function, and index (0) represents the cache line with the highest eviction probability.
4) And building a neural network training module, and training the neural network in the cache replacement prediction model to obtain a trained neural network model.
The step of training the neural network model comprises:
4.1 Judging whether the current cache is not full or whether the current accessed content is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache and entering the step 7), otherwise, entering the step 2).
4.2 Belady expert policy to calculate the true reuse distance for each access data based on all access dataWhere n is the total number of accesses.
4.3 A cache eviction prediction network is established, the cache eviction prediction network comprising a cache replacement prediction module and a cache eviction module.
4.4 The cache replacement prediction module generates an eviction probability prob= { p 1,p2,…,pW } for each cache line.
4.5 Calculating an error between the cache line eviction probability output by the cache eviction prediction network and a Bela dy expert policy.
Wherein, the prob t,Probability vectors for evicting cache contents at time t by using model and Belady algorithm respectively,/>The probability of eviction of the ith cache line at time t by the model and Belady algorithm respectively, and W is the cache size.
The calculation of (2) is as follows:
k=int(W*topk) (10)
where α is the eviction percentage, int () is the decreasing integer function, decend _sort () is the decreasing order ordering function
The cache eviction prediction network parameter ω is updated as follows:
wherein ω is the current parameter of the network, ω' is the updated parameter, ε t is the current learning rate, Is a gradient sign.
4.6 Using the gradient of the cached eviction prediction network to update the global neural network parameters.
4.7 The cache replacement module selects a cache line for eviction and stores the current access in the cache.
4.8 Acquire a new cache access sequence and return to step 1).
5) And inputting the access sequence into a trained neural network model, and completing the cache replacement operation by using a cache replacement module.
And carrying out cache replacement prediction and cache replacement on the access data by using the trained neural network, wherein the method comprises the following steps of:
5.1 Judging whether the current cache is not full or whether the current accessed content is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache, otherwise, entering step 2).
5.2 Extracting the reuse distance of each cache access content in the current cache sequence by using the trained neural network, wherein the method comprises the following steps of:
5.2.1 With the last layer of hidden layer of the time convolution network and the embedding of the cache line as inputs, the context information gw of each cache line is obtained.
5.2.2 The context information gw of each cache line is passed through the reuse distance dis of the cache line output by the socket layer.
5.3 Using the trained neural network to output the eviction probability of each cache line through the output layer by the extracted cache reuse distance.
5.4 Using the trained cache replacement module to replace the cache content.

Claims (7)

1. A cache replacement system based on imitative learning, characterized in that: the system comprises an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database;
The access data acquisition module acquires a cache access request sequence S= { S t-H+1,st-H+2,...,st-1,st }, and transmits the cache access request sequence S= { S t-H+1,st-H+2,...,st-1,st } to the cache replacement prediction module; s t represents cache access at time t, and H is the number of historical cache accesses;
the cache replacement prediction module is stored with a neural network;
The cache replacement prediction module predicts cache line eviction probability by using a cache eviction prediction neural network and transmits the cache line eviction probability to the cache replacement module;
The cache replacement module stores the current cache access request into a cache according to the cache line eviction probability;
the database stores data of the access data acquisition module, the cache replacement prediction module and the cache replacement module;
the cache eviction prediction neural network comprises a time convolution neural network, an attention mechanism neural network, a bearing layer and an output layer;
the time convolution neural network comprises an input layer, a hidden layer and an output layer;
The input of the time convolution neural network input layer is a cache access request sequence S= { S t-H+1,st-H+2,...,st-1,st }, and the input is an embedded e (S) = { e (S t-H+1),e(st-H+2),...,e(st-1),e(st) } of the cache access request sequence; wherein e (s t) represents the embedding of the cache access at time t;
The input of the first hidden layer of the time convolution neural network is the output e(s) of the input layer of the time convolution neural network;
The input of the j+1th hidden layer is the output of the j-th hidden layer The output is
Wherein the hidden layer outputsThe following is shown:
Wherein F (s t) represents the output of the dilation convolution at time t; an action is an Activation function;
the expansion convolution F (s t) at time t is as follows:
In the method, in the process of the invention, Information of a hidden layer of the j-th layer for access data at the moment of t-d.i; f (i) is a convolution filter; k is the convolution kernel size; d is the coefficient of expansion, d=2 j for the j-th layer;
The output of the output layer of the time convolution neural network is the output of the last hidden layer, and the output of the layer is expressed as the history information h(s) = { h (s t-H+1),h(st-H+2),...,h(st-1),h(st) } of the cache access request sequence; wherein h (s t) represents history information of the cache access content at the time t;
the step of training the cached eviction prediction neural network comprises:
1) Judging whether the current cache is not full or whether the current cache access request is in the cache, if yes, directly putting the current cache access request into the cache or acquiring the current cache access request from the cache, otherwise, entering step 2);
2) Establishing a cache eviction prediction neural network;
3) Calculating the true reuse distance of each cache access request by Belady algorithm Wherein n is the total number of accesses; /(I)Representing the true reuse distance of the nth cache access request;
4) Generating an eviction probability prob= { p 1,p2,...,pW }, for each cache line, using a cache eviction prediction neural network;
5) Calculating an error between a cache line eviction probability output by a cache eviction prediction network at time t and a Belady algorithm Namely:
In the method, in the process of the invention, Cache line eviction probabilities at time t are respectively the cache eviction prediction network and Belady algorithm; the probability of evicting the ith cache line at the time t is respectively calculated by a cache eviction prediction network and Belady algorithm; w is the cache size;
Wherein cache line eviction probability Satisfies the following formula:
Wherein, parameters topk and sequences The following are respectively shown:
topk=int(W*α) (10)
Where α is the eviction percentage, int () is the round-down function, decend _sort () is the descending order ordering function;
6) Updating the neural network parameter omega by using the gradient of the cache eviction prediction network, judging whether the loss function value reaches the minimum value or whether the training round number reaches the preset value, if so, ending the training of the cache eviction prediction neural network, otherwise, entering the step 7);
the parameter ω is updated as follows: cache eviction prediction neural network
Wherein ω is the current parameter of the network; omega' is the updated parameter; epsilon t is the current learning rate of the user,Is a gradient sign; bceloss is a loss function;
7) The cache replacement module selects a cache line for eviction and stores the current access into a cache;
8) Acquire a new cache access request and return to step 1).
2. The cache replacement system based on imitation learning according to claim 1, wherein the input of the attention mechanism neural network is history information h(s) = { h (s t-H+1),h(st-H+2),...,h(st-1),h(st) }, embedded e (l) = { e (l 1),e(l2),e(l3),...,e(lW) } of cache line content of the current time step, and the output is a context vector gw= { g 1,g2,...,gW } of each cache line; where l w denotes the w-th cache line in the cache; w represents the number of cache lines in the cache; g w denotes the context of the w-th cache line; w=1, 2,. -%, W;
wherein the context vector g w is as follows:
wherein the coefficient α i is as follows:
αi=softmax(e(lw)TWeh(st-H+i)) (4)
where W e is a trainable parameter.
3. The cache replacement system based on imitation learning of claim 2, wherein the input of the socket layer is a context vector g w, and the output is a reuse distance dis= { dis 1,dis2,...,disW } for each cache line; wherein dis W represents the reuse distance of the W-th cache line;
wherein the reuse distance dis of each cache line is as follows:
dis=Linear(Relu(Linear(gw))) (5)
Wherein Relu is an activation function; linear is a Linear function;
The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p 1,p2,...,pW }; where p W represents the eviction probability of the W-th cache line in the cache;
Wherein the eviction probability prob for each cache line is as follows:
prob=Sigmoid(dis) (6)
In the formula, sigmoid is an activation function.
4. A cache replacement system based on imitation learning as claimed in claim 3, wherein: the system also comprises a neural network training module;
the neural network training module is used for training the cache eviction prediction neural network of the cache replacement prediction module.
5. The cache replacement system based on impersonation learning of claim 1, wherein the step of the cache replacement module storing the current cache access request in the cache comprises:
a1 Judging whether the current cache is not full or whether the current cache access request is in the cache, if yes, directly putting the current cache access request into the cache or acquiring the current cache access request from the cache, otherwise, entering the step a 2);
a2 The cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;
a3 And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache.
6. The imitation learning based cache replacement system of claim 5, wherein the cache line eviction policy is as follows:
index=decend_sort(prob) (13)
evict_index=index(0) (14)
where prob represents a cache line eviction probability vector, decend _sort () is a descending sort function, index (0) represents the cache line with the highest eviction probability; the evict_index is the eviction policy; index is a guide function.
7. A method of using a cache replacement system according to any one of claims 1 to 6, comprising the steps of:
s 1) building a cache replacement system based on imitation learning;
s 2) obtaining a cache access request sequence;
S3) judging whether the current cache is not full or whether the current access is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache and entering a step S9), otherwise, entering a step S4);
s 4) training the cache replacement prediction neural network of the cache replacement prediction module;
s 5) inputting the cache access request sequence into a trained cache replacement prediction neural network, calculating to obtain cache line eviction probability, and transmitting to a cache replacement module;
s 6) the buffer replacement module judges whether the current buffer is not full or whether the current buffer access request is already in the buffer, if yes, the current buffer access request is directly put into the buffer or obtained from the buffer, otherwise, the step s7 is entered;
s 7) the cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;
s 8) the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache;
s 9) obtaining a new cache access request sequence, and returning to step s 3).
CN202210491621.9A 2022-05-07 2022-05-07 Cache replacement system and method based on imitation learning Active CN114780889B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210491621.9A CN114780889B (en) 2022-05-07 2022-05-07 Cache replacement system and method based on imitation learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210491621.9A CN114780889B (en) 2022-05-07 2022-05-07 Cache replacement system and method based on imitation learning

Publications (2)

Publication Number Publication Date
CN114780889A CN114780889A (en) 2022-07-22
CN114780889B true CN114780889B (en) 2024-06-25

Family

ID=82435100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210491621.9A Active CN114780889B (en) 2022-05-07 2022-05-07 Cache replacement system and method based on imitation learning

Country Status (1)

Country Link
CN (1) CN114780889B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9779029B2 (en) * 2012-11-06 2017-10-03 Facebook, Inc. Cache replacement policy for data with strong temporal locality
CN112862060B (en) * 2019-11-28 2024-02-13 南京大学 Content caching method based on deep learning
CN112752308B (en) * 2020-12-31 2022-08-05 厦门越人健康技术研发有限公司 Mobile prediction wireless edge caching method based on deep reinforcement learning
CN113225380B (en) * 2021-04-02 2022-06-28 中国科学院计算技术研究所 Content distribution network caching method and system based on spectral clustering

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
OA-Cache: Oracle Approximation-Based Cache Replacement at the Network Edge;Shuting Qiu等;《IEEE Transactions on Network and Service Management 》;20230125;第20卷(第3期);3177 - 3189 *

Also Published As

Publication number Publication date
CN114780889A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN113114756A (en) Video cache updating method for self-adaptive code rate selection in mobile edge calculation
CN111813539B (en) Priority and collaboration-based edge computing resource allocation method
CN109215344B (en) Method and system for urban road short-time traffic flow prediction
CN111626041B (en) Music comment generation method based on deep learning
CN117541026B (en) Intelligent logistics transport vehicle dispatching method and system
CN111881358B (en) Object recommendation system, method and device, electronic equipment and storage medium
CN115237825A (en) Intelligent cache replacement method based on machine learning
CN114969278A (en) Knowledge enhancement graph neural network-based text question-answering model
CN112115264B (en) Text classification model adjustment method for data distribution change
CN111488528A (en) Content cache management method and device and electronic equipment
CN117851909B (en) Multi-cycle decision intention recognition system and method based on jump connection
CN114154060B (en) Content recommendation system and method integrating information age and dynamic graph neural network
CN117349748A (en) Active learning fault diagnosis method based on cloud edge cooperation
CN114780889B (en) Cache replacement system and method based on imitation learning
Lima et al. Participatory evolving fuzzy modeling
Zhou et al. Content placement with unknown popularity in fog radio access networks
CN117473616A (en) Railway BIM data edge caching method based on multi-agent reinforcement learning
CN109993271A (en) Grey neural network forecasting based on theory of games
CN115098697A (en) Method, device and equipment for determining result event
CN113360772B (en) Interpretable recommendation model training method and device
CN115129888A (en) Active content caching method based on network edge knowledge graph
CN115630979A (en) Day-ahead electricity price prediction method and device, storage medium and computer equipment
CN114490447A (en) Intelligent caching method for multitask optimization
CN113378941A (en) Multi-decision fusion small sample image classification method
CN112560760A (en) Attention-assisted unsupervised video abstraction system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant