CN114780889B

CN114780889B - Cache replacement system and method based on imitation learning

Info

Publication number: CN114780889B
Application number: CN202210491621.9A
Authority: CN
Inventors: 范琪琳; 丘淑婷; 蔡茂林; 马浩; 李秀华; 熊庆宇; 文俊浩
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2022-05-07
Filing date: 2022-05-07
Publication date: 2024-06-25
Anticipated expiration: 2042-05-07
Also published as: CN114780889A

Abstract

The invention discloses a cache replacement system and a cache replacement method based on imitation learning, wherein the system comprises an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database; the method comprises the following steps: 1) Acquiring a cache access request sequence; 2) Inputting a cache access request sequence into a trained cache replacement prediction neural network, calculating to obtain cache line eviction probability, and transmitting the cache line eviction probability to a cache replacement module; 3) And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache. The invention builds the neural network model by using the sequence information feature extractor such as the time convolution neural network and the attention mechanism, fully extracts the history information of the access sequence and the context information of the cache line in the cache, and improves the performance of model prediction.

Description

Cache replacement system and method based on imitation learning

Technical Field

The invention relates to the field of network caching, in particular to a cache replacement system and method based on imitation learning.

Background

The cache is a temporary, high-speed data exchange memory that ensures that high-frequency accessed content can be retrieved quickly. Cache replacement is a strategy for updating the content of a cache memory when the cache is full, and aims to replace the cache content accessed at low frequency in the cache and store the cache content accessed at high frequency, so that the cache resource with limited size can be utilized as efficiently as possible, and the quality of the cache replacement model directly determines the performance of a computer system or a network server.

The cache replacement system based on imitation learning surrounds a content delivery network (Content Delivery Network, CDN) as a content management and network traffic management core, ensures that a network edge cache server provides more efficient service for users, and thus builds a network application service mode with high quality, high efficiency and clear network order, which is a fundamental target of the system.

In the network caching field, the merits of the cache replacement model are core problems of whether the cache resources can be fully and efficiently utilized. In recent years, as demands for internet data streaming and network content access are increasing, cache replacement algorithms for network edge caches are also receiving more and more attention from researchers. The cache replacement model commonly used at the present stage is generally divided into the following two types: a cache replacement model based on heuristic rules and variants thereof are disclosed, wherein the model is used for replacing and modeling a cache according to heuristic rules, and the implementation is simpler. Another is a machine learning based cache replacement model, the idea of which is to continuously modify the model by supervised learning to adapt to different cache access modes.

The first model is a heuristic model, has weak generalization capability, is large in fall in different cache access modes, is greatly influenced by the quality of a cache access sequence, and is poor in performance in network caches with changeable content access. The second model based on machine learning improves the generalization ability of the model well, but requires a large number of learning samples and is affected by feedback delay, which may result in slow response in a highly dynamic environment, and failure to effectively use the historical access information, resulting in a reduced hit rate of cache replacement.

Disclosure of Invention

The invention aims to provide a cache replacement system based on imitation learning, which comprises an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database;

The access data acquisition module acquires a cache access request sequence S= { S _t-H+1,S_t-H+2,…,s_t-1,s_t }, and transmits the cache access request sequence S= { S _t-H+1,S_t-H+2,…,s_t-1,s_t } to the cache replacement prediction module; s _t represents cache access at time t, and H is the number of historical cache accesses;

the cache replacement prediction module is stored with a neural network;

The cache replacement prediction module predicts cache line eviction probability by using a cache eviction prediction neural network and transmits the cache line eviction probability to the cache replacement module;

The cache replacement module stores the current cache access request into a cache according to the cache line eviction probability;

The database stores data of the access data acquisition module, the cache replacement prediction module and the cache replacement module.

Further, the cache eviction prediction neural network includes a time convolution neural network, an attention mechanism neural network, a socket layer, and an output layer.

Further, the time convolution neural network comprises an input layer, a hidden layer and an output layer;

the input of the time convolution neural network input layer is a cache access request sequence S= { S _t-H+1,S_t-H+2,…,s_t-1,s_t }, and the input is an embedded e (S) = { e (S _t-H+1),e(s_t-H+2),…,e(s_t-1),e(s_t) } of the cache access request sequence; wherein e (s _t) represents the embedding of the cache access at time t;

The input of the first hidden layer of the time convolution neural network is the output e(s) of the input layer of the time convolution neural network;

The input of the j+1th hidden layer is the output of the j-th hidden layer The output is

Wherein the hidden layer outputsThe following is shown:

Wherein F (s _t) represents the output of the dilation convolution at time t; an action is an Activation function;

the output F (s _t) of the time-convolution expansion convolution at the time point t of the time-convolution neural network is as follows:

In the method, in the process of the invention, Information of a hidden layer of the j-th layer for access data at the moment of t-d.i; f (i) is a convolution filter; k is the convolution kernel size; d is the coefficient of expansion, d=2 ^j for the j-th layer;

The output of the output layer of the time convolution neural network is the output of the last hidden layer, and the output of the layer is expressed as the history information h(s) = { h (s _t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t) } of the cache access request sequence; where h (s _t) represents history information of the cache access content at time t.

Further, the input of the attention mechanism neural network is history information h(s) = { h (s _t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t) }, embedded e (l) = { e (l ₁),e(l₂),e(l₃),…,e(l_W) } of cache line content of the current time step, and the input is a context vector gw= { g ₁,g₂,…,g_W } of each cache line; where l _w denotes the w-th cache line in the cache; w represents the number of cache lines in the cache; g _w denotes the context of the w-th cache line; w=1, 2, …, W;

wherein the context vector g _w is as follows:

wherein the coefficient α _i is as follows:

α_i＝softmax(e(l_w)^TW_eh(s_t-H+i)) (4)

where W _e is a trainable parameter.

Further, the input of the socket layer is a context vector g _w, and the output is a reuse distance dis= { dis ₁,dis₂,…,dis_W } of each cache line; wherein dis _w represents the reuse distance of the w-th cache line;

wherein the reuse distance dis of each cache line is as follows:

dis＝Linear(Relu(Linear(gw))) (5)

Wherein Relu is an activation function; linear is a Linear function;

The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p ₁,p₂,…,p_W }; where p _w represents the eviction probability of the w-th cache line in the cache;

Wherein the eviction probability prob for each cache line is as follows:

prob＝Sigmoid(dis) (6)

In the formula, sigmoid is an activation function.

Further, the system also comprises a neural network training module;

the neural network training module is used for training the cache eviction prediction neural network of the cache replacement prediction module.

Further, the step of training the cached eviction prediction neural network comprises:

1) Judging whether the current cache is not full or whether the current cache access request is in the cache, if yes, directly putting the current cache access request into the cache or acquiring the current cache access request from the cache, otherwise, entering step 2);

2) Establishing a cache eviction prediction neural network;

3) Calculating the true reuse distance of each cache access request by Belady algorithm Wherein n is the total number of accesses; /(I)Representing the true reuse distance of the nth cache access request;

4) Generating an eviction probability prob= { p ₁,p₂,…,p_W }, for each cache line, using a cache eviction prediction neural network;

5) Calculating an error between a cache line eviction probability output by a cache eviction prediction network at time t and a Belady algorithm Namely:

wherein, the prob _t, Cache line eviction probabilities at time t are respectively the cache eviction prediction network and Belady algorithm; /(I)The probability of evicting the ith cache line at the time t is respectively calculated by a cache eviction prediction network and Belady algorithm; w is the cache size;

Wherein cache line eviction probability Satisfies the following formula:

Wherein, parameters topk and sequences The following are respectively shown:

topk＝int(W*α) (10)

Where α is the eviction percentage, int () is the round-down function, decend _sort () is the descending order ordering function;

6) Updating the neural network parameter omega by using the gradient of the cache eviction prediction network, judging whether the loss function value reaches the minimum value or whether the training round number reaches the preset value, if so, ending the training of the cache eviction prediction neural network, otherwise, entering the step 7);

the parameter ω is updated as follows: cache eviction prediction neural network

Wherein ω is the current parameter of the network; omega' is the updated parameter; epsilon _t is the current learning rate of the user,Is a gradient sign; bceloss is a loss function.

7) The cache replacement module selects a cache line for eviction and stores the current access into a cache;

8) Acquire a new cache access request and return to step 1).

Further, the step of storing the current cache access request into the cache by the cache replacement module includes:

1) Judging whether the current cache is not full or whether the current cache access request is in the cache, if yes, directly putting the current cache access request into the cache or acquiring the current cache access request from the cache, otherwise, entering the step 2).

2) The cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;

3) And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache.

Further, the cache line eviction policy is as follows:

index＝decend_sort(prob) (13)

evict_index＝index(0) (14)

where prob represents a cache line eviction probability vector, decend _sort () is a descending sort function, and index (0) represents the cache line with the highest eviction probability. The evict_index is the eviction policy; index is a guide function.

A method of using a cache replacement system based on imitation learning, comprising the steps of:

1) Constructing a cache replacement system based on imitation learning;

2) Acquiring a cache access request sequence;

3) Judging whether the current cache is not full or whether the current access is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache and entering the step 9), otherwise, entering the step 4).

4) Training a cache replacement prediction neural network of a cache replacement prediction module;

5) Inputting a cache access request sequence into a trained cache replacement prediction neural network, calculating to obtain cache line eviction probability, and transmitting the cache line eviction probability to a cache replacement module;

6) And (3) the cache replacement module judges whether the current cache is not full or whether the current cache access request is in the cache, if so, the current cache access request is directly put into the cache or obtained from the cache, otherwise, the step (7) is entered.

7) The cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;

8) And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache.

9) Acquiring a new cache access request sequence and returning to the step 3).

The method has the technical effects that the neural network model is built by using the sequence information feature extractor such as the time convolution neural network and the attention mechanism, the history information of the access sequence and the context information of the cache line in the cache are fully extracted, and the prediction performance of the model is improved.

According to the invention, the behavior of Belady experts is further simulated by using a technology simulating learning, the neural network model is trained according to the difference between the output of the cache replacement prediction module and the output of Belady experts, and the hit rate and the application range of the model are improved.

Drawings

FIG. 1 is a general algorithm flow diagram of a cache replacement method based on simulation learning;

FIG. 2 is a diagram of a neural network model structure based on a cache replacement method for simulation learning;

FIG. 3 is a flow chart of a cache replacement method neural network training based on simulation learning.

Detailed Description

The present invention is further described below with reference to examples, but it should not be construed that the scope of the above subject matter of the present invention is limited to the following examples. Various substitutions and alterations are made according to the ordinary skill and familiar means of the art without departing from the technical spirit of the invention, and all such substitutions and alterations are intended to be included in the scope of the invention.

Example 1:

Referring to fig. 1 to 3, a cache replacement system based on imitative learning includes an access data acquisition module, a cache replacement prediction module, a cache replacement module, and a database;

the cache replacement prediction module is stored with a neural network;

The cache eviction prediction neural network comprises a time convolution neural network, an attention mechanism neural network, a receiving layer and an output layer.

The time convolution neural network comprises an input layer, a hidden layer and an output layer;

Wherein the hidden layer outputsThe following is shown:

F (s _t) in equation 2 is the medium-expansion convolution output of the time-convolution neural network at time t, F (s _t) and the input at time tOutput after residual connection/>F (s _t) can be seen as the intermediate output of the time-convolved neural network at time t, which is the input/>, with the current timeThe output after residual connection is the actual output of the time convolution neural network at the current moment. Here, the output h(s) of the output layer of the time convolution neural network is the actual output of the time convolution neural network of the last layer.

The input of the attention mechanism neural network is history information h(s) = { h (s _t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t) }, embedded e (l) = { e (l ₁),e(l₂),e(l₃),…,e(l_W) } of cache line content of the current time step of the current cache access request sequence, and the input is a context vector gw= { g ₁,g₂,…,g_w } of each cache line; where l _w denotes the w-th cache line in the cache; w represents the number of cache lines in the cache; g _w denotes the context of the w-th cache line; w=1, 2, …, W;

wherein the context vector g _w is as follows:

wherein the coefficient α _i is as follows:

α_i＝softmax(e(l_w)^TW_eh(s_t-H+i)) (4)

where W _e is a trainable parameter.

The input of the receiving layer is a context vector g _w, and the output is a reuse distance dis= { dis ₁,dis₂,…,dis_W } of each cache line; wherein dis _w represents the reuse distance of the w-th cache line;

wherein the reuse distance dis of each cache line is as follows:

dis＝Linear(Relu(Linear(gw))) (5)

wherein Reli is an activation function; linear is a Linear function;

The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p ₁,p₂,…,p_W } of each cache line, wherein p _w represents the eviction probability of the w-th cache line in the cache;

Wherein the eviction probability prob for each cache line is as follows:

prob＝Sigmoid(dis) (6)

In the formula, sigmoid is an activation function.

The system also comprises a neural network training module;

The step of training the cached eviction prediction neural network comprises:

2) Establishing a cache eviction prediction neural network;

5) Calculating error between cache line eviction probability output by cache eviction prediction network at time t and Bela dy algorithm Namely:

Wherein cache line eviction probability Satisfies the following formula:

topk＝int(W*α) (8)

Where α is the eviction percentage, int () is the round down function, decend _sort () is the descending sort function.

Wherein ω is the current parameter of the network; omega' is the updated parameter; epsilon _t is the current learning rate of the user,Is a gradient sign; bceloss is a two-class cross entropy loss function.

8) Acquire a new cache access request and return to step 1).

The step of storing the current cache access request into the cache by the cache replacement module comprises the following steps:

The cache line eviction policy is as follows:

index＝decend_sort(prob) (13)

evict_index＝index(0) (14)

1) Constructing a cache replacement system based on imitation learning;

2) Acquiring a cache access request sequence;

9) Acquiring a new cache access request sequence and returning to the step 3).

Example 2:

the cache replacement system based on imitation learning comprises an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database.

The access data acquisition module acquires an access sequence to the cache. S= { S _t-H+1,s_t-H+2,…,s_t-1,s_t }. Wherein s _t represents the cache access at time t, and H is the number of historical cache accesses in the cache access sequence. And judging whether the current cache is not full or whether the current accessed content is in the cache, if so, directly putting the current access into the cache or acquiring the current cache access from the cache, otherwise, sending the access sequence to a cache replacement prediction module.

The cache replacement prediction module comprises a time convolution neural network, an attention mechanism neural network, a bearing layer and an output layer, and sends the output cache line eviction probability to the cache replacement module.

The time convolution neural network and the attention mechanism neural network have a sequence relationship, wherein the time convolution neural network extracts the history information of each cache access in the access sequence in real time, and the attention mechanism neural network extracts the history information of each access sequence and embeds the cache line in the current cache for calculation to generate the context of each cache line. The neural network model is formed by the neural network model, the receiving layer and the output layer, and the structure comprises:

i) extracting access sequence history information by using a time convolution neural network;

II) the attention mechanism network calculates a context vector;

III) outputting reuse distance prediction of each cache line by the receiving layer;

IV) outputting the eviction probability prediction of each cache line by an output layer;

time convolution neural network the time convolution neural network comprises an input layer, a hidden layer and an output layer.

The input of the input layer of the time convolution neural network is a buffer access request sequence S= { S _t-H+1,s_t-H+2,…,s_t-1,s_t }, and the input is an embedded e (S) = { e (S _t-H+1),e(s_t-H+2),…,e(s_t-1),e(s_t) } of the buffer access sequence, wherein e (S _t) represents the embedding of buffer access at the time t. The output of the input layer is taken as the input of the first layer of the hidden layer.

Time convolution neural network hidden layerInput of (a) is the upper hidden layer/>Output of/>The calculation steps of (a) are as follows:

Where F (s _t) represents the output of the dilation convolution and the Activation is an Activation function. In order that the gradient does not disappear, the time-convolution neural network uses a residual connection at each layer to obtain the output hidden ^l+1 of the next layer.

In the method, in the process of the invention,Information of a hidden layer of the j-th layer for access data at the moment of t-d.i; f {0, …, k-1} → R is a convolution filter, k is the convolution kernel size, d is the expansion coefficient, d=2 ^j for the first layer.

The time convolution neural network output layer outputs the information of the last hidden layer as the history information h(s) = { h (s _t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t) } of each cache access data, wherein h (s _t) represents the history information of the cache access content at the time t.

Attention network: for any time step t, the inputs of the attention network are history information h(s) = { h (s _t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t) } of the current cache access request sequence and embedded e (l) = { e (l ₁),e(l₂),e(l₃),…,e(l_W) } of the cache line content of the current time step, wherein l _w represents the W-th cache line in the cache, and W represents the number of cache lines in the cache. The output is a context vector gw= { g ₁,g₂,…,g_w } for each cache line, where g _w represents the context content of the w-th cache line.

The context g _w is as follows:

α_i＝softmax(e(l_w)^TW_eh(s_t-H+i)) (3)

Wherein W _e is a trainable parameter

Bearing layer: the input of the socket layer is the context vector gw, and the output is the reuse distance dis= { dis ₁,dis₂,…,dis_W } for each cache line, where dis _w represents the reuse distance of the w-th cache line.

dis＝Linear(Relu(Linear(gw))) (5)

The activation function of the receiving layer is RELU: f (x) =max (0, x), the linear function is Li near: y=xa ^T +b.

Output layer: the input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p ₁,p₂,…,p_W } of each cache line, where p _w represents the eviction probability of the w-th cache line in the cache.

prob＝Sigmoid(dis) (6)

The activation function of the output layer is Sigmoid:

And the cache replacement module selects one cache line to be evicted according to the eviction probability of each cache line in the cache and stores the current access data.

The eviction selection of a cache line is as follows:

index＝decend_sort(prob) (7)

evict_index＝index(0) (8)

Where prob represents a cache line eviction probability vector, decend _sort () is a descending sort function, and index (0) represents the cache line with the highest eviction probability.

The neural network training module trains the neural network model to obtain a trained neural network model.

The step of training the neural network model comprises:

1) Judging whether the current cache is not full or whether the current accessed content is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache and entering the step 7), otherwise, entering the step 2).

2) Belady expert strategy calculates the true reuse distance of each access data according to all access dataWhere n is the total number of accesses.

3) A cache eviction prediction network is established, the cache eviction prediction network comprising a cache replacement prediction module and a cache eviction module.

In the neural network training process, a cache eviction prediction network model, namely the neural network model described in the above I) II) III) IV), is built. The neural network training and generating the actual cache replacement policy differ in: during training, belady experts generate cache eviction content in real time and cache line eviction probability generated by the model to calculate errors, and update parameters; in the actual cache replacement strategy, the cache line eviction probability calculation is performed only by using the trained model.

4) The cache replacement prediction module generates an eviction probability prob= { p ₁,p₂,…,p_W } for each cache line.

5) And calculating the error between the cache line eviction probability output by the cache eviction prediction network and Belady expert strategies.

Wherein, the prob _t,Probability vectors for evicting cache contents at time t by using model and Belady algorithm respectively,/>The probability of eviction of the ith cache line at time t by the model and Belady algorithm respectively, and W is the cache size.

The calculation of (2) is as follows:

topk＝int(W*α) (10)

/>

The cache eviction prediction network parameter ω is updated as follows:

wherein ω is the current parameter of the network, ω' is the updated parameter, ε _t is the current learning rate, Is a gradient sign.

6) The global neural network parameters are updated using the gradient of the cached eviction prediction network.

7) The cache replacement module selects a cache line for eviction and stores the current access in the cache.

8) Acquire a new cache access sequence and return to step 1).

The cache replacement module comprises the following steps:

1) Judging whether the current cache is not full or whether the current accessed content is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache, otherwise, entering step 2).

2) Extracting the reuse distance of each cache access content in the current cache sequence by using the trained neural network, wherein the steps comprise:

2.1 With the last layer of hidden layer of the time convolution network and the embedding of the cache line as inputs, the context information gw of each cache line is obtained.

2.2 The context information gw of each cache line is passed through the reuse distance dis of the cache line output by the socket layer.

3) And outputting the eviction probability of each cache line through the output layer by using the trained neural network through the extracted cache reuse distance.

4) And replacing the cache content by using a cache replacement module.

The database stores data of the access data acquisition module, the cache replacement prediction module, the cache replacement module and the neural network training module.

Example 3:

Referring to fig. 1 to 3, the cache replacement method based on the imitation learning includes the steps of:

1) An access sequence to the cache is obtained.

2) And establishing a cache replacement prediction model, and sending the output cache line eviction probability to a cache replacement module.

The time convolutional neural network includes an input layer, a hidden layer, and an output layer.

Time convolution neural network hidden layerInput of (a) is the upper hidden layer/>Output of/>The calculation steps of (a) are as follows: /(I)

Where F (s _t) represents the output of the dilation convolution and the Activation is an Activation function. In order that the gradient does not disappear, the time-convolution neural network uses a residual connection at each layer to obtain the output hiddenj ⁺¹ of the next layer.

In the method, in the process of the invention,Information of a hidden layer of the j-th layer for access data at the moment of t-d.i; f {0, …, k-1} → R is a convolution filter, k is the convolution kernel size, d is the expansion coefficient, and d=2 ^j for the j-th layer.

For any time step t, the attention network acquires, as inputs to the attention network, an output h(s) = { h (s _t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t) } of an output layer of the time convolution network and an embedding e (l) = { e (l ₁),e(l₂),e(l₃),…,e(l_w) } of cache line contents of a current time step, where l _w represents a W-th cache line in the cache and W represents the number of cache lines in the cache. The output is a context vector gw= { g ₁,g₂,…,g_w } for each cache line, where g _w represents the context content of the w-th cache line.

The context g _w is as follows:

α_i＝softmax(e(l_w)^TW_eh(s_t-H+i)) (3)

Wherein W _e is a trainable parameter

The input of the socket layer is the context vector gw, and the output is the reuse distance dis= { dis ₁,dis₂,…,dis_W } for each cache line, where dis _w represents the reuse distance of the w-th cache line.

dis＝Linear(Relu(Linear(gw))) (5)

The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p ₁,p₂,…,p_W } of each cache line, where p _w represents the eviction probability of the w-th cache line in the cache.

prob＝Sigmoid(dis) (6)

The activation function of the output layer is Sigmoid:

3) And inputting the output of the cache replacement prediction module into the cache replacement module so as to access the current cache to a certain cache line in the replacement cache.

The eviction selection of a cache line is as follows:

index＝decend_sort(prob) (7)

evict_index＝index(0) (8)

4) And building a neural network training module, and training the neural network in the cache replacement prediction model to obtain a trained neural network model.

The step of training the neural network model comprises:

4.1 Judging whether the current cache is not full or whether the current accessed content is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache and entering the step 7), otherwise, entering the step 2).

4.2 Belady expert policy to calculate the true reuse distance for each access data based on all access dataWhere n is the total number of accesses.

4.3 A cache eviction prediction network is established, the cache eviction prediction network comprising a cache replacement prediction module and a cache eviction module.

4.4 The cache replacement prediction module generates an eviction probability prob= { p ₁,p₂,…,p_W } for each cache line.

4.5 Calculating an error between the cache line eviction probability output by the cache eviction prediction network and a Bela dy expert policy.

The calculation of (2) is as follows:

k＝int(W*topk) (10)

where α is the eviction percentage, int () is the decreasing integer function, decend _sort () is the decreasing order ordering function

The cache eviction prediction network parameter ω is updated as follows:

4.6 Using the gradient of the cached eviction prediction network to update the global neural network parameters.

4.7 The cache replacement module selects a cache line for eviction and stores the current access in the cache.

4.8 Acquire a new cache access sequence and return to step 1).

5) And inputting the access sequence into a trained neural network model, and completing the cache replacement operation by using a cache replacement module.

And carrying out cache replacement prediction and cache replacement on the access data by using the trained neural network, wherein the method comprises the following steps of:

5.1 Judging whether the current cache is not full or whether the current accessed content is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache, otherwise, entering step 2).

5.2 Extracting the reuse distance of each cache access content in the current cache sequence by using the trained neural network, wherein the method comprises the following steps of:

5.2.1 With the last layer of hidden layer of the time convolution network and the embedding of the cache line as inputs, the context information gw of each cache line is obtained.

5.2.2 The context information gw of each cache line is passed through the reuse distance dis of the cache line output by the socket layer.

5.3 Using the trained neural network to output the eviction probability of each cache line through the output layer by the extracted cache reuse distance.

5.4 Using the trained cache replacement module to replace the cache content.

Claims

1. A cache replacement system based on imitative learning, characterized in that: the system comprises an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database;

The access data acquisition module acquires a cache access request sequence S= { S _t-H+1,s_t-H+2,...,s_t-1,s_t }, and transmits the cache access request sequence S= { S _t-H+1,s_t-H+2,...,s_t-1,s_t } to the cache replacement prediction module; s _t represents cache access at time t, and H is the number of historical cache accesses;

the cache replacement prediction module is stored with a neural network;

the database stores data of the access data acquisition module, the cache replacement prediction module and the cache replacement module;

the cache eviction prediction neural network comprises a time convolution neural network, an attention mechanism neural network, a bearing layer and an output layer;

The input of the time convolution neural network input layer is a cache access request sequence S= { S _t-H+1,s_t-H+2,...,s_t-1,s_t }, and the input is an embedded e (S) = { e (S _t-H+1),e(s_t-H+2),...,e(s_t-1),e(s_t) } of the cache access request sequence; wherein e (s _t) represents the embedding of the cache access at time t;

Wherein the hidden layer outputsThe following is shown:

the expansion convolution F (s _t) at time t is as follows:

The output of the output layer of the time convolution neural network is the output of the last hidden layer, and the output of the layer is expressed as the history information h(s) = { h (s _t-H+1),h(s_t-H+2),...,h(s_t-1),h(s_t) } of the cache access request sequence; wherein h (s _t) represents history information of the cache access content at the time t;

the step of training the cached eviction prediction neural network comprises:

2) Establishing a cache eviction prediction neural network;

4) Generating an eviction probability prob= { p ₁,p₂,...,p_W }, for each cache line, using a cache eviction prediction neural network;

In the method, in the process of the invention, Cache line eviction probabilities at time t are respectively the cache eviction prediction network and Belady algorithm; the probability of evicting the ith cache line at the time t is respectively calculated by a cache eviction prediction network and Belady algorithm; w is the cache size;

Wherein cache line eviction probability Satisfies the following formula:

Wherein, parameters topk and sequences The following are respectively shown:

topk＝int(W*α) (10)

Wherein ω is the current parameter of the network; omega' is the updated parameter; epsilon _t is the current learning rate of the user,Is a gradient sign; bceloss is a loss function;

8) Acquire a new cache access request and return to step 1).

2. The cache replacement system based on imitation learning according to claim 1, wherein the input of the attention mechanism neural network is history information h(s) = { h (s _t-H+1),h(s_t-H+2),...,h(s_t-1),h(s_t) }, embedded e (l) = { e (l ₁),e(l₂),e(l₃),...,e(l_W) } of cache line content of the current time step, and the output is a context vector gw= { g ₁,g₂,...,g_W } of each cache line; where l _w denotes the w-th cache line in the cache; w represents the number of cache lines in the cache; g _w denotes the context of the w-th cache line; w=1, 2,. -%, W;

wherein the context vector g _w is as follows:

wherein the coefficient α _i is as follows:

α_i＝softmax(e(l_w)^TW_eh(s_t-H+i)) (4)

where W _e is a trainable parameter.

3. The cache replacement system based on imitation learning of claim 2, wherein the input of the socket layer is a context vector g _w, and the output is a reuse distance dis= { dis ₁,dis₂,...,dis_W } for each cache line; wherein dis _W represents the reuse distance of the W-th cache line;

wherein the reuse distance dis of each cache line is as follows:

dis＝Linear(Relu(Linear(gw))) (5)

Wherein Relu is an activation function; linear is a Linear function;

The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob= { p ₁,p₂,...,p_W }; where p _W represents the eviction probability of the W-th cache line in the cache;

Wherein the eviction probability prob for each cache line is as follows:

prob＝Sigmoid(dis) (6)

In the formula, sigmoid is an activation function.

4. A cache replacement system based on imitation learning as claimed in claim 3, wherein: the system also comprises a neural network training module;

5. The cache replacement system based on impersonation learning of claim 1, wherein the step of the cache replacement module storing the current cache access request in the cache comprises:

a1 Judging whether the current cache is not full or whether the current cache access request is in the cache, if yes, directly putting the current cache access request into the cache or acquiring the current cache access request from the cache, otherwise, entering the step a 2);

a2 The cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;

a3 And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache.

6. The imitation learning based cache replacement system of claim 5, wherein the cache line eviction policy is as follows:

index＝decend_sort(prob) (13)

evict_index＝index(0) (14)

where prob represents a cache line eviction probability vector, decend _sort () is a descending sort function, index (0) represents the cache line with the highest eviction probability; the evict_index is the eviction policy; index is a guide function.

7. A method of using a cache replacement system according to any one of claims 1 to 6, comprising the steps of:

s 1) building a cache replacement system based on imitation learning;

s 2) obtaining a cache access request sequence;

S3) judging whether the current cache is not full or whether the current access is in the cache, if yes, directly putting the current access into the cache or acquiring the current cache access from the cache and entering a step S9), otherwise, entering a step S4);

s 4) training the cache replacement prediction neural network of the cache replacement prediction module;

s 5) inputting the cache access request sequence into a trained cache replacement prediction neural network, calculating to obtain cache line eviction probability, and transmitting to a cache replacement module;

s 6) the buffer replacement module judges whether the current buffer is not full or whether the current buffer access request is already in the buffer, if yes, the current buffer access request is directly put into the buffer or obtained from the buffer, otherwise, the step s7 is entered;

s 7) the cache replacement module invokes the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction policy;

s 8) the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache;

s 9) obtaining a new cache access request sequence, and returning to step s 3).