CN116827515A - Fog computing system performance optimization algorithm based on blockchain and reinforcement learning - Google Patents

Fog computing system performance optimization algorithm based on blockchain and reinforcement learning Download PDF

Info

Publication number
CN116827515A
CN116827515A CN202310775693.0A CN202310775693A CN116827515A CN 116827515 A CN116827515 A CN 116827515A CN 202310775693 A CN202310775693 A CN 202310775693A CN 116827515 A CN116827515 A CN 116827515A
Authority
CN
China
Prior art keywords
blockchain
computing
fog
throughput
follows
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202310775693.0A
Other languages
Chinese (zh)
Inventor
孔令和
朱斌
葛威
吴世伟
刘伟
刘恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Zhongxi Biological Information Co ltd
Original Assignee
Suzhou Zhongxi Biological Information Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Zhongxi Biological Information Co ltd filed Critical Suzhou Zhongxi Biological Information Co ltd
Priority to CN202310775693.0A priority Critical patent/CN116827515A/en
Publication of CN116827515A publication Critical patent/CN116827515A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Currently, fog computing technology is widely focused on improving delay and performance problems existing in traditional internet of things and cloud computing to a certain extent due to the distributed characteristics of the fog computing technology, but the fog node cannot guarantee safety and privacy of data due to computing resources. The distributed trust mechanism of the blockchain can be well combined with it to improve security and privacy. However, there are limitations to the blockchain technique currently applied to fog computing, one of which is low throughput performance. Therefore, in order to meet the requirements of data security and throughput improvement at the same time, we propose herein a fog computing framework combined with blockchain technology, which can be dynamically adjusted according to the system state, and adopts a duel-bucket-type deep reinforcement learning algorithm to obtain the optimal configuration. Simulation shows that the scheme can effectively improve the throughput performance of the blockchain-fog computing system and ensure the safety.

Description

Fog computing system performance optimization algorithm based on blockchain and reinforcement learning
Technical Field
In this context, we propose a fog computing system framework supporting blockchain and performance optimization algorithm that not only guarantees security and privacy of data, but also can dynamically adjust blockchain throughput and delay by analyzing system status and user Qos requirements to improve overall system performance. By analyzing the resource status and system configuration of blockchain in a complex dynamic fog computing environment, we express the problem as a markov process, and intelligent optimization of the system is performed by adopting a Dueling deep reinforcement learning algorithm based on improvement. The algorithm can dynamically adjust the block generator, the block size and the network bandwidth allocation, meet the QoS requirements of different users, improve the throughput and reduce the delay. Simulation results show that compared with the existing method, the block chain fog computing system architecture and the optimization algorithm provided by the invention have the effectiveness of improving the throughput of the system under different system parameters, and are obviously superior to the existing scheme.
Background
In recent years, the development of internet of things has attracted a wide range of attention worldwide. The application range and scale of the internet of things are also expanding, including various fields such as home automation, smart cities, smart medical treatment, smart factories and the like. However, due to the limited resources of the internet of things device itself, the large data processing and storage requirements often require additional computing, storage, and network bandwidth resources. Although the traditional cloud computing mode can process data in a centralized manner, the computing resources of the internet of things equipment cannot be fully utilized, and the traditional cloud computing mode depends on the internet to transmit the data, so that the service requirements of low delay and high quality cannot be met.
To solve these problems, fog computing techniques have been developed. Fog computing is a novel computing architecture, and the main idea is to push computing, storage and network resources to the edge of the internet of things, so that a programmable and manageable computing platform is formed between edge equipment and a cloud data center. Therefore, faster, more reliable and safer computing service can be provided at the edge of the Internet of things, and meanwhile, the computing resources of the edge equipment can be fully utilized, so that the data transmission cost and delay are reduced. Through fog computing technology, the internet of things equipment can process data more intelligently and efficiently, and richer and diversified application scenes are realized. Fog computing is also considered as one of the important directions of the development of the internet of things in the future, and the fog computing promotes the development and application of the internet of things technology, so that more business opportunities and social values are brought.
However, fog calculations also have some drawbacks to be addressed. Since the computing mode involves a large amount of data transmission and storage, it is necessary to secure the data. However, the edge equipment of fog calculation is limited by calculation resources, and the security performance is generally weak, so that the security problems of data leakage, data loss, data tampering and the like are faced; in addition, in fog calculation, frequent data transmission is required between the edge equipment and the cloud data center, and a malicious attacker can attack a data transmission channel to steal data or interfere with data transmission, so that the security of the data is endangered; furthermore, in the mist computing mode, the edge devices typically collect and process large amounts of personal information and sensitive data, which if leaked may have a serious impact on personal privacy, while the mist computing system involves multiple parties, such as edge devices, cloud data centers, etc., which also increase the risk of privacy leakage.
Although fog calculation is an emerging calculation mode, has wide application prospect in the field of Internet of things, certain disadvantages and safety problems still exist and need to be solved. Among the most significant problems are data security and privacy issues. Fog computing involves a large amount of data transmission and storage, which requires data security. However, since the computing resources of the edge device are limited, the security performance of the edge device is generally weak, and security problems such as data leakage, data loss, data tampering and the like may be faced. Furthermore, frequent data transfer is required between the edge devices and the cloud data center, which also increases network security challenges. A malicious attacker may attack the data transmission channel, steal data or interfere with data transmission, thereby compromising the security of the data. In addition, edge devices typically collect and process large amounts of personal information and sensitive data, which if compromised, can have a serious impact on personal privacy. Moreover, since multiple participants are involved in the fog computing system, such as edge devices, cloud data centers, etc., this also increases the risk of privacy disclosure. Blockchains have been considered as one of the most suitable techniques to address the challenges of fog computing as a solution, with the distributed trust mechanism complement the distributed processing characteristics of fog computing. Furthermore, blockchain consensus protocols and encryption techniques help ensure secure transmission of data in fog computing architectures. However, there are still problems in combining blockchain with fog calculations, and the throughput and delay of blockchains may not fully meet the performance requirements of fog calculations. This means that performance bottlenecks may occur in practical applications, affecting the overall efficiency of the system. Therefore, research on how to increase the throughput of blockchain and reduce latency in order to better accommodate fog computing environments has become an important point of academic concern.
In order to solve the above problems, a blockchain fog computing performance optimization algorithm based on a deep reinforcement learning technology is proposed herein. The method adopts the deep learning reinforcement algorithm of Dueling, can dynamically select the blocking person, the blocking size and the network bandwidth allocation according to the state of the fog computing node, the network bandwidth and the QoS requirement of the user in a complex dynamic environment, improves the throughput according to the QoS requirement of the user, reduces the delay and greatly improves the expandability of the blockchain fog computing system.
In order to solve the above problems, a blockchain fog computing performance optimization algorithm based on a deep reinforcement learning technology is proposed herein. The algorithm adopts the Dueling deep learning reinforcement technology, so that the blockchain fog computing system can be self-adjusted in complex and changeable environments. By monitoring the fog computing node status, network bandwidth, and QoS requirements of the user, the algorithm is able to dynamically select the appropriate blocking agent, blocking size, and network bandwidth allocation. The self-adaptive method is beneficial to improving the throughput and reducing the delay of the block chain fog computing system, thereby meeting the QoS requirement of users and providing an effective way for improving the expandability and throughput performance of the block chain fog computing system.
Disclosure of Invention
1. For a fog computing system architecture combined with a blockchain technology, proper blocking agents, blocking sizes and network bandwidth allocation are dynamically selected according to fog computing node states, network bandwidths and QoS requirements of users, and system performance is optimized.
The method is characterized by comprising the following steps of:
(1) Designing a block chain based fog computing system architecture, wherein the system architecture is divided into three layers: the cloud computing system comprises an Internet of things layer, a fog computing-blockchain layer and a cloud service layer. The definition for each layer is as follows:
(1) in the hierarchical structure of the Internet of things, a large number of intelligent Internet of things devices work cooperatively, such as intelligent automobiles, intelligent environment monitoring devices, intelligent home and the like. The intelligent devices are connected with the fog computing nodes through efficient communication links, mass data are generated in the process, and the mass data are processed by the fog computing-blockchain layer.
(2) In the fog calculation-blockchain layer, fog calculation equipment is tightly matched with intelligent equipment of the Internet of things layer, the fog calculation equipment firstly collects data from various intelligent equipment of the Internet of things layer, and after preliminary processing, the data are packaged and transmitted to a blockchain system. In a blockchain system, participants agree through a consensus algorithm to ensure the authenticity and integrity of data. When the local fog node equipment encounters the condition that the computing power is insufficient to process the current task, the local fog node equipment can offload part of the computing task to a remote cloud server, and the offloading strategy is beneficial to improving the overall computing efficiency and ensuring that the task can be completed in a proper time. Meanwhile, through the blockchain system, the integrity, traceability and non-falsifiability of transmission data among the fog nodes and between the fog nodes and the cloud server layer are effectively ensured. In addition, in the fog computing-blockchain system, the consensus process does not consume resources of the fog computing device. In contrast, the local consensus node and the remote cloud computing server jointly support the whole consensus process, so that distributed consensus decision is realized.
(3) At the cloud service level, virtual machines running the blockchain system are deployed that can efficiently handle consensus demands from the fog computing-blockchain level. Through the virtual machine in the cloud server layer, the blockchain system can realize distributed consensus in a very large range, and ensure the safety, reliability and non-falsification of data. In addition, the cloud service layer also bears the responsibility of processing complex computing tasks sent by the fog node equipment, when the fog computing equipment encounters insufficient computing capacity to process specific tasks, the tasks can be offloaded to the cloud service layer, and the cloud server has more powerful computing capacity and resources and can rapidly process the complex tasks.
(2) A system resource model and a Qos model based on a blockchain-fog computing system architecture are designed, and are defined as follows:
the cloud node as a block producer requires a lot of computing resources, but since the cloud computing-blockchain layer needs to interact with the cloud server layer, we have difficulty in knowing exactly the computing power of the node at the next time slot. Therefore, we model the computational power of the foggy node n as a random variable c herein n And assuming that its computing power can be divided into discrete intervals P, denoted as p= { P 0 ,P 1 ,...,P P-1 }. Thus, the computing power of the foggy node n at time slot t is denoted as c n (t) and simulating the state of the random variable by means of a markov chain.
P x P-sized state transition probability matrix R n (t) is represented as follows:
R n (t)=[Pr(c n (t+1)=l s |c n (t)=w s )] P×P l s ,w s ∈P (1)
in addition, the large data transmissions in the mist computing-blockchain system require network bandwidth resources to be taken up, but it is difficult in its architecture to know exactly how much network bandwidth resources are available in the next time slotTherefore, we also assume that the network bandwidth resource is B and that the network bandwidth resource is modeled as a random variable w b It can be divided into discrete intervals X, denoted as x= { X 0 ,X 1 ,...,X X-1 }。w b (t) is the available bandwidth resource over time slot t, we model the state of this random variable also through a Markov chain.
X-sized state transition probability matrix O b (t) is represented as follows:
O b (t)=[Pr(w b (t+1)=q s |w b (t)=m s )] X×X q s ,w s ∈X (2)
in a blockchain-based fog computing system, quality of service (QoS) requirements may vary significantly for different application scenarios. For example, some applications require low latency to achieve fast response, while others focus on extremely high throughput to process large amounts of data. Accordingly, it is necessary to dynamically adjust the configuration of the blockchain to optimize system performance according to different QoS requirements. To better evaluate and adjust the quality of service of the fog computing-blockchain system, we measure its performance, i.e., throughput and delay, by two key metrics. To represent these two parameters we can introduce a vector Q, where the first parameter represents the throughput criterion and the second parameter represents the latency (delay) criterion, which is expressed in particular as follows:
Q=[q T ,q L ] (3)
wherein the calculation of the throughput criterion is expressed as follows:
wherein delta target Is the throughput, delta, calculated using the current blockchain configuration avg Is the throughput calculated using a standard blockchain configuration.
Wherein θ is target Is the delay, θ, calculated using the current blockchain configuration avg Is the calculated delay under a standard blockchain configuration.
At the same time, we introduce a user preference array d= { D 1 ,d 2 }, where d 1 ,d 2 Representing the user's preference weights, i.e. throughput and delay requirements, we model the user throughput preference and delay preference as random variables D due to the uncertainty of the user's preference in a certain time slot 1 And D 2 Modeling by using a Markov chain to obtain a state transition probability matrix D 1 (t) and D 2 (T), where T e {0,1,2, …, T-1} represents a slot.
From the introduced vectors Q and D, the Qos values for the adopted blockchain configuration can be obtained, which are expressed as follows:
Qos=d 1 ·q T +d 2 ·q L (6)
(3) Throughput and delay optimization algorithms based on the deep reinforcement learning of Dueling are designed, a Markov decision process is established, and a state space, an action space and a reward function are established.
(1) The state space is defined as follows:
wherein d is 1 (t) and d 2 (t) represents user preference weight, c N (t) is the calculation power of node N in real time slot t, w B And (t) is the network bandwidth resource available at time slot t.
(2) The action space is defined as follows:
A(t)={a n (t),a b (t),a s (t)} (8)
wherein a is n (t) ∈ {1, 2..n …, N } indicates which node is selected as the block generator, a b (t) ∈ {1, 2.,. B.,. B indicates available bandwidth resources, a s (t) ∈ {1,2,.,. S,. S } represents a data blockA level of size.
(3) The bonus function is defined as follows:
wherein delta target And theta target Is the throughput and delay of the selected configuration, calculated as follows:
wherein c n (t) is the computational power of the selected block generator, b s Is the number of transactions, w, corresponding to the selected block size level b (t) is the selected bandwidth resource, δ avg θ avg Is the standard throughput and delay calculated by averaging all states.
(4) Throughput and delay optimization algorithms based on the lasting deep reinforcement learning are designed and dynamically adjusted according to state space, action space and rewarding function.
In the deep reinforcement learning algorithm, the action-state cost function Q (s, a) is expressed as follows:
wherein E is π Representing mathematical expectations, gamma E (0, 1) is a postulation factor reflecting the return between equilibrium instant and future, r t+k+1 Representing an instant prize under the t+k+1 time period policy.
The loss function L (θ) may train the network to produce a value that approximates the Q (s, a) function, expressed as follows:
where θ is the weight and bias of the network being evaluated, θ - Weights and biases for the target network.
Q of evaluated network eval (s, a, θ) is a combination of V(s) and A (s, a), defined as follows:
Q eval (s,a,θ)=V(s,θ,δ)+A(s,a,θ,ε) (14)
where δ and ε are parameters of two independent streams.
The algorithm flow is as follows:
(1) Initializing:
(1) initializing an evaluated network according to the weight and the deviation theta;
(2) according to the weight and the deviation theta - Initializing a target network;
(3) initializing the memory size M, the batch size B and the greedy coefficient E.
(2) K times of iterative training are carried out on the evaluated, and each iteration comprises the following steps:
(1) using a random state space s init Reset context, s (t) =s init
(2) When the state space is not in the end state, i.e. s (t) +.! =s terminal The following steps are repeated under the condition of:
(1) randomly selecting an action a (t) with a probability epsilon;
(2) if (1) is not performed, a (t) =argmax is set a Q(s,a,θ);
(3) Obtaining an instant prize r (t) and a next state s (t+1);
(4) storing experience (s (t), a (t), r (t), s (t+1)) into an experience playback memory;
(5) randomly sampling (s (i), a (i), r (i), s (i+1)) from the empirical playback memory;
(6) computing two data streams of the evaluated network, including V (s, θ, δ) and A (s, a, θ, ε), and then combining into
Q eval (s,a,θ);
(7) Counting in a target networkCalculation target Q target
(8) If the next state is the end state, i.e. s (t+1) =s terminal Then Q target =r(t);
If not, Q target =r(t)+γmax a′ Q(s(t+1),a′,θ′);
(9) Training the evaluated network to minimize the loss function L (θ), L (θ) =e [ (Q) target -Q eval ) 2 ]
Updating the target network according to the evaluated network trained for a period of time; updating the state space to
The next state space s (t) ≡s (t+1).
The invention is creatively embodied in:
the present invention proposes an innovative solution to the high dynamic and complex fog computing environment. In such an environment, the conventional optimization method is often difficult to adapt to the change of the environment, and cannot realize dynamic adjustment according to the environment. In order to solve the problem, the adaptive and self-learning capabilities of the deep reinforcement learning of the Dueling are combined, a fog computing system framework based on the blockchain is designed, and meanwhile, the dynamic adjustment of the blockchain configuration is realized by utilizing the deep reinforcement learning algorithm of the Dueling according to QoS requirements of different users. The method remarkably optimizes the throughput and delay of the block chain-fog computing system and improves the overall performance of the system
Drawings
FIG. 1 is a block chain based model block diagram of a mist computing system in accordance with the present invention
FIG. 2 is a graph comparing convergence performance of the algorithm optimization architecture and the fixed configuration architecture of the present invention
FIG. 3 shows Qos values for different block sizes for the algorithm optimization architecture and the fixed configuration architecture of the present invention
Detailed Description
Aiming at a fog computing system architecture combined with a blockchain technology, proper blocking agents, blocking sizes and network bandwidth allocation are dynamically selected according to the states of fog computing nodes, network bandwidths and QoS requirements of users, and the system performance is optimized, and the implementation comprises the following steps:
(1) Configuring a block chain based fog computing system architecture, wherein the system architecture is divided into three layers: the cloud computing system comprises an Internet of things layer, a fog computing-blockchain layer and a cloud service layer. In the hierarchical structure of the Internet of things, various intelligent Internet of things devices such as intelligent automobiles, intelligent environment monitoring devices, intelligent home furnishings and the like cooperate together. These devices are interconnected with the mist computing nodes by efficient communication links, and the resulting large amounts of data are then passed to the mist computing-blockchain layer for processing and analysis; and at the fog computing-blockchain layer, fog computing devices are in close cooperation with various intelligent devices at the internet of things layer. First, the fog computing device collects data from the intelligent devices of the internet of things layer, packages the data after preprocessing, and transmits the data to the blockchain system. In a blockchain system, participants agree through a consensus algorithm to ensure the authenticity and integrity of data. When the computing power of the local cloud node device is insufficient to cope with the current task, part of the computing task can be offloaded to a remote cloud server, and the offloading strategy helps to improve the overall computing efficiency and ensure that the task is completed on time. In addition, the blockchain system ensures the integrity, traceability and non-tamper ability of data transmission between the foggy nodes and the cloud server layer. In the fog calculation-blockchain hierarchy, the consensus process does not consume resources of the fog calculation device. In contrast, the local consensus node and the remote cloud computing server support the whole consensus process together, so that distributed consensus decision is realized. At the cloud service level, virtual machines running the blockchain system are deployed that can efficiently handle consensus demands from the fog computing-blockchain level. Through the virtual machine in the cloud server layer, the blockchain system can realize distributed consensus in a very large range, and ensure the safety, reliability and non-falsification of data. In addition, the cloud service layer also bears the responsibility of processing complex computing tasks sent by the fog node equipment, when the fog computing equipment encounters insufficient computing power to process specific tasks, the tasks can be offloaded to the cloud service layer, and the cloud server has more powerful computing power and resources and can rapidly process the complex tasks.
(2) The method for configuring the system resource model and the Qos model based on the blockchain-fog computing system architecture comprises the following steps:
the cloud node as a block producer requires a lot of computing resources, but since the cloud computing-blockchain layer needs to interact with the cloud server layer, we have difficulty in knowing exactly the computing power of the node at the next time slot. Therefore, we model the computational power of the foggy node n as a random variable c herein n And assuming that its computing power can be divided into discrete intervals P, denoted as p= { P 0 ,P 1 ,...,P P-1 }. Thus, the computing power of the foggy node n at time slot t is denoted as c n (t) and simulating the state of the random variable by means of a markov chain.
P x P-sized state transition probability matrix R n (t) is represented as follows:
R n (t)=[Pr(c n (t+1)=l s |c n (t)=ws)] P×P l s ,w s ∈P (1)
in addition, the large amount of data transmission in the mist computing-blockchain system needs to occupy network bandwidth resources, but it is difficult in its architecture to know exactly how much network bandwidth resources are available in the next time slot, so we also assume that network bandwidth resources are B and model network bandwidth resources as random variables w b It can be divided into discrete intervals X, denoted as x= { X 0 ,X 1 ,...,X X-1 }。w b (t) is the available bandwidth resource over time slot t, we model the state of this random variable also through a Markov chain.
X-sized state transition probability matrix O b (t) is represented as follows:
O b (t)=[Pr(w b (t+1)=q s |w b (t)=m s )] X×X q s ,w s ∈X (2)
in a blockchain-based fog computing system, quality of service (QoS) requirements may vary significantly for different application scenarios. For example, some applications require low latency to achieve fast response, while others focus on extremely high throughput to process large amounts of data. Accordingly, it is necessary to dynamically adjust the configuration of the blockchain to optimize system performance according to different QoS requirements. To better evaluate and adjust the quality of service of the fog computing-blockchain system, we measure its performance, i.e., throughput and delay, by two key metrics. To represent these two parameters we can introduce a vector Q, where the first parameter represents the throughput criterion and the second parameter represents the latency (delay) criterion, which is expressed in particular as follows:
Q=[q T ,q L ] (3)
wherein the calculation of the throughput criterion is expressed as follows:
wherein delta target Is the throughput, delta, calculated using the current blockchain configuration avg Is the throughput calculated using a standard blockchain configuration.
Wherein θ is target Is the delay, θ, calculated using the current blockchain configuration avg Is the calculated delay under a standard blockchain configuration.
At the same time, we introduce a user preference array d= { D 1 ,d 2 }, where d 1 ,d 2 Representing the user's preference weights, i.e. throughput and delay requirements, we model the user throughput preference and delay preference as random variables D due to the uncertainty of the user's preference in a certain time slot 1 And D 2 Modeling by using a Markov chain to obtain a state transition probability matrix D 1 (t) and D 2 (T), wherein T e {0,1, 2..t-1 } represents a time slot.
From the introduced vectors Q and D, the Qos values for the adopted blockchain configuration can be obtained, which are expressed as follows:
Qos=d 1 ·q T +d 2 ·q L (6)
(3) A throughput and delay optimization algorithm based on the deep reinforcement learning of the Dueling is configured, a Markov decision process is established, and a state space, an action space and a reward function are established.
(1) The state space is defined as follows:
wherein d is 1 (t) and d 2 (t) represents user preference weight, c N (t) is the calculation power of node N in real time slot t, w B And (t) is the network bandwidth resource available at time slot t.
(2) The action space is defined as follows:
A(t)={a n (t),a b (t),a s (t)} (8)
wherein a is n (t) ∈ {1, 2..n …, N } indicates which node is selected as the block generator, a b (t) ∈ {1, 2.,. B.,. B indicates available bandwidth resources, a s (t) ∈ {1,2,..s } represents a level of data block size.
(3) The bonus function is defined as follows:
wherein delta target And theta target Is the throughput and delay of the selected configuration, calculated as follows:
wherein c n (t) is the computing power of the selected block generator,b s Is the number of transactions, w, corresponding to the selected block size level b (t) is the selected bandwidth resource, δ avg θ avg Is the standard throughput and delay calculated by averaging all states.
(4) Throughput and delay optimization algorithms based on the lasting deep reinforcement learning are configured and dynamically adjusted according to state space, action space and rewarding function.
In the deep reinforcement learning algorithm, the action-state cost function Q (s, a) is expressed as follows:
wherein E is π Representing mathematical expectations, gamma E (0, 1) is a postulation factor reflecting the return between equilibrium instant and future, r t+k+1 Representing an instant prize under the t+k+1 time period policy.
The loss function L (θ) may train the network to produce a value that approximates the Q (s, a) function, expressed as follows:
where θ is the weight and bias of the network being evaluated, θ - Weights and biases for the target network.
Q of evaluated network eval (s, a, θ) is a combination of V(s) and A (s, a), defined as follows:
Q eval (s,a,θ)=V(s,θ,δ)+A(s,a,θ,ε) (14)
where δ and ε are parameters of two independent streams.
The algorithm flow is as follows:
(1) Initializing:
(1) initializing an evaluated network according to the weight and the deviation theta;
(2) according to the weight and the deviation theta - Initializing a target network;
(3) initializing the memory size M, the batch size B and the greedy coefficient E.
(2) K times of iterative training are carried out on the evaluated, and each iteration comprises the following steps:
(1) using a random state space s init Reset context, s (t) =s init
(2) When the state space is not in the end state, i.e. s (t) +.! =s terminal The following steps are repeated under the condition of:
(1) randomly selecting an action a (t) with a probability epsilon;
(2) if (1) is not performed, a (t) =argmax is set a Q(s,a,θ);
(3) Obtaining an instant prize r (t) and a next state s (t+1);
(4) storing experience (s (t), a (t), r (t), s (t+1)) into an experience playback memory;
(5) randomly sampling (s (i), a (i), r (i), s (i+1)) from the empirical playback memory;
(6) computing two data streams of the evaluated network, including V (s, θ, δ) and A (s, a, θ, ε), and then combining into
Q eval (s,a,θ);
(7) Computing target Q in target network target
(8) If the next state is the end state, i.e. s (t+1) =s terminal Then Q target =r(t);
If not, Q target =r(t)+γmax a′ Q(s(t+1),a′,θ′);
(9) Training the evaluated network to minimize the loss function L (θ), L (θ) =e [ (Q) target -Q eval ) 2 ]
Updating the target network according to the evaluated network trained for a period of time; updating the state space to
The next state space s (t) ≡s (t+1).
We assume that 9 foggy nodes can be used as consensus nodes for blockchain consensus, 4 available bandwidth resources are linked, and the block size is divided into 4 levels: the computing power of the fog node is divided into three levels of high, medium and low, and the state transition probability matrix is as follows:
similarly, the states of the available bandwidth resources are divided into good, medium and bad states, and the state transition probability matrix is set as follows:
likewise, the throughput preference and latency preference of the user are high, medium, and low, respectively. The state transition probability matrix of the user is set as follows:
FIG. 2 is a graph comparing the convergence performance of the algorithm optimization architecture and the fixed configuration architecture of the present invention, wherein the X axis is the iteration number and the Y axis is the convergence performance represented by the Qos value, and it can be found that the proposed scheme has higher QoS, which is nearly twice that of the fixed configuration scheme, which proves the effectiveness of the scheme; fig. 3 shows Qos values of the algorithm optimization architecture and the fixed configuration architecture under different block sizes, and similarly, the scheme provided by the invention has the best performance under different block sizes, and the block size is far greater than that of the fixed configuration scheme, and meanwhile, the scheme of the invention can dynamically adjust the block size according to different systems so as to obtain better performance.

Claims (6)

1. Aiming at a fog computing system architecture combined with a blockchain technology, a proper blocking agent, a blocking size and a system performance optimization algorithm for distributing network bandwidth are dynamically selected according to the state of a fog computing node, the network bandwidth and the QoS requirement of a user, and the system performance optimization algorithm is characterized in that:
first, the blockchain-based fog computing system architecture is divided into three layers: the cloud computing system comprises an Internet of things layer, a fog computing-block chain layer and a cloud service layer. A large number of intelligent Internet of things devices in the Internet of things layer work cooperatively and are connected with the fog computing nodes through efficient communication links, mass data are generated in the process, and the mass data are transmitted to the fog computing-blockchain layer for processing; in the fog calculation-blockchain layer, fog calculation equipment is tightly matched with intelligent equipment of the Internet of things layer, data are collected from various intelligent equipment of the Internet of things layer, after preliminary processing, the data are packaged and transmitted to a blockchain system, and in the blockchain system, all participants agree through a consensus algorithm to ensure the authenticity and integrity of the data; the virtual machines running the blockchain system, which are deployed in the final cloud service layer, handle the formula requirements from the fog calculation-blockchain layer. And secondly, evaluating the system state by using a system resource model and a Qos model based on a block chain-fog computing system architecture, simultaneously establishing a Markov decision process comprising a state space, an action space and a reward function according to the throughput and delay optimization algorithm of the Dueling deep reinforcement learning, and finally dynamically adjusting the system state according to the state space, the action space and the reward function according to the designed throughput and delay optimization algorithm based on the Dueling deep reinforcement learning.
2. The blockchain-fog computing system performance optimization algorithm of claim 1, wherein:
when the local fog node equipment encounters the condition that the computing capacity is insufficient for processing the current task, the local fog node equipment can offload part of computing tasks to a remote cloud server, and the overall computing efficiency of the system is improved through the offloading strategy, so that the tasks can be completed in a proper time. Meanwhile, through the blockchain system, the integrity, traceability and non-falsifiability of transmission data among the fog nodes and between the fog nodes and the cloud server layer are effectively ensured. In addition, in the fog computing-blockchain system, the consensus process does not consume resources of the fog computing device. In contrast, the local consensus node and the remote cloud computing server jointly support the whole consensus process, so that distributed consensus decision is realized.
3. The blockchain-fog computing system performance optimization algorithm of claim 1, wherein:
through the virtual machine in the cloud server layer, the blockchain system can realize distributed consensus in a very large range, and ensure the safety, reliability and non-falsification of data. In addition, the cloud service layer also bears the responsibility of processing complex computing tasks sent by the fog node equipment, when the fog computing equipment encounters computing capacity deficiency C n When processing specific tasks, the tasks are offloaded to a cloud service layer, and a cloud server has more powerful computing power and resources and can rapidly process the complex tasks.
4. The blockchain-fog computing system performance optimization algorithm of claim 1, wherein:
the cloud node as a block producer requires a lot of computing resources, but since the cloud computing-blockchain layer needs to interact with the cloud server layer, it is difficult to know the computing power of the node in the next time slot accurately. Therefore, we model the computational power of the foggy node n as a random variable c herein n And assuming that its computing power can be divided into discrete intervals P, denoted as p= { P 0 ,P 1 ,...,P P-1 }. Thus, the computing power of the foggy node n at time slot t is denoted as c n (t) and simulating the state of the random variable by means of a markov chain.
P x P-sized state transition probability matrix R n (t) is represented as follows:
R n (t)=[Pr(c n (t+1)=l s |c n (t)=w s )] P×P l s ,w s ∈P (1)
in addition, the large data transmission in the mist computing-blockchain system requires network bandwidth resources, but it is difficult to know exactly how many networks are in the next time slot in its architectureBandwidth resources are available, so we also assume that network bandwidth resources are B and network bandwidth resources are modeled as random variables w b It can be divided into discrete intervals X, denoted as x= { X 0 ,X 1 ,...,X X-1 }。w b (t) is the available bandwidth resource over time slot t, we model the state of this random variable also through a Markov chain.
X-sized state transition probability matrix O b (t) is represented as follows:
O b (t)=[Pr(w b (t+1)=q S |w b (t)=m S )] X×X q S ,w s ∈X (2)
in a blockchain-based fog computing system, quality of service (QoS) requirements may vary significantly for different application scenarios. For example, some applications require low latency to achieve fast response, while others focus on extremely high throughput to process large amounts of data. Accordingly, it is necessary to dynamically adjust the configuration of the blockchain to optimize system performance according to different QoS requirements. To better evaluate and adjust the quality of service of the fog computing-blockchain system, we measure its performance, i.e., throughput and delay, by two key metrics. To represent these two parameters we can introduce a vector Q, where the first parameter represents the throughput criterion and the second parameter represents the latency (delay) criterion, which is expressed in particular as follows:
Q=[q T ,q L ] (3)
wherein the calculation of the throughput criterion is expressed as follows:
wherein delta target Is the throughput, delta, calculated using the current blockchain configuration avg Is the throughput calculated using a standard blockchain configuration.
Wherein the method comprises the steps ofIs the delay calculated using the current blockchain configuration,/>Is the calculated delay under a standard blockchain configuration.
At the same time, we introduce a user preference array d= { D 1 ,d 2 }, where d 1 ,d 2 Representing the user's preference weights, i.e. throughput and delay requirements, we model the user throughput preference and delay preference as random variables D due to the uncertainty of the user's preference in a certain time slot 1 And D 2 Modeling by using a Markov chain to obtain a state transition probability matrix D 1 (t) and D 2 (T), wherein T e {0,1, 2..t-1 } represents a time slot.
From the introduced vectors Q and D, the Qos values for the adopted blockchain configuration can be obtained, which are expressed as follows:
Qos=d 1 ·q T +d 2 ·q L (6)。
5. the blockchain-fog computing system performance optimization algorithm of claim 1, wherein:
(1) the state space is defined as follows:
wherein d is 1 (t) and d 2 (t) represents user preference weight, c N (t) is the calculation power of node N in real time slot t, w B And (t) is the network bandwidth resource available at time slot t.
(2) The action space is defined as follows:
A(t)={a n (t),a b (t),a s (t)} (8)
wherein a is n (t)∈{1,2.., n. the number, N indicates which node is selected as the block generator, a b (t) ∈ {1, 2.,. B.,. B indicates available bandwidth resources, a s (t) ∈ {1,2,..s } represents a level of data block size.
(3) The bonus function is defined as follows:
wherein delta target Andis the throughput and delay of the selected configuration, calculated as follows:
wherein c n (t) is the computational power of the selected block generator, b s Is the number of transactions, w, corresponding to the selected block size level b (t) is the selected bandwidth resource, δ avg Andis the standard throughput and delay calculated by averaging all states.
6. The blockchain-fog computing system performance optimization algorithm of claim 1, wherein:
in the deep reinforcement learning algorithm, the action-state cost function Q (s, a) is expressed as follows:
wherein E is π Indicating numberLearning expectation that gamma E (0, 1) is a postulation factor reflecting the return between equilibrium instant and future, r t+k+1 Representing an instant prize under the t+k+1 time period policy.
The loss function L (θ) may train the network to produce a value that approximates the Q (s, a) function, expressed as follows:
where θ is the weight and bias of the network being evaluated and θ is the weight and bias of the target network.
Q of evaluated network eval (s, a, θ) is a combination of V(s) and A (s, a), defined as follows:
Q eval (s,a,θ)=V(s,θ,δ)+A(s,a,θ,ε) (14)
where δ and ε are parameters of two independent streams.
The algorithm flow is as follows:
(1) Initializing:
(1) initializing an evaluated network according to the weight and the deviation theta;
(2) according to the weight and the deviation theta - Initializing a target network;
(3) initializing the memory size M, the batch size B and the greedy coefficient E.
(2) K times of iterative training are carried out on the evaluated, and each iteration comprises the following steps:
(1) using a random state space s init Reset context, s (t) =s init
(2) When the state space is not in the end state, i.e. s (t) +.! =s terminal The following steps are repeated under the condition of:
1) Randomly selecting an action a (t) with a probability epsilon;
2) If (1) is not performed, a (t) =argmax is set a Q(s,a,θ);
3) Obtaining an instant prize r (t) and a next state s (t+1);
4) Storing experience (s (t), a (t), r (t), s (t+1)) into an experience playback memory;
5) Randomly sampling (s (i), a (i), r (i), s (i+1)) from the empirical playback memory;
6) Computing two data streams of the evaluated network, including V (s, θ, δ) and A (s, a, θ, ε), and then combining them into Q eval (s,a,θ);
7) Computing target Q in target network target
8) If the next state is the end state, i.e. s (t+1) =s terminal Then Q target R (t); if not, Q target =r(t)+γmax a′ Q(s(t+1),a′,θ′);
9) Training the evaluated network to minimize the loss function L (θ), L (θ) =e [ (Q) target -Q eval ) 2 ]
10 Updating the target network based on the evaluated network after a period of training; the updated state space is the next state space s (t) ≡s (t+1).
CN202310775693.0A 2023-06-28 2023-06-28 Fog computing system performance optimization algorithm based on blockchain and reinforcement learning Withdrawn CN116827515A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310775693.0A CN116827515A (en) 2023-06-28 2023-06-28 Fog computing system performance optimization algorithm based on blockchain and reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310775693.0A CN116827515A (en) 2023-06-28 2023-06-28 Fog computing system performance optimization algorithm based on blockchain and reinforcement learning

Publications (1)

Publication Number Publication Date
CN116827515A true CN116827515A (en) 2023-09-29

Family

ID=88114172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310775693.0A Withdrawn CN116827515A (en) 2023-06-28 2023-06-28 Fog computing system performance optimization algorithm based on blockchain and reinforcement learning

Country Status (1)

Country Link
CN (1) CN116827515A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117938886A (en) * 2024-03-25 2024-04-26 武汉烽火信息集成技术有限公司 Cross-chain block multi-source selection storage method and system based on reinforcement learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019179471A1 (en) * 2018-03-21 2019-09-26 南京邮电大学 Fog computing architecture based on internet of things environment
CN112364317A (en) * 2020-11-17 2021-02-12 中国传媒大学 Internet of things fog environment management architecture and method based on block chain technology
CN114020079A (en) * 2021-11-03 2022-02-08 北京邮电大学 Indoor space temperature and humidity regulation and control method and device
CN114143062A (en) * 2021-11-25 2022-03-04 中南财经政法大学 Block chain-based security authentication system, method, terminal and medium for fog computing environment
CN115412157A (en) * 2022-08-22 2022-11-29 北京鹏鹄物宇科技发展有限公司 Emergency rescue oriented satellite energy-carrying Internet of things resource optimal allocation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019179471A1 (en) * 2018-03-21 2019-09-26 南京邮电大学 Fog computing architecture based on internet of things environment
CN112364317A (en) * 2020-11-17 2021-02-12 中国传媒大学 Internet of things fog environment management architecture and method based on block chain technology
CN114020079A (en) * 2021-11-03 2022-02-08 北京邮电大学 Indoor space temperature and humidity regulation and control method and device
CN114143062A (en) * 2021-11-25 2022-03-04 中南财经政法大学 Block chain-based security authentication system, method, terminal and medium for fog computing environment
CN115412157A (en) * 2022-08-22 2022-11-29 北京鹏鹄物宇科技发展有限公司 Emergency rescue oriented satellite energy-carrying Internet of things resource optimal allocation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DAI MEILING 等: "Blockchain-Based Reliable Fog-Cloud Service Solution for IIoT", CHINESE JOURNAL OF ELECTRONICS, 31 March 2021 (2021-03-31) *
YIHE ZHANG 等: "Performance Optimization Blockchain-Enabled Fog Computing with Deep Reinforcement Learning", ICCNS \'22: PROCEEDINGS OF THE 2022 12TH INTERNATIONAL CONFERENCE ON COMMUNICATION AND NETWORK SECURITY, 3 December 2022 (2022-12-03), pages 3 - 6 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117938886A (en) * 2024-03-25 2024-04-26 武汉烽火信息集成技术有限公司 Cross-chain block multi-source selection storage method and system based on reinforcement learning

Similar Documents

Publication Publication Date Title
He et al. Trust-based social networks with computing, caching and communications: A deep reinforcement learning approach
Lin et al. Task offloading for wireless VR-enabled medical treatment with blockchain security using collective reinforcement learning
Lin et al. Resource management for pervasive-edge-computing-assisted wireless VR streaming in industrial Internet of Things
Li et al. NOMA-enabled cooperative computation offloading for blockchain-empowered Internet of Things: A learning approach
Xie et al. Adaptive online decision method for initial congestion window in 5G mobile edge computing using deep reinforcement learning
CN114340016B (en) Power grid edge calculation unloading distribution method and system
Shome et al. Federated learning and next generation wireless communications: A survey on bidirectional relationship
EP4024212B1 (en) Method for scheduling inference workloads on edge network resources
Perin et al. Towards sustainable edge computing through renewable energy resources and online, distributed and predictive scheduling
Sun et al. Edge learning with timeliness constraints: Challenges and solutions
CN116233926A (en) Task unloading and service cache joint optimization method based on mobile edge calculation
Lakew et al. Adaptive partial offloading and resource harmonization in wireless edge computing-assisted IoE networks
Jo et al. Deep reinforcement learning‐based joint optimization of computation offloading and resource allocation in F‐RAN
Yun et al. Cooperative inference of DNNs for delay-and memory-constrained wireless IoT systems
Xu et al. Learning-based sustainable multi-user computation offloading for mobile edge-quantum computing
Jeong et al. Deep reinforcement learning-based task offloading decision in the time varying channel
KR20220097201A (en) Network congestion control method using federated learning
Henna et al. Distributed and collaborative high-speed inference deep learning for mobile edge with topological dependencies
Wang et al. On Jointly optimizing partial offloading and SFC mapping: a cooperative dual-agent deep reinforcement learning approach
Elgendy et al. Security-aware data offloading and resource allocation for MEC systems: a deep reinforcement learning
Ansere et al. Quantum deep reinforcement learning for dynamic resource allocation in mobile edge computing-based IoT systems
Jiang et al. Joint model pruning and topology construction for accelerating decentralized machine learning
Saeed et al. Task reverse offloading with deep reinforcement learning in multi-access edge computing
Zhao et al. Cross-Domain Service Function Chain Routing: Multiagent Reinforcement Learning Approaches
Tong et al. FedTO: Mobile-aware task offloading in multi-base station collaborative MEC

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20230929

WW01 Invention patent application withdrawn after publication