CN112637032A

CN112637032A - Service function chain deployment method and device

Info

Publication number: CN112637032A
Application number: CN202011377655.2A
Authority: CN
Inventors: 董秋丽; 李福昌; 钟志刚; 冯毅; 王筱斐; 张天魁
Original assignee: China United Network Communications Group Co Ltd
Current assignee: China United Network Communications Group Co Ltd
Priority date: 2020-11-30
Filing date: 2020-11-30
Publication date: 2021-04-09
Anticipated expiration: 2040-11-30
Also published as: CN112637032B

Abstract

The embodiment of the application provides a service function chain deployment method and device, relates to the technical field of communication, and solves the technical problems that the service function chain deployment flexibility is poor and a good deployment effect is difficult to obtain in the prior art. The deployment method of the service function chain comprises the following steps: randomly acquiring sample data of an experience replay buffer; updating the actor network and the target actor network according to the sample data; under the condition that the return value of the return function of the sample data is converged, outputting the parameters of the target actor network; and determining a physical node according to the parameters of the target actor network, and deploying the service function chain through the determined physical node.

Description

Service function chain deployment method and device

Technical Field

The present application relates to the field of communications technologies, and in particular, to a method and an apparatus for deploying a service function chain.

Background

In the service function chain deployment process, each Virtual Network Function (VNF) needs an infrastructure network to provide physical network resources for the VNF, however, in actual deployment, the physical network resources (such as a calculation resource amount) are usually assumed to be a fixed value, and in the case that the fixed value is set unreasonably, the network delay is large, so the service function chain deployment flexibility in the prior art is poor, and it is difficult to obtain a good deployment effect.

Disclosure of Invention

The application provides a service function chain deployment method and device, and solves the technical problems that in the prior art, the service function chain deployment flexibility is poor, and a good deployment effect is difficult to obtain.

In order to achieve the purpose, the technical scheme is as follows:

in a first aspect, a method for deploying a service function chain is provided, including: randomly acquiring sample data of the verified replay buffer; updating the actor network and the target actor network according to the sample data; under the condition that the return value of the return function of the sample data is converged, outputting the parameters of the target actor network; and determining a physical node according to the parameters of the target actor network, and deploying the service function chain through the determined physical node.

In the embodiment of the application, the sample data of the experience replay buffer can be randomly acquired; updating the actor network and the target actor network according to the sample data; under the condition that the return value of the return function of the sample data is converged, outputting the parameters of the target actor network; and finally, determining a physical node according to the parameters of the target actor network, and deploying a service function chain through the determined physical node. By the scheme, the actor network and the target actor network can be continuously updated according to the sample data until the return value is converged, so that the parameters of the target actor network for realizing optimal deployment can be output, the optimal deployment of a service function chain is realized, and the system cost and the network delay are further reduced.

In a second aspect, a device for deploying a service function chain is provided, which includes an obtaining unit and a processing unit. The acquisition unit is used for randomly acquiring the sample data of the experience replay buffer. The processing unit is used for updating the actor network and the target actor network according to the sample data; under the condition that the return value of the return function of the sample data is converged, outputting the parameters of the target actor network; and determining a physical node according to the parameters of the target actor network, and deploying the service function chain through the determined physical node.

In a third aspect, a deployment apparatus for a service function chain is provided that includes a memory and a processor. The memory is used for storing computer execution instructions, and the processor is connected with the memory through a bus. When the deployment device of the service function chain is running, the processor executes the computer execution instructions stored in the memory to make the deployment device of the service function chain execute the deployment method of the service function chain provided in the first aspect.

In a fourth aspect, a computer-readable storage medium is provided, which comprises computer-executable instructions, which, when executed on a computer, cause the computer to perform the method for deploying a service function chain provided in the first aspect.

In a fifth aspect, a computer program product is provided, which comprises computer instructions that, when run on a computer, cause the computer to perform the method of deploying a service function chain as provided in the first aspect and its various possible implementations.

It should be noted that all or part of the computer instructions may be stored on the computer readable storage medium. The computer readable storage medium may be packaged with the processor of the deployment device of the service function chain, or may be packaged separately from the processor of the deployment device of the service function chain, which is not limited in this application.

In the description of the second aspect, the third aspect, the fourth aspect, and the fifth aspect in the present application, reference may be made to the detailed description of the first aspect, which is not repeated herein; in addition, for the beneficial effects described in the second aspect, the third aspect, the fourth aspect and the fifth aspect, reference may be made to the beneficial effect analysis of the first aspect, and details are not repeated here.

In the present application, the names of the above-mentioned deployment means of the service function chain do not constitute a limitation on the devices or function modules themselves, which may appear under other names in an actual implementation. Insofar as the functions of the respective devices or functional blocks are similar to those of the present application, they come within the scope of the appended claims and their equivalents.

These and other aspects of the present application will be more readily apparent from the following description.

Drawings

Fig. 1 is a schematic structural diagram of a network system according to an embodiment of the present application;

fig. 2 is a flowchart illustrating a method for deploying a service function chain according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a service function chain deployment device according to an embodiment of the present application;

fig. 4 is a schematic hardware structure diagram of a service function chain deployment apparatus according to an embodiment of the present application;

fig. 5 is a second hardware structure diagram of a service function chain deployment apparatus according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it should be understood that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that in the embodiments of the present application, words such as "exemplary" or "for example" are used to indicate examples, illustrations or explanations. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.

For the convenience of clearly describing the technical solutions of the embodiments of the present application, in the embodiments of the present application, the terms "first" and "second" are used to distinguish the same items or similar items with basically the same functions and actions, and those skilled in the art can understand that the terms "first" and "second" are not limited in number and execution order.

In order to improve the flexibility and the expandability of the network, the mobile core network takes a virtualization technology as a key technology for constructing a future mobile communication network. Virtualization technology isolates the logical implementation and the physical implementation of networks, so that various customized virtual networks can coexist in the same physical network in an isolated manner. Based on the virtualization technology, the mobile core network may be implemented by using a Service Based Architecture (SBA), and virtualizes each core network element function as a VNF.

Fig. 1 is a schematic diagram of a fifth generation mobile communication technology (5th generation mobile networks, 5G) core network system based on a virtualization technology. The whole network system mainly includes three different types of services, which are a session management Service Function Chain (SFC) 1, a user plane function service function chain SFC2, and a service application service function chain SFC 3. The VNF in the SFC1 may include a Session Management Function (SMF), a Policy Control Function (PCF), and an Application Function (AF); the VNF in SFC2 may include a User Plane Function (UPF). Various types of core network element function group chains are service function chains, representing network services of different needs.

In order to optimize the service function chain deployment in the core network system, an embodiment of the present application provides a method for deploying a service function chain, and details of the method for deploying a service function chain provided in the embodiment of the present application are described below.

As shown in fig. 2, an embodiment of the present application provides a method for deploying a service function chain, where the method for deploying the service function chain may be applied to a device for deploying the service function chain, and the method for deploying the service function chain may include the following steps S201 to S204.

S201, the deployment device of the service function chain randomly obtains sample data of the experience replay buffer.

Through interaction with the 5G SBA infrastructure network ring, the service function chain deployment device can continuously obtain updated historical information, including status, actions, and rewards, and the service function chain deployment device can initialize the empirical replay buffer using the historical information.

The resource allocation amount of the physical network has great influence on the service performance (such as the delay performance). Thus, the impact of the amount of computing resources allocated to the VNF on the latency characteristics may be considered in formulating the service function chain deployment problem. The deployment device of the service function chain may determine the remaining amount of computing resources of all physical nodes of the service function chain. Specifically, the service function chain deployment device may establish a service function chain deployment and computing resource allocation model, and the underlying physical network may be represented by an weighted undirected graph G ═ N, L, where N represents a physical node in the network and L represents a link between the physical nodes. Service function chain in virtual network is composed of VNF functions in sequence, and can use assigned weight directed graph G^V＝(N^V,L^V) Is represented by, wherein, N^VRepresenting virtual nodes, L^VRepresenting a virtual link. The set of service function chains requested by the virtual network operator is represented as

The heterogeneous set of VNFs that may be deployed in a physical network is { f _x1, 2.,. X }. The kth service function chain of the system may be represented as

Where M represents the total number of VNFs in this service function chain. The deployment means of the service function chain may cause all VNFs to observe the initial environment state. The observed environment states of each VNF of the service function chain k are all the environment states in the infrastructureThe remaining amount of computing resources of the physical node. The remaining amount of computing resources of the nth physical node can be recorded as V_nThe state at a certain time can be recorded as s_k,m＝[V₁,V₂,...,V_N]。

Then, the deployment device of the service function chain can calculate the residual quantity V of the resource_nAnd selecting actions, namely the target physical node and the computing resource amount, and executing corresponding operations to obtain a return value of the return function. Specifically, there are N physical nodes in the infrastructure network, the amount of allocable computing resources can be discretized into W optional levels, and each action in the set of actions can be denoted as a ═ N, Φ, where N ∈ {1,2, 3.

All VNFs in a service function chain k can select action a based on the current maximum reward function policy_t(ii) a And obtaining a return value r_tAnd observing the next state s_t+1。

Finally, the deployment device of the service function chain can use the tuple(s) formed by the computing resource residual quantity, the target physical node, the return value and the computing resource residual quantity after executing the corresponding operation_t,a_t,r_t,s_t+1) To an empirical replay buffer. The empirical replay buffer may store a plurality of tuples and the deployment device of the service function chain may randomly obtain a small batch of sample data from the empirical replay buffer.

Optionally, the service function chain deployment apparatus may determine an end-to-end delay and a deployment cost of the service function chain, construct a minimum deployment cost and a weighted sum of the delay, and define a reward function according to the minimum deployment cost and the weighted sum of the delay.

For example, a VNF may be deployed on any general-purpose server node in an infrastructure, and various types of resources on the node are collectively considered as computing resources. Defining the dependency relationship between the calculation resource allocation amount and the processing delay of the VNF as a linear relationship, wherein the processing delay generated by the mth VNF in the kth service function chain

Can be defined as the allocation of computing resources

The function of the correlation:

wherein the coefficient a_mAnd b_mCan be respectively expressed as:

for the minimum processing latency of the VNF m,

being the maximum processing latency of the VNF m,

maximum computational resource allocation for VNF m

Corresponding to VNF m minimum computational resource allocation

Further, the processing delay generated by the VNF m mapped on the physical node n in the kth service function chain is determined

Is defined as:

where ρ is_nRepresenting the physical node correlation coefficient.

The end-to-end delay of the service function chain consists of two parts, namely processing delay and transmission delay, which can be respectively expressed as:

and

wherein h is_n,n′Representing the number of hops between processor node n and processor node n'; d_n,n′Representing the propagation delay between processor nodes n and n'. Thus, the end-to-end delay of the service function chain k can be expressed as:

the deployment cost of a service function chain k can be expressed as:

wherein, delta_nRepresenting the unit price of a computing resource on processor node n.

And finally, constructing a weighted sum of the minimized network cost and the time delay:

the weighted sum equation is satisfied:

alpha is more than 0 and beta is more than 0 in the case of C5. Wherein η represents a VNF mapping in the service function chain; phi denotes the allocation of computing resources; c^kRepresents the deployment cost of the service function chain k; d^kRepresenting the end-to-end delay of the service function chain k; alpha and beta represent optimization target balance coefficients; constraint C1 indicates that VNFs in a service function chain can only be deployed on one physical node; c2 denotesThe VNF should meet its minimum/maximum computational resource allocation limit; c3 indicates that the computational resources allocated for the VNF must not exceed the total computational resource amount of the physical node

C4 denotes that the end-to-end delay of the service function chain must not exceed its tolerable delay limit

C5 indicates that the values of the two trade-off coefficients are both positive numbers.

S202, the service function chain deployment device updates the actor network and the target actor network according to the sample data.

Take the service function chain k as an example. Since the network parameters are also used to calculate the target values while being updated, which may cause some instability, the Actor (Actor) network and Critic (Critic) network of each VNF will use two identical networks, namely the learning network and the target network, where the target network is used to calculate the target values. Specifically, each VNF of the service function chain k is used as an agent, and when the centralized training is started, the deployment device of the service function chain can randomly initialize the Actor network parameter θ^μAnd Critic network parameter θ^QAnd initializing the target Actor network parameter theta^μ′←θ^μAnd a target Critic network parameter θ^Q′←θ^Q。

After randomly acquiring sample data of the experience replay buffer, the deployment device of the service function chain can update the actor network and the target actor network according to the sample data.

For example, taking VNF m as an example, the service function chain deployment device may use the following formula:

updating the critical network parameters

Wherein Q is_mAs a function of action-value，

μ (s') represents the deterministic strategy for all agents; gamma represents a discount factor; state s at the present moment_tAnd action a_tReporting r_tCan be simplified into s, a and r; state s at the next moment_t+1Can be simplified to s' and action a_t+1Can be simplified to a'. The service function chain deployment device can update the Actor network parameters according to the following formula

Wherein, J_mRepresenting the expected reward of VNF m.

The deployment device of the service function chain can then pass the formula θ^μ′←τθ^μ+(1-τ)θ^μ′Updating the target Actor network by theta^Q′←τθ^Q+(1-τ)θ^Q′Updating the criticic network, wherein tau represents a soft update factor.

It should be noted that the actor critic algorithm is an algorithm in the field of reinforcement learning, and is composed of two networks, namely an actor and a critic. The actor may select an action based on the observed environmental state, and the critic may determine how good the selected action is, and the two-part strategy may use a deep neural network fit. The network input of the actor is state information, and the output is an action; the critic network inputs the state information and the selected action and outputs the time difference error, and the time difference error obtained by calculation can drive the learning of the actor network and the critic network.

S203, the deployment apparatus of the service function chain outputs the parameters of the target actor network when the return value of the return function of the sample data converges.

With updating of Actor network parameters and target Actor network parametersThe deployment device of the service function chain can obtain the strategy of the maximum return value, and under the condition that the return value of the return function of sample data is converged, the deployment device of the service function chain can use the Actor network parameters theta of all VNFs^μ′And returning and outputting the parameters of the target Actor network.

Optionally, in a case that the reward value of the reward function is not converged, the deployment device of the service function chain may perform the above S202 in a loop, that is, reselect the target physical node according to the remaining amount of the computing resource and perform a corresponding operation.

It should be noted that, when the action selection satisfies the constraint conditions C1-C4, the reward value of each VNF is defined as the negative value of the weighted sum of its own cost and delay, i.e., -ac_k,m-βD_k,m(ii) a Otherwise, the return of each VNF is defined as a negative maximum r_neg。

S204, the service function chain deployment device determines a physical node according to the parameters of the target actor network, and deploys the service function chain through the determined physical node.

In the testing stage, each VNF may select an action only through the Actor network, and inputs an environment state into the Actor network, and outputs the environment state as an action selection of itself and executes the action

And according to

And selecting and executing actions to obtain a system return value, thereby determining the optimal service function chain deployment and computing resource allocation scheme.

The embodiment of the application provides a deployment method of a service function chain, and as the actor network and the target actor network can be continuously updated according to sample data until the return value is converged, parameters of the target actor network for realizing optimal deployment can be output, so that the optimal deployment of the service function chain is realized, and the system cost and the network delay are further reduced.

The scheme provided by the embodiment of the application is mainly introduced from the perspective of a method. To implement the above functions, it includes hardware structures and/or software modules for performing the respective functions. Those of skill in the art will readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends on the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the deployment method of the service function chain provided in the embodiment of the present application, the execution subject may be a deployment device of the service function chain, or a control module for deploying the service function chain in the deployment device of the service function chain. In the embodiment of the present application, a method for executing a service function chain by a deployment device of the service function chain is taken as an example, and the deployment device of the service function chain provided in the embodiment of the present application is described.

It should be noted that, in the embodiment of the present application, the functional modules may be divided according to the above method example for the deployment apparatus of the service function chain, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. Optionally, the division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

As shown in fig. 3, an embodiment of the present application provides a service function chain deployment apparatus. The service function chain deployment apparatus 300 may include an acquisition unit 301 and a processing unit 302. The obtaining unit 301 may be configured to randomly obtain sample data of the empirical playback buffer. The processing unit 302 may be configured to update the actor network and the target actor network according to the sample data; under the condition that the return value of the return function of the sample data is converged, outputting the parameters of the target actor network; and determining a physical node according to the parameters of the target actor network, and deploying the service function chain through the determined physical node. For example, in conjunction with fig. 2, the acquiring unit 301 may be configured to execute S201, the processing unit 302. May be used to perform S202, S203 and S204.

Optionally, the processing unit 302 may be further configured to determine the remaining amount of computing resources of all physical nodes of the service function chain; selecting a target physical node according to the computing resource residual quantity, and obtaining a return value of a return function by executing corresponding operation; and storing the computing resource residual amount, the target physical node, the return value and the computing resource residual amount after the corresponding operation is executed in an experience replay buffer.

Optionally, the processing unit 302 may be further configured to reselect a target physical node according to the remaining amount of the computing resource and execute a corresponding operation when the reward value of the reward function is not converged.

Optionally, the processing unit 302 may be further configured to determine an end-to-end delay and a deployment cost of the service function chain; constructing a minimum deployment cost and a time delay weighted sum; and defining the reward function according to the weighted sum of the minimized deployment cost and the time delay.

Of course, the deployment apparatus 300 of the service function chain provided in the embodiment of the present application includes, but is not limited to, the above modules.

The embodiment of the present application further provides a service function chain deployment apparatus as shown in fig. 4, where the service function chain deployment apparatus includes a processor 11, a memory 12, a communication interface 13, and a bus 14. The processor 11, the memory 12 and the communication interface 13 may be connected by a bus 14.

The processor 11 is a control center of a deployment device of the service function chain, and may be a single processor or a collective term for a plurality of processing elements. For example, the processor 11 may be a general-purpose Central Processing Unit (CPU), or may be another general-purpose processor. Wherein a general purpose processor may be a microprocessor or any conventional processor or the like.

For one embodiment, processor 11 may include one or more CPUs, such as CPU 0 and CPU 1 shown in FIG. 4.

The memory 12 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that may store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

In a possible implementation, the memory 12 may exist separately from the processor 11, and the memory 12 may be connected to the processor 11 via a bus 14 for storing instructions or program code. The deployment method of the service function chain provided by the embodiment of the present application can be implemented when the processor 11 calls and executes the instructions or program codes stored in the memory 12.

In another possible implementation, the memory 12 may also be integrated with the processor 11.

And a communication interface 13 for connecting with other devices through a communication network. The communication network may be an ethernet network, a radio access network, a Wireless Local Area Network (WLAN), or the like. The communication interface 13 may comprise a receiving unit for receiving data and a transmitting unit for transmitting data.

The bus 14 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 4, but this does not indicate only one bus or one type of bus.

It is noted that the structure shown in fig. 4 does not constitute a definition of the deployment means of the service function chain. In addition to the components shown in FIG. 4, the service function chain deployment apparatus may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.

Fig. 5 shows another hardware structure of a deployment apparatus of a service function chain in the embodiment of the present application. As shown in fig. 5, the deployment means of the service function chain may comprise a processor 21 and a communication interface 22. The processor 21 is coupled to a communication interface 22.

The function of the processor 21 may refer to the description of the processor 11 above. The processor 21 also has a memory function, and the function of the memory 12 can be referred to.

The communication interface 22 is used to provide data to the processor 21. The communication interface 22 may be an internal interface of the service function chain deployment device, or may be an external interface of the service function chain deployment device (corresponding to the communication interface 13).

It should be noted that the structure shown in fig. 4 (or fig. 5) does not constitute a definition of a deployment apparatus for a service function chain, which may include more or less components than those shown in fig. 4 (or fig. 5), or may combine certain components, or a different arrangement of components, in addition to the components shown in fig. 4 (or fig. 5).

Embodiments of the present application also provide a computer-readable storage medium, which includes computer-executable instructions. When the computer executes the instructions to run on the computer, the computer is caused to execute the steps executed by the service function chain deployment device in the service function chain deployment method provided by the embodiment.

The embodiment of the present application further provides a computer program product, where the computer program product is directly loadable into a memory and contains a software code, and the computer program product is loaded and executed by a computer, so as to implement the steps executed by the service function chain deployment apparatus in the service function chain deployment method provided in the foregoing embodiment.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented using a software program, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The processes or functions according to the embodiments of the present application are generated in whole or in part when the computer-executable instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). Computer-readable storage media can be any available media that can be accessed by a computer or data storage device comprising one or more available media integrated servers, data centers, and the like. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Through the description of the above embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the above division of each functional module is only used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the above modules or units is only one logical function division, and there may be other division ways in actual implementation. For example, various elements or components may be combined or may be integrated into another device, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. Units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed to a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application are included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for deploying a service function chain, comprising:

randomly acquiring sample data of an experience replay buffer;

updating the actor network and the target actor network according to the sample data;

under the condition that the return value of the return function of the sample data is converged, outputting the parameters of the target actor network;

and determining a physical node according to the parameters of the target actor network, and deploying a service function chain through the determined physical node.

2. The method for deploying service function chains according to claim 1, wherein before randomly acquiring sample data of the experience replay buffer, the method further comprises:

determining the computing resource residual amount of all physical nodes of the service function chain;

selecting a target physical node according to the computing resource residual quantity, and obtaining a return value of a return function by executing corresponding operation;

and storing the computing resource residual amount, the target physical node, the return value and the computing resource residual amount after the corresponding operation is executed into an experience replay buffer.

3. The method for deploying a service function chain according to claim 2,

and under the condition that the return value of the return function is not converged, reselecting a target physical node according to the computing resource residual quantity and executing corresponding operation.

4. The method for deploying service function chains according to any of claims 1-3, wherein the method further comprises:

determining end-to-end time delay and deployment cost of a service function chain;

constructing a minimum deployment cost and a time delay weighted sum;

and defining the reward function according to the minimized deployment cost and the weighted sum of the time delay.

5. A service function chain deployment apparatus, comprising: an acquisition unit and a processing unit;

the acquisition unit is used for randomly acquiring sample data of the experience replay buffer;

the processing unit is used for updating the actor network and the target actor network according to the sample data; under the condition that the return value of the return function of the sample data is converged, outputting the parameters of the target actor network; and determining a physical node according to the parameters of the target actor network, and deploying a service function chain through the determined physical node.

6. The device for deploying service function chains according to claim 5, wherein the processing unit is further configured to determine a remaining amount of computing resources of all physical nodes of the service function chain; selecting a target physical node according to the computing resource residual quantity, and obtaining a return value of a return function by executing corresponding operation; and storing the computing resource residual amount, the target physical node, the return value and the computing resource residual amount after the corresponding operation is executed in an experience replay buffer.

7. The device for deploying service function chains according to claim 6, wherein the processing unit is further configured to, in a case that a reward value of the reward function does not converge, reselect a target physical node according to the remaining amount of the computing resources and perform a corresponding operation.

8. The device according to any of claims 5-7, wherein the processing unit is further configured to determine an end-to-end latency and a deployment cost of a service function chain; constructing a minimum deployment cost and a time delay weighted sum; and defining the reward function according to the weighted sum of the minimized deployment cost and the time delay.

9. A service function chain deployment apparatus comprising a memory and a processor; the memory is used for storing computer execution instructions, and the processor is connected with the memory through a bus;

the processor executes the computer-executable instructions stored by the memory to cause a deployment device of the service function chain to perform the method of deploying the service function chain as recited in any one of claims 1-4 when the deployment device of the service function chain is running.

10. A computer-readable storage medium comprising computer-executable instructions that, when executed on a computer, cause the computer to perform the method for deploying a service function chain according to any of claims 1-4.