CN106850289B - Service combination method combining Gaussian process and reinforcement learning - Google Patents

Service combination method combining Gaussian process and reinforcement learning Download PDF

Info

Publication number
CN106850289B
CN106850289B CN201710055817.2A CN201710055817A CN106850289B CN 106850289 B CN106850289 B CN 106850289B CN 201710055817 A CN201710055817 A CN 201710055817A CN 106850289 B CN106850289 B CN 106850289B
Authority
CN
China
Prior art keywords
state
value
service
gaussian
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710055817.2A
Other languages
Chinese (zh)
Other versions
CN106850289A (en
Inventor
王红兵
李佳杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201710055817.2A priority Critical patent/CN106850289B/en
Publication of CN106850289A publication Critical patent/CN106850289A/en
Application granted granted Critical
Publication of CN106850289B publication Critical patent/CN106850289B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a service combination method combining a Gaussian process and reinforcement learning, which comprises the following steps: 1. modeling a service composition problem as a four-tuple Markov decision process; 2. solving a four-tuple Markov decision process by applying a Q-learning-based reinforcement learning method to obtain an optimal strategy; wherein the Q value is updated by establishing a Gaussian prediction model of the Q value; 3. and mapping the optimal strategy into the workflow of the web service combination. The method uses a Gaussian process to model the learning of the Q value, so that the method has better accuracy and generalization.

Description

Service combination method combining Gaussian process and reinforcement learning
Technical Field
The invention relates to a method for combining Web services by using a computer, belonging to the field of artificial intelligence.
Background
With the development of computer technology, the requirements of software systems become more and more complex and changeable, and with the development of internet and information technology, a Service-Oriented Architecture (Service-Oriented Architecture) is gradually developed: software or components implementing certain functions are placed in the context of the internet as web services, and users may use their functions by communicating with the web services via some messaging protocol. And finally, constructing a new software system meeting the requirements by combining various web services. Currently, a weather service, a map positioning service, and the like are common web services.
For a certain function, there are generally a plurality of services that are similar in function but different in quality of Service (QoS), one type of Service that can satisfy a certain function is called an abstract Service, and a plurality of specific services that satisfy the function are called candidates for the abstract Service. For a user's requirement, how to select a service with the best quality from a plurality of candidate services and finally obtain the best combination of services is a service combination problem, and the selection and combination optimization of services according to the QoS attributes of different services is called QoS-aware service combination. Since the internet environment is highly dynamic, and QoS attributes of a certain service may fluctuate or change with time and environmental changes, the service combination method needs to have certain adaptivity and be capable of coping with the influence of environmental changes. Meanwhile, as candidate services are continuously increased and business requirements are more and more complex, a complex user requirement often includes a plurality of abstract services and corresponding candidate services, and therefore the service combination method also needs to be capable of facing the challenge of the large-scale service combination problem. Based on the above two problems, some scholars propose a service combination method based on Markov Decision Processes (MDP) and reinforcement learning. MDP is a decision planning technique, in a service combination, a current network environment and context are modeled as states in MDP, a plurality of candidate services available for selection in the current state are modeled as a plurality of actions available in MDP, and after a certain action is performed, a new state is shifted to, so that a next round of selection is performed until the whole service combination is finally completed. After the MDP model is used for modeling the service combination process, the problem of exploring the optimal service combination can be converted into the problem of solving the MDP model, and therefore a reinforcement learning method is further used. The reinforcement learning method is an effective method for solving the MDP model, particularly in a large-scale dynamic environment of a service combination problem, reinforcement learning is performed through iterative interaction with the environment, and the reinforcement learning method is natural and adaptive and can well deal with the service combination problem in a network environment. In the traditional reinforcement learning algorithm Q-learning, a Q value is recorded through a value table, the generalization capability is lacked, the learning result is not accurate enough, and the influence of noise is large.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems in the prior art, the invention discloses a service combination method combining a Gaussian process and reinforcement learning, wherein the Gaussian process is used for modeling the learning of a Q value, so that the Q value has better accuracy and generalization.
The technical scheme is as follows: the technical scheme adopted by the invention is as follows:
a service combination method combining Gaussian process and reinforcement learning comprises the following steps:
(1) modeling a service composition problem as a four-tuple Markov decision process;
(2) solving a four-tuple Markov decision process by applying a Q-learning-based reinforcement learning method to obtain an optimal strategy;
(3) and mapping the optimal strategy into the workflow of the web service combination.
Specifically, the service composition problem is modeled in step (1) as a four-tuple Markov decision process as follows:
M=<S,A,P,R>
where S is a set of finite states in the environment; a is a set of callable actions, A(s) represents a set of actions that can be performed in state s; p is a function describing the MDP state transition, P (s '| s, a) represents the probability of transitioning to state s' after invoking action a under state s; r is a reward value function, and R (s, a) represents the reward value resulting from invoking action a in state s.
Specifically, the step (2) of solving the four-tuple Markov decision process by applying a Q-learning-based reinforcement learning method to obtain the optimal strategy comprises the following steps:
(21) taking the state action pair z as input and the corresponding Q value Q (z) as output, and establishing a Gaussian prediction model of the Q value;
(22) initializing a learning rate sigma, a discount rate gamma and a greedy strategy probability epsilon in Q-learning, wherein the current state s is 0, and the current time step t is 0;
(23) selecting one service a as current service a by using greedy strategy with probability of epsilontAnd the execution is carried out,
(24) is recorded in the current state stUnder execution of Current service atIs given a return value rtAnd state after execution of service at+1(ii) a Calculating the in-state action pair z according tot=<st,at>The following Q value:
Figure BDA0001219062010000021
wherein Q (z)t) For acting on a state zt=<st,at>The Q value, σ, is the learning rate, r, γ, is the discount rate, st+1To execute service atFrom the current state stTransition to the subsequent state, at+1Is in a state st+1Service of lower choice, Q(s)t+1,at+1) Indicating an action pair in a state<st+1,at+1>The lower Q value;
(25) updating the Q value according to a Gaussian prediction model:
Figure BDA0001219062010000031
where I is the identity matrix, ωnFor uncertainty parameters, Z is a set of historical state action pairs,
Figure BDA0001219062010000032
for the set of historical Q values corresponding to Z, K (Z, Z) is the covariance matrix between the historical state action pairs, with the ith row and j column elements of K (Z)i,zj) K (·) is a kernel function; k (Z, Z)t+1) For historical state action pairs with newly entered state action pairst+1A covariance matrix between;
according to the state action pair zt+1=<st+1,at+1>And a corresponding Q value Q (z)t+1) Updating a Gaussian prediction model;
(26) and updating the current state: st=st+1When s istWhen the terminal state is the termination state and the convergence condition is met, the reinforcement learning is finished to obtain an optimal strategy; otherwise go to step (23).
Specifically, the kernel function k (·) in the gaussian prediction model is a gaussian kernel function:
Figure BDA0001219062010000033
wherein sigmakIs the width of the gaussian kernel function.
Specifically, the convergence condition in step (26) is: the variation of Q value is less than Q threshold QthNamely: i Q (z)t)-Q(zt+1)|<Qth
Has the advantages that: compared with the prior art, the service combination method disclosed by the invention has the following advantages: in the invention, when the reinforcement learning Q value is calculated, the traditional method of recording and searching the Q value through a value table is improved, the service selected and called each time and the observed QoS attribute are taken as the input and the output of an unknown function, in the iterative process of the Q value, the Q value is estimated through a Gaussian process instead of being searched through the value table, and meanwhile, the parameter of the Gaussian process is learned and updated, so that the prediction of the Q value is more accurate, and a better service combination result is finally obtained. Meanwhile, a Gaussian process reinforcement learning service combination method is adopted, and a Gaussian process model can be trained from the existing data, so that new data can be predicted and estimated, the method has good generalization capability, and can be more suitable for dynamic and variable web service combination environments.
Drawings
FIG. 1 is a basic service composition model;
FIG. 2 is a schematic diagram of service composition modeled by MDP;
FIG. 3 is a schematic diagram of a basic Gaussian process;
FIG. 4 is a flow diagram of a method of service composition incorporating a Gaussian process and reinforcement learning.
Detailed Description
The invention is further elucidated with reference to the drawings and the detailed description.
Basic model of service composition as shown in fig. 1, a complex software system can be viewed as a workflow composed of a plurality of components or subsystems, in the field of service composition, i.e., web services. Thus, when combining services, the user's needs can be modeled with an abstract task workflow diagram, where the individual components are abstract services. For each abstract service there may be multiple candidate services with similar functionality but different QoS (quality of service), so that a suitable concrete service can be selected from the candidate services based on the QoS attributes, eventually combining the available service composition systems.
The invention discloses a service combination method combining a Gaussian process and reinforcement learning, which comprises the following steps:
step 1, modeling a service combination problem into a four-tuple Markov decision process:
M=<S,A,P,R>
where S is a set of finite states in the environment; a is a set of callable actions, A(s) represents a set of actions that can be performed in state s; p is a function describing the MDP state transition, P (s '| s, a) represents the probability of transitioning to state s' after invoking action a under state s; r is a reward value function, and R (s, a) represents the reward value resulting from invoking action a in state s.
FIG. 2 shows an example of service composition modeled by MDP, which describes the process of service composition while traveling. In the MDP model, candidate services that can be invoked are modeled as distinct actions. Different actions are invoked, possibly reaching different states, while the new state determines the set of services that can be invoked next. For the different services invoked, the quality of the service, i.e. the reward function in the MDP model, is evaluated by the observed QoS attributes. Thus, a service composition problem is transformed into an MDP model, so that solution optimization can be performed through a reinforcement learning method.
Step 2, solving a four-tuple Markov decision process by applying a Q-learning-based reinforcement learning method to obtain an optimal strategy;
and solving the MDP model to find the optimal service selection strategy in each state, so that the final combined result is more optimal. In the MDP model, the goodness or badness of an action is not only dependent on the immediate return value generated by the action, but also related to the subsequent state and return caused by the action, and in the reinforcement learning algorithm Q-learning, the estimated value of selecting action a in state s is evaluated by using a Q-value function Q (s, a), and the iterative formula is as follows:
Figure BDA0001219062010000051
wherein σ is a learning rate for controlling the magnitude of the degree of change at each update of the Q value; gamma is discount rate, which is used to control the influence degree of future state; reinforcement learning theory considers that the effect of an immediate return value should be greater than the future possible return values, so that the value of γ is between 0 and 1. R is R (s, a), which is the report value for performing action a in state s. Q (s ', a') represents the Q value selected for a 'after transition to state s' after performing action a, and is used to represent a future prize value.
In the traditional reinforcement learning process, the calculated Q value is recorded, and when Q is updated later, Q (s ', a') is obtained by searching in a Q value table which is calculated and recorded before, which is sufficient in some application scenes. However, in a highly dynamic service combination scene, the method lacks generalization capability and cannot cope with data changes in a real scene. And along with the expansion of the service combination scale, the space and time required by value table storage and query also consume great computing power, and the requirement on real-time performance cannot be well met. Therefore, the invention provides a method for modeling the estimation of the Q value through the Gaussian process, thereby improving the generalization ability, better coping with the dynamic environment and obtaining better effect in practical application.
As shown in fig. 4, the method specifically includes the following steps:
(21) taking the state action pair z as input and the corresponding Q value Q (z) as output, and establishing a Gaussian prediction model of the Q value;
the gaussian process is schematically shown in fig. 3, a gaussian process model is trained according to known input and output data, and when a new input arrives, the corresponding output is predicted through the model. The Gaussian process model is uniquely determined by the mean function and the covariance function, so that the adjustment and optimization are easy, and the iterative convergence is relatively fast.
Specifically, a group of n training samples is selected { (z)i=(si,ai),Q(zi) 1.. n }, where z isi=(si,ai) Is a state action pair, is an input; q (z)i) The Q value corresponding to the state action is output. z is a radical of*And Q*Is the data that needs to be predicted. The Gaussian process considers that the input and output satisfy a joint probability distribution, using K (X, X)*) Represents n × n*All training points X and test points X*Covariance matrix of (n is the number of training points, n)*Number of test points), K (X, X*) The ith row and j columns of the matrix have the elements of k (X)i,X*),XiIs the ith element of set X.
K(X,X),K(X*,X),K(X*,X*) Similarly, the joint distribution of the output training points and the output test points is as follows:
Figure BDA0001219062010000052
calculated Q (z)*) Is desired to be α* TK(Z,Z*). Wherein
Figure BDA0001219062010000061
Wherein ω isnRepresenting an uncertainty parameter, and taking a value of 1 in the embodiment; i is an identity matrix; z is a set of historical state action pairs, f is a set of historical Q values corresponding to Z, K (Z, Z) is a covariance matrix between the historical state action pairs, and the ith row and j column elements of the covariance matrix are K (Z)i,zj) K (·) is a kernel function; k (Z, Z)t+1) For historical state action pairs with newly entered state action pairst+1A covariance matrix between;
(22) initializing a learning rate sigma, a discount rate gamma and a greedy strategy probability epsilon in Q-learning, wherein the current state s is 0, and the current time step t is 0;
(23) selecting one service a as current service a by using greedy strategy with probability of epsilontAnd executing, specifically: randomly generating a random number upsilon in the interval (0,1), if upsilon>Epsilon, randomly selecting a new service a; if upsilon is less than or equal to epsilon, selecting the service with the largest current Q value as a new service a; this can avoid falling into local optima;
(24) is recorded in the current state stUnder execution of Current service atIs given a return value rtAnd state after execution of service at+1(ii) a Calculating the in-state action pair z according tot=<st,at>The following Q value:
Figure BDA0001219062010000062
wherein Q (z)t) For acting on a state zt=<st,at>The Q value, σ, is the learning rate, r, γ, is the discount rate, st+1To execute service atFrom the current state stTransition to the subsequent state, at+1Is in a state st+1Service of lower choice, Q(s)t+1,at+1) Indicating an action pair in a state<st+1,at+1>The lower Q value;
(25) updating the Q value according to a Gaussian prediction model:
Figure BDA0001219062010000063
where I is the identity matrix, ωnFor uncertainty parameters, Z is a set of historical state action pairs,
Figure BDA0001219062010000064
for the set of historical Q values corresponding to Z, K (Z, Z) is the covariance matrix between the historical state action pairs, with the ith row and j column elements of K (Z)i,zj) K (·) is a kernel function; k (Z, Z)t+1) For historical state action pairs with newly entered state action pairst+1The covariance matrix in between. There are many kinds of kernel functions that can be used, and in this embodiment, the kernel function K is a gaussian kernel function:
Figure BDA0001219062010000065
wherein sigmakIs the width of the gaussian kernel function.
Since the Gaussian model has changed due to the newly added data points, it is necessary to act on z according to the statet+1=<st+1,at+1>And a corresponding Q value Q (z)t+1) Updating Gaussian backgroundThe measurement model is used for the iterative update of the next Q value;
(26) and updating the current state: st=st+1When s istWhen the terminal state is the termination state and the convergence condition is met, the reinforcement learning is finished to obtain an optimal strategy; otherwise go to step (23).
The convergence condition in this embodiment is that the Q value changes stably, that is, the change of the Q value is smaller than the Q value threshold QthNamely: i Q (z)t)-Q(zt+1)|<QthAnd obtaining the optimal strategy at the moment, and obtaining the final service combination result according to the optimal strategy.

Claims (3)

1. A service combination method combining Gaussian process and reinforcement learning is characterized by comprising the following steps:
(1) modeling the service composition problem as a four-tuple Markov decision process as follows:
M=<S,A,P,R>
where S is a set of finite states in the environment; a is a set of callable actions, A(s) represents a set of actions that can be performed in state s; p is a function describing the MDP state transition, P (s '| s, a) represents the probability of transitioning to state s' after invoking action a under state s; r is a return value function, and R (s, a) represents the return value obtained by invoking the action a under the state s;
(2) solving a four-tuple Markov decision process by applying a Q-learning-based reinforcement learning method to obtain an optimal strategy;
(3) mapping the optimal strategy into a workflow of a web service combination;
the step (2) of solving the four-tuple Markov decision process by applying a Q-learning-based reinforcement learning method to obtain an optimal strategy comprises the following steps:
(21) taking the state action pair z as input and the corresponding Q value Q (z) as output, and establishing a Gaussian prediction model of the Q value;
(22) initializing learning rate sigma, discount rate gamma, greedy strategy probability epsilon and current state s in Q-learningt0, and 0 is the current time step t;
(23) by a probability ofGreedy strategy selection of epsilon for current service atAnd executing;
(24) is recorded in the current state stUnder execution of Current service atIs given a return value rtAnd executing the current service atRear state st+1(ii) a Calculating the in-state action pair z according tot=<st,at>The following Q value:
Figure FDA0002293298000000011
wherein Q (z)t) For acting on a state zt=<st,at>Q value at σ is learning rate, rtTo return value, gamma is discount rate, st+1To execute service atFrom the current state stTransition to the subsequent state, at+1Is in a state st+1Service of lower choice, Q(s)t+1,at+1) Indicating an action pair in a state<st+1,at+1>The lower Q value;
(25) updating the Q value according to a Gaussian prediction model:
Figure FDA0002293298000000012
where I is the identity matrix, ωnFor uncertainty parameters, Z is a set of historical state action pairs,
Figure FDA0002293298000000013
for the set of historical Q values corresponding to Z, K (Z, Z) is the covariance matrix between the historical state action pairs, with the ith row and j column elements of K (Z)i,zj) K (·) is a kernel function; k (Z, Z)t+1) For historical state action pairs with newly entered state action pairst+1A covariance matrix between;
according to the state action pair zt+1=<st+1,at+1>And a corresponding Q value Q (z)t+1) Updating a Gaussian prediction model;
(26) is updated whenThe former state: st=st+1When s istWhen the terminal state is the termination state and the convergence condition is met, the reinforcement learning is finished to obtain an optimal strategy; otherwise go to step (23).
2. The service composition method combining gaussian process and reinforcement learning according to claim 1, wherein the kernel function k (-) in the gaussian prediction model is a gaussian kernel function:
Figure FDA0002293298000000021
wherein sigmakIs the width of the gaussian kernel function.
3. The service composition method combining gaussian process and reinforcement learning according to claim 1, wherein the convergence condition in step (26) is: the variation of Q value is less than Q threshold QthNamely: i Q (z)t)-Q(zt+1)|<Qth
CN201710055817.2A 2017-01-25 2017-01-25 Service combination method combining Gaussian process and reinforcement learning Active CN106850289B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710055817.2A CN106850289B (en) 2017-01-25 2017-01-25 Service combination method combining Gaussian process and reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710055817.2A CN106850289B (en) 2017-01-25 2017-01-25 Service combination method combining Gaussian process and reinforcement learning

Publications (2)

Publication Number Publication Date
CN106850289A CN106850289A (en) 2017-06-13
CN106850289B true CN106850289B (en) 2020-04-24

Family

ID=59120622

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710055817.2A Active CN106850289B (en) 2017-01-25 2017-01-25 Service combination method combining Gaussian process and reinforcement learning

Country Status (1)

Country Link
CN (1) CN106850289B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108319852B (en) * 2018-02-08 2022-05-06 北京安信天行科技有限公司 Event discrimination strategy creating method and device
CN108972546B (en) * 2018-06-22 2021-07-20 华南理工大学 Robot constant force curved surface tracking method based on reinforcement learning
CN108958916B (en) * 2018-06-29 2021-06-22 杭州电子科技大学 Workflow unloading optimization method under mobile edge environment
CN109388484B (en) * 2018-08-16 2020-07-28 广东石油化工学院 Multi-resource cloud job scheduling method based on Deep Q-network algorithm
CN109670637A (en) * 2018-12-06 2019-04-23 苏州科技大学 Building energy consumption prediction technique, storage medium, device and system
KR102251316B1 (en) * 2019-06-17 2021-05-12 (주)브이엠에스 솔루션스 Reinforcement learning and simulation based dispatching method within a factory, and an apparatus thereof
CN113065284B (en) * 2021-03-31 2022-11-01 天津国科医工科技发展有限公司 Triple quadrupole mass spectrometer parameter optimization strategy calculation method based on Q learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103248693A (en) * 2013-05-03 2013-08-14 东南大学 Large-scale self-adaptive composite service optimization method based on multi-agent reinforced learning
CN103646008A (en) * 2013-12-13 2014-03-19 东南大学 Web service combination method
CN105046351A (en) * 2015-07-01 2015-11-11 内蒙古大学 Reinforcement learning-based service combination method and system in uncertain environment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103248693A (en) * 2013-05-03 2013-08-14 东南大学 Large-scale self-adaptive composite service optimization method based on multi-agent reinforced learning
CN103646008A (en) * 2013-12-13 2014-03-19 东南大学 Web service combination method
CN105046351A (en) * 2015-07-01 2015-11-11 内蒙古大学 Reinforcement learning-based service combination method and system in uncertain environment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Integrating Gaussian Process with Reinforcement Learning for Adaptive Service Composition";Hongbing Wang 等;《Lecture Notes in Computer Science》;20151125;第203-217页 *
"基于多Agent学习机制的服务组合";赵海燕 等;《计算机工程与科学》;20130915;第35卷(第9期);正文第3节 *
"基于强化学习的QoS感知的服务组合优化方案研究";吴琴;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160815;I139-123 *

Also Published As

Publication number Publication date
CN106850289A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
CN106850289B (en) Service combination method combining Gaussian process and reinforcement learning
Wang et al. Adaptive and large-scale service composition based on deep reinforcement learning
CN108932671A (en) A kind of LSTM wind-powered electricity generation load forecasting method joined using depth Q neural network tune
CN110235148A (en) Training action selects neural network
EP3593289A1 (en) Training action selection neural networks using a differentiable credit function
CN108962238A (en) Dialogue method, system, equipment and storage medium based on structural neural networks
CN112541302A (en) Air quality prediction model training method, air quality prediction method and device
CN107241213A (en) A kind of web service composition method learnt based on deeply
CN109840595B (en) Knowledge tracking method based on group learning behavior characteristics
CN111310987B (en) Method and device for predicting free parking space of parking lot, electronic equipment and storage medium
CN112329948A (en) Multi-agent strategy prediction method and device
CN112154458A (en) Reinforcement learning using proxy courses
CN115905691B (en) Preference perception recommendation method based on deep reinforcement learning
CN110235149A (en) Neural plot control
CN112930541A (en) Determining a control strategy by minimizing delusional effects
CN115293623A (en) Training method and device for production scheduling model, electronic equipment and medium
CN114637911A (en) Next interest point recommendation method of attention fusion perception network
US20220269835A1 (en) Resource prediction system for executing machine learning models
CN113379536A (en) Default probability prediction method for optimizing recurrent neural network based on gravity search algorithm
CN113095513A (en) Double-layer fair federal learning method, device and storage medium
CN112541570A (en) Multi-model training method and device, electronic equipment and storage medium
CN114861917A (en) Knowledge graph inference model, system and inference method for Bayesian small sample learning
CN115098613A (en) Method, device and medium for tracking and predicting trajectory data
CN115098583A (en) User portrait depicting method for energy user
CN114528992A (en) Block chain-based e-commerce business analysis model training method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant