CN114626509A - Method for reconstructing explicit model prediction control based on deep learning - Google Patents

Method for reconstructing explicit model prediction control based on deep learning Download PDF

Info

Publication number
CN114626509A
CN114626509A CN202210313902.5A CN202210313902A CN114626509A CN 114626509 A CN114626509 A CN 114626509A CN 202210313902 A CN202210313902 A CN 202210313902A CN 114626509 A CN114626509 A CN 114626509A
Authority
CN
China
Prior art keywords
neural network
deep neural
input
control
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210313902.5A
Other languages
Chinese (zh)
Other versions
CN114626509B (en
Inventor
张聚
施超
牛彦
潘伟栋
陈德臣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Normal University
Original Assignee
Hangzhou Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Normal University filed Critical Hangzhou Normal University
Priority to CN202210313902.5A priority Critical patent/CN114626509B/en
Publication of CN114626509A publication Critical patent/CN114626509A/en
Application granted granted Critical
Publication of CN114626509B publication Critical patent/CN114626509B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses a prediction control method for reconstructing an explicit model based on deep learning, which comprises the following steps: step 1) re-optimizing and expressing the explicit model predictive control into a multi-parameter quadratic programming problem; step 2), collecting data and building a deep neural network; step 3), training the built deep neural network; step 4), verifying the feasibility of the deep neural network; step 5), reconstructing explicit model prediction control; and 6) optimizing the reconstructed parameters. The invention integrates the deep learning model and the explicit model prediction control, solves the problems of high calculation resource requirement and long calculation time of the traditional model prediction control, ensures the control precision and the prediction accuracy and improves the calculation efficiency.

Description

Method for reconstructing explicit model prediction control based on deep learning
Technical Field
The invention belongs to the technical field of deep learning, is used for developing an accurate agent model based on deep learning and an off-line explicit optimal solution thereof, and particularly relates to a prediction control method based on a deep learning reconstruction model.
Background
Deep learning models are a class of approximate models that have proven to be highly predictive of complex phenomena. The introduction of deep learning models into the equation requiring optimization provides a method to reduce the complexity of the problem and maintain the accuracy of the model. A deep learning model in the form of a neural network with corrective linear units therein can be accurately recast into a multi-parameter quadratic programming formulation. However, developing optimal solutions in online applications involving explicit model predictive control remains a challenge. Multi-parameter programming alleviates the burden of online computation to solve optimization problems involving bounded, uncertain parameters. There is still great room for improvement in offline computing.
Deep learning is a method of approximating complex systems and tasks by developing complex mathematical models using large amounts of data. These approximation models are increasingly valuable as data-driven modeling techniques, and it is important to incorporate deep learning into the optimization formulation. The use of neural networks as surrogate models has been successful in various environments, such as modeling; optimizing and controlling; and (4) regression and classification. In all of these applications, artificial neural network models are developed that are used to represent complex, non-linear processes. However, due to the inherent, unavoidable and unavoidable complexity, obtaining a global solution to the corresponding optimization problem involving the neural network imposes a significant computational burden.
Because of their highly connected structure, deep learning models are adept at expressing complex functional relationships. Their ability to approximate functions to arbitrary levels of precision is due to the existence of an exponential number of piecewise connected hyperplanes based on the size of the neural network. Given an optimization problem with highly complex and nonlinear components, with a correct linear-activated neural network unit (ReLU), has proven to have high performance for regression-based problems, and can be incorporated into the optimization formula as a surrogate model.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a method for combining deep learning and multi-parameter planning, which is used for developing an accurate agent model based on the deep learning and an off-line explicit optimal solution thereof.
The method is characterized by using an integrated deep learning model, particularly a neural network with a correction linear unit, and an explicit model prediction control. The multiparameter quadratic programming formula can be accurately recast. Recasting a deep learning model as a set of piecewise-linear functions enables incorporation of predictive models into model-based control strategies, such as explicit model predictive control. To reduce the computational burden of solving the piecewise linear optimization problem on-line, a full off-line explicit solution to the optimal control problem is obtained using multi-parameter planning. However, in online applications where time is a critical factor, determining the optimal solution is challenging due to the inherent non-complexity of the resulting discrete optimization equations. Multiparameter programming is an effective method to reduce the computational burden on-line by developing optimal solutions off-line. By introducing more advanced surrogate models in the multi-parameter optimization formulation, the advantages of the developed parametric solutions, namely (I) the ability to obtain an optimal solution without having to solve the optimization problem each time an uncertain parameter is identified, (ii) having a prior mapping of the solution, and (iii) having an explicit functional relationship between the optimization variables and the uncertain parameters, are improved. One key drawback in these multi-parameter model formulas is that they rely on linear or piecewise linear constraints. Therefore, in order to incorporate more complex phenomena into the parametric formulation, optimization is required. Developing an accurate approximation model to represent the nonlinear functional relationship is not straightforward, and deep learning models based on the ReLU activation function make up for this gap. Neural networks using the ReLU activation function can be accurately represented in multi-parameter quadratic programming formulas. Accurate recasting of a neural network and the ReLU activation function as a new approach to incorporate deep learning models into optimization-based formulas. The gap between model accuracy and computational performance is thus reduced.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention are further described below. The method for reconstructing the explicit model prediction control based on the deep learning comprises the following steps:
step (1), re-optimizing and expressing the explicit model predictive control into a multi-parameter quadratic programming problem;
the explicit model predictive control problem is as follows:
Figure BDA0003568208530000021
Figure BDA0003568208530000031
wherein
Figure BDA0003568208530000032
Is a vector that contains the control input sequence,
Figure BDA0003568208530000033
and
Figure BDA0003568208530000034
is a weighting matrix, the weighting matrix is selected such that P ≧ 0 and Q ≧ 0 are semi-positive, R > 0 is positive;
Figure BDA0003568208530000035
is a vector of the states of the image data,
Figure BDA0003568208530000036
is a control input to the control unit,
Figure BDA0003568208530000037
is a matrix of the system and is,
Figure BDA0003568208530000038
is an input matrix, controllable for (a, B); k denotes the kth sample point, xNIndicating the terminal state; n isxA dimension representing a system state; n isuA dimension representing a system control input; n iscxRepresenting the number of hyperplanes that make up a state-bounded polyhedral set; n is a radical of an alkyl radicalcfRepresenting the number of hyperplanes constituting a terminal-bounded polyhedral set; n iscuRepresenting the number of hyperplanes of the input bounded polyhedron set;
the state, terminal and input constraints are a bounded set of polyhedrons, passing through a matrix
Figure BDA0003568208530000039
Figure BDA00035682085300000310
Sum vector
Figure BDA00035682085300000311
Defining; selecting the terminal cost defined by P and the recursion feasibility of the terminal set which should ensure the stability of the closed-loop system and the optimization problem; depending on the prediction boundary N, there is a set of initial states x for which a solution is feasibleinitCalled feasible region, the optimization formula (1) reformulates as a multi-parameter quadratic programming problem, which depends only on the current system state xinit
Figure BDA00035682085300000312
subject to Ccu≤Txinit+cc
Wherein
Figure BDA00035682085300000313
Figure BDA00035682085300000314
nineqIs the total number of inequalities in the multi-parameter problem;
Figure BDA00035682085300000315
representing the number of inequality constraints in equation (2);
Figure BDA00035682085300000316
represents the number of control variables in equation (2);
the multi-parameter quadratic programming problem solution is in the form of a piecewise affine function:
Figure BDA00035682085300000317
for nrThe area of the image to be displayed is,
Figure BDA0003568208530000041
and
Figure BDA0003568208530000042
each region ΘiAre all described by polyhedrons;
Figure BDA0003568208530000043
wherein
Figure BDA0003568208530000044
ciRepresentation description region ΘiNumber of inequalities of polyhedrons, ai,jxinit≤bi,j,j=1,...,cj
Figure BDA0003568208530000045
And is
Figure BDA0003568208530000046
Step (2) of constructing a data set
Sampling the input space of the function, normalizing the input space to construct an input/output data set, wherein an initial state set x is constructedinitThe initial states of the middle sampling are input, and the input u is controlled to be output;
step (3), building a deep neural network, and training and verifying the deep neural network by using the data set:
the deep neural network comprises an input layer, three hidden layers and an output layer;
the neural network is added with a projection algorithm on the basis of full connection, so that the output of the neural network can be ensured to meet the constraint condition and feasibility of a control system; the fourth layer of the neural network is a projection algorithm layer, because some data are infeasible during training, the feasibility of the output of the neural network can be ensured through the projection algorithm;
each node in the deep neural network hidden layer is associated with an activation function; the activation function employs a ReLU activation function, which is defined in the equation:
y=max{0,1} (5)
since the ReLU activation function is also a piecewise linear function, it can also be expressed by the equation:
Figure BDA0003568208530000047
the ReLU activation function approximates a function by decomposing the original function (5) into a set of piecewise hyperplanes; defining the exponential relation between the number of the segmented hyperplanes and the number of hidden layers of the deep neural network:
Figure BDA0003568208530000048
wherein L is the number of hidden layers, nxThe number of input variables of the neural network, namely the number of system state variables, and M is the number of nodes;
the deep neural network with the ReLU activation function has the capability of accurate function approximation, and passes through any function of a piecewise affine hyperplane with exponential quantity;
and (4) reconstructing explicit model prediction control by using the deep neural network trained and verified in the step (3):
the neural network related to the ReLU activation function can accurately reconstruct the multi-parameter quadratic programming problem; using a multi-parameter quadratic programming formula allows neural networks to be directly embedded into optimization problems; the added complexity of this reconstruction to the overall optimization problem is to manage binary variables;
for a deep neural network with n nodes, the output takes the form:
xk=max{0,Wkxk-1+bk} (8)
where k is the number of network layers, WkIs a weight matrix of the k-th layer, bkIs the deviation vector of the k-th layer,
Figure BDA0003568208530000051
is the output of the previous layer(s),
Figure BDA0003568208530000052
is the output of the current layer;
accurately reconstructing the formula (8) in an optimization formula by containing binary variables; the k-th hidden layer reconstruction for the multi-parameter problem is as follows:
Figure BDA0003568208530000053
xk≥0,sk≥0,y∈{0,1}n
wherein the variable y is a binary variable,
Figure BDA0003568208530000054
is an auxiliary variable vector, M is a large scalar value;
after the formula (8) is reconstructed by the formula (9), the total number of the binary variables y is equal to the total number of the nodes forming the hidden layer;
a reconstructed neural network with a ReLU activation function is an accurate reconstruction; the binary variable enables the activation function to output a value of 0 or x through the interlayer constraint; in addition, the number of equality and inequality constraints respectively grows linearly with the total number n of nodes; the combination of the reconstructed neural network into an optimization formula provides an effective strategy to maintain high precision of the proxy model;
and (5) optimizing the parameters of the deep neural network model reconstructed in the step (4):
after reconstruction, a regularization technology can be used for reducing the number of active nodes (nodes with values different from 0); reducing the number of active nodes directly reduces the number of binary variables needed in the reconstruction process; the neural network after training and processing minimizes unnecessary binary variables; for example, nodes in the hidden layer that are always positive are represented by linear activation functions; therefore, the node does not need a slack variable or a binary variable.
Step (6), realizing the attitude control of the helicopter by using the optimized deep neural network;
preferably, during deep neural network training, a dropout technology is adopted to train the model, update network parameters and prevent an overfitting phenomenon from occurring.
Preferably, during deep neural network validation, a k-fold cross-validation model is employed.
The invention has the following advantages:
1. the invention integrates the deep learning model and the explicit model prediction control, solves the problems of high calculation resource requirement and long calculation time of the traditional model prediction control, ensures the control precision and the prediction accuracy and improves the calculation efficiency.
2. The method has firm theoretical basis of steps, simple and clear steps and perfect theoretical support.
3. The deep neural network adds a projection algorithm on the basis of full connection, so that the output can be controlled through a feasible set of the output. The fourth layer of the network is the projection algorithm layer, because some data is not feasible when training, and then the feasibility of controlling the output can be improved through the projection algorithm.
2. The invention introduces binary variables, which can enable the activation function to pass through the constraint, thereby outputting the value of 0 or x. Therefore, the accuracy of the model can be improved by substituting the model into the optimization formula of the network.
Drawings
Fig. 1 is a flow framework diagram of deep learning model based integration with ReLU activation functions and explicit model predictive control in accordance with the method of the present invention.
FIG. 2 is a schematic diagram of a feedforward neural network structure involved in the method of the present invention.
FIG. 3 is a comparison of the approximate control law generated by deep learning of the method of the present invention with the conventional EMPC control law.
FIG. 4 is a comparison graph of deep learning of the method of the present invention and tracking simulation of altitude angle by conventional EMPC.
Fig. 5 is a comparison graph of deep learning of the method of the present invention and tracking simulation of the state trace of the pitch angle by the conventional EMPC.
FIG. 6 is a comparison graph of deep learning of the method of the present invention and tracking simulation of the state trajectory of the rotation angle by the conventional EMPC.
FIG. 7 is a comparison graph of experimental data of deep learning and the traditional EMPC control law.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
the invention discloses a prediction control method for reconstructing an explicit model based on deep learning, which is applied to the field of helicopter attitude control.
A helicopter attitude control method based on deep learning reconstruction explicit model predictive control specifically comprises the following steps as shown in figure 1:
step (1), firstly, analyzing and dynamically modeling a helicopter system: the method comprises the following steps of analyzing stress conditions corresponding to all shafts and components of the helicopter system in the operation process, and specifically comprises the following steps:
the three-degree-of-freedom helicopter system space state equation is as follows:
Figure BDA0003568208530000071
according to the modeling analysis of the helicopter, the altitude angle epsilon, the pitch angle p, the rotation angle r and the differential altitude angular velocity thereof are respectively selected
Figure BDA0003568208530000072
Pitch angular velocity
Figure BDA0003568208530000073
And rotational angular velocity
Figure BDA0003568208530000074
As state vectors, i.e.
Figure BDA0003568208530000075
The voltage of front and rear motors of the three-freedom-degree helicopter is selected as an input vector, namely u ═ VfVb]TAltitude angle e, pitch angle p, and rotation angle r as output vector y ═ e p r]T(ii) a Substituting the values in the specific relevant parameter table to obtain the coefficients of the A, B and C state equations as follows:
Figure BDA0003568208530000076
and (2) acquiring and processing data through an explicit model predictive control algorithm: re-optimizing and expressing the explicit model predictive control into a multi-parameter quadratic programming problem; the method comprises the following steps:
the explicit model predictive control problem is as follows:
Figure BDA0003568208530000077
Figure BDA0003568208530000081
wherein
Figure BDA0003568208530000082
Is a vector that contains the control input sequence,
Figure BDA0003568208530000083
and
Figure BDA0003568208530000084
is a weighting matrix selected such that P ≧ 0 and Q ≧ 0 are semi-positive, and R > 0 is positive.
Figure BDA0003568208530000085
Is a vector of the states of the memory cells,
Figure BDA0003568208530000086
is a control input to the control unit that,
Figure BDA0003568208530000087
is a matrix of the system, and,
Figure BDA0003568208530000088
is an input matrix, controllable for (a, B). k denotes the kth sample point, xNIndicating the terminal state; n is a radical of an alkyl radicalxA dimension representing a system state; n is a radical of an alkyl radicaluA dimension representing a system control input; n is a radical of an alkyl radicalcxRepresenting the number of hyperplanes that make up a state-bounded polyhedral set; n iscfRepresenting the number of hyperplanes constituting a terminal-bounded polyhedral set; n iscuRepresenting the number of hyperplanes of the input bounded set of polyhedrons;
the state, terminal and input constraints are a bounded set of polyhedrons, by a matrix
Figure BDA0003568208530000089
Figure BDA00035682085300000810
Sum vector
Figure BDA00035682085300000811
And (4) defining.
The cost of the terminal defined by P is chosen and the set of terminals should guarantee the stability of the closed-loop system and the recursive feasibility of the optimization problem. Depending on the prediction boundary N, there is a set of initial states x for which a solution is feasibleinitCalled feasible region, the optimization formula (1) reformulates as a multi-parameter quadratic programming problem, which depends only on the current system state xinit
Figure BDA00035682085300000812
Wherein
Figure BDA00035682085300000813
nineqIs the total number of inequalities in the multi-parameter problem.
Figure BDA00035682085300000814
Representing the number of inequality constraints in equation (2);
Figure BDA00035682085300000815
represents the number of control variables in equation (2);
the multi-parameter quadratic programming problem solution is in the form of a piecewise affine function:
Figure BDA0003568208530000091
for nrThe area of the image to be displayed is,
Figure BDA0003568208530000092
and
Figure BDA0003568208530000093
each region ΘiAre described by polyhedrons.
Figure BDA0003568208530000094
Wherein
Figure BDA0003568208530000095
ciRepresentation description region ΘiNumber of inequalities of polyhedrons, ai,jxinit≤bi,j,j=1,...,cj
Figure BDA0003568208530000096
And is
Figure BDA0003568208530000097
Step (3) of constructing a data set
Sampling the input space of the function, normalizing the sampled input spacePost-processing to construct an input/output data set, wherein the initial state set xinitThe initial states of the middle sampling are input, and the control input u is output;
step (4), building a deep neural network as shown in fig. 2, and training and verifying the deep neural network by using the data set:
the deep neural network comprises an input layer, three hidden layers and an output layer;
the neural network is added with a projection algorithm on the basis of full connection, so that the output of the neural network can be ensured to meet the constraint condition and feasibility of a control system; the fourth layer of the neural network is a projection algorithm layer, because some data are infeasible during training, the feasibility of the output of the neural network can be ensured through the projection algorithm;
each node in the hidden layer is associated with an activation function. Common activation functions are the hyperbolic tangent function, and the rectified linear unit. The choice of a suitable activation function depends on the problem, and the neural network of activation functions always shows good performance in many applications in the corrected linear unit.
The activation function adopted by the hidden layer in the deep neural network DNN is a ReLU activation function, which is defined in an equation:
y=max{0,1} (5)
it is also a piecewise linear function, which can also be expressed by the equation:
Figure BDA0003568208530000101
the ReLU activation function approximates a function by decomposing the original function into a set of piecewise hyperplanes. The number of segmented hyperplanes defining a ReLU neural network is exponentially related to the number of hidden layers of the neural network:
Figure BDA0003568208530000102
wherein L is the number of hidden layers, nxM is the number of nodes. The deep neural network with the ReLU activation function has the capability of accurate function approximation through an arbitrary function with an exponential number of piecewise affine hyperplanes.
During the training process, each node in the neural network has an associated weight and bias term. During training, these values are determined to minimize a defined performance criterion. There are many strategies for determining the optimum, such as gradient descent, random gradient descent, leffinberg-marquardt, etc. These techniques determine a set of locally optimal weights and bias values that minimize the selected performance criteria. Determining a global optimum is desirable and challenging. In many cases, a local solution is sufficient. Before implementing these training algorithms, the input and output data are typically normalized to avoid any scaling problems. In the training step, it is important to ensure that the neural network developed does not over-fit the data. One simple strategy to avoid overfitting is to use a large amount of data in the training process. More advanced techniques to prevent overfitting are cost function regularization and batch normalization. Dropout encourages sparsity of the neural network, preventing random nodes and their connections during the over-fitting training process by "Dropout". Dropping random nodes forces the network to be resilient and to identify the most salient features of the data set. Cost function regularization adds an additional penalty term to the minimized objective function. These additional terms result in a trained neural network with less active nodes.
In validating the feasibility of deep neural networks, several techniques exist to ensure that the neural network fits the data properly, as well as to provide a realistic measure of the effectiveness of the model. Ensuring that the neural network developed is valid is typically achieved by comparing the expected output of the real model with the predicted output of the neural network. One common technique is k-fold cross-validation, which attempts to maximize the validation criteria to ensure that the trained neural network provides a good fit. There are various test metrics to quantify the fit of the developed model between the predicted and measured output datasets, these metrics including Mean Square Error (MSE) and Root Mean Square Error (RMSE).
And (5) reconstructing explicit model prediction control by using the trained and verified deep neural network:
neural networks involving the ReLU activation function can accurately recreate a multi-parameter quadratic programming problem. The use of the mp-Qp formula allows the neural network to be directly embedded into the optimization problem. This reconstruction adds complexity to the overall optimization problem by managing binary variables.
For any network layer with n nodes, the output takes the form:
xk=max{0,Wkxk-1+bk} (8)
where k is the number of network layers, WkIs a weight matrix of the k-th layer, bkIs the deviation vector of the k-th layer,
Figure BDA0003568208530000111
is the output of the previous layer or layers,
Figure BDA0003568208530000112
is the output of the current layer.
The importance of the ReLU activation function is its piecewise linear nature. Thus, equation (8) can be accurately recast in the optimization equation by including binary variables. The k-th hidden layer reconstruction for the multi-parameter problem is as follows:
Figure BDA0003568208530000113
wherein the variable y is a binary variable,
Figure BDA0003568208530000114
is an auxiliary variable vector and M is a large scalar value.
After equation (8) is reconstructed by equation (9), the total number of binary variables y is equal to the total number of nodes constituting the hidden layer.
The recast neural network with the ReLU activation function is an accurate reconstruction. Binary variables enable the activation function to output a value of 0 or x through the interlayer constraint. In addition, the number of equality and inequality constraints respectively grows linearly with the total number of nodes n. Incorporating recast neural networks into the optimization formula provides an effective strategy to maintain high accuracy of the proxy model.
Step (6), optimizing the parameters of the deep neural network model reconstructed in the step (5):
training is performed after reconstruction, and the number of active nodes (nodes with values different from 0) can be reduced by using a regularization technology. Reducing the number of active nodes directly reduces the number of binary variables required in the recasting process. The trained and processed neural network may also minimize unnecessary binary variables. For example, nodes in the hidden layer that are always positive are represented by linear activation functions. Therefore, the node does not need a slack variable or a binary variable.
Case analysis
The invention aims at the three-degree-of-freedom helicopter, because of the self MIM0, higher order and nonlinear characteristics, the aircraft is controlled by utilizing the explicit model prediction to obtain output data, the data is used as training data, then the trained deep learning network is used for respectively carrying out experiments on the height axis, the rotating axis and the pitching axis, the performance of the deep learning combined with the explicit model prediction control method in the specific application of the three-degree-of-freedom helicopter is shown, and the superior performance of the invention is embodied by comparing the experimental results. The control signal can be fed back more quickly under the control of the deep learning network than the control of the explicit model prediction, the response time is shortened, and the stability of the system in the control process is improved due to the autonomous learning capability of the control system.
And obtaining output data of the system by an explicit model predictive control method, wherein the value of the input quantity and the correspondingly selected output quantity are correspondingly combined into data. Then, the table is analyzed to delete unnecessary data, correct abnormal data and repair the missing data. And finally, constructing a data set meeting the training requirement, converting the data set into a csv format, and dividing the data set into a training set, a verification set and a test set. The overall steps can be seen from the overall flow chart of fig. 2, a neural network is built on the tenserflow, data is imported, and then normalization processing is performed. Then 500 rounds of data training are set with a learning rate of 0.01, defining a mean square error loss function and creating an optimizer. As shown in the specification and the attached figure 5, after 500 training, the error of 0.12 still exists between the predicted value and the actual value, although the error exists between the finally obtained result and the calculation of the explicit model prediction control, the flight stability of the helicopter is not affected, and the helicopter can still keep stable flight within the error range. And comparing in the solution time, and obviously, the deep learning network has higher control efficiency than the common model prediction under the condition of the same parameters. The burden of the computer is greatly reduced on the storage space and the computation.
According to the experimental results and the program operation, under the same conditions and on the premise of stable operation, compared with common explicit model prediction control, the model control based on deep learning has the advantages of higher solving speed and better explicit calculation performance. In the aspect of control effect, the altitude angle, the rotation angle and the pitch angle of the three-freedom-degree helicopter can be effectively adjusted, an ideal stable state is quickly achieved, and the three-freedom-degree helicopter has good control performance.
FIG. 3 is a comparison of the approximate control law generated by deep learning of the method of the present invention with the conventional EMPC control law.
FIG. 4 is a comparison graph of deep learning of the method of the present invention and tracking simulation of altitude angle by conventional EMPC.
Fig. 5 is a comparison diagram of deep learning of the method of the present invention and tracking simulation of a state trace of a pitch angle by a traditional EMPC.
FIG. 6 is a comparison graph of deep learning of the method of the present invention and tracking simulation of the state trajectory of the rotation angle by the conventional EMPC.
FIG. 7 is a comparison graph of experimental data of deep learning and the traditional EMPC control law.

Claims (3)

1. A method for reconstructing explicit model prediction control based on deep learning is characterized in that: the method comprises the following steps:
step (1), re-optimizing and expressing the explicit model predictive control into a multi-parameter quadratic programming problem;
the explicit model predictive control problem is as follows:
Figure FDA0003568208520000011
wherein
Figure FDA0003568208520000012
Is a vector that contains the control input sequence,
Figure FDA0003568208520000013
and
Figure FDA0003568208520000014
is a weighting matrix, the weighting matrix is selected such that P ≧ 0 and Q ≧ 0 are semi-positive, R > 0 is positive;
Figure FDA0003568208520000015
is a vector of the states of the memory cells,
Figure FDA0003568208520000016
is a control input to the control unit,
Figure FDA0003568208520000017
is a matrix of the system, and,
Figure FDA0003568208520000018
is an input matrix, controllable for (a, B); k denotes the kth sample point, xNIndicating the terminal state; n isxA dimension representing a system state; n isuA dimension representing a system control input; n iscxRepresenting the number of hyperplanes that make up a state-bounded polyhedral set; n iscfRepresenting the number of hyperplanes constituting a terminal-bounded polyhedral set; n iscuRepresenting the number of hyperplanes of the input bounded set of polyhedrons;
the state, terminal and input constraints are a bounded set of polyhedrons, by a matrix
Figure FDA0003568208520000019
Figure FDA00035682085200000110
Sum vector
Figure FDA00035682085200000111
Defining; selecting the cost of the terminal defined by P and the recursion feasibility of the terminal set which should ensure the stability of the closed-loop system and the optimization problem; depending on the prediction boundary N, there is a set of initial states x for which a solution is feasibleinitCalled feasible region, the optimization formula (1) reformulates as a multi-parameter quadratic programming problem, which depends only on the current system state xinit
Figure FDA00035682085200000112
subject to Ccu≤Txinit+cc
Wherein
Figure FDA0003568208520000021
Figure FDA0003568208520000022
nineqIs the total number of inequalities in the multi-parameter problem;
Figure FDA0003568208520000023
representing the number of inequality constraints in equation (2);
Figure FDA0003568208520000024
represents the number of control variables in equation (2);
the multi-parameter quadratic programming problem solution is in the form of a piecewise affine function:
Figure FDA0003568208520000025
for nrThe area of the image to be displayed is,
Figure FDA0003568208520000026
and
Figure FDA0003568208520000027
each region ΘiAre all described by polyhedrons;
Figure FDA0003568208520000028
wherein
Figure FDA0003568208520000029
ciRepresentation description region ΘiNumber of inequalities of polyhedrons, ai,jxinit≤bi,j,j=1,...,cj
Figure FDA00035682085200000210
And is
Figure FDA00035682085200000211
Step (2) of constructing a data set
Sampling the input space of the function, normalizing the input space to construct an input/output data set, wherein an initial state set x is constructedinitThe initial states of the middle sampling are input, and the input u is controlled to be output;
step (3), building a deep neural network, and training and verifying the deep neural network by using the data set:
the deep neural network comprises an input layer, three hidden layers and an output layer;
the neural network is added with a projection algorithm on the basis of full connection, so that the output of the neural network is ensured to meet the constraint condition and feasibility of a control system;
each node in the deep neural network hidden layer is associated with an activation function; the activation function employs a ReLU activation function, which is defined in the equation:
y=max{0,1} (5)
since the ReLU activation function is also a piecewise linear function, it can also be expressed by the equation:
Figure FDA0003568208520000031
the ReLU activation function approximates a function by decomposing the original function (5) into a set of piecewise hyperplanes; defining the exponential relation between the number of the segmented hyperplanes and the number of hidden layers of the deep neural network:
Figure FDA0003568208520000032
wherein L is the number of hidden layers, nxThe number of input variables of the neural network, namely the number of system state variables, and M is the number of nodes;
the deep neural network with the ReLU activation function has the capability of accurate function approximation, and passes through any function of a piecewise affine hyperplane with exponential quantity;
and (4) reconstructing explicit model prediction control by using the deep neural network trained and verified in the step (3):
the neural network related to the ReLU activation function can accurately reconstruct a multi-parameter quadratic programming problem; using a multi-parameter quadratic programming formula allows neural networks to be directly embedded into optimization problems; the added complexity of this reconstruction to the overall optimization problem is to manage binary variables;
for a deep neural network with n nodes, the output takes the form:
xk=max{0,Wkxk-1+bk} (8)
where k is the number of network layers, WkIs a weight matrix of the k-th layer, bkIs the deviation vector of the k-th layer,
Figure FDA0003568208520000033
is the output of the previous layer or layers,
Figure FDA0003568208520000034
is the output of the current layer;
accurately reconstructing the formula (8) in an optimization formula by containing binary variables; the k-th hidden layer reconstruction for the multi-parameter problem is as follows:
Figure FDA0003568208520000035
xk≥0,sk≥0,y∈{0,1}n
wherein the variable y is a binary variable,
Figure FDA0003568208520000041
is an auxiliary variable vector, M is a large scalar value;
after the formula (8) is reconstructed by the formula (9), the total number of the binary variables y is equal to the total number of the nodes forming the hidden layer; the binary variable enables the activation function to output a value of 0 or x through the interlayer constraint;
and (5) optimizing the parameters of the deep neural network model reconstructed in the step (4):
reducing the number of active nodes of the deep neural network model by adopting a regularization technology;
and (6) realizing the attitude control of the helicopter by using the optimized deep neural network.
2. The method of claim 1, wherein during deep neural network training, the model is trained using a dropout technique to update network parameters and prevent overfitting.
3. The method of claim 1, wherein during deep neural network validation, a k-fold cross-validation model is employed.
CN202210313902.5A 2022-03-28 2022-03-28 Depth learning-based reconstruction explicit model prediction control method Active CN114626509B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210313902.5A CN114626509B (en) 2022-03-28 2022-03-28 Depth learning-based reconstruction explicit model prediction control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210313902.5A CN114626509B (en) 2022-03-28 2022-03-28 Depth learning-based reconstruction explicit model prediction control method

Publications (2)

Publication Number Publication Date
CN114626509A true CN114626509A (en) 2022-06-14
CN114626509B CN114626509B (en) 2024-06-14

Family

ID=81904380

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210313902.5A Active CN114626509B (en) 2022-03-28 2022-03-28 Depth learning-based reconstruction explicit model prediction control method

Country Status (1)

Country Link
CN (1) CN114626509B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6064997A (en) * 1997-03-19 2000-05-16 University Of Texas System, The Board Of Regents Discrete-time tuning of neural network controllers for nonlinear dynamical systems
CN109615146A (en) * 2018-12-27 2019-04-12 东北大学 A kind of wind power prediction method when ultrashort based on deep learning
CN111580389A (en) * 2020-05-21 2020-08-25 浙江工业大学 Three-degree-of-freedom helicopter explicit model prediction control method based on deep learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6064997A (en) * 1997-03-19 2000-05-16 University Of Texas System, The Board Of Regents Discrete-time tuning of neural network controllers for nonlinear dynamical systems
CN109615146A (en) * 2018-12-27 2019-04-12 东北大学 A kind of wind power prediction method when ultrashort based on deep learning
CN111580389A (en) * 2020-05-21 2020-08-25 浙江工业大学 Three-degree-of-freedom helicopter explicit model prediction control method based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张宝录;罗丹婷;胡鹏;樊举;景超;: "一种基于深度神经网络模型的测井曲线生成方法", 电子测量技术, no. 11, 8 June 2020 (2020-06-08) *

Also Published As

Publication number Publication date
CN114626509B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
CN106600059B (en) Intelligent power grid short-term load prediction method based on improved RBF neural network
CN111277434A (en) Network flow multi-step prediction method based on VMD and LSTM
CN113553755B (en) Power system state estimation method, device and equipment
Luo et al. Timeliness online regularized extreme learning machine
CN114065653A (en) Construction method of power load prediction model and power load prediction method
CN114358197A (en) Method and device for training classification model, electronic equipment and storage medium
Gil et al. Quantization-aware pruning criterion for industrial applications
CN112784920A (en) Cloud-side-end-coordinated dual-anti-domain self-adaptive fault diagnosis method for rotating part
CN115114128A (en) Satellite health state evaluation system and evaluation method
CN112990958A (en) Data processing method, data processing device, storage medium and computer equipment
Mandal et al. Performative reinforcement learning
Rocha Filho et al. Recursive fuzzy instrumental variable based evolving neuro-fuzzy identification for non-stationary dynamic system in a noisy environment
CN114239796A (en) Power system state estimation method based on extended Kalman filtering
Sawant et al. Learning-based MPC from big data using reinforcement learning
CN116706907B (en) Photovoltaic power generation prediction method based on fuzzy reasoning and related equipment
Zhang et al. Ada-Segment: Automated multi-loss adaptation for panoptic segmentation
Rady Reyni’s entropy and mean square error for improving the convergence of multilayer backprobagation neural networks: a comparative study
CN117313351A (en) Safe data center optimized cold prediction method and system
CN112766537A (en) Short-term electric load prediction method
CN114626509B (en) Depth learning-based reconstruction explicit model prediction control method
CN110986946A (en) Dynamic pose estimation method and device
CN115619563A (en) Stock price analysis method based on neural network
CN115392113A (en) Cross-working condition complex electromechanical system residual life prediction system and method
CN114372418A (en) Wind power space-time situation description model establishing method
Zhang et al. Spectral normalization generative adversarial networks for photovoltaic power scenario generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant