CN114626509B

CN114626509B - Depth learning-based reconstruction explicit model prediction control method

Info

Publication number: CN114626509B
Application number: CN202210313902.5A
Authority: CN
Inventors: 张聚; 施超; 牛彦; 潘伟栋; 陈德臣
Original assignee: Hangzhou Normal University
Current assignee: Hangzhou Normal University
Priority date: 2022-03-28
Filing date: 2022-03-28
Publication date: 2024-06-14
Anticipated expiration: 2042-03-28
Also published as: CN114626509A

Abstract

The invention discloses a prediction control method for reconstructing an explicit model based on deep learning, which comprises the following steps: step 1) re-optimizing and expressing the explicit model prediction control as a multi-parameter quadratic programming problem; step 2) data collection and deep neural network construction; step 3) training the built deep neural network; step 4) verifying feasibility of the deep neural network; step 5) reconstructing explicit model prediction control; and 6) optimizing the parameters after reconstruction. The invention integrates the deep learning model and the explicit model prediction control, solves the problems of high demand on calculation resources, long calculation time, control precision and prediction accuracy and calculation efficiency of the traditional model prediction control.

Description

Depth learning-based reconstruction explicit model prediction control method

Technical Field

The invention belongs to the technical field of deep learning, and is used for developing an accurate proxy model based on deep learning and an off-line explicit optimal solution thereof, in particular to a prediction control method based on a deep learning reconstruction model.

Background

Deep learning models are a class of approximation models that have been demonstrated to have strong predictive power for representing complex phenomena. Introducing the deep learning model into the needed optimization formula provides a way to reduce the complexity of the problem and maintain the accuracy of the model. A deep learning model in the form of a neural network with correction linear elements therein can be accurately recast into a multiparameter quadratic programming formula. However, developing optimal solutions involving explicit model predictive control in online applications remains a challenge. Multiparameter programming eases the on-line computational burden of solving optimization problems involving bounded uncertainty parameters. There is still significant room for improvement in offline computing.

Deep learning is a method of approximating complex systems and tasks by developing complex mathematical models using large amounts of data. These approximation models are increasingly valuable with data-driven modeling techniques, and it is critical to incorporate deep learning into the optimization formula. The use of neural networks as alternative models has been successful in various environments, such as modeling; optimizing and controlling; regression and classification. In all of these applications, the developed artificial neural network model is used to represent a complex nonlinear process. However, obtaining a global solution containing the corresponding optimization problem of the neural network presents a huge computational burden due to inherent, unavoidable and unavoidable complexity.

Due to their highly connected structure, deep learning models are adept at expressing complex functional relationships. Their ability to approximate functions to arbitrary precision levels is due to the exponential number of piecewise connected hyperplanes that exist based on the size of the neural network. Given an optimization problem with highly complex and nonlinear components, with a neural network unit (ReLU) that corrects for linear activation, it has been demonstrated to have high performance on regression-based problems, and can be incorporated into an optimization formula as an alternative model.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a method for combining deep learning and multi-parameter planning, which is used for developing an accurate proxy model based on the deep learning and an off-line explicit optimal solution thereof.

With integrated deep learning models, in particular neural networks with correcting linear units, and explicit model predictive control. The multiparameter quadratic programming formula can be precisely recast. Recasting the deep learning model as a set of piecewise linear functions enables the incorporation of the prediction model into a model-based control strategy, such as explicit model predictive control. To alleviate the computational burden of solving the piecewise linear optimization problem online, a multi-parameter programming is utilized to obtain a complete offline explicit solution to the optimal control problem. However, in online applications where time is a critical factor, determining the optimal solution is challenging due to the inherent non-complexity of the resulting discrete optimization formula. Multiparameter programming is an effective method to reduce the computational burden on-line by developing optimal solutions off-line. By introducing a more advanced surrogate model into the multiparameter optimization formula, the advantages of the developed parametric solution are improved, namely (I) the ability to obtain an optimal solution without having to solve the optimization problem every time an uncertain parameter is identified, (ii) having an a priori mapping of the solution, and (iii) having an explicit functional relationship between the optimization variables and the uncertain parameter. One key disadvantage in these multiparameter model formulas is that they rely on linear or piecewise linear constraints. Therefore, in order to incorporate more complex phenomena into the parametric formula, optimization is required. Developing an accurate approximation model to represent the nonlinear functional relationship is not straightforward, and deep learning models based on ReLU activation functions make up this gap. Neural networks that utilize ReLU activation functions can be accurately represented in a multiparameter quadratic programming formula. The exact recasting and ReLU activation functions of a neural network as a new approach combines deep learning models into optimization-based formulas. The gap between model accuracy and computational performance is reduced.

In order to make the objects, technical solutions and advantages of the present invention more clear, the technical solutions of the present invention are further described below. The explicit model prediction control method based on deep learning reconstruction comprises the following steps:

step (1), re-optimizing and expressing the explicit model prediction control as a multiparameter quadratic programming problem;

the explicit model predictive control problem is as follows:

Wherein the method comprises the steps of Is a vector comprising a control input sequence,/>And/>Is a weighting matrix, the weighting matrix is selected so that P is more than or equal to 0 and Q is more than or equal to 0 is semi-positive, and R is more than 0 is positive; /(I)Is a state vector,/>Is a control input,/>Is a system matrix,/>Is an input matrix, controllable for (a, B); k represents a kth sampling point, and x _N represents a terminal state; n _x represents the dimension of the system state; n _u represents the dimension of the system control input; n _cx represents the number of hyperplanes that make up the state-bounded polyhedron set; n _cf represents the number of hyperplanes constituting the terminal bounded polyhedron set; n _cu represents the number of hyperplanes of the input bounded polyhedron set;

the state, terminal and input constraints are a set of bounded polyhedrons, passing through a matrix Sum vector/>Definition; selecting the terminal cost defined by P and the terminal set to ensure the stability of the closed-loop system and the recursive feasibility of the optimization problem; depending on the prediction boundary N, the set of initial states x _init with feasible solutions is called the feasible domain, and the optimization formula (1) is restated as a multiparameter quadratic programming problem, which depends only on the current system state x _init:

subject to C_cu≤Tx_init+c_c

Wherein the method comprises the steps of N _ineq is the total number of inequalities in the multiparameter problem; /(I)Representing the number of inequality constraints in equation (2); /(I)Representing the number of control variables in equation (2);

the multi-parameter quadratic programming problem solution is in the form of a piecewise affine function:

for the region of n _r, And/>Each region Θ _i is described by a polyhedron;

Wherein the method comprises the steps of C _i denotes the number of inequalities describing the polyhedron of the region Θ _i, a _i,jx_init≤b_i,j,j＝1,...,c_j,/>And/>

Step (2), constructing a data set

Sampling the input space of the function, normalizing the input space to construct an input/output data set, wherein each initial state sampled in an initial state set x _init is input, and controlling the input u to be output;

step (3), constructing a deep neural network, and training and verifying the deep neural network by using the data set:

the deep neural network comprises an input layer, three hidden layers and an output layer;

The neural network is added with a projection algorithm on the basis of full connection, so that the output of the neural network can be ensured to meet the constraint condition and feasibility of a control system; the fourth layer of the neural network is a projection algorithm layer, because some data are not feasible during training, and the feasibility of the output of the neural network can be ensured through a projection algorithm;

Each node in the deep neural network hidden layer is associated with an activation function; the activation function employs a ReLU activation function, which is defined in the equation:

y＝max{0,1} (5)

Since the ReLU activation function is also a piecewise linear function, it can also be expressed by the equation:

The ReLU activation function approximates a function by decomposing the original function (5) into a set of piecewise hyperplanes; defining the number of the segmented hyperplanes and the hidden layer number of the deep neural network to form an exponential relation:

Wherein L is the hidden layer number, n _x is the input variable number of the neural network, namely the number of system state variables, and M is the node number;

The deep neural network with the ReLU activation function has the capability of accurate function approximation, through any function with an exponential number of piecewise affine hyperplanes;

and (4) training and verifying the depth neural network to reconstruct the explicit model prediction control by utilizing the step (3):

The neural network related to the ReLU activation function can accurately reconstruct the multi-parameter quadratic programming problem; the use of a multiparameter quadratic programming formula allows the neural network to be directly embedded into the optimization problem; the added complexity of this reconstruction to the overall optimization problem is to manage the binary variables;

for a deep neural network with n nodes, the output takes the form:

x^k＝max{0,W^kx^k-1+b^k} (8)

Where k is the number of network layers, W ^k is the weight matrix of the kth layer, b ^k is the bias vector of the kth layer, Is the output of the previous layer,/>Is the output of the current layer;

accurately reconstructing equation (8) in an optimization equation by including binary variables; the k-th hidden layer reconstruction of the multiparameter problem is as follows:

x^k≥0,s^k≥0,y∈{0,1}ⁿ；

where the variable y is a binary variable, Is an auxiliary variable vector, M is a large scalar value;

after the formula (8) is reconstructed through the formula (9), the total number of the binary variables y is equal to the total number of nodes forming the hidden layer;

the reconstruction neural network with the ReLU activation function is an accurate reconstruction; binary variables enable the activation function to output a value of 0 or x through the sandwich constraint; in addition, the number of equality and inequality constraints increases linearly with the total number of nodes n, respectively; combining the reconstructed neural network into the optimization formula provides an effective strategy to maintain the high accuracy of the proxy model;

step (5), optimizing parameters of the depth neural network model reconstructed in the step (4):

regularization techniques can be used after reconstruction to reduce the number of active nodes (nodes with values other than 0); reducing the number of active nodes directly reduces the number of binary variables required in the reconstruction process; the neural network after training and processing minimizes unnecessary binary variables; for example, nodes in the hidden layer that are always positive are represented by linear activation functions; thus, the node does not need a slack variable or a binary variable.

Step (6), realizing attitude control of the helicopter by using the optimized deep neural network;

Preferably, during deep neural network training, a dropout technique is used to train a model, update network parameters and prevent the occurrence of over-fitting.

Preferably, during deep neural network validation, a k-fold cross validation model is employed.

The invention has the following advantages:

1. The invention integrates the deep learning model and the explicit model prediction control, solves the problems of high demand on calculation resources, long calculation time, control precision and prediction accuracy and calculation efficiency of the traditional model prediction control.

2. The method has firm theoretical foundation of the steps, simple and clear steps and perfect theoretical support.

3. The depth neural network of the invention adds a projection algorithm on the basis of full connection, so that the output can be controlled by a feasible set of the output. The fourth layer of the network is a projection algorithm layer, because some data is not feasible during training, and then the feasibility of controlling output can be improved through the projection algorithm.

2. The invention introduces binary variables that can enable the activation function to pass through the constraint, thereby outputting a value of 0 or x. Thus, the accuracy of the model can be improved by substituting the model into the optimization formula of the network.

Drawings

FIG. 1 is a flow chart of a method of the present invention based on the integration of a deep learning model with a ReLU activation function and explicit model predictive control.

FIG. 2 is a schematic diagram of the structure of a feedforward neural network according to the method of the present invention.

FIG. 3 is a comparison of the approximate control law generated by the deep learning of the method of the present invention with the conventional EMPC control law.

FIG. 4 is a comparison of the deep learning of the method of the present invention with the tracking simulation of the state trace of the altitude angle by the conventional EMPC.

Fig. 5 is a comparison diagram of the state trajectory tracking simulation of the deep learning and the conventional EMPC for the pitch angle of the method of the present invention.

FIG. 6 is a comparison of the state trace tracking simulation of the deep learning and conventional EMPC rotation angle by the method of the present invention.

Fig. 7 is a graph comparing the deep learning of the method of the present invention with various experimental data of the conventional EMPC control law.

Detailed Description

The invention is further described with reference to the accompanying drawings:

The invention discloses a depth learning reconstruction explicit model prediction control method which is applied to the field of helicopter attitude control.

A helicopter attitude control method based on deep learning reconstruction explicit model prediction control specifically comprises the following steps in fig. 1:

step (1), firstly, analyzing and dynamically modeling a helicopter system: analyzing stress conditions corresponding to each shaft and each assembly of the helicopter system in the operation process, wherein the method specifically comprises the following steps:

The spatial state equation of the three-degree-of-freedom helicopter system is as follows:

according to the modeling analysis of the helicopter, the altitude angle epsilon, the pitch angle p, the rotation angle r and the differential altitude angular velocity of the altitude angle epsilon, the pitch angle p and the rotation angle r are respectively selected Pitch angle rate/>Rotational angular velocity/>As a state vector, i.e./>

The voltage of front and rear motors of the three-degree-of-freedom helicopter is selected as an input vector, namely u= [ V _fV_b]^T, a height angle epsilon, a pitch angle p and a rotation angle r are selected as an output vector y= [ epsilon p r ] ^T; substituting values in a specific relevant parameter table to obtain the coefficients of the state equations of A, B and C, wherein the coefficients are as follows:

step (2), data acquisition and processing are carried out through an explicit model prediction control algorithm: re-optimizing and expressing the explicit model prediction control as a multi-parameter quadratic programming problem; the method specifically comprises the following steps:

the explicit model predictive control problem is as follows:

Wherein the method comprises the steps of Is a vector comprising a control input sequence,/>AndIs a weighting matrix, the weighting matrix is selected such that P.gtoreq.0 and Q.gtoreq.0 are semi-positive and R >0 is positive. /(I)Is a state vector,/>Is a control input,/>Is a system matrix,/>Is an input matrix, is controllable for (a, B). k represents a kth sampling point, and x _N represents a terminal state; n _x represents the dimension of the system state; n _u represents the dimension of the system control input; n _cx represents the number of hyperplanes that make up the state-bounded polyhedron set; n _cf represents the number of hyperplanes constituting the terminal bounded polyhedron set; n _cu represents the number of hyperplanes of the input bounded polyhedron set;

the state, terminal and input constraints are a set of bounded polyhedrons, passing through a matrix Sum vector/>And (5) defining.

The choice of the terminal cost defined by P and the terminal set should guarantee the stability of the closed-loop system and the recursive feasibility of the optimization problem. Depending on the prediction boundary N, the set of initial states x _init with feasible solutions is called the feasible domain, and the optimization formula (1) is restated as a multiparameter quadratic programming problem, which depends only on the current system state x _init:

Wherein the method comprises the steps of N _ineq is the total number of inequalities in the multiparameter problem. /(I)Representing the number of inequality constraints in equation (2); /(I)Representing the number of control variables in equation (2);

for the region of n _r, And/>Each region Θ _i is described by a polyhedron.

Wherein the method comprises the steps ofC _i denotes the number of inequalities describing the polyhedron of the region Θ _i, a _i,jx_init≤b_i,j,j＝1,...,c_j,/>And/>

Step (3), constructing a data set

Step (4), constructing a deep neural network as shown in fig. 2, and training and verifying the deep neural network by using the data set:

Each node in the hidden layer is associated with an activation function. Common activation functions are hyperbolic tangent function, hyperbolic tangent function and rectifying linear units. The choice of an appropriate activation function depends on the problem that the neural network of the activation function always exhibits good performance in many applications in the corrected linear cell.

The activation function adopted by the hidden layer in the deep neural network DNN is a ReLU activation function, which is defined in the equation:

y＝max{0,1} (5)

it is also a piecewise linear function, which can also be expressed by the equation:

The ReLU activation function approximates a function by decomposing the original function into a set of piecewise hyperplanes. Defining the number of segmented hyperplanes of the ReLU neural network to be exponentially related to the number of hidden layers of the neural network:

Wherein L is the hidden layer number, n _x is the input number, and M is the node number. The deep neural network with the ReLU activation function has the ability to approximate an exact function by having an exponential number of arbitrary functions of the piecewise affine hyperplane.

During the training process, each node in the neural network has an associated weight and bias term. During training, these values are determined to minimize defined performance criteria. There are many strategies for determining the optimum value, such as gradient descent, random gradient descent, lux-marquardt, etc. These techniques determine a set of locally optimal weights and bias values that minimize the selected performance criteria. It is challenging to determine a global optimum that is ideal. In many cases, a local solution is sufficient. Before implementing these training algorithms, the input and output data is typically normalized to avoid any scaling problems. In the training step, it is important to ensure that the developed neural network does not overfit the data. One simple strategy to avoid overfitting is to use a large amount of data during the training process. More advanced techniques to prevent overfitting are cost function regularization and batch normalization. Dropout encourages sparsity of the neural network, thereby preventing random nodes and their connections during the over-fitting training process by "Dropout". Discarding random nodes forces the network to be resilient and identifies the most significant features of the data set. Regularization of the cost function to a minimized objective function adds an additional penalty term. These additional terms result in a trained neural network with less active nodes.

In verifying the feasibility of deep neural networks, several techniques exist to ensure that the neural network fits the data properly, as well as to provide realistic measures of model validity. Ensuring that the developed neural network is effective is typically accomplished by comparing the expected output of the real model with the predicted output of the neural network. One common technique is k-fold cross-validation, which attempts to maximize the validation criteria to ensure that the trained neural network provides a good fit. There are various test metrics to quantify the fit of the developed model between predicted and measured output data sets, including Mean Square Error (MSE) and Root Mean Square Error (RMSE).

And (5) reconstructing explicit model prediction control by using the trained and verified deep neural network:

Neural networks involving ReLU activation functions can accurately recast multiparameter quadratic programming problems. The use of the mp-Qp formula allows the neural network to be embedded directly into the optimization problem. This reconstruction adds complexity to the overall optimization problem by managing the binary variables.

For any network layer with n nodes, the output takes the form:

x^k＝max{0,W^kx^k-1+b^k} (8)

Where k is the number of network layers, W ^k is the weight matrix of the kth layer, b ^k is the bias vector of the kth layer, Is the output of the previous layer,/>Is the output of the current layer.

The importance of the ReLU activation function is its piecewise linear nature. Equation (8) can be precisely recast in the optimization equation by including binary variables. The k-th hidden layer reconstruction of the multiparameter problem is as follows:

where the variable y is a binary variable, Is an auxiliary variable vector, M is a large scalar value.

After the formula (8) is reconstructed by the formula (9), the total number of the binary variables y is equal to the total number of nodes constituting the hidden layer.

Recast neural networks with ReLU activation functions are an accurate reconstruction. Binary variables enable the activation function to output a value of 0 or x through the mezzanine constraint. Furthermore, the number of equality and inequality constraints increases linearly with the total number of nodes n, respectively. Incorporating the recast neural network into the optimization formula provides an effective strategy to maintain high accuracy of the proxy model.

Step (6), optimizing parameters of the depth neural network model reconstructed in the step (5):

Training is performed after reconstruction, and regularization techniques can be used to reduce the number of active nodes (nodes with values other than 0). Reducing the number of active nodes directly reduces the number of binary variables required in the recasting process. The trained, processed neural network may also minimize unnecessary binary variables. For example, nodes in the hidden layer that are always positive are represented by linear activation functions. Thus, the node does not need a slack variable or a binary variable.

Case analysis

According to the invention, aiming at the three-degree-of-freedom helicopter, due to the characteristics of MIM0, higher order and nonlinearity of the helicopter, the output data is obtained by utilizing the explicit model to predict and control the aircraft, the data is used as training data, and then a trained deep learning network is used for respectively carrying out experiments on a height axis, a rotation axis and a pitching axis, so that the performance of the deep learning combined with the explicit model to predict and control the specific application of the helicopter in three degrees of freedom is shown, and the superior performance of the helicopter is shown by comparing experimental results. The control signal can be fed back faster than the explicit model prediction control under the control of the deep learning network, the response time is shortened, and the autonomous learning capability of the system increases the stability of the system in the control process.

And obtaining output data of the system by an explicit model prediction control method, and correspondingly combining the value of the input quantity and the corresponding selected output quantity into data. Then analyzing the table to delete unnecessary data, correcting abnormal data and repairing missing data. And finally constructing a data set meeting training requirements, converting the data set into a csv format, and dividing the data set into a training set, a verification set and a test set. The overall steps can be seen by the overall flow chart of fig. 2, where a neural network is built up at tensorflow, the data is imported, and then normalization is performed. Next, 500 rounds of data training were set with a learning rate of 0.01, a mean square error loss function was defined and an optimizer was created. As shown in figure 5 of the specification, after training of 500, the predicted value and the actual value still have an error of 0.12, and although the obtained result has the error with the calculation of the explicit model prediction control, the flight stability of the helicopter is not affected, and the helicopter can still maintain stable flight within the error range. When the comparison is performed on solving time, under the condition that parameters are the same, the deep learning network is obviously higher than the common model prediction control efficiency. The burden on the computer is also greatly reduced in storage space and in computation.

According to the experimental results and the program operation, under the same condition, on the premise of stable operation, the model control based on deep learning has faster solving speed and better calculation performance compared with the common explicit model prediction control. In the aspect of control effect, the height angle, the rotation angle and the pitch angle of the three-degree-of-freedom helicopter can be effectively adjusted, an ideal stable state can be quickly achieved, and good control performance is achieved.

Claims

1. A prediction control method based on depth learning reconstruction explicit model is characterized in that: the method comprises the following steps:

the explicit model predictive control problem is as follows:

subject to C_cu≤Tx_init+c_c

for the region of n _r, And/>Each region Θ _i is described by a polyhedron;

Step (2), constructing a data set

The neural network is added with a projection algorithm on the basis of full connection, so that the output of the neural network is ensured to meet the constraint condition and feasibility of a control system;

y＝max{0,1} (5)

for a deep neural network with n nodes, the output takes the form:

x^k＝max{0,W^kx^k-1+b^k} (8)

x^k≥0,s^k≥0,y∈{0,1}ⁿ；

After the formula (8) is reconstructed through the formula (9), the total number of the binary variables y is equal to the total number of nodes forming the hidden layer; binary variables enable the activation function to output a value of 0 or x through the sandwich constraint;

Reducing the number of active nodes of the deep neural network model by adopting a regularization technology;

and (6) realizing attitude control of the helicopter by using the optimized deep neural network.

2. The method of claim 1, wherein during deep neural network training, a dropout technique is used to train the model, update network parameters and prevent overfitting.

3. The method of claim 1, wherein during deep neural network validation, a k-fold cross validation model is employed.