CN115755606B - Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization - Google Patents

Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization Download PDF

Info

Publication number
CN115755606B
CN115755606B CN202211433936.4A CN202211433936A CN115755606B CN 115755606 B CN115755606 B CN 115755606B CN 202211433936 A CN202211433936 A CN 202211433936A CN 115755606 B CN115755606 B CN 115755606B
Authority
CN
China
Prior art keywords
function
optimization
representing
bayesian
controller
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211433936.4A
Other languages
Chinese (zh)
Other versions
CN115755606A (en
Inventor
苏杰
牟剑秋
许正昊
李晓芸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Youdao Zhitu Technology Co Ltd
Original Assignee
Shanghai Youdao Zhitu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Youdao Zhitu Technology Co Ltd filed Critical Shanghai Youdao Zhitu Technology Co Ltd
Priority to CN202211433936.4A priority Critical patent/CN115755606B/en
Publication of CN115755606A publication Critical patent/CN115755606A/en
Application granted granted Critical
Publication of CN115755606B publication Critical patent/CN115755606B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Feedback Control In General (AREA)

Abstract

The invention discloses a carrier automatic driving controller automatic optimization method, medium and equipment based on Bayesian optimization, which use Bayesian optimization to automatically optimize the performance of the carrier automatic driving controller, replace manual parameter adjustment and grid parameter adjustment with original redundancy and low efficiency, have definite practical significance, use batch parallelization technology to improve and promote the analytic proxy function of the Bayesian optimization, improve the efficiency of the carrier automatic driving controller performance optimization, and have obvious technical advancement and practicability.

Description

Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization
Technical Field
The invention belongs to the technical field of intelligent automobile autopilot, and particularly relates to an automatic optimization method, medium and equipment for an automatic carrier driving controller based on Bayesian optimization.
Background
In recent years, with rapid improvement of the vehicle intelligence level, the related automatic driving technology is vigorously developed, a controller provided with a control algorithm is one of necessary modules of an automatic driving vehicle system, the controller can effectively control the vehicle to track a reference track to run forward, in general, the design of the control algorithm needs to model vehicle dynamics and construct according to the model, in the process, the model needs to be linearized and discretized, and therefore, the designed control algorithm needs to adjust and calibrate a plurality of parameters of the model, so that the performance of the control algorithm meets the requirement. In the past, the adjustment of control algorithm parameters is carried out by methods such as manual adjustment or grid search, and the efficiency is low, and the parameter adjustment space is limited, so that the optimal control performance cannot be achieved.
To optimize the performance of the control algorithm and improve the efficiency of the optimization process, researchers have conducted some related research and exploration. Marco et al published articles IEEE International Conference on Robotics and Automation,2016 "Automatic LQR Tuning Based on Gaussian Process Global Optimization, propose an automatic LQR controller optimizing method based on Bayesian process, and the Bayesian optimization uses entropy search as a proxy function, so that the optimal parameter set of the LQR controller can be automatically, efficiently and rapidly searched; su, jie et al published in IEEE Transactions on Vehicular Technology,2018, paper "Autonomous vehicle control through the dynamics and controller learning", further consider LQR controller performance optimization for Gaussian process Bayesian optimization, and design a time-varying lower confidence boundary function as a proxy function of Bayesian optimization for time-varying characteristics of system operation, so that the method has better applicability to time-varying characteristic scenes of vehicles; riboni, a. Et al, issued in nature,2022, science report "Bayesian optimization and deep learning for steering wheel angle prediction," used LSTM to design controllers for backbone networks for steering control of autonomous vehicles, and bayesian optimization as a controller parameter for automatic optimization searching.
The above research results can improve the performance optimization efficiency of the controller to a certain extent, but these methods still have certain limitations, such as: the research results all consider single-process serialization decision examples, so that the design of the Bayesian optimization flow is performed by using the analyzed proxy function, and parallelization cannot be achieved. The deep learning control algorithm designed by Riboni, a. Et al, as described above, has numerous parameters to be adjusted, which can make the space dimension of the bayesian optimized parameter set very high; in addition, the value space of part of the control parameters has compact continuity characteristics, so that the number of parameter groups to be searched is increased sharply, and the method brings further challenges for optimizing the efficiency of the search task.
Disclosure of Invention
Aiming at the problems, the main purpose of the invention is to design an automatic optimization method, medium and equipment of a carrier automatic driving controller based on Bayesian optimization, which consider using expected improvement functions of multi-batch parallelization as proxy functions of Bayesian optimization and bring a better solution for optimizing search of an automatic driving control algorithm.
The invention adopts the following technical scheme for realizing the purposes:
an automatic driving carrier controller optimizing method based on Bayesian optimization comprises the following steps:
s1: initializing a sample dataset
Figure GDA0004258359300000021
S2: for dataCollection set
Figure GDA0004258359300000022
Modeling using a proxy model;
establishing a Bayesian optimization proxy function, and circulating the following steps:
s21: obtaining posterior distribution mean and variance through agent model regression;
s22: obtaining a parameter set to be evaluated by Bayesian optimization proxy function;
s23: verifying the obtained parameter set to be evaluated on the carrier vehicle, and amplifying the data of the parameter set to a sampling data set
Figure GDA0004258359300000023
S3: and (3) if the parameter set to be evaluated reaches the termination condition, exiting the circulation step of S2, ending and obtaining the index parameters required by the controller.
As a further description of the invention, S1: modeling the performance index of the controller to be evaluated to obtain an evaluation function; according to the reachable domain of the parameter set to be optimized in the index, selecting n feasible combinations as the parameter set to be evaluated, carrying out a controller performance effect experiment on the carrier vehicle through the parameter to be evaluated, and collecting an effect index data set
Figure GDA0004258359300000031
S2: set S1 parameter set to be evaluated X= { θ 1 ,…,θ n As input, the effect index set corresponding to the parameter set to be evaluated
Figure GDA0004258359300000032
As an output, use agent model for effect index dataset +.>
Figure GDA0004258359300000033
Is set up; wherein θ i ,i∈[1,n]Representing the set of parameters evaluated ∈>
Figure GDA0004258359300000034
Representing a controller performance effect value; establishing a Bayesian optimization proxy function, and then circulating the following steps:
s21, obtaining an effect index data set through agent model regression
Figure GDA0004258359300000035
Posterior distribution and variance of (a);
s22, substituting the mean function and the variance function of the posterior distribution obtained in the S21 into a Bayesian optimization proxy function to obtain a recommended solution theta predicted by the proxy function n+1 .;
S23, performing a controller performance effect experiment on the carrier vehicle by using the parameter set represented by the recommended solution obtained in the S22, and collecting an effect index
Figure GDA0004258359300000036
And amplifying the set of data by the existing data set +.>
Figure GDA0004258359300000037
S3: and when the difference between the effect index and the ideal index of the controller is smaller than a set threshold value or the difference between the posterior distribution mean value and the set threshold value is smaller than the set threshold value, the circulation step is exited, and the obtained recommended solution is the index parameter obtained by the controller.
As a further description of the present invention, the modeling method of the evaluation function of the vehicle system control performance index in S1 is as follows:
Figure GDA0004258359300000038
wherein,,
Figure GDA0004258359300000039
representing the control performance of the parameter θ fitted to the system, representing a weighted fusion of control accuracy, vehicle safety and control cost, +.>
Figure GDA00042583593000000310
Representing variance as +.>
Figure GDA00042583593000000311
Gaussian distributed noise,/, of (2)>
Figure GDA00042583593000000312
And the corresponding noisy evaluation result obtained by measurement after the parameter group theta is matched with the system is represented.
As a further description of the present invention, the reachable domain space of the parameter set to be optimized in S1 is a mixed space, where the mixed space includes a discrete space and a continuous space, a part of the parameter set to be optimized belongs to the discrete space, and a part of the parameter set belongs to the continuous space.
As a further description of the present invention, S2 the proxy model is a gaussian process, which is completely described by a mean function μ (X) and a covariance function K (X, X);
the mean function μ (X) is:
Figure GDA0004258359300000041
wherein ψ (X) represents a polynomial function of order p, α p Coefficients representing the corresponding orders, C being a constant;
the covariance function K (X, X) is:
Figure GDA0004258359300000042
wherein the kernel function k (θ i ,θ j ),i∈[1,n],j∈[1,n]The complete form is as follows:
Figure GDA0004258359300000043
wherein, the diagonal array
Figure GDA0004258359300000044
Representing length-stretching hyper-parameters lambda i ,i∈[1,n]Representing the parameter θ i Corresponding telescoping parameters;
the objective function of the proxy model regression is the logarithm of the edge likelihood distribution as follows:
Figure GDA0004258359300000045
wherein,,
Figure GDA0004258359300000046
representing likelihood distribution +.>
Figure GDA0004258359300000047
Representing all gaussian process related hyper-parameter sets.
As a further description of the present invention, the bayesian-optimized proxy function is modeled as:
Figure GDA0004258359300000048
wherein AC (X) represents a proxy function, N represents Monte Carlo sampling points, i represents a current sampling point sequence number, q represents the total number of parallelization batch, and j represents a current batch; x= { X 1 ,…,X q The parameter set is divided into q batches, where X q A set of parameter sets representing the q-th lot;
Figure GDA0004258359300000049
representation of posterior mean function->
Figure GDA00042583593000000410
The j-th lot, L (X) is Gaussian process posterior distribution covariance +.>
Figure GDA00042583593000000411
The result of the cholesky decomposition is: it satisfies->
Figure GDA00042583593000000412
Figure GDA00042583593000000413
Representing a standard normal distribution sample, +.>
Figure GDA00042583593000000414
Represents the optimal value min Y observed by the current dataset.
As a further description of the present invention, the posterior mean function
Figure GDA00042583593000000415
The calculation method of (1) is as follows:
Figure GDA00042583593000000416
wherein I represents a unit array,
Figure GDA00042583593000000417
Figure GDA0004258359300000051
the posterior distribution covariance
Figure GDA0004258359300000052
The calculation method of (2) is as follows:
Figure GDA0004258359300000053
as a further description of the present invention, the bayesian proxy function also includes a maximum operation that depends on whether the demand for the objective function is minimized or maximized;
minimizing the demand leads to the proxy function having the minimizing operation, and maximizing the demand leads to the proxy function having the maximizing operation.
A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the bayesian-optimization-based method of optimizing an autopilot controller.
An automatic driving carrier controller optimizing device based on Bayesian optimization comprises a memory for storing a computer program and a processor, wherein the processor realizes the automatic driving carrier controller optimizing method based on Bayesian optimization when executing the computer program.
Compared with the prior art, the invention has the technical effects that:
the invention provides a carrier automatic driving controller automatic optimization method, medium and equipment based on Bayesian optimization, which uses Bayesian optimization to automatically optimize the performance of the carrier automatic driving controller, replaces manual parameter adjustment and grid parameter adjustment with original redundancy and low efficiency, has definite practical significance, improves and promotes the analytic proxy function of the Bayesian optimization by using a batch parallelization technology, improves the efficiency of the carrier automatic driving controller performance optimization, and has obvious technical advancement and practicability.
Drawings
FIG. 1 is a flow chart of an autopilot vehicle controller optimization method based on Bayesian optimization;
FIG. 2 is a schematic diagram of a controller performance change corresponding to a candidate parameter set in a Bayesian optimization operation process;
FIG. 3 is a schematic diagram of LQR controller trace effects using Bayesian optimization resulting in parameter set design.
Detailed Description
The invention is described in detail below with reference to the attached drawing figures:
an automatic driving carrier controller optimizing method based on Bayesian optimization, referring to fig. 1-3, comprises the following steps:
s1: initializing a sample dataset
Figure GDA0004258359300000061
S2: for data sets
Figure GDA0004258359300000062
Modeling using a proxy model;
establishing a Bayesian optimization proxy function, and circulating the following steps:
s21: obtaining posterior distribution mean and variance through agent model regression;
s22: obtaining a parameter set to be evaluated by Bayesian optimization proxy function;
s23: verifying the obtained parameter set to be evaluated on the carrier vehicle, and amplifying the data of the parameter set to a sampling data set
Figure GDA0004258359300000063
S3: and (3) if the parameter set to be evaluated reaches the termination condition, exiting the circulation step of S2, ending and obtaining the index parameters required by the controller.
Specifically, this embodiment specifically describes the steps described above:
s1: modeling the performance index of the controller to be evaluated to obtain an evaluation function; according to the reachable domain of the parameter set to be optimized in the index, selecting n feasible combinations as the parameter set to be evaluated, performing a controller performance effect experiment on the carrier vehicle on the parameter to be evaluated, and collecting an effect index data set
Figure GDA0004258359300000064
The modeling mode of the evaluation function of the vehicle system control performance index is as follows:
Figure GDA0004258359300000065
wherein,,
Figure GDA0004258359300000066
representing the control performance of the parameter theta incorporated into the system, representingWeighted fusion of control accuracy, vehicle safety and control cost, +.>
Figure GDA0004258359300000067
Representing variance as +.>
Figure GDA0004258359300000068
Gaussian distributed noise,/, of (2)>
Figure GDA0004258359300000069
And the corresponding noisy evaluation result obtained by measurement after the parameter group theta is matched with the system is represented.
S2: set S1 parameter set to be evaluated X= { θ 1 ,…,θ n As input, the effect index set corresponding to the parameter set to be evaluated
Figure GDA00042583593000000610
As an output, use agent model for effect index dataset +.>
Figure GDA00042583593000000611
Is set up;
wherein θ i ,i∈[1,n]Representing the set of parameters that have been evaluated,
Figure GDA00042583593000000612
representing a controller performance effect value;
in this embodiment, the proxy model is preferably set as a gaussian process, but is not limited to a gaussian process, and the gaussian process is completely described by a mean function μ (X) and a covariance function K (X, X);
the mean function μ (X) is:
Figure GDA0004258359300000071
wherein ψ (X) represents a polynomial function of order p, α p Coefficients representing the corresponding orders, C being a constant;
the covariance function K (X, X) is:
Figure GDA0004258359300000072
wherein the kernel function k (θ i ,θ j ),i∈[1,n],j∈[1,n]The complete form is as follows:
Figure GDA0004258359300000073
wherein, the diagonal array
Figure GDA0004258359300000074
Representing length-stretching hyper-parameters lambda i ,i∈[1,n]Representing the parameter θ i Corresponding telescoping parameters.
The objective function of the gaussian process regression is the logarithm of the edge likelihood distribution as follows:
Figure GDA0004258359300000075
wherein,,
Figure GDA0004258359300000076
representing likelihood distribution +.>
Figure GDA0004258359300000077
Representing all gaussian process related hyper-parameter sets.
Further, a bayesian optimized proxy function is established, and in this embodiment, as a preference, the bayesian optimized proxy function is modeled as:
Figure GDA0004258359300000078
wherein AC (X) represents a proxy function, N represents Monte Carlo sampling points, i represents a current sampling point sequence number, q represents total parallelization batch times, and j represents a tableShowing the current batch; x= { X 1 ,…,X q The parameter set is divided into q batches, where X q A set of parameter sets representing the q-th lot;
Figure GDA0004258359300000079
representation of posterior mean function->
Figure GDA00042583593000000710
The j-th lot, L (X) is Gaussian process posterior distribution covariance +.>
Figure GDA00042583593000000711
The result of the cholesky decomposition is: it satisfies->
Figure GDA00042583593000000712
Figure GDA00042583593000000713
Representing a standard normal distribution sample, +.>
Figure GDA00042583593000000714
Represents the optimal value observed by the current data set, namely min Y.
After the Bayesian optimization proxy function is determined, the following steps are circulated for the effect experiment of the parameter set to be evaluated:
s21, obtaining an effect index data set through agent model regression
Figure GDA0004258359300000081
Posterior distribution and variance of (a);
the posterior mean function
Figure GDA0004258359300000082
The calculation method of (1) is as follows:
Figure GDA0004258359300000083
wherein I represents a unit array,
Figure GDA0004258359300000084
Figure GDA0004258359300000085
the posterior distribution covariance
Figure GDA0004258359300000086
The calculation method of (2) is as follows:
Figure GDA0004258359300000087
s22, substituting the mean function and the variance function of the posterior distribution obtained in the S21 into a Bayesian optimization proxy function to obtain a recommended solution theta predicted by the proxy function n+1 .;
S23, performing a controller performance effect experiment on the carrier vehicle by using the parameter set represented by the recommended solution obtained in the S22, and collecting an effect index
Figure GDA0004258359300000088
And amplifying the set of data into the effect index data set of the existing S2 +.>
Figure GDA0004258359300000089
S3: when the difference between the effect index and the ideal index of the controller is smaller than a set threshold value or the difference between the posterior distribution mean value and the set threshold value is smaller than the set threshold value, the circulation step of S2 is exited, and the obtained recommended solution is the index parameter solved by the controller;
it should be noted that, the present embodiment is not limited to the type of the autopilot controller, and may be used for autopilot controllers with various parameter optimization requirements.
In one embodiment, the vehicle is typically modeled using a bicycle model (bicycle model) and linearized, discretized, to the form:
z k+1 =Az k +Bu k , (1)
wherein the method comprises the steps of
Figure GDA00042583593000000810
Representing a system state vector, e representing a track trace lateral offset error, d_e representing a derivative of the track trace lateral inexpensive error, th_e representing an angle offset error of the track trace, d_th_e representing a derivative of the track trace angle offset error, delta_v representing a difference between the current speed and the planned speed. />
Figure GDA00042583593000000811
The system control vector, delta, represents steering angle, acc represents longitudinal acceleration.
Matrices a and B are shown below, respectively:
Figure GDA0004258359300000091
where dt represents a discrete time step and v represents vehicle speed.
The control performance objective function is modeled as follows:
Figure GDA0004258359300000092
wherein the method comprises the steps of
Figure GDA0004258359300000093
The expected value is expressed, and M is the experiment number. The optimization objective of the infinite time domain needs to be approximated in a limited way, and what is expected to correspond to is a noisy estimate, which we approximate characterize with the following functions:
Figure GDA0004258359300000094
wherein the method comprises the steps of
Figure GDA0004258359300000095
Representing variance as +.>
Figure GDA0004258359300000096
Gaussian distributed noise of (c). The designed controller is a linear quadratic regulator controller (Linear Quadratic Regulator, LQR), and Q and R respectively represent a state weight matrix and a control weight matrix. Consider the state weight parameters Q [0, 0] corresponding to the position error term]=θ[0,0]. Corresponding control weight parameter of corresponding control quantity is R0, 0]=θ[0,1]。
The expression of the LQR controller is as follows,
u k =-F θ z k , (5)
wherein F is θ The manner of calculation of (c) is as follows,
Figure GDA0004258359300000097
wherein F is θ Is a solution to the following algebraic licarpae equation:
Figure GDA0004258359300000098
the LQR controller is used to perform trajectory tracking control. The reference track for track tracking is obtained by using the following cubic spline interpolation function:
Figure GDA0004258359300000099
where pos denotes the reference position, h 1 ,…,h m+1 Representing a total of m +1 reference anchor points. a, a 1 ,b 1 ,c 1 ,d 1 ,…,a m ,b m ,c m ,d m Is the corresponding coefficient. In this example, the reference track anchor lateral position is set to [0.0,6.0,12.5,10.0,17.5,20.0,25.0 ]]The longitudinal position is set to be [0.0, -3.0, -5.0,6.5,3.0,0.0,0.0 ]]。
Generation optimized for bayesianConstruction of initial data set by reason model
Figure GDA0004258359300000101
In the present embodiment, n θ are taken to constitute x= { θ 1 ,…,θ n -a }; the parameters are respectively substituted into the LQR controllers to track, so that a controller effect evaluation set +.>
Figure GDA0004258359300000102
In the present embodiment, two parameters θ are set to [0.0001,0.001,0.01,0.1,1, 10, 100, 1000, respectively]Therefore, n=64.
Entering a Bayesian optimization main loop.
First, a data set is obtained using Gaussian process regression
Figure GDA0004258359300000103
Posterior distribution of (c). Without loss of generality, the prior mean function of the Gaussian process is taken as zero mean, and the prior covariance function calculation is carried out by taking the first n-1 points as follows:
Figure GDA0004258359300000104
the kernel function k (θ i ,θ j ),i∈[1,n-1],j∈[1,n-1]In its complete form, it is,
Figure GDA0004258359300000105
wherein, the diagonal array
Figure GDA0004258359300000106
Representing length-stretching hyper-parameters lambda i ,i∈[1,n-1]Representing the parameter θ i Corresponding telescoping parameters.
The super parameters are all obtained by minimizing the logarithm of edge likelihood:
Figure GDA0004258359300000107
wherein the method comprises the steps of
Figure GDA0004258359300000108
Representing likelihood distribution +.>
Figure GDA0004258359300000109
Representing all gaussian process related hyper-parameter sets.
The posterior mean function
Figure GDA00042583593000001010
The calculation method of (2) is as follows:
Figure GDA00042583593000001011
wherein I represents a unit array,
Figure GDA00042583593000001012
Figure GDA00042583593000001013
the posterior distribution covariance
Figure GDA00042583593000001014
The calculation method of (2) is as follows:
Figure GDA0004258359300000111
secondly, obtaining the next point to be evaluated recommended by Bayesian optimization by using the following proxy function:
Figure GDA0004258359300000112
wherein AC (X) represents a proxy function, N represents Monte Carlo miningThe number of samples, i, represents the current sampling point sequence number, q represents the total number of parallelization batches, and j represents the current batch. X= { X 1 ,…,X q The parameter set is divided into q batches, where X q The set of parameter sets representing the q-th lot.
Figure GDA0004258359300000113
Representation of posterior mean function->
Figure GDA0004258359300000114
The j-th lot, L (X) is Gaussian process posterior distribution covariance +.>
Figure GDA0004258359300000115
The result of the Geolis decomposition, i.e. it satisfies +.>
Figure GDA0004258359300000116
Figure GDA0004258359300000117
Representing a standard normal distribution sample, +.>
Figure GDA0004258359300000118
Represents the optimal value observed by the current data set, namely min Y.
Third step, theta is calculated n+1 Performing controller performance effect experiment on carrier vehicle, and collecting effect index
Figure GDA0004258359300000119
And amplifying the set of data to the existing data set +.>
Figure GDA00042583593000001110
And updates the bayesian optimized posterior distribution.
And when the difference between the performance index and the ideal index of the controller or the difference between the posterior distribution mean value of the proxy model and the set threshold value is smaller than the set threshold value, exiting the Bayesian optimization main loop to obtain the solved.
The above algorithm is deployed on a computer medium and device. In this embodiment, the computer medium is a notebook computer, its hardware is configured as cpu i5-10210u,16g memory, and its software is configured as windows 10 operating system, and python 3.9.6,pytorch 1.12.1,gpytorch 1.9.0,botorch 0.7.2,numpy 1.23.3,matplotlib 3.6.0 is deployed.
The program operating parameters are configured to: the wheel diameter of the vehicle is 0.5m, and the maximum turning angle is 45 degrees. The dynamic model discrete sampling time was 0.1s. The gaussian process model is a single-task gaussian process, and the number of initialized samples is set to 16. The bayesian optimization is tried three times in total, each time an attempt is made to search 16 rounds, q of the proxy function qEI is set to 1, the number of monte carlo sample samples is set to 64, and the search boundaries are set to [0.0001,100].
As shown in fig. 2, in the process of automatically searching the optimal parameter set obtained by the method of the present embodiment (i.e., the automatic optimization method of the carrier autopilot control algorithm based on bayesian optimization described in S1-S3), the objective function (i.e., the position error tracked by the trajectory of the LQR controller) changes schematically, and the result shows that the parameter set of the controller can be effectively and automatically optimized by applying the method of the present embodiment.
As shown in fig. 3, the trajectory tracking control effect of the LQR controller designed for the parameters obtained by the method of this embodiment (i.e., the automatic optimization method of the vehicle autopilot control algorithm based on bayesian optimization described in S1 to S3 above) is performed.
In addition, in other embodiments, the present invention may also provide a bayesian-optimized-based autopilot controller optimization apparatus including a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to implement the automatic optimization method of the vehicle autopilot controller based on bayesian optimization as described in S1 to S3 above when executing the computer program.
In addition, in other embodiments, the present invention may further provide a computer readable storage medium, where a computer program is stored, where the computer program, when executed by a processor, can implement the automatic optimization method for a vehicle autopilot controller based on bayesian optimization as described in S1 to S3 above.
It should be noted that the Memory may include a random access Memory (Random Access Memory, RAM) or a Non-Volatile Memory (NVM), such as at least one magnetic disk Memory. The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a neural network processor (Neural Processor Unit, NPU), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. Of course, the apparatus should also have necessary components to implement the program operation, such as a power supply, a communication bus, and the like.
The above embodiments are only for illustrating the technical solution of the present invention, but not for limiting, and other modifications and equivalents thereof by those skilled in the art should be included in the scope of the claims of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (8)

1. An automatic driving carrier controller optimizing method based on Bayesian optimization is characterized by comprising the following steps:
s1: initializing a sample dataset
Figure QLYQS_1
S2: for data sets
Figure QLYQS_2
Modeling using a proxy model;
establishing a Bayesian optimization proxy function, and circulating the following steps:
s21: obtaining posterior distribution mean and variance through agent model regression;
s22: obtaining a parameter set to be evaluated by Bayesian optimization proxy function;
s23: verifying the obtained parameter set to be evaluated on the carrier vehicle, and amplifying the data of the parameter set to a sampling data set
Figure QLYQS_3
S3: if the parameter set to be evaluated reaches the termination condition, the step S2 of circulation is exited, and the index parameters obtained by the controller are ended;
the bayesian-optimized proxy function in S22 is modeled as:
Figure QLYQS_4
wherein AC (X) represents a proxy function, N represents Monte Carlo sampling points, i represents a current sampling point sequence number, q represents the total number of parallelization batch, and j represents a current batch; x= { X 1 ,…,X q The parameter set is divided into q batches, where X q A set of parameter sets representing the q-th lot;
Figure QLYQS_5
representation of posterior mean function->
Figure QLYQS_6
The j-th lot, L (x) is Gaussian process posterior distribution covariance +.>
Figure QLYQS_7
The product is obtained by the decomposition of the Gerlichia group; />
Figure QLYQS_8
Representing a standard normal distribution sample, +.>
Figure QLYQS_9
Representing an optimal value minY obtained by observation of the current data set;
the posterior mean function
Figure QLYQS_10
The calculation method of (1) is as follows:
Figure QLYQS_11
wherein I represents a unit array, X 1:n-1 ={θ 1 ,…,θ n-1 -representing the set of parameters to be evaluated, θ 1 ,…,θ n Representing the set of parameters evaluated; k (θ) n ,X 1:n-1 )=[k(θ 1 ,θ n ),…,k(θ n-1 ,θ n )] T Represents θ n And X 1:n-1 Is a covariance function of (2);
Figure QLYQS_12
indicating the effect index corresponding to the evaluated parameter set, < >>
Figure QLYQS_13
Representing a controller performance effect value; />
Figure QLYQS_14
Representing the noise variance;
the posterior distribution covariance
Figure QLYQS_15
The calculation method of (2) is as follows:
Figure QLYQS_16
2. a bayesian-optimization-based method for optimizing an autopilot controller according to claim 1, wherein:
s1: modeling the performance index of the controller to be evaluated to obtain an evaluation function; according to the indexSelecting n feasible combinations as parameter set to be evaluated, performing controller performance effect experiment on a carrier vehicle through the parameter to be evaluated, and collecting effect index data set
Figure QLYQS_17
S2: set S1 parameter set to be evaluated X= { θ 1 ,…,θ n As input, the effect index set corresponding to the parameter set to be evaluated
Figure QLYQS_18
As an output, use agent model for effect index dataset +.>
Figure QLYQS_19
Is set up; wherein θ i ,i∈[1,n]Representing the set of parameters evaluated ∈>
Figure QLYQS_20
Representing a controller performance effect value;
establishing a Bayesian optimization proxy function, and then circulating the following steps:
s21, obtaining an effect index data set through agent model regression
Figure QLYQS_21
Posterior distribution and variance of (a);
s22, substituting the mean function and the variance function of the posterior distribution obtained in the S21 into a Bayesian optimization proxy function to obtain a recommended solution theta predicted by the proxy function n+1
S23, performing a controller performance effect experiment on the carrier vehicle by using the parameter set represented by the recommended solution obtained in the S22, and collecting an effect index
Figure QLYQS_22
And amplifying the set of data by the existing data set +.>
Figure QLYQS_23
S3: and when the difference between the effect index and the ideal index of the controller is smaller than a set threshold value or the difference between the posterior distribution mean value and the set threshold value is smaller than the set threshold value, the circulation step is exited, and the obtained recommended solution is the index parameter obtained by the controller.
3. A bayesian-optimization-based method for optimizing an autopilot controller according to claim 2, wherein: s1, modeling a vehicle system control performance index evaluation function by the following steps:
Figure QLYQS_24
wherein,,
Figure QLYQS_25
representing the control performance of the parameter θ fitted to the system, representing a weighted fusion of control accuracy, vehicle safety and control cost, +.>
Figure QLYQS_26
Representing variance as +.>
Figure QLYQS_27
Gaussian distributed noise,/, of (2)>
Figure QLYQS_28
And the corresponding noisy evaluation result obtained by measurement after the parameter group theta is matched with the system is represented.
4. A bayesian-optimization-based method for optimizing an autopilot controller according to claim 2, wherein: the reachable domain space of the parameter set to be optimized in the S1 is a mixed space, the mixed space comprises a discrete space and a continuous space, a part of the parameter set to be optimized belongs to the discrete space, and a part of the parameter set belongs to the continuous space.
5. A bayesian-optimization-based method for optimizing an autopilot controller according to claim 2, wherein: s2, the agent model is a Gaussian process, and the Gaussian process is completely described by a mean function mu (X) and a covariance function K (X, X);
the mean function μ (X) is:
Figure QLYQS_29
wherein ψ (X) represents a polynomial function of order p, α p Coefficients representing the corresponding orders, C being a constant;
the covariance function K (X, X) is:
Figure QLYQS_30
wherein the kernel function k (θ i ,θ j ),i∈[1,n],j∈[1,n]The complete form is as follows:
Figure QLYQS_31
wherein, the diagonal array
Figure QLYQS_32
Representing length-stretching hyper-parameters lambda i ,i∈[1,n]Representing the parameter θ i Corresponding telescoping parameters;
the objective function of the proxy model regression is the logarithm of the edge likelihood distribution as follows:
Figure QLYQS_33
wherein,,
Figure QLYQS_34
representing likelihood distribution +.>
Figure QLYQS_35
Representing all gaussian process related hyper-parameter sets.
6. A bayesian-optimization-based method for optimizing an autopilot controller according to claim 1, wherein: the Bayesian optimization proxy function also includes a maximum operation that depends on whether the demand for the objective function is minimized or maximized;
minimizing the demand leads to the proxy function having the minimizing operation, and maximizing the demand leads to the proxy function having the maximizing operation.
7. A computer-readable storage medium, characterized by: the storage medium has stored thereon a computer program which, when executed by a processor, implements a bayesian-optimization-based autopilot controller optimization method according to any one of claims 1-6.
8. An automatic driving carrier controller optimizing device based on bayesian optimization, which is characterized in that: comprising a memory for storing a computer program and a processor, which processor, when executing the computer program, implements a bayesian-optimization-based autopilot controller optimization method according to any one of claims 1-6.
CN202211433936.4A 2022-11-16 2022-11-16 Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization Active CN115755606B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211433936.4A CN115755606B (en) 2022-11-16 2022-11-16 Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211433936.4A CN115755606B (en) 2022-11-16 2022-11-16 Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization

Publications (2)

Publication Number Publication Date
CN115755606A CN115755606A (en) 2023-03-07
CN115755606B true CN115755606B (en) 2023-07-07

Family

ID=85372191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211433936.4A Active CN115755606B (en) 2022-11-16 2022-11-16 Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization

Country Status (1)

Country Link
CN (1) CN115755606B (en)

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573281A (en) * 2018-04-11 2018-09-25 中科弘云科技(北京)有限公司 A kind of tuning improved method of the deep learning hyper parameter based on Bayes's optimization
DE102019208262A1 (en) * 2019-06-06 2020-12-10 Robert Bosch Gmbh Method and device for determining model parameters for a control strategy of a technical system with the help of a Bayesian optimization method
DE102019208264A1 (en) * 2019-06-06 2020-12-10 Robert Bosch Gmbh Method and device for determining a control strategy for a technical system
CN110304075B (en) * 2019-07-04 2020-06-26 清华大学 Vehicle track prediction method based on hybrid dynamic Bayesian network and Gaussian process
US11673584B2 (en) * 2020-04-15 2023-06-13 Baidu Usa Llc Bayesian Global optimization-based parameter tuning for vehicle motion controllers
CN111460368A (en) * 2020-05-22 2020-07-28 南京大学 Parallel Bayesian optimization method
CN112163373A (en) * 2020-09-23 2021-01-01 中国民航大学 Radar system performance index dynamic evaluation method based on Bayesian machine learning
DE102020213527A1 (en) * 2020-10-28 2022-04-28 Robert Bosch Gesellschaft mit beschränkter Haftung Method for optimizing a strategy for a robot

Also Published As

Publication number Publication date
CN115755606A (en) 2023-03-07

Similar Documents

Publication Publication Date Title
Berkenkamp et al. Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics
Cheng et al. Control regularization for reduced variance reinforcement learning
Quinonero-Candela et al. Approximation methods for Gaussian process regression
Cao et al. Self-adaptive evolutionary extreme learning machine
Li et al. A novel double incremental learning algorithm for time series prediction
Capone et al. Localized active learning of Gaussian process state space models
Huang et al. Interpretable policies for reinforcement learning by empirical fuzzy sets
CN112977412A (en) Vehicle control method, device and equipment and computer storage medium
CN114580747A (en) Abnormal data prediction method and system based on data correlation and fuzzy system
JP2020095586A (en) Reinforcement learning method and reinforcement learning program
Qu et al. Rl-driven mppi: Accelerating online control laws calculation with offline policy
CN115755606B (en) Automatic optimization method, medium and equipment for carrier controller based on Bayesian optimization
Wibawa et al. Modified online sequential extreme learning machine algorithm using model predictive control approach
Abdolmaleki et al. Contextual relative entropy policy search with covariance matrix adaptation
Arslan et al. Information-theoretic stochastic optimal control via incremental sampling-based algorithms
Li et al. Neural-fuzzy control of truck backer-upper system using a clustering method
Graves et al. Proximity fuzzy clustering and its application to time series clustering and prediction
Swief et al. Approximate Neural Network Model for Adaptive Model Predictive Control
Reda et al. A Hybrid Machine Learning-based Control Strategy for Autonomous Driving Optimization
Wang et al. Path Tracking Method Based on Model Predictive Control and Genetic Algorithm for Autonomous Vehicle
Yu et al. Online data-driven model predictive control in variable noise environment based on neural network and Gaussian process regression
Esposito et al. Bellman residuals minimization using online support vector machines
Mostafa et al. Fast adaptive regression-based model predictive control
Aikawa et al. Improving the efficiency of training physics-informed neural networks using active learning
Teixeira et al. Temporal-Difference learning an online support vector regression approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant