CN110259592A - A kind of PID diesel engine self-adapting electronic speed regulating method - Google Patents

A kind of PID diesel engine self-adapting electronic speed regulating method Download PDF

Info

Publication number
CN110259592A
CN110259592A CN201910558083.9A CN201910558083A CN110259592A CN 110259592 A CN110259592 A CN 110259592A CN 201910558083 A CN201910558083 A CN 201910558083A CN 110259592 A CN110259592 A CN 110259592A
Authority
CN
China
Prior art keywords
speed
diesel engine
value
target
subtask
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910558083.9A
Other languages
Chinese (zh)
Inventor
惠小亮
张朦朦
李鹏豪
张永林
吴庆林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Hongjiang Machinery Co Ltd
Original Assignee
Chongqing Hongjiang Machinery Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Hongjiang Machinery Co Ltd filed Critical Chongqing Hongjiang Machinery Co Ltd
Priority to CN201910558083.9A priority Critical patent/CN110259592A/en
Publication of CN110259592A publication Critical patent/CN110259592A/en
Pending legal-status Critical Current

Links

Classifications

    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F02COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS
    • F02DCONTROLLING COMBUSTION ENGINES
    • F02D31/00Use of speed-sensing governors to control combustion engines, not otherwise provided for
    • F02D31/001Electric control of rotation speed
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F02COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS
    • F02DCONTROLLING COMBUSTION ENGINES
    • F02D41/00Electrical control of supply of combustible mixture or its constituents
    • F02D41/02Circuit arrangements for generating control signals
    • F02D41/14Introducing closed-loop corrections
    • F02D41/1401Introducing closed-loop corrections characterised by the control or regulation method
    • F02D41/1405Neural network control
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F02COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS
    • F02DCONTROLLING COMBUSTION ENGINES
    • F02D41/00Electrical control of supply of combustible mixture or its constituents
    • F02D41/02Circuit arrangements for generating control signals
    • F02D41/14Introducing closed-loop corrections
    • F02D41/1401Introducing closed-loop corrections characterised by the control or regulation method
    • F02D41/1406Introducing closed-loop corrections characterised by the control or regulation method with use of a optimisation method, e.g. iteration
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F02COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS
    • F02DCONTROLLING COMBUSTION ENGINES
    • F02D41/00Electrical control of supply of combustible mixture or its constituents
    • F02D41/24Electrical control of supply of combustible mixture or its constituents characterised by the use of digital means
    • F02D41/2406Electrical control of supply of combustible mixture or its constituents characterised by the use of digital means using essentially read only memories
    • F02D41/2425Particular ways of programming the data
    • F02D41/2429Methods of calibrating or learning
    • F02D41/2438Active learning methods
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F02COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS
    • F02DCONTROLLING COMBUSTION ENGINES
    • F02D41/00Electrical control of supply of combustible mixture or its constituents
    • F02D41/30Controlling fuel injection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F02COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS
    • F02DCONTROLLING COMBUSTION ENGINES
    • F02D41/00Electrical control of supply of combustible mixture or its constituents
    • F02D41/02Circuit arrangements for generating control signals
    • F02D41/14Introducing closed-loop corrections
    • F02D41/1401Introducing closed-loop corrections characterised by the control or regulation method
    • F02D2041/1409Introducing closed-loop corrections characterised by the control or regulation method using at least a proportional, integral or derivative controller
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F02COMBUSTION ENGINES; HOT-GAS OR COMBUSTION-PRODUCT ENGINE PLANTS
    • F02DCONTROLLING COMBUSTION ENGINES
    • F02D41/00Electrical control of supply of combustible mixture or its constituents
    • F02D41/02Circuit arrangements for generating control signals
    • F02D41/14Introducing closed-loop corrections
    • F02D41/1401Introducing closed-loop corrections characterised by the control or regulation method
    • F02D2041/1433Introducing closed-loop corrections characterised by the control or regulation method using a model or simulation of the system

Abstract

The present invention proposes a kind of based on the PID diesel engine self-adapting electronic speed regulating method for improving MAXQ algorithm, it is related to controlling method for diesel engine, the present invention is based on the autonomous optimizing S-MAXQ algorithms of rotating speed of target, set the rotating speed of target of diesel engine, and as the input of self-organizing map neural network SOM, pid parameter set corresponding to rotating speed of target will be reached by solving current time, output of the Q value that each subtask calculates as SOM, the target position of rotational speed governor acquisition diesel engine rack gear, S-MAXQ is called to calculate the value function of current speed adjustment strategy, when value function reaches maximum value, it is applied to the revolving speed that rack position controller adjusts diesel engine.Diesel engine speed levels off to setting speed with optimal speed regulation index, and the present invention can be improved diesel engine electronic speed regulation adaptability, improve diesel engine operating index.

Description

A kind of PID diesel engine self-adapting electronic speed regulating method
Technical field
The present invention relates to technical field of industrial control, and in particular to a kind of diesel engine autonomous control method is divided based on improving The PID diesel engine adaptive speed regulation method of layer intensified learning MAXQ algorithm.
Background technique
Governor can detect diesel engine revolving speed, and generated according to the difference of rotating speed of target and actual speed and adjust oil spout Active force necessary to measuring.Therefore the effect of governor is actually the control to revolving speed.Since diesel engine is in the course of work It the middle fluctuation of speed and changes greatly.So governor must use closed-loop control to the control of diesel engine speed.Closed-loop control is first First require the mathematical model for accurately establishing system.Because the structure and the course of work of diesel engine are complicated, its course of work is influenced Factor is numerous, and there is very strong coupled relations between various factors and between various factors and system, therefore, diesel oil Machine belongs to multi input, multi output, time-varying, nonlinear complication system, only describes its characteristic with the mathematical method of closed-loop control, no It is likely to be breached very much degree of precision.So being simulated using the control method of accurate mathematical model to working process of diesel engine It is worthless.Furthermore, it is contemplated that the influence of the factors such as volume, reliability and cost, controller generally used now are generally adopted Microprocessor is used to be unable to satisfy complex control algorithm to the needs of resource as core, arithmetic speed and memory capacity.
It is either analog or digital in the Digital Governor for Diesel Engine product being commonly used, substantially All use traditional PID control technology.However traditional PID control is although easy to use, easy to accomplish and stable state floating, There are following critical defects for it: first is that being unable to on-line tuning after control parameter adjusting, therefore when encountering stronger interference, no It is evitable phenomena such as recovery time extension, overshoot increases occur, thus the raising of system for restricting dynamic, static properties;Second is that In starting or big dynamic regulation, there are integral saturated phenomenons.
Because a kind of dynamic of the invention is excellent, it is strong to adapt to environment, being capable of adaptive, intelligent online adjustment pid control algorithm pair It is of great significance in improving Digital Governor for Diesel Engine performance, improving diesel engine operating index.Adaptive PID Control technology is inhaled The advantages of having taken both self adaptive control thought and conventional PID controller becomes the hot spot studied at this stage.Through to existing skill Art literature search discovery: document " Research on Digital Governing System of Diesel Engine based on single neuron Intelligent PID Control " (Dalian iron Road institute journal, 1999) a kind of diesel engine intelligent PID speed-regulating system is proposed, one kind is based on classical PID, with single nerve The advantages of INTELLIGENT PID CONTROL ALGORITHM that member is realized, which had not only remained ANN Control, but it is simple and easy.By its Applied to Research on Digital Governing System of Diesel Engine, simulation result shows: not only dynamic property is excellent for the control algolithm, but also has Well adapting to property and robustness provide a new way to study the intelligent control of Research on Digital Governing System of Diesel Engine Diameter ", shortcoming is to carry out parameter optimization using supervised learning, and the teacher signal of supervised learning is difficult to obtain, in addition, should Control method does not have on-line study ability, thus its adaptability is poor.The patent of Publication No. CN108008627 discloses one kind The Adaptive PID Control method of parallel optimization intensified learning, this method use matlab, will transmit letter by zero-order holder method Then number discretization carries out three layers of variable training using Critic network, this method solve conventional PID controllers to be not easy online The deficiency of real-time setting parameter, but three layers of training feedback of Critic network, will also calculate the final value function of intensified learning, increase The complexity of traditional pid algorithm is added.It is adaptive that the patent of Publication No. CN201510492758 discloses a kind of executing agency PID control method, which combines Expert PID Controller and fuzzy controller is connect with executing agency respectively, holds Row mechanism is according to current state information and expectation information selection Expert PID Controller or fuzzy controller, this control Although device can reduce overshoot, have the characteristics that control is with high accuracy, but there is still a need for the big of professional for this controller Priori knowledge is measured, the use of decision controller is carried out.
Summary of the invention
The present invention in view of the above drawbacks of the prior art, proposes a kind of based on the ratio for improving Hierarchical reinforcement learning MAXQ algorithm Example-Integrated Derivative algorithm PID (Proportion-integral-derivative), diesel engine self-adapting electronic speed regulating method, Using the MAXQ algorithm and self-organizing map neural network SOM (Self-Oragnizing Map) in Hierarchical reinforcement learning, nerve Network combines, the autonomous optimizing algorithm (S-MAXQ algorithm) based on rotating speed of target, on the basis of S-MAXQ algorithm, with diesel oil It is research object that machine target, which adjusts revolving speed, using the rotating speed of target at diesel engine current time as self-organizing map neural network SOM The input of (Self-Oragnizing Map), output of the value Q that each subtask calculates as SOM neural network, will export plan It is slightly evenly distributed in SOM neural network;Initialize the Proportional coefficient K of PID controllerp, integral coefficient KI, differential coefficient KD's Value forms decision package, calculates the output control amount of PID controller, pass through the autonomous optimizing mode of learning based on rotating speed of target S-MAXQ, the input using current goal as S-MAXQ algorithm unit calculates the value function of current speed adjustment strategy, when value function reaches When to maximum value, speed adjustment strategy is optimal, and PID controller Real-Time Evaluation and updates pid parameter, when to approach expectation defeated for control amount Out, diesel engine speed levels off to setting speed with optimal speed regulation index.
A kind of proportional-integral-differential PID (Proportion-integral- improving Hierarchical reinforcement learning MAXQ algorithm Derivative) diesel engine self-adapting electronic speed regulating method, the rotating speed of target of diesel engine is set, and reflect as self-organizing The input of neural network SOM is penetrated, solution current time will reach pid parameter set corresponding to rotating speed of target, and the output of SOM is The Q value that each subtask calculates, rotational speed governor obtain the target position of diesel engine rack gear, call based on the autonomous of rotating speed of target Optimizing algorithm S-MAXQ calculates the value function of current speed adjustment strategy, when value function reaches maximum value, is applied to rack gear position The revolving speed that controller adjusts diesel engine, real-time update actuator pid parameter are set, the target that control diesel engine speed reaches setting turns Speed.
The present invention further comprises that the autonomous optimizing algorithm S-MAXQ based on rotating speed of target is specifically included, most by diesel engine Pid parameter sequence sets M corresponding to whole rotating speed of target is decomposed into n independent subtask { M0, M1,...,Mn, each subtask MiA corresponding sub- rotating speed of target, to each subtask MiIt is extended, and with a four-tuple < πi,Ti,Ai,ri> indicate extension Subtask afterwards, wherein πiFor subtask strategy, TiFor termination predicate, AiFor MiSubtask collection, riFor pseudo- reward functions.It asks Pid parameter arrangement set corresponding to each sub-goal revolving speed is taken, and is decomposed into set of strategies { π01,...,πn, if tactful Each of concentration strategy is its optimal policy for corresponding to subtask, then set of strategies π=(π01,...,πn) it is passing for M Return optimal policy, obtain the optimal Selection Strategy of PID, wherein i-th of subtask strategy πiIt is i-th of subtask MiPID choose Strategy.
The value function for calculating current speed adjustment strategy is specific can include: according to formula Vπ(i, s)=Vπ(ai,s)+Cπ(an-1,s, an)+,...,+Cπ(a1,s,a2)+Cπ(a0,s,a1) by the root task value function V of root nodeπ(i, s) is decomposed, Vπ(i, s) is MiValue function, Cπ(an-1,s,an) it is subtask ManCompletion function, s is rotating speed of target, Vπ(ai, s) and it is subtask's Value function.
Pid control algorithm specifically may include that diesel engine speed regulation argument sequence collection M is divided into subtask collection { M0,M1,..., Mn, initialize learning rateDiscount factor γ records subtasking MiThe status switch Seq { } of Shi Jingli, in sequence Rotating speed of target value s including current subtask, with corresponding optimum PID parameter set a*;Subtasking MiAnd when receiving current Carve the reward value r immediately of rotating speed of targett(i, s), reward value is the quality for assessing the pid parameter set solved immediately, Reward value can carry out assignment according to the actual situation immediately, if the pid parameter set solved makes diesel engine speed closer to most Then reward value assignment is bigger immediately for whole rotating speed of target;Formula is called according to reward value immediatelyUnder the conditions of calculating t moment rotating speed of target is s, subtasking MiIt is tired Count reward value Vt+1(i, s), whereinIndicate learning rate;The weight of SOM neural network is adjusted, current goal revolving speed s is obtained Lower subtask MiOptimal Q value Q (s, a*), wherein behalf current goal revolving speed, a*Represent corresponding optimum PID parameter set; Assuming that the subsequent time rotating speed of target of current goal revolving speed s is s ', formula is calledIt calculates and executes pid parameter set a Subtask MiLower a moment complete function Ct+1(i, s', a), wherein when the rotating speed of target at t+1 moment is s ', t moment is executed Optimal PID sequence a*Accumulative reward value Vt(a*, s'), t moment current goal revolving speed are s, execute the subtask pid parameter set a MiCompletion function Ct(i, s, a), t+1 moment current goal revolving speed are s ';Until argument sequence collection { M0,M1,...,MnIn it is complete Portion subtask runs and finishes, and by current obtained rotating speed of target s and corresponding optimum PID parameter set a*Successively put Enter in Seq { }, obtains the optimal speed adjustment strategy of entire diesel engine speed regulation.Ct+1(i, s' a) represent t+1 moment current goal revolving speed For s ', pid parameter set a subtask M is executediCompletion function, γ is discount rate,Subtask M is completed for t momenti? Habit rate, general value range are 0~1.
When value function reaches maximum value, it is applied to the revolving speed that rack position controller adjusts diesel engine, adjusts the speed plan Slightly it is optimal, real-time update actuator pid parameter, when PID control deviation approaches desired output, diesel engine speed is with optimal tune Fast index levels off to the rotating speed of target of setting.
Turn if executing a certain group of pid parameter sequence and the rotating ratio of diesel engine being made to execute previous group closer to the target at the moment It is fast then to reward value immediately assign positive value, otherwise to reward value immediately assign negative value.
Neural network weight is adjusted, specifically, each weight vector of output layer (the Q value that each subtask calculates) assign with Machine number simultaneously normalizes the Q value of each subtask, the weight vector after obtaining output layer normalizationIt establishes Initial winning neighborhood Nj* (0) and learning rate η initial value, wherein m is output layer neuron number,For j-th subtask Q value Normalize weight vector;Establish initial winning neighborhood Nj* (0) and learning rate η initial value;Take an input pattern at random from training set And it is normalized to obtain the set after rotating speed of target normalizationIt calculatesWithDot product,WithDot product most Big node is winning node j*, with winning node j*Centered on determine t moment weighed value adjusting domain be defined as winning neighborhood Nj* (t), according to formulaCalculate the weight w of t momentt, obtain excellent Win neighborhood Nj*(t) all nodes in adjust weight, wherein wtFor the weight of t moment, QtIt is corresponding for t moment rotating speed of target Value function, rt-1For the reward value at t-1 moment,For the learning rate of t moment, γ is discount factor, and N is neural network section Point number;rt-1For the reward value immediately at t-1 moment, QtFor the Q value of t moment SOM input.
The rotating speed of target for obtaining diesel engine is set by speed, the target position of rack gear is obtained by rotational speed governor, is adjusted The value function that current speed adjustment strategy is calculated with S-MAXQ algorithm is applied to rack position when value function reaches maximum value The revolving speed of diesel engine is adjusted in controller, speed adjustment strategy is optimal;When speed adjustment strategy is optimal, real-time update is held Row device pid parameter, when PID control deviation approaches desired output, diesel engine speed levels off to the mesh of setting with optimal speed regulation index Mark revolving speed.
To any task MiIn any child node(MiForUpper layer task) on a that executeskBe subtask or Original activities (the pid parameter set that original activities are the moment), and meetMiTarget Revolving speed and value function Q corresponding to corresponding pid parameter setπ(i, s, a) by accumulation reward value Vπ(a, s) and complete reward value Cπ(i, s a) are formed.
The selection of pid parameter set directly affects rack position, so that the actual speed of diesel engine is controlled, so to make The actual speed of diesel engine is substantially equal to set rotating speed of target, and it is most important how pid parameter is chosen, however traditional Pid parameter adjusts process, uses set formula, it is difficult to accomplish that real-time online updates, adjustment effect is undesirable, and application of the present invention In the adjustment of pid parameter, pid parameter can be made, which to accumulate during the work time, has better adaptivity.Automatic adjusument PID The selection of parameter effective Real-Time Evaluation and can obtain pid parameter, when to approach expectation defeated for the control amount of PID controller output Out, diesel engine speed is leveled off to setting speed (target adjusting revolving speed) with optimal speed regulation index, so as to improve diesel engine electric Son speed regulation adaptability improves diesel engine operating index.
The present invention forms decision parameters according to the control amount of PID controller, and a large amount of priori for not needing professional are known Know, is learnt by autonomous optimizing, calculate the value function of current speed adjustment strategy, obtain optimal speed adjustment strategy.It can be improved diesel engine Electronic speed regulation adaptability improves diesel engine operating index.
Detailed description of the invention
Q value function structural map of the Fig. 1 based on SOM neural network collection;
The calculation method schematic diagram of Fig. 2 root node value function;
Fig. 3 is SOM neural metwork training flow chart;
Fig. 4 SOM neighborhood contracting model figure;
Control algolithm flow chart of the Fig. 5 based on S-MAXQ self-adaptive PID;
Fig. 6 is to adjust the speed block diagram based on the PID diesel engine self-adapting electronic for improving MAXQ algorithm.
Specific embodiment
Implementation of the invention is illustrated below for the drawings and specific embodiments.
Using MAXQ and SOM neural network automatic adjusument pid parameter, so that governor adaptively carries out diesel engine Speed regulation, however conventional diesel engine speed regulation algorithm needs to decompose learning tasks, the hierarchical structure meeting of Task-decomposing according to priori knowledge The quality of recursive strategies is directly affected, but the MAXQ algorithm layered particle degree (during such as diesel engine speed regulation) in some problems It is excessively rough, it is more difficult to which that subtask is further abstracted and is decomposed.In order to solve this problem, the present invention is using SOM nerve (solution current time will reach pid parameter collection corresponding when rotating speed of target for subtask during network analog diesel engine speed regulation Close) abstract mechanism.It is as shown in Figure 1 the Q value function construction based on SOM neural network collection, the rotating speed of target of diesel engine is made For the input of SOM, output (as shown in figure 1 rotating speed of target s of the optimal Q value that each subtask calculates as SOMiCorrespond to optimal Q (si,ai) value output, i=1, quantity of 2 ..., the i value depending on the SOM network of construction), then at this time corresponding to rotating speed of target Optimal speed adjustment strategy can be obtained, and be evenly distributed in SOM neural network.
Building can autonomous optimizing algorithm (S-MAXQ method).It will seek the ginseng of PID corresponding to diesel engine final goal revolving speed Number Sequence collection is considered as total learning tasks M, and M is decomposed into subtask collection { M0,M1,...,Mn, wherein each subtask MiHave One sub- rotating speed of target is corresponding, then subtask MiTask be seek PID corresponding to each sub-goal revolving speed ginseng Number Sequence set, and set of strategies { π is decomposed by PID strategy is chosen01,...,πn, wherein πiIt is MiPID Selection Strategy (πi Set according to actual needs, generally use greedy algorithm), if each strategy πiIt is all subtask MiOptimal policy, then being layered Tactful π=(π01,...,πn) be M recurrence optimal policy, then the optimal Selection Strategy of optimal PID can be obtained.Son Task-set is formed with MiFor the layered structure of root node, i.e. task image.
Such as the calculation method schematic diagram that Fig. 2 is root node value function.Vπ(i, s) is root task MiValue function, Cπ(an-1, s,an) it is subtaskCompletion function, under conditions of known strategy collection π and final goal revolving speed s, it is assumed that root task Mi Selection executes a under the conditions of rotating speed of target s1Secondary subtaskThen reselection executes a2Secondary subtaskSuccessively Selection, until executing MiFinal subtaskThus obtained (0, a1,a2,...,an) it is to be obtained according to Stratified Strategy π A top-down pid parameter set path, then can be by the root task value function V of root nodeπ(i, s) is decomposed are as follows:
Wherein, Vπ(ai, s) and it is subtaskValue function.
According to diesel engine speed regulation process, to each subtask MiIt is extended, with a four-tuple < πi,Ti,Ai,ri> carry out table Show, wherein πiFor subtask strategy (i.e. subtask MiPID Selection Strategy), Ti(turn if finding out final goal for termination predicate Corresponding to speed when pid parameter set, then entire task is completed, i.e. task termination),
AiFor MiSubtask collection,riIt is reward functions immediately (if executing a certain group of PID Argument sequence makes the revolving speed of diesel engine then assign just reward value immediately closer to the rotating speed of target at the moment, otherwise assigns and negative awards immediately Value).By MAXQ algorithm it is found that completing subtask MiValue function Qπ(i, s a) consist of two parts: first is that accumulation reward value Vπ (a, s), second is that completing reward value Cπ(i,s,a)。
Wherein, Cπ(i, s a) refer in subtasking MiWhen its rotating speed of target be s, choose PID in PID Selection Strategy π The completion function of parameter sets a, i.e. completion reward value;Qπ(i, s', a') refers in subtasking MiWhen its rotating speed of target be S ' chooses the value function of pid parameter set a ' in PID Selection Strategy π;Refer in subtasking MiWhen, Pid parameter set a is chosen in PID Selection Strategy π, is transferred to turning for target s ' by N number of time step by current goal revolving speed s Move probability.
When the strategy that bottom subtask has been determined (is transferred to target s's ' by N number of time step by current goal revolving speed s PID Selection Strategy) after, transition probability can exactly define;Therefore, to Cπ(i, s, solution key a) are under calculating The Q value of one subtask, then to Cπ(i, s, fitting problems a) be converted into SOM neural network rotating speed of target with it is corresponding Value function Q value corresponding to pid parameter set calculates.
Diesel engine speed regulation process be it is changeable and complicated, according to above-mentioned decomposable process, complicated speed regulation process can be decomposed Diesel engine can be obtained when the subtask in each stage gets optimal PID speed adjustment strategy for the subtask in several stages Entire speed regulation optimal policy.
By the corresponding SOM neural network in each subtask for the task that adjusts the speed, subtask M is iterated to calculate every timeiWhen, it will Input of the rotating speed of target at diesel engine current time as neural network, i.e. the training set of neural network, pass through pid parameter collection Close aiSo that the winning neighborhood of neural network makes a change, to obtain optimal neural network node, i.e. optimum PID parameter collection It closes.
It is illustrated in figure 3 SOM neural metwork training flow chart, detailed process is as follows for neural network weight adjustment algorithm:
Initialization: random number is assigned to each weight vector of output layer and is normalized, obtains calculating each subtask Q value normalizationEstablish initial winning neighborhood Nj* (0) and learning rate η initial value, wherein m is output layer Neuron number;
Receive input.An input pattern is taken at random from the training set that diesel engine rotating speed of target is constituted and is normalized Processing, the set after obtaining rotating speed of target normalization
It finds winning node: calculatingWithDot product, therefrom find the maximum node of dot product be winning node j*;Definition Winning neighborhood Nj*(t).With j*Centered on determine t moment weighed value adjusting domain, general initial neighborhood Nj*(0) larger (about total The 50%~80% of node), N in training processj*(t) it is shunk with the training time.
Adjust winning neighborhood Nj*(t): the iteration in view of MAXQ algorithm at each layer isWherein (i, s a) are subtask M to QiCurrent goal The value function of revolving speed s and corresponding pid parameter set, r are reward value,For learning rate, then introducing the anti-of SOM neural network Feedforward error, then weighed value adjusting formula is defined as:
j∈Nj*(t) to winning neighborhood Nj*(t) in All nodes adjust weight, wherein QtTo gather solved value function, r in t moment rotating speed of target and PIDt-1When for t-1 The reward value at quarter,It is j-th of neuron and triumph neuron j in training time t and neighborhood*Between topology distance N Function, i.e. learning rate, learning rate generally have following rule:
It is illustrated in figure 4 SOM neighborhood contracting model figure, Nj*It (0) is initialization neighborhood, Nj*(1) and Nj*(2) according toFormula neighborhood adjustment process twice, it can be seen that winning neighborhood constantly reduces.
Work as learning rate(wherein,For minimum learning rate) when, training terminates.It is trained by above-mentioned SOM, Winning node can be got in winning neighborhood, that is, completes the maximum Q value got when current subtask, around winning node Node also produce bigger effect because of lateral mutually excitation, then winning node (maximum Q value node) and its winning neighborhood The weight vector that interior all nodes are connected makees different degrees of adjustment to input direction, adjusts dynamics according to node each in neighborhood It is gradually reduced apart from (distance of winning node).Network adjusts network weight by Ad hoc mode, with a large amount of training samples, Each node of output layer is finally set to become the neuron sensitive to AD HOC class, corresponding interior star weight vector becomes each input pattern Center vector.What SOM network selected every time around this principle is the network section of maximum Q value acquired when completing subtask Point, i.e.,Value, therefore adjusts weight centered on the child node, when the weighed value adjusting of SOM neural network is to a stabilization When value, it can find the optimal PID sequence of the moment subtask.
For traditional pid parameter adjustment process using fixed formula, adjustment effect is not ideal, it is difficult to be accomplished in real time Above-mentioned S-MAXQ algorithm is applied in the adjustment of pid parameter, pid parameter can be made to make during the work time by online updating Experience accumulation has better adaptivity, to improve diesel engine electronic speed regulation adaptability, improve diesel engine operating index.
If Fig. 5 is the control algolithm flow chart based on S-MAXQ self-adaptive PID, it is shown that detailed process is as follows:
1. diesel engine speed regulation general assignment M is divided into subtask collection { M0,M1,...,Mn, initialize learning rateFolding Factor gamma is detained, and for recording subtasking MiThe status switch Seq { } of Shi Jingli includes current subtask in sequence Rotating speed of target value s, with corresponding optimum PID parameter set a*
2. subtasking MiAnd receive the reward value r immediately at current timet, reward value r immediatelytIt is to be solved for assessing The quality of pid parameter set out, if the pid parameter set solved makes diesel engine speed closer to final rotating speed of target Then reward value assignment is bigger immediately, and reward value can carry out assignment according to the actual situation immediately;
3. calculating accumulation reward valueWherein, Vt(i, s) indicates t Under the conditions of moment rotating speed of target is s, subtasking MiAccumulative reward value,Indicate learning rate, it can be according to actual needs It is defined design, general value range is
4. calling SOM neural network training method, the weight of SOM neural network is adjusted, can be obtained subtask MiMost Excellent Q value Q (s, a*), behalf current goal revolving speed, a*Represent corresponding optimum PID parameter set;
5. the subsequent time rotating speed of target for assuming current goal revolving speed s is s ', then the completion function at lower a moment can be found out Are as follows:
Wherein, Vt(a*, s') represent when the t+1 moment rotating speed of target be s ', then t moment execution optimal PID sequence a*'s Accumulative reward value;Ct(i, s represent t moment current goal revolving speed a) as s, execute pid parameter set a subtask MiCompletion letter Number;Ct+1(i, s' represent t+1 moment current goal revolving speed a) as s ', execute pid parameter set a subtask MiCompletion function; γ is discount rate,Subtask M is completed for t momentiLearning rate, general value range is 0~1.
6. circular flow above step is until entire { M0,M1,...,MnOperation finishes, and incites somebody to action current obtained mesh Mark revolving speed s and corresponding optimum PID parameter set a*It is sequentially placed into Seq { }, entire diesel engine speed regulation process can be obtained Optimal speed adjustment strategy.
If Fig. 6 is to adjust the speed block diagram based on the PID diesel engine self-adapting electronic for improving MAXQ algorithm, specific steps illustrate such as Under:
The rotating speed of target for obtaining diesel engine is set by speed, and the target position of rack gear is obtained by rotational speed governor, The value function for calling S-MAXQ algorithm to calculate current speed adjustment strategy is applied to rack gear position when value function reaches maximum value The revolving speed that diesel engine is adjusted in controller is set, speed adjustment strategy is optimal;When speed adjustment strategy is optimal, real-time update Actuator pid parameter, when PID control deviation approaches desired output, diesel engine speed levels off to setting with optimal speed regulation index Rotating speed of target.
Next, analyzing oneself proposed by the invention from two angles of algorithm complexity and subtask automatic optimal ability Adapt to electronic speed regulation method:
Firstly, parser complexity.Diesel engine speed regulation process is a continuous process, and diesel engine is entirely adjusted the speed task It is divided into several subtasks set { M0,M1,...,Mn, wherein each subtask be it is associated, so corresponding S- MAXQ algorithm is a recursive algorithm, passes through the optimum PID parameter collection for selecting that there is max function Q value to obtain current rotating speed It closes, so that recurrence finds the optimal PID Selection Strategy of other subtasks.Algorithm is iterated calculating to n subtask, algorithm Time complexity is O (n), and space complexity is O (n2), linear character is presented in the complexity of entire algorithm, and computing cost is smaller.
Secondly, the automatic acquisition capability of analysis optimum PID parameter set.S-MAXQ algorithm is using adjustment SOM neural network The problem of method of weight simulates the abstraction process of subtask, and current rotating speed is obtained corresponding optimum PID parameter set converts For the optimal output state sequence problem of SOM neural network.The weight of SOM neural network can tend to be steady after training several times Fixed, being easily obtained the maximum Q of output, (s, a) value obtain current rotating speed and corresponding optimum PID parameter collection so as to realize It closes.

Claims (6)

1. a kind of PID diesel engine self-adapting electronic speed regulating method for improving Hierarchical reinforcement learning algorithm MAXQ, feature exist In setting the rotating speed of target of diesel engine, and as the input of self-organizing map neural network SOM, solving current time wants Reach pid parameter set corresponding to rotating speed of target, output of the Q value that each subtask calculates as SOM, rotational speed governor obtains The target position of diesel engine rack gear is taken, the autonomous optimizing algorithm S-MAXQ based on rotating speed of target is called to calculate current speed adjustment strategy Value function is applied to the revolving speed that rack position controller adjusts diesel engine, real-time update when value function reaches maximum value Actuator pid parameter, control diesel engine speed reach the rotating speed of target of setting.
2. according to method described in right 1, which is characterized in that the autonomous optimizing algorithm S-MAXQ based on rotating speed of target is specifically wrapped It includes, pid parameter sequence sets M corresponding to diesel engine final goal revolving speed is decomposed into subtask collection { M0,M1,...,Mn, each Subtask MiA corresponding sub- rotating speed of target, seeks pid parameter arrangement set corresponding to each sub-goal revolving speed, and decompose For set of strategies { π01,...,πn, if each of set of strategies strategy is its optimal policy for corresponding to subtask, then strategy Collect π=(π01,...,πn) be M recurrence optimal policy, obtain the optimal Selection Strategy of PID, wherein i-th of subtask plan Slightly πiIt is i-th of subtask MiPID Selection Strategy.
3. according to method described in right 1, which is characterized in that the value function for calculating current speed adjustment strategy specifically includes: according to public affairs Formula Vπ(i, s)=Vπ(ai,s)+Cπ(an-1,s,an)+,...,+Cπ(a1,s,a2)+Cπ(a0,s,a1) by the root task value of root node Function Vπ(i, s) is decomposed, to sub- task-set { M0,M1,...,MnIn each subtask be extended, with a four-tuple < πi,Ti,Ai,ri> indicate the subtask after extension, wherein πiFor i-th of subtask MiPID Selection Strategy, TiTo terminate meaning Word, AiFor MiSubtask collection, riFor reward functions immediately, Vπ(i, s) is MiValue function, Cπ(an-1,s,an) it is subtask Man Completion function, s is rotating speed of target, Vπ(ai, s) and it is subtaskValue function.
4. according to method described in right 1, which is characterized in that pid control algorithm specifically includes, by diesel engine speed regulation argument sequence Collection M is divided into subtask collection { M0,M1,...,Mn, initialize learning rateDiscount factor γ records subtasking MiWhen The status switch Seq { } of experience includes the rotating speed of target value s of current subtask in sequence, with corresponding optimum PID parameter set a*;Subtasking MiAnd receive the reward value r immediately of current target revolving speedt(i,s);It is called according to reward value immediately public FormulaUnder the conditions of calculating t moment rotating speed of target is s, subtasking Mi's Accumulative reward value Vt+1(i, s), whereinIndicate learning rate;The weight of SOM neural network is adjusted, current goal revolving speed is obtained Subtask M under siOptimal Q value Q (s, a*);Call formulaIt calculates and executes pid parameter set a Subtask MiCompletion function Ct+1(i, s', a), wherein when the rotating speed of target at t+1 moment is s ', t moment executes optimal PID Sequence a*Accumulative reward value Vt(a*, s'), t moment current goal revolving speed are s, execute pid parameter set a subtask MiIt is complete At function Ct(i, s, a), t+1 moment current goal revolving speed are s ';Until argument sequence collection { M0,M1,...,MnIn all son appoint Business operation finishes, and by current obtained rotating speed of target s and corresponding optimum PID parameter set a*It is sequentially placed into Seq In { }, the optimal speed adjustment strategy of entire diesel engine speed regulation is obtained.
5. -4 one of them described method according to claim 1, which is characterized in that when value function reaches maximum value, by it It is applied to the revolving speed that rack position controller adjusts diesel engine, speed adjustment strategy is optimal, real-time update actuator pid parameter, When PID control deviation approaches desired output, diesel engine speed levels off to the rotating speed of target of setting with optimal speed regulation index.
6. method according to claim 4, which is characterized in that if executing a certain group of pid parameter sequence makes the rotating ratio of diesel engine The rotating speed of target for executing previous group closer to the moment then assigns positive value to reward value immediately, otherwise assigns negative value to reward value immediately.
The method according to one of right 1-4, which is characterized in that neural network weight is adjusted, specifically, defeated Each weight vector of layer assigns random number and normalizes the Q value of each subtask outEstablish initial winning neighbour Domain Nj* (0) and learning rate η initial value, wherein m is output layer neuron number;Normalizing is carried out to diesel engine rotating speed of target training set Change processing, is gatheredWithThe maximum node of dot product be winning node, with winning node j*Centered on determine t when The weighed value adjusting domain at quarter is defined as winning neighborhoodAccording to formulaCalculate the weight w of t momentt, obtain winning neighborhood Interior all nodes adjust weight, wherein wtFor the weight of t moment, QtFor the corresponding value function of t moment rotating speed of target, rt-1For The reward value at t-1 moment,It is training time t neighborhood interior nodes j and winning node j*Between learning rate, γ is discount The factor.
CN201910558083.9A 2019-06-26 2019-06-26 A kind of PID diesel engine self-adapting electronic speed regulating method Pending CN110259592A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910558083.9A CN110259592A (en) 2019-06-26 2019-06-26 A kind of PID diesel engine self-adapting electronic speed regulating method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910558083.9A CN110259592A (en) 2019-06-26 2019-06-26 A kind of PID diesel engine self-adapting electronic speed regulating method

Publications (1)

Publication Number Publication Date
CN110259592A true CN110259592A (en) 2019-09-20

Family

ID=67921524

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910558083.9A Pending CN110259592A (en) 2019-06-26 2019-06-26 A kind of PID diesel engine self-adapting electronic speed regulating method

Country Status (1)

Country Link
CN (1) CN110259592A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112596374A (en) * 2020-11-26 2021-04-02 中广核核电运营有限公司 Adjusting performance optimization and state monitoring system and method of electronic speed regulator
CN114200840A (en) * 2021-12-10 2022-03-18 广东工业大学 Traditional Chinese medicine pharmacy process operation optimization method based on distributed model predictive control
CN114370348A (en) * 2022-01-14 2022-04-19 哈尔滨工程大学 Control parameter setting method for engine rotating speed control system
CN115075967A (en) * 2022-06-29 2022-09-20 东风汽车集团股份有限公司 Electronic throttle control method of supercharged direct injection gasoline engine
CN115750108A (en) * 2022-11-29 2023-03-07 上海船舶运输科学研究所有限公司 Multifunctional speed regulation driving system and method for marine high-power diesel engine

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101705873A (en) * 2009-11-09 2010-05-12 哈尔滨工程大学 Speed-regulating device and speed-regulating method for sequential supercharged diesel engine
CN102787915A (en) * 2012-06-06 2012-11-21 哈尔滨工程大学 Diesel engine electronic speed adjusting method based on reinforced study of proportion integration differentiation (PID) controller
CN104832307A (en) * 2015-04-09 2015-08-12 哈尔滨工程大学 Diesel engine rotating speed control method
CN108062618A (en) * 2017-11-30 2018-05-22 中国船舶工业***工程研究院 Low-speed diesel engine Economic Analysis Method and system based on biradical line
CN108170147A (en) * 2017-12-31 2018-06-15 南京邮电大学 A kind of unmanned plane mission planning method based on self organizing neural network
CN108667734A (en) * 2018-05-18 2018-10-16 南京邮电大学 It is a kind of that the through street with LSTM neural networks is learnt by decision making algorithm based on Q

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101705873A (en) * 2009-11-09 2010-05-12 哈尔滨工程大学 Speed-regulating device and speed-regulating method for sequential supercharged diesel engine
CN102787915A (en) * 2012-06-06 2012-11-21 哈尔滨工程大学 Diesel engine electronic speed adjusting method based on reinforced study of proportion integration differentiation (PID) controller
CN104832307A (en) * 2015-04-09 2015-08-12 哈尔滨工程大学 Diesel engine rotating speed control method
CN108062618A (en) * 2017-11-30 2018-05-22 中国船舶工业***工程研究院 Low-speed diesel engine Economic Analysis Method and system based on biradical line
CN108170147A (en) * 2017-12-31 2018-06-15 南京邮电大学 A kind of unmanned plane mission planning method based on self organizing neural network
CN108667734A (en) * 2018-05-18 2018-10-16 南京邮电大学 It is a kind of that the through street with LSTM neural networks is learnt by decision making algorithm based on Q

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张朦朦: "基于分层强化学习的MAUVS围捕策略研究", 《中国优秀硕士学位论文全文数据库(电子期刊)信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112596374A (en) * 2020-11-26 2021-04-02 中广核核电运营有限公司 Adjusting performance optimization and state monitoring system and method of electronic speed regulator
CN114200840A (en) * 2021-12-10 2022-03-18 广东工业大学 Traditional Chinese medicine pharmacy process operation optimization method based on distributed model predictive control
CN114200840B (en) * 2021-12-10 2023-05-23 广东工业大学 Traditional Chinese medicine pharmaceutical process operation optimization method based on distributed model predictive control
CN114370348A (en) * 2022-01-14 2022-04-19 哈尔滨工程大学 Control parameter setting method for engine rotating speed control system
CN115075967A (en) * 2022-06-29 2022-09-20 东风汽车集团股份有限公司 Electronic throttle control method of supercharged direct injection gasoline engine
CN115075967B (en) * 2022-06-29 2023-11-03 东风汽车集团股份有限公司 Electronic throttle control method of supercharged direct injection gasoline engine
CN115750108A (en) * 2022-11-29 2023-03-07 上海船舶运输科学研究所有限公司 Multifunctional speed regulation driving system and method for marine high-power diesel engine
CN115750108B (en) * 2022-11-29 2024-01-23 上海船舶运输科学研究所有限公司 Multifunctional speed regulation driving system and method for marine high-power diesel engine

Similar Documents

Publication Publication Date Title
CN110259592A (en) A kind of PID diesel engine self-adapting electronic speed regulating method
JP2539540B2 (en) Process control equipment
CN110806759B (en) Aircraft route tracking method based on deep reinforcement learning
CN104776446B (en) Combustion optimization control method for boiler
US8260441B2 (en) Method for computer-supported control and/or regulation of a technical system
CN107272403A (en) A kind of PID controller parameter setting algorithm based on improvement particle cluster algorithm
WO2005013019A2 (en) Soft computing optimizer of intelligent control system structures
CN103235620A (en) Greenhouse environment intelligent control method based on global variable prediction model
CN108490965A (en) Rotor craft attitude control method based on Genetic Algorithm Optimized Neural Network
CN114357852A (en) Layered water injection optimization method based on long-short term memory neural network and particle swarm optimization algorithm
Jacobsen A generic architecture for hybrid intelligent systems
CN108876001A (en) A kind of Short-Term Load Forecasting Method based on twin support vector machines
CN105978732A (en) Method and system for optimizing parameters of minimum complexity echo state network based on particle swarm
CN106530082A (en) Stock predication method and stock predication system based on multi-machine learning
CN113138555A (en) GRNN electric spindle thermal error modeling method based on genetic algorithm optimization
Hafez et al. Topological Q-learning with internally guided exploration for mobile robot navigation
CN112330012A (en) Building energy consumption prediction method and equipment based on transfer learning
CN113885324A (en) Intelligent building power utilization control method and system
CN109782586A (en) The tight format non-model control method of the different factor of the MISO of parameter self-tuning
KR102175280B1 (en) Control system based on learning of control parameter and method thereof
CN114861364A (en) Intelligent sensing and suction regulation and control method for air inlet flow field of air-breathing engine
Nae et al. Neuro-fuzzy traffic signal control in urban traffic junction
CN106444389A (en) Method for optimizing PI control by fuzzy RBF neural network based on system of pyrolysis of waste plastic temperature
Qin et al. A reinforcement learning-based near-optimal hierarchical approach for motion control: Design and experiment
CN112733372B (en) Fuzzy logic strong tracking method for load modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190920