JP7221833B2

JP7221833B2 - Nonlinear model predictive controller

Info

Publication number: JP7221833B2
Application number: JP2019158051A
Authority: JP
Inventors: 敏之大塚; 想太郎片山; 将弘土井
Original assignee: Kyoto University; Toyota Motor Corp
Current assignee: Kyoto University; Toyota Motor Corp
Priority date: 2019-08-30
Filing date: 2019-08-30
Publication date: 2023-02-14
Anticipated expiration: 2039-08-30
Also published as: JP2021036390A

Description

特許法第３０条第２項適用平成３０年８月３１日開催の「非線形現象の特徴化に基づく制御理論調査研究会第５回研究会」にて発表Application of Article 30, Paragraph 2 of the Patent Law Presented at the 5th Study Group of the Research Group on Control Theory Based on Characterization of Nonlinear Phenomena held on August 31, 2018

本発明は、非線形モデル予測制御装置に関する。 The present invention relates to a nonlinear model predictive controller.

二足歩行ロボットのような安定性の低いシステムを制御する際には、未来（有限時間後まで）のシステム挙動を予測しながら制御を行うモデル予測制御（receding horizon control；リシーディングホライゾン制御）を用いることが有効である。モデル予測制御は、制御周期（サンプリング周期）ごとに各時刻から有限時間未来までの最適制御問題を解き、制御入力値を決定するフィードバック制御である。 When controlling a system with low stability, such as a bipedal robot, model predictive control (receding horizon control) is used to perform control while predicting future system behavior (until a finite amount of time). It is effective to use Model predictive control is feedback control that solves an optimum control problem from each time to a finite time future for each control cycle (sampling cycle) to determine a control input value.

フィードバック制御において、二足歩行ロボットは歩行動作等を伴う非線形性の高いシステムであるので、二足歩行ロボットの制御は、非線形モデル予測制御によって行われることが好ましい。 In feedback control, since the bipedal robot is a highly nonlinear system involving walking motions, etc., the control of the bipedal robot is preferably performed by nonlinear model predictive control.

非特許文献１には、周囲物との衝突を前提とした制御対象（例えばロボットの脚の動作）の非線形制御モデル制御装置が開示されている。 Non-Patent Document 1 discloses a non-linear control model controller for a controlled object (for example, motion of a leg of a robot) assuming a collision with a surrounding object.

M. Yamakita, A. Taura, and Y. Onodera、「An application of nonlinear receding horizon control to posture control with collisions」、Proceedings of lnternational Conference on Advanced lntelligent Mechatronics、２００５年７月、p.1505-1510M. Yamakita, A. Taura, and Y. Onodera, "An application of nonlinear receding horizon control to posture control with collisions," Proceedings of International Conference on Advanced lntelligent Mechatronics, July 2005, p.1505-1510

しかしながら、周囲物との衝突（例えばロボットの脚の接地）が将来（次の瞬間）のどのタイミングで発生するかを予測しながら制御することは難しい。そのため、非特許文献１に記載の技術では、周囲物との衝突のタイミングを予め設定した仮の時刻として制御を実行している。よって、将来の実際の衝突タイミングに制御結果がより良く合う高精度な非線形モデル予測制御装置が望まれる。 However, it is difficult to control while predicting at what timing in the future (the next moment) a collision with a surrounding object (for example, a leg of the robot touches the ground) will occur. Therefore, in the technique described in Non-Patent Document 1, control is executed with the timing of the collision with the surrounding object set in advance as a provisional time. Therefore, there is a demand for a highly accurate nonlinear model predictive control system whose control results better match actual future collision timings.

本発明は、上述のような実状に鑑みてなされたものであり、周囲物と衝突することを前提とした制御対象に対し、将来の実際の衝突タイミングに制御結果がより良く合う高精度な非線形モデル予測制御装置を提供することを、その目的とする。 The present invention has been made in view of the above-mentioned actual situation. The object is to provide a model predictive controller.

本発明にかかる非線形モデル予測制御装置は、制御対象の非線形制御モデルの最適化問題を演算しながらフィードバック制御を行うことによって、各時刻において将来の制御対象の応答を予測しながら制御対象の制御を行うことが可能に構成された非線形モデル予測制御装置であって、繰り返し周囲物との衝突が発生することを前提とする動作を実行制御されるように構成された制御対象に対して、各時刻において当該時刻から所定期間後までにおける当該衝突の回数を予測する予測手段と、周囲物との衝突が発生することを前提とした前記非線形制御モデルの最適化問題の演算を、予測した衝突回数に応じて実行する演算手段と、を備える、ものである。 The nonlinear model predictive control device according to the present invention performs feedback control while calculating the optimization problem of the nonlinear control model of the controlled object, thereby controlling the controlled object while predicting the future response of the controlled object at each time. A non-linear model predictive control device configured to be able to perform the following at each time on a controlled object configured to be executed and controlled on the premise that it repeatedly collides with a surrounding object: predicting means for predicting the number of collisions for a predetermined period after the current time, and computing the optimization problem of the non-linear control model on the premise that collisions with surrounding objects will occur, based on the predicted number of collisions. and computing means for executing in response.

本発明は、上述したように、各時刻において所定期間後までに制御対象が衝突する回数を予測し、その予測した衝突回数に応じて非線形制御モデルの最適化問題を演算している。したがって、本発明にかかる非線形モデル予測制御装置によれば、将来の実際の衝突タイミングに制御結果がより良く合う高精度な予測制御を実行することができる。 As described above, the present invention predicts the number of collisions of the controlled object within a predetermined period of time at each time, and calculates the optimization problem of the nonlinear control model according to the predicted number of collisions. Therefore, according to the nonlinear model predictive control device according to the present invention, highly accurate predictive control can be executed in which the control result better matches the actual collision timing in the future.

本発明によれば、周囲物と衝突することを前提とした制御対象に対し、将来の実際の衝突タイミングに制御結果がより良く合う高精度な非線形モデル予測制御装置を提供できる。 According to the present invention, it is possible to provide a highly accurate non-linear model predictive control device in which the control result better matches the future actual collision timing for a controlled object that is assumed to collide with a surrounding object.

実施の形態１にかかるロボットシステムを示す概略図である。1 is a schematic diagram showing a robot system according to a first embodiment; FIG. 実施の形態１にかかるロボットシステムの構成を示す機能ブロック図である。1 is a functional block diagram showing the configuration of a robot system according to Embodiment 1; FIG. 実施の形態１にかかるロボットをコンパス型モデルに適用する方法を説明するための図である。FIG. 4 is a diagram for explaining a method of applying the robot according to the first embodiment to a compass model; 実施の形態１にかかるコンパス型ロボットを示す模式図である。1 is a schematic diagram showing a compass-type robot according to a first embodiment; FIG. 図４のコンパス型ロボットにおける片脚支持期と両脚支持期の２つの状態を示す模式図である。FIG. 5 is a schematic diagram showing two states of the compass robot of FIG. 4 , a single-leg support period and a double-leg support period; 実施の形態１にかかるモデルを示す図である。1 is a diagram showing a model according to Embodiment 1; FIG. 実施の形態１にかかる非線形モデル予測制御の例として、コンパス型モデルの歩行制御をシミュレーションした結果を示す図である。FIG. 5 is a diagram showing a result of simulating walking control of a compass model as an example of nonlinear model predictive control according to the first embodiment; 実施の形態１にかかる非線形モデル予測制御の例として、コンパス型モデルの歩行制御をシミュレーションした結果を示す図である。FIG. 5 is a diagram showing a result of simulating walking control of a compass model as an example of nonlinear model predictive control according to the first embodiment; 実施の形態１にかかる非線形モデル予測制御の例として、コンパス型モデルの歩行制御をシミュレーションした結果を示す図である。FIG. 5 is a diagram showing a result of simulating walking control of a compass model as an example of nonlinear model predictive control according to the first embodiment; 実施の形態１にかかる非線形モデル予測制御の例として、コンパス型モデルの歩行制御をシミュレーションした結果を示す図である。FIG. 5 is a diagram showing a result of simulating walking control of a compass model as an example of nonlinear model predictive control according to the first embodiment; 実施の形態１にかかる非線形モデル予測制御の例として、コンパス型モデルの歩行制御をシミュレーションした結果を示す図である。FIG. 5 is a diagram showing a result of simulating walking control of a compass model as an example of nonlinear model predictive control according to the first embodiment; 実施の形態１にかかる非線形モデル予測制御の例として、コンパス型モデルの歩行制御をシミュレーションした結果を示す図である。FIG. 5 is a diagram showing a result of simulating walking control of a compass model as an example of nonlinear model predictive control according to the first embodiment; 実施の形態１にかかる非線形モデル予測制御の例として、コンパス型モデルの歩行制御をシミュレーションした結果の、ホライゾン上での予測衝突時刻のグラフを示す図である。FIG. 10 is a graph showing predicted collision times on the horizon as a result of simulating walking control of a compass model as an example of nonlinear model predictive control according to the first embodiment; 図１３の一部を拡大したグラフを示す図である。It is a figure which shows the graph which expanded a part of FIG.

以下、図面を参照して本発明の実施の形態について説明する。なお、各図面において、同一の要素には同一の符号が付されており、必要に応じて重複説明は省略されている。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described with reference to the drawings. In each drawing, the same elements are denoted by the same reference numerals, and redundant description is omitted as necessary.

＜実施の形態１＞
図１は、実施の形態１にかかるロボットシステム１を示す概略図である。また、図２は、実施の形態１にかかるロボットシステム１の構成を示す機能ブロック図である。ロボットシステム１は、ロボット１００と、ロボットの動作を制御する制御装置２とを有する。 <Embodiment 1>
FIG. 1 is a schematic diagram showing a robot system 1 according to a first embodiment. Also, FIG. 2 is a functional block diagram showing the configuration of the robot system 1 according to the first embodiment. The robot system 1 has a robot 100 and a controller 2 that controls the motion of the robot.

ロボット１００は、胴体１０２と、２つの脚である右脚１１０Ｒ及び左脚１１０Ｌとを有する。ロボット１００は、２つの脚（右脚１１０Ｒ及び左脚１１０Ｌ）を用いて歩行動作を行うことが可能な二足歩行ロボットである。右脚１１０Ｒ及び左脚１１０Ｌは、ロボット１００の胴体１０２の下部に設けられている。ここで、図１に示すように、ロボット１００の前方向をＸ軸方向、上方向をＹ軸方向とする。また、以下、右脚１１０Ｒに関する構成要素の符号に「Ｒ」を付し、左脚１１０Ｌに関する構成要素の符号に「Ｌ」を付すが、それぞれの構成要素について左右を区別しない場合には、「Ｒ」及び「Ｌ」は、適宜、省略され得る。 The robot 100 has a torso 102 and two legs, a right leg 110R and a left leg 110L. The robot 100 is a bipedal robot capable of walking using two legs (a right leg 110R and a left leg 110L). The right leg 110R and the left leg 110L are provided below the body 102 of the robot 100. As shown in FIG. Here, as shown in FIG. 1, the forward direction of the robot 100 is the X-axis direction, and the upward direction is the Y-axis direction. Further, hereinafter, "R" is attached to the reference numerals of the components relating to the right leg 110R, and "L" is attached to the reference numerals of the components relating to the left leg 110L. R" and "L" may be omitted as appropriate.

右脚１１０Ｒは、胴体１０２に近い方から順に、股関節部１２０Ｒと、上腿部１１２Ｒと、膝関節部１２２Ｒと、下腿部１１４Ｒと、足首関節部１２４Ｒと、足部１１６Ｒとを有する。同様に、左脚１１０Ｌは、胴体１０２に近い方から順に、股関節部１２０Ｌと、上腿部１１２Ｌと、膝関節部１２２Ｌと、下腿部１１４Ｌと、足首関節部１２４Ｌと、足部１１６Ｌとを有する。足部１１６Ｒ及び足部１１６Ｌの底部には、それぞれ足裏センサ１１８が設けられている。足裏センサ１１８は、足部１１６の底部に加わる荷重を検出する。 The right leg 110R has a hip joint 120R, an upper leg 112R, a knee joint 122R, a lower leg 114R, an ankle joint 124R, and a foot 116R in this order from the body 102 side. Similarly, the left leg 110L includes a hip joint 120L, an upper leg 112L, a knee joint 122L, a lower leg 114L, an ankle joint 124L, and a foot 116L in this order from the body 102. have. A sole sensor 118 is provided at each of the bottoms of the foot portion 116R and the foot portion 116L. The sole sensor 118 detects the load applied to the bottom of the foot 116 .

股関節部１２０Ｒ及び股関節部１２０Ｌは、胴体１０２の下部に取り付けられている。そして、股関節部１２０Ｒ及び股関節部１２０Ｌを介して、それぞれ、上腿部１１２Ｒ及び上腿部１１２Ｌが胴体１０２と接続されている。言い換えると、右脚１１０Ｒ及び左脚１１０Ｌは、それぞれ、股関節部１２０Ｒ及び股関節部１２０Ｌを介して、胴体１０２と接続されている。 The hip joint portion 120R and the hip joint portion 120L are attached to the lower portion of the body 102. As shown in FIG. Upper thighs 112R and 112L are connected to body 102 via hip joints 120R and 120L, respectively. In other words, the right leg 110R and the left leg 110L are connected to the trunk 102 via the hip joint 120R and the hip joint 120L, respectively.

また、膝関節部１２２Ｒを介して、上腿部１１２Ｒと下腿部１１４Ｒとが接続されている。同様に、膝関節部１２２Ｌを介して、上腿部１１２Ｌと下腿部１１４Ｌとが接続されている。また、足首関節部１２４Ｒを介して、下腿部１１４Ｒと足部１１６Ｒとが接続されている。同様に、足首関節部１２４Ｌを介して、下腿部１１４Ｌと足部１１６Ｌとが接続されている。 Also, the upper leg 112R and the lower leg 114R are connected via the knee joint 122R. Similarly, the upper leg 112L and the lower leg 114L are connected via the knee joint 122L. Also, the lower leg portion 114R and the foot portion 116R are connected via the ankle joint portion 124R. Similarly, the lower leg portion 114L and the foot portion 116L are connected via the ankle joint portion 124L.

股関節部１２０は、ＸＹ平面に垂直な軸（つまりロボット１００の横方向に水平な軸）の周りに回転する。これにより、右脚１１０Ｒ及び左脚１１０Ｌは、前後に動作し得る。したがって、ロボット１００は、右脚１１０Ｒ及び左脚１１０Ｌを交互に前に出すことにより歩行動作を行うことができる。 The hip joint 120 rotates around an axis perpendicular to the XY plane (that is, an axis horizontal to the lateral direction of the robot 100). This allows the right leg 110R and the left leg 110L to move forward and backward. Therefore, the robot 100 can perform a walking motion by alternately extending the right leg 110R and the left leg 110L forward.

膝関節部１２２は、ＸＹ平面に垂直な軸の周りに回転する。これにより、右脚１１０Ｒ及び左脚１１０Ｌは、膝関節部１２２で屈曲動作を行うことができる。また、足首関節部１２４は、ＸＹ平面に垂直な軸の周りに回転する。これにより、足部１１６は、下腿部１１４に対して上下に動作し得る。 The knee joint part 122 rotates around an axis perpendicular to the XY plane. As a result, the right leg 110R and the left leg 110L can bend at the knee joint portion 122. FIG. Also, the ankle joint part 124 rotates around an axis perpendicular to the XY plane. This allows the foot portion 116 to move up and down with respect to the lower leg portion 114 .

図２に示すように、ロボット１００の各関節部（股関節部１２０、膝関節部１２２及び足首関節部１２４）は、角度センサ１３０と、モータ１４０とを有する。角度センサ１３０は、例えばエンコーダであって、各関節部の関節角度を検出する。モータ１４０は、各関節部を動作させる、アクチュエータとしての機能を有する。また、各関節部は、各関節部のモータ１４０のトルクを検出するトルクセンサ１３６を有してもよい。また、ロボット１００の周囲の状態を検出するためのカメラが、胴体１０２に内蔵されていてもよい。 As shown in FIG. 2 , each joint (hip joint 120 , knee joint 122 and ankle joint 124 ) of the robot 100 has an angle sensor 130 and a motor 140 . The angle sensor 130 is an encoder, for example, and detects the joint angle of each joint. The motor 140 functions as an actuator that operates each joint. Each joint may also have a torque sensor 136 that detects the torque of the motor 140 of each joint. Also, a camera for detecting the surrounding conditions of the robot 100 may be built into the body 102 .

制御装置２は、例えばコンピュータとしての機能を有する。制御装置２は、ロボット１００の内部（例えば胴体１０２）に搭載されてもよい。また、制御装置２は、ロボット１００と物理的に離れていてもよく、その場合、ロボット１００と有線又は無線を介して通信可能に接続されてもよい。制御装置２は、ロボット１００の動作、特に、右脚１１０Ｒ及び左脚１１０Ｌの動作を制御する。さらに具体的には、制御装置２は、各関節部のモータのトルクを制御することで、右脚１１０Ｒ及び左脚１１０Ｌの姿勢を制御する。つまり、ロボットシステム１において、制御装置２はマスタ装置としての機能を有し、ロボット１００はスレーブ装置としての機能を有する。 The control device 2 has a function as a computer, for example. The control device 2 may be mounted inside the robot 100 (for example, the body 102). Also, the control device 2 may be physically separated from the robot 100, and in that case, may be connected to the robot 100 via a wire or wirelessly so as to be communicable. The control device 2 controls the movements of the robot 100, particularly the movements of the right leg 110R and the left leg 110L. More specifically, the control device 2 controls the postures of the right leg 110R and the left leg 110L by controlling the torque of the motors of each joint. That is, in the robot system 1, the control device 2 functions as a master device, and the robot 100 functions as a slave device.

制御装置２は、主要なハードウェア構成として、ＣＰＵ（Central Processing Unit）４と、ＲＯＭ（Read Only Memory）６と、ＲＡＭ（Random Access Memory）８とを有する。ＣＰＵ４は、制御処理及び演算処理等を行う演算装置としての機能を有する。ＲＯＭ６は、ＣＰＵ４によって実行される制御プログラム及び演算プログラム等を記憶するための機能を有する。ＲＡＭ８は、処理データ等を一時的に記憶するための機能を有する。 The control device 2 has a CPU (Central Processing Unit) 4, a ROM (Read Only Memory) 6, and a RAM (Random Access Memory) 8 as main hardware components. The CPU 4 functions as an arithmetic device that performs control processing, arithmetic processing, and the like. The ROM 6 has a function of storing control programs, arithmetic programs and the like executed by the CPU 4 . The RAM 8 has a function of temporarily storing processing data and the like.

また、制御装置２は、状態取得部１２、非線形モデル予測制御部１４、及びサーボ制御部１６（以下、「各構成要素」と称する）を有する。各構成要素は、例えば、ＣＰＵ４がＲＯＭ６に記憶されたプログラムを実行することによって実現可能である。また、各構成要素は、必要なプログラムを任意の不揮発性記録媒体に記録しておき、必要に応じてインストールするようにして、実現するようにしてもよい。なお、各構成要素は、上記のようにソフトウェアによって実現されることに限定されず、何らかの回路素子等のハードウェアによって実現されてもよい。 The control device 2 also has a state acquisition unit 12, a nonlinear model predictive control unit 14, and a servo control unit 16 (hereinafter referred to as "components"). Each component can be implemented by executing a program stored in the ROM 6 by the CPU 4, for example. Moreover, each component may be realized by recording necessary programs in any non-volatile recording medium and installing them as necessary. Note that each component is not limited to being implemented by software as described above, and may be implemented by hardware such as some circuit element.

状態取得部１２は、ロボット１００の現在の歩行に関する状態を示すデータ（状態パラメータ）を取得する、状態取得手段としての機能を有する。状態取得部１２は、各センサ（角度センサ１３０、足裏センサ１１８及びトルクセンサ１３６）から、各センサの検出値を取得する。そして、状態取得部１２は、取得された検出値（及び検出値から得られた値）を非線形モデル予測制御部１４に対して出力する。なお、「検出値から得られた値」とは、例えば、「検出値」が角度センサ１３０から検出された関節角度である場合、関節角度の速度（変化量，時間微分）であってもよい。この場合、状態パラメータは、関節角度及び関節角度の速度を示してもよい。 The state acquisition unit 12 functions as state acquisition means for acquiring data (state parameters) indicating the current walking state of the robot 100 . The state acquisition unit 12 acquires detection values of each sensor (the angle sensor 130, the sole sensor 118, and the torque sensor 136). The state acquisition unit 12 then outputs the acquired detection value (and the value obtained from the detection value) to the nonlinear model predictive control unit 14 . It should be noted that the "value obtained from the detected value" may be, for example, the speed of the joint angle (amount of change, time differentiation) when the "detected value" is the joint angle detected by the angle sensor 130. . In this case, the state parameter may indicate the joint angle and the velocity of the joint angle.

非線形モデル予測制御部１４は、本実施の形態にかかる非線形モデル予測制御装置に相当する部位である。非線形モデル予測制御部１４は、ロボット１００の動作を制御するための制御入力値（入力値）を算出する算出手段としての機能を有する。非線形モデル予測制御部１４は、状態取得部１２からの検出値（及び検出値から得られた値）の少なくとも一部を状態パラメータとして入力することができる。非線形モデル予測制御部１４は、その状態パラメータに基づいて、モデル予測制御のアルゴリズムを使用してロボット１００の動作を制御するための制御入力値を算出する。また、非線形モデル予測制御部１４は、算出された制御入力値をサーボ制御部１６に対して出力する。 The nonlinear model predictive control unit 14 is a part corresponding to the nonlinear model predictive control device according to this embodiment. The nonlinear model predictive control unit 14 has a function as calculation means for calculating a control input value (input value) for controlling the motion of the robot 100 . The nonlinear model predictive control unit 14 can input at least part of the detection values (and values obtained from the detection values) from the state acquisition unit 12 as state parameters. The non-linear model predictive control unit 14 calculates a control input value for controlling the motion of the robot 100 using a model predictive control algorithm based on the state parameters. The nonlinear model predictive control unit 14 also outputs the calculated control input value to the servo control unit 16 .

非線形モデル予測制御部１４は、その状態パラメータに基づいて、非線形モデル予測制御（モデル予測制御）のアルゴリズムを使用して制御対象の動作を制御するための制御入力値を算出し、サーボ制御部１６に出力する。非線形モデル予測制御についての詳細は後述する。また、非線形モデル予測制御部１４は、ロボットシステム１の外部の上位コントローラ（図示せず）によって、必要な指示値（歩幅、歩行周期等）を入力されてもよい。なお、制御対象がロボット以外のものであった場合でも、出力先が制御対象の部位になるだけで基本的に同様である。 Based on the state parameters, the nonlinear model predictive control unit 14 uses a nonlinear model predictive control (model predictive control) algorithm to calculate a control input value for controlling the operation of the controlled object, and the servo control unit 16 output to Details of the nonlinear model predictive control will be described later. Further, the nonlinear model predictive control unit 14 may receive necessary instruction values (step length, walking cycle, etc.) from an external host controller (not shown) of the robot system 1 . It should be noted that even if the controlled object is something other than a robot, the output destination is basically the same as that of the controlled object part.

サーボ制御部１６は、非線形モデル予測制御部１４によって算出された制御入力値を用いてロボット１００の動作を制御する制御手段としての機能を有する。サーボ制御部１６は、算出された制御入力値となるように、ロボット１００の各関節部を制御する。また、サーボ制御部１６は、サーボアンプの機能を有してもよい。また、サーボ制御部１６は、トルク制御を行う場合、各関節部のトルク（関節トルク）が算出された制御入力値となるように、各関節のモータ１４０を制御する。このとき、サーボ制御部１６は、各関節部のトルクセンサ１３６によって検出されたトルク値を用いてフィードバック制御を行ってもよい。 The servo control unit 16 functions as control means for controlling the motion of the robot 100 using the control input values calculated by the nonlinear model predictive control unit 14 . The servo control unit 16 controls each joint of the robot 100 so as to obtain the calculated control input value. Also, the servo control unit 16 may have a function of a servo amplifier. Further, when performing torque control, the servo control unit 16 controls the motor 140 of each joint so that the torque of each joint (joint torque) becomes the calculated control input value. At this time, the servo control section 16 may perform feedback control using the torque values detected by the torque sensors 136 of the joints.

次に、本実施の形態にかかる非線形モデル予測制御について説明する。
非線形モデル予測制御とは、非線形システムに対し、各サンプリング時刻で有限時刻未来までの最適入力（制御入力値の最適解）を求め、得られた入力のうち初期値を実際の入力とする制御である。換言すれば、非線形モデル予測制御部１４は、制御対象の非線形制御モデルの最適化問題を演算しながらフィードバック制御を行うことによって、各時刻において将来の制御対象の応答を予測しながら制御対象の制御を行うことが可能に構成されている。非線形モデル予測制御には、非線形最適制御である、フィードバック制御である、及び、拘束条件を組み込み易いという、３つの利点がある。 Next, the nonlinear model predictive control according to this embodiment will be explained.
Non-linear model predictive control is control that determines the optimal input (optimal solution of control input value) for a non-linear system up to a finite time in the future at each sampling time, and uses the initial value of the obtained input as the actual input. be. In other words, the nonlinear model predictive control unit 14 controls the controlled object while predicting the future response of the controlled object at each time by performing feedback control while calculating the optimization problem of the nonlinear control model of the controlled object. It is configured to be able to perform Nonlinear model predictive control has three advantages: it is nonlinear optimal control, it is feedback control, and it is easy to incorporate constraint conditions.

このように、非線形モデル予測制御は、フィードバック制御であるため外乱に対して強く、拘束条件も多様に組み合わせることができる。このような特徴があるため、非線形モデル予測制御は、多くのシステムへの導入が期待されている。しかしながら、ニュートン法などの従来の反復法では、サンプリング周期内で最適解に収束させることは困難であった。 In this way, since the nonlinear model predictive control is feedback control, it is resistant to disturbances and can be combined with various constraint conditions. Because of these characteristics, nonlinear model predictive control is expected to be introduced to many systems. However, conventional iterative methods such as Newton's method have difficulty converging to the optimum solution within the sampling period.

近年、この問題に対する有効な数値計算法として、Ｃ／ＧＭＲＥＳ（continuation／generalized minimum residual method）法が新たに考案された。Ｃ／ＧＭＲＥＳ法は、連続変形法（continuation method）とＧＭＲＥＳ法とを組み合わせたアルゴリズムである。Ｃ／ＧＭＲＥＳ法は、状態変化が連続であるシステムに対し、最適解の連続性を利用して、最適解の変化率を求めながら最適解を追跡していく計算方法である。このＣ／ＧＭＲＥＳ法を用いることにより、非線形モデル予測制御においても、実時間（リアルタイム）でシステムを制御することが可能となる。つまり、Ｃ／ＧＭＲＥＳ法を用いることで有限時刻未来までの最適制御問題をサンプリング周期内で解くことが可能になった。本実施の形態においても、Ｃ／ＧＭＲＥＳ法での計算を適用することができる。なお、Ｃ／ＧＭＲＥＳ法については後述する。 In recent years, a new C/GMRES (continuation/generalized minimum residual method) method has been devised as an effective numerical calculation method for this problem. The C/GMRES method is an algorithm that combines the continuation method and the GMRES method. The C/GMRES method is a calculation method for tracking the optimum solution while obtaining the change rate of the optimum solution by utilizing the continuity of the optimum solution for a system in which state changes are continuous. By using this C/GMRES method, it is possible to control the system in real time even in nonlinear model predictive control. That is, by using the C/GMRES method, it has become possible to solve the optimal control problem up to a finite future time within the sampling period. Also in this embodiment, the calculation by the C/GMRES method can be applied. Note that the C/GMRES method will be described later.

そして、本実施の形態にかかる非線形モデル予測制御部１４は、その主たる特徴として、次の予測手段及び演算手段を備える。この予測手段は、繰り返し周囲物との衝突（周囲環境との接触等）が発生することを前提とする動作（例えば歩行）を実行制御されるように構成された制御対象に対して、各時刻において当該時刻から所定期間後までにおける当該衝突の回数を予測する。この予測手段は、衝突回数予測手段と称することもできる。 The nonlinear model predictive control unit 14 according to the present embodiment has the following prediction means and calculation means as its main features. This prediction means is configured to execute and control an action (for example, walking) on the premise that collisions with surrounding objects (contact with the surrounding environment, etc.) occur repeatedly. , predicts the number of collisions from the time to a predetermined period later. This prediction means can also be called collision frequency prediction means.

上記の演算手段は、周囲物との衝突が起こり得ることを前提とした非線形制御モデルの最適化問題の演算を、予測した衝突回数に応じて（予測した衝突回数ごとに区別して）実行する。以下、このような非線形モデル予測制御部１４における制御例について説明する。 The calculation means executes calculations of optimization problems of the nonlinear control model on the premise that collisions with surrounding objects may occur, according to the predicted number of collisions (separately for each predicted number of collisions). An example of control in such a nonlinear model predictive control unit 14 will be described below.

［制御対象と制御目的］
次に、上述した非線形モデル予測制御を、本実施の形態にかかるロボット１００の動作の制御に適用した例について説明する。なお、実施の形態１においては、制御対象としてのロボット１００がコンパス型モデルである例について説明するが、非線形モデル予測制御は、ロボット１００がコンパス型モデルでなくても適用可能である。 [Control object and control purpose]
Next, an example in which the nonlinear model predictive control described above is applied to control the motion of the robot 100 according to the present embodiment will be described. In Embodiment 1, an example in which robot 100 as a control object is a compass model will be described, but non-linear model predictive control can be applied even if robot 100 is not a compass model.

なお、ロボット１００の歩行動作は、地面に着いていない脚である遊脚が地面と衝突する（着地する）という動作を含む。この衝突の前後で、ロボット１００の一般化速度が不連続に変化する。つまり、このとき、状態ジャンプが発生する。また、一般的に、歩行動作は、周期的な運動である。したがって、ロボット１００を、予め定められた周期ごとに状態ジャンプを生じさせる（つまり遊脚を着地させる）ように制御を行うことが可能である。なお、「着地」とは、遊脚が地面と衝突（接触）することに限定されない。つまり、「着地」とは、ロボット１００がその上を歩行している面（歩行面）に遊脚が接触することを意味する。 Note that the walking motion of the robot 100 includes a motion in which the free leg, which is a leg that is not on the ground, collides with (lands on) the ground. Before and after this collision, the generalized speed of the robot 100 changes discontinuously. That is, at this time, a state jump occurs. In general, walking motion is periodic motion. Therefore, it is possible to control the robot 100 to cause a state jump (that is, to land the free leg) at each predetermined cycle. Note that "landing" is not limited to collision (contact) of the free leg with the ground. In other words, "landing" means that the free leg comes into contact with the surface (walking surface) on which the robot 100 is walking.

図３は、実施の形態１にかかるロボット１００をコンパス型モデルに適用する方法を説明するための図である。図３に示した例では、右脚１１０Ｒが片脚で支持している期間に地面９０に着いている脚支持脚であり、左脚１１０Ｌが遊脚（振り脚）である。制御装置２は、支持脚が地面と点接触していることを模擬するため、支持脚（図３の例では右脚１１０Ｒ）の足首関節部１２４に設けられたトルクセンサ１３６を用いて、支持脚の足首関節部１２４のトルクを０に制御する。また、制御装置２は、右脚１１０Ｒ及び左脚１１０Ｌの膝関節部１２２を、伸展状態でロックするように制御する。つまり、制御装置２は、右脚１１０Ｒ及び左脚１１０Ｌの膝関節部１２２の関節角度が伸展状態に対応する角度（例えば０）となるように、膝関節部１２２のモータ１４０を制御する。さらに、制御装置２は、遊脚（図３の例では左脚１１０Ｌ）の足裏センサ１１８を用いて、遊脚の着地を検出する。このようにして、ロボットシステム１は、コンパス型モデルを模擬することができる。 FIG. 3 is a diagram for explaining a method of applying the robot 100 according to the first embodiment to a compass model. In the example shown in FIG. 3, the right leg 110R is the leg supporting leg that is in contact with the ground 90 while being supported by one leg, and the left leg 110L is the free leg (swing leg). In order to simulate that the supporting leg is in point contact with the ground, the control device 2 uses the torque sensor 136 provided at the ankle joint portion 124 of the supporting leg (the right leg 110R in the example of FIG. 3) to The torque of the leg ankle joint 124 is controlled to zero. The control device 2 also controls the knee joints 122 of the right leg 110R and the left leg 110L to be locked in the extended state. That is, the control device 2 controls the motors 140 of the knee joints 122 so that the joint angles of the knee joints 122 of the right leg 110R and the left leg 110L are the angles corresponding to the extended state (for example, 0). Further, the control device 2 uses the sole sensor 118 of the free leg (the left leg 110L in the example of FIG. 3) to detect the landing of the free leg. Thus, the robot system 1 can simulate a compass model.

図４は、実施の形態１にかかるロボット１００をコンパス型モデルに適用した例を説明するための図で、コンパス型ロボットを示す模式図である。図４で例示するロボット（２足歩行ロボット）１００は、関節１５０と、支持脚リンク１５１と、遊脚リンク１５２とから構成されるコンパス型モデルにモデル化されている。ここで、関節１５０は、胴体１０２及び股関節部１２０に対応する。また、支持脚リンク１５１は、右脚１１０Ｒ及び左脚１１０Ｌのうちの支持脚に対応する。また、遊脚リンク１５２は、右脚１１０Ｒ及び左脚１１０Ｌのうちの遊脚に対応する。 FIG. 4 is a diagram for explaining an example in which the robot 100 according to the first embodiment is applied to a compass-type model, and is a schematic diagram showing the compass-type robot. A robot (biped walking robot) 100 exemplified in FIG. 4 is modeled as a compass model including joints 150 , supporting leg links 151 and free leg links 152 . Here, joint 150 corresponds to body 102 and hip joint 120 . Also, the supporting leg link 151 corresponds to the supporting leg of the right leg 110R and the left leg 110L. Also, the free leg link 152 corresponds to the free leg of the right leg 110R and the left leg 110L.

関節１５０の質量をｍ０とする。また、図４の矢印で示すように、関節１５０の周りに、制御入力値として入力トルクｕが入力される。ここで、支持脚リンク１５１及び遊脚リンク１５２の物理的性質は、互いに同じであるとする。支持脚リンク１５１及び遊脚リンク１５２の長さを、ｌとする。また、支持脚リンク１５１及び遊脚リンク１５２の質量を、ｍとする。関節１５０から各リンクの重心（重心１５１ｍ及び重心１５２ｍ）までの長さを、ｄとする。支持脚リンク１５１及び遊脚リンク１５２の重心（重心１５１ｍ及び重心１５２ｍ）周りの慣性モーメントを、Ｉとする。 Assume that the mass of the joint 150 is m0. Also, as indicated by the arrow in FIG. 4, an input torque u is input around the joint 150 as a control input value. Here, it is assumed that the supporting leg link 151 and the swing leg link 152 have the same physical properties. Let l be the length of the support leg link 151 and the swing leg link 152 . Also, the mass of the supporting leg link 151 and the free leg link 152 is m. Let d be the length from the joint 150 to the center of gravity of each link (center of gravity 151 m and center of gravity 152 m). Let I be the moment of inertia around the center of gravity of the support leg link 151 and the free leg link 152 (the center of gravity 151m and the center of gravity 152m).

また、鉛直方向に対する支持脚リンク１５１の角度をθ１とし、鉛直方向に対する遊脚リンク１５２の角度をθ２とする。但し、図４において時計回り（各リンクの下端を中心に関節１５０が前方に回る方向）を正とする。したがって、図４の状態では、θ２＜０である。 The angle of the supporting leg link 151 with respect to the vertical direction is θ1, and the angle of the free leg link 152 with respect to the vertical direction is θ2. However, in FIG. 4, the clockwise direction (the direction in which the joint 150 rotates forward about the lower end of each link) is positive. Therefore, in the state of FIG. 4, θ2<0.

非線形モデル予測制御部１４は、図１のロボット１００で例示でき且つ図４で表されるような２足歩行ロボット１００を最も単純化したモデル（コンパス型モデル）に対し、制御を行うことになる。この制御は、非線形モデル予測制御を用いた実時間での動的歩行制御である。 The non-linear model predictive control unit 14 controls the most simplified model (compass model) of the bipedal robot 100, which can be exemplified by the robot 100 in FIG. 1 and is represented in FIG. . This control is real-time dynamic walking control using nonlinear model predictive control.

図５は、コンパス型ロボットの片脚支持期と両脚支持期の２つの状態を示す模式図である。図４に示すロボット１００の運動では、図５に示すロボット１００－１，１００－２のようにロボットが２つの状態をとる。ロボット１００－１で示す１つの状態は、片脚が地面９０に着きもう一方の脚が地面９０から離れている状態（片脚支持期）である。ロボット１００－２で示すもう１つの状態は、両脚が地面９０に着いている状態（両脚支持期）である。以下では、このような片脚支持期に地面９０に着いている脚を支持脚、その片脚支持期に地面９０に着いていない脚を遊脚と称する。 FIG. 5 is a schematic diagram showing two states of the compass robot, one-leg support period and two-leg support period. In the motion of the robot 100 shown in FIG. 4, the robot takes two states like the robots 100-1 and 100-2 shown in FIG. One state illustrated by robot 100-1 is when one leg is on the ground 90 and the other leg is off the ground 90 (single leg support phase). Another state shown by robot 100-2 is a state in which both legs are on the ground 90 (both legs support phase). Hereinafter, the leg that is in contact with the ground 90 during the one-leg support period will be referred to as the supporting leg, and the leg that is not in contact with the ground 90 during the one-leg support period will be referred to as the free leg.

また、歩行動作について次の４つの仮定を置く。
・遊脚と地面９０との衝突は完全非弾性衝突とする。すなわち、地面９０と衝突した脚が跳ね返ることはない。
・衝突の直後に、それまで支持脚だった脚は相互作用無しで地面９０から離れる。
・遊脚と地面９０との衝突は一瞬とする。すなわち、両脚支持期は一瞬とする。
・衝突の撃力によりロボットの速度は瞬間的に変わるが、座標は瞬間的には変わらないとものする。 Moreover, the following four assumptions are made about walking motion.
- The collision between the free leg and the ground 90 is assumed to be a completely inelastic collision. That is, the leg that collides with the ground 90 does not rebound.
• Immediately after impact, the previously supporting leg leaves the ground 90 without interaction.
・The collision between the free leg and the ground 90 is instantaneous. That is, the two-leg support period is assumed to be instantaneous.
・It is assumed that the speed of the robot changes instantaneously due to the impact force of the collision, but the coordinates do not change instantaneously.

これらの仮定により、ロボットの歩行動作におけるダイナミクスは、片脚支持期の運動という連続変化、遊脚と地面９０との衝突という不連続変化、という２つの事象に分けることができる。次に、これらの仮定に基づき、２つの方程式を導出する。 Based on these assumptions, the dynamics of the walking motion of the robot can be divided into two phenomena: a continuous change of movement during the one-leg support period, and a discontinuous change of collision between the free leg and the ground 90 . We then derive two equations based on these assumptions.

まず片脚支持期の運動方程式は、一般化座標ｑを

ととると、ラグランジュの運動方程式が、

と得られる。 First, the equation of motion in the single-leg support phase is the generalized coordinate q as

Then, Lagrange's equation of motion is

is obtained.

ここで、τ（ｕ）はジョイント部に対する制御入カトルク、Ｍ（ｑ）は慣性行列、Ｈ（ｑ，ｑ（ドット））は重力とコリオリの力を表す項であり、それぞれ次のような運動方程式で示される通りである。

Here, τ(u) is the control input torque to the joint, M(q) is the inertia matrix, H(q, q(dot)) is the term representing the force of gravity and the Coriolis force. As shown in the equation.

これより、ロボットの運動方程式及び状態空間ベクトルｘを

とすると、制御に用いるための状態方程式が

と得られる。 From this, the equation of motion of the robot and the state space vector x are

Then the state equation for control is

is obtained.

なお、ここで説明するコンパス型モデルでは、歩行の拘束条件として、ＺＭＰ（zero moment point）は考慮されないものとする。 In the compass model described here, ZMP (zero moment point) is not taken into consideration as a constraint condition for walking.

次に、衝突の式を導出する。簡単のため、本実施の形態では衝突後にθ_１＝θ_２という座標の取り直しを行う（但しこの際物理的な意味は一切不変である）。衝突の際に物理的な座標は変わらないという仮定から、衝突直前の座標をｑ^－＝［θ_１ ^－ θ_２ ^－］^Τ、直後の座標をｑ^＋＝［θ_１ ^＋ θ_２ ^＋］^Τと置き、正方行列を次式のＩ（バー）で置く。

Next, we derive the collision formula. For the sake of simplicity, in this embodiment, the coordinates are recalculated as θ ₁ =θ ₂ after the collision (however, the physical meaning does not change at all). Based on the assumption that the physical coordinates do not change upon collision, the coordinates immediately before the collision are q ⁻ = [θ ₁ ⁻ θ ₂ ⁻ ] ^T , and the coordinates immediately after the collision are q ⁺ = [θ ₁ ⁺ θ ₂ ⁺ ] ^T. and put the square matrix by I(bar) in the following equation.

すると、次式が成り立つ。

Then, the following formula holds.

それぞれ次式で表される衝突直前の速度、衝突直後の速度

については、遊脚についての角運動量保存則から次式が成り立つ。 The speed immediately before collision and the speed immediately after collision are expressed by the following equations, respectively.

, the following equation holds from the law of conservation of angular momentum for the swing leg.

但し、Ｑ^－、Ｑ^＋は下式で示す通りである。

However, Q ⁻ and Q ⁺ are as shown in the following formulas.

以上により、衝突前後のコンパス型ロボットの状態の変化は、衝突直前と直後の座標ｘ^－、ｘ^＋を用いて

と表すことができる。また、衝突が起きる条件としては遊脚の先端の地面９０からの高さが０になること、すなわち
ψ（ｘ）＝ｌ（ｃｏｓθ_１－ｃｏｓθ_２）＝０・・・（Ａ１４）
である。 From the above, the change in the state of the compass robot before and after the collision can be calculated using the coordinates x ⁻ and x ⁺ immediately before and after the collision

It can be expressed as. Also, as a condition for collision to occur, the height of the tip of the free leg from the ground 90 should be 0, that is, ψ(x)=l(cos θ ₁ -cos θ ₂ )=0 (A14).
is.

上述の説明から分かるように、本実施の形態で例示するコンパス型歩行モデルは、図６で表されるような一般的なモデルで表すことができる。但し、ｆ_ｋ（ｘ（ｔ），ｕ（ｔ））はｋ歩目の時のコンパス型モデルの状態方程式、ψ_ｋ（ｘ）はｋ歩目の衝突の条件、γ_ｋ（ｘ）はｋ歩目の衝突による状態の不連続変化を表す。また、このモデルは歩行以外にもロボット等が外界との接触を行うモデル全般を表すことができる。 As can be seen from the above description, the compass-type walking model exemplified in the present embodiment can be represented by a general model as shown in FIG. where f _k (x(t), u(t)) is the state equation of the compass model at the k-th step, ψ _k (x) is the condition for the k-th step collision, and γ _k (x) is k Represents a discontinuous change in state due to a step collision. In addition, this model can represent general models in which a robot or the like makes contact with the outside world in addition to walking.

［非線形モデル予測制御とＣ／ＧＭＲＥＳ法による実時間制御アルゴリズム］
次に、このように一般化したモデルに対して非線形モデル予測制御を適用する際の最適制御問題の定式化を予測ホライゾン上の衝突の回数ごとに導出するとともに、この非線形モデル予測制御を実時間で実行する実時間制御アルゴリズムについて説明する。 [Non-linear model predictive control and real-time control algorithm by C/GMRES method]
Next, we derive the formulation of the optimal control problem when applying nonlinear model predictive control to such a generalized model for each number of collisions on the prediction horizon. A real-time control algorithm executed in .

ここでは、図６で示した歩行動作を一般化したモデルに対して非線形モデル予測制御を適用する手法を説明する。非線形モデル予測制御（Nonlinear model predictive control；以下ＮＭＰＣ）は、制御対象システムのモデルと、システムの現時刻ｔの状態に基づき制御される。ここでのモデルは、本実施の形態で言えば状態方程式ｘ（ドット）＝ｆｋ（ｘ，ｕ）と衝突の式ｘ＋＝γｋ（ｘ－）となる。具体的には、ＮＭＰＣは、上記モデルと現時刻ｔの状態に基づき、現時刻ｔから未来ｔ＋Ｔまでのシステムの挙動が最適になるような制御入力ｕＯＰＴ（τ）（ｔ≦τ≦ｔ＋Τ）を求める制御則である。また、ＮＭＰＣは、実際のシステムへの制御入力ｕ（ｔ）を、得られた最適制御入力の初期値、すなわちｕ（ｔ）＝ｕｏｐｔ（τ）として与える制御則である。 Here, a method of applying non-linear model predictive control to the generalized model of the walking motion shown in FIG. 6 will be described. Nonlinear model predictive control (hereinafter referred to as NMPC) is controlled based on a model of the system to be controlled and the state of the system at the current time t. In this embodiment, the model is the state equation x(dot)=fk(x, u) and the collision equation x+=γk(x−). Specifically, based on the above model and the state at current time t, NMPC selects the control input uOPT(τ) (t≤τ≤t+T) that optimizes the behavior of the system from current time t to future t+T. This is the desired control law. NMPC is a control law that gives the control input u(t) to the actual system as the initial value of the obtained optimal control input, that is, u(t)=uopt(τ).

＊最適性条件の導出＊
ここで、システムが衝突のような不連続現象を含む際は、それに応じた最適性条件を導出する必要がある。そこで、ここでは、ホライゾン上での衝突回数に応じた最適性条件を導出する。 *Derivation of optimality condition*
Here, when the system includes discontinuous phenomena such as collisions, it is necessary to derive optimality conditions accordingly. Therefore, here, an optimality condition is derived according to the number of collisions on the horizon.

図６のような状態方程式の切り替えを持つシステムにおいて、状態ｘ∈Ｒ^ｎが状態方程式がｆ_ｋ（ｘ，ｕ）に支配されている時、状態ｘはサブシステムｋに支配され、サブシステムｋがアクティブであると記述する。 In a system with state equation switching as in FIG. 6, when state xεR ⁿ is governed by state equation f _k (x, u), state x is governed by subsystem k, and subsystem k is active.

すなわち、図６のモデルは、サブシステムｋがアクティブな時、状態ｘ∈Ｒ^ｎは

に従い、ある条件ψ_ｋ（ｘ）∈Ｒ^ｎについて
ψ_ｋ（ｘ（ｔ_ｋ－））＝０・・・（Ａ１６）
を満たすことで、状態ジャンプ（状態の不連続変化）
ｘ（ｔ_ｋ＋）＝γ_ｋ（ｘ（ｔ_ｋ－））・・・（Ａ１７）
が起きる。 That is, the model in FIG. 6 states that when subsystem k is active, state xεR ⁿ is

ψ _k (x(t _k −))=0 for some condition ψ _k (x)∈R ⁿ (A16)
By satisfying the state jump (discontinuous change of state)
x(t _k +)=γ _k (x(t _k −)) (A17)
happens.

また、アクティブなサブシステムは、サブシステムｋからサブシステムｋ＋１にスイッチし、状態は

に支配される。 Also, the active subsystem switches from subsystem k to subsystem k+1 and the state is

dominated by

このモデルをＮＭＰＣによって制御する際には、最適性条件、最適制御の必要条件を導出し、それを数値的に解く必要がある。一方、システムが衝突のような不連続事象を含む際はホライゾン上の衝突の数に応じて最適制御の必要条件が変わってくる。したがって、ここでは次の３つの場面、（場面１）ホライゾン上で衝突が起きない時、（場面２）ホライゾン上で衝突が１回起きる時、（場面３）ホライゾン上で衝突が複数回起きる時、についてそれぞれ最適性条件を導出する。 When controlling this model by NMPC, it is necessary to derive optimality conditions and necessary conditions for optimal control and solve them numerically. On the other hand, when the system contains discontinuous events such as collisions, the optimal control requirements change depending on the number of collisions on the horizon. So here we have three scenarios: (Scene 1) no collision on the horizon, (Scene 2) one collision on the horizon, (Scene 3) multiple collisions on the horizon. , respectively.

つまり、上述のように、非線形モデル予測制御部１４は周囲物との衝突が起こり得ることを前提とした非線形制御モデルの最適化問題の演算を行うが、その際、これらの場面に応じて演算を行うことになる。具体的には次の（場面１）～（場面３）に説明するように、衝突が無い場合から複数回発生する場合（総じて、衝突が起こりえる場合）それぞれにおいて最適化問題の演算を行う。 That is, as described above, the nonlinear model predictive control unit 14 performs calculations for the optimization problem of the nonlinear control model on the assumption that a collision with surrounding objects may occur. will be performed. Specifically, as described in the following (Scene 1) to (Scene 3), the optimization problem is calculated for each of the cases from the case where there is no collision to the case where collision occurs multiple times (generally, the case where collision may occur).

（場面１）ホライゾン上で衝突が起きない時
ホライゾン上で衝突が起きない時、ＮＭＰＣではシステムのモデル（Ａ１５）とシステムの現時刻の状態ｘ（ｔ）に基づき、評価関数

が最小になるようなホライゾン上の最適制御入力ｕ_ＯＰＴ（τ）（ｔ≦τ≦ｔ＋Τ）を求める。そして、ＮＭＰＣでは、実際のシステムヘの制御入力をｕ（ｔ）＝ｕ_ＯＰＴ（ｔ）として与える。 (Scene 1) When no collision occurs on the horizon When no collision occurs on the horizon, NMPC uses the system model (A15) and the current system state x(t) as the evaluation function

Optimal control input u _OPT (τ) (t≦τ≦t+T) on the horizon that minimizes . Then, in NMPC, the control input to the actual system is given as u(t)=u _OPT (t).

但し、Ｌ_ｋはサブシステムｋに割り当てられたステージコスト関数、φ_ｋはサブシステムｋに割り当てられた終端コスト以降では変数τ（０≦τ≦Τ）を評価区間上の時間パラメータとして扱い、ｔは定数パラメータとして扱う。また、ｘ^＊（τ；ｔ）、ｕ^＊（τ；ｔ）をそれぞれｘ（ｔ＋τ）、ｕ（ｔ＋τ）に一致するものとして扱う。 However, L _k is the stage cost function assigned to subsystem k, φ _k is the terminal cost assigned to subsystem k and thereafter the variable τ (0 ≤ τ ≤ T) is treated as a time parameter on the evaluation interval, and t is treated as a constant parameter. Also, x ^* (τ; t) and u ^* (τ; t) are treated as corresponding to x(t+τ) and u(t+τ), respectively.

このとき、ＮＭＰＣの最適制御問題は

のもとで、評価関数

を最小にするような最適制御入力ｕ^＊（τ；ｔ）（０≦τ≦Τ）を求める問題となる。 At this time, the optimal control problem of NMPC is

Under the evaluation function

The problem is to find the optimum control input u ^* (τ; t) (0≦τ≦T) that minimizes .

また、実際のシステムヘの制御入力は、
ｕ（ｔ）＝ｕ^＊（０；ｔ）・・・（Ａ２３）
として与えられる。 Also, the control input to the actual system is
u(t)=u ^* (0;t) (A23)
given as

次に、この非線形最適制御問題を数値的に解くために最適制御問題を離散化して扱う。ホライゾンＴをＮ分割すると、解くべき最適制御問題は、
ｘ_０ ^＊（ｔ）＝ｘ（ｔ）・・・（Ａ２４）
ｘ_i+１ ^＊（ｔ）＝ｘ_i ^＊（ｔ）＋ｆ_ｋ（ｘ_i ^＊（ｔ），ｕ_i ^＊（ｔ））Δτ，
ｉ＝０，・・・，Ｎ－１
・・・（Ａ２５）
のもとで、評価関数

を最小にするような制御入力ｕ_i ^＊（ｔ）、ｉ＝０，・・・，Ｎ－１を求める問題になる。但し、Δτ＝Τ／Ｎであり、ｘ_i ^＊（ｔ）、ｕ_i ^＊（ｔ）はそれぞれｘ_i ^＊（ｉΔτ；ｔ）、ｕ_i ^＊（ｉΔτ；ｔ）に相当する値である。 Next, in order to numerically solve this nonlinear optimal control problem, the optimal control problem is discretized. When the horizon T is divided into N, the optimal control problem to be solved is
_x0 ^* (t)=x(t) (A24)
x _{i +1} ^* (t) = x _i ^* (t) + _fk (x _i ^* (t), u _i ^* (t)) Δτ,
i=0,...,N-1
... (A25)
Under the evaluation function

The problem is to find control inputs u _i ^* (t), i=0, . . . , N−1 that minimize . However, Δτ=T/N, and x _i ^* (t) and u _i ^* (t) are values corresponding to x _i ^* (iΔτ; t) and u _i ^* (iΔτ; t), respectively.

この問題に対して停留条件、最適性条件を導出すると、

が得られる。但し、Ｈ_ｋ（ｘ，λ，ｕ）はサブシステムｋに対するハミルトニアン、
Ｈ_ｋ（ｘ，ｕ，λ）＝Ｌ_ｋ（ｘ，ｕ）＋λ^Ｔｆ_ｋ（ｘ，ｕ）・・・（Ａ３０）
である。 Deriving the stationary and optimality conditions for this problem yields

is obtained. where H _k (x, λ, u) is the Hamiltonian for subsystem k,
_Hk (x,u,λ ₎ = _Lk (x,u)+ ^λTfk (x,u) (A30)
is.

このとき、最適制御問題は、（Ａ２４）－（Ａ２９）を満たすような未知量ｘ_０ ^＊（ｔ），・・・，ｘ_Ｎ ^＊（ｔ）、ｕ_０ ^＊（ｔ），・・・，ｕ_Ｎ－１ ^＊（ｔ）、λ_０ ^＊（ｔ），・・・，λ_Ｎ ^＊（ｔ）を求める問題となる。もしｕ_０ ^＊（ｔ），・・・，ｕ_Ｎ－１ ^＊（ｔ）を知っていればｘ_０ ^＊（ｔ），・・・，ｘ_Ｎ ^＊（ｔ）、λ_０ ^＊（ｔ），・・・，λ_Ｎ ^＊（ｔ）は（Ａ２４）－（Ａ２８）から計算することができる。 _At _this time, the optimal ^control ^problem is the unknown quantity x ₀ ^* (t), . It becomes a problem to find u _N−1 ^* (t), λ ₀ ^* (t), . . . , λ _N ^* (t). If u ₀ ^* (t),..., u _N-1 ^* (t) are known, x ₀ ^* (t),..., x _N ^* (t), λ ₀ ^* (t), . . , λ _N ^* (t) can be calculated from (A24)−(A28).

そこで、本質的な未知量としてＵ（ｔ）を

として定義する。 Therefore, U(t) as an essential unknown quantity is

defined as

このとき、Ｕ（ｔ）についての最適性条件で構成されたベクトルは

となる。 Then the vector constructed of optimality conditions for U(t) is

becomes.

（場面２）ホライゾン上で衝突が１回起きる時
次に、ホライゾン上で衝突が１回起きる場合について最適性条件を導出する。ホライゾン上の時刻τ_ｋで衝突が起きるとすると、状態について、

と予測することができる。 (Scene 2) When one collision occurs on the horizon Next, the optimality condition is derived for the case where one collision occurs on the horizon. Suppose a collision occurs at time τ _k on the horizon, then for the state

can be predicted.

このとき、最適制御問題は、これらの状態についての方程式のもとで

を最小にするような最適制御入力ｕ^＊（τ；ｔ）（０≦τ≦Τ）を求める問題となる。 Then the optimal control problem is expressed under the equations for these states as

次に、ホライゾン上で衝突が起きない場合と同様にホライゾン上を離散化して最適性条件を求める。まず、衝突が起きるステップｉ_ｋを
ｉ_ｋΔτ≦τ_ｋ≦（ｉ_ｋ＋１）Δτ ・・・（Ａ３８）
として定義すると、状態の予測は

として行うことができる。ここで、Δτ_ｋ１＝τ_ｋ－ｉ_ｋΔτ、Δτ_ｋ２＝Δτ－Δτ_ｋ１である。 Next, the optimality condition is obtained by discretizing on the horizon in the same way as when collisions do not occur on the horizon. First, the step i _k at which collision occurs is i _k Δτ≦τ _k ≦(i _k +1)Δτ (A38)
, the state prediction is

can be done as where Δτ _k1 =τ _k −i _k Δτ and Δτ _k2 =Δτ−Δτ _k1 .

また、ホライゾン上の状態の初期値はこれまでと同様に
ｘ_０ ^＊（ｔ）＝ｘ（ｔ）・・・（Ａ４５）
として与えられる。評価関数Ｊは

と離散化される。 Also, the initial value of the state on the horizon is x ₀ ^* (t)=x(t) (A45)
given as The evaluation function J is

and discretized.

このとき最適制御問題は、（Ａ３９）－（Ａ４５）のもとで評価関数（Ａ４６）を最小にするような制御入力の系列ｕ_i ^＊（ｔ）（ｉ＝０，・・・，Ｎ－１）、ｕ^＊（τ_ｋ＋；ｔ）を求める制御入力を求める問題に帰着される。この問題に対して最適性条件を導出するために、衝突条件（Ａ４１）に対してラグランジュ乗数ν∈Ｒ^ｌを導入する。このとき、最適性条件は次のように得られる。 At this time, the optimal control problem is a sequence of control inputs u _i ^* (t) (i=0, . . . , N− 1), which reduces to the problem of finding a control input that yields u ^* (τ _k +; t). To derive the optimality condition for this problem, we introduce the Lagrangian multiplier νεR ^l for the collision condition (A41). Then, the optimality condition is obtained as follows.

このとき、最適制御問題は、（Ａ３９）－（Ａ４５）、（Ａ４８）－（Ａ５７）を満たすようなｘ_０ ^＊（ｔ），・・・，ｘ_ｉｋ ^＊（ｔ），ｘ^＊（τ_ｋ－；ｔ），ｘ^＊（τ_ｋ＋；ｔ），ｘ_ｉｋ＋１ ^＊（ｔ），・・・，ｘ_Ｎ ^＊（ｔ），λ_０ ^＊（ｔ），・・・，λ_ｉｋ ^＊（ｔ），λ^＊（τ_ｋ－；ｔ），λ^＊（τ_ｋ＋；ｔ），λ_ｉｋ ^＊（ｔ），・・・，λ_Ｎ ^＊（ｔ），ｕ_０ ^＊（ｔ），・・・，ｕ_ｉｋ ^＊（ｔ），ｕ^＊（τ_ｋ＋；ｔ），ｕ_ｉｋ＋１ ^＊（ｔ），・・・，ｕ_Ｎ－１ ^＊（ｔ），ν_ｋ ^＊（ｔ），τ_ｋを求める問題に帰着される。 At this time, the optimal control problem is x ₀ ^* (t), . . . , x _ik ^* (t), x ^* (τ _k − _; t), x ^* (τ _k +; t), x _ik+1 ^* (t), . . . , x _N ^* (t), λ ₀ ^* ⁽ t), . , λ ^* (τ _k −; t), λ ^* (τ _k +; t), λ _ik ^* (t), ..., λ _N ^* (t), u ₀ ^* (t), ..., In _the problem of finding u _ik ^* (t), u ^* (τ _k ⁺ ^; t), u _{ik +1} ^* ( _t ) _, . be returned.

これらの未知量のうち、本質的な未知量を次のように定義する。

Among these unknown quantities, essential unknown quantities are defined as follows.

但し、Ｕ_ｋ（ｔ）を次のように定義する。

However, U _k (t) is defined as follows.

このような定義により、このＵ（ｔ）が分かっていればｘ_０ ^＊（ｔ），・・・，ｘ_ｉｋ ^＊（ｔ），ｘ^＊（τ_ｋ－；ｔ），ｘ^＊（τ_ｋ＋；ｔ），ｘ_ｉｋ＋１ ^＊（ｔ），・・・，ｘ_Ｎ ^＊（ｔ），λ_０ ^＊（ｔ），・・・，λ_ｉｋ ^＊（ｔ），・・・，λ^＊（τ_ｋ－；ｔ），λ^＊（τ_ｋ＋；ｔ），λ_ｉｋ ^＊（ｔ），・・・，λ_Ｎ ^＊（ｔ）を（Ａ３９）－（Ａ４５）、（Ａ４７）－（Ａ５２）から求めることができる。 By such definition, if this U(t) is known, x ₀ ^* (t ⁾ , . . . , x _ik ^* (t), x ^* (τ _k ₋ ; ; t), x _ik+1 ^* (t), ..., x _N ^* (t), λ ₀ ^* (t), ..., λ _ik ^* (t), ..., λ ^* (τ _k - ; t), λ ^* (τ _k +; t), λ _ik ^* (t), ..., λ _N ^* (t) from (A39) - (A45), (A47) - (A52) can be done.

このときＵ（ｔ）が満たすべき条件は、

と得られる。 The condition that U(t) should satisfy is

is obtained.

但しＦ_ｋ（Ｕ_ｋ（ｔ），ｔ）は次式で示される。

However, F _k (U _k (t), t) is represented by the following equation.

（場面３）ホライゾン上で衝突が複数回起きる時
ホライゾンで複数回衝突が起きる場合の最適性条件は、衝突の回数が１回の場合の最適性条件を拡張することで求めることができる。ｋ，・・・，ｋ＋ｌ回目の衝突がホライゾン上の時刻τ_ｋ，・・・，τ_ｋ＋ｌで起きるとする。このとき、（Ａ３８）で求めた衝突が起きるステップｉ_ｋ，・・・，ｉ_ｋ＋ｌについて、（Ａ４０）－（Ａ４３）からｘ^＊（τ_ｋ－；ｔ），ｘ^＊（τ_ｋ＋；ｔ），ｘ_ｉｋ＋１ ^＊（ｔ）を、ｋ＝ｋ，・・・，ｋ＋ｌについて求めることができる。λについても同様に、（Ａ４９）－（Ａ５１）からλ^＊（τ_ｋ－；ｔ），λ^＊（τ_ｋ＋；ｔ），λ_ｉｋ ^＊（ｔ）を各ｉ_ｋ，・・・，ｉ_ｋ＋ｌについて求めることができる。ｉ_ｋ，・・・，ｉ_ｋ＋ｌ以外の全てのｉについては、ｘ_ｉとλ_ｉはそれぞれ（Ａ３９）、（Ａ５２）から計算することができる。 (Scene 3) When multiple collisions occur on the horizon The optimality condition when multiple collisions occur on the horizon can be obtained by extending the optimality condition when the number of collisions is one. Let the k, . . . , k ₊ l-th collision occur at times τ _k , . ^At _this time, for the _steps ⁱ _k _, . ), x _ik+1 ^* (t) can be determined for k=k, . . . , k+l. Similarly, for λ, λ ^* (τ _k −; t), λ ^* (τ _k +; t), λ _ik ^* (t) from (A49)-(A51) to each i _k , . . . , i _k+l can be determined. For _all i _{except i k} _, _.

このとき本質的な未知量は、

として定義することができる。 In this case, the essential unknown quantity is

can be defined as

このときＵ（ｔ）が満たすべき条件は次のようになる。

At this time, the conditions that U(t) should satisfy are as follows.

ここで、Ｆ_ｋは（Ａ６１）であり、Ｆの要素の各ｉ_ｋステップ目については、

の代わりに

を代用する。 where F _k is (A61) and for each i _k step of the elements of F,

Instead of

substitute.

＊非線形モデル予測制御の実行アルゴリズム＊
非線形モデル予測制御を行うためには、上述のように求めた最適性条件を数値的に解く必要がある。本実施の形態では短いサンプリング周期内でもロボットのような複雑なシステムの制御を目標とするため、ＮＭＰＣの高速数値計算アルゴリズムであるＣ／ＧＭＲＥＳ法を用いる。 *Execution algorithm for non-linear model predictive control*
In order to perform nonlinear model predictive control, it is necessary to numerically solve the optimality condition obtained as described above. In this embodiment, the C/GMRES method, which is a high-speed numerical calculation algorithm of NMPC, is used in order to control a complicated system such as a robot even within a short sampling period.

＊＊Ｃ／ＧＭＲＥＳ法＊＊
Ｃ／ＧＭＲＥＳ法は連続変形法とＧＭＲＥＳ法を組み合わせた手法である。Ｃ／ＧＭＲＥＳ法では、非線形方程式Ｆ（Ｕ（ｔ），ｘ（ｔ），ｔ）＝０を直接解いてＵ（ｔ）を直接求めるのではない。Ｃ／ＧＭＲＥＳ法では、Ｕ（ｔ）の時間についての連続性を前提として、各サンプリング時刻で最適なＵ（ｔ）を求め、
Ｕ（ｔ＋Δｔ）＝Ｕ（ｔ）＋Ｕ（ｔ）Δｔ・・・（Ａ６４）
として更新する。但し、Δｔはサンプリング周期である。 **C/GMRES method**
The C/GMRES method is a method combining the continuous deformation method and the GMRES method. The C/GMRES method does not directly solve the nonlinear equation F(U(t), x(t), t)=0 to obtain U(t) directly. In the C/GMRES method, on the assumption that U(t) is continuous with respect to time, the optimum U(t) is obtained at each sampling time,
U(t+Δt)=U(t)+U(t)Δt (A64)
Update as However, Δt is the sampling period.

Ｃ／ＧＭＲＥＳ法ではＵ（ｔ）を求めるために連続変形法を用いてＦ＝０を

として変形する。ここで、ζ＞０は安定化パラメータである。この方程式は

についての線形方程式としてみなすことができる。そこで、この方程式を線形方程式の高速数値解法であるＧＭＲＥＳ法を用いて解く。以上がＣ／ＧＭＲＥＳ法の簡単な説明である。 In the C/GMRES method, the continuous deformation method is used to obtain U(t), and F=0

transforms as where ζ>0 is a stabilization parameter. This equation is

can be viewed as a linear equation for Therefore, this equation is solved using the GMRES method, which is a high-speed numerical solution method for linear equations. The above is a brief description of the C/GMRES method.

＊＊Ｃ／ＧＭＲＥＳ法の衝突現象への拡張＊＊
Ｃ／ＧＭＲＥＳ法はＵ（ｔ）の連続性が前提にされている。その一方で、Ｃ／ＧＭＲＥＳ法は、衝突現象を含むシステムに対する問題は不連続現象が伴い、この（Ａ６４）による更新では解の連続性の前提が成り立たなくなる場合がある。この連続性を成り立たせるためには、ホライゾン上でのあるサブシステムｋに割り当てられた制御入力が、同様にあるサブシステムに割り当てられた制御入力によって更新される必要がある。 **Extension of the C/GMRES method to collision phenomena**
The C/GMRES method assumes continuity of U(t). On the other hand, in the C/GMRES method, a problem for a system including a collision phenomenon is accompanied by a discontinuity phenomenon, and the update by (A64) may not hold the premise of continuity of the solution. In order for this continuity to hold, the control input assigned to a subsystem k on the horizon needs to be updated by the control input assigned to a similar subsystem.

すなわち、ホライゾン上の任意の衝突ｋについて、τ_ｋ（ｔ＋Δｔ）＜ｉ_ｋΔτのとき、τ_ｋ（ｔ＋Δｔ）／Δτ＜ｉ≦ｉ_ｋである各ｉについてｕ_ｉ ^＊（ｔ＋Δｔ）は

として更新する。 That is, for any collision k over the horizon, if τ _k (t+Δt)<i _k Δτ, then u _i ^* (t+Δt) for each i with τ _k (t+Δt)/Δτ<i≦i _k is

Update as

同様に、（ｉ_ｋ＋１）Δτ＜τ_ｋ（ｔ＋Δｔ）のとき、ｉ_ｋ＜ｉ＜τ_ｋ（ｔ＋Δｔ）／Δτを満たす各ｉについて、ｕ_ｉ ^＊（ｔ＋Δｔ）を

として更新する。 Similarly, when (i _k +1)Δτ<τ _k (t+Δt), for each i that satisfies i _k <i<τ _k (t+Δt)/Δτ, u _i ^* (t+Δt) is

Update as

もう１つの修正点としてヤコビ行列とベクトルの前進差分近似がある。式（Ａ６５）のヤコビ行列は、あるベクトルＷ∈Ｒ^ｍＮ、ｗ∈Ｒ^ｎ、ω∈Ｒ、十分小さな実数ｈ＞０を用いて、

として前進差分近似される。 Another modification is forward difference approximation of Jacobian matrices and vectors. Using some vector WεR ^mN , wεR ⁿ , ωεR, and a sufficiently small real number h>0, the Jacobian matrix of equation (A65) is

is a forward difference approximation as

この差分近似も時刻ｔとｔ＋ｈにおける予測ホライゾン上のサブシステムについての連続性を前提としている。その一方で、
ｉ_ｋΔτ≦τ_ｋ（ｔ）＜ｉ_ｋΔτ＋ｈ・・・（Ａ６９）
のとき、この連続性が成り立たなくなる。 This difference approximation also assumes continuity for the subsystems on the prediction horizon at times t and t+h. On the other hand,
i _k Δτ≦τ _k (t)<i _k Δτ+h (A69)
When , this continuity ceases to hold.

そこで、そのような場合は

として後退差分近似を行う。但し、ｈを十分小さくしておけば、（Ａ６９）はほとんど起こらない。 So in such a case

to perform backward difference approximation. However, if h is sufficiently small, (A69) hardly occurs.

＊＊スイッチによって追加される変数の初期化アルゴリズム＊＊
次に、衝突によって追加される変数の初期化アルゴリズムについて説明する。
もし時刻ｔ－Δｔのホライゾン上にスイッチｋが存在しないが、時刻ｔには存在する場合、すなわちスイッチｋが時刻ｔにホライゾン上に現れる場合、次のようになる。すなわち、このような場合、時刻ｔ－Δｔのときの最適制御問題には含まれていなかった新たな変数ｕ^＊（τ_ｋ＋；ｔ）、ν_ｋ ^＊、τ_ｋが時刻ｔに追加されることになる。このとき新たな変数ｕ^＊（τ_ｋ＋；ｔ）、ν_ｋ ^＊、τ_ｋは前の時刻で求めてはいないため、Ｃ／ＧＭＲＥＳ法によりこれらを求めることができない。また、スイッチｋがＮ－１ステップ以前のｉ_ｋで起きたとき、それまでサブシステムｋについて最適に求められたｕ_ｉｋ ^＊（ｔ），・・・，ｕ_Ｎ－１ ^＊（ｔ）をサブシステムｋ＋１について最適に求め直す必要がある。但し、サンプリング周期を十分小さくしている限り、ほとんどのケースでは、ｉ_ｋ＝Ｎ－１であり、このとき初期化する変数はｕ^＊（τ_ｋ＋；ｔ）、ν_ｋ ^＊、τ_ｋだけでよい。このとき、本来であればホライゾン全体についてニュートン法などを用いてもう１度最適制御問題を解く必要があるが、計算時間が膨大になってしまうという問題点がある。そこで、少ない計算時間で部分的に最適な値を求めることでこれらの値を初期化する手法を採用するとよい。以下に、そのような手法について説明する。 ** Initialization Algorithm for Variables Added by Switches **
Next, we describe the initialization algorithm for variables added by collisions.
If switch k is not on the horizon at time t−Δt, but it is at time t, ie switch k appears on the horizon at time t, then: That is, in such a case, new variables u ^* (τ _k +; t), ν _k ^* , τ _k that were not included in the optimal control problem at time t−Δt are added at time t. It will be. At this time, the new variables u ^* (τ _k +; t), ν _k ^* , and τ _k have not been obtained at the previous time, so they cannot be obtained by the C/GMRES method. Also _, when switch k occurs at i _k before N−1 steps, u _ik ^* ⁽ t), . We need to re-determine optimally for system k+1. However, as long as the sampling period is sufficiently small, i _k =N−1 in most cases, and the only variables to be initialized at this time are u ^* (τ _k +; t), ν _k ^* , and τ _k . OK. At this time, originally, it is necessary to solve the optimum control problem once again using Newton's method or the like for the entire horizon, but there is a problem that the calculation time becomes enormous. Therefore, it is preferable to adopt a method of initializing these values by obtaining partially optimum values in a short calculation time. Such techniques are described below.

まず、ホライゾン上で衝突条件ψ＝０が満たされたことを観測したとする。すなわち、あるステップで初めて

を観測したときを考える。このとき、部分的に初期化を行うためにまずｉ_ｋステップ目までの制御入力、状態の系列は最適であると仮定する。すなわち、ｉ_ｋステップ目以降での最適な変数を求める問題を定義し解くことでこの追加された変数の初期化を行う。以降では、ｉ_ｋ＝Ｎ－１のとき、ｉ_ｋ＜Ｎ－１の２通りに分けて記述を行う。 First, let us assume that we observe that the collision condition ψ=0 is satisfied on the horizon. i.e. for the first time at some step

Consider when you observe At this time, it is assumed that the sequence of control inputs and states up to the ik- _th step is optimal for partial initialization. That is, the added variables are initialized by defining and solving the problem of finding the optimum variables after the i _kth step. In the following, when i _k =N−1, two cases of i _k <N−1 are described.

ｉ_ｋ＝Ｎ－１のとき：
このとき、もしτ_ｋ、ｕ^＊（τ_ｋ＋；ｔ）、ν_ｋ ^＊が分かっていれば、既に計算されたｘ_Ｎ－１ ^＊（ｔ）、ｕ_Ｎ－１ ^＊（ｔ）を用いて、

が計算できる。 When i _k =N−1:
Then, if τ _k , u ^* (τ _k +; t), ν _k ^* are known, using the already calculated x _N−1 ^* (t), u _N−1 ^* (t), ,

can be calculated.

ここで解く最適制御問題は、（Ａ７２）－（Ａ７５）のもとで、Ｎ－１ステップ以降の評価関数

を最小にするようなｕ^＊（τ_ｋ＋；ｔ）、ν_ｋ ^＊、τ_ｋを求める問題である。 The optimal control problem to be solved here is the evaluation function after the N-1 step under (A72)-(A75)

The problem is to find u ^* (τ _k +; t), ν _k ^* , τ _k that minimizes .

この問題に対して最適性条件を導出すると、

が得られる。 Deriving the optimality condition for this problem,

is obtained.

この初期化の実行は以下のように行われる。ここではホライゾン上で初めてψ_ｋ（ｘ_ｉ ^＊（ｔ））＜０となるのがｉ＝Ｎである場合を考えているため、まずτ_ｋをψ_ｋ（ｘ_Ｎ－１ ^＊（ｔ））とψ_ｋ（ｘ_Ｎ ^＊（ｔ））から

として与える。次に、ｕ^＊（τ_ｋ＋；ｔ）とν_ｋ ^＊（ｔ）とについて、ニュートン法を行うための適当な初期推定解を与える。 This initialization is performed as follows. Here, since ψ _k (x _i ^* (t)) < 0 for the first time on the horizon is considered when i=N, first τ _k is changed to ψ _k (x _N−1 ^* (t)) and ψ _k (x _N ^* (t)) from

give as Next, for u ^* (τ _k +; t) and ν _k ^* (t), we give a suitable initial guess solution for performing Newton's method.

そして求めたい未知量を

とする。 And the unknown quantity we want to find is

and

また、Ｕ_{ｋ，ｉｎｉｔ}（ｔ）が満たすべき条件を

とする。 Also, the condition that U _k,init (t) should satisfy is

and

上述した求めたい未知量及びＵ_{ｋ，ｉｎｉｔ}（ｔ）が満たすべき条件に対して、前進差分ニュートンＧＭＲＥＳ法による反復を行う。次のアルゴリズム１では、このＵ_{ｋ，ｉｎｉｔ}（ｔ）についてこの初期化をまとめている。 The forward difference Newtonian GMRES method iterates over the unknown quantity to be determined and the conditions that U _k,init (t) should satisfy. Algorithm 1 below summarizes this initialization for this U _k,init (t).

［アルゴリズム１：Ｕ_{ｋ，ｉｎｉｔ}（ｔ）の初期化（ｉ_ｋ＝Ｎ－１のとき）］
１：τ_ｋを（Ａ９８）により初期化する。
２：ｕ^＊（τ_ｋ＋；ｔ）とν_ｋ ^＊（ｔ）に適当な初期推定解を代入する。
３：ｘ^＊（τ_ｋ－；ｔ），ｘ^＊（τ_ｋ＋；ｔ），ｘ_Ｎ ^＊（ｔ），λ_Ｎ ^＊（ｔ），λ^＊（τ_ｋ＋；ｔ）を求め、Ｆ_{ｋ，ｉｎｉｔ}（Ｕ_{ｋ，ｉｎｉｔ}（ｔ），ｔ）を計算する。
４：ｗｈｉｌｅ｜Ｆ_{ｋ，ｉｎｉｔ}（Ｕ_{ｋ，ｉｎｉｔ}（ｔ），ｔ）｜＜τ_ｉｎｉｔｏｒｉ＜ｉ_ｍａｘｄｏ
５：前進差分Ｎｅｗｔｏｎ－ＧＭＲＥＳ法をＦ_{ｋ，ｉｎｉｔ}（Ｕ_{ｋ，ｉｎｉｔ}（ｔ），ｔ）に用いることでΔＵ_{ｋ，ｉｎｉｔ}を求める。
６：Ｕ_{ｋ，ｉｎｉｔ}（ｔ）をＵ_{ｋ，ｉｎｉｔ}（ｔ）←Ｕ_{ｋ，ｉｎｉｔ}（ｔ）＋ΔＵ_{ｋ，ｉｎｉｔ}と更新する。
７：ｘ^＊（τ_ｋ－；ｔ），ｘ^＊（τ_ｋ＋；ｔ），ｘ_Ｎ ^＊（ｔ），λ_Ｎ ^＊（ｔ），λ^＊（τ_ｋ＋；ｔ）を求め、Ｆ_{ｋ，ｉｎｉｔ}（Ｕ_{ｋ，ｉｎｉｔ}（ｔ），ｔ）を計算する。
８：ｅｎｄｗｈｉｌｅ
９：Ｕ_{ｋ，ｉｎｉｔ}（ｔ）の初期化終わり。 [Algorithm 1: Initialization of U _k,init (t) (when i _k =N−1)]
1: Initialize τ _k by (A98).
2: Substitute suitable initial guesses for u ^* (τ _k +; t) and ν _k ^* (t).
3: x ^* (τ _k −; t), x ^* (τ _k +; t), x _N ^* (t), λ _N ^* (t), λ ^* (τ _k +; t), F _{k , init} (U _{k, init} (t), t).
4: while |F _{k, init} (U _{k, init} (t), t) |<τ _init or i<i _max do
5: Obtain ΔU _k,init by using the forward difference Newton-GMRES method on F _k,init (U _k,init (t),t).
6: Update U _k,init (t) as U _k,init (t)←U _k,init (t)+ΔU _k,init .
7: x ^* (τ _k −; t), x ^* (τ _k +; t), x _N ^* (t), λ _N ^* (t), λ ^* (τ _k +; t), F _{k , init} (U _{k, init} (t), t).
8: end while
9: End of initialization of U _k,init (t).

ｉ_ｋ＜Ｎ－１のとき：
このとき解くべき最適制御問題とは、

のもとで、

を最小にするようなｕ^＊（τ_ｋ＋；ｔ）、ｕ_ｉｋ＋１ ^＊（ｔ），・・・，ｕ_Ｎ－１ ^＊（ｔ）、ν_ｋ ^＊、τ_ｋを求める問題である。 When i _k <N−1:
The optimal control problem to be solved at this time is

under the

The problem is to find ^u ^* ⁽ τ _k ₊ ; t), u _ik+ ₁ ^* ( _t ), .

この問題に対して最適性条件を導出すると、次のようになる。

The optimality condition for this problem is derived as follows.

ｉ_ｋ＝Ｎ－１の場合と同様に、ｘ_ｉｋ ^＊（ｔ），ｕ_ｉｋ ^＊（ｔ）がこの初期化問題の境界条件として与えられる。この初期化の実行はｉ_ｋ＝Ｎ－１のときと同様である。まず、τ_ｋをψ_ｋ（ｘ_ｉｋ ^＊（ｔ））とψ_ｋ（ｘ_ｉｋ＋１ ^＊（ｔ））から

として与える。 As with i _k =N−1, x _ik ^* (t), u _ik ^* (t) are given as boundary conditions for this initialization problem. Performing this initialization is the same as when i _k =N−1. First, τ _k is derived from ψ _k (x _ik ^* (t)) and ψ _k (x _ik+1 ^* (t)) as

give as

次に、ｕ（τ_ｋ＋；ｔ）とν_ｋ ^＊（ｔ）に適当な初期推定解を与える。ｕ_ｉｋ＋１ ^＊（ｔ），・・・，ｕ_Ｎ－１ ^＊（ｔ）については前のサンプリング時刻で求めた値をそのままＮｅｗｔｏｎ－ＧＭＲＥＳ法の初期推定解として用いる。求める未知量を

とする。 Next, we give u(τ _k +; t) and ν _k ^* (t) suitable initial guess solutions. For u _ik ₊₁ ^* (t), ^. the desired unknown quantity

and

Ｕ_{ｋ，ｉｎｉｔ}（ｔ）が満たすべき条件は

となる。 The condition that U _k,init (t) should satisfy is

becomes.

上述した求めたい未知量及びＵ_{ｋ，ｉｎｉｔ}（ｔ）が満たすべき条件に対して、前進差分ニュートンＧＭＲＥＳ法による反復を行う。次のアルゴリズム２では、このＵ_{ｋ，ｉｎｉｔ}（ｔ）についてこの初期化をまとめている。 The forward difference Newtonian GMRES method iterates over the unknown quantity to be determined and the conditions that U _k,init (t) should satisfy. Algorithm 2 below summarizes this initialization for this U _k,init (t).

［アルゴリズム２：Ｕ_{ｋ，ｉｎｉｔ}（ｔ）の初期化（ｉ_ｋ＜Ｎ－１のとき）］
１：τ_ｋを（Ａ９８）により初期化する。
２：ｕ^＊（τ_ｋ＋；ｔ）とν_ｋ ^＊（ｔ）に適当な初期推定解を代入する。
３：ｘ^＊（τ_ｋ－；ｔ），ｘ^＊（τ_ｋ＋；ｔ），ｘ_ｉｋ＋１ ^＊（ｔ），・・・，ｘ_Ｎ ^＊（ｔ），λ_Ｎ ^＊（ｔ），・・・，λ_ｉｋ＋２ ^＊（ｔ），λ^＊（τ_ｋ＋；ｔ）を求め、Ｆ_{ｋ，ｉｎｉｔ}を計算する。
４：ｗｈｉｌｅ｜Ｆ_{ｋ，ｉｎｉｔ}（Ｕ（ｔ），ｘ（ｔ），ｔ）｜＜τ_ｉｎｉｔｏｒｉ＜ｉ_ｍａｘｄｏ
５：前進差分Ｎｅｗｔｏｎ－ＧＭＲＥＳ法をＦ_{ｋ，ｉｎｉｔ}（Ｕ（ｔ），ｘ（ｔ），ｔ）に用いることでΔＵ_{ｋ，ｉｎｉｔ}を求める。
６：Ｕ_{ｋ，ｉｎｉｔ}（ｔ）をＵ_{ｋ，ｉｎｉｔ}（ｔ）←Ｕ_{ｋ，ｉｎｉｔ}（ｔ）＋ΔＵ_{ｋ，ｉｎｉｔ}と更新する。
７：ｘ^＊（τ_ｋ－；ｔ），ｘ^＊（τ_ｋ＋；ｔ），ｘ_ｉｋ＋１ ^＊（ｔ），・・・，ｘ_Ｎ ^＊（ｔ），λ_Ｎ ^＊（ｔ），・・・，λ_ｉｋ＋２ ^＊（ｔ），λ^＊（τ_ｋ＋；ｔ）を求め、Ｆ_{ｋ，ｉｎｉｔ}を計算する。
８：ｅｎｄｗｈｉｌｅ
９：Ｕ_{ｋ，ｉｎｉｔ}（ｔ）の初期化終わり。 [Algorithm 2: Initialization of U _k,init (t) (when i _k < N−1)]
1: Initialize τ _k by (A98).
2: Substitute suitable initial guesses for u ^* (τ _k +; t) and ν _k ^* (t).
3: x ^* ( _τk- ;t), x ^* ( _τk +;t), _xik+1 ^* (t), ..., _xN ^* (t), _λN ^* (t), ... , λ _ik+2 ^* (t), λ ^* (τ _k +; t) to calculate F _k,init .
4: while |F _{k, init} (U(t), x(t), t) |<τ _init or i<i _max do
5: Obtain ΔU _k, init by using the forward difference Newton-GMRES method on F _k,init (U(t), x(t), t).
6: Update U _k,init (t) as U _k,init (t)←U _k,init (t)+ΔU _k,init .
7: x ^* (τ _k −; t), x ^* (τ _k +; t), x _ik+1 ^* (t), ..., x _N ^* (t), λ _N ^* (t), ... , λ _ik+2 ^* (t), λ ^* (τ _k +; t) to calculate F _k,init .
8: end while
9: End of initialization of U _k,init (t).

［数値シミュレーション］
次に、上述した手法を用いて具体的にコンパス型モデルの歩行制御の数値シミュレーションを実行した結果を示す。以下に説明するシミュレーションは、本実施の形態にかかる非線形モデル予測制御のアルゴリズムを、図４で例示したコンパス型モデルにかかるロボット１００に適用したものである。 [Numerical simulation]
Next, the result of numerical simulation of the walking control of the compass type model using the above-described method will be shown. The simulation described below applies the nonlinear model predictive control algorithm according to the present embodiment to the robot 100 according to the compass model illustrated in FIG.

＊評価関数＊
まず、コンパス型歩行制御に用いる際の評価関数を設定する。継続的な歩行制御を実現するために、振り足を前に出す動作を評価関数として加えることを考える。すなわち、遊脚を前に出す速度

を、適当な目標値ν_ｒｅｆに近づけるような項

を評価関数に加える。また、その際に遊脚が地面から高く上がりすぎて非効率な穂動きをしないようｑ_２（θ_１＋θ_２）^２も加える。最後に、使用エネルギーの少ない自然な歩行を実現するため、ｒｕ^２を加える。 *Evaluation function*
First, an evaluation function for use in compass-type walking control is set. In order to realize continuous walking control, we consider adding the action of swinging the leg forward as an evaluation function. In other words, the speed at which the free leg moves forward

to a suitable target value ν _ref

is added to the evaluation function. At that time, q ₂ (θ ₁ +θ ₂ ) ² is also added so that the free leg does not rise too high from the ground and move inefficiently. Finally, ru ² is added to achieve natural walking with less energy consumption.

以上から、評価関数は、

となる。また、本実施の形態では、終端コストについてφ（ｘ）＝０としている。 From the above, the evaluation function is

becomes. Also, in the present embodiment, φ(x)=0 for the termination cost.

＊シミュレーション条件＊
シミュレーションに用いたコンパス型モデルの物理パラメータは、ｍ_０＝ｍ＝１．０［ｋｇ］、ｌ＝１．０［ｍ］、ｄ＝０．５［ｍ］、Ｉ＝０．０８３３３［ｋｇ・ｍ^２］として与える。 *Simulation conditions*
The physical parameters of the compass model used in the simulation are m ₀ =m=1.0 [kg], l=1.0 [m], d=0.5 [m], I=0.08333 [kg· m ² ].

シミュレーション条件としては、（Ａ６）に基づく状態の初期値はｘ（０）＝［－０．１４０．１４０．５００．５８］^Ｔ、シミュレーション中のモデルの状態の更新及びサンプリング周期はΔｔ＝０．００１［ｓ］とする。評価関数内のパラメータは、ｑ_１＝ｑ_２＝１．０、ｒ＝０．５、ν_ｒｅｆ＝０．５［ｒａｄ／ｓ］とする。ホライゾンの長さはＴ（ｔ）＝Ｔ_ｆ（１－ｅ^－αｔ）、α＝１．０、Ｔ_ｆ＝０．８［ｓ］とし、評価区間の分割数は、Ｎ＝８０とする。また、差分近似（Ａ６８）、（Ａ７０）の差分は、ｈ＝１．０×１０^－８として与える。 As simulation conditions, the initial value of the state based on (A6) is x(0)=[−0.14 0.14 0.50 0.58] ^T , and the model state update and sampling period during simulation is Δt = 0.001 [s]. The parameters in the evaluation function are q ₁ =q ₂ =1.0, r=0.5, v _ref =0.5 [rad/s]. The length of the horizon is T(t)=T _f (1−e ^−αt ), α=1.0, T _f =0.8 [s], and the division number of the evaluation interval is N=80. Also, the difference between the difference approximations (A68) and (A70) is given as h=1.0×10 ⁻⁸ .

＊シミュレーション結果＊
図７～図１２は、本実施の形態にかかる非線形モデル予測制御の例として、コンパス型モデルの歩行制御をシミュレーションした結果を示す図である。図１３はその歩行制御におけるホライゾン上での予測衝突時刻のグラフを示す図で、図１４は図１３の一部を拡大したグラフを示す図である。 *simulation result*
7 to 12 are diagrams showing results of simulating walking control of a compass model as an example of nonlinear model predictive control according to the present embodiment. FIG. 13 is a diagram showing a graph of predicted collision times on the horizon in the walking control, and FIG. 14 is a graph showing a part of FIG. 13 enlarged.

図７ではθ_１の変化を、図８ではθ_２の変化を、図９ではθ_１（ドット）の変化を、図１０ではθ_２（ドット）の変化を、図１１ではｕの変化を、図１２では｜｜Ｆ｜｜の変化を、それぞれ示している。 Change in θ ₁ in FIG. 7, change in θ ₂ in FIG. 8, change in θ ₁ (dot) in FIG. 9, change in θ ₂ (dot) in FIG. 10, change in u in FIG. FIG. 12 shows changes in ||F||.

図１２における｜｜Ｆ｜｜（エラーノルム）は、（Ａ３２），（Ａ６１），（Ａ６３）で示す各場面におけるＦ（Ｕ（ｔ），ｘ（ｔ），ｔ）の大きさ、すなわち最適解からの現在の解の誤差を表す。｜｜Ｆ｜｜が他の点と比べ大きくなっている点がある。これは、θ_１、θ_２が垂直になっている時刻、すなわち実際の制御対象が地面と衝突を起こしている時刻と一致している。よって、これは衝突によってホライゾン上の最適性条件（場面１～３について説明した最適性条件）が変わったために生じたと考えられる。その点を考慮すると、図７～図１２で示すこのシミュレーション結果は、制御入力が滑らかな挙動となっているのが分かる。 ||F|| (error norm) in FIG. Represents the error of the current solution from the solution. There is a point where ||F|| is larger than other points. This coincides with the time when θ ₁ and θ ₂ are vertical, that is, the time when the actual controlled object collides with the ground. Therefore, it is considered that this occurred because the collision changed the optimality condition on the horizon (the optimality condition described for Scenes 1 to 3). Considering this point, it can be seen that the simulation results shown in FIGS. 7 to 12 show smooth behavior of the control input.

図１３においては、ｔ_ｋ＝０になっている時刻では、評価区間上に衝突を検出していないことを表している。また、このシミュレーションで設定した評価区間長さでは、ホライゾン上において１回の衝突のみが起こっていた。図１４では、各時刻でホライゾン上の衝突時刻が最適化されていることが分かる。また、サンプリング周期を１［ｍｓ］として行った本シミュレーションで、ＮＭＰＣの１サンプリングあたりの更新時刻は０．８［ｍｓ］前後であり、実時間での歩行制御に成功しているのが分かる。 FIG. 13 shows that no collision is detected in the evaluation section at the time when t _k =0. Also, in the evaluation section length set in this simulation, only one collision occurred on the horizon. In FIG. 14, it can be seen that the collision time on the horizon is optimized at each time. Also, in this simulation performed with a sampling period of 1 [ms], the update time per sampling of NMPC is around 0.8 [ms], and it can be seen that walking control in real time is successful.

［本実施の形態の特徴について］
上述したように、本実施の形態では、その主たる特徴の一つとして、上記予測手段が制御対象に対して、各時刻において当該時刻から所定期間後までにおける当該衝突の回数を予測する。 [Features of this embodiment]
As described above, in this embodiment, as one of its main features, the prediction means predicts the number of collisions for the controlled object at each time until a predetermined period from that time.

このような未来の衝突回数の予測について簡単に補足説明する。この予測は制御周期ごとに実行され、上述のような最適制御では求解の過程で未来（予測区間内）の各時刻における状態も予測されることになる。よって、今回の最適制御を実行する前に、前回の最適制御で予測された未来の状態に対して例えば（Ａ３３）～（Ａ３６）を適用することで、未来の衝突回数を予測することができる。また、上述のようにこの予測は制御周期ごとに実行されるため、予測区間（ホライゾン）の長さを一定とすると、歩行速度が上がる程、予測される衝突回数が増え、減速する程、予測される衝突回数が減少することになる。 A brief supplementary explanation of such prediction of the number of future collisions will be given. This prediction is executed for each control cycle, and in the optimum control as described above, the state at each time in the future (within the prediction interval) is also predicted in the process of finding the solution. Therefore, by applying (A33) to (A36) to the future state predicted by the previous optimum control before executing the current optimum control, it is possible to predict the number of future collisions. . In addition, as described above, this prediction is executed for each control cycle, so if the length of the prediction interval (horizon) is constant, the more the walking speed increases, the more collisions are predicted. This will reduce the number of collisions that occur.

そして、上述したように、本実施の形態では、上記演算手段は、周囲物との衝突が起こり得ることを前提とした非線形制御モデルの最適化問題の演算を、予測した衝突回数に応じて（予測した衝突回数ごとに区別して）実行する。この演算について簡単に補足説明する。 Then, as described above, in the present embodiment, the computing means computes the optimization problem of the nonlinear control model on the premise that collisions with surrounding objects may occur, depending on the predicted number of collisions ( (separately for each predicted number of collisions). A brief supplementary explanation of this calculation will be given.

本実施の形態では、最適制御問題の中に衝突時刻を組み込んでいる。すなわち、本実施の形態では、「最適性条件の導出」において場面１～３ごとの制御について例示したように、衝突回数の予測に基づいて上記場面１～３のいずれを用いるか（どの最適制御の必要条件を用いるか）により最適制御問題の切り替えを行っている。なお、衝突時刻を最適化問題に組み込まない、即ち衝突時刻を変更しない場合は、所定の衝突時刻に衝突が起こるように定式化を行い、通常通りＣ／ＧＭＲＥＳ法を用いて制御を行うことができる。 In this embodiment, the collision time is incorporated into the optimal control problem. That is, in the present embodiment, as exemplified for the control for each scene 1 to 3 in "Derivation of the optimality condition", which of the scenes 1 to 3 is used based on the prediction of the number of collisions (which optimal control The optimal control problem is switched by using the necessary condition of If the collision time is not included in the optimization problem, i.e., if the collision time is not changed, it is possible to formulate so that the collision occurs at a predetermined collision time and control using the C/GMRES method as usual. can.

但し、単純に衝突時刻も最適制御問題に組み込むと、衝突回数によって解くべき最適制御問題が変化するため、前回の求解結果を用いたＣ／ＧＭＲＥＳ法のような手法では前回の求解結果を得ることが困難となる場合がある。これは、衝突回数が前回と今回の制御周期で異なっている場合には、前回解いた最適制御問題が今回解こうとしている最適制御問題と異なるためである。具体的には、衝突回数が変わると最適制御問題自体が変わるため、解ベクトルも異なった形になる。例えば衝突条件に対するラグランジュ定数ｖと衝突時間τなど、前回の最適化問題では解ベクトルに含まれていなかった変数が今回の最適化問題には含まれるようになる。 However, if the collision time is simply incorporated into the optimum control problem, the optimum control problem to be solved will change depending on the number of collisions. can be difficult. This is because if the number of collisions differs between the previous and current control cycles, the optimal control problem solved last time is different from the optimal control problem to be solved this time. Specifically, since the optimal control problem itself changes when the number of collisions changes, the solution vector will also have a different shape. For example, the current optimization problem includes variables that were not included in the solution vector in the previous optimization problem, such as the Lagrangian constant v and the collision time τ for the collision condition.

したがって、本実施の形態として説明したように、最適性条件を切り替えるだけでなく、この切り替えに伴って新たに加わる制御の変数を初期化することが望ましい。この初期化は、前回の求解結果を基にして解ベクトルを初期化するものとなっている。「スイッチによって追加される変数の初期化アルゴリズム」において例示したように、この切り替えに伴って新たに加わる制御の変数とは例えばｕ^＊（τ_ｋ＋；ｔ）、ν_ｋ ^＊、τ_ｋを指す。 Therefore, as described in the present embodiment, it is desirable not only to switch the optimality condition, but also to initialize the newly added control variables associated with this switching. This initialization is to initialize the solution vector based on the previous solution finding result. As exemplified in the "initialization algorithm for variables added by the switch", the control variables newly added along with this switching refer to u ^* (τ _k +; t), ν _k ^* , τ _k , for example. .

このように、本実施の形態では、衝突回数の予測に基づいて最適制御問題を演算し、前回解いた最適制御問題の解を利用して、今回の最適制御問題の初期値を適切に設定することが好ましい。 Thus, in the present embodiment, the optimum control problem is calculated based on the prediction of the number of collisions, and the solution of the previously solved optimum control problem is used to appropriately set the initial value of the current optimum control problem. is preferred.

以上に説明したように、本実施の形態は、各時刻において所定期間後までに制御対象が衝突する回数を予測し、その予測した衝突回数に応じて非線形制御モデルの最適化問題を演算している。すなわち、本実施の形態では、繰り返し衝突が発生することを前提とする場合において、例えば同じ期間内に衝突が発生する回数ごとに変えて非線形制御モデルを最適化する。 As described above, the present embodiment predicts the number of collisions of the controlled object within a predetermined period of time at each time, and calculates the optimization problem of the nonlinear control model according to the predicted number of collisions. there is That is, in the present embodiment, on the premise that repeated collisions occur, the nonlinear control model is optimized by changing it for each number of collisions that occur within the same period, for example.

よって、本実施の形態によれば、個々の衝突におけるタイミングの予測と実際のずれが小さくなる確率が上がり、また、統計的に見るほど予測と実際のずれはより無くなっていく。これにより、本実施の形態にかかる非線形モデル予測制御装置は、将来の実際の衝突タイミングに制御結果がより良く合う高精度な予測制御を実行することができる。 Therefore, according to the present embodiment, the probability that the difference between the predicted timing and the actual timing in each collision becomes small increases, and statistically speaking, the difference between the predicted timing and the actual timing decreases. As a result, the non-linear model predictive control device according to the present embodiment can execute highly accurate predictive control in which the control result better matches the actual collision timing in the future.

換言すれば、本実施の形態では、最適制御の予測区間上での衝突の回数に応じた定式化を行い、動作の制御に関する制御入力と衝突時刻を同時に最適化しており、それにより、状態に応じて適応的且つ最適な動作が生成可能となる。このような最適制御手法は、任意時刻での切り替えに対応した形で最適制御問題を定式化するとともに、切り替え時刻も制御変数に含めて制御入力と同時に最適化することにより、切替タイミングと予測区間の制御入力を同時最適化する手法と言える。 In other words, in the present embodiment, formulation is performed according to the number of collisions on the prediction interval of the optimum control, and the control input related to the control of the motion and the collision time are optimized at the same time. Adaptive and optimal behavior can be generated accordingly. In such an optimal control method, the optimal control problem is formulated in a way that corresponds to switching at an arbitrary time. It can be said that it is a method to simultaneously optimize the control inputs of

よって、本実施の形態によれば、状態に応じて適切な切り替えタイミングと制御入力を発生できるため、制御対象のより効率的で安定な動作を実現できる。例えば、切り替えのタイミングを予め決めておく場合には、制御の最適性を満たすことは困難となり、どのような状態であっても決められた時刻に衝突を行おうとするため、不自然な動作になったり、電力が大きくなるような入力が必要となったりしてしまう。これに対し、本実施の形態では適切な切り替えタイミングを発生させることができるため制御の最適性を満たすことができる。例えば、歩行動作を例に考えると、外乱が生じて定常的な歩行動作から状態が外れた場合、動作軌道だけでなく、次の足が着地するタイミングも変更することで無理なく（より少ないトルクで）定常状態に戻すことができる。 Therefore, according to the present embodiment, appropriate switching timing and control input can be generated according to the state, so that more efficient and stable operation of the controlled object can be realized. For example, if the switching timing is determined in advance, it becomes difficult to satisfy the optimality of the control. Otherwise, an input that increases power is required. On the other hand, in the present embodiment, it is possible to generate an appropriate switching timing, thereby satisfying the optimality of control. For example, taking walking as an example, when a disturbance occurs and the state deviates from the normal walking motion, not only the motion trajectory but also the timing at which the next foot touches the ground can be changed in a natural way (with less torque). ) can be returned to a steady state.

（変形例）
なお、本発明は上記実施の形態に限られたものではなく、趣旨を逸脱しない範囲で適宜変更することが可能である。例えば、ロボット１００の片方の脚は、股関節部１２０、膝関節部１２２及び足首関節部１２４を有するとしたが、このような構成に限られない。ロボット１００の脚は、３個よりも少ない数の関節部を有してもよいし、３個よりも多い数の関節部を有してもよい。この場合、状態ベクトル及び関節トルクベクトル（制御入力値）は、関節部の数に応じて、適宜、変更され得る。そして、状態方程式等の関数も、関節部の数に応じて、適宜、変更され得る。 (Modification)
It should be noted that the present invention is not limited to the above embodiments, and can be modified as appropriate without departing from the scope of the invention. For example, one leg of the robot 100 has the hip joint 120, the knee joint 122, and the ankle joint 124, but the configuration is not limited to this. The legs of the robot 100 may have less than three joints or more than three joints. In this case, the state vector and the joint torque vector (control input value) can be appropriately changed according to the number of joints. Functions such as state equations can also be changed as appropriate according to the number of joints.

また、上述した実施の形態においては、非線形システムが二足歩行ロボットである例について説明した。しかし、本実施の形態にかかる非線形モデル予測制御のアルゴリズムは、二足歩行ロボット以外の非線形システムについても適用可能であり、またコンパス型モデルでなくても適用可能である。 Also, in the above-described embodiments, an example in which the nonlinear system is a bipedal robot has been described. However, the non-linear model predictive control algorithm according to the present embodiment can also be applied to non-linear systems other than bipedal robots, and can also be applied to non-compass models.

つまり、本実施の形態にかかる非線形システムの制御方法は、以下に例示するような、繰り返し周囲物との衝突が発生することを前提とする任意の非線形システムに対して、適用可能である。なお、一見して周囲物との衝突が繰り返し起こりそうもないような制御対象であっても、周囲物との衝突が繰り返し起こり得るような制御を意図的に行うことはあり得るため、本実施の形態は、そのような制御を行う場合にも有益となる。 That is, the nonlinear system control method according to the present embodiment can be applied to any nonlinear system assuming repeated collisions with surrounding objects, such as those exemplified below. It should be noted that even for a controlled object that at first glance seems unlikely to repeatedly collide with surrounding objects, it is possible to intentionally perform control in such a way that collisions with surrounding objects can occur repeatedly. The form of is also useful for such control.

例えば、ロボットに設けたロボットハンド又はロボットアーム等を制御対象とすることもできる。この例における衝突は、ロボットハンド又はロボットアームが、周辺環境又は操作対象等の物体を押圧するとき、物体を把持し又は離すとき、球体等の物体を叩く又は打ち返すとき等に、発生し得る。なお、球体等の物体を打ち返す非線形システムの例として、例えば、卓球ロボットがある。 For example, a robot hand or a robot arm provided on the robot can be controlled. Collisions in this example can occur when a robotic hand or arm presses an object such as the environment or an object to be manipulated, grasps or releases an object, hits or hits an object such as a sphere, and so on. An example of a non-linear system that hits back an object such as a sphere is a table tennis robot.

また、例えば、本実施の形態での制御対象は、腕及び脚を同時に床等に着地して移動可能な人型ロボット又は動物型ロボット等であってもよい。この例における衝突は、人型ロボット又は動物型ロボットが、腕と脚とを同時に、壁、床又はテーブル等に接触して移動するとき、又は、人型ロボット又は動物型ロボットが、梯子又は壁等を登るとき等に、発生し得る。 Further, for example, the object to be controlled in the present embodiment may be a humanoid robot, an animal robot, or the like that can move by landing its arms and legs on the floor at the same time. Collisions in this example are when the humanoid or animal robot moves with its arms and legs simultaneously contacting a wall, floor, table, etc., or when the humanoid or animal robot moves against a ladder or wall. It can occur, for example, when climbing etc.

また、例えば、本実施の形態での制御対象は、ドローン等の無人航空機などであってもよい。この例における衝突は、無人航空機が、操作対象又は検査対象の物体に接触するとき又はその物体から離れるとき、輸送対象又は捕獲対象の物体を把持し又は離すとき等に、発生し得る。 Further, for example, the controlled object in the present embodiment may be an unmanned aerial vehicle such as a drone. A collision in this example may occur when the unmanned aerial vehicle contacts or leaves an object to be manipulated or inspected, grasps or releases an object to be transported or captured, and the like.

また、例えば、本実施の形態での制御対象は、加工機械の工具等であってもよい。この例における衝突は、加工機械の工具が、加工対象等の物体に接触し又は離れるとき等に、発生し得る。 Further, for example, the object to be controlled in the present embodiment may be a tool of a processing machine or the like. Collisions in this example can occur, such as when a tool of a machine tool touches or leaves an object, such as a workpiece.

また、例えば、本実施の形態での制御対象は、自動車のトランスミッション等であってもよい。この例における衝突は、トランスミッションのクラッチが、接触状態（動力の伝達状態）となったとき又は離間状態（動力の遮断状態）なったとき等に、発生し得る。 Further, for example, the controlled object in the present embodiment may be a transmission of an automobile or the like. A collision in this example can occur when the clutch of the transmission is in a contact state (power transmission state) or separated state (power cutoff state).

また、例えば、本実施の形態での制御対象は、飛行機等であってもよい。この例における衝突は、飛行機の離着陸において、接地の前後を含めて運動を最適化するように制御するとき等に、発生し得る。具体的には、所望の経路で着陸しつつ、着陸後すみやかに減速するようにエンジン及び機体を制御するような場合である。 Further, for example, the object to be controlled in this embodiment may be an airplane or the like. A collision in this example can occur during take-off and landing of an airplane, such as when controlling to optimize motion, including before and after touchdown. Specifically, it is a case where the engine and the aircraft are controlled so as to decelerate immediately after landing while landing on a desired route.

また、例えば、本実施の形態での制御対象は、列車等であってもよい。この例における衝突は、列車の連結において、連結の前後を含めて運動を最適化するように制御するとき等に、発生し得る。具体的には、連結時の衝撃及び駆動モータの負荷を軽減するようにモータを制御するような場合である。 Further, for example, the object to be controlled in this embodiment may be a train or the like. Collisions in this example can occur, for example, when train coupling is controlled to optimize motion, including before and after coupling. Specifically, it is a case where the motor is controlled so as to reduce the impact at the time of connection and the load on the drive motor.

上述の例において、プログラムは、様々なタイプの非一時的なコンピュータ可読媒体（non-transitory computer readable medium）を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体（tangible storage medium）を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体（例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ）、光磁気記録媒体（例えば光磁気ディスク）、ＣＤ－ＲＯＭ（Read Only Memory）、ＣＤ－Ｒ、ＣＤ－Ｒ／Ｗ、半導体メモリ（例えば、マスクＲＯＭ、ＰＲＯＭ（Programmable ROM）、ＥＰＲＯＭ（Erasable PROM）、フラッシュＲＯＭ、ＲＡＭ（Random Access Memory））を含む。また、プログラムは、様々なタイプの一時的なコンピュータ可読媒体（transitory computer readable medium）によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 In the above examples, the programs can be stored and delivered to computers using various types of non-transitory computer readable media. Non-transitory computer-readable media include various types of tangible storage media. Examples of non-transitory computer-readable media include magnetic recording media (eg, flexible discs, magnetic tapes, hard disk drives), magneto-optical recording media (eg, magneto-optical discs), CD-ROMs (Read Only Memory), CD-Rs, CD-R/W, semiconductor memory (eg, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)). The program may also be delivered to the computer on various types of transitory computer readable medium. Examples of transitory computer-readable media include electrical signals, optical signals, and electromagnetic waves. Transitory computer-readable media can deliver the program to the computer via wired channels, such as wires and optical fibers, or wireless channels.

１・・・ロボットシステム、２・・・制御装置、１２・・・状態取得部、１４・・・非線形モデル予測制御部、１６・・・サーボ制御部、１００,１００－１，１００－２・・・ロボット、１０２・・・胴体、１１０Ｌ・・・左脚、１１０Ｒ・・・右脚、１１２・・・上腿部、１１４・・・下腿部、１１６・・・足部、１１８・・・足裏センサ、１２０・・・股関節部、１２２・・・膝関節部、１２４・・・足首関節部、１３０・・・角度センサ、１３６・・・トルクセンサ、１４０・・・モータ、１５０・・・関節、１５１・・・支持脚リンク、１５２・・・遊脚リンク 1... Robot system, 2... Control device, 12... State acquisition unit, 14... Non-linear model prediction control unit, 16... Servo control unit, 100, 100-1, 100-2. Robot 102 Torso 110L Left leg 110R Right leg 112 Upper leg 114 Lower leg 116 Foot 118 Sole sensor 120 Hip joint 122 Knee joint 124 Ankle joint 130 Angle sensor 136 Torque sensor 140 Motor 150. .. joint 151 .. support leg link 152 .. free leg link

Claims

制御対象の非線形制御モデルの最適化問題を演算しながらフィードバック制御を行うことによって、各時刻において将来の制御対象の応答を予測しながら制御対象の制御を行うことが可能に構成された非線形モデル予測制御装置であって、
繰り返し周囲物との衝突が発生することを前提とする動作を実行制御されるように構成された制御対象に対して、各時刻において当該時刻から所定期間後までにおける当該衝突の回数を予測する予測手段と、
周囲物との衝突が発生することを前提とした前記非線形制御モデルの最適化問題の演算を、予測した衝突回数に応じて実行する演算手段と、
を備える、非線形モデル予測制御装置。 A nonlinear model prediction that is configured to be able to control the controlled plant while predicting the future response of the controlled plant at each time by performing feedback control while computing the optimization problem of the nonlinear control model of the controlled plant. a controller,
Prediction that predicts the number of collisions at each time until a predetermined period of time from that time for a controlled object that is configured to execute and control operations on the premise that collisions with surrounding objects will occur repeatedly. means and
a calculation means for executing calculations of the optimization problem of the nonlinear control model on the premise that collisions with surrounding objects will occur, according to the predicted number of collisions;
A nonlinear model predictive controller, comprising: