JPS62245401A

JPS62245401A - Learning control system

Info

Publication number: JPS62245401A
Application number: JP8962486A
Authority: JP
Inventors: Taku Arimoto; 有本　卓; Munehisa Takeda; 宗久武田
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1986-04-18
Filing date: 1986-04-18
Publication date: 1987-10-26

Abstract

PURPOSE:To improve the accuracy of positioning, and to obtain a learning control with fast convergency, by correcting a command value through the use of all of the bits of error information at every degree of freedom from a certain time to a future regulated time. CONSTITUTION:In a reproducing operation, an output signal from a controlled system 6 at every sample time is stored in a memory 9 through a detector 7, and an A/D converter 8. When one time of reproducing operation is completed, an evaluation function J, such as, for example, an error square integral value, is calculated in a command value arithmetic unit 1. If the evaluation function is not less than a prescribed value (Jmin), the command value is corrected through the use of an error from a time (t) to a time (t+T), and the reproducing operation is performed by using a new command value. In this way, it is possible to arrive a targeted value earlier than an operation in which the correction of the command value is performed by the error at a certain time.

Description

【発明の詳細な説明】〔産業上の利用分野〕この発明は、プレイバック形ロボット等のように繰り返
し制御を行う対象物の学習制御方式に係り、特に収束性
の速い学習制御方式に関するものである。[Detailed Description of the Invention] [Industrial Application Field] The present invention relates to a learning control method for objects that are repeatedly controlled, such as playback robots, and in particular to a learning control method with fast convergence. be.

〔従来の技術〕[Conventional technology]

一般に上記のような繰り返し制御を行う対象物の位置決
め制御を行う場合には、まず教示動作を行なって対象物
に目標とする作業軌跡の位置データ（これを教示値とい
う）を覚え込ませ、この教示値にしたがって再生運転を
行うとともに、上記教示値と運転軌跡との偏差値を検出
して、この偏差値を前記の教示値に加えて次回の再生運
転のための指令値とする学習制御方式が採用されている
。Generally, when performing positioning control of an object that involves repeated control as described above, first a teaching operation is performed to make the object memorize the position data of the target work trajectory (this is called a teaching value). A learning control method that performs regenerative operation according to the taught value, detects the deviation value between the taught value and the driving trajectory, and adds this deviation value to the aforementioned taught value as a command value for the next regenerative operation. has been adopted.

また、最近では、昭和６０年４月６日に開丞された特開
昭６０−５７４０９号１学習制御方法Ｉのように、各自
由度毎の動的遅れ時間をこの動的遅れ時間毎に測定し、
こうして得られた動的遅れ時間分だけ早く教示値または
指令値に上記偏差値を加えて再生運転することにより、
より位置決め精度が向上するという学習制御方式が提供
されている。In addition, recently, as in Japanese Patent Application Laid-Open No. 60-57409 1 Learning Control Method I published on April 6, 1985, the dynamic delay time for each degree of freedom is calculated for each dynamic delay time. measure,
By adding the above deviation value to the teaching value or command value earlier by the dynamic delay time obtained in this way and performing regenerative operation,
A learning control method has been proposed that further improves positioning accuracy.

〔発明が解決しようとする問題点〕[Problem that the invention seeks to solve]

しかしながら、いずれにせよ従来の学習制御では、ある
特定の時刻の偏差値だけで教示値または指令値を修正し
ているため、試行回数を多く繰り返さなければならない
。つまり、収束性の悪い制御しか実現できないという問
題があった。However, in any case, in conventional learning control, the taught value or the command value is corrected only based on the deviation value at a certain specific time, so many trials must be repeated. In other words, there is a problem in that only control with poor convergence can be realized.

この発明は上記のような問題点を解消するためになされ
たもので、位置決め精度が良いとともに、収束性の速い
（すなわち、試行回数の少ない）学習制御方式を得るこ
とを目的とする。This invention was made to solve the above-mentioned problems, and aims to provide a learning control method that has good positioning accuracy and fast convergence (that is, requires a small number of trials).

〔問題点を解決するための手段〕[Means for solving problems]

この発明に係る学習制御方式は、ある時刻から一定時間
先までのすべての偏差値情報を用いて。The learning control method according to the present invention uses all deviation value information from a certain time to a certain period of time.

教示値または指令値な修正するようにしたものである。The teaching value or command value can be corrected.

〔作用〕[Effect]

この発明における学習制御は、ある時刻から一定時間先
までの偏差値情報を用いて指令値を修正するため１例え
ば再生値が目標値からずっと遅れている場合には、ある
時刻の値のみで指令値を修正する場合に比べてより効果
的に修正が行えるため、収束性の早い学習制御が実現で
きる。The learning control in this invention corrects the command value using deviation value information from a certain time to a certain period of time.1 For example, if the reproduction value is far behind the target value, the command is issued only with the value at a certain time. Since the correction can be made more effectively than when correcting the value, learning control with fast convergence can be realized.

〔発明の実施例〕[Embodiments of the invention]

μ下、この発明の一実施例を図について説明する。第１
図において、（１）は制御対象物（６）を制御する指令
値を発生する指令値演算装置で、例えばデジタル計算機
である。（２）はこの指令値演算装置（１）からのデジ
タル信号をアナログ信号に変換するＤ／入コンバータ、
（３）は例えば演算増＠器よりなる比較器、（４）は制
御回路、（５）はサーボ増幅器、（６）は制御対象物、
（７）はこの制御対象物（６）からの出力信号を検出す
る検出器、（８）はこの検出器（７）により帰還された
アナログ信号をデジタル信号に変換するＡ／Ｄコンバー
タ、（９）はいコンバータ（８）からのデジタル信号を
記憶するメモリである。An embodiment of the present invention will be described below with reference to the drawings. 1st
In the figure, (1) is a command value calculation device that generates a command value for controlling a controlled object (6), and is, for example, a digital computer. (2) is a D/input converter that converts the digital signal from this command value calculation device (1) into an analog signal;
(3) is a comparator consisting of an operational amplifier, (4) is a control circuit, (5) is a servo amplifier, (6) is a controlled object,
(7) is a detector that detects the output signal from this controlled object (6), (8) is an A/D converter that converts the analog signal fed back by this detector (7) into a digital signal, (9 ) is a memory that stores the digital signal from the converter (8).

この実施例の動作を第２図のフローチャートに基づいて
説明する。まず、ステップａ聾の初期設定では、教示動
作等により対象物に目標とする作業軌跡の位置データを
覚え込ませるとともに、先行情報利用時間（１）等の各
種利得の設定を行う。つづいて、ステップ（２）では前
記の初期設定に基づいて再生運転を行う。このとぎ、各
サンプル時間毎の制御対象物（６）からの出力信号は、
検出器（７）、Ａ／Ｄコンバータ（８）を通して、メモ
リ（９）に記憶される。The operation of this embodiment will be explained based on the flowchart of FIG. First, in the initial setting of step a, the target object is memorized with the position data of the target work trajectory by a teaching operation, and various gains such as the advance information usage time (1) are set. Subsequently, in step (2), a regeneration operation is performed based on the above-mentioned initial settings. At this point, the output signal from the controlled object (6) for each sample time is
The data is stored in a memory (9) through a detector (7) and an A/D converter (8).

１回の再生運転が終了すると、記憶されたデータをもと
に、ステップ（２）では指令値演算装置（１）において
例えば誤差２乗積分値のような評価関数（Ｊ）が計算さ
れる。もし１次のステップα尋で評価関数（Ｊ）が所定
の値（Ｊ＋ｎｉｎ　）より小さい場合（ｙｅｓ　）には
制御を終了するが、そうでない場合（ｎｏ　）には、ス
テップα日である時刻（１）から（を十Ｔ）までの時間
の誤差（ｅ（ｔ）〜ｅ（ｔ＋Ｔ））を用いて指令値（ｕ
（ｔ））を修正し、新たな指令値を用いてステップ（２
）で再度再生運転を行う。μ下問様の操作を評価関数（
Ｊ）が（Ｊｍｉｎ　）より小さくなるまで繰り返す。When one regeneration operation is completed, an evaluation function (J) such as an error squared integral value is calculated in step (2) in the command value calculation device (1) based on the stored data. If the evaluation function (J) is smaller than the predetermined value (J+nin) at the first step α (yes), the control is terminated, but if not (no), the time ( The command value (u
(t)) and step (2) using the new command value.
) to perform regeneration operation again. The evaluation function (
Repeat until J) becomes smaller than (Jmin).

上記の指令値修正に関しては、第３図の特性線図をもと
にさらに詳しく説明する。第６図において、横軸は時間
、縦軸は制御対象物の位置を表し。The above command value correction will be explained in more detail based on the characteristic diagram shown in FIG. In FIG. 6, the horizontal axis represents time and the vertical axis represents the position of the controlled object.

実線［７ｄ（ｔ）］が目標軌跡、一点鎖線（ｙｋ（ｔ）
）が（旧回試行時の実際の運動軌跡を表す。従来の学習
制御では（ｋ＋１）回目の指令値ＣＵｋ＋＋　（ｔ）　
”３は１時刻（１）における誤差［ｅｋ（ｔ））あるい
殆何等かの方法で求められた遅れ補償時間（Ｔｃ　）だ
け進んだ地点での誤差値（ｅｋ　（ｔ＋Ｔｃ）　）だけ
を用いて修正されていたが、この発明では時刻（１）か
ら先行情報利用時間σ）の間の（ＴＡｔ）個すべての誤
差値情報［ｅ（ｔ））〜Ｃｅ　（ｔ＋Ｔ）　）を用いて
指令値を修正する。このため、ある一時刻の誤差で修正
していたよりも早く目標値に達することができる。The solid line [7d(t)] is the target trajectory, and the dashed line (yk(t)
) represents the actual movement trajectory at the previous trial. In conventional learning control, the (k+1)th command value CUk++ (t)
``3 uses only the error value [ek(t)] at one time (1) or the error value (ek(t+Tc)) at a point advanced by the delay compensation time (Tc) obtained by almost any method. However, in this invention, all (TAt) pieces of error value information [e(t)) to Ce (t+T)) between time (1) and advance information usage time σ) are used to calculate the command value. As a result, the target value can be reached sooner than if it were corrected based on a one-time error.

なお、上記実施例では、サーボ制御装置および制御対象
物（６）はアナログサーボ系としたが、これをデジタル
サーボ系としてもよいことはいうまでもない。また、上
記説明では一目由４度に限って説明をしたが、同様に多
自由度を有する制御対象についても適用可能である。In the above embodiment, the servo control device and the controlled object (6) are analog servo systems, but it goes without saying that they may be digital servo systems. Further, in the above explanation, the explanation was limited to the 4 degrees at a glance, but the present invention is similarly applicable to a controlled object having multiple degrees of freedom.

〔発明の効果〕〔Effect of the invention〕

μ上のように、この発明によれば、学習制御方式におい
て、各自由度毎にある時刻から一定時間先までのすべて
の誤差情報を用いて指令値を修正するように構成したの
で１位置決め精度が良いとともに、収束性の速い学習制
御が得られる効果がある。As described above, according to the present invention, in the learning control method, the command value is corrected using all error information from a certain time to a certain period of time for each degree of freedom, so one positioning accuracy is achieved. This has the effect of providing learning control with good convergence and fast convergence.

【図面の簡単な説明】[Brief explanation of drawings]

第１図はこの発明の一実施例による学習制御装置を示す
ブロック図、第２図はこの本発明の−実路側である学習
制御方式の処理手順の一例を示すフローチャート、第６
図はこの発明の詳細な説明するだめの特性線図である。図中、（１）は指令値演算装置、（２）はＤ／Ａコンバ
ータ、（３）は比較器、（４）は制御回路、（５）はサ
ーボ増幅器、（６）は制御対象物、（７）は検出器、（
８）はＡ／Ｄコンバータ、（９）はメモリである。なお、各図中同一符号は同一または相当部分を示す。FIG. 1 is a block diagram showing a learning control device according to an embodiment of the present invention, FIG.
The figure is a characteristic diagram for detailed explanation of this invention. In the figure, (1) is a command value calculation device, (2) is a D/A converter, (3) is a comparator, (4) is a control circuit, (5) is a servo amplifier, (6) is a controlled object, (7) is the detector, (
8) is an A/D converter, and (9) is a memory. Note that the same reference numerals in each figure indicate the same or corresponding parts.

Claims

【特許請求の範囲】[Claims]

複数の自由度を有する制御対象物を教示値にしたがつて
再生運転させて教示値と再生軌跡との偏差値を測定し、
次回の再生運転時には教示値または今回の指令値に上記
偏差値を加えて再生運転する学習制御方式において、各
自由度毎にある時刻から一定時間先までのすべての偏差
値情報を用いて上記指令値を修正することを特徴とする
学習制御方式。A controlled object having multiple degrees of freedom is operated in a regenerative manner according to the taught value, and a deviation value between the taught value and the regenerated trajectory is measured;
In a learning control method that adds the above deviation value to the taught value or the current command value during the next regeneration operation, the above command is performed using all deviation value information from a certain time to a certain period of time for each degree of freedom. A learning control method characterized by modifying values.