JP2004291228A5

JP2004291228A5 -

Info

Publication number: JP2004291228A5
Application number: JP2004068133A
Authority: JP
Filing date: 2004-03-10
Publication date: 2007-04-19
Anticipated expiration: 2024-03-10

Claims

内部状態及び外部刺激に基づいて自律的に行動を選択し発現するための処理を行なう情報処理装置において、
所定の内部状態及び外部刺激が対応付けられた行動が記述された複数の行動記述モジュールと、
入力される外部刺激と、行動発現後に変化すると予想される予想内部状態変化とを対応付けたデータ形式からなる行動価値算出データベースと、
内部状態及び外部刺激から前記行動価値算出データベースを参照し、内部状態に対応付けられた行動に対する欲求値及び内部状態に基づく満足度を求め、現在の内部状態から求まる欲求値と、予想内部状態変化から求まる予想満足度変化とに基づいて、前記の各行動記述モジュールに記述された行動の行動価値を算出する行動価値算出手段と、
該算出された行動価値に基づいて行動記述モジュールを選択し、該選択された行動記述モジュールに記述された行動を発現させる行動選択手段と、
を具備することを特徴とする情報処理装置。 In an information processing apparatus that performs processing for autonomously selecting and expressing an action based on an internal state and an external stimulus,
A plurality of behavior description modules describing behaviors associated with predetermined internal states and external stimuli;
An action value calculation database composed of a data format in which an external stimulus input is associated with an expected internal state change expected to change after the action is expressed;
Refers to the behavior value calculation database from the internal state and external stimulus, obtains the desire value for the action associated with the internal state and the satisfaction based on the internal state, the desire value obtained from the current internal state, and the expected internal state change Action value calculation means for calculating the action value of the action described in each of the action description modules based on the expected satisfaction change obtained from
Action selecting means for selecting an action description module based on the calculated action value and expressing the action described in the selected action description module;
An information processing apparatus comprising:

前記行動選択手段により選択された行動発現後の結果に基づいて実際に得られる内部状態変化量に基づいて、前記前記行動価値算出データベース内の該当する予想内部状態変化を更新する学習手段をさらに備える、
ことを特徴とする請求項１に記載の情報処理装置。 Learning means for updating the corresponding expected internal state change in the behavior value calculation database based on the internal state change amount actually obtained based on the result after the behavior expression selected by the behavior selecting means. ,
The information processing apparatus according to claim 1.

前記行動価値算出手段は、現在の内部状態から求まる欲求値と、現在の内部状態から求まる満足度と、予想満足度変化とに基づいて、前記の各行動記述モジュールに記述された行動に対する行動価値を算出する、
ことを特徴とする請求項１に記載の情報処理装置。 The behavior value calculation means is based on the desire value obtained from the current internal state, the satisfaction obtained from the current internal state, and the expected satisfaction change, and the behavior value for the behavior described in each behavior description module. To calculate,
The information processing apparatus according to claim 1.

前記行動価値算出データベースは、外部刺激の値に対応付けられた予想内部状態変化を有する、
ことを特徴とする請求項１に記載の情報処理装置。 The behavior value calculation database has an expected internal state change associated with a value of an external stimulus.
The information processing apparatus according to claim 1.

前記行動価値算出手段は、前記行動価値算出データベースにはない値が入力された場合は、線形モデルを使用し線形補間を行なって予想内部状態変化を算出する、
ことを特徴とする請求４に記載の情報処理装置。 The behavior value calculation means calculates a predicted internal state change by performing linear interpolation using a linear model when a value not in the behavior value calculation database is input,
The information processing apparatus according to claim 4.

前記行動選択手段は、候補である行動のうち前記行動価値算出手段により算出された行動価値が最大のものを常に選択する、
ことを特徴とする請求項１に記載の情報処理装置。 The action selection means always selects the action having the maximum action value calculated by the action value calculation means from among the actions that are candidates.
The information processing apparatus according to claim 1.

前記行動選択手段は、前記行動価値算出手段により算出された行動価値に依らず、候補である行動の中から無作為に選択する、
ことを特徴とする請求項１に記載の情報処理装置。 The behavior selection means randomly selects from among the behaviors that are candidates regardless of the behavior value calculated by the behavior value calculation means.
The information processing apparatus according to claim 1.

前記行動選択手段は、前記行動価値算出手段により算出された行動価値に応じた確率に従って、候補である行動の中から選択する、
ことを特徴とする請求項１に記載の情報処理装置。 The action selecting means selects from among the actions that are candidates according to the probability according to the action value calculated by the action value calculating means.
The information processing apparatus according to claim 1.

前記行動価値算出データベースは、前記の各行動記述モジュールに記述された行動と、外部刺激としての対象物が持つ特性と、内部状態の組として、前記データ形式を管理する、
ことを特徴とする請求項１に記載の情報処理装置。 The behavior value calculation database manages the data format as a set of behavior described in each behavior description module, characteristics of an object as an external stimulus, and an internal state.
The information processing apparatus according to claim 1.

前記行動価値算出手段は、前記の各行動記述モジュールに記述された行動をインデックスとして前記行動価値算出データベースを検索し、外部刺激としての対象物の特性から内部状態を決定する、
ことを特徴とする請求項９に記載の情報処理装置。 The behavior value calculation means searches the behavior value calculation database using the behavior described in each of the behavior description modules as an index, and determines an internal state from characteristics of an object as an external stimulus.
The information processing apparatus according to claim 9.

前記行動価値算出手段は、外部刺激としての対象物が持つある特性をインデックスとして前記行動価値算出データベースを検索し、内部状態を決定する、
ことを特徴とする請求項９に記載の情報処理装置。 The behavior value calculation means searches the behavior value calculation database using a certain characteristic of an object as an external stimulus as an index, and determines an internal state.
The information processing apparatus according to claim 9.

前記行動価値算出手段は、行動又は外部刺激としての対象物が持つその他の特性を任意に値に設定し又は平均化し、対象物に抽象的な価値を与える、
ことを特徴とする請求項１１に記載の情報処理装置。 The behavior value calculation means arbitrarily sets or averages other characteristics of the object as an action or external stimulus, and gives an abstract value to the object.
The information processing apparatus according to claim 11.

内部状態及び外部刺激に基づいて自律的に行動を選択して発現するロボット装置の行動制御方法において、各行動は所定の内部状態及び外部刺激と対応付けられた行動記述モジュールとして記述され、
入力される外部刺激と、行動発現後に変化すると予想される予想内部状態変化とを対応付けたデータ形式からなる行動価値算出データベースを管理するステップと、
内部状態及び外部刺激から前記行動価値算出データベースを参照し、内部状態に対応付けられた行動に対する欲求値及び内部状態に基づく満足度を求め、現在の内部状態から求まる欲求値と、予想内部状態変化から求まる予想満足度変化とに基づいて、前記の各行動記述モジュールに記述された行動の行動価値を算出する行動価値算出ステップと、
該算出された行動価値に基づいて行動記述モジュールを選択し、該選択された行動記述モジュールに記述された行動を発現させる行動選択ステップと、
該選択された行動発現後の結果に基づいて行動価値算出データベースを更新する学習ステップと、
ことを特徴とするロボット装置の行動制御方法。 In the behavior control method of the robot apparatus that autonomously selects and expresses an action based on an internal state and an external stimulus, each action is described as an action description module associated with a predetermined internal state and an external stimulus.
Managing an action value calculation database consisting of a data format in which an external stimulus to be input and an expected internal state change expected to change after the onset of action are associated with each other;
Refers to the behavior value calculation database from the internal state and external stimulus, obtains the desire value for the action associated with the internal state and the satisfaction based on the internal state, the desire value obtained from the current internal state, and the expected internal state change An action value calculating step for calculating the action value of the action described in each of the action description modules based on the expected satisfaction change obtained from
An action selection step of selecting an action description module based on the calculated action value and expressing the action described in the selected action description module;
A learning step of updating the behavior value calculation database based on the selected result after the behavior expression;
A behavior control method for a robot apparatus.

内部状態及び外部刺激に基づいて自律的に行動を選択し発現するロボット装置において、In a robotic device that selects and expresses actions autonomously based on internal conditions and external stimuli,
所定の内部状態及び外部刺激が対応付けられた行動が記述された複数の行動記述モジュールと、A plurality of behavior description modules describing behaviors associated with predetermined internal states and external stimuli;
入力される外部刺激と、行動発現後に変化すると予想される予想内部状態変化とを対応付けたデータ形式からなる行動価値算出データベースと、An action value calculation database composed of a data format in which an external stimulus input is associated with an expected internal state change expected to change after the action is expressed;
内部状態及び外部刺激から前記行動価値算出データベースを参照し、内部状態に対応付けられた行動に対する欲求値及び内部状態に基づく満足度を求め、現在の内部状態から求まる欲求値と、予想内部状態変化から求まる予想満足度変化とに基づいて、前記の各行動記述モジュールに記述された行動の行動価値を算出する行動価値算出手段と、Refers to the behavior value calculation database from the internal state and external stimulus, obtains the desire value for the action associated with the internal state and the satisfaction based on the internal state, the desire value obtained from the current internal state, and the expected internal state change Action value calculation means for calculating the action value of the action described in each of the action description modules based on the expected satisfaction change obtained from
該算出された行動価値に基づいて行動記述モジュールを選択し、該選択された行動記述モジュールに記述された行動を発現させる行動選択手段と、Action selection means for selecting an action description module based on the calculated action value and expressing the action described in the selected action description module;
を具備することを特徴とするロボット装置。A robot apparatus comprising:

内部状態及び外部刺激に基づいて自律的に行動を選択して発現するためのロボット装置の行動制御をコンピュータ上で実行するようにコンピュータ可読形式で記述されたコンピュータ・プログラムにおいて、In a computer program written in a computer-readable format to execute on a computer behavior control of a robot apparatus for autonomously selecting and expressing an action based on an internal state and an external stimulus,
所定の内部状態及び外部刺激が対応付けられた行動が記述された複数の行動記述モジュールを備え、A plurality of behavior description modules describing behaviors associated with predetermined internal states and external stimuli;
前記コンピュータに対し、For the computer
入力される外部刺激と、行動発現後に変化すると予想される予想内部状態変化とを対応付けたデータ形式からなる行動価値算出データベースを管理する手順と、A procedure for managing an action value calculation database consisting of a data format in which an input external stimulus is associated with an expected internal state change expected to change after the action appears;
内部状態及び外部刺激から前記行動価値算出データベースを参照し、内部状態に対応付けられた行動に対する欲求値及び内部状態に基づく満足度を求め、現在の内部状態から求まる欲求値と、予想内部状態変化から求まる予想満足度変化とに基づいて、前記の各行動記述モジュールに記述された行動の行動価値を算出する行動価値算出手順と、Refers to the behavior value calculation database from the internal state and external stimulus, obtains the desire value for the action associated with the internal state and the satisfaction based on the internal state, the desire value obtained from the current internal state, and the expected internal state change An action value calculation procedure for calculating the action value of the action described in each of the action description modules based on the expected satisfaction change obtained from
該算出された行動価値に基づいて行動記述モジュールを選択し、該選択された行動記述モジュールに記述された行動を発現させる行動選択手順と、An action selection procedure for selecting an action description module based on the calculated action value and expressing the action described in the selected action description module;
該選択された行動発現後の結果に基づいて行動価値算出データベースを更新する学習手順と、 A learning procedure for updating the action value calculation database based on the selected result after the expression of the action,
を実行させることを特徴とするコンピュータ・プログラム。A computer program for executing