JP2004291228A5 - - Google Patents
Download PDFInfo
- Publication number
- JP2004291228A5 JP2004291228A5 JP2004068133A JP2004068133A JP2004291228A5 JP 2004291228 A5 JP2004291228 A5 JP 2004291228A5 JP 2004068133 A JP2004068133 A JP 2004068133A JP 2004068133 A JP2004068133 A JP 2004068133A JP 2004291228 A5 JP2004291228 A5 JP 2004291228A5
- Authority
- JP
- Japan
- Prior art keywords
- action
- internal state
- behavior
- value
- value calculation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Claims (15)
所定の内部状態及び外部刺激が対応付けられた行動が記述された複数の行動記述モジュールと、
入力される外部刺激と、行動発現後に変化すると予想される予想内部状態変化とを対応付けたデータ形式からなる行動価値算出データベースと、
内部状態及び外部刺激から前記行動価値算出データベースを参照し、内部状態に対応付けられた行動に対する欲求値及び内部状態に基づく満足度を求め、現在の内部状態から求まる欲求値と、予想内部状態変化から求まる予想満足度変化とに基づいて、前記の各行動記述モジュールに記述された行動の行動価値を算出する行動価値算出手段と、
該算出された行動価値に基づいて行動記述モジュールを選択し、該選択された行動記述モジュールに記述された行動を発現させる行動選択手段と、
を具備することを特徴とする情報処理装置。 In an information processing apparatus that performs processing for autonomously selecting and expressing an action based on an internal state and an external stimulus,
A plurality of behavior description modules describing behaviors associated with predetermined internal states and external stimuli;
An action value calculation database composed of a data format in which an external stimulus input is associated with an expected internal state change expected to change after the action is expressed;
Refers to the behavior value calculation database from the internal state and external stimulus, obtains the desire value for the action associated with the internal state and the satisfaction based on the internal state, the desire value obtained from the current internal state, and the expected internal state change Action value calculation means for calculating the action value of the action described in each of the action description modules based on the expected satisfaction change obtained from
Action selecting means for selecting an action description module based on the calculated action value and expressing the action described in the selected action description module;
An information processing apparatus comprising:
ことを特徴とする請求項1に記載の情報処理装置。 Learning means for updating the corresponding expected internal state change in the behavior value calculation database based on the internal state change amount actually obtained based on the result after the behavior expression selected by the behavior selecting means. ,
The information processing apparatus according to claim 1.
ことを特徴とする請求項1に記載の情報処理装置。 The behavior value calculation means is based on the desire value obtained from the current internal state, the satisfaction obtained from the current internal state, and the expected satisfaction change, and the behavior value for the behavior described in each behavior description module. To calculate,
The information processing apparatus according to claim 1.
ことを特徴とする請求項1に記載の情報処理装置。 The behavior value calculation database has an expected internal state change associated with a value of an external stimulus.
The information processing apparatus according to claim 1.
ことを特徴とする請求4に記載の情報処理装置。 The behavior value calculation means calculates a predicted internal state change by performing linear interpolation using a linear model when a value not in the behavior value calculation database is input,
The information processing apparatus according to claim 4.
ことを特徴とする請求項1に記載の情報処理装置。 The action selection means always selects the action having the maximum action value calculated by the action value calculation means from among the actions that are candidates.
The information processing apparatus according to claim 1.
ことを特徴とする請求項1に記載の情報処理装置。 The behavior selection means randomly selects from among the behaviors that are candidates regardless of the behavior value calculated by the behavior value calculation means.
The information processing apparatus according to claim 1.
ことを特徴とする請求項1に記載の情報処理装置。 The action selecting means selects from among the actions that are candidates according to the probability according to the action value calculated by the action value calculating means.
The information processing apparatus according to claim 1.
ことを特徴とする請求項1に記載の情報処理装置。 The behavior value calculation database manages the data format as a set of behavior described in each behavior description module, characteristics of an object as an external stimulus, and an internal state.
The information processing apparatus according to claim 1.
ことを特徴とする請求項9に記載の情報処理装置。 The behavior value calculation means searches the behavior value calculation database using the behavior described in each of the behavior description modules as an index, and determines an internal state from characteristics of an object as an external stimulus.
The information processing apparatus according to claim 9.
ことを特徴とする請求項9に記載の情報処理装置。 The behavior value calculation means searches the behavior value calculation database using a certain characteristic of an object as an external stimulus as an index, and determines an internal state.
The information processing apparatus according to claim 9.
ことを特徴とする請求項11に記載の情報処理装置。 The behavior value calculation means arbitrarily sets or averages other characteristics of the object as an action or external stimulus, and gives an abstract value to the object.
The information processing apparatus according to claim 11.
入力される外部刺激と、行動発現後に変化すると予想される予想内部状態変化とを対応付けたデータ形式からなる行動価値算出データベースを管理するステップと、
内部状態及び外部刺激から前記行動価値算出データベースを参照し、内部状態に対応付けられた行動に対する欲求値及び内部状態に基づく満足度を求め、現在の内部状態から求まる欲求値と、予想内部状態変化から求まる予想満足度変化とに基づいて、前記の各行動記述モジュールに記述された行動の行動価値を算出する行動価値算出ステップと、
該算出された行動価値に基づいて行動記述モジュールを選択し、該選択された行動記述モジュールに記述された行動を発現させる行動選択ステップと、
該選択された行動発現後の結果に基づいて行動価値算出データベースを更新する学習ステップと、
ことを特徴とするロボット装置の行動制御方法。 In the behavior control method of the robot apparatus that autonomously selects and expresses an action based on an internal state and an external stimulus, each action is described as an action description module associated with a predetermined internal state and an external stimulus.
Managing an action value calculation database consisting of a data format in which an external stimulus to be input and an expected internal state change expected to change after the onset of action are associated with each other;
Refers to the behavior value calculation database from the internal state and external stimulus, obtains the desire value for the action associated with the internal state and the satisfaction based on the internal state, the desire value obtained from the current internal state, and the expected internal state change An action value calculating step for calculating the action value of the action described in each of the action description modules based on the expected satisfaction change obtained from
An action selection step of selecting an action description module based on the calculated action value and expressing the action described in the selected action description module;
A learning step of updating the behavior value calculation database based on the selected result after the behavior expression;
A behavior control method for a robot apparatus.
所定の内部状態及び外部刺激が対応付けられた行動が記述された複数の行動記述モジュールと、A plurality of behavior description modules describing behaviors associated with predetermined internal states and external stimuli;
入力される外部刺激と、行動発現後に変化すると予想される予想内部状態変化とを対応付けたデータ形式からなる行動価値算出データベースと、An action value calculation database composed of a data format in which an external stimulus input is associated with an expected internal state change expected to change after the action is expressed;
内部状態及び外部刺激から前記行動価値算出データベースを参照し、内部状態に対応付けられた行動に対する欲求値及び内部状態に基づく満足度を求め、現在の内部状態から求まる欲求値と、予想内部状態変化から求まる予想満足度変化とに基づいて、前記の各行動記述モジュールに記述された行動の行動価値を算出する行動価値算出手段と、Refers to the behavior value calculation database from the internal state and external stimulus, obtains the desire value for the action associated with the internal state and the satisfaction based on the internal state, the desire value obtained from the current internal state, and the expected internal state change Action value calculation means for calculating the action value of the action described in each of the action description modules based on the expected satisfaction change obtained from
該算出された行動価値に基づいて行動記述モジュールを選択し、該選択された行動記述モジュールに記述された行動を発現させる行動選択手段と、Action selection means for selecting an action description module based on the calculated action value and expressing the action described in the selected action description module;
を具備することを特徴とするロボット装置。A robot apparatus comprising:
所定の内部状態及び外部刺激が対応付けられた行動が記述された複数の行動記述モジュールを備え、A plurality of behavior description modules describing behaviors associated with predetermined internal states and external stimuli;
前記コンピュータに対し、For the computer
入力される外部刺激と、行動発現後に変化すると予想される予想内部状態変化とを対応付けたデータ形式からなる行動価値算出データベースを管理する手順と、A procedure for managing an action value calculation database consisting of a data format in which an input external stimulus is associated with an expected internal state change expected to change after the action appears;
内部状態及び外部刺激から前記行動価値算出データベースを参照し、内部状態に対応付けられた行動に対する欲求値及び内部状態に基づく満足度を求め、現在の内部状態から求まる欲求値と、予想内部状態変化から求まる予想満足度変化とに基づいて、前記の各行動記述モジュールに記述された行動の行動価値を算出する行動価値算出手順と、Refers to the behavior value calculation database from the internal state and external stimulus, obtains the desire value for the action associated with the internal state and the satisfaction based on the internal state, the desire value obtained from the current internal state, and the expected internal state change An action value calculation procedure for calculating the action value of the action described in each of the action description modules based on the expected satisfaction change obtained from
該算出された行動価値に基づいて行動記述モジュールを選択し、該選択された行動記述モジュールに記述された行動を発現させる行動選択手順と、An action selection procedure for selecting an action description module based on the calculated action value and expressing the action described in the selected action description module;
該選択された行動発現後の結果に基づいて行動価値算出データベースを更新する学習手順と、 A learning procedure for updating the action value calculation database based on the selected result after the expression of the action,
を実行させることを特徴とするコンピュータ・プログラム。A computer program for executing
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2004068133A JP4552465B2 (en) | 2003-03-11 | 2004-03-10 | Information processing apparatus, action control method for robot apparatus, robot apparatus, and computer program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2003065586 | 2003-03-11 | ||
JP2004068133A JP4552465B2 (en) | 2003-03-11 | 2004-03-10 | Information processing apparatus, action control method for robot apparatus, robot apparatus, and computer program |
Publications (3)
Publication Number | Publication Date |
---|---|
JP2004291228A JP2004291228A (en) | 2004-10-21 |
JP2004291228A5 true JP2004291228A5 (en) | 2007-04-19 |
JP4552465B2 JP4552465B2 (en) | 2010-09-29 |
Family
ID=33421555
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2004068133A Expired - Fee Related JP4552465B2 (en) | 2003-03-11 | 2004-03-10 | Information processing apparatus, action control method for robot apparatus, robot apparatus, and computer program |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP4552465B2 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100909532B1 (en) * | 2007-02-07 | 2009-07-27 | 삼성전자주식회사 | Method and device for learning behavior of software robot |
US7984013B2 (en) | 2007-02-07 | 2011-07-19 | Samsung Electronics Co., Ltd | Method and apparatus for learning behavior in software robot |
JP5910249B2 (en) * | 2012-03-30 | 2016-04-27 | 富士通株式会社 | Interaction device and interaction control program |
US9751212B1 (en) * | 2016-05-05 | 2017-09-05 | Toyota Jidosha Kabushiki Kaisha | Adapting object handover from robot to human using perceptual affordances |
JP7312511B1 (en) * | 2023-02-17 | 2023-07-21 | 独立行政法人国立高等専門学校機構 | Behavior control method, behavior control program, behavior control device, and communication robot |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002239952A (en) * | 2001-02-21 | 2002-08-28 | Sony Corp | Robot device, action control method for robot device, program, and recording medium |
-
2004
- 2004-03-10 JP JP2004068133A patent/JP4552465B2/en not_active Expired - Fee Related
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6963627B2 (en) | Neural architecture search for convolutional neural networks | |
CN111144580B (en) | Hierarchical reinforcement learning training method and device based on imitation learning | |
US9256371B2 (en) | Implementing reinforcement learning based flash control | |
JP2020535546A5 (en) | ||
CN110263979B (en) | Method and device for predicting sample label based on reinforcement learning model | |
CN104076809B (en) | Data processing equipment and data processing method | |
KR20170058954A (en) | System for generating sets of control data for robots | |
US20190262990A1 (en) | Robot skill management | |
JP2016151932A5 (en) | ||
CN108255059B (en) | Robot control method based on simulator training | |
CN108229640B (en) | Emotion expression method and device and robot | |
JP2006309519A (en) | Reinforcement learning system and reinforcement learning program | |
CN111124916B (en) | Model training method based on motion semantic vector and electronic equipment | |
JP2004291228A5 (en) | ||
CN114529010A (en) | Robot autonomous learning method, device, equipment and storage medium | |
JP2009505198A5 (en) | ||
JPWO2020090949A5 (en) | ||
JP2021513148A5 (en) | ||
JP6947029B2 (en) | Control devices, information processing devices that use them, control methods, and computer programs | |
CN107229965B (en) | Anthropomorphic system of intelligent robot and method for simulating forgetting effect | |
CN111930602A (en) | Performance index prediction method and device | |
JPWO2022013933A5 (en) | Control device, control method and program | |
JP7196935B2 (en) | Arithmetic device, action determination method, and control program | |
JP5927797B2 (en) | Robot control device, robot system, behavior control method for robot device, and program | |
JP2013152595A5 (en) |