JP2020157436A

JP2020157436A - Work-piece take-out work device

Info

Publication number: JP2020157436A
Application number: JP2019060437A
Authority: JP
Inventors: 博明大庭; Hiroaki Oba
Original assignee: NTN Corp; NTN Toyo Bearing Co Ltd
Current assignee: NTN Corp
Priority date: 2019-03-27
Filing date: 2019-03-27
Publication date: 2020-10-01
Anticipated expiration: 2039-03-27
Also published as: JP7113778B2

Abstract

To provide a technique for efficiently performing mechanical learning without having a singular point in structure.SOLUTION: A work device 101 includes: a holding section 112; an angle adjustment section 111 which adjusts the direction of the holding section 112; a work head 109 on which the angle adjustment section 111 is mounted; a position adjustment section which moves the work head 109 by a plurality of driving sections; a first imaging device 115; and information processor 102. The angle adjustment section 111 has first and second link hubs 32 and 33 and a plurality of links arranged in parallel between each link hub, and a plurality of driving sections for driving each link. The information processor 102 uses the number of work-pieces held by the holding section 112 and torque of each driving section of the angle driving section 111 as parameters of a mechanical learning model when the holding section 112 holds two or more work-pieces on the basis of images photographed by the first imaging device 115, drives each driving section on the basis of the driving signal determined by the mechanical learning model, and causes the holding section to perform an entanglement releasing operation of work-pieces.SELECTED DRAWING: Figure 1

Description

本開示は、ワーク取り出し作業装置に関し、より特定的には、リンク機構による角度調整機能を備えるワーク取り出し作業装置の機械学習を用いた制御に関する。 The present disclosure relates to a work taking-out work device, and more specifically, to control of a work taking-out work device having an angle adjusting function by a link mechanism using machine learning.

自動化された機械加工または組立作業におけるワークピース（以下、ワークと称する）は、ロボットまたは組立装置等（所謂、ワーク取り出し作業装置）によって自動的にピックアップされ、加工装置または組立物にセットされることが多い。 Workpieces (hereinafter referred to as workpieces) in automated machining or assembly work are automatically picked up by a robot or assembly device (so-called work removal work device) and set in the processing device or assembly. There are many.

取り出されたワークを後工程の組立作業で別のワークに組み付ける場合、ワーク取り出し作業装置は、ワークを１つずつ取り出す必要がある。しかし、バネやクリップの様に絡みやすいワークを扱う場合、ワーク取り出し作業装置は、２つ以上のワークを同時に把持してしまう可能性がある。そのため、ワーク取り出し作業装置には、ワークの取り出しだけでなく、ワーク同士の絡まりを解く機能（以下、絡まり解き機能と呼ぶ）が求められている。 When assembling the taken-out work to another work in the assembly work in the subsequent process, the work taking-out work device needs to take out the works one by one. However, when handling a work that is easily entangled such as a spring or a clip, the work taking-out work device may grip two or more works at the same time. Therefore, the work taking-out work device is required to have a function of not only taking out the work but also untangling the works (hereinafter, referred to as an entanglement untangling function).

また、近年、機械学習の手法が進歩してきたこともあり、ワーク取り出し作業装置の制御の精度向上のために、機械学習を導入することが望まれている。 Further, in recent years, the method of machine learning has been advanced, and it is desired to introduce machine learning in order to improve the accuracy of control of the work taking-out work device.

ワークの取り出し制御に関し、例えば、特許文献１（特開２０１７−０３０１３５号公報）は、多関節ロボットを用いた「バラ積みされた状態を含む、乱雑に置かれたワークを取り出すときのロボットの最適な動作が人間の介在無しに選択される機械学習装置」を開示している（［要約］参照）。 Regarding control of work removal, for example, Patent Document 1 (Japanese Unexamined Patent Publication No. 2017-030135) states that an articulated robot is used to "optimize a robot for taking out a randomly placed work including a state of being piled up in bulk". A machine learning device in which various movements are selected without human intervention ”(see [Summary]).

また、絡まり取り制御に関し、例えば、特許文献２（特開平０５−１８５３８８号公報）は、「ロボットハンドのワーク把持状態を絡み検出用カメラが撮像し、この撮像した撮像情報に基づいてロボットハンドが把持するワークに絡み状態が発生しているか否かを視覚装置にて判定する」部品供給装置を開示している（［要約］参照）。 Regarding entanglement control, for example, Patent Document 2 (Japanese Unexamined Patent Publication No. 05-185388) states that "a camera for entanglement detection captures a work gripping state of a robot hand, and the robot hand captures the image based on the captured imaging information. A visual device is used to determine whether or not an entangled state has occurred in the workpiece to be gripped. ”A component supply device is disclosed (see [Summary]).

特開２０１７−０３０１３５号公報Japanese Unexamined Patent Publication No. 2017-030135 特開平０５−１８５３８８号公報Japanese Unexamined Patent Publication No. 05-185388 特開２０１７−０６４９１０号公報Japanese Unexamined Patent Publication No. 2017-064910 特開平０６−１４４５８４号公報Japanese Unexamined Patent Publication No. 06-144584

例えば、特許文献１および２に開示された技術はいずれも多関節ロボットを前提としている。多関節ロボットは、一般に特異点と呼ばれる構造的に制御できなくなる姿勢を含む。また、多関節ロボットは、ワークに加わる力やモーメントを検出するためのセンサーが必要であり、機械学習と組み合わせた場合に学習用のパラメータが多くなり、学習効率が悪くなる。 For example, the techniques disclosed in Patent Documents 1 and 2 all presuppose an articulated robot. Articulated robots include structurally uncontrollable postures, commonly referred to as singularities. In addition, the articulated robot requires a sensor to detect the force and moment applied to the work, and when combined with machine learning, the number of learning parameters increases and the learning efficiency deteriorates.

そのため、多関節ロボットと異なり、構造的に特異点を有さず、効率よく機械学習を行うための技術が必要とされている。 Therefore, unlike an articulated robot, there is a need for a technique for efficiently performing machine learning without structurally having a singular point.

本開示は、上記のような背景に鑑みてなされたものであって、ある局面における目的は、構造的に特異点を有さず、効率よく機械学習を行うための技術を提供することにある。 The present disclosure has been made in view of the above background, and an object in a certain aspect is to provide a technique for efficiently performing machine learning without structurally having a singular point. ..

ある実施の形態に従うワーク取り出しを行う作業装置は、ワークを容器から取り出して把持する把持部と、把持部が装着され、把持部の向きを調整する角度調整部と、角度調整部が装着される作業ヘッドと、複数の駆動部により作業ヘッドを移動させる位置調整部と、把持部の把持するワークを撮影する把持部撮像装置と、作業装置を制御する制御装置とを備える。角度調整部は、第１および第２のリンクハブと、第１および第２のリンクハブの間に並列に配置された複数のリンクと、複数のリンクのそれぞれを駆動させる複数の駆動部とを含む。制御装置は、把持部撮像装置が撮影した画像に基づいて、把持部が把持するワークの個数を検出し、把持部が把持するワークの個数が２以上であることに基づいて、把持部が把持するワークの個数および角度調整部の各駆動部のトルクを機械学習モデルのパラメータとし、機械学習モデルにより、位置調整部および角度調整部の各駆動部に送信するそれぞれの駆動信号を決定し、決定した駆動信号に基づいて、位置調整部の各駆動部および角度調整部の各駆動部を駆動させることにより、把持部にワークの絡まり解き動作をさせる。 A work device for taking out a work according to a certain embodiment is equipped with a grip portion for taking out the work from a container and gripping the work, an angle adjusting portion for adjusting the orientation of the grip portion, and an angle adjusting portion. It includes a work head, a position adjusting unit for moving the work head by a plurality of drive units, a grip portion imaging device for photographing the work gripped by the grip portion, and a control device for controlling the work device. The angle adjusting unit includes a first and second link hubs, a plurality of links arranged in parallel between the first and second link hubs, and a plurality of driving units for driving each of the plurality of links. Including. The control device detects the number of workpieces gripped by the gripping portion based on the image taken by the gripping portion imaging device, and the gripping portion grips based on the number of workpieces gripped by the gripping portion being 2 or more. The number of workpieces to be used and the torque of each drive unit of the angle adjustment unit are used as parameters of the machine learning model, and the machine learning model determines and determines each drive signal to be transmitted to each drive unit of the position adjustment unit and the angle adjustment unit. By driving each drive unit of the position adjustment unit and each drive unit of the angle adjustment unit based on the generated drive signal, the grip portion is made to untangle the work.

ある実施の形態によれば、構造的に特異点を有さず、効率よく機械学習を行うことが可能である。 According to a certain embodiment, it is possible to efficiently perform machine learning without structurally having a singular point.

この発明の上記および他の目的、特徴、局面および利点は、添付の図面と関連して理解されるこの発明に関する次の詳細な説明から明らかとなるであろう。 The above and other objectives, features, aspects and advantages of the invention will become apparent from the following detailed description of the invention as understood in connection with the accompanying drawings.

ある実施の形態に従うワーク取り出し作業システム１００の一構成例を示す図である。It is a figure which shows one configuration example of the work taking-out work system 100 according to a certain embodiment. 角度調整機構１１１の一構成例を示す図である。It is a figure which shows one configuration example of the angle adjustment mechanism 111. 角度調整機構１１１の回転軸４２に姿勢制御用の電動アクチュエータ１１を取り付けた一構成例を示す。An example of one configuration in which the electric actuator 11 for attitude control is attached to the rotating shaft 42 of the angle adjusting mechanism 111 is shown. 把持機構１１２の一構成例を示す図である。It is a figure which shows one configuration example of the gripping mechanism 112. 把持機構１１２を取り付けた角度調整機構１１１の一例を示す図である。It is a figure which shows an example of the angle adjustment mechanism 111 which attached the gripping mechanism 112. 情報処理装置１０２のハードウェアの一構成例を示す図である。It is a figure which shows one configuration example of the hardware of the information processing apparatus 102. 情報処理装置１０２を実現する機能の一構成例を示す図である。It is a figure which shows one configuration example of the function which realizes an information processing apparatus 102. 評価値関数部８０２の動作の一例を示す図である。It is a figure which shows an example of the operation of the evaluation value function part 802. 動作パターンテーブル８０３の一例を示す図である。It is a figure which shows an example of the operation pattern table 803. ワーク取り出し作業システム１００の処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process of the work take-out work system 100. 図１０の処理の動作イメージの一例を示す図である。It is a figure which shows an example of the operation image of the process of FIG. ワーク取り出し作業システム１００の絡まり解き作業の学習処理の一例を示すフローチャートである。It is a flowchart which shows an example of the learning process of the entanglement work of the work take-out work system 100. 絡まり解き作業の初期学習処理（図１２のステップ１２３０に対応）の一例を示すフローチャートである。It is a flowchart which shows an example of the initial learning process (corresponding to step 1230 of FIG. 12) of the entanglement unraveling work. 絡まり解き作業の学習処理（図１２のステップ１２６０に対応）の一例を示すフローチャートである。It is a flowchart which shows an example of the learning process (corresponding to step 1260 of FIG. 12) of the entanglement unraveling work. 評価値関数部８０２の評価値関数Ｆの更新処理の一例を示すフローチャートである。It is a flowchart which shows an example of the update process of the evaluation value function F of the evaluation value function unit 802.

以下、図面を参照しつつ、本開示に係る技術思想の実施の形態について説明する。以下の説明では、同一の部品には同一の符号を付してある。それらの名称および機能も同じである。したがって、それらについての詳細な説明は繰り返さない。 Hereinafter, embodiments of the technical concept according to the present disclosure will be described with reference to the drawings. In the following description, the same parts are designated by the same reference numerals. Their names and functions are the same. Therefore, the detailed description of them will not be repeated.

＜Ａ．システム構成＞
図１は、本実施の形態に従うワーク取り出し作業システム１００の一構成例を示す図である。図１を参照して、ワーク取り出し作業システム１００は、ワーク取り出し作業装置１０１と、情報処理装置１０２と、制御装置１０３と、撮像装置１１４，１１５とを備える。 <A. System configuration>
FIG. 1 is a diagram showing a configuration example of a work taking-out work system 100 according to the present embodiment. With reference to FIG. 1, the work taking-out work system 100 includes a work taking-out work device 101, an information processing device 102, a control device 103, and image pickup devices 114 and 115.

ワーク取り出し作業装置１０１は、架台１０４と、第１の直動ユニット１０５と、第２の直動ユニット１０６と、第３の直動ユニット１０７と、電動アクチュエータ１０８Ａ，１０８Ｂ，１０８Ｃ（以降、総称する場合は電動アクチュエータ１０８と呼ぶ）と、作業ヘッド１０９と、回転ユニット取付部材１１０と、角度調整機構１１１と、把持機構１１２と、ワーク入れ１１３とを備える。 The work taking-out work device 101 includes a gantry 104, a first linear motion unit 105, a second linear motion unit 106, a third linear motion unit 107, and electric actuators 108A, 108B, 108C (hereinafter collectively referred to as generic names). In this case, it is referred to as an electric actuator 108), a work head 109, a rotating unit mounting member 110, an angle adjusting mechanism 111, a gripping mechanism 112, and a work insertion 113.

架台１０４は、第１の直動ユニット１０５と、第２の直動ユニット１０６と、第３の直動ユニット１０７と、それぞれの直動ユニットを駆動させる電動アクチュエータ１０８Ａ，１０８Ｂ，１０８Ｃと、作業ヘッド１０９とからなる位置調整装置が装着される台である。 The gantry 104 includes a first linear motion unit 105, a second linear motion unit 106, a third linear motion unit 107, electric actuators 108A, 108B, 108C for driving the respective linear motion units, and a work head. It is a table on which a position adjusting device including 109 is mounted.

第１の直動ユニット１０５、第２の直動ユニット１０６および第３の直動ユニット１０７は、それぞれ直交するＸ軸、Ｙ軸、Ｚ軸方向に作業ヘッド１０９を移動させる。ある局面において、各直動ユニットは、フレームと、リニアシャフトと、リニアブッシュと、電動アクチュエータ１０８から動力を伝達するための台形ネジおよびボールネジナットとを備えていてもよい。また、ある局面において、各直動ユニットは、リニアシャフトの代わりに、リニアガイドや、フレームの表面を滑るガイドローラーを備えていてもよい。また、ある局面において、各直動ユニットは、台形ネジの代わりに駆動ベルトを備えていてもよい。また、各直動ユニットの端部には、各電動アクチュエータ１０８の初期位置の決定と、安全機構のための、衝突検知センサーが設けられていてもよい。 The first linear motion unit 105, the second linear motion unit 106, and the third linear motion unit 107 move the work head 109 in the orthogonal X-axis, Y-axis, and Z-axis directions, respectively. In certain aspects, each linear motion unit may include a frame, a linear shaft, a linear bush, and a trapezoidal screw and a ball screw nut for transmitting power from the electric actuator 108. Further, in a certain aspect, each linear motion unit may include a linear guide or a guide roller that slides on the surface of the frame instead of the linear shaft. Also, in certain aspects, each linear motion unit may include a drive belt instead of a trapezoidal screw. Further, a collision detection sensor for determining the initial position of each electric actuator 108 and for a safety mechanism may be provided at the end of each linear motion unit.

電動アクチュエータ１０８は、それぞれの直動ユニットを駆動させる。ある局面において、電動アクチュエータ１０８は、ステッピングモータであり、台形ネジや駆動ベルトを介して動力を各直動ユニットに伝達してもよい。また、ある局面において、電動アクチュエータ１０８は、ＡＣサーボモータまたはエンコーダーを備えたギアードモータであってもよい。情報処理装置１０２は、ステッピングモータのステップ数や、エンコーダーの回転数によって、作業ヘッド１０９および把持機構１１２の現在位置を算出してもよい。 The electric actuator 108 drives each linear motion unit. In one aspect, the electric actuator 108 is a stepping motor, which may transmit power to each linear motion unit via a trapezoidal screw or drive belt. Further, in a certain aspect, the electric actuator 108 may be an AC servomotor or a geared motor including an encoder. The information processing device 102 may calculate the current positions of the work head 109 and the gripping mechanism 112 based on the number of steps of the stepping motor and the rotation speed of the encoder.

作業ヘッド１０９は、上下方向（Ｚ軸方向）に動作するように第３の直動ユニット１０７に取り付けられている。また、作業ヘッド１０９は、作業に必要なパーツを取り付けるためのネジ穴やアタッチメントを備える。 The work head 109 is attached to the third linear motion unit 107 so as to operate in the vertical direction (Z-axis direction). Further, the work head 109 is provided with screw holes and attachments for attaching parts necessary for work.

回転ユニット取付部材１１０は、作業ヘッド１０９に取り付けられており、角度調整機構１１１を取り付けるためのネジ穴やアタッチメントを備える。角度調整機構１１１は、把持機構１１２によって把持されたワークの向きを微調整する。また、角度調整機構１１１の根元は電動アクチュエータを用いた回転機構となっている。なお、回転機構は、角度調整機構１１１とは別体でもよい。角度調整機構１１１の詳細については後述する。把持機構１１２は、ワーク取り出し作業におけるワーク、例えば、Ｃ型サークリップや、ばね、コイル等を把持する。把持機構１１２の詳細については後述する。ワーク入れ１１３は、ワークを入れるための箱である。 The rotating unit mounting member 110 is mounted on the work head 109 and includes screw holes and attachments for mounting the angle adjusting mechanism 111. The angle adjusting mechanism 111 finely adjusts the direction of the work gripped by the gripping mechanism 112. Further, the base of the angle adjusting mechanism 111 is a rotation mechanism using an electric actuator. The rotation mechanism may be separate from the angle adjusting mechanism 111. The details of the angle adjusting mechanism 111 will be described later. The gripping mechanism 112 grips a work in the work taking-out work, for example, a C-shaped circlip, a spring, a coil, or the like. Details of the gripping mechanism 112 will be described later. The work container 113 is a box for storing the work.

情報処理装置１０２は、制御装置１０３を介して、ワーク取り出し作業装置１０１に対して制御命令を送信し、また、電動アクチュエータ１０８や角度調整機構１１１の電動アクチュエータのモータトルク値等を取得する。情報処理装置１０２の詳細は後述する。 The information processing device 102 transmits a control command to the work taking-out work device 101 via the control device 103, and also acquires the motor torque value of the electric actuator 108 of the electric actuator 108 and the angle adjusting mechanism 111. Details of the information processing device 102 will be described later.

制御装置１０３は、ワーク取り出し作業装置１０１および情報処理装置１０２の間のデータを相互に変換する。ある局面において、制御装置１０３は、マイクロコンピューターからなる制御基板であり、情報処理装置１０２から、ワーク取り出し作業装置１０１の電動アクチュエータ１０８や角度調整機構１１１の電動アクチュエータに対する指令（指令トルク、回転量、回転速度等）を受信し、それぞれの電動アクチュエータに制御信号を送信してもよい。 The control device 103 mutually converts the data between the work taking-out work device 101 and the information processing device 102. In a certain aspect, the control device 103 is a control board composed of a microcomputer, and commands (command torque, rotation speed, command torque, rotation speed, etc.) from the information processing device 102 to the electric actuator 108 of the work taking-out work device 101 and the electric actuator of the angle adjusting mechanism 111. (Rotation speed, etc.) may be received and a control signal may be transmitted to each electric actuator.

撮像装置１１４は、ワーク入れ１１３を上から撮影することによって得られる画像または映像を情報処理装置１０２に送信する。情報処理装置１０２は、撮像装置１１４から受信した画像または映像に基づいて、ワークの位置を検出し、位置調整装置により、把持機構１１２をワークの取り出し位置まで移動させる。 The image pickup apparatus 114 transmits an image or a video obtained by photographing the work holder 113 from above to the information processing apparatus 102. The information processing device 102 detects the position of the work based on the image or video received from the image pickup device 114, and moves the gripping mechanism 112 to the work take-out position by the position adjusting device.

撮像装置１１５は、把持機構１１２が把持するワークを横から撮影することによって得られる画像または映像を情報処理装置１０２に送信する。情報処理装置１０２は、撮像装置１１５から受信した画像または映像に基づいて、把持機構１１２が把持するワークの個数を検出する。 The image pickup device 115 transmits an image or a video obtained by photographing the work gripped by the gripping mechanism 112 from the side to the information processing device 102. The information processing device 102 detects the number of workpieces gripped by the gripping mechanism 112 based on the image or video received from the image pickup device 115.

＜Ｂ．システム構成部品のハードウェア構成＞
図２は、角度調整機構１１１の一構成例を示す図である。図２を参照して、角度調整機構１１１は、基端側の第１リンクハブ３２に対し先端側の第２リンクハブ３３を３組のリンク機構３４によって姿勢変更可能に連結したものである。先端側の第２リンクハブ３３には、図１に示された把持機構１１２が取り付けられる。なお、ここでは３組のリンク機構３４を有する角度調整機構１１１について示したが、リンク機構３４の数は、４組以上であってもよい。 <B. Hardware configuration of system components>
FIG. 2 is a diagram showing a configuration example of the angle adjusting mechanism 111. With reference to FIG. 2, the angle adjusting mechanism 111 connects the first link hub 32 on the proximal end side to the second link hub 33 on the distal end side in a posture-changeable manner by three sets of link mechanisms 34. The gripping mechanism 112 shown in FIG. 1 is attached to the second link hub 33 on the distal end side. Although the angle adjusting mechanism 111 having three sets of link mechanisms 34 is shown here, the number of link mechanisms 34 may be four or more.

各リンク機構３４は、基端側の端部リンク部材３５、先端側の端部リンク部材３６および中央リンク部材３７で構成される。リンク機構３４は、４つの回転対偶からなる４節連鎖のリンク機構である。基端側および先端側の端部リンク部材３５，３６はＬ字状の形状を有する。 Each link mechanism 34 is composed of an end link member 35 on the proximal end side, an end link member 36 on the distal end side, and a central link member 37. The link mechanism 34 is a four-node chain link mechanism composed of four rotational pairs. The end link members 35 and 36 on the base end side and the tip end side have an L-shape.

基端側の端部リンク部材３５の一端は、回転軸４２を介して、基端側の第１リンクハブ３２に回転自在に連結されている。先端側の端部リンク部材３６の一端は、回転軸７３を介して、先端側の第２リンクハブ３３に回転自在に連結されている。中央リンク部材３７は、回転軸５５，７５を介して、両端に端部リンク部材３５，３６の各他端がそれぞれ回転自在に連結されている。 One end of the end link member 35 on the base end side is rotatably connected to the first link hub 32 on the base end side via a rotation shaft 42. One end of the end link member 36 on the tip side is rotatably connected to the second link hub 33 on the tip side via a rotation shaft 73. The other ends of the end link members 35 and 36 are rotatably connected to both ends of the central link member 37 via rotation shafts 55 and 75.

角度調整機構１１１は、パラレルリンク機構であり、２つの球面リンク機構を組み合わせた構造を有する。端部リンク部材３５，３６と中央リンク部材３７との各回転対偶の中心軸は、ある交差角を持っていてもよいし、平行であってもよい。 The angle adjusting mechanism 111 is a parallel link mechanism and has a structure in which two spherical link mechanisms are combined. The central axes of each rotational kinematic pair of the end link members 35, 36 and the central link member 37 may have a certain crossing angle or may be parallel.

角度調整機構１１１は、リンクの動作のみで各リンクハブの中心軸の相対角度を調整可能であり、多関節ロボットのように直列に連結された複数の関節の動作を伴わない。このため、先端のわずかな動きに対して構成部材が大きく動くことは無く素早い動作が可能である。また、角度調整機構１１１は、リンクを駆動させる電動アクチュエータのモータトルク値から、任意の姿勢における把持機構１１２の先端に加わる力を検出できる。 The angle adjusting mechanism 111 can adjust the relative angle of the central axis of each link hub only by the operation of the link, and does not involve the operation of a plurality of joints connected in series like an articulated robot. Therefore, the constituent members do not move significantly with respect to a slight movement of the tip, and quick movement is possible. Further, the angle adjusting mechanism 111 can detect the force applied to the tip of the gripping mechanism 112 in an arbitrary posture from the motor torque value of the electric actuator that drives the link.

第２リンクハブ３３は、第１リンクハブ３２から見て半球面上で姿勢を変える。そのため、第１リンクハブ３２から見た第２リンクハブ３３の目標位置と、各リンクの姿勢とは、必ず一対一で対応する。よって、角度調整機構１１１は、ロボットアーム等のマルチリンクを持つ構造と異なり、特異点を有さない。 The second link hub 33 changes its posture on a hemisphere when viewed from the first link hub 32. Therefore, the target position of the second link hub 33 as seen from the first link hub 32 and the posture of each link always have a one-to-one correspondence. Therefore, the angle adjusting mechanism 111 does not have a singular point, unlike a structure having a multi-link such as a robot arm.

図３は、角度調整機構１１１の回転軸４２に姿勢制御用の電動アクチュエータ１１を取り付けた一構成例を示す。電動アクチュエータ１１は、減速機構６２を備えたロータリアクチュエータ（モータ）である。電動アクチュエータ１１は、基端側の第１リンクハブ３２の上面に、電動アクチュエータ１１の回転軸と回転軸４２とが同軸上に位置するように設置されている。電動アクチュエータ１１および減速機構６２は、一体として設けられてもよい。減速機構６２は、モータ固定部材６３により基端側の第１リンクハブ３２に固定される。ある局面において、電動アクチュエータ１１は、ＡＣサーボモータまたはエンコーダーを備えたギアードモータであってもよい。 FIG. 3 shows an example of a configuration in which the electric actuator 11 for attitude control is attached to the rotating shaft 42 of the angle adjusting mechanism 111. The electric actuator 11 is a rotary actuator (motor) provided with a reduction mechanism 62. The electric actuator 11 is installed on the upper surface of the first link hub 32 on the proximal end side so that the rotation shaft and the rotation shaft 42 of the electric actuator 11 are coaxially located. The electric actuator 11 and the reduction mechanism 62 may be provided integrally. The speed reduction mechanism 62 is fixed to the first link hub 32 on the proximal end side by the motor fixing member 63. In certain aspects, the electric actuator 11 may be an AC servomotor or a geared motor with an encoder.

図３に示す例では、電動アクチュエータ１１が３組のリンク機構３４の全てに設けられているが、本実施の形態に従う角度調整機構１１１はこれに限られない。角度調整機構１１１は、リンク機構３４のうち少なくとも２組に姿勢制御用の電動アクチュエータ１１が設けられていれば、基端側の第１リンクハブ３２に対する先端側の第２リンクハブ３３の姿勢を確定することができる。 In the example shown in FIG. 3, the electric actuator 11 is provided in all three sets of the link mechanisms 34, but the angle adjusting mechanism 111 according to the present embodiment is not limited to this. If at least two sets of the link mechanisms 34 are provided with the electric actuators 11 for attitude control, the angle adjusting mechanism 111 can change the attitude of the second link hub 33 on the tip side with respect to the first link hub 32 on the proximal end side. Can be confirmed.

図４は、把持機構１１２の一構成例を示す図である。把持機構１１２は、対向する２枚の爪で対象物を挟み込む。本実施の形態に従う把持機構１１２は、エアシリンダを用いて２枚の爪を開閉させる方式である。状態Ａは把持機構１１２の開放時の状態を示す。状態Ｂは把持機構１１２の閉じた状態を示す。図４に示す把持機構１１２は一例であり、本実施の形態に従う把持機構１１２はこれに限られない。ある局面において、把持機構１１２は、電動式の開閉機構、対象物を吸着する機構または他の挟み込み機構であってもよい。 FIG. 4 is a diagram showing a configuration example of the gripping mechanism 112. The gripping mechanism 112 sandwiches the object between two opposing claws. The gripping mechanism 112 according to the present embodiment is a method of opening and closing two claws by using an air cylinder. The state A indicates a state when the gripping mechanism 112 is opened. The state B indicates a closed state of the gripping mechanism 112. The gripping mechanism 112 shown in FIG. 4 is an example, and the gripping mechanism 112 according to the present embodiment is not limited to this. In certain aspects, the gripping mechanism 112 may be an electric opening / closing mechanism, a mechanism for adsorbing an object, or another pinching mechanism.

図５は、把持機構１１２を取り付けた角度調整機構１１１の一例を示す図である。角度調整機構１１１の先端側の第２リンクハブ３３は、把持機構１１２をネジ止めするネジ穴、はめ込み穴またはその他のアタッチメントを備えていてもよい。図５に示す構成によって、ワーク取り出し作業装置１０１は、把持機構１１２が複数のワークを把持したときにおける、絡まり解きの動作を行うことができる。 FIG. 5 is a diagram showing an example of the angle adjusting mechanism 111 to which the gripping mechanism 112 is attached. The second link hub 33 on the distal end side of the angle adjusting mechanism 111 may include a screw hole, a fitting hole, or other attachment for screwing the gripping mechanism 112. According to the configuration shown in FIG. 5, the work taking-out work device 101 can perform the operation of untangling when the gripping mechanism 112 grips a plurality of works.

なお、本実施の例では、位置調整装置は、角度調整機構１１１を移動させているが、本実施の形態に従うワーク取り出し作業装置１０１はこれに限られない。位置調整装置は、角度調整機構１１１、把持機構１１２およびワーク入れ１１３の中のワークを相対的に位置決めできればよく、ある局面において、位置調整装置は、ワーク入れ１１３を移動させる機構を含んでもよい。 In the example of the present embodiment, the position adjusting device moves the angle adjusting mechanism 111, but the work taking-out work device 101 according to the present embodiment is not limited to this. The position adjusting device only needs to be able to relatively position the work in the angle adjusting mechanism 111, the gripping mechanism 112 and the work holder 113, and in a certain aspect, the position adjusting device may include a mechanism for moving the work holder 113.

＜Ｃ．回路およびソフトウェア構成＞
図６は、情報処理装置１０２のハードウェアの一構成例を示す図である。図６を参照して、情報処理装置１０２は、ＣＰＵ（Central Processing Unit）７０１と、１次記憶装置７０２と、２次記憶装置７０３と、外部機器インターフェース７０４と、入力インターフェース７０５と、出力インターフェース７０６と、通信インターフェース７０７とを備える。 <C. Circuit and software configuration>
FIG. 6 is a diagram showing a configuration example of the hardware of the information processing device 102. With reference to FIG. 6, the information processing unit 102 includes a CPU (Central Processing Unit) 701, a primary storage device 702, a secondary storage device 703, an external device interface 704, an input interface 705, and an output interface 706. And a communication interface 707.

ＣＰＵ７０１は、情報処理装置１０２で動作するプログラムやデータを処理する。１次記憶装置７０２は、ＣＰＵ７０１によって実行されるプログラムおよび参照されるデータを格納する。ある局面において、ＤＲＡＭ（Dynamic Random Access Memory）が１次記憶装置７０２として使用されてもよい。 The CPU 701 processes programs and data that operate in the information processing device 102. The primary storage device 702 stores a program executed by the CPU 701 and data to be referenced. In some aspects, DRAM (Dynamic Random Access Memory) may be used as the primary storage device 702.

２次記憶装置７０３は、プログラムやデータ等を長期間記憶する。一般的に２次記憶装置７０３は、１次記憶装置７０２よりも低速であるため、ＣＰＵ７０１で直接使用するデータは、１次記憶装置７０２に配置され、それ以外のデータは、２次記憶装置７０３に配置される。ある局面において、ＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）等の不揮発性の記憶装置が２次記憶装置７０３として使用されてもよい。 The secondary storage device 703 stores programs, data, and the like for a long period of time. Since the secondary storage device 703 is generally slower than the primary storage device 702, the data directly used by the CPU 701 is arranged in the primary storage device 702, and the other data is the secondary storage device 703. Placed in. In a certain aspect, a non-volatile storage device such as an HDD (Hard Disk Drive) or an SSD (Solid State Drive) may be used as the secondary storage device 703.

外部機器インターフェース７０４は、情報処理装置１０２に補助デバイスを接続する場合等に使用される。ある局面において、ＵＳＢ（Universal Serial Bus）インターフェースが、外部機器インターフェース７０４として使用されてもよい。入力インターフェース７０５は、キーボードやマウス等を接続するために使用される。ある局面において、ＵＳＢインターフェースが、入力インターフェース７０５として使用されてもよい。 The external device interface 704 is used when an auxiliary device is connected to the information processing device 102 or the like. In some aspects, a USB (Universal Serial Bus) interface may be used as the external device interface 704. The input interface 705 is used to connect a keyboard, a mouse, or the like. In some aspects, the USB interface may be used as the input interface 705.

出力インターフェース７０６は、ディスプレイ等の出力デバイスを接続するために使用される。ある局面において、ＨＤＭＩ（登録商標）（High-Definition Multimedia Interface）やＤＶＩ（Digital Visual Interface）が出力インターフェース７０６として使用されてもよい。 The output interface 706 is used to connect an output device such as a display. In certain aspects, HDMI® (High-Definition Multimedia Interface) or DVI (Digital Visual Interface) may be used as the output interface 706.

通信インターフェース７０７は、外部の通信機器と通信するために使用される。ある局面において、ＬＡＮ（Local Area Network）ポートや、Ｗｉ−Ｆｉ（登録商標）（Wireless Fidelity）の送受信装置等が、通信インターフェース７０７として使用されてもよい。また、ある局面において、情報処理装置１０２は、ＰＣ（Personal Computer）またはワークステーションであってもよい。本実施の形態に従う情報処理装置１０２の処理は、図６に示すハードウェア上で、プログラムとして実行されてもよい。 The communication interface 707 is used to communicate with an external communication device. In a certain aspect, a LAN (Local Area Network) port, a Wi-Fi (registered trademark) (Wireless Fidelity) transmitter / receiver, or the like may be used as the communication interface 707. Further, in a certain aspect, the information processing apparatus 102 may be a PC (Personal Computer) or a workstation. The processing of the information processing apparatus 102 according to the present embodiment may be executed as a program on the hardware shown in FIG.

図７は、情報処理装置１０２を実現する機能の一構成例を示す図である。ある局面において、図７に示す機能の一部は、図６に示すハードウェア上で、プログラムが実行されることにより実現され得る。図７を参照して、情報処理装置１０２は、信号入力部８０１と、評価値関数部８０２と、動作パターンテーブル８０３と、動作決定部８０４と、指令生成部８０５と、動作結果判定部８０６と、評価値関数学習部８０７とを含む。 FIG. 7 is a diagram showing a configuration example of a function that realizes the information processing device 102. In a certain aspect, some of the functions shown in FIG. 7 can be realized by executing a program on the hardware shown in FIG. With reference to FIG. 7, the information processing apparatus 102 includes a signal input unit 801, an evaluation value function unit 802, an operation pattern table 803, an operation determination unit 804, a command generation unit 805, and an operation result determination unit 806. , Evaluation value function learning unit 807 and the like.

信号入力部８０１は、撮像装置１１４，１１５が撮影することによって取得された画像と、ワーク取り出し作業装置１０１から角度調整機構１１１の電動アクチュエータ１１のモータトルク値とを取得する。ある局面において、信号入力部８０１は、さらに、位置調整装置の電動アクチュエータ１０８のモータトルク値を取得してもよい。 The signal input unit 801 acquires the image acquired by the imaging devices 114 and 115 and the motor torque value of the electric actuator 11 of the angle adjusting mechanism 111 from the work taking-out work device 101. In a certain aspect, the signal input unit 801 may further acquire the motor torque value of the electric actuator 108 of the position adjusting device.

評価値関数部８０２は、後述する評価値関数Ｆを用いて信号入力部に入力された画像から検出されたワークの個数およびモータトルク値等に基づいて各動作パターンに対応するそれぞれの評価値を計算する。 The evaluation value function unit 802 uses the evaluation value function F described later to obtain each evaluation value corresponding to each operation pattern based on the number of workpieces detected from the image input to the signal input unit, the motor torque value, and the like. calculate.

動作パターンテーブル８０３は、位置調整装置および角度調整機構１１１の各電動アクチュエータの移動量および移動速度、加速度、指令トルク値の内の少なくとも１つが対応付けられた複数の動作パターンを保管する。動作パターンテーブル８０３は、角度調整機構１１１に関して、個別のアクチュエータの指令値ではなく、角度調整機構１１１の角度等を動作パターンに含めてもよい。 The operation pattern table 803 stores a plurality of operation patterns associated with at least one of the movement amount, movement speed, acceleration, and command torque value of each electric actuator of the position adjustment device and the angle adjustment mechanism 111. Regarding the angle adjusting mechanism 111, the operation pattern table 803 may include the angle of the angle adjusting mechanism 111 and the like in the operation pattern instead of the command value of each actuator.

動作決定部８０４は、動作パターンテーブル８０３の動作パターンの中から、評価値が最大となる動作パターンをワーク取り出し作業装置１０１の次の動作として選択する。指令生成部８０５は、動作決定部８０４により選択された動作パターンに基づいて、ワーク取り出し作業装置１０１の各電動アクチュエータへの指令値を生成し、制御装置１０３を介して、当該指令値をワーク取り出し作業装置１０１に送信する。 The operation determination unit 804 selects the operation pattern having the maximum evaluation value from the operation patterns of the operation pattern table 803 as the next operation of the work taking-out work device 101. The command generation unit 805 generates a command value for each electric actuator of the work take-out work device 101 based on the operation pattern selected by the operation determination unit 804, and takes out the command value via the control device 103. It is transmitted to the working device 101.

動作結果判定部８０６は、前回選択された動作パターンの実行前後における、把持機構１１２が把持しているワーク個数が１のときは報酬１を与え、２つ以上のワークを把持している場合は０を与える。さらに、動作パターンの実行回数が上限回数を超過した場合は−１を与える。 The operation result determination unit 806 gives a reward 1 when the number of workpieces gripped by the gripping mechanism 112 is 1 before and after the execution of the previously selected motion pattern, and when gripping two or more workpieces, Give 0. Further, when the number of executions of the operation pattern exceeds the upper limit, -1 is given.

評価値関数学習部８０７は、動作結果判定部が出力した報酬を教師信号として、動作パターンを選択した時の評価値と、教師信号との差に基づいて評価値関数Ｆを更新する。ある局面において、評価値関数学習部８０７は、予め定められた回数だけ評価値関数Ｆを更新するごとに、評価値関数部８０２で使用する評価値関数Ｆを最新状態に更新してもよい。 The evaluation value function learning unit 807 updates the evaluation value function F based on the difference between the evaluation value when the operation pattern is selected and the teacher signal, using the reward output by the operation result determination unit as the teacher signal. In a certain aspect, the evaluation value function learning unit 807 may update the evaluation value function F used by the evaluation value function unit 802 to the latest state every time the evaluation value function F is updated a predetermined number of times.

図８は、評価値関数部８０２の動作の一例を示す図である。評価値関数部８０２は、信号入力部８０１から、各撮像装置１１４，１１５が撮影することによって取得された画像およびモータトルク値等を取得して評価値関数Ｆに入力する。なお、情報処理装置１０２は、画像から検出したワークの個数を評価値関数Ｆに入力してもよい。評価値は動作パターンごとに算出される。図８に示す例では、評価値関数部８０２は、ｎ個の各動作パターンａ_１〜ａ_ｎに対してそれぞれ評価値を算出する。ある局面において、評価値関数部８０２は、画像（もしくは、画像に写るワークの個数）やモータトルク値等を評価値関数Ｆの入力として受け付け、各動作パターンのそれぞれの評価値を計算するプログラムであってもよい。 FIG. 8 is a diagram showing an example of the operation of the evaluation value function unit 802. The evaluation value function unit 802 acquires an image, a motor torque value, and the like acquired by taking pictures of the imaging devices 114 and 115 from the signal input unit 801 and inputs them to the evaluation value function F. The information processing device 102 may input the number of workpieces detected from the image into the evaluation value function F. The evaluation value is calculated for each operation pattern. In the example shown in FIG. 8, the evaluation value function unit 802 calculates the evaluation value for each of _n operation patterns a ₁ to an. In a certain aspect, the evaluation value function unit 802 is a program that accepts an image (or the number of workpieces shown in the image), a motor torque value, etc. as an input of the evaluation value function F and calculates each evaluation value of each operation pattern. There may be.

評価値関数Ｆが出力するｎ個の評価値は、次に実行すべき動作パターンを選択するための指標であり、対応する評価値が最大の値を示す動作パターンが、次に実行すべき最適な動作であること示す。 The n evaluation values output by the evaluation value function F are indexes for selecting the operation pattern to be executed next, and the operation pattern showing the maximum value corresponding to the evaluation value is the optimum operation pattern to be executed next. Indicates that the operation is normal.

そのため、動作決定部８０４は、ｎ個の動作パターンの中から、最大の評価値に対応する動作パターンを次の動作として選択する。図９に示す例では、「評価値＝０．６１４」が最大のため、動作決定部８０４は、「評価値＝０．６１４」に対応する動作パターンａ_ｎ−３を選択する。 Therefore, the operation determination unit 804 selects the operation pattern corresponding to the maximum evaluation value as the next operation from the n operation patterns. In the example shown in FIG. 9, since "evaluation value = 0.614" is the maximum, the operation determination unit 804 selects the operation pattern an _-3 corresponding to "evaluation value = 0.614".

動作決定部８０４は、選択した動作パターンａ_ｎ−３を指令生成部８０５に転送する。指令生成部８０５は、動作パターンテーブル８０３を参照し、ａ_ｎ−３に対応する指令値を生成して制御装置１０３に出力する。 The operation determination unit 804 transfers the selected operation pattern an _-3 to the command generation unit 805. The command generation unit 805 refers to the operation pattern table 803, generates a command value corresponding to an _-3 , and outputs the command value to the control device 103.

図９は、動作パターンテーブル８０３の一例を示す図である。動作パターンテーブル８０３は、動作パターンごとに、位置調整装置の電動アクチュエータ１０８の移動量、移動速度、加速度および指令トルク値と、角度調整機構１１１の根元の回転機構の回転角度、回転速度、加速度、減速度および指令トルク値と、角度調整機構１１１の折れ角変更量、旋回角変更量、回転速度、加速度、減速度および指令トルク値とを格納する。 FIG. 9 is a diagram showing an example of the operation pattern table 803. The operation pattern table 803 shows, for each operation pattern, the movement amount, movement speed, acceleration and command torque value of the electric actuator 108 of the position adjusting device, and the rotation angle, rotation speed, acceleration of the rotation mechanism at the base of the angle adjustment mechanism 111. The deceleration and command torque values and the bending angle change amount, turning angle change amount, rotation speed, acceleration, deceleration, and command torque value of the angle adjusting mechanism 111 are stored.

ある局面において、動作パターンテーブル８０３は、角度調整機構１１１の個別の電動アクチュエータの移動量、移動速度、加速度、および指令トルク値を格納してもよい。また、ある局面において、動作パターンテーブル８０３は、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって参照されてもよい。 In certain aspects, the motion pattern table 803 may store the movement amount, movement speed, acceleration, and command torque value of the individual electric actuators of the angle adjusting mechanism 111. Further, in a certain aspect, the operation pattern table 803 may be referred to by the CPU 701 by being stored in the secondary storage device 703 and read out by the primary storage device 702.

＜Ｄ．ワーク取り出し作業における情報処理装置１０２の内部処理＞
図１０は、ワーク取り出し作業システム１００の処理の一例を示すフローチャートである。ある局面において、図１０の処理を実行するためのプログラムは２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１０の各ステップを実行するものとして当該処理を説明する。 <D. Internal processing of information processing device 102 in work retrieval work>
FIG. 10 is a flowchart showing an example of processing of the work taking-out work system 100. In a certain aspect, the program for executing the process of FIG. 10 may be stored in the secondary storage device 703 and read by the primary storage device 702 to be executed by the CPU 701. Hereinafter, the process will be described assuming that the information processing apparatus 102 executes each step of FIG.

ステップＳ１００５において、情報処理装置１０２は、撮像装置１１４がワーク入れ１１３を上から撮影することによって得られた画像１に基づいて、ワーク入れ１１３の中のワークの位置を検出する。ある局面において、情報処理装置１０２は、画像１からワーク入れ１１３の中のワークの位置を検出するために、既存の画像認識技術を用いてもよい。 In step S1005, the information processing apparatus 102 detects the position of the work in the workpiece 113 based on the image 1 obtained by the imaging apparatus 114 photographing the workpiece 113 from above. In a certain aspect, the information processing apparatus 102 may use an existing image recognition technique in order to detect the position of the work in the work slot 113 from the image 1.

ステップＳ１０１０において、情報処理装置１０２は、画像１からワークを検出したか否かを判定する。情報処理装置１０２は、画像１からワークを検出した場合（ステップＳ１０１０にてＹＥＳ）、制御をステップＳ１０１５に移す。そうでない場合（ステップＳ１０１０にてＮＯ）、情報処理装置１０２は、ワーク入れ１１３内から全てのワークの取り出し処理が完了したと判定し、処理を終了する。 In step S1010, the information processing apparatus 102 determines whether or not the work is detected from the image 1. When the information processing apparatus 102 detects the work from the image 1 (YES in step S1010), the information processing apparatus 102 shifts the control to step S1015. If this is not the case (NO in step S1010), the information processing apparatus 102 determines that the process of removing all the workpieces from the workpiece 113 has been completed, and ends the process.

ステップＳ１０１５において、情報処理装置１０２は、位置調整装置により、把持機構１１２をワークの取り出し作業のための予め定められた位置に移動させ、ワーク入れ１１３からワークを取り出す。 In step S1015, the information processing device 102 moves the gripping mechanism 112 to a predetermined position for the work taking out work by the position adjusting device, and takes out the work from the work holder 113.

ステップＳ１０２０において、情報処理装置１０２は、撮像装置１１５が把持機構１１２の先端を側面から撮影することによって取得された画像２を基に、把持機構１１２が把持しているワークの個数を算出する。ある局面において、情報処理装置１０２は、画像２から把持機構１１２が把持しているワークの個数を算出するために、既存の画像認識技術を用いてもよい。 In step S1020, the information processing device 102 calculates the number of works gripped by the gripping mechanism 112 based on the image 2 acquired by the imaging device 115 photographing the tip of the gripping mechanism 112 from the side surface. In a certain aspect, the information processing apparatus 102 may use an existing image recognition technique in order to calculate the number of workpieces gripped by the gripping mechanism 112 from the image 2.

ステップＳ１０２５において、情報処理装置１０２は、画像２の解析結果に基づいて、把持機構１１２がワークを１個だけ把持しているか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個だけ把持していると判定した場合（ステップＳ１０２５にてＹＥＳ）、制御をステップＳ１０３０に移す。そうでない場合（ステップＳ１０２５にてＮＯ）、制御をステップＳ１０３５に移す。 In step S1025, the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips only one work based on the analysis result of the image 2. When the information processing device 102 determines that the gripping mechanism 112 grips only one work (YES in step S1025), the information processing device 102 shifts control to step S1030. If not (NO in step S1025), control is transferred to step S1035.

ステップＳ１０３０において、情報処理装置１０２は、把持機構１１２が把持するワークを予め定められた位置に置き、処理を終了する。 In step S1030, the information processing device 102 places the work gripped by the gripping mechanism 112 at a predetermined position, and ends the process.

ステップＳ１０３５において、情報処理装置１０２は、画像２の解析結果に基づいて、把持機構１１２がワークを１個も把持していないか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個も把持していないと判定した場合（ステップＳ１０３５にてＹＥＳ）、処理を終了する。そうでない場合（ステップＳ１０３５にてＮＯ）、情報処理装置１０２は制御をステップＳ１０４０に移す。 In step S1035, the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips even one work based on the analysis result of the image 2. When the information processing device 102 determines that the gripping mechanism 112 does not grip any of the workpieces (YES in step S1035), the information processing apparatus 102 ends the process. If not (NO in step S1035), the information processing apparatus 102 shifts control to step S1040.

ステップＳ１０４０において、情報処理装置１０２は、撮像装置１１４，１１５が撮影した各画像と、各電動アクチュエータのモータトルク値とを取得する。ある局面において、情報処理装置１０２は、各画像および各電動アクチュエータのモータトルク値に加えて、各種センサー値を各種センサーから取得してもよい。 In step S1040, the information processing device 102 acquires each image taken by the image pickup devices 114 and 115 and the motor torque value of each electric actuator. In a certain aspect, the information processing apparatus 102 may acquire various sensor values from various sensors in addition to the motor torque values of each image and each electric actuator.

ステップＳ１０４５において、情報処理装置１０２は、取得した各画像（もしくは、画像に写るワークの個数）および各電動アクチュエータのモータトルク値を評価値関数部８０２の入力として、動作パターンごとの評価値を算出する。ステップＳ１１５０において、情報処理装置１０２は、動作パターンごとに算出された評価値の中で最大の評価値を選択し、当該最大の評価値に対応する動作パターンａ_ｋを次の動作として選択する。 In step S1045, the information processing apparatus 102 calculates an evaluation value for each operation pattern by inputting each acquired image (or the number of workpieces shown in the image) and the motor torque value of each electric actuator as the input of the evaluation value function unit 802. To do. In step S1150, the information processing apparatus 102 selects the maximum evaluation value among the evaluation values calculated for each operation pattern, and selects the operation pattern _ak corresponding to the maximum evaluation value as the next operation.

ステップＳ１１５５において、情報処理装置１０２は、指令生成部８０５により、選択した動作パターンａ_ｋを実行するための指令をワーク取り出し作業装置１０１に送信する。ワーク取り出し作業装置１０１は、受信した指令に基づいて、角度調整機構１１１の電動アクチュエータ１１を駆動させることにより、把持機構１１２に絡まり解き動作をさせる。ある局面において、ワーク取り出し作業装置１０１は、角度調整機構１１１の電動アクチュエータ１１および位置調整装置の電動アクチュエータ１０８を駆動させることにより、把持機構１１２に絡まり解き動作をさせてもよい。 In step S1155, the information processing device 102 transmits a command for executing the selected operation pattern _ak to the work retrieval work device 101 by the command generation unit 805. The work taking-out work device 101 drives the electric actuator 11 of the angle adjusting mechanism 111 based on the received command, so that the gripping mechanism 112 is entangled and unwound. In a certain aspect, the work taking-out work device 101 may be entangled with the gripping mechanism 112 by driving the electric actuator 11 of the angle adjusting mechanism 111 and the electric actuator 108 of the position adjusting device.

ステップＳ１１６０において、情報処理装置１０２は、再度、撮像装置１１５により把持機構１１２の先端を側面から撮影することによって取得された画像２を基に、把持機構１１２が把持しているワークの個数を算出する。 In step S1160, the information processing apparatus 102 again calculates the number of workpieces gripped by the gripping mechanism 112 based on the image 2 acquired by photographing the tip of the gripping mechanism 112 from the side surface by the imaging device 115. To do.

ステップＳ１１６５において、情報処理装置１０２は、ステップＳ１１６０にて撮影された画像２の解析結果に基づいて、把持機構１１２がワークを１個だけ把持しているか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個だけ把持していると判定した場合（ステップＳ１０６５にてＹＥＳ）、絡まり解き作業は完了したため、制御をステップＳ１０３０に移す。そうでない場合（ステップＳ１０６５にてＮＯ）、絡まり解き作業は未完了のため、情報処理装置１０２は制御をステップＳ１０３５に移す。 In step S1165, the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips only one work based on the analysis result of the image 2 captured in step S1160. When the information processing apparatus 102 determines that the gripping mechanism 112 grips only one work (YES in step S1065), the entanglement untangling work is completed, so the control is transferred to step S1030. If this is not the case (NO in step S1065), the information processing apparatus 102 shifts control to step S1035 because the entanglement untangling work is incomplete.

ステップＳ１０４０〜Ｓ１０６５において、情報処理装置１０２は、把持機構１１２が１個のワークを把持している状態になるまで、評価値に基づいて選択された動作パターンをワーク取り出し作業装置１０１に実行させ続けることで、ワークの絡まりを解くことができる。 In steps S104 to S1065, the information processing apparatus 102 continues to cause the workpiece taking-out work apparatus 101 to execute the operation pattern selected based on the evaluation value until the gripping mechanism 112 grips one workpiece. By doing so, the entanglement of the work can be unraveled.

図１１は、図１０の処理の動作イメージの一例を示す図である。状態Ｘは、把持機構１１２がワーク入れ１１３からワークを取り出した直後を表している（ステップＳ１０１５に対応）。把持機構１１２は、絡まった状態の３個のワーク（ワークＡ，Ｂ，Ｃ）を把持している。 FIG. 11 is a diagram showing an example of an operation image of the process of FIG. The state X represents immediately after the gripping mechanism 112 takes out the work from the work insertion 113 (corresponding to step S1015). The gripping mechanism 112 grips three entangled workpieces (works A, B, C).

情報処理装置１０２は、状態Ｘから、図１０のステップＳ１０３５〜ステップＳ１０６５の処理を繰り返すことにより、ワーク取り出し作業装置１０１に絡まり解き作業をさせる。情報処理装置１０２は、状態Ｘのときの各画像（もしくは、画像に写るワークの個数）および各電動アクチュエータのモータトルク値を取得する（ステップＳ１０４０に対応）。次に、情報処理装置１０２は、状態Ｘのときの各画像（もしくは、画像に写るワークの個数）および各電動アクチュエータのモータトルク値を評価値関数Ｆの入力として、各動作パターンのそれぞれの評価値を算出する（ステップＳ１０４５に対応）。そして、情報処理装置１０２は、最も評価値の高い動作パターンａ_ｎ−３を選択し（ステップＳ１０５０に対応）、動作パターンａ_ｎ−３に対応する指令をワーク取り出し作業装置１０１に送信する（ステップＳ１０５５に対応）。 From the state X, the information processing device 102 repeats the processes of steps S1035 to S1065 of FIG. 10 to cause the work taking-out work device 101 to entangle and unravel the work. The information processing device 102 acquires each image (or the number of workpieces shown in the image) and the motor torque value of each electric actuator in the state X (corresponding to step S1040). Next, the information processing device 102 receives each image (or the number of workpieces reflected in the image) in the state X and the motor torque value of each electric actuator as the input of the evaluation value function F, and evaluates each operation pattern. Calculate the value (corresponding to step S1045). Then, the information processing apparatus 102 selects the operation pattern an _-3 having the highest evaluation value (corresponding to step S1050), and transmits a command corresponding to the operation pattern an _-3 to the work taking-out work apparatus 101 (step). Corresponds to S1055).

状態Ｙは、ワーク取り出し作業装置１０１が動作パターンａ_ｎ−３を実行した直後の様子を示す。情報処理装置１０２は、把持機構１１２がワークを１個把持しているか否かを判定する（ステップＳ１０６０およびＳ１０６５に対応）。状態Ｙにおいて、把持機構１１２が把持するワークの数は３個のままであり、絡まり解き作業は完了していない。よって、情報処理装置１０２は、再度ステップＳ１０３５からステップＳ１０６５までの処理を繰り返す。 The state Y indicates a state immediately after the work taking-out work device 101 executes the operation pattern an _-3 . The information processing device 102 determines whether or not the gripping mechanism 112 grips one work (corresponding to steps S1060 and S1065). In the state Y, the number of workpieces gripped by the gripping mechanism 112 remains three, and the entanglement untangling work is not completed. Therefore, the information processing apparatus 102 repeats the processes from step S1035 to step S1065 again.

状態Ｚは、ワーク取り出し作業装置１０１が状態Ｙのときに動作パターンａ_ｎ−１を実行した直後の様子を示す。状態Ｚにおいて、ワークＢ，Ｃは落下しており、把持機構１１２が把持するワークの絡まり解き作業は完了していることがわかる。情報処理装置１０２は、撮像装置１１５が把持機構１１２の先端を側面から撮影することによって取得された画像２から、ワークの絡まり解き作業の完了を検出する。ワークの絡まり解き作業が完了した後は、情報処理装置１０２は、ワーク取り出し作業装置１０１に、ワークを予め定められた位置に運ばせる。その後、情報処理装置１０２は、ワーク取り出し作業装置１０１に、次のワークの取り出し作業を行うための指令を送信してもよい。 The state Z indicates a state immediately after the operation pattern an _-1 is executed when the work taking-out work device 101 is in the state Y. In the state Z, the works B and C have fallen, and it can be seen that the work of untangling the work gripped by the gripping mechanism 112 has been completed. The information processing device 102 detects the completion of the work entanglement work from the image 2 acquired by the image pickup device 115 photographing the tip of the gripping mechanism 112 from the side surface. After the work of untangling the work is completed, the information processing device 102 causes the work taking-out work device 101 to carry the work to a predetermined position. After that, the information processing device 102 may send a command to the work taking-out work device 101 to perform the next work taking-out work.

＜Ｅ．ワーク取り出し作業の学習処理＞
図１０および図１１で説明した例において、情報処理装置１０２は、撮像装置１１４，１１５により撮影することによって取得された画像１，２の撮影した画像（もしくは、画像に写るワークの個数）、角度調整機構１１１および位置調整装置の現在のモータトルク値等に基づいて各動作パターンのそれぞれの評価値を計算し、評価値が最大になる動作パターンを順次実行することで動作を成功させる。そのため、図８の評価値関数Ｆは、画像（もしくは、画像に写るワークの個数）およびモータトルク値等に基づいて次に実行すべき最適な動作パターンに対して最大の評価値を出力するよう最適化されている必要がある。 <E. Learning process of work retrieval work ＞
In the example described with reference to FIGS. 10 and 11, the information processing apparatus 102 captures images (or the number of workpieces in the images) and angles of images 1 and 2 acquired by taking images with the imaging devices 114 and 115. The evaluation value of each operation pattern is calculated based on the current motor torque value of the adjustment mechanism 111 and the position adjustment device, and the operation pattern that maximizes the evaluation value is sequentially executed to succeed the operation. Therefore, the evaluation value function F in FIG. 8 outputs the maximum evaluation value for the optimum operation pattern to be executed next based on the image (or the number of workpieces shown in the image), the motor torque value, and the like. Must be optimized.

しかし、ワーク取り出し処理の対象となるワークの初期状態（ワーク同士の絡まり方等）は、把持機構１１２に把持されるごとに変化する可能性ある。また、ワーク取り出し作業装置１０１が動作パターンを実行することで、把持機構１１２により把持されたワークの姿勢や絡まり方も変化する可能性がある。これらのあらゆる状態を想定したルールベースの動作プログラムの構築は困難である。よって、本実施の形態に従うワーク取り出し作業システム１００は、強化学習により、繰り返し動作を試行する過程で評価値関数Ｆを最適化する。 However, the initial state of the work to be removed from the work (how the works are entangled with each other, etc.) may change each time the work is gripped by the gripping mechanism 112. Further, when the work taking-out work device 101 executes the operation pattern, the posture and the entanglement of the work gripped by the gripping mechanism 112 may change. It is difficult to construct a rule-based operation program that assumes all of these states. Therefore, the work taking-out work system 100 according to the present embodiment optimizes the evaluation value function F in the process of trying the repetitive operation by reinforcement learning.

図１２は、ワーク取り出し作業システム１００の絡まり解き作業の学習処理の一例を示すフローチャートである。ある局面において、図１２の処理を実行するためのプログラムは、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１２の各ステップを実行するものとして当該学習処理を説明する。 FIG. 12 is a flowchart showing an example of a learning process of the entanglement work of the work taking-out work system 100. In a certain aspect, the program for executing the process of FIG. 12 may be executed by the CPU 701 by being stored in the secondary storage device 703 and read out by the primary storage device 702. Hereinafter, the learning process will be described assuming that the information processing apparatus 102 executes each step of FIG. 12.

ステップＳ１２１０において、情報処理装置１０２は、変数ｊに１を代入する。ステップＳ１２２０において、情報処理装置１０２は、変数ｊの値が定数Ｊ１以下であるか否かを判定する。情報処理装置１０２は、変数ｊの値が定数Ｊ１以下であると判定すると（ステップＳ１２２０にてＹＥＳ）、ステップＳ１２３０に制御を移す。そうでない場合（ステップＳ１２２０にてＮＯ）、情報処理装置１０２は、ステップＳ１２５０に制御を移す。評価値関数Ｆが未学習の初期状態において、情報処理装置１０２は、変数ｊが定数Ｊ１に達するまで、絡まり解き作業初期学習を繰り返し実行する。 In step S1210, the information processing apparatus 102 assigns 1 to the variable j. In step S1220, the information processing apparatus 102 determines whether or not the value of the variable j is equal to or less than the constant J1. When the information processing apparatus 102 determines that the value of the variable j is equal to or less than the constant J1 (YES in step S1220), the information processing apparatus 102 shifts control to step S1230. If not (NO in step S1220), the information processing apparatus 102 transfers control to step S1250. In the initial state where the evaluation value function F is unlearned, the information processing apparatus 102 repeatedly executes the initial learning of the entanglement unraveling work until the variable j reaches the constant J1.

ステップＳ１２３０において、情報処理装置１０２は、絡まり解き作業の初期学習処理を実行する。絡まり解き作業の初期学習処理については後述する。ステップＳ１２４０において、情報処理装置１０２は、変数ｊの値をインクリメントする。以降は、情報処理装置１０２は、上限回数として予め定められた回数Ｊ１まで、絡まり解き作業の初期学習処理を繰り返し実行する。 In step S1230, the information processing device 102 executes the initial learning process of the entanglement unraveling work. The initial learning process of the entanglement unraveling work will be described later. In step S1240, the information processing apparatus 102 increments the value of the variable j. After that, the information processing apparatus 102 repeatedly executes the initial learning process of the entanglement unraveling work up to a predetermined number of times J1 as the upper limit number of times.

ステップＳ１２５０において、情報処理装置１０２は、変数ｊの値が定数Ｊ２以下であるか否かを判定する。情報処理装置１０２は、変数ｊの値が定数Ｊ２以下であると判定すると（ステップＳ１２５０にてＹＥＳ）、ステップＳ１２６０に制御を移す。そうでない場合（ステップＳ１２５０にてＮＯ）、情報処理装置１０２は、学習処理を終了する。 In step S1250, the information processing apparatus 102 determines whether or not the value of the variable j is equal to or less than the constant J2. When the information processing apparatus 102 determines that the value of the variable j is equal to or less than the constant J2 (YES in step S1250), the information processing apparatus 102 shifts control to step S1260. If not (NO in step S1250), the information processing apparatus 102 ends the learning process.

ステップＳ１２６０において、情報処理装置１０２は、絡まり解き作業の学習処理を実行する。絡まり解き作業の学習処理については後述する。ステップＳ１２７０において、情報処理装置１０２は、変数ｊの値をインクリメントする。以降は、情報処理装置１０２は、変数ｊが上限値として予め定められた定数Ｊ２より大きくなるまで、絡まり解き作業の学習処理を繰り返し実行する。 In step S1260, the information processing device 102 executes the learning process of the entanglement unraveling work. The learning process of the entanglement unraveling work will be described later. In step S1270, the information processing apparatus 102 increments the value of the variable j. After that, the information processing apparatus 102 repeatedly executes the learning process of the entanglement unraveling work until the variable j becomes larger than the predetermined constant J2 as the upper limit value.

図１３は、絡まり解き作業の初期学習処理（図１２のステップＳ１２３０に対応）の一例を示すフローチャートである。ある局面において、図１３の処理を実行するためのプログラムは、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１３の各ステップを実行するものとして当該初期学習処理を説明する。 FIG. 13 is a flowchart showing an example of the initial learning process (corresponding to step S1230 in FIG. 12) of the entanglement unraveling work. In a certain aspect, the program for executing the process of FIG. 13 may be executed by the CPU 701 by being stored in the secondary storage device 703 and read out by the primary storage device 702. Hereinafter, the initial learning process will be described assuming that the information processing apparatus 102 executes each step of FIG.

ステップＳ１３０５において、情報処理装置１０２は、位置調整装置により、把持機構１１２を予め定められた位置（ワーク取り出し開始位置）に移動させるための指令をワーク取り出し作業装置１０１に送信する。 In step S1305, the information processing device 102 transmits a command for moving the gripping mechanism 112 to a predetermined position (work take-out start position) to the work take-out work device 101 by the position adjusting device.

ステップＳ１３１０において、情報処理装置１０２は、変数ｉに１を代入する。ステップＳ１３１５において、情報処理装置１０２は、ワーク取り出し作業装置１０１の撮像装置１１４，１１５の撮影した画像（もしくは、画像に写るワークの個数）と、各電動アクチュエータのモータトルク値とを状態情報Ｓ１として取得する。なお、状態情報Ｓ１は、さらに、情報処理装置１０２が各種センサーから取得した各種センサー値を含んでいてもよい。 In step S1310, the information processing apparatus 102 substitutes 1 for the variable i. In step S1315, the information processing apparatus 102 uses the images (or the number of workpieces in the images) taken by the image pickup devices 114 and 115 of the work retrieval work device 101 as the state information S1 and the motor torque value of each electric actuator. get. The state information S1 may further include various sensor values acquired by the information processing device 102 from various sensors.

ステップＳ１３２０において、情報処理装置１０２は、動作決定部８０４により、乱数を用いて次に実行する動作パターンａ_ｋを選択する。具体的には、情報処理装置１０２は、１〜ｎの間の乱数に基づいて動作パターンのインデックス番号ｋを決定する。 In step S1320, the information processing apparatus 102 selects the operation pattern _ak to be executed next by using the random number by the operation determination unit 804. Specifically, the information processing apparatus 102 determines the index number k of the operation pattern based on a random number between 1 and n.

ステップＳ１３２５において、情報処理装置１０２は、状態情報Ｓ１を評価値関数学習部８０７に保管した後、ワーク取り出し作業装置１０１に動作パターンａ_ｋを実行させるための指令を送信する。 In step S1325, the information processing apparatus 102 stores the state information S1 in the evaluation value function learning unit 807, and then transmits a command for causing the work retrieval work apparatus 101 to execute the operation pattern _ak .

ステップＳ１３３０において、情報処理装置１０２は、ワーク取り出し作業装置１０１が動作パターンａ_ｋを実行した後に、ワーク取り出し作業装置１０１の撮像装置１１４，１１５の撮影した画像（もしくは、画像に写るワークの個数）と、各電動アクチュエータのモータトルク値とを状態Ｓ２として取得する。なお、状態Ｓ２は、さらに、情報処理装置１０２が各種センサーから取得した各種センサー値を含んでいてもよい。 In step S1330, the information processing apparatus 102 takes images (or the number of workpieces in the image) taken by the imaging devices 114 and 115 of the work retrieval work device 101 after the workpiece retrieval work device 101 executes the operation pattern _ak. And the motor torque value of each electric actuator are acquired as the state S2. The state S2 may further include various sensor values acquired by the information processing device 102 from various sensors.

ステップＳ１３３５において、情報処理装置１０２は、撮像装置１１５が把持機構１１２の先端を側面から撮影することによって取得された画像２を基に、把持機構１１２が把持しているワークの個数を算出する。ある局面において、情報処理装置１０２は、画像２から把持機構１１２が把持しているワークの個数を算出するために、既存の画像認識技術を用いてもよい。 In step S1335, the information processing device 102 calculates the number of works gripped by the gripping mechanism 112 based on the image 2 acquired by the imaging device 115 photographing the tip of the gripping mechanism 112 from the side surface. In a certain aspect, the information processing apparatus 102 may use an existing image recognition technique in order to calculate the number of workpieces gripped by the gripping mechanism 112 from the image 2.

ステップＳ１３４０において、情報処理装置１０２は、画像２の解析結果に基づいて、把持機構１１２がワークを１個だけ把持しているか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個だけ把持していると判定した場合（ステップＳ１３４０にてＹＥＳ）、制御をステップＳ１３４５に移す。そうでない場合（ステップＳ１３４０にてＮＯ）、情報処理装置１０２は制御をステップＳ１３６０に移す。 In step S1340, the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips only one work based on the analysis result of the image 2. When the information processing apparatus 102 determines that the gripping mechanism 112 grips only one work (YES in step S1340), the information processing apparatus 102 shifts control to step S1345. If not (NO in step S1340), the information processing apparatus 102 shifts control to step S1360.

ステップＳ１３４５において、情報処理装置１０２は、終了判定をＴｒｕｅ（完了）にし、「動作パターンａ_ｋ」に対する報酬Ｒを１にする。なお、本実施の例では、報酬Ｒは、成功のときは１、失敗のときは−１、それ以外のときは０とするが、報酬Ｒの例はこれに限られない。成功時や失敗時のときの報酬ごとに差があればよい。 In step S1345, the information processing apparatus 102 sets the end determination to True (completion) and sets the reward R for the "operation pattern _ak " to 1. In the example of this implementation, the reward R is 1 for success, -1 for failure, and 0 for other cases, but the example of reward R is not limited to this. It suffices if there is a difference for each reward at the time of success or failure.

ステップＳ１３５０において、情報処理装置１０２は、評価値関数学習部８０７に、実行した「動作パターンａ_ｋ」、「状態情報Ｓ１，Ｓ２」、「報酬Ｒ（Ｒ＝１）」および「終了判定Ｔｒｕｅ（完了）」を保存する。ステップＳ１３５５において、情報処理装置１０２は、評価値関数Ｆの更新処理を実行する。 In step S1350, the information processing apparatus 102 causes the evaluation value function learning unit 807 to execute the "operation pattern _ak ", "state information S1, S2", "reward R (R = 1)", and "end determination True (end determination True). Done) ”is saved. In step S1355, the information processing device 102 executes the update process of the evaluation value function F.

ステップＳ１３６０において、情報処理装置１０２は、画像２の解析結果に基づいて、把持機構１１２がワークを１個も把持していないか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個も把持していないと判定した場合（ステップＳ１３６０にてＹＥＳ）、報酬Ｒを生成せず処理を終了する。そうでない場合（ステップＳ１３６０にてＮＯ）、情報処理装置１０２は制御をステップＳ１３６５に移す。 In step S1360, the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips even one work based on the analysis result of the image 2. When the information processing device 102 determines that the gripping mechanism 112 does not grip any of the workpieces (YES in step S1360), the information processing apparatus 102 ends the process without generating the reward R. If not (NO in step S1360), the information processing apparatus 102 shifts control to step S1365.

ステップＳ１３６５において、情報処理装置１０２は、変数ｉの値が定数Ｎ１より大きいか否かを判定する。定数Ｎ１は、絡まり解き作業中に繰り返してよい動作パターンの実行回数の上限値である。情報処理装置１０２は、変数ｉの値が定数Ｎ１より大きいと判定した場合（ステップＳ１３６５にてＹＥＳ）、動作パターンの実行回数が上限に達したと判断し、制御をステップＳ１３７０に移す。そうでない場合（ステップＳ１３６５にてＮＯ）、情報処理装置１０２は制御をステップＳ１３７５に移す。 In step S1365, the information processing apparatus 102 determines whether or not the value of the variable i is larger than the constant N1. The constant N1 is an upper limit of the number of times the operation pattern is executed that may be repeated during the untangling work. When the information processing apparatus 102 determines that the value of the variable i is larger than the constant N1 (YES in step S1365), it determines that the number of executions of the operation pattern has reached the upper limit, and shifts control to step S1370. If not (NO in step S1365), the information processing apparatus 102 shifts control to step S1375.

ステップＳ１３７０において、情報処理装置１０２は、終了判定をＴｒｕｅにし、動作パターンａ_ｋに対する報酬Ｒを−１にする。ステップＳ１３５０以降の処理は前述した通りになる。ステップＳ１３７５において、情報処理装置１０２は、変数ｉの値をインクリメントする。ステップＳ１３８０において、情報処理装置１０２は、終了判定をＦａｌｓｅにし、実行した「動作パターンａ_ｋ」に対する報酬Ｒを０にする。 In step S1370, the information processing apparatus 102 sets the end determination to True and sets the reward R for the operation pattern _ak to -1. The processing after step S1350 is as described above. In step S1375, the information processing apparatus 102 increments the value of the variable i. In step S1380, the information processing apparatus 102 sets the end determination to False and sets the reward R for the executed “operation pattern _ak ” to 0.

ステップＳ１３８５において、情報処理装置１０２は、評価値関数学習部８０７に、実行した「動作パターンａ_ｋ」、「状態情報Ｓ１，Ｓ２」、「報酬Ｒ（Ｒ＝０）」および「終了判定Ｆａｌｓｅ（未完了）」を保存する。ステップＳ１３９０において、情報処理装置１０２は、評価値関数Ｆの更新処理を実行する。 In step S1385, the information processing apparatus 102 causes the evaluation value function learning unit 807 to execute the "operation pattern _ak ", "state information S1, S2", "reward R (R = 0)", and "end determination False (end determination False). Incomplete) ”is saved. In step S1390, the information processing device 102 executes the update process of the evaluation value function F.

図１４は、絡まり解き作業の学習処理（図１２のステップＳ１２６０に対応）の一例を示すフローチャートである。ある局面において、図１４の処理を実行するためのプログラムは、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１４の各ステップを実行するものとして当該学習処理を説明する。また、図１４において、前述の処理と同一の処理には、同一の符号を付してある。したがって、同一の処理の説明は繰り返さない。 FIG. 14 is a flowchart showing an example of a learning process (corresponding to step S1260 of FIG. 12) of the entanglement unraveling work. In a certain aspect, the program for executing the process of FIG. 14 may be executed by the CPU 701 by being stored in the secondary storage device 703 and read out by the primary storage device 702. Hereinafter, the learning process will be described assuming that the information processing apparatus 102 executes each step of FIG. Further, in FIG. 14, the same processes as those described above are designated by the same reference numerals. Therefore, the description of the same process is not repeated.

ステップＳ１４１０において、情報処理装置１０２は、評価値関数部８０２により、状態情報Ｓ１に基づいて、各動作パターンのそれぞれの評価値を算出する。ステップＳ１４２０において、情報処理装置１０２は、動作パターンテーブル８０３を参照して、最も評価値が高い動作パターンを選択する。 In step S1410, the information processing apparatus 102 calculates the evaluation value of each operation pattern based on the state information S1 by the evaluation value function unit 802. In step S1420, the information processing apparatus 102 refers to the operation pattern table 803 and selects the operation pattern having the highest evaluation value.

図１３の絡まり解きの初期学習処理においては、学習情報が十分にないため、情報処理装置１０２は、ステップＳ１３２０において、乱数で次の動作パターンを選択している。これに対して、図１４の絡まり解きの学習処理においては、一定量以上の学習情報が評価値関数学習部８０７に蓄積されているため、情報処理装置１０２は、ステップＳ１４１０において、評価値関数Ｆに基づいて各動作パターンの評価値を算出する。情報処理装置１０２は、図１４の処理においても、随時、評価値関数Ｆを更新することで絡まり解き作業の精度を向上させる。 In the initial learning process of untangling in FIG. 13, since there is not enough learning information, the information processing apparatus 102 selects the next operation pattern with random numbers in step S1320. On the other hand, in the learning process of untangling in FIG. 14, since a certain amount or more of learning information is accumulated in the evaluation value function learning unit 807, the information processing apparatus 102 sets the evaluation value function F in step S1410. The evaluation value of each operation pattern is calculated based on. The information processing apparatus 102 also improves the accuracy of the entanglement unraveling work by updating the evaluation value function F at any time in the process of FIG.

図１５は、評価値関数部８０２の評価値関数Ｆの更新処理の一例を示すフローチャートである。ある局面において、図１５の処理を実行するためのプログラムは２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１５の各ステップを実行するものとして更新処理を説明する。 FIG. 15 is a flowchart showing an example of the update process of the evaluation value function F of the evaluation value function unit 802. In a certain aspect, the program for executing the process of FIG. 15 may be stored in the secondary storage device 703 and read by the primary storage device 702 to be executed by the CPU 701. Hereinafter, the update process will be described assuming that the information processing apparatus 102 executes each step of FIG.

ステップＳ１５１０において、情報処理装置１０２は、評価値関数学習部８０７に保存されている各動作パターンａ_ｋの「状態情報Ｓ１，Ｓ２」、「報酬Ｒ」および「終了判定」を読み出す。 In step S1510, the information processing apparatus 102 reads out the "state information S1, S2", "reward R", and "end determination" of each operation pattern _ak stored in the evaluation value function learning unit 807.

ステップＳ１５２０において、情報処理装置１０２は、ステップＳ１５１０にて読み出した各種データを用いて、学習用の評価値関数Ｆ’の内部パラメータを更新する。評価値関数Ｆ’は、評価値の算出に使用される評価値関数Ｆとは別に用意する学習用の評価値関数である。評価値関数Ｆ’は、評価値関数学習部８０７によって使用される。評価値関数Ｆは、評価値関数部８０２によって使用される。 In step S1520, the information processing apparatus 102 updates the internal parameters of the evaluation value function F'for learning by using various data read in step S1510. The evaluation value function F'is an evaluation value function for learning prepared separately from the evaluation value function F used for calculating the evaluation value. The evaluation value function F'is used by the evaluation value function learning unit 807. The evaluation value function F is used by the evaluation value function unit 802.

ステップＳ１５３０において、情報処理装置１０２は、学習処理を予め定められた回数繰り返すごとに、評価値関数Ｆ’を評価値関数Ｆにコピーする。情報処理装置１０２は、図１２〜図１４の処理中においても、図１５の処理を随時実行してもよい。 In step S1530, the information processing apparatus 102 copies the evaluation value function F'to the evaluation value function F every time the learning process is repeated a predetermined number of times. The information processing apparatus 102 may execute the process of FIG. 15 at any time even during the process of FIGS. 12 to 14.

以下に、評価値関数Ｆの学習処理の詳細について説明する。評価値関数Ｆはニューラルネットワークのため、学習には教師信号が必要になる。情報処理装置１０２は、終了判定に応じて教師信号ｙを次のように決定する。 The details of the learning process of the evaluation value function F will be described below. Since the evaluation value function F is a neural network, a teacher signal is required for learning. The information processing device 102 determines the teacher signal y as follows according to the end determination.

絡まり解き処理の終了判定がＴｒｕｅの場合の教師信号ｙは以下のようになる。 The teacher signal y when the end determination of the entanglement unraveling process is True is as follows.

絡まり解き処理の終了判定がＦａｌｓｅの場合の教師信号ｙは以下のようになる。 The teacher signal y when the end determination of the entanglement unraveling process is False is as follows.

ここで、「ｓ'＝Ｔ２、ａ'」は「Ｑ（ｓ，ａ）」が最大になる動作パターンを意味する。情報処理装置１０２は、上記の教師信号ｙと、評価値関数Ｆとの２乗誤差Ｅを求め、誤差逆伝搬法によりニューラルネットワークの学習を行う。評価値関数Ｆは、下記の式（３）の式で表される。 Here, "s'= T2, a'" means an operation pattern in which "Q (s, a)" is maximized. The information processing apparatus 102 obtains the square error E of the above-mentioned teacher signal y and the evaluation value function F, and learns the neural network by the error back propagation method. The evaluation value function F is represented by the following equation (3).

また、情報処理装置１０２は、式（３）を下記の式（４）に代入して誤差を算出する。 Further, the information processing apparatus 102 substitutes the equation (3) into the following equation (4) to calculate the error.

誤差逆伝搬法は、上記Ｅが０になるようにニューラルネットワークの内部パラメータを最適化する。よって、学習が進むにしたがって下記の式（５）の値が０に近づいていく。 The error back propagation method optimizes the internal parameters of the neural network so that the above E becomes 0. Therefore, as the learning progresses, the value of the following equation (5) approaches 0.

強化学習も同様に、学習が十分に行われると、下記の式（６）が成り立つので、誤差逆伝搬法によるニューラルネットワークの学習は強化学習の学習結果と同様になる。 Similarly, in reinforcement learning, when the learning is sufficiently performed, the following equation (6) holds, so that the learning of the neural network by the error back propagation method is the same as the learning result of reinforcement learning.

以上説明したように、本実施の形態に従うワーク取り出し作業装置１０１は、直列多関節の構造を有さず、代わりに直動機構およびパラレルリンクのみの構成を有する。その結果、多関節ロボットが持つ特異点の問題が発生せず、多関節ロボットよりも少ないスペースでの作業を可能にする。また、ワーク取り出し作業装置１０１は、機械学習においても、パラレルリンクの基端側リンクハブに取付けられた電動アクチュエータおよび位置調整装置の電動アクチュエータのモータトルク値のみを学習データとすることができる。そのため、ワーク取り出し作業装置１０１は、多関節ロボットと比較して、ワークの絡まり解き処理における学習パラメータが少なく機械学習が容易になる。よって、Ｃ型サークリップや、ばね、コイル等の絡まりやすいワークの取り出し作業における絡まり解き処理の精度を向上させることが可能となる。 As described above, the work taking-out work device 101 according to the present embodiment does not have a series articulated structure, but instead has a structure of only a linear motion mechanism and a parallel link. As a result, the problem of singularity of the articulated robot does not occur, and it is possible to work in a smaller space than the articulated robot. Further, even in machine learning, the work taking-out work device 101 can use only the motor torque values of the electric actuator attached to the base end side link hub of the parallel link and the electric actuator of the position adjusting device as learning data. Therefore, the work taking-out work device 101 has fewer learning parameters in the work entanglement processing and facilitates machine learning as compared with the articulated robot. Therefore, it is possible to improve the accuracy of the entanglement untangling process in the work of taking out the C-shaped circlip, the spring, the coil, and the like, which are easily entangled.

今回開示された実施の形態は全ての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内で全ての変更が含まれることが意図される。 It should be considered that the embodiments disclosed this time are exemplary in all respects and not restrictive. The scope of the present invention is shown not by the above description but by the scope of claims, and it is intended that all modifications are included in the meaning and scope equivalent to the scope of claims.

１１，１０８Ａ，１０８Ｂ，１０８Ｃ電動アクチュエータ、３２第１リンクハブ、３３第２リンクハブ、３４リンク機構、３５，３６端部リンク部材、３７中央リンク部材、４２，５５，７３，７５回転軸、６２減速機構、６３モータ固定部材、１００合作業システム、１０１合作業装置、１０２情報処理装置、１０３制御装置、１０４架台、１０５第１の直動ユニット、１０６第２の直動ユニット、１０７第３の直動ユニット、１０９作業ヘッド、１１０回転ユニット取付部材、１１１角度調整機構、１１２把持機構、１１３ワーク入れ、１１４，１１５撮像装置、７０２１次記憶装置、７０３２次記憶装置、７０４外部機器インターフェース、７０５入力インターフェース、７０６出力インターフェース、７０７通信インターフェース、８０１信号入力部、８０２評価値関数部、８０３動作パターンテーブル、８０４動作決定部、８０５指令生成部、８０６動作結果判定部、８０７評価値関数学習部。 11,108A, 108B, 108C Electric actuator, 32 1st link hub, 33 2nd link hub, 34 link mechanism, 35, 36 end link member, 37 central link member, 42, 55, 73, 75 rotating shaft, 62 Deceleration mechanism, 63 motor fixing member, 100 working system, 101 working device, 102 information processing device, 103 control device, 104 gantry, 105 first linear motion unit, 106 second linear motion unit, 107 third Linear unit, 109 work head, 110 rotation unit mounting member, 111 angle adjustment mechanism, 112 gripping mechanism, 113 work insert, 114, 115 image pickup device, 702 primary storage device, 703 secondary storage device, 704 external device interface, 705 input interface, 706 output interface, 707 communication interface, 801 signal input unit, 802 evaluation value function unit, 803 operation pattern table, 804 operation determination unit, 805 command generation unit, 806 operation result determination unit, 807 evaluation value function learning unit ..

Claims

ワークの取り出しを行う作業装置であって、
ワークを容器から取り出して把持する把持部と、
前記把持部が装着され、前記把持部の向きを調整する角度調整部と、
前記角度調整部が装着される作業ヘッドと、
複数の駆動部により前記作業ヘッドを移動させる位置調整部と、
前記把持部の把持するワークを撮影する第１の撮像装置と、
前記作業装置を制御する情報処理装置とを備え、
前記角度調整部は、
第１および第２のリンクハブと、
前記第１および第２のリンクハブの間に並列に配置された複数のリンクと、
前記複数のリンクのそれぞれを駆動させる複数の駆動部とを含み、
前記情報処理装置は、
前記第１の撮像装置が撮影した画像に基づいて、前記把持部が把持するワークの個数を検出し、
前記把持部が把持するワークの個数が２以上であることに基づいて、前記把持部が把持するワークの個数および画像、前記角度調整部の各駆動部のトルクを機械学習モデルのパラメータとし、前記機械学習モデルにより、前記位置調整部および前記角度調整部の各駆動部に送信するそれぞれの駆動信号を決定し、
決定した前記駆動信号に基づいて、前記位置調整部の各駆動部および前記角度調整部の各駆動部を駆動させることにより、前記把持部にワークの絡まり解き動作をさせる、作業装置。 It is a work device that takes out the work.
A grip part that takes out the work from the container and grips it,
An angle adjusting portion to which the grip portion is attached and adjusting the orientation of the grip portion,
The work head to which the angle adjustment unit is mounted and
A position adjustment unit that moves the work head by a plurality of drive units,
A first imaging device that captures the workpiece gripped by the grip portion, and
It is equipped with an information processing device that controls the work device.
The angle adjusting unit
With the first and second link hubs,
A plurality of links arranged in parallel between the first and second link hubs,
A plurality of driving units for driving each of the plurality of links are included.
The information processing device
Based on the image taken by the first imaging device, the number of workpieces gripped by the gripping portion is detected.
Based on the number of workpieces gripped by the grip portion being 2 or more, the number and images of workpieces gripped by the grip portion and the torque of each drive portion of the angle adjusting portion are used as parameters of the machine learning model. The machine learning model determines each drive signal to be transmitted to each drive unit of the position adjustment unit and the angle adjustment unit.
A working device that causes the gripping portion to unentangle the work by driving each driving portion of the position adjusting portion and each driving portion of the angle adjusting portion based on the determined drive signal.

前記情報処理装置は、
前記把持部が絡まり解き動作を実行した後に、再度、前記第１の撮像装置が撮影した画像に基づいて、前記把持部が把持するワークの個数を検出し、
前記把持部が把持するワークの個数が２以上であることに基づいて、前記把持部にワークの絡まり解き動作を再度実行させる、請求項１に記載の作業装置。 The information processing device
After the grip portion is entangled and the untangling operation is executed, the number of workpieces gripped by the grip portion is detected again based on the image taken by the first imaging device.
The work apparatus according to claim 1, wherein the grip portion is made to perform a work unentanglement operation again based on the number of workpieces gripped by the grip portion being two or more.

前記情報処理装置は、
前記第１の撮像装置が撮影した画像に基づいて、前記把持部が把持するワークの個数が１であるか否かを判定し、
前記把持部が把持するワークの個数が１であることに基づいて、ワークの絡まり解き作業が完了したと判定し、
前記機械学習モデルの学習に用いる報酬データを生成し、前記機械学習モデルに入力する学習データに前記報酬データを加える、請求項１または２に記載の作業装置。 The information processing device
Based on the image taken by the first imaging device, it is determined whether or not the number of workpieces gripped by the gripping portion is 1.
Based on the number of workpieces gripped by the gripping portion being 1, it is determined that the work of untangling the workpieces has been completed.
The work apparatus according to claim 1 or 2, wherein the reward data used for learning the machine learning model is generated, and the reward data is added to the learning data input to the machine learning model.

前記情報処理装置は、
前記第１の撮像装置が撮影した画像に基づいて、前記把持部が把持するワークの個数が０であるか否かを判定し、
前記把持部が把持するワークの個数が０個であることに基づいて、前記機械学習モデルの学習に用いる報酬データを生成せずに、絡まり解き作業を終了する、請求項３に記載の作業装置。 The information processing device
Based on the image taken by the first imaging device, it is determined whether or not the number of workpieces gripped by the gripping portion is 0.
The work apparatus according to claim 3, wherein the entanglement unraveling work is completed without generating reward data used for learning the machine learning model based on the number of workpieces gripped by the grip portion being 0. ..

前記容器内のワークを撮影する第２の撮像装置をさらに備え、
前記情報処理装置は、
前記第２の撮像装置が撮影した画像に基づいて、前記容器内にワークがあるか否かを判定し、
前記容器内にワークがあると判定したことに基づいて、前記把持部を予め定められた取り出し作業の位置に移動させるための前記駆動信号を前記位置調整部および前記角度調整部の各駆動部に送信する、請求項１〜４のいずれか１項に記載の作業装置。 A second imaging device for photographing the work in the container is further provided.
The information processing device
Based on the image taken by the second imaging device, it is determined whether or not there is a work in the container.
Based on the determination that there is a work in the container, the drive signal for moving the grip portion to a predetermined position for taking out work is sent to each drive portion of the position adjusting portion and the angle adjusting portion. The working device according to any one of claims 1 to 4 to be transmitted.

前記情報処理装置は、
前記第２の撮像装置が撮影した画像に基づいて、前記容器内にワークがないと判定すると、前記容器内からのワークの取り出し作業を終了する、請求項５に記載の作業装置。 The information processing device
The work device according to claim 5, wherein if it is determined that there is no work in the container based on the image taken by the second image pickup device, the work of taking out the work from the container is completed.

前記情報処理装置は、前記位置調整部および前記角度調整部の各駆動部に送信するそれぞれの前記駆動信号を決定するとき、さらに、前記位置調整部の各駆動部のトルクを取得し、前記位置調整部の各駆動部のトルクを前記機械学習モデルのパラメータに加える、請求項１〜６のいずれか１項に記載の作業装置。 When the information processing device determines the respective drive signals to be transmitted to the respective drive units of the position adjustment unit and the angle adjustment unit, the information processing device further acquires the torque of each drive unit of the position adjustment unit to obtain the position. The working apparatus according to any one of claims 1 to 6, wherein the torque of each driving unit of the adjusting unit is added to the parameters of the machine learning model.

前記位置調整部および前記角度調整部の各駆動部に送信されるそれぞれの前記駆動信号は、各駆動部のそれぞれの指令トルク、回転速度および回転量に関する情報を含む、請求項１〜７のいずれか１項に記載の作業装置。 Any of claims 1 to 7, wherein each drive signal transmitted to each drive unit of the position adjustment unit and the angle adjustment unit includes information on a command torque, a rotation speed, and a rotation amount of each drive unit. The working device according to item 1.

前記位置調整部は、３軸の直動機構を含む、請求項１〜８のいずれか１項に記載の作業装置。 The working device according to any one of claims 1 to 8, wherein the position adjusting unit includes a three-axis linear motion mechanism.

前記位置調整部の駆動部は、ステッピングモータである、請求項１〜９のいずれか１項に記載の作業装置。 The working device according to any one of claims 1 to 9, wherein the driving unit of the position adjusting unit is a stepping motor.

前記角度調整部の駆動部は、ステッピングモータである、請求項１〜１０いずれか１項に記載の作業装置。 The working device according to any one of claims 1 to 10, wherein the driving unit of the angle adjusting unit is a stepping motor.