JP7113778B2

JP7113778B2 - Work take-out device

Info

Publication number: JP7113778B2
Application number: JP2019060437A
Authority: JP
Inventors: 博明大庭
Original assignee: NTN Corp
Current assignee: NTN Corp
Priority date: 2019-03-27
Filing date: 2019-03-27
Publication date: 2022-08-05
Anticipated expiration: 2039-03-27
Also published as: JP2020157436A

Description

本開示は、ワーク取り出し作業装置に関し、より特定的には、リンク機構による角度調整機能を備えるワーク取り出し作業装置の機械学習を用いた制御に関する。 TECHNICAL FIELD The present disclosure relates to a work picking device, and more particularly to control using machine learning for a work picking device having an angle adjustment function by a link mechanism.

自動化された機械加工または組立作業におけるワークピース（以下、ワークと称する）は、ロボットまたは組立装置等（所謂、ワーク取り出し作業装置）によって自動的にピックアップされ、加工装置または組立物にセットされることが多い。 A workpiece (hereinafter referred to as a workpiece) in automated machining or assembly work is automatically picked up by a robot or an assembly device (so-called work picking device) and set in a processing device or assembly. There are many.

取り出されたワークを後工程の組立作業で別のワークに組み付ける場合、ワーク取り出し作業装置は、ワークを１つずつ取り出す必要がある。しかし、バネやクリップの様に絡みやすいワークを扱う場合、ワーク取り出し作業装置は、２つ以上のワークを同時に把持してしまう可能性がある。そのため、ワーク取り出し作業装置には、ワークの取り出しだけでなく、ワーク同士の絡まりを解く機能（以下、絡まり解き機能と呼ぶ）が求められている。 When the removed work is assembled with another work in the post-process assembly work, the work picking device needs to take out the works one by one. However, when handling workpieces that tend to get entangled, such as springs and clips, the workpiece picking device may grip two or more workpieces at the same time. Therefore, the work unloading device is required to have a function of untangling the works (hereinafter referred to as untangling function) in addition to unloading the work.

また、近年、機械学習の手法が進歩してきたこともあり、ワーク取り出し作業装置の制御の精度向上のために、機械学習を導入することが望まれている。 In addition, in recent years, the technique of machine learning has progressed, and it is desired to introduce machine learning in order to improve the accuracy of control of the work picking device.

ワークの取り出し制御に関し、例えば、特許文献１（特開２０１７－０３０１３５号公報）は、多関節ロボットを用いた「バラ積みされた状態を含む、乱雑に置かれたワークを取り出すときのロボットの最適な動作が人間の介在無しに選択される機械学習装置」を開示している（［要約］参照）。 Regarding work take-out control, for example, Patent Document 1 (Japanese Patent Application Laid-Open No. 2017-030135) describes "optimization of the robot when taking out randomly placed works, including randomly stacked states" using an articulated robot. A machine learning device in which a suitable action is selected without human intervention” (see [Abstract]).

また、絡まり取り制御に関し、例えば、特許文献２（特開平０５－１８５３８８号公報）は、「ロボットハンドのワーク把持状態を絡み検出用カメラが撮像し、この撮像した撮像情報に基づいてロボットハンドが把持するワークに絡み状態が発生しているか否かを視覚装置にて判定する」部品供給装置を開示している（［要約］参照）。 Regarding tangling control, for example, Patent Document 2 (Japanese Patent Application Laid-Open No. 05-185388) discloses that "a tangling detection camera captures an image of a workpiece gripping state of a robot hand, and based on the captured image information, the robot hand detects A visual device determines whether or not a workpiece to be gripped is entangled" (see [Summary]).

特開２０１７－０３０１３５号公報JP 2017-030135 A 特開平０５－１８５３８８号公報JP-A-05-185388 特開２０１７－０６４９１０号公報JP 2017-064910 A 特開平０６－１４４５８４号公報JP-A-06-144584

例えば、特許文献１および２に開示された技術はいずれも多関節ロボットを前提としている。多関節ロボットは、一般に特異点と呼ばれる構造的に制御できなくなる姿勢を含む。また、多関節ロボットは、ワークに加わる力やモーメントを検出するためのセンサーが必要であり、機械学習と組み合わせた場合に学習用のパラメータが多くなり、学習効率が悪くなる。 For example, the techniques disclosed in Patent Documents 1 and 2 are both based on articulated robots. Articulated robots contain poses that are structurally uncontrollable, commonly referred to as singularities. In addition, articulated robots require sensors to detect the force and moment applied to the workpiece, and when combined with machine learning, the number of learning parameters increases, resulting in poor learning efficiency.

そのため、多関節ロボットと異なり、構造的に特異点を有さず、効率よく機械学習を行うための技術が必要とされている。 Therefore, unlike multi-joint robots, there is a need for a technique for efficient machine learning that does not have singular points structurally.

本開示は、上記のような背景に鑑みてなされたものであって、ある局面における目的は、構造的に特異点を有さず、効率よく機械学習を行うための技術を提供することにある。 The present disclosure has been made in view of the above background, and an object in one aspect is to provide a technique for efficiently performing machine learning without structural singularities. .

ある実施の形態に従うワーク取り出しを行う作業装置は、ワークを容器から取り出して把持する把持部と、把持部が装着され、把持部の向きを調整する角度調整部と、角度調整部が装着される作業ヘッドと、複数の駆動部により作業ヘッドを移動させる位置調整部と、把持部の把持するワークを撮影する把持部撮像装置と、作業装置を制御する制御装置とを備える。角度調整部は、第１および第２のリンクハブと、第１および第２のリンクハブの間に並列に配置された複数のリンクと、複数のリンクのそれぞれを駆動させる複数の駆動部とを含む。制御装置は、把持部撮像装置が撮影した画像に基づいて、把持部が把持するワークの個数を検出し、把持部が把持するワークの個数が２以上であることに基づいて、把持部が把持するワークの個数および角度調整部の各駆動部のトルクを機械学習モデルのパラメータとし、機械学習モデルにより、位置調整部および角度調整部の各駆動部に送信するそれぞれの駆動信号を決定し、決定した駆動信号に基づいて、位置調整部の各駆動部および角度調整部の各駆動部を駆動させることにより、把持部にワークの絡まり解き動作をさせる。 A working device for picking up a workpiece according to an embodiment is equipped with a gripping section for gripping a workpiece after it is retrieved from a container, an angle adjusting section for adjusting the orientation of the gripping section, and an angle adjusting section. It includes a working head, a position adjusting section that moves the working head by a plurality of driving sections, a gripping section imaging device that captures an image of a workpiece gripped by the gripping section, and a control device that controls the working device. The angle adjusting section includes first and second link hubs, a plurality of links arranged in parallel between the first and second link hubs, and a plurality of driving sections for driving each of the plurality of links. include. The control device detects the number of workpieces gripped by the gripping part based on the image captured by the gripping part imaging device, and based on the fact that the number of workpieces gripped by the gripping part is two or more, the gripping part is gripped. The number of workpieces to be processed and the torque of each driving part of the angle adjusting part are used as parameters of the machine learning model, and the respective driving signals to be sent to each driving part of the position adjusting part and the angle adjusting part are determined by the machine learning model. By driving the respective driving portions of the position adjusting portion and the respective driving portions of the angle adjusting portion based on the generated drive signal, the grasping portion is caused to untangle the workpiece.

ある実施の形態によれば、構造的に特異点を有さず、効率よく機械学習を行うことが可能である。 According to an embodiment, it is possible to efficiently perform machine learning without structural singularities.

この発明の上記および他の目的、特徴、局面および利点は、添付の図面と関連して理解されるこの発明に関する次の詳細な説明から明らかとなるであろう。 The above and other objects, features, aspects and advantages of the present invention will become apparent from the following detailed description of the invention taken in conjunction with the accompanying drawings.

ある実施の形態に従うワーク取り出し作業システム１００の一構成例を示す図である。1 is a diagram showing a configuration example of a work picking system 100 according to an embodiment; FIG. 角度調整機構１１１の一構成例を示す図である。4 is a diagram showing a configuration example of an angle adjustment mechanism 111; FIG. 角度調整機構１１１の回転軸４２に姿勢制御用の電動アクチュエータ１１を取り付けた一構成例を示す。A configuration example in which an electric actuator 11 for attitude control is attached to a rotation shaft 42 of an angle adjustment mechanism 111 is shown. 把持機構１１２の一構成例を示す図である。4 is a diagram showing a configuration example of a gripping mechanism 112; FIG. 把持機構１１２を取り付けた角度調整機構１１１の一例を示す図である。FIG. 4 is a diagram showing an example of an angle adjusting mechanism 111 to which a gripping mechanism 112 is attached; 情報処理装置１０２のハードウェアの一構成例を示す図である。3 is a diagram illustrating an example of hardware configuration of the information processing apparatus 102. FIG. 情報処理装置１０２を実現する機能の一構成例を示す図である。2 is a diagram illustrating a configuration example of functions that implement the information processing apparatus 102. FIG. 評価値関数部８０２の動作の一例を示す図である。8 is a diagram showing an example of the operation of an evaluation value function unit 802; FIG. 動作パターンテーブル８０３の一例を示す図である。8 is a diagram showing an example of an operation pattern table 803; FIG. ワーク取り出し作業システム１００の処理の一例を示すフローチャートである。4 is a flow chart showing an example of processing of the work picking work system 100. FIG. 図１０の処理の動作イメージの一例を示す図である。FIG. 11 is a diagram showing an example of an operation image of the process of FIG. 10; ワーク取り出し作業システム１００の絡まり解き作業の学習処理の一例を示すフローチャートである。6 is a flow chart showing an example of learning processing of the untangling work of the work picking work system 100. FIG. 絡まり解き作業の初期学習処理（図１２のステップ１２３０に対応）の一例を示すフローチャートである。13 is a flow chart showing an example of initial learning processing (corresponding to step 1230 in FIG. 12) for untangling work; 絡まり解き作業の学習処理（図１２のステップ１２６０に対応）の一例を示すフローチャートである。FIG. 13 is a flow chart showing an example of a learning process (corresponding to step 1260 in FIG. 12) for untangling work; FIG. 評価値関数部８０２の評価値関数Ｆの更新処理の一例を示すフローチャートである。8 is a flowchart showing an example of update processing of an evaluation value function F of an evaluation value function unit 802;

以下、図面を参照しつつ、本開示に係る技術思想の実施の形態について説明する。以下の説明では、同一の部品には同一の符号を付してある。それらの名称および機能も同じである。したがって、それらについての詳細な説明は繰り返さない。 Hereinafter, embodiments of the technical concept according to the present disclosure will be described with reference to the drawings. In the following description, the same parts are given the same reference numerals. Their names and functions are also the same. Therefore, detailed description thereof will not be repeated.

＜Ａ．システム構成＞
図１は、本実施の形態に従うワーク取り出し作業システム１００の一構成例を示す図である。図１を参照して、ワーク取り出し作業システム１００は、ワーク取り出し作業装置１０１と、情報処理装置１０２と、制御装置１０３と、撮像装置１１４，１１５とを備える。 <A. System configuration>
FIG. 1 is a diagram showing one configuration example of a work picking work system 100 according to the present embodiment. Referring to FIG. 1 , work picking system 100 includes work picking device 101 , information processing device 102 , control device 103 , and imaging devices 114 and 115 .

ワーク取り出し作業装置１０１は、架台１０４と、第１の直動ユニット１０５と、第２の直動ユニット１０６と、第３の直動ユニット１０７と、電動アクチュエータ１０８Ａ，１０８Ｂ，１０８Ｃ（以降、総称する場合は電動アクチュエータ１０８と呼ぶ）と、作業ヘッド１０９と、回転ユニット取付部材１１０と、角度調整機構１１１と、把持機構１１２と、ワーク入れ１１３とを備える。 The workpiece picking device 101 includes a base 104, a first linear motion unit 105, a second linear motion unit 106, a third linear motion unit 107, and electric actuators 108A, 108B, and 108C (hereinafter collectively referred to as ), a working head 109 , a rotating unit mounting member 110 , an angle adjusting mechanism 111 , a gripping mechanism 112 , and a work holder 113 .

架台１０４は、第１の直動ユニット１０５と、第２の直動ユニット１０６と、第３の直動ユニット１０７と、それぞれの直動ユニットを駆動させる電動アクチュエータ１０８Ａ，１０８Ｂ，１０８Ｃと、作業ヘッド１０９とからなる位置調整装置が装着される台である。 The pedestal 104 includes a first linear motion unit 105, a second linear motion unit 106, a third linear motion unit 107, electric actuators 108A, 108B, and 108C for driving the respective linear motion units, and a work head. 109 on which a position adjusting device is mounted.

第１の直動ユニット１０５、第２の直動ユニット１０６および第３の直動ユニット１０７は、それぞれ直交するＸ軸、Ｙ軸、Ｚ軸方向に作業ヘッド１０９を移動させる。ある局面において、各直動ユニットは、フレームと、リニアシャフトと、リニアブッシュと、電動アクチュエータ１０８から動力を伝達するための台形ネジおよびボールネジナットとを備えていてもよい。また、ある局面において、各直動ユニットは、リニアシャフトの代わりに、リニアガイドや、フレームの表面を滑るガイドローラーを備えていてもよい。また、ある局面において、各直動ユニットは、台形ネジの代わりに駆動ベルトを備えていてもよい。また、各直動ユニットの端部には、各電動アクチュエータ１０８の初期位置の決定と、安全機構のための、衝突検知センサーが設けられていてもよい。 The first linear motion unit 105, the second linear motion unit 106, and the third linear motion unit 107 move the working head 109 in orthogonal X-axis, Y-axis, and Z-axis directions, respectively. In one aspect, each linear motion unit may comprise a frame, a linear shaft, a linear bushing, a trapezoidal screw and a ball screw nut for transmitting power from the electric actuator 108. In one aspect, each linear motion unit may include a linear guide or a guide roller that slides on the surface of the frame instead of the linear shaft. In one aspect, each linear motion unit may include a drive belt instead of the trapezoidal screw. Also, the end of each linear motion unit may be provided with a collision detection sensor for determining the initial position of each electric actuator 108 and for a safety mechanism.

電動アクチュエータ１０８は、それぞれの直動ユニットを駆動させる。ある局面において、電動アクチュエータ１０８は、ステッピングモータであり、台形ネジや駆動ベルトを介して動力を各直動ユニットに伝達してもよい。また、ある局面において、電動アクチュエータ１０８は、ＡＣサーボモータまたはエンコーダーを備えたギアードモータであってもよい。情報処理装置１０２は、ステッピングモータのステップ数や、エンコーダーの回転数によって、作業ヘッド１０９および把持機構１１２の現在位置を算出してもよい。 The electric actuator 108 drives each linear motion unit. In one aspect, the electric actuator 108 may be a stepping motor and transmit power to each linear motion unit via a trapezoidal screw or drive belt. Also, in one aspect, the electric actuator 108 may be an AC servomotor or a geared motor with an encoder. The information processing device 102 may calculate the current positions of the working head 109 and the gripping mechanism 112 based on the number of steps of the stepping motor and the number of revolutions of the encoder.

作業ヘッド１０９は、上下方向（Ｚ軸方向）に動作するように第３の直動ユニット１０７に取り付けられている。また、作業ヘッド１０９は、作業に必要なパーツを取り付けるためのネジ穴やアタッチメントを備える。 The working head 109 is attached to the third linear motion unit 107 so as to move vertically (in the Z-axis direction). The working head 109 also has screw holes and attachments for attaching parts necessary for the work.

回転ユニット取付部材１１０は、作業ヘッド１０９に取り付けられており、角度調整機構１１１を取り付けるためのネジ穴やアタッチメントを備える。角度調整機構１１１は、把持機構１１２によって把持されたワークの向きを微調整する。また、角度調整機構１１１の根元は電動アクチュエータを用いた回転機構となっている。なお、回転機構は、角度調整機構１１１とは別体でもよい。角度調整機構１１１の詳細については後述する。把持機構１１２は、ワーク取り出し作業におけるワーク、例えば、Ｃ型サークリップや、ばね、コイル等を把持する。把持機構１１２の詳細については後述する。ワーク入れ１１３は、ワークを入れるための箱である。 The rotary unit attachment member 110 is attached to the working head 109 and has screw holes and attachments for attaching the angle adjustment mechanism 111 . The angle adjustment mechanism 111 finely adjusts the orientation of the workpiece gripped by the gripping mechanism 112 . Further, the base of the angle adjustment mechanism 111 is a rotation mechanism using an electric actuator. Note that the rotation mechanism may be separate from the angle adjustment mechanism 111 . Details of the angle adjustment mechanism 111 will be described later. The gripping mechanism 112 grips a workpiece, such as a C-shaped circlip, a spring, or a coil, in the workpiece picking operation. Details of the gripping mechanism 112 will be described later. A work container 113 is a box for containing a work.

情報処理装置１０２は、制御装置１０３を介して、ワーク取り出し作業装置１０１に対して制御命令を送信し、また、電動アクチュエータ１０８や角度調整機構１１１の電動アクチュエータのモータトルク値等を取得する。情報処理装置１０２の詳細は後述する。 The information processing device 102 transmits a control command to the work picking device 101 via the control device 103, and acquires the motor torque value of the electric actuator 108 and the electric actuator of the angle adjustment mechanism 111, and the like. Details of the information processing apparatus 102 will be described later.

制御装置１０３は、ワーク取り出し作業装置１０１および情報処理装置１０２の間のデータを相互に変換する。ある局面において、制御装置１０３は、マイクロコンピューターからなる制御基板であり、情報処理装置１０２から、ワーク取り出し作業装置１０１の電動アクチュエータ１０８や角度調整機構１１１の電動アクチュエータに対する指令（指令トルク、回転量、回転速度等）を受信し、それぞれの電動アクチュエータに制御信号を送信してもよい。 The control device 103 mutually converts data between the work picking device 101 and the information processing device 102 . In one aspect, the control device 103 is a control board made up of a microcomputer, and commands (command torque, rotation amount, rotational speed, etc.) and transmit control signals to the respective electric actuators.

撮像装置１１４は、ワーク入れ１１３を上から撮影することによって得られる画像または映像を情報処理装置１０２に送信する。情報処理装置１０２は、撮像装置１１４から受信した画像または映像に基づいて、ワークの位置を検出し、位置調整装置により、把持機構１１２をワークの取り出し位置まで移動させる。 The imaging device 114 transmits an image or video obtained by photographing the work container 113 from above to the information processing device 102 . The information processing device 102 detects the position of the work based on the image or video received from the imaging device 114, and moves the gripping mechanism 112 to the work pick-up position by the position adjusting device.

撮像装置１１５は、把持機構１１２が把持するワークを横から撮影することによって得られる画像または映像を情報処理装置１０２に送信する。情報処理装置１０２は、撮像装置１１５から受信した画像または映像に基づいて、把持機構１１２が把持するワークの個数を検出する。 The imaging device 115 transmits to the information processing device 102 an image or video obtained by photographing the workpiece gripped by the gripping mechanism 112 from the side. The information processing device 102 detects the number of workpieces gripped by the gripping mechanism 112 based on the image or video received from the imaging device 115 .

＜Ｂ．システム構成部品のハードウェア構成＞
図２は、角度調整機構１１１の一構成例を示す図である。図２を参照して、角度調整機構１１１は、基端側の第１リンクハブ３２に対し先端側の第２リンクハブ３３を３組のリンク機構３４によって姿勢変更可能に連結したものである。先端側の第２リンクハブ３３には、図１に示された把持機構１１２が取り付けられる。なお、ここでは３組のリンク機構３４を有する角度調整機構１１１について示したが、リンク機構３４の数は、４組以上であってもよい。 <B. Hardware Configuration of System Components>
FIG. 2 is a diagram showing a configuration example of the angle adjusting mechanism 111. As shown in FIG. Referring to FIG. 2, the angle adjustment mechanism 111 connects the second link hub 33 on the distal side to the first link hub 32 on the proximal side by three sets of link mechanisms 34 so that the attitude can be changed. The gripping mechanism 112 shown in FIG. 1 is attached to the second link hub 33 on the distal end side. Although the angle adjusting mechanism 111 having three link mechanisms 34 is shown here, the number of link mechanisms 34 may be four or more.

各リンク機構３４は、基端側の端部リンク部材３５、先端側の端部リンク部材３６および中央リンク部材３７で構成される。リンク機構３４は、４つの回転対偶からなる４節連鎖のリンク機構である。基端側および先端側の端部リンク部材３５，３６はＬ字状の形状を有する。 Each link mechanism 34 is composed of a proximal end link member 35 , a distal end link member 36 and a central link member 37 . The link mechanism 34 is a four-bar chain link mechanism composed of four rotational pairs. The proximal and distal end link members 35, 36 have an L-shape.

基端側の端部リンク部材３５の一端は、回転軸４２を介して、基端側の第１リンクハブ３２に回転自在に連結されている。先端側の端部リンク部材３６の一端は、回転軸７３を介して、先端側の第２リンクハブ３３に回転自在に連結されている。中央リンク部材３７は、回転軸５５，７５を介して、両端に端部リンク部材３５，３６の各他端がそれぞれ回転自在に連結されている。 One end of the end link member 35 on the proximal side is rotatably connected to the first link hub 32 on the proximal side via a rotating shaft 42 . One end of the end link member 36 on the tip side is rotatably connected to the second link hub 33 on the tip side via a rotating shaft 73 . The center link member 37 is rotatably connected to both ends of the end link members 35 and 36 via the rotation shafts 55 and 75, respectively.

角度調整機構１１１は、パラレルリンク機構であり、２つの球面リンク機構を組み合わせた構造を有する。端部リンク部材３５，３６と中央リンク部材３７との各回転対偶の中心軸は、ある交差角を持っていてもよいし、平行であってもよい。 The angle adjusting mechanism 111 is a parallel link mechanism and has a structure in which two spherical link mechanisms are combined. The central axes of the rotational pairs of the end link members 35, 36 and the center link member 37 may have a certain crossing angle or may be parallel.

角度調整機構１１１は、リンクの動作のみで各リンクハブの中心軸の相対角度を調整可能であり、多関節ロボットのように直列に連結された複数の関節の動作を伴わない。このため、先端のわずかな動きに対して構成部材が大きく動くことは無く素早い動作が可能である。また、角度調整機構１１１は、リンクを駆動させる電動アクチュエータのモータトルク値から、任意の姿勢における把持機構１１２の先端に加わる力を検出できる。 The angle adjustment mechanism 111 can adjust the relative angle of the central axis of each link hub only by the operation of the link, and does not involve the operation of a plurality of joints connected in series like an articulated robot. For this reason, even a slight movement of the tip does not cause a large movement of the constituent members, enabling quick operation. Also, the angle adjusting mechanism 111 can detect the force applied to the tip of the gripping mechanism 112 in an arbitrary posture from the motor torque value of the electric actuator that drives the link.

第２リンクハブ３３は、第１リンクハブ３２から見て半球面上で姿勢を変える。そのため、第１リンクハブ３２から見た第２リンクハブ３３の目標位置と、各リンクの姿勢とは、必ず一対一で対応する。よって、角度調整機構１１１は、ロボットアーム等のマルチリンクを持つ構造と異なり、特異点を有さない。 The second link hub 33 changes its posture on the hemispherical surface when viewed from the first link hub 32 . Therefore, the target position of the second link hub 33 as viewed from the first link hub 32 and the orientation of each link always correspond one-to-one. Therefore, the angle adjustment mechanism 111 does not have a singular point, unlike a structure having multiple links such as a robot arm.

図３は、角度調整機構１１１の回転軸４２に姿勢制御用の電動アクチュエータ１１を取り付けた一構成例を示す。電動アクチュエータ１１は、減速機構６２を備えたロータリアクチュエータ（モータ）である。電動アクチュエータ１１は、基端側の第１リンクハブ３２の上面に、電動アクチュエータ１１の回転軸と回転軸４２とが同軸上に位置するように設置されている。電動アクチュエータ１１および減速機構６２は、一体として設けられてもよい。減速機構６２は、モータ固定部材６３により基端側の第１リンクハブ３２に固定される。ある局面において、電動アクチュエータ１１は、ＡＣサーボモータまたはエンコーダーを備えたギアードモータであってもよい。 FIG. 3 shows a configuration example in which the electric actuator 11 for attitude control is attached to the rotating shaft 42 of the angle adjusting mechanism 111. As shown in FIG. The electric actuator 11 is a rotary actuator (motor) having a speed reduction mechanism 62 . The electric actuator 11 is installed on the upper surface of the first link hub 32 on the base end side so that the rotating shaft of the electric actuator 11 and the rotating shaft 42 are coaxially positioned. The electric actuator 11 and the deceleration mechanism 62 may be provided integrally. The speed reduction mechanism 62 is fixed to the first link hub 32 on the base end side by a motor fixing member 63 . In one aspect, electric actuator 11 may be an AC servomotor or a geared motor with an encoder.

図３に示す例では、電動アクチュエータ１１が３組のリンク機構３４の全てに設けられているが、本実施の形態に従う角度調整機構１１１はこれに限られない。角度調整機構１１１は、リンク機構３４のうち少なくとも２組に姿勢制御用の電動アクチュエータ１１が設けられていれば、基端側の第１リンクハブ３２に対する先端側の第２リンクハブ３３の姿勢を確定することができる。 In the example shown in FIG. 3, electric actuators 11 are provided in all three sets of link mechanisms 34, but angle adjustment mechanism 111 according to the present embodiment is not limited to this. If at least two sets of the link mechanisms 34 are provided with electric actuators 11 for attitude control, the angle adjustment mechanism 111 adjusts the attitude of the second link hub 33 on the distal side with respect to the first link hub 32 on the proximal side. can be determined.

図４は、把持機構１１２の一構成例を示す図である。把持機構１１２は、対向する２枚の爪で対象物を挟み込む。本実施の形態に従う把持機構１１２は、エアシリンダを用いて２枚の爪を開閉させる方式である。状態Ａは把持機構１１２の開放時の状態を示す。状態Ｂは把持機構１１２の閉じた状態を示す。図４に示す把持機構１１２は一例であり、本実施の形態に従う把持機構１１２はこれに限られない。ある局面において、把持機構１１２は、電動式の開閉機構、対象物を吸着する機構または他の挟み込み機構であってもよい。 FIG. 4 is a diagram showing a configuration example of the gripping mechanism 112. As shown in FIG. The gripping mechanism 112 grips the object with two opposing claws. Gripping mechanism 112 according to the present embodiment is of a type that opens and closes two claws using an air cylinder. State A shows the state when the gripping mechanism 112 is opened. State B shows the gripping mechanism 112 closed. The gripping mechanism 112 shown in FIG. 4 is an example, and the gripping mechanism 112 according to the present embodiment is not limited to this. In one aspect, the gripping mechanism 112 may be a motorized opening and closing mechanism, an object suction mechanism, or other pinching mechanism.

図５は、把持機構１１２を取り付けた角度調整機構１１１の一例を示す図である。角度調整機構１１１の先端側の第２リンクハブ３３は、把持機構１１２をネジ止めするネジ穴、はめ込み穴またはその他のアタッチメントを備えていてもよい。図５に示す構成によって、ワーク取り出し作業装置１０１は、把持機構１１２が複数のワークを把持したときにおける、絡まり解きの動作を行うことができる。 FIG. 5 is a diagram showing an example of the angle adjusting mechanism 111 to which the gripping mechanism 112 is attached. The second link hub 33 on the distal end side of the angle adjusting mechanism 111 may have a screw hole, a fitting hole, or other attachment for screwing the gripping mechanism 112 . With the configuration shown in FIG. 5, the work picking device 101 can perform the operation of untangling when the gripping mechanism 112 grips a plurality of workpieces.

なお、本実施の例では、位置調整装置は、角度調整機構１１１を移動させているが、本実施の形態に従うワーク取り出し作業装置１０１はこれに限られない。位置調整装置は、角度調整機構１１１、把持機構１１２およびワーク入れ１１３の中のワークを相対的に位置決めできればよく、ある局面において、位置調整装置は、ワーク入れ１１３を移動させる機構を含んでもよい。 In this embodiment, the position adjustment device moves the angle adjustment mechanism 111, but the work picking device 101 according to this embodiment is not limited to this. The position adjustment device only needs to be able to relatively position the workpieces in the angle adjustment mechanism 111 , the gripping mechanism 112 and the workpiece container 113 , and in some aspects the position adjustment device may include a mechanism for moving the workpiece container 113 .

＜Ｃ．回路およびソフトウェア構成＞
図６は、情報処理装置１０２のハードウェアの一構成例を示す図である。図６を参照して、情報処理装置１０２は、ＣＰＵ（Central Processing Unit）７０１と、１次記憶装置７０２と、２次記憶装置７０３と、外部機器インターフェース７０４と、入力インターフェース７０５と、出力インターフェース７０６と、通信インターフェース７０７とを備える。 <C. Circuit and software configuration>
FIG. 6 is a diagram showing a configuration example of hardware of the information processing apparatus 102. As shown in FIG. 6, information processing apparatus 102 includes a CPU (Central Processing Unit) 701, a primary storage device 702, a secondary storage device 703, an external device interface 704, an input interface 705, and an output interface 706. and a communication interface 707 .

ＣＰＵ７０１は、情報処理装置１０２で動作するプログラムやデータを処理する。１次記憶装置７０２は、ＣＰＵ７０１によって実行されるプログラムおよび参照されるデータを格納する。ある局面において、ＤＲＡＭ（Dynamic Random Access Memory）が１次記憶装置７０２として使用されてもよい。 The CPU 701 processes programs and data that operate on the information processing apparatus 102 . Primary storage device 702 stores programs executed by CPU 701 and referenced data. In some aspects, dynamic random access memory (DRAM) may be used as primary storage 702 .

２次記憶装置７０３は、プログラムやデータ等を長期間記憶する。一般的に２次記憶装置７０３は、１次記憶装置７０２よりも低速であるため、ＣＰＵ７０１で直接使用するデータは、１次記憶装置７０２に配置され、それ以外のデータは、２次記憶装置７０３に配置される。ある局面において、ＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）等の不揮発性の記憶装置が２次記憶装置７０３として使用されてもよい。 A secondary storage device 703 stores programs, data, and the like for a long period of time. Since the secondary storage device 703 is generally slower than the primary storage device 702, data directly used by the CPU 701 is stored in the primary storage device 702, and other data is stored in the secondary storage device 703. placed in In one aspect, a non-volatile storage device such as a HDD (Hard Disk Drive) or SSD (Solid State Drive) may be used as the secondary storage device 703 .

外部機器インターフェース７０４は、情報処理装置１０２に補助デバイスを接続する場合等に使用される。ある局面において、ＵＳＢ（Universal Serial Bus）インターフェースが、外部機器インターフェース７０４として使用されてもよい。入力インターフェース７０５は、キーボードやマウス等を接続するために使用される。ある局面において、ＵＳＢインターフェースが、入力インターフェース７０５として使用されてもよい。 The external device interface 704 is used, for example, when connecting an auxiliary device to the information processing apparatus 102 . In one aspect, a USB (Universal Serial Bus) interface may be used as the external device interface 704 . An input interface 705 is used to connect a keyboard, mouse, and the like. In one aspect, a USB interface may be used as input interface 705 .

出力インターフェース７０６は、ディスプレイ等の出力デバイスを接続するために使用される。ある局面において、ＨＤＭＩ（登録商標）（High-Definition Multimedia Interface）やＤＶＩ（Digital Visual Interface）が出力インターフェース７０６として使用されてもよい。 Output interface 706 is used to connect an output device such as a display. In one aspect, HDMI (registered trademark) (High-Definition Multimedia Interface) or DVI (Digital Visual Interface) may be used as the output interface 706 .

通信インターフェース７０７は、外部の通信機器と通信するために使用される。ある局面において、ＬＡＮ（Local Area Network）ポートや、Ｗｉ－Ｆｉ（登録商標）（Wireless Fidelity）の送受信装置等が、通信インターフェース７０７として使用されてもよい。また、ある局面において、情報処理装置１０２は、ＰＣ（Personal Computer）またはワークステーションであってもよい。本実施の形態に従う情報処理装置１０２の処理は、図６に示すハードウェア上で、プログラムとして実行されてもよい。 A communication interface 707 is used to communicate with an external communication device. In one aspect, a LAN (Local Area Network) port, a Wi-Fi (registered trademark) (Wireless Fidelity) transceiver, or the like may be used as the communication interface 707 . In one aspect, the information processing apparatus 102 may be a PC (Personal Computer) or a workstation. The processing of information processing apparatus 102 according to the present embodiment may be executed as a program on the hardware shown in FIG.

図７は、情報処理装置１０２を実現する機能の一構成例を示す図である。ある局面において、図７に示す機能の一部は、図６に示すハードウェア上で、プログラムが実行されることにより実現され得る。図７を参照して、情報処理装置１０２は、信号入力部８０１と、評価値関数部８０２と、動作パターンテーブル８０３と、動作決定部８０４と、指令生成部８０５と、動作結果判定部８０６と、評価値関数学習部８０７とを含む。 FIG. 7 is a diagram showing a configuration example of functions realizing the information processing apparatus 102. As shown in FIG. In one aspect, part of the functions shown in FIG. 7 can be realized by executing a program on the hardware shown in FIG. 7, information processing apparatus 102 includes signal input unit 801, evaluation value function unit 802, motion pattern table 803, motion determination unit 804, command generation unit 805, and motion result determination unit 806. , and an evaluation value function learning unit 807 .

信号入力部８０１は、撮像装置１１４，１１５が撮影することによって取得された画像と、ワーク取り出し作業装置１０１から角度調整機構１１１の電動アクチュエータ１１のモータトルク値とを取得する。ある局面において、信号入力部８０１は、さらに、位置調整装置の電動アクチュエータ１０８のモータトルク値を取得してもよい。 The signal input unit 801 acquires the images captured by the imaging devices 114 and 115 and the motor torque value of the electric actuator 11 of the angle adjustment mechanism 111 from the work picking device 101 . In one aspect, the signal input unit 801 may also acquire the motor torque value of the electric actuator 108 of the position adjustment device.

評価値関数部８０２は、後述する評価値関数Ｆを用いて信号入力部に入力された画像から検出されたワークの個数およびモータトルク値等に基づいて各動作パターンに対応するそれぞれの評価値を計算する。 The evaluation value function unit 802 calculates each evaluation value corresponding to each operation pattern based on the number of workpieces detected from the image input to the signal input unit and the motor torque value using the evaluation value function F, which will be described later. calculate.

動作パターンテーブル８０３は、位置調整装置および角度調整機構１１１の各電動アクチュエータの移動量および移動速度、加速度、指令トルク値の内の少なくとも１つが対応付けられた複数の動作パターンを保管する。動作パターンテーブル８０３は、角度調整機構１１１に関して、個別のアクチュエータの指令値ではなく、角度調整機構１１１の角度等を動作パターンに含めてもよい。 The operation pattern table 803 stores a plurality of operation patterns in which at least one of the movement amount and movement speed, acceleration, and command torque value of each electric actuator of the position adjustment device and angle adjustment mechanism 111 is associated. Regarding the angle adjustment mechanism 111, the operation pattern table 803 may include the angle of the angle adjustment mechanism 111 and the like in the operation pattern, instead of the command values of the individual actuators.

動作決定部８０４は、動作パターンテーブル８０３の動作パターンの中から、評価値が最大となる動作パターンをワーク取り出し作業装置１０１の次の動作として選択する。指令生成部８０５は、動作決定部８０４により選択された動作パターンに基づいて、ワーク取り出し作業装置１０１の各電動アクチュエータへの指令値を生成し、制御装置１０３を介して、当該指令値をワーク取り出し作業装置１０１に送信する。 The motion determining unit 804 selects the motion pattern with the maximum evaluation value from the motion patterns in the motion pattern table 803 as the next motion of the workpiece picking device 101 . The command generation unit 805 generates a command value for each electric actuator of the work picking device 101 based on the motion pattern selected by the motion determining unit 804, and transmits the command value to the work picking device 103 via the control device 103. It is transmitted to the work device 101 .

動作結果判定部８０６は、前回選択された動作パターンの実行前後における、把持機構１１２が把持しているワーク個数が１のときは報酬１を与え、２つ以上のワークを把持している場合は０を与える。さらに、動作パターンの実行回数が上限回数を超過した場合は－１を与える。 The motion result determination unit 806 gives a reward of 1 when the number of workpieces gripped by the gripping mechanism 112 before and after execution of the previously selected motion pattern is 1, and reward 1 when the gripping mechanism 112 grips two or more workpieces. give 0. Furthermore, -1 is given when the number of executions of the operation pattern exceeds the upper limit number of times.

評価値関数学習部８０７は、動作結果判定部が出力した報酬を教師信号として、動作パターンを選択した時の評価値と、教師信号との差に基づいて評価値関数Ｆを更新する。ある局面において、評価値関数学習部８０７は、予め定められた回数だけ評価値関数Ｆを更新するごとに、評価値関数部８０２で使用する評価値関数Ｆを最新状態に更新してもよい。 The evaluation value function learning unit 807 updates the evaluation value function F based on the difference between the evaluation value when the motion pattern is selected and the teacher signal, using the reward output by the motion result determination unit as a teacher signal. In one aspect, the evaluation value function learning unit 807 may update the evaluation value function F used in the evaluation value function unit 802 to the latest state each time the evaluation value function F is updated a predetermined number of times.

図８は、評価値関数部８０２の動作の一例を示す図である。評価値関数部８０２は、信号入力部８０１から、各撮像装置１１４，１１５が撮影することによって取得された画像およびモータトルク値等を取得して評価値関数Ｆに入力する。なお、情報処理装置１０２は、画像から検出したワークの個数を評価値関数Ｆに入力してもよい。評価値は動作パターンごとに算出される。図８に示す例では、評価値関数部８０２は、ｎ個の各動作パターンａ_１～ａ_ｎに対してそれぞれ評価値を算出する。ある局面において、評価値関数部８０２は、画像（もしくは、画像に写るワークの個数）やモータトルク値等を評価値関数Ｆの入力として受け付け、各動作パターンのそれぞれの評価値を計算するプログラムであってもよい。 FIG. 8 is a diagram showing an example of the operation of the evaluation value function unit 802. As shown in FIG. The evaluation value function unit 802 acquires the image and the motor torque value obtained by the imaging devices 114 and 115 from the signal input unit 801 and inputs them to the evaluation value function F. FIG. The information processing apparatus 102 may input the number of workpieces detected from the image to the evaluation value function F. FIG. An evaluation value is calculated for each operation pattern. In the example shown in FIG. 8, the evaluation value function unit 802 calculates evaluation values for each of the n operation patterns a ₁ to a _n . In one aspect, the evaluation value function unit 802 is a program that receives an image (or the number of workpieces shown in the image), a motor torque value, etc. as an input to the evaluation value function F, and calculates an evaluation value for each operation pattern. There may be.

評価値関数Ｆが出力するｎ個の評価値は、次に実行すべき動作パターンを選択するための指標であり、対応する評価値が最大の値を示す動作パターンが、次に実行すべき最適な動作であること示す。 The n evaluation values output by the evaluation value function F are indices for selecting the operation pattern to be executed next. indicates that it is a valid action.

そのため、動作決定部８０４は、ｎ個の動作パターンの中から、最大の評価値に対応する動作パターンを次の動作として選択する。図９に示す例では、「評価値＝０．６１４」が最大のため、動作決定部８０４は、「評価値＝０．６１４」に対応する動作パターンａ_ｎ－３を選択する。 Therefore, the motion determining unit 804 selects the motion pattern corresponding to the maximum evaluation value from among the n motion patterns as the next motion. In the example shown in FIG. 9, since “evaluation value=0.614” is the maximum, the motion determining unit 804 selects motion pattern a _n-3 corresponding to “evaluation value=0.614”.

動作決定部８０４は、選択した動作パターンａ_ｎ－３を指令生成部８０５に転送する。指令生成部８０５は、動作パターンテーブル８０３を参照し、ａ_ｎ－３に対応する指令値を生成して制御装置１０３に出力する。 The motion determining unit 804 transfers the selected motion pattern a _n-3 to the command generating unit 805 . The command generation unit 805 refers to the operation pattern table 803 to generate a command value corresponding to an _-3 and outputs it to the control device 103 .

図９は、動作パターンテーブル８０３の一例を示す図である。動作パターンテーブル８０３は、動作パターンごとに、位置調整装置の電動アクチュエータ１０８の移動量、移動速度、加速度および指令トルク値と、角度調整機構１１１の根元の回転機構の回転角度、回転速度、加速度、減速度および指令トルク値と、角度調整機構１１１の折れ角変更量、旋回角変更量、回転速度、加速度、減速度および指令トルク値とを格納する。 FIG. 9 is a diagram showing an example of the operation pattern table 803. As shown in FIG. The operation pattern table 803 contains, for each operation pattern, the movement amount, movement speed, acceleration, and command torque value of the electric actuator 108 of the position adjustment device, and the rotation angle, rotation speed, acceleration, and rotation angle of the base rotation mechanism of the angle adjustment mechanism 111 . It stores the deceleration and command torque value, and the bending angle change amount, turning angle change amount, rotational speed, acceleration, deceleration and command torque value of the angle adjusting mechanism 111 .

ある局面において、動作パターンテーブル８０３は、角度調整機構１１１の個別の電動アクチュエータの移動量、移動速度、加速度、および指令トルク値を格納してもよい。また、ある局面において、動作パターンテーブル８０３は、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって参照されてもよい。 In one aspect, the operation pattern table 803 may store the movement amount, movement speed, acceleration, and command torque value of the individual electric actuators of the angle adjustment mechanism 111 . In a certain aspect, operation pattern table 803 may be stored in secondary storage device 703 and read out to primary storage device 702 to be referred to by CPU 701 .

＜Ｄ．ワーク取り出し作業における情報処理装置１０２の内部処理＞
図１０は、ワーク取り出し作業システム１００の処理の一例を示すフローチャートである。ある局面において、図１０の処理を実行するためのプログラムは２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１０の各ステップを実行するものとして当該処理を説明する。 <D. Internal Processing of Information Processing Apparatus 102 in Work Picking Work>
FIG. 10 is a flow chart showing an example of processing of the work picking work system 100. As shown in FIG. In one aspect, a program for executing the processing of FIG. Hereinafter, the processing will be described assuming that the information processing apparatus 102 executes each step in FIG.

ステップＳ１００５において、情報処理装置１０２は、撮像装置１１４がワーク入れ１１３を上から撮影することによって得られた画像１に基づいて、ワーク入れ１１３の中のワークの位置を検出する。ある局面において、情報処理装置１０２は、画像１からワーク入れ１１３の中のワークの位置を検出するために、既存の画像認識技術を用いてもよい。 In step S1005, the information processing device 102 detects the position of the work in the work container 113 based on the image 1 obtained by the imaging device 114 capturing the work container 113 from above. In one aspect, the information processing device 102 may use existing image recognition technology to detect the position of the work in the work container 113 from the image 1 .

ステップＳ１０１０において、情報処理装置１０２は、画像１からワークを検出したか否かを判定する。情報処理装置１０２は、画像１からワークを検出した場合（ステップＳ１０１０にてＹＥＳ）、制御をステップＳ１０１５に移す。そうでない場合（ステップＳ１０１０にてＮＯ）、情報処理装置１０２は、ワーク入れ１１３内から全てのワークの取り出し処理が完了したと判定し、処理を終了する。 In step S1010 , the information processing apparatus 102 determines whether or not a workpiece has been detected from image 1 . When information processing apparatus 102 detects a workpiece from image 1 (YES in step S1010), information processing apparatus 102 shifts control to step S1015. Otherwise (NO in step S1010), the information processing apparatus 102 determines that the process of removing all the works from the work container 113 has been completed, and ends the process.

ステップＳ１０１５において、情報処理装置１０２は、位置調整装置により、把持機構１１２をワークの取り出し作業のための予め定められた位置に移動させ、ワーク入れ１１３からワークを取り出す。 In step S1015 , the information processing apparatus 102 uses the position adjustment device to move the gripping mechanism 112 to a predetermined position for picking up the work, and picks up the work from the work container 113 .

ステップＳ１０２０において、情報処理装置１０２は、撮像装置１１５が把持機構１１２の先端を側面から撮影することによって取得された画像２を基に、把持機構１１２が把持しているワークの個数を算出する。ある局面において、情報処理装置１０２は、画像２から把持機構１１２が把持しているワークの個数を算出するために、既存の画像認識技術を用いてもよい。 In step S1020, the information processing apparatus 102 calculates the number of works gripped by the gripping mechanism 112 based on the image 2 acquired by the imaging device 115 capturing the tip of the gripping mechanism 112 from the side. In one aspect, the information processing apparatus 102 may use existing image recognition technology to calculate the number of workpieces gripped by the gripping mechanism 112 from the image 2 .

ステップＳ１０２５において、情報処理装置１０２は、画像２の解析結果に基づいて、把持機構１１２がワークを１個だけ把持しているか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個だけ把持していると判定した場合（ステップＳ１０２５にてＹＥＳ）、制御をステップＳ１０３０に移す。そうでない場合（ステップＳ１０２５にてＮＯ）、制御をステップＳ１０３５に移す。 In step S1025 , the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips only one workpiece based on the analysis result of Image 2 . When information processing apparatus 102 determines that gripping mechanism 112 grips only one workpiece (YES in step S1025), control proceeds to step S1030. Otherwise (NO in step S1025), control proceeds to step S1035.

ステップＳ１０３０において、情報処理装置１０２は、把持機構１１２が把持するワークを予め定められた位置に置き、処理を終了する。 In step S1030, the information processing apparatus 102 places the workpiece gripped by the gripping mechanism 112 at a predetermined position, and ends the process.

ステップＳ１０３５において、情報処理装置１０２は、画像２の解析結果に基づいて、把持機構１１２がワークを１個も把持していないか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個も把持していないと判定した場合（ステップＳ１０３５にてＹＥＳ）、処理を終了する。そうでない場合（ステップＳ１０３５にてＮＯ）、情報処理装置１０２は制御をステップＳ１０４０に移す。 In step S1035 , the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips even one workpiece based on the analysis result of the image 2 . When the information processing apparatus 102 determines that the gripping mechanism 112 has not gripped even one workpiece (YES in step S1035), the process ends. Otherwise (NO in step S1035), information processing apparatus 102 shifts control to step S1040.

ステップＳ１０４０において、情報処理装置１０２は、撮像装置１１４，１１５が撮影した各画像と、各電動アクチュエータのモータトルク値とを取得する。ある局面において、情報処理装置１０２は、各画像および各電動アクチュエータのモータトルク値に加えて、各種センサー値を各種センサーから取得してもよい。 In step S1040, the information processing device 102 acquires each image captured by the imaging devices 114 and 115 and the motor torque value of each electric actuator. In one aspect, the information processing apparatus 102 may acquire various sensor values from various sensors in addition to the motor torque values of each image and each electric actuator.

ステップＳ１０４５において、情報処理装置１０２は、取得した各画像（もしくは、画像に写るワークの個数）および各電動アクチュエータのモータトルク値を評価値関数部８０２の入力として、動作パターンごとの評価値を算出する。ステップＳ１１５０において、情報処理装置１０２は、動作パターンごとに算出された評価値の中で最大の評価値を選択し、当該最大の評価値に対応する動作パターンａ_ｋを次の動作として選択する。 In step S1045, the information processing apparatus 102 uses the obtained images (or the number of workpieces shown in the images) and the motor torque values of the electric actuators as inputs to the evaluation value function unit 802 to calculate an evaluation value for each operation pattern. do. In step S1150, the information processing apparatus 102 selects the maximum evaluation value among the evaluation values calculated for each motion pattern, and selects the motion pattern _ak corresponding to the maximum evaluation value as the next motion.

ステップＳ１１５５において、情報処理装置１０２は、指令生成部８０５により、選択した動作パターンａ_ｋを実行するための指令をワーク取り出し作業装置１０１に送信する。ワーク取り出し作業装置１０１は、受信した指令に基づいて、角度調整機構１１１の電動アクチュエータ１１を駆動させることにより、把持機構１１２に絡まり解き動作をさせる。ある局面において、ワーク取り出し作業装置１０１は、角度調整機構１１１の電動アクチュエータ１１および位置調整装置の電動アクチュエータ１０８を駆動させることにより、把持機構１１２に絡まり解き動作をさせてもよい。 In step S1155, the information processing device 102 causes the command generation unit 805 to transmit a command for executing the selected operation pattern _ak to the work picking device 101. FIG. The work picking device 101 drives the electric actuator 11 of the angle adjusting mechanism 111 based on the received command, thereby causing the gripping mechanism 112 to perform an untangling operation. In one aspect, the workpiece removal work device 101 may cause the gripping mechanism 112 to perform an untangling operation by driving the electric actuator 11 of the angle adjustment mechanism 111 and the electric actuator 108 of the position adjustment device.

ステップＳ１１６０において、情報処理装置１０２は、再度、撮像装置１１５により把持機構１１２の先端を側面から撮影することによって取得された画像２を基に、把持機構１１２が把持しているワークの個数を算出する。 In step S1160, the information processing apparatus 102 again calculates the number of workpieces gripped by the gripping mechanism 112 based on the image 2 obtained by photographing the tip of the gripping mechanism 112 from the side using the imaging device 115. do.

ステップＳ１１６５において、情報処理装置１０２は、ステップＳ１１６０にて撮影された画像２の解析結果に基づいて、把持機構１１２がワークを１個だけ把持しているか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個だけ把持していると判定した場合（ステップＳ１０６５にてＹＥＳ）、絡まり解き作業は完了したため、制御をステップＳ１０３０に移す。そうでない場合（ステップＳ１０６５にてＮＯ）、絡まり解き作業は未完了のため、情報処理装置１０２は制御をステップＳ１０３５に移す。 In step S1165, the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips only one workpiece, based on the analysis result of image 2 captured in step S1160. If the information processing apparatus 102 determines that the gripping mechanism 112 grips only one workpiece (YES in step S1065), the untangling work is completed, and the control proceeds to step S1030. Otherwise (NO in step S1065), the untangling work has not been completed, so the information processing apparatus 102 shifts the control to step S1035.

ステップＳ１０４０～Ｓ１０６５において、情報処理装置１０２は、把持機構１１２が１個のワークを把持している状態になるまで、評価値に基づいて選択された動作パターンをワーク取り出し作業装置１０１に実行させ続けることで、ワークの絡まりを解くことができる。 In steps S1040 to S1065, the information processing device 102 continues to cause the workpiece picking device 101 to execute the operation pattern selected based on the evaluation value until the gripping mechanism 112 grips one workpiece. By doing so, it is possible to untangle the workpiece.

図１１は、図１０の処理の動作イメージの一例を示す図である。状態Ｘは、把持機構１１２がワーク入れ１１３からワークを取り出した直後を表している（ステップＳ１０１５に対応）。把持機構１１２は、絡まった状態の３個のワーク（ワークＡ，Ｂ，Ｃ）を把持している。 FIG. 11 is a diagram showing an example of an operation image of the processing of FIG. State X represents the state immediately after the gripping mechanism 112 takes out the work from the work container 113 (corresponding to step S1015). The gripping mechanism 112 grips three entangled works (works A, B, and C).

情報処理装置１０２は、状態Ｘから、図１０のステップＳ１０３５～ステップＳ１０６５の処理を繰り返すことにより、ワーク取り出し作業装置１０１に絡まり解き作業をさせる。情報処理装置１０２は、状態Ｘのときの各画像（もしくは、画像に写るワークの個数）および各電動アクチュエータのモータトルク値を取得する（ステップＳ１０４０に対応）。次に、情報処理装置１０２は、状態Ｘのときの各画像（もしくは、画像に写るワークの個数）および各電動アクチュエータのモータトルク値を評価値関数Ｆの入力として、各動作パターンのそれぞれの評価値を算出する（ステップＳ１０４５に対応）。そして、情報処理装置１０２は、最も評価値の高い動作パターンａ_ｎ－３を選択し（ステップＳ１０５０に対応）、動作パターンａ_ｎ－３に対応する指令をワーク取り出し作業装置１０１に送信する（ステップＳ１０５５に対応）。 From the state X, the information processing device 102 repeats the processing of steps S1035 to S1065 in FIG. Information processing apparatus 102 acquires each image (or the number of workpieces in the image) in state X and the motor torque value of each electric actuator (corresponding to step S1040). Next, the information processing device 102 uses each image (or the number of workpieces shown in the image) in the state X and the motor torque value of each electric actuator as inputs to the evaluation value function F, and evaluates each operation pattern. A value is calculated (corresponding to step S1045). Then, the information processing device 102 selects the motion pattern a _n-3 with the highest evaluation value (corresponding to step S1050), and transmits a command corresponding to the motion pattern a _n-3 to the workpiece picking device 101 (step S1050). corresponding to S1055).

状態Ｙは、ワーク取り出し作業装置１０１が動作パターンａ_ｎ－３を実行した直後の様子を示す。情報処理装置１０２は、把持機構１１２がワークを１個把持しているか否かを判定する（ステップＳ１０６０およびＳ１０６５に対応）。状態Ｙにおいて、把持機構１１２が把持するワークの数は３個のままであり、絡まり解き作業は完了していない。よって、情報処理装置１０２は、再度ステップＳ１０３５からステップＳ１０６５までの処理を繰り返す。 State Y shows the state immediately after the work picking device 101 executes the operation pattern a _n-3 . The information processing apparatus 102 determines whether or not the gripping mechanism 112 is gripping one workpiece (corresponding to steps S1060 and S1065). In the state Y, the number of workpieces gripped by the gripping mechanism 112 is still three, and the entanglement work has not been completed. Therefore, the information processing apparatus 102 repeats the processing from step S1035 to step S1065 again.

状態Ｚは、ワーク取り出し作業装置１０１が状態Ｙのときに動作パターンａ_ｎ－１を実行した直後の様子を示す。状態Ｚにおいて、ワークＢ，Ｃは落下しており、把持機構１１２が把持するワークの絡まり解き作業は完了していることがわかる。情報処理装置１０２は、撮像装置１１５が把持機構１１２の先端を側面から撮影することによって取得された画像２から、ワークの絡まり解き作業の完了を検出する。ワークの絡まり解き作業が完了した後は、情報処理装置１０２は、ワーク取り出し作業装置１０１に、ワークを予め定められた位置に運ばせる。その後、情報処理装置１０２は、ワーク取り出し作業装置１０１に、次のワークの取り出し作業を行うための指令を送信してもよい。 State Z shows the state immediately after execution of operation pattern a _n-1 when work picking device 101 is in state Y. FIG. In the state Z, the works B and C have fallen, and it can be seen that the entanglement work of the works gripped by the gripping mechanism 112 has been completed. The information processing device 102 detects the completion of the untangling work from the image 2 acquired by the imaging device 115 capturing the tip of the gripping mechanism 112 from the side. After the untangling work of the work is completed, the information processing device 102 causes the work unloading device 101 to carry the work to a predetermined position. After that, the information processing device 102 may send a command to the work picking device 101 to perform the work of picking up the next work.

＜Ｅ．ワーク取り出し作業の学習処理＞
図１０および図１１で説明した例において、情報処理装置１０２は、撮像装置１１４，１１５により撮影することによって取得された画像１，２の撮影した画像（もしくは、画像に写るワークの個数）、角度調整機構１１１および位置調整装置の現在のモータトルク値等に基づいて各動作パターンのそれぞれの評価値を計算し、評価値が最大になる動作パターンを順次実行することで動作を成功させる。そのため、図８の評価値関数Ｆは、画像（もしくは、画像に写るワークの個数）およびモータトルク値等に基づいて次に実行すべき最適な動作パターンに対して最大の評価値を出力するよう最適化されている必要がある。 <E. Learning Processing of Work Picking Work>
In the example described with reference to FIGS. 10 and 11, the information processing apparatus 102 captures images 1 and 2 obtained by capturing images with the imaging devices 114 and 115 (or the number of workpieces captured in the images), angle An evaluation value for each operation pattern is calculated based on the current motor torque values of the adjustment mechanism 111 and the position adjustment device, and the operation pattern with the maximum evaluation value is sequentially executed to make the operation successful. Therefore, the evaluation value function F in FIG. 8 is designed to output the maximum evaluation value for the optimum operation pattern to be executed next based on the image (or the number of workpieces shown in the image) and the motor torque value. should be optimized.

しかし、ワーク取り出し処理の対象となるワークの初期状態（ワーク同士の絡まり方等）は、把持機構１１２に把持されるごとに変化する可能性ある。また、ワーク取り出し作業装置１０１が動作パターンを実行することで、把持機構１１２により把持されたワークの姿勢や絡まり方も変化する可能性がある。これらのあらゆる状態を想定したルールベースの動作プログラムの構築は困難である。よって、本実施の形態に従うワーク取り出し作業システム１００は、強化学習により、繰り返し動作を試行する過程で評価値関数Ｆを最適化する。 However, there is a possibility that the initial state of the workpiece to be picked up (such as how the workpieces are entangled) changes each time the workpiece is gripped by the gripping mechanism 112 . In addition, there is a possibility that the posture and entanglement of the workpiece gripped by the gripping mechanism 112 may change as the workpiece picking device 101 executes the operation pattern. It is difficult to build a rule-based operating program that assumes all these states. Therefore, workpiece picking work system 100 according to the present embodiment optimizes evaluation value function F in the process of trying repeated motions by reinforcement learning.

図１２は、ワーク取り出し作業システム１００の絡まり解き作業の学習処理の一例を示すフローチャートである。ある局面において、図１２の処理を実行するためのプログラムは、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１２の各ステップを実行するものとして当該学習処理を説明する。 FIG. 12 is a flow chart showing an example of learning processing of the untangling work of the work picking work system 100 . In one aspect, a program for executing the process of FIG. 12 may be stored in secondary storage device 703 and read out to primary storage device 702 to be executed by CPU 701 . Hereinafter, the learning process will be described assuming that the information processing apparatus 102 executes each step in FIG. 12 .

ステップＳ１２１０において、情報処理装置１０２は、変数ｊに１を代入する。ステップＳ１２２０において、情報処理装置１０２は、変数ｊの値が定数Ｊ１以下であるか否かを判定する。情報処理装置１０２は、変数ｊの値が定数Ｊ１以下であると判定すると（ステップＳ１２２０にてＹＥＳ）、ステップＳ１２３０に制御を移す。そうでない場合（ステップＳ１２２０にてＮＯ）、情報処理装置１０２は、ステップＳ１２５０に制御を移す。評価値関数Ｆが未学習の初期状態において、情報処理装置１０２は、変数ｊが定数Ｊ１に達するまで、絡まり解き作業初期学習を繰り返し実行する。 In step S1210, the information processing apparatus 102 substitutes 1 for the variable j. In step S1220, the information processing apparatus 102 determines whether or not the value of the variable j is equal to or less than the constant J1. When information processing apparatus 102 determines that the value of variable j is equal to or less than constant J1 (YES in step S1220), the control proceeds to step S1230. Otherwise (NO in step S1220), information processing apparatus 102 shifts control to step S1250. In the initial state where the evaluation value function F is unlearned, the information processing device 102 repeatedly performs the initial learning of the untangling work until the variable j reaches the constant J1.

ステップＳ１２３０において、情報処理装置１０２は、絡まり解き作業の初期学習処理を実行する。絡まり解き作業の初期学習処理については後述する。ステップＳ１２４０において、情報処理装置１０２は、変数ｊの値をインクリメントする。以降は、情報処理装置１０２は、上限回数として予め定められた回数Ｊ１まで、絡まり解き作業の初期学習処理を繰り返し実行する。 In step S1230, the information processing apparatus 102 executes an initial learning process for untangling work. The initial learning process for the untangling work will be described later. In step S1240, the information processing device 102 increments the value of the variable j. After that, the information processing apparatus 102 repeatedly executes the initial learning process of the entanglement work up to the number of times J1 predetermined as the upper limit number of times.

ステップＳ１２５０において、情報処理装置１０２は、変数ｊの値が定数Ｊ２以下であるか否かを判定する。情報処理装置１０２は、変数ｊの値が定数Ｊ２以下であると判定すると（ステップＳ１２５０にてＹＥＳ）、ステップＳ１２６０に制御を移す。そうでない場合（ステップＳ１２５０にてＮＯ）、情報処理装置１０２は、学習処理を終了する。 In step S1250, the information processing apparatus 102 determines whether or not the value of the variable j is equal to or less than the constant J2. When information processing apparatus 102 determines that the value of variable j is equal to or less than constant J2 (YES in step S1250), the control proceeds to step S1260. Otherwise (NO in step S1250), information processing apparatus 102 terminates the learning process.

ステップＳ１２６０において、情報処理装置１０２は、絡まり解き作業の学習処理を実行する。絡まり解き作業の学習処理については後述する。ステップＳ１２７０において、情報処理装置１０２は、変数ｊの値をインクリメントする。以降は、情報処理装置１０２は、変数ｊが上限値として予め定められた定数Ｊ２より大きくなるまで、絡まり解き作業の学習処理を繰り返し実行する。 In step S1260, the information processing apparatus 102 executes learning processing for the untangling work. The learning process of the entanglement work will be described later. In step S1270, the information processing apparatus 102 increments the value of variable j. After that, the information processing apparatus 102 repeatedly executes the learning process of the entanglement untangling work until the variable j becomes larger than the constant J2 predetermined as the upper limit.

図１３は、絡まり解き作業の初期学習処理（図１２のステップＳ１２３０に対応）の一例を示すフローチャートである。ある局面において、図１３の処理を実行するためのプログラムは、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１３の各ステップを実行するものとして当該初期学習処理を説明する。 FIG. 13 is a flow chart showing an example of initial learning processing (corresponding to step S1230 in FIG. 12) for untangling work. In one aspect, a program for executing the process of FIG. 13 may be stored in secondary storage device 703 and read out to primary storage device 702 to be executed by CPU 701 . Hereinafter, the initial learning process will be described assuming that the information processing apparatus 102 executes each step in FIG.

ステップＳ１３０５において、情報処理装置１０２は、位置調整装置により、把持機構１１２を予め定められた位置（ワーク取り出し開始位置）に移動させるための指令をワーク取り出し作業装置１０１に送信する。 In step S1305, the information processing device 102 transmits a command to the work picking device 101 to move the gripping mechanism 112 to a predetermined position (work picking start position) by using the position adjusting device.

ステップＳ１３１０において、情報処理装置１０２は、変数ｉに１を代入する。ステップＳ１３１５において、情報処理装置１０２は、ワーク取り出し作業装置１０１の撮像装置１１４，１１５の撮影した画像（もしくは、画像に写るワークの個数）と、各電動アクチュエータのモータトルク値とを状態情報Ｓ１として取得する。なお、状態情報Ｓ１は、さらに、情報処理装置１０２が各種センサーから取得した各種センサー値を含んでいてもよい。 In step S1310, the information processing apparatus 102 substitutes 1 for the variable i. In step S1315, the information processing device 102 uses the image (or the number of workpieces captured in the image) captured by the imaging devices 114 and 115 of the workpiece picking device 101 and the motor torque value of each electric actuator as state information S1. get. The state information S1 may further include various sensor values obtained by the information processing device 102 from various sensors.

ステップＳ１３２０において、情報処理装置１０２は、動作決定部８０４により、乱数を用いて次に実行する動作パターンａ_ｋを選択する。具体的には、情報処理装置１０２は、１～ｎの間の乱数に基づいて動作パターンのインデックス番号ｋを決定する。 In step S1320, the information processing apparatus 102 causes the motion determining unit 804 to select a motion pattern _ak to be executed next using a random number. Specifically, the information processing device 102 determines the index number k of the movement pattern based on a random number between 1 and n.

ステップＳ１３２５において、情報処理装置１０２は、状態情報Ｓ１を評価値関数学習部８０７に保管した後、ワーク取り出し作業装置１０１に動作パターンａ_ｋを実行させるための指令を送信する。 In step S1325, the information processing apparatus 102 stores the state information S1 in the evaluation value function learning unit 807, and then transmits an instruction to the workpiece picking device 101 to execute the operation pattern _ak .

ステップＳ１３３０において、情報処理装置１０２は、ワーク取り出し作業装置１０１が動作パターンａ_ｋを実行した後に、ワーク取り出し作業装置１０１の撮像装置１１４，１１５の撮影した画像（もしくは、画像に写るワークの個数）と、各電動アクチュエータのモータトルク値とを状態Ｓ２として取得する。なお、状態Ｓ２は、さらに、情報処理装置１０２が各種センサーから取得した各種センサー値を含んでいてもよい。 In step S1330, the information processing device 102 captures images captured by the imaging devices 114 and 115 of the workpiece picking device 101 after the workpiece picking device 101 executes the operation pattern _ak (or the number of workpieces captured in the image). , and the motor torque value of each electric actuator are acquired as the state S2. The state S2 may further include various sensor values obtained by the information processing device 102 from various sensors.

ステップＳ１３３５において、情報処理装置１０２は、撮像装置１１５が把持機構１１２の先端を側面から撮影することによって取得された画像２を基に、把持機構１１２が把持しているワークの個数を算出する。ある局面において、情報処理装置１０２は、画像２から把持機構１１２が把持しているワークの個数を算出するために、既存の画像認識技術を用いてもよい。 In step S1335, the information processing apparatus 102 calculates the number of works gripped by the gripping mechanism 112 based on the image 2 acquired by the imaging device 115 capturing the tip of the gripping mechanism 112 from the side. In one aspect, the information processing apparatus 102 may use existing image recognition technology to calculate the number of workpieces gripped by the gripping mechanism 112 from the image 2 .

ステップＳ１３４０において、情報処理装置１０２は、画像２の解析結果に基づいて、把持機構１１２がワークを１個だけ把持しているか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個だけ把持していると判定した場合（ステップＳ１３４０にてＹＥＳ）、制御をステップＳ１３４５に移す。そうでない場合（ステップＳ１３４０にてＮＯ）、情報処理装置１０２は制御をステップＳ１３６０に移す。 In step S1340 , the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips only one workpiece based on the analysis result of Image 2 . When information processing apparatus 102 determines that gripping mechanism 112 grips only one workpiece (YES in step S1340), control proceeds to step S1345. Otherwise (NO in step S1340), information processing apparatus 102 shifts control to step S1360.

ステップＳ１３４５において、情報処理装置１０２は、終了判定をＴｒｕｅ（完了）にし、「動作パターンａ_ｋ」に対する報酬Ｒを１にする。なお、本実施の例では、報酬Ｒは、成功のときは１、失敗のときは－１、それ以外のときは０とするが、報酬Ｒの例はこれに限られない。成功時や失敗時のときの報酬ごとに差があればよい。 In step S1345, the information processing apparatus 102 sets the end determination to True (completion) and sets the reward R to 1 for the “movement pattern a _k ”. In the example of this embodiment, the reward R is 1 for success, -1 for failure, and 0 otherwise, but examples of the reward R are not limited to this. It is sufficient if there is a difference in rewards for success and failure.

ステップＳ１３５０において、情報処理装置１０２は、評価値関数学習部８０７に、実行した「動作パターンａ_ｋ」、「状態情報Ｓ１，Ｓ２」、「報酬Ｒ（Ｒ＝１）」および「終了判定Ｔｒｕｅ（完了）」を保存する。ステップＳ１３５５において、情報処理装置１０２は、評価値関数Ｆの更新処理を実行する。 In step S1350 , the information processing apparatus 102 instructs the evaluation value function learning unit 807 to perform “operation pattern a _k ”, “state information S1, S2”, “reward R (R=1)”, and “end determination True ( Done)”. In step S1355, the information processing apparatus 102 executes update processing of the evaluation value function F. FIG.

ステップＳ１３６０において、情報処理装置１０２は、画像２の解析結果に基づいて、把持機構１１２がワークを１個も把持していないか否かを判定する。情報処理装置１０２は、把持機構１１２がワークを１個も把持していないと判定した場合（ステップＳ１３６０にてＹＥＳ）、報酬Ｒを生成せず処理を終了する。そうでない場合（ステップＳ１３６０にてＮＯ）、情報処理装置１０２は制御をステップＳ１３６５に移す。 In step S1360 , the information processing apparatus 102 determines whether or not the gripping mechanism 112 grips even one workpiece based on the analysis result of the image 2 . When information processing apparatus 102 determines that gripping mechanism 112 has not gripped even one workpiece (YES in step S1360), it does not generate reward R and ends the process. Otherwise (NO in step S1360), information processing apparatus 102 shifts control to step S1365.

ステップＳ１３６５において、情報処理装置１０２は、変数ｉの値が定数Ｎ１より大きいか否かを判定する。定数Ｎ１は、絡まり解き作業中に繰り返してよい動作パターンの実行回数の上限値である。情報処理装置１０２は、変数ｉの値が定数Ｎ１より大きいと判定した場合（ステップＳ１３６５にてＹＥＳ）、動作パターンの実行回数が上限に達したと判断し、制御をステップＳ１３７０に移す。そうでない場合（ステップＳ１３６５にてＮＯ）、情報処理装置１０２は制御をステップＳ１３７５に移す。 In step S1365, the information processing device 102 determines whether the value of the variable i is greater than the constant N1. The constant N1 is the upper limit of the number of times the motion pattern can be repeated during the untangling work. When information processing apparatus 102 determines that the value of variable i is greater than constant N1 (YES in step S1365), information processing apparatus 102 determines that the number of executions of the motion pattern has reached the upper limit, and moves control to step S1370. Otherwise (NO in step S1365), the information processing device 102 shifts the control to step S1375.

ステップＳ１３７０において、情報処理装置１０２は、終了判定をＴｒｕｅにし、動作パターンａ_ｋに対する報酬Ｒを－１にする。ステップＳ１３５０以降の処理は前述した通りになる。ステップＳ１３７５において、情報処理装置１０２は、変数ｉの値をインクリメントする。ステップＳ１３８０において、情報処理装置１０２は、終了判定をＦａｌｓｅにし、実行した「動作パターンａ_ｋ」に対する報酬Ｒを０にする。 In step S1370, the information processing apparatus 102 sets the end determination to True, and sets the reward R for the motion pattern _ak to -1. The processing after step S1350 is as described above. In step S1375, the information processing apparatus 102 increments the value of the variable i. In step S1380, the information processing apparatus 102 sets the end determination to False, and sets the reward R for the executed "movement pattern a _k " to zero.

ステップＳ１３８５において、情報処理装置１０２は、評価値関数学習部８０７に、実行した「動作パターンａ_ｋ」、「状態情報Ｓ１，Ｓ２」、「報酬Ｒ（Ｒ＝０）」および「終了判定Ｆａｌｓｅ（未完了）」を保存する。ステップＳ１３９０において、情報処理装置１０２は、評価値関数Ｆの更新処理を実行する。 In step S1385 , the information processing apparatus 102 instructs the evaluation value function learning unit 807 to perform “operation pattern a _k ”, “state information S1, S2”, “reward R (R=0)”, and “termination determination False ( incomplete)”. In step S1390, the information processing apparatus 102 executes update processing of the evaluation value function F. FIG.

図１４は、絡まり解き作業の学習処理（図１２のステップＳ１２６０に対応）の一例を示すフローチャートである。ある局面において、図１４の処理を実行するためのプログラムは、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１４の各ステップを実行するものとして当該学習処理を説明する。また、図１４において、前述の処理と同一の処理には、同一の符号を付してある。したがって、同一の処理の説明は繰り返さない。 FIG. 14 is a flow chart showing an example of learning processing for untangling work (corresponding to step S1260 in FIG. 12). In one aspect, a program for executing the process of FIG. 14 may be stored in secondary storage device 703 and read out to primary storage device 702 to be executed by CPU 701 . Hereinafter, the learning process will be described assuming that the information processing apparatus 102 executes each step in FIG. 14 . Further, in FIG. 14, the same reference numerals are assigned to the same processes as those described above. Therefore, description of the same processing will not be repeated.

ステップＳ１４１０において、情報処理装置１０２は、評価値関数部８０２により、状態情報Ｓ１に基づいて、各動作パターンのそれぞれの評価値を算出する。ステップＳ１４２０において、情報処理装置１０２は、動作パターンテーブル８０３を参照して、最も評価値が高い動作パターンを選択する。 In step S1410, the information processing apparatus 102 uses the evaluation value function unit 802 to calculate an evaluation value for each operation pattern based on the state information S1. In step S1420, the information processing apparatus 102 refers to the motion pattern table 803 and selects the motion pattern with the highest evaluation value.

図１３の絡まり解きの初期学習処理においては、学習情報が十分にないため、情報処理装置１０２は、ステップＳ１３２０において、乱数で次の動作パターンを選択している。これに対して、図１４の絡まり解きの学習処理においては、一定量以上の学習情報が評価値関数学習部８０７に蓄積されているため、情報処理装置１０２は、ステップＳ１４１０において、評価値関数Ｆに基づいて各動作パターンの評価値を算出する。情報処理装置１０２は、図１４の処理においても、随時、評価値関数Ｆを更新することで絡まり解き作業の精度を向上させる。 In the initial learning process for untangling in FIG. 13, since there is not enough learning information, the information processing apparatus 102 selects the next movement pattern with a random number in step S1320. On the other hand, in the tangling-untangling learning process of FIG. 14, since a certain amount or more of learning information is accumulated in the evaluation value function learning unit 807, the information processing apparatus 102 performs the evaluation value function F An evaluation value for each operation pattern is calculated based on. The information processing apparatus 102 also updates the evaluation value function F as needed in the process of FIG. 14 to improve the accuracy of the entanglement work.

図１５は、評価値関数部８０２の評価値関数Ｆの更新処理の一例を示すフローチャートである。ある局面において、図１５の処理を実行するためのプログラムは２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１５の各ステップを実行するものとして更新処理を説明する。 FIG. 15 is a flowchart showing an example of update processing of the evaluation value function F of the evaluation value function unit 802. As shown in FIG. In one aspect, a program for executing the processing of FIG. Hereinafter, update processing will be described assuming that the information processing apparatus 102 executes each step in FIG. 15 .

ステップＳ１５１０において、情報処理装置１０２は、評価値関数学習部８０７に保存されている各動作パターンａ_ｋの「状態情報Ｓ１，Ｓ２」、「報酬Ｒ」および「終了判定」を読み出す。 In step S 1510 , the information processing apparatus 102 reads “state information S 1 , S 2 ”, “reward R”, and “termination determination” of each motion pattern _ak stored in the evaluation value function learning unit 807 .

ステップＳ１５２０において、情報処理装置１０２は、ステップＳ１５１０にて読み出した各種データを用いて、学習用の評価値関数Ｆ’の内部パラメータを更新する。評価値関数Ｆ’は、評価値の算出に使用される評価値関数Ｆとは別に用意する学習用の評価値関数である。評価値関数Ｆ’は、評価値関数学習部８０７によって使用される。評価値関数Ｆは、評価値関数部８０２によって使用される。 In step S1520, the information processing apparatus 102 updates the internal parameters of the learning evaluation value function F' using the various data read out in step S1510. The evaluation value function F' is a learning evaluation value function prepared separately from the evaluation value function F used for calculating the evaluation value. Evaluation value function F′ is used by evaluation value function learning unit 807 . Evaluation value function F is used by evaluation value function unit 802 .

ステップＳ１５３０において、情報処理装置１０２は、学習処理を予め定められた回数繰り返すごとに、評価値関数Ｆ’を評価値関数Ｆにコピーする。情報処理装置１０２は、図１２～図１４の処理中においても、図１５の処理を随時実行してもよい。 In step S1530, the information processing apparatus 102 copies the evaluation value function F' to the evaluation value function F each time the learning process is repeated a predetermined number of times. The information processing apparatus 102 may execute the process of FIG. 15 at any time during the processes of FIGS. 12 to 14 as well.

以下に、評価値関数Ｆの学習処理の詳細について説明する。評価値関数Ｆはニューラルネットワークのため、学習には教師信号が必要になる。情報処理装置１０２は、終了判定に応じて教師信号ｙを次のように決定する。 Details of the learning process of the evaluation value function F will be described below. Since the evaluation value function F is a neural network, a teacher signal is required for learning. The information processing device 102 determines the teacher signal y as follows according to the end determination.

絡まり解き処理の終了判定がＴｒｕｅの場合の教師信号ｙは以下のようになる。 The teacher signal y when the untangling process end determination is True is as follows.

絡まり解き処理の終了判定がＦａｌｓｅの場合の教師信号ｙは以下のようになる。 The teacher signal y when the untangling process end determination is False is as follows.

ここで、「ｓ'＝Ｔ２、ａ'」は「Ｑ（ｓ，ａ）」が最大になる動作パターンを意味する。情報処理装置１０２は、上記の教師信号ｙと、評価値関数Ｆとの２乗誤差Ｅを求め、誤差逆伝搬法によりニューラルネットワークの学習を行う。評価値関数Ｆは、下記の式（３）の式で表される。 Here, 's'=T2, a'' means an operation pattern that maximizes 'Q(s, a)'. The information processing device 102 obtains the squared error E between the teacher signal y and the evaluation value function F, and performs neural network learning by the error back propagation method. The evaluation value function F is represented by the following formula (3).

また、情報処理装置１０２は、式（３）を下記の式（４）に代入して誤差を算出する。 Further, the information processing apparatus 102 substitutes the formula (3) into the following formula (4) to calculate the error.

誤差逆伝搬法は、上記Ｅが０になるようにニューラルネットワークの内部パラメータを最適化する。よって、学習が進むにしたがって下記の式（５）の値が０に近づいていく。 The error backpropagation method optimizes the internal parameters of the neural network so that E becomes zero. Therefore, the value of the following equation (5) approaches 0 as learning progresses.

強化学習も同様に、学習が十分に行われると、下記の式（６）が成り立つので、誤差逆伝搬法によるニューラルネットワークの学習は強化学習の学習結果と同様になる。 Similarly, in reinforcement learning, if the learning is sufficiently performed, the following equation (6) holds, so that the learning result of the neural network by the error backpropagation method is the same as the learning result of the reinforcement learning.

以上説明したように、本実施の形態に従うワーク取り出し作業装置１０１は、直列多関節の構造を有さず、代わりに直動機構およびパラレルリンクのみの構成を有する。その結果、多関節ロボットが持つ特異点の問題が発生せず、多関節ロボットよりも少ないスペースでの作業を可能にする。また、ワーク取り出し作業装置１０１は、機械学習においても、パラレルリンクの基端側リンクハブに取付けられた電動アクチュエータおよび位置調整装置の電動アクチュエータのモータトルク値のみを学習データとすることができる。そのため、ワーク取り出し作業装置１０１は、多関節ロボットと比較して、ワークの絡まり解き処理における学習パラメータが少なく機械学習が容易になる。よって、Ｃ型サークリップや、ばね、コイル等の絡まりやすいワークの取り出し作業における絡まり解き処理の精度を向上させることが可能となる。 As described above, the workpiece picking device 101 according to the present embodiment does not have a series multi-joint structure, but instead has only a linear motion mechanism and a parallel link. As a result, the singularity problem of articulated robots does not occur, and work can be done in a smaller space than articulated robots. Also, in machine learning, the workpiece picking device 101 can use only the motor torque values of the electric actuators attached to the base-end link hub of the parallel link and the electric actuator of the position adjusting device as learning data. Therefore, the work picking device 101 has fewer learning parameters in work untangling processing than an articulated robot, and machine learning is facilitated. Therefore, it is possible to improve the accuracy of the untangling process in the work of removing easily tangled workpieces such as C-shaped circlips, springs, and coils.

今回開示された実施の形態は全ての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内で全ての変更が含まれることが意図される。 It should be considered that the embodiments disclosed this time are illustrative in all respects and not restrictive. The scope of the present invention is indicated by the scope of the claims rather than the above description, and is intended to include all modifications within the meaning and scope equivalent to the scope of the claims.

１１，１０８Ａ，１０８Ｂ，１０８Ｃ電動アクチュエータ、３２第１リンクハブ、３３第２リンクハブ、３４リンク機構、３５，３６端部リンク部材、３７中央リンク部材、４２，５５，７３，７５回転軸、６２減速機構、６３モータ固定部材、１００合作業システム、１０１合作業装置、１０２情報処理装置、１０３制御装置、１０４架台、１０５第１の直動ユニット、１０６第２の直動ユニット、１０７第３の直動ユニット、１０９作業ヘッド、１１０回転ユニット取付部材、１１１角度調整機構、１１２把持機構、１１３ワーク入れ、１１４，１１５撮像装置、７０２１次記憶装置、７０３２次記憶装置、７０４外部機器インターフェース、７０５入力インターフェース、７０６出力インターフェース、７０７通信インターフェース、８０１信号入力部、８０２評価値関数部、８０３動作パターンテーブル、８０４動作決定部、８０５指令生成部、８０６動作結果判定部、８０７評価値関数学習部。 Reference Signs List 11, 108A, 108B, 108C Electric actuator 32 First link hub 33 Second link hub 34 Link mechanism 35, 36 End link member 37 Center link member 42, 55, 73, 75 Rotating shaft 62 Reduction mechanism 63 Motor fixing member 100 Cooperative work system 101 Cooperative work device 102 Information processing device 103 Control device 104 Base 105 First linear motion unit 106 Second linear motion unit 107 Third Linear motion unit 109 Working head 110 Rotation unit mounting member 111 Angle adjustment mechanism 112 Gripping mechanism 113 Work container 114, 115 Imaging device 702 Primary storage device 703 Secondary storage device 704 External device interface 705 input interface, 706 output interface, 707 communication interface, 801 signal input unit, 802 evaluation value function unit, 803 operation pattern table, 804 operation determination unit, 805 command generation unit, 806 operation result determination unit, 807 evaluation value function learning unit .

Claims

ワークの取り出しを行う作業装置であって、
ワークを容器から取り出して把持する把持部と、
前記把持部が装着され、前記把持部の向きを調整する角度調整部と、
前記角度調整部が装着される作業ヘッドと、
複数の駆動部により前記作業ヘッドを移動させる位置調整部と、
前記把持部の把持するワークを撮影する第１の撮像装置と、
前記作業装置を制御する情報処理装置とを備え、
前記角度調整部は、
第１および第２のリンクハブと、
前記第１および第２のリンクハブの間に並列に配置された複数のリンクと、
前記複数のリンクのそれぞれを駆動させる複数の駆動部とを含み、
前記情報処理装置は、
前記第１の撮像装置が撮影した画像に基づいて、前記把持部が把持するワークの個数を検出し、
前記把持部が把持するワークの個数が２以上であることに基づいて、前記把持部が把持するワークの個数および画像、前記角度調整部の各駆動部のトルクを機械学習モデルのパラメータとし、前記機械学習モデルにより、前記位置調整部および前記角度調整部の各駆動部に送信するそれぞれの駆動信号を決定し、
決定した前記駆動信号に基づいて、前記位置調整部の各駆動部および前記角度調整部の各駆動部を駆動させることにより、前記把持部にワークの絡まり解き動作をさせる、作業装置。 A working device for picking up a workpiece,
a gripping portion for gripping the workpiece after taking it out of the container;
an angle adjustment unit to which the gripping portion is attached and which adjusts the orientation of the gripping portion;
a working head on which the angle adjustment part is mounted;
a position adjusting unit that moves the working head by a plurality of driving units;
a first imaging device that captures an image of the workpiece gripped by the gripping unit;
an information processing device that controls the working device;
The angle adjuster is
first and second link hubs;
a plurality of links arranged in parallel between the first and second link hubs;
and a plurality of driving units for driving each of the plurality of links,
The information processing device is
detecting the number of workpieces gripped by the gripping unit based on the image captured by the first imaging device;
Based on the fact that the number of workpieces gripped by the gripping unit is two or more, the number and image of workpieces gripped by the gripping unit and the torque of each driving unit of the angle adjusting unit are parameters of a machine learning model, and using a machine learning model to determine each drive signal to be transmitted to each drive unit of the position adjustment unit and the angle adjustment unit;
A work device that causes the gripping section to perform an untangling operation of the workpiece by driving each driving section of the position adjusting section and each driving section of the angle adjusting section based on the determined drive signal.

前記情報処理装置は、
前記把持部が絡まり解き動作を実行した後に、再度、前記第１の撮像装置が撮影した画像に基づいて、前記把持部が把持するワークの個数を検出し、
前記把持部が把持するワークの個数が２以上であることに基づいて、前記把持部にワークの絡まり解き動作を再度実行させる、請求項１に記載の作業装置。 The information processing device is
detecting the number of workpieces gripped by the gripping unit again based on the image captured by the first imaging device after the gripping unit performs the untangling operation;
2. The working device according to claim 1, wherein the gripping section is caused to re-execute the untangling operation of the workpieces based on the fact that the number of workpieces gripped by the gripping section is two or more.

前記情報処理装置は、
前記第１の撮像装置が撮影した画像に基づいて、前記把持部が把持するワークの個数が１であるか否かを判定し、
前記把持部が把持するワークの個数が１であることに基づいて、ワークの絡まり解き作業が完了したと判定し、
前記機械学習モデルの学習に用いる報酬データを生成し、前記機械学習モデルに入力する学習データに前記報酬データを加える、請求項１または２に記載の作業装置。 The information processing device is
determining whether or not the number of workpieces gripped by the gripping unit is 1 based on the image captured by the first imaging device;
Based on the fact that the number of workpieces gripped by the gripping unit is 1, it is determined that the unentangling work of the workpieces has been completed,
3. The working device according to claim 1, wherein reward data used for learning of said machine learning model is generated, and said reward data is added to learning data input to said machine learning model.

前記情報処理装置は、
前記第１の撮像装置が撮影した画像に基づいて、前記把持部が把持するワークの個数が０であるか否かを判定し、
前記把持部が把持するワークの個数が０個であることに基づいて、前記機械学習モデルの学習に用いる報酬データを生成せずに、絡まり解き作業を終了する、請求項３に記載の作業装置。 The information processing device is
determining whether or not the number of workpieces gripped by the gripping unit is 0 based on the image captured by the first imaging device;
4. The working device according to claim 3, wherein the unentanglement work is terminated without generating reward data used for learning of the machine learning model based on the fact that the number of workpieces gripped by the gripping unit is 0. .

前記容器内のワークを撮影する第２の撮像装置をさらに備え、
前記情報処理装置は、
前記第２の撮像装置が撮影した画像に基づいて、前記容器内にワークがあるか否かを判定し、
前記容器内にワークがあると判定したことに基づいて、前記把持部を予め定められた取り出し作業の位置に移動させるための前記駆動信号を前記位置調整部および前記角度調整部の各駆動部に送信する、請求項１～４のいずれか１項に記載の作業装置。 further comprising a second imaging device for imaging the workpiece in the container,
The information processing device is
determining whether or not there is a workpiece in the container based on the image captured by the second imaging device;
the drive signal for moving the gripping section to a predetermined position for the take-out operation is sent to each drive section of the position adjustment section and the angle adjustment section based on determination that there is a work in the container; The work device according to any one of claims 1 to 4, which transmits.

前記情報処理装置は、
前記第２の撮像装置が撮影した画像に基づいて、前記容器内にワークがないと判定すると、前記容器内からのワークの取り出し作業を終了する、請求項５に記載の作業装置。 The information processing device is
6. The working device according to claim 5, wherein when it is determined that there is no workpiece in the container based on the image captured by the second imaging device, the operation of removing the workpiece from the container is finished.

前記情報処理装置は、前記位置調整部および前記角度調整部の各駆動部に送信するそれぞれの前記駆動信号を決定するとき、さらに、前記位置調整部の各駆動部のトルクを取得し、前記位置調整部の各駆動部のトルクを前記機械学習モデルのパラメータに加える、請求項１～６のいずれか１項に記載の作業装置。 When determining the respective drive signals to be transmitted to the drive units of the position adjustment unit and the angle adjustment unit, the information processing device further obtains torque of each drive unit of the position adjustment unit, The working device according to any one of claims 1 to 6, wherein the torque of each driving part of the adjustment part is added to the parameters of the machine learning model.

前記位置調整部および前記角度調整部の各駆動部に送信されるそれぞれの前記駆動信号は、各駆動部のそれぞれの指令トルク、回転速度および回転量に関する情報を含む、請求項１～７のいずれか１項に記載の作業装置。 8. The drive signal transmitted to each drive unit of the position adjustment unit and the angle adjustment unit includes information on command torque, rotation speed and amount of rotation of each drive unit. or the work device according to item 1.

前記位置調整部は、３軸の直動機構を含む、請求項１～８のいずれか１項に記載の作業装置。 The working device according to any one of claims 1 to 8, wherein the position adjusting section includes a three-axis linear motion mechanism.

前記位置調整部の駆動部は、ステッピングモータである、請求項１～９のいずれか１項に記載の作業装置。 The working device according to any one of claims 1 to 9, wherein the driving portion of the position adjusting portion is a stepping motor.

前記角度調整部の駆動部は、ステッピングモータである、請求項１～１０いずれか１項に記載の作業装置。 The working device according to any one of claims 1 to 10, wherein the driving portion of the angle adjusting portion is a stepping motor.