JP2022118954A

JP2022118954A - Object detection device and autonomous mobile body

Info

Publication number: JP2022118954A
Application number: JP2021015822A
Authority: JP
Inventors: 将哉南田; Masaya Minamida; 憲司山村; Kenji Yamamura; 朋晃野々目; Tomoaki Nonome
Original assignee: Toyota Industries Corp
Current assignee: Toyota Industries Corp
Priority date: 2021-02-03
Filing date: 2021-02-03
Publication date: 2022-08-16

Abstract

To suppress the reduction in detection accuracy of a target object due to an object detection device.SOLUTION: An object detection device applies a bounding box surrounding a target object captured in an image. The object detection device deletes the bounding box with NMS. The object detection device calculates a proportion of the two mutually-overlapping bounding boxes. The proportion can be calculated by dividing a region of the product set of the two mutually-overlapping bounding boxes by one region of the two bounding boxes. The object detection device deletes one of the two mutually-overlapping bounding boxes on the basis of the proportion.SELECTED DRAWING: Figure 3

Description

本開示は、物体検出装置、及び自律移動体に関する。 The present disclosure relates to an object detection device and an autonomous mobile body.

特許文献１に開示の物体検出装置は、画像から物体を検出する。物体検出装置は、検出する対象となる対象物体を囲むバウンディングボックスを付与する。バウンディングボックスは、対象物体が存在する領域の候補である。バウンディングボックスは、１つの対象物体に対して複数付与される場合がある。特許文献１に開示の物体検出装置は、１つの対象物体に対してバウンディングボックスが複数付与された場合、NMS（Non Maximum Suppression）によってバウンディングボックスを削除している。NMSは、互いに重複する２つのバウンディングボックスの重複割合が閾値を超えた場合、信頼度スコアの低い方のバウンディングボックスを削除する処理である。重複割合は、互いに重複している２つのバウンディングボックスの積集合の領域を当該２つのバウンディングボックスの和集合の領域で除算することで算出される。 An object detection device disclosed in Patent Document 1 detects an object from an image. An object detection device provides a bounding box surrounding a target object to be detected. A bounding box is a candidate for the region in which the target object exists. A plurality of bounding boxes may be given to one target object. The object detection device disclosed in Patent Document 1 deletes the bounding box by NMS (Non Maximum Suppression) when a plurality of bounding boxes are given to one target object. NMS is a process of deleting a bounding box with a lower reliability score when the overlapping ratio of two bounding boxes that overlap with each other exceeds a threshold. The overlap ratio is calculated by dividing the intersection area of two bounding boxes that overlap each other by the area of the union of the two bounding boxes.

特開２０２０－７１７９３号公報JP 2020-71793 A

NMSは、重複割合が閾値を超えない場合には、バウンディングボックスの削除を行わない。このため、NMSによってバウンディングボックスを削除した後であっても、同一の対象物体に対して複数のバウンディングボックスが付与されている場合がある。例えば、２つのバウンディングボックスの領域の大きさが異なる場合にはNMSによるバウンディングボックスの削除を行えない場合がある。この場合、いずれのバウンディングボックスに対象物体が存在しているかを特定しにくく、物体検出装置による対象物体の検出精度の低下を招くおそれがある。 The NMS does not remove bounding boxes if the overlap ratio does not exceed the threshold. Therefore, even after the bounding box is deleted by the NMS, a plurality of bounding boxes may be assigned to the same target object. For example, if the two bounding boxes have different sizes, the NMS may not be able to delete the bounding boxes. In this case, it is difficult to specify in which bounding box the target object exists, and there is a possibility that the detection accuracy of the target object by the object detection device may be lowered.

上記課題を解決する物体検出装置は、画像から対象物体を検出する物体検出装置であって、前記画像に映る前記対象物体を囲むバウンディングボックスを付与する付与部と、前記バウンディングボックスが複数存在する場合に、互いに重複している２つの前記バウンディングボックスの積集合の領域を当該２つの前記バウンディングボックスの和集合の領域で除算することで重複割合を算出し、前記重複割合が閾値を超えている場合には、互いに重複している２つの前記バウンディングボックスのうち信頼度スコアが低い方を削除する第１削除部と、前記バウンディングボックスが複数存在する場合に、互いに重複している２つの前記バウンディングボックスの積集合の領域を当該２つの前記バウンディングボックスのうち一方の領域で除算することで包含割合を算出する包含割合算出部と、前記包含割合に基づき、互いに重複している２つの前記バウンディングボックスの一方を削除する第２削除部と、を備える。 An object detection apparatus that solves the above problems is an object detection apparatus that detects a target object from an image, and includes a provision unit that provides a bounding box surrounding the target object appearing in the image, and a plurality of the bounding boxes. and calculating the overlap ratio by dividing the intersection area of the two bounding boxes that overlap each other by the area of the union of the two bounding boxes, and if the overlap ratio exceeds a threshold includes a first deletion unit that deletes one of the two overlapping bounding boxes that has a lower reliability score; and two overlapping bounding boxes when a plurality of the bounding boxes exist. an inclusion ratio calculation unit that calculates an inclusion ratio by dividing the area of the intersection of the two bounding boxes by one of the areas of the two bounding boxes; and a second deletion unit for deleting one.

第２削除部は、包含割合によってバウンディングボックスを削除する。このため、重複割合によって削除を行えないバウンディングボックスであっても、第２削除部による削除を行い得る。第１削除部によるバウンディングボックスの削除と、第２削除部によるバウンディングボックスの削除とを行うことで、第１削除部によるバウンディングボックスの削除のみを行う場合に比べて、同一の対象物体に付与されたバウンディングボックスの数を減らすことができる。これにより、物体検出装置による対象物体の検出精度の低下を抑制できる。 The second deletion unit deletes the bounding box according to the inclusion ratio. Therefore, even a bounding box that cannot be deleted based on the overlapping ratio can be deleted by the second deletion unit. By deleting the bounding box by the first deletion unit and deleting the bounding box by the second deletion unit, compared to the case where only the bounding box is deleted by the first deletion unit, can reduce the number of bounding boxes. As a result, it is possible to suppress a decrease in the detection accuracy of the target object by the object detection device.

上記課題を解決する自律移動体は、対象物体を追跡する自律移動体であって、移動体と、前記移動体に設けられており、前記対象物体を撮像するカメラと、物体検出装置と、を備え、前記物体検出装置は、前記カメラから画像を取得する取得部と、前記画像に映る前記対象物体を囲むバウンディングボックスを付与する付与部と、前記バウンディングボックスが複数存在する場合に、互いに重複している２つの前記バウンディングボックスの積集合の領域を当該２つの前記バウンディングボックスの和集合の領域で除算することで重複割合を算出し、前記重複割合が閾値を超えている場合には、互いに重複している２つの前記バウンディングボックスのうち信頼度スコアが低い方を削除する第１削除部と、前記バウンディングボックスが複数存在する場合に、互いに重複している２つの前記バウンディングボックスの積集合の領域を当該２つの前記バウンディングボックスのうち一方の領域で除算することで包含割合を算出する包含割合算出部と、前記包含割合に基づき、互いに重複している２つの前記バウンディングボックスの一方を削除する第２削除部と、を備え、前記自律移動体は、前記第１削除部、及び前記第２削除部により前記バウンディングボックスの削除を行った後に残った前記バウンディングボックスに基づき、前記移動体と前記対象物体との相対位置を導出する導出部と、前記相対位置に基づき、前記対象物体を追跡するように前記移動体を移動させる移動制御部と、を備える。 An autonomous mobile body that solves the above problems is an autonomous mobile body that tracks a target object, and includes a mobile body, a camera provided on the mobile body for imaging the target object, and an object detection device. The object detection device includes an acquisition unit that acquires an image from the camera, an addition unit that adds a bounding box that surrounds the target object in the image, and a plurality of bounding boxes that overlap each other when there are a plurality of bounding boxes. calculating the overlap ratio by dividing the intersection area of the two bounding boxes by the area of the union of the two bounding boxes, and if the overlap ratio exceeds a threshold, overlap with each other a first deletion unit that deletes one of the two bounding boxes that has a lower reliability score, and an intersection area of the two bounding boxes that overlap each other when there are a plurality of the bounding boxes. by one area of the two bounding boxes to calculate an inclusion ratio; and based on the inclusion ratio, one of the two overlapping bounding boxes is deleted. 2 deletion unit, and the autonomous moving body is based on the bounding box remaining after deleting the bounding box by the first deletion unit and the second deletion unit, the moving body and the object A derivation unit that derives a relative position with respect to an object, and a movement control unit that moves the moving body so as to track the target object based on the relative position.

物体検出装置による対象物体の検出精度の低下を抑制できる。自律移動体は、物体検出装置によって検出された対象物体を追跡する。これにより、自律移動体は、同一の対象物体を追跡することができる。 It is possible to suppress deterioration in detection accuracy of the target object by the object detection device. The autonomous mobile body tracks the target object detected by the object detection device. This allows the autonomous mobile body to track the same target object.

本発明によれば、物体検出装置による対象物体の検出精度の低下を抑制できる。 ADVANTAGE OF THE INVENTION According to this invention, the fall of the detection accuracy of the target object by an object detection apparatus can be suppressed.

自律移動体を示す概略図。Schematic which shows an autonomous mobile body. 自律移動体を示すブロック図。The block diagram which shows an autonomous mobile body. 物体検出処理を示すフローチャート。4 is a flowchart showing object detection processing; 対象物体が映る画像を示す図。The figure which shows the image in which a target object is reflected. バウンディングボックスが付与された画像を示す図。FIG. 10 is a diagram showing an image to which a bounding box is added; 重複割合によるバウンディングボックスの削除を行った後の画像を示す図。The figure which shows the image after performing deletion of the bounding box by the overlap ratio. 包含割合によるバウンディングボックスの削除を行った後の画像を示す図。The figure which shows the image after performing deletion of the bounding box by the inclusion ratio. 移動処理を示すフローチャート。4 is a flowchart showing movement processing;

以下、物体検出装置、及び自律移動体の一実施形態について説明する。
図１に示すように、自律移動体１０は、車両２０と、制御ユニットＣＵと、外界センサ３１と、カメラ４１と、を備える。車両２０は、車体２１と、複数の車輪２２と、駆動機構２３と、を備える。車両２０は、制御装置３２に制御されることで、対象物体Ｔを追跡するように自律移動する移動体である。自律移動体１０は、対象物体Ｔとの離間距離が所定の範囲内となるように移動する。対象物体Ｔは、人である。 An embodiment of an object detection device and an autonomous mobile body will be described below.
As shown in FIG. 1 , the autonomous mobile body 10 includes a vehicle 20 , a control unit CU, an external sensor 31 and a camera 41 . A vehicle 20 includes a vehicle body 21 , a plurality of wheels 22 and a drive mechanism 23 . The vehicle 20 is a mobile object that autonomously moves so as to track the target object T under the control of the control device 32 . The autonomous mobile body 10 moves so that the separation distance from the target object T is within a predetermined range. A target object T is a person.

図２に示すように、駆動機構２３は、車輪２２を回転させるためのモータ２４と、モータ２４を駆動させるモータドライバ２５と、エンコーダ２６と、を備える。制御ユニットＣＵは、制御装置３２と、物体検出装置５１と、を備える。なお、図示は省略するが、モータ２４及びモータドライバ２５は、車輪２２の数と同数設けられる。これにより、各車輪２２の回転数と回転方向を独立して制御することが可能である。エンコーダ２６は、車輪２２毎に個別に設けられている。 As shown in FIG. 2, the drive mechanism 23 includes a motor 24 for rotating the wheels 22, a motor driver 25 for driving the motor 24, and an encoder 26. The control unit CU includes a control device 32 and an object detection device 51 . Although not shown, the same number of motors 24 and motor drivers 25 as the number of wheels 22 are provided. Thereby, it is possible to independently control the rotation speed and rotation direction of each wheel 22 . Encoder 26 is provided individually for each wheel 22 .

モータドライバ２５には、制御装置３２から指令が入力される。モータドライバ２５は、制御装置３２からの指令に応じてモータ２４を制御する。
エンコーダ２６は、例えば、モータ２４の回転軸の回転量に基づいたパルス信号を出力するインクリメンタル型のエンコーダである。エンコーダ２６は、モータ２４の回転軸の回転数を検出する。モータドライバ２５は、エンコーダ２６の検出結果から、モータ２４の回転数、及び回転方向を認識可能である。車両２０は、モータ２４の駆動による車輪２２の回転によって移動する。 A command is input to the motor driver 25 from the control device 32 . The motor driver 25 controls the motor 24 according to commands from the control device 32 .
The encoder 26 is, for example, an incremental encoder that outputs a pulse signal based on the amount of rotation of the rotating shaft of the motor 24 . The encoder 26 detects the rotation speed of the rotating shaft of the motor 24 . The motor driver 25 can recognize the rotation speed and rotation direction of the motor 24 from the detection result of the encoder 26 . Vehicle 20 is moved by rotation of wheels 22 driven by motor 24 .

外界センサ３１としては、制御装置３２に車両２０の周辺に存在する物体を認識させることができ、かつ、自律移動体１０から物体までの距離を測定できるものが用いられる。物体は、対象物体Ｔ及び障害物を含む。障害物は、対象物体Ｔとは異なる物体である。 As the external sensor 31, a sensor capable of making the control device 32 recognize an object existing around the vehicle 20 and measuring the distance from the autonomous mobile body 10 to the object is used. Objects include the target object T and obstacles. An obstacle is an object different from the target object T. FIG.

本実施形態では、外界センサ３１として、レーザー距離計を用いている。レーザー距離計は、LIDAR（Laser Imaging Detection and Ranging）、あるいは、レーザーレンジファインダと呼ばれることもある。外界センサ３１は、レーザーを周辺に照射し、レーザーが当たった部分によって反射された反射光を受光することで周辺環境を認識可能な距離計である。本実施形態の外界センサ３１としては、水平方向の照射角度を変更しながらレーザーを照射する二次元距離計が用いられる。 In this embodiment, a laser rangefinder is used as the external sensor 31 . Laser range finders are also called LIDAR (Laser Imaging Detection and Ranging) or laser range finders. The external sensor 31 is a rangefinder capable of recognizing the surrounding environment by irradiating the surroundings with a laser and receiving reflected light reflected by the portion hit by the laser. As the external sensor 31 of this embodiment, a two-dimensional rangefinder that emits a laser beam while changing the irradiation angle in the horizontal direction is used.

外界センサ３１は、周囲にレーザーを照射し、レーザーが当たった点から反射された反射光を受光することで点までの距離を導出する。レーザーが当たった点は、物体の表面の一部を表す点である。点の位置は、極座標系の座標で表すことができる。極座標系における点の座標は、直交座標系の座標に変換することができる。極座標系から直交座標系への変換は、外界センサ３１によって行われてもよいし、制御装置３２で行われてもよい。本実施形態では、外界センサ３１により極座標系から直交座標系への変換が行われているとする。外界センサ３１は、センサ座標系での点の座標を導出する。センサ座標系は、外界センサ３１を原点とする直交座標系である。センサ座標系は、例えば、水平方向のうち１方向をＸ軸、水平方向のうちＸ軸に直交する方向をＹ軸とする座標系である。適宜、センサ座標系の座標をセンサ座標と称する。外界センサ３１は、レーザーを照射することにより得られた複数の点の座標を点群データとして制御装置３２に出力する。 The external sensor 31 irradiates the surroundings with a laser beam and receives reflected light reflected from the point hit by the laser beam, thereby deriving the distance to the point. The point hit by the laser is a point that represents a portion of the object's surface. The position of a point can be represented by coordinates in a polar coordinate system. The coordinates of a point in a polar coordinate system can be transformed into coordinates in a rectangular coordinate system. The conversion from the polar coordinate system to the orthogonal coordinate system may be performed by the external sensor 31 or may be performed by the control device 32 . In this embodiment, it is assumed that the external sensor 31 performs conversion from the polar coordinate system to the orthogonal coordinate system. The external sensor 31 derives the coordinates of points in the sensor coordinate system. The sensor coordinate system is an orthogonal coordinate system with the external sensor 31 as the origin. The sensor coordinate system is, for example, a coordinate system in which one of the horizontal directions is the X-axis and the horizontal direction perpendicular to the X-axis is the Y-axis. Coordinates of the sensor coordinate system are appropriately referred to as sensor coordinates. The external sensor 31 outputs the coordinates of a plurality of points obtained by irradiating the laser to the control device 32 as point group data.

制御装置３２は、プロセッサ３３と、記憶部３４と、を備える。プロセッサ３３としては、例えば、CPU（Central Processing Unit）、GPU（Graphics Processing Unit）、又はDSP（Digital Signal Processor）が用いられる。記憶部３４は、RAM（Random access memory）及びROM（Read Only Memory）を含む。記憶部３４は、処理をプロセッサ３３に実行させるように構成されたプログラムコードまたは指令を格納している。記憶部３４、即ち、コンピュータ可読媒体は、汎用または専用のコンピュータでアクセスできるあらゆる利用可能な媒体を含む。制御装置３２は、ASIC（Application Specific Integrated Circuit）やFPGA（Field Programmable Gate Array）等のハードウェア回路によって構成されていてもよい。処理回路である制御装置３２は、コンピュータプログラムに従って動作する１つ以上のプロセッサ、ASICやFPGA等の１つ以上のハードウェア回路、或いは、それらの組み合わせを含み得る。 The control device 32 includes a processor 33 and a storage section 34 . As the processor 33, for example, a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), or a DSP (Digital Signal Processor) is used. The storage unit 34 includes RAM (Random Access Memory) and ROM (Read Only Memory). Storage unit 34 stores program code or instructions configured to cause processor 33 to perform processing. Storage 34, or computer-readable media, includes any available media that can be accessed by a general purpose or special purpose computer. The control device 32 may be configured by a hardware circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). The processing circuitry, controller 32, may include one or more processors operating according to a computer program, one or more hardware circuits such as ASICs or FPGAs, or a combination thereof.

カメラ４１は、ＲＧＢカメラである。カメラ４１は、撮像素子を備える。撮像素子としては、ＣＣＤイメージセンサ、及びＣＯＭＳイメージセンサを挙げることができる。カメラ４１は、赤、緑及び青の３色のカラー信号で構成された画像を出力する。 Camera 41 is an RGB camera. The camera 41 has an imaging element. A CCD image sensor and a CMOS image sensor can be mentioned as an imaging device. The camera 41 outputs an image composed of three color signals of red, green and blue.

カメラ４１は、外界センサ３１と同一方向を向くように設けられている。詳細にいえば、カメラ４１は、外界センサ３１による物体の検出可能範囲とカメラ４１による撮像範囲とが同一方向を向くように設けられている。カメラ４１は、所定のフレームレートで撮像を行って画像を生成する。 The camera 41 is provided so as to face the same direction as the external sensor 31 . Specifically, the camera 41 is provided so that the object detectable range of the external sensor 31 and the imaging range of the camera 41 face the same direction. The camera 41 performs imaging at a predetermined frame rate to generate an image.

物体検出装置５１は、プロセッサ５２と、記憶部５３と、を備える。プロセッサ５２としては、例えば、CPU、GPU、又はDSPが用いられる。記憶部５３は、RAM及びROMを含む。記憶部５３は、処理をプロセッサ５２に実行させるように構成されたプログラムコードまたは指令を格納している。記憶部５３、即ち、コンピュータ可読媒体は、汎用または専用のコンピュータでアクセスできるあらゆる利用可能な媒体を含む。物体検出装置５１は、ASICやFPGA等のハードウェア回路によって構成されていてもよい。処理回路である物体検出装置５１は、コンピュータプログラムに従って動作する１つ以上のプロセッサ、ASICやFPGA等の１つ以上のハードウェア回路、或いは、それらの組み合わせを含み得る。物体検出装置５１は、制御装置３２とは異なる装置であってもよいし、制御装置３２と同一の装置であってもよい。即ち、物体検出装置５１は、制御装置３２の一機能であってもよい。 The object detection device 51 includes a processor 52 and a storage section 53 . A CPU, GPU, or DSP, for example, is used as the processor 52 . The storage unit 53 includes RAM and ROM. Storage unit 53 stores program code or instructions configured to cause processor 52 to perform processing. Storage 53, or computer-readable media, includes any available media that can be accessed by a general purpose or special purpose computer. The object detection device 51 may be configured by a hardware circuit such as ASIC or FPGA. The object detection device 51, which is a processing circuit, may include one or more processors operating according to a computer program, one or more hardware circuits such as ASIC and FPGA, or a combination thereof. The object detection device 51 may be a device different from the control device 32 or may be the same device as the control device 32 . That is, the object detection device 51 may be one function of the control device 32 .

物体検出装置５１は、カメラ４１によって撮像された画像から物体と、物体の位置と、を検出する。物体のクラスは、少なくとも、対象物体Ｔを検出できるように設定されている。本実施形態では、対象物体Ｔが「人」なので、物体のクラスは、少なくとも「人」を含む。物体検出装置５１は、画像から人と、人の位置とを検出するといえる。本実施形態では、画像から対象物体Ｔである人のみを検出する場合について説明するが、物体のクラスに「人」以外を設定することで、物体検出装置５１に多クラス分類を行わせてもよい。 The object detection device 51 detects an object and the position of the object from the image captured by the camera 41 . The object class is set so that at least the target object T can be detected. In this embodiment, since the target object T is "person", the object class includes at least "person". It can be said that the object detection device 51 detects a person and the position of the person from the image. In this embodiment, a case will be described in which only a person, who is the target object T, is detected from an image. good.

物体検出装置５１が行う物体検出処理について説明する。物体検出処理は、カメラ４１によって撮像された画像から対象物体Ｔを検出する処理である。物体検出処理は、所定の制御周期で繰り返し行われる。 An object detection process performed by the object detection device 51 will be described. The object detection processing is processing for detecting the target object T from the image captured by the camera 41 . The object detection process is repeatedly performed at a predetermined control cycle.

図３に示すように、ステップＳ１において、物体検出装置５１は、カメラ４１から画像を取得する。ステップＳ１の処理を行うことで、物体検出装置５１は、取得部を備えているといえる。本実施形態において、物体検出装置５１は、図４に示す画像ＩＭを取得したとする。図４に示す画像ＩＭには、対象物体Ｔである人の後ろ姿を図示している。以下、図４に示す画像ＩＭを例に挙げて説明を行う。 As shown in FIG. 3, the object detection device 51 acquires an image from the camera 41 in step S1. By performing the process of step S1, the object detection device 51 can be said to have an acquisition unit. In this embodiment, it is assumed that the object detection device 51 acquires the image IM shown in FIG. In the image IM shown in FIG. 4, the back view of a person who is the target object T is illustrated. The image IM shown in FIG. 4 will be described below as an example.

図３及び図５に示すように、ステップＳ２において、物体検出装置５１は、画像ＩＭにバウンディングボックスＢを付与する。バウンディングボックスＢは、画像ＩＭに映る対象物体Ｔを囲む枠である。バウンディングボックスＢは、対象物体Ｔが存在する領域の候補である。バウンディングボックスＢは、１つの対象物体Ｔに対して１又は複数付与される。バウンディングボックスＢは、対象物体Ｔの一部、あるいは、対象物体Ｔの全部を囲む。図５に示す例では、対象物体Ｔに対して、５つのバウンディングボックスＢが付与されている。適宜、５つのバウンディングボックスＢのそれぞれに、個別に符号Ｂ１，Ｂ２，Ｂ３，Ｂ４，Ｂ５を付して説明を行う。なお、説明の便宜上、図５に示すバウンディングボックスＢの大きさの差異、及びバウンディングボックスＢ同士のずれ量は誇張して表現している。バウンディングボックスＢには、バウンディングボックスＢ毎に信頼度スコアが設定されている。信頼度スコアとは、バウンディングボックスＢで囲まれる領域に存在する物体が対象物体Ｔであることの信頼度の指標である。本実施形態であれば、バウンディングボックスＢで囲まれる領域に存在する物体が人であることの信頼度の指標である。信頼度スコアが高いほど、バウンディングボックスＢで囲まれる領域に存在する物体が対象物体Ｔである確率が高いといえる。 As shown in FIGS. 3 and 5, in step S2, the object detection device 51 gives a bounding box B to the image IM. The bounding box B is a frame surrounding the target object T appearing in the image IM. The bounding box B is a candidate for the area where the target object T exists. One or a plurality of bounding boxes B are given to one target object T. FIG. A bounding box B encloses a portion of the target object T or the entire target object T. FIG. In the example shown in FIG. 5, five bounding boxes B are given to the target object T. In the example shown in FIG. As appropriate, each of the five bounding boxes B will be individually assigned reference numerals B1, B2, B3, B4, and B5 for explanation. For convenience of explanation, the difference in size of the bounding boxes B shown in FIG. 5 and the amount of displacement between the bounding boxes B are exaggerated. A reliability score is set for each bounding box B. FIG. A reliability score is an index of reliability that an object existing in the area surrounded by the bounding box B is the target object T. FIG. In this embodiment, it is an index of reliability that an object existing in the area surrounded by the bounding box B is a person. It can be said that the higher the reliability score, the higher the probability that the object existing in the area surrounded by the bounding box B is the target object T.

ステップＳ２の処理は、例えば、機械学習によって学習を行った学習済みモデルを用いて行われる。学習済みモデルは、記憶部５３に記憶されている。学習モデルとしては、例えば、RCNN(Regional Convolutional Neural Network)、fast RCNN、faster RCNN、YOLO(You Only Look Once)、及びSSD(Single Shot Detector)を挙げることができる。即ち、学習モデルとしては、領域単位で物体認識を行うモデルを用いている。学習モデルとして、R-CNNを用いた場合、物体検出装置５１は、カメラ４１から取得した画像ＩＭから複数の候補領域を抽出する。候補領域とは、画像ＩＭにおいて物体が含まれている可能性のある領域である。それぞれの候補領域の特徴量は、CNNにより計算される。物体検出装置５１は、この特徴量に基づき、候補領域の信頼度スコアを算出する。物体検出装置５１は、信頼度スコアが閾値よりも高い候補領域をバウンディングボックスＢとして出力する。ステップＳ２の処理を行うことで、物体検出装置５１は、付与部を備えているといえる。 The process of step S2 is performed using, for example, a trained model that has been trained by machine learning. A trained model is stored in the storage unit 53 . Examples of learning models include RCNN (Regional Convolutional Neural Network), fast RCNN, faster RCNN, YOLO (You Only Look Once), and SSD (Single Shot Detector). That is, as a learning model, a model that performs object recognition on a region-by-region basis is used. When using R-CNN as a learning model, the object detection device 51 extracts a plurality of candidate regions from the image IM obtained from the camera 41 . A candidate area is an area that may contain an object in the image IM. The feature amount of each candidate region is calculated by CNN. The object detection device 51 calculates the reliability score of the candidate area based on this feature quantity. The object detection device 51 outputs, as a bounding box B, a candidate area whose reliability score is higher than the threshold. By performing the process of step S2, it can be said that the object detection device 51 is provided with the applying unit.

次に、ステップＳ３において、物体検出装置５１は、NMSによってバウンディングボックスＢの削除を行う。物体検出装置５１は、重複割合を算出する。重複割合は、IoU（Intersection over Union）とも呼ばれ、以下の（１）式で表現される。 Next, in step S3, the object detection device 51 deletes the bounding box B by NMS. The object detection device 51 calculates the overlapping ratio. The overlapping ratio is also called IoU (Intersection over Union) and is expressed by the following formula (1).

IoU＝(Area of intersection)/(Area of union)…（１）
Area of intersectionは、互いに重複している２つのバウンディングボックスＢの積集合の領域である。Area of intersectionは、２つのバウンディングボックスＢが互いに重なり合う部分の面積ともいえる。Area of unionは、互いに重複している２つのバウンディングボックスＢの和集合の領域である。Area of unionは、２つのバウンディングボックスＢのうち少なくともいずれかに含まれる部分の面積ともいえる。重複割合は、互いに重複している２つのバウンディングボックスＢの積集合の領域を当該２つのバウンディングボックスＢの和集合の領域で除算することで算出されるといえる。 IoU＝(Area of intersection)/(Area of union)…(1)
Area of intersection is the area of intersection of two bounding boxes B that overlap each other. The Area of intersection can also be said to be the area of the portion where the two bounding boxes B overlap each other. The Area of union is the area of the union of two bounding boxes B that overlap each other. The area of union can also be said to be the area of a portion included in at least one of the two bounding boxes B. FIG. It can be said that the overlapping ratio is calculated by dividing the intersection area of two bounding boxes B overlapping each other by the union area of the two bounding boxes B. FIG.

物体検出装置５１は、重複割合が閾値を超えている場合には、互いに重複している２つのバウンディングボックスＢのうち信頼度スコアが低い方を削除する。閾値としては、例えば、０．３～０．７から任意の値を設定することができる。 When the overlapping ratio exceeds the threshold, the object detection device 51 deletes the one with the lower reliability score of the two bounding boxes B that overlap each other. Any value from 0.3 to 0.7, for example, can be set as the threshold value.

物体検出装置５１は、同一の対象物体Ｔに付与された複数のバウンディングボックスＢの全ての組み合わせについて、NMSを適用する。これにより、５つのバウンディングボックスＢのうち重複割合が閾値を超えており、かつ、重複しているバウンディングボックスＢよりも信頼度スコアが低いバウンディングボックスＢは削除される。 The object detection device 51 applies NMS to all combinations of multiple bounding boxes B given to the same target object T. FIG. As a result, the bounding box B whose overlap ratio among the five bounding boxes B exceeds the threshold and whose reliability score is lower than that of the overlapping bounding box B is deleted.

図６に示す例では、５つのバウンディングボックスＢのうち２つのバウンディングボックスＢ４，Ｂ５が削除されている。これにより、５つのバウンディングボックスＢのうちバウンディングボックスＢ１，Ｂ２，Ｂ３が残る。このように、NMSでは、同一の対象物体Ｔを囲むバウンディングボックスＢを１つにすることができない場合が生じ得る。２つのバウンディングボックスＢの大きさの差が著しく大きい場合、２つのバウンディングボックスＢの積集合は小さいほうのバウンディングボックスＢの影響により小さくなる。一方で、２つのバウンディングボックスＢの和集合は大きいほうのバウンディングボックスＢの影響により大きくなる。これにより、２つのバウンディングボックスＢが重複している場合であっても、（１）式により算出される重複割合が閾値を超えない場合がある。重複割合が閾値を超えない場合には、バウンディングボックスＢの削除が行われないため、２つのバウンディングボックスＢの大きさの差を原因として、NMSによるバウンディングボックスＢの削除を行えない場合が生じる。本実施形態では、以下の処理によって、バウンディングボックスＢを更に削除する。ステップＳ３の処理を行うことで、物体検出装置５１は、第１削除部を備えているといえる。 In the example shown in FIG. 6, two bounding boxes B4 and B5 out of five bounding boxes B are deleted. As a result, of the five bounding boxes B, bounding boxes B1, B2, and B3 remain. Thus, in NMS, there may be a case where the bounding box B surrounding the same target object T cannot be made into one. If the size difference between two bounding boxes B is significantly large, the intersection of the two bounding boxes B will be smaller due to the effect of the smaller bounding box B. On the other hand, the union of two bounding boxes B is larger due to the influence of the larger bounding box B. As a result, even when two bounding boxes B overlap, the overlapping ratio calculated by the formula (1) may not exceed the threshold. If the overlapping ratio does not exceed the threshold, the bounding box B is not deleted. Therefore, there may be a case where the bounding box B cannot be deleted by the NMS due to the difference in size between the two bounding boxes B. In this embodiment, the bounding box B is further deleted by the following processing. By performing the process of step S3, it can be said that the object detection device 51 is provided with the first deletion unit.

図３に示すように、ステップＳ４において、物体検出装置５１は、包含割合を算出する。包含割合を、IoA（Intersection over Area）と定義する。IoAは、以下の（２）式から算出することができる。 As shown in FIG. 3, in step S4, the object detection device 51 calculates the inclusion ratio. The coverage ratio is defined as IoA (Intersection over Area). IoA can be calculated from the following formula (2).

IoA＝(Area of intersection)/(Area)…（２）
Areaは、互いに重複している２つのバウンディングボックスＢのうち一方の領域である。包含割合は、互いに重複している２つのバウンディングボックスＢの積集合の領域を当該２つのバウンディングボックスＢのうち一方の領域で除算することで算出されるといえる。（２）式から把握できるように、包含割合とは、２つのバウンディングボックスＢの一方が他方に包含される割合を示す指標である。包含割合が１に近いほど、２つのバウンディングボックスＢのうち分母となるバウンディングボックスＢは、もう一方のバウンディングボックスＢに包含されているといえる。包含割合が１の場合、２つのバウンディングボックスＢのうち分母となるバウンディングボックスＢの全体が、もう一方のバウンディングボックスＢに包含されている。包含割合が０より大きく、かつ、１より小さい場合、２つのバウンディングボックスＢのうち分母となるバウンディングボックスＢの一部が、もう一方のバウンディングボックスＢに包含されている。物体検出装置５１は、同一の対象物体Ｔに付与された複数のバウンディングボックスＢの全ての組み合わせについて、包含割合を算出する。なお、ここでいう組み合わせとは、２つのバウンディングボックスＢの組み合わせのうち、分母を入れ替えたパターンを含むものである。２つのバウンディングボックスＢ１，Ｂ２を例に挙げると、分母をバウンディングボックスＢ１とする包含割合と、分母をバウンディングボックスＢ２とする包含割合との２パターンが算出される。ステップＳ４の処理を行うことで、物体検出装置５１は、包含割合算出部を備えているといえる。 IoA＝(Area of intersection)/(Area)…(2)
Area is one of the two bounding boxes B that overlap each other. It can be said that the inclusion ratio is calculated by dividing the intersection area of two bounding boxes B that overlap each other by one area of the two bounding boxes B. FIG. As can be understood from the formula (2), the inclusion rate is an index indicating the rate at which one of the two bounding boxes B is included in the other. It can be said that the closer the inclusion ratio is to 1, the more the bounding box B, which is the denominator of the two bounding boxes B, is included in the other bounding box B. When the inclusion ratio is 1, the entire bounding box B, which is the denominator of the two bounding boxes B, is included in the other bounding box B. If the inclusion ratio is greater than 0 and less than 1, part of the bounding box B that is the denominator of the two bounding boxes B is included in the other bounding box B. The object detection device 51 calculates inclusion ratios for all combinations of a plurality of bounding boxes B given to the same target object T. FIG. The term "combination" as used herein includes a combination of two bounding boxes B in which the denominators are interchanged. Taking two bounding boxes B1 and B2 as an example, two patterns are calculated: one containing the bounding box B1 as the denominator and one containing the bounding box B2 as the denominator. By performing the process of step S4, the object detection device 51 can be said to have an inclusion ratio calculation unit.

次に、ステップＳ５において、物体検出装置５１は、包含割合が包含閾値以上のバウンディングボックスＢを削除する。詳細にいえば、物体検出装置５１は、（２）式により算出された包含割合が包含閾値以上となった２つのバウンディングボックスＢの組み合わせのうち、分母に該当するバウンディングボックスＢを削除する。これにより、２つのバウンディングボックスＢの組み合わせのうち、小さいほうのバウンディングボックスＢが削除される。包含閾値としては、０より大きい値から任意の値を設定することができる。包含閾値が１の場合、２つのバウンディングボックスＢのうち、一方のバウンディングボックスＢに全体が包含された他方のバウンディングボックスＢが削除される。包含閾値が１未満の場合、包含割合に応じて、２つのバウンディングボックスＢのうち、一方のバウンディングボックスＢに一部が包含された他方のバウンディングボックスＢが削除される。本実施形態では、複数のバウンディングボックスＢから、１つのバウンディングボックスＢのみを残せるように包含閾値を設定している。これにより、対象物体Ｔが存在する領域を表すバウンディングボックスＢを１つに絞ることができる。図７に示す例では、バウンディングボックスＢ２，Ｂ３が削除されている。これにより、バウンディングボックスＢ１のみが残る。物体検出装置５１は、バウンディングボックスＢ１に囲まれる領域に対象物体Ｔが存在している特定することができる。即ち、物体検出装置５１は、画像から対象物体Ｔを検出することができる。ステップＳ５の処理を行うことで、物体検出装置５１は、第２削除部を備えているといえる。 Next, in step S5, the object detection device 51 deletes bounding boxes B whose inclusion rate is equal to or greater than the inclusion threshold. Specifically, the object detection device 51 deletes the bounding box B corresponding to the denominator from among the combinations of the two bounding boxes B for which the inclusion ratio calculated by Equation (2) is equal to or greater than the inclusion threshold. As a result, of the combination of two bounding boxes B, the smaller bounding box B is deleted. Any value greater than 0 can be set as the inclusion threshold. When the inclusion threshold is 1, of the two bounding boxes B, the other bounding box B that is wholly included in the other bounding box B is deleted. If the inclusion threshold is less than 1, of the two bounding boxes B, one of the bounding boxes B partially included in the other bounding box B is deleted according to the inclusion ratio. In this embodiment, the inclusion threshold is set so that only one bounding box B can be left out of a plurality of bounding boxes B. FIG. Thereby, the bounding box B representing the area where the target object T exists can be narrowed down to one. In the example shown in FIG. 7, bounding boxes B2 and B3 are deleted. This leaves only the bounding box B1. The object detection device 51 can specify that the target object T exists in the area surrounded by the bounding box B1. That is, the object detection device 51 can detect the target object T from the image. By performing the process of step S5, it can be said that the object detection device 51 is provided with the second deletion unit.

なお、バウンディングボックスＢの削除は、ステップＳ２で付与されたバウンディングボックスＢが複数の場合に行われる。従って、ステップＳ２で付与されたバウンディングボックスＢが単数の場合には、ステップＳ３～ステップＳ５の処理は行われなくてもよい。同様に、NMSによるバウンディングボックスＢの削除により、バウンディングボックスＢが単数になった場合、ステップＳ４，Ｓ５の処理は行われなくてもよい。 The deletion of bounding box B is performed when a plurality of bounding boxes B are assigned in step S2. Therefore, when the number of bounding boxes B assigned in step S2 is singular, the processing of steps S3 to S5 may not be performed. Similarly, when the number of bounding boxes B becomes singular due to deletion of bounding boxes B by the NMS, the processes of steps S4 and S5 may not be performed.

制御装置３２が行う移動処理について説明する。移動処理は、対象物体Ｔを追跡するように車両２０を移動させる処理である。移動処理は、所定の制御周期で繰り返し行われる。 Movement processing performed by the control device 32 will be described. The moving process is a process of moving the vehicle 20 so as to track the target object T. FIG. The movement process is repeatedly performed at a predetermined control cycle.

図８に示すように、ステップＳ１１において、制御装置３２は、外界センサ３１の検出結果を取得する。これにより、制御装置３２は、自律移動体１０の周辺環境の形状を示す点群データを得ることができる。 As shown in FIG. 8, the control device 32 acquires the detection result of the external sensor 31 in step S11. Thereby, the control device 32 can obtain point cloud data indicating the shape of the surrounding environment of the autonomous mobile body 10 .

次に、ステップＳ１２において、制御装置３２は、物体検出装置５１の検出結果を取得する。物体検出装置５１の検出結果とは、例えば、対象物体Ｔの位置情報である。対象物体Ｔの位置情報とは、例えば、画像ＩＭにおけるバウンディングボックスＢの位置を示す二次元座標である。制御装置３２は、重複割合によるバウンディングボックスＢの削除と、包含割合によるバウンディングボックスＢの削除とが行われた後に残ったバウンディングボックスＢの位置情報を取得するといえる。 Next, in step S<b>12 , the control device 32 acquires the detection result of the object detection device 51 . The detection result of the object detection device 51 is position information of the target object T, for example. The position information of the target object T is, for example, two-dimensional coordinates indicating the position of the bounding box B in the image IM. It can be said that the control device 32 acquires the position information of the remaining bounding box B after the bounding box B is deleted by the overlap ratio and the bounding box B is deleted by the inclusion ratio.

次に、ステップＳ１３において、制御装置３２は、車両２０と対象物体Ｔとの相対位置を導出する。車両２０と対象物体Ｔとの相対位置は、外界センサ３１による検出結果と、物体検出装置５１の検出結果とに基づき導出される。重複割合によるバウンディングボックスＢの削除と、包含割合によるバウンディングボックスＢの削除とが行われた後に残ったバウンディングボックスＢに基づき、車両２０と対象物体Ｔとの相対位置は導出されるといえる。車両２０と対象物体Ｔとの相対位置は、例えば、対象物体Ｔのセンサ座標である。車両２０と対象物体Ｔとの相対位置は、車両２０の水平方向における中心位置を原点とする座標系など、車両２０と対象物体Ｔとの位置関係を把握できる座標系であれば、どのような座標系の座標であってもよい。どのような座標系を用いる場合であっても、センサ座標系との関係を把握できていれば、センサ座標から、車両２０と対象物体Ｔとの位置関係を把握できる座標系の座標への変換を行うことができる。本実施形態では、対象物体Ｔのセンサ座標を車両２０と対象物体Ｔとの相対位置とする。 Next, the control device 32 derives the relative position between the vehicle 20 and the target object T in step S13. A relative position between the vehicle 20 and the target object T is derived based on the detection result of the external sensor 31 and the detection result of the object detection device 51 . It can be said that the relative position between the vehicle 20 and the target object T is derived based on the bounding box B remaining after the bounding box B is deleted according to the overlapping ratio and the bounding box B is deleted according to the inclusion ratio. The relative position between the vehicle 20 and the target object T is the sensor coordinates of the target object T, for example. The relative position between the vehicle 20 and the target object T may be any coordinate system that can grasp the positional relationship between the vehicle 20 and the target object T, such as a coordinate system whose origin is the center position of the vehicle 20 in the horizontal direction. It may be the coordinates of a coordinate system. Transformation from the sensor coordinates to the coordinates of a coordinate system in which the positional relationship between the vehicle 20 and the target object T can be grasped, regardless of what coordinate system is used, as long as the relation with the sensor coordinate system can be grasped. It can be performed. In this embodiment, the sensor coordinates of the target object T are the relative positions of the vehicle 20 and the target object T. FIG.

制御装置３２は、外界センサ３１から取得した点群データに基づき、センサ座標系での物体の座標を導出することができる。例えば、制御装置３２は、外界センサ３１から得られた複数の点をクラスタ化することで、クラスタ化した複数の点を１つの物体とみなすことができる。これにより、制御装置３２は、外界センサ３１による検出可能範囲内に位置している物体のセンサ座標を導出することができる。制御装置３２は、物体検出装置５１の検出結果から、外界センサ３１により検出した物体のうち、いずれの物体が対象物体Ｔであるかを特定する。そして、制御装置３２は、対象物体Ｔとして特定された物体のセンサ座標を、車両２０と対象物体Ｔとの相対位置とする。 The control device 32 can derive the coordinates of the object in the sensor coordinate system based on the point cloud data acquired from the external sensor 31 . For example, by clustering a plurality of points obtained from the external sensor 31, the control device 32 can regard the clustered plurality of points as one object. Thereby, the control device 32 can derive the sensor coordinates of the object positioned within the detectable range of the external sensor 31 . The control device 32 identifies which of the objects detected by the external sensor 31 is the target object T from the detection result of the object detection device 51 . Then, the control device 32 sets the sensor coordinates of the object identified as the target object T as the relative position between the vehicle 20 and the target object T. FIG.

制御装置３２は、物体検出装置５１の検出結果から対象物体Ｔの方位を特定し、この方位に基づき対象物体Ｔを特定してもよい。制御装置３２は、画像ＩＭにおけるバウンディングボックスＢの位置から、車両２０を基準とした場合の対象物体Ｔの方位を特定することができる。これにより、制御装置３２は、センサ座標系において上記した方位に存在する物体を対象物体Ｔとして特定することができる。 The control device 32 may identify the orientation of the target object T from the detection result of the object detection device 51, and identify the target object T based on this orientation. The control device 32 can identify the orientation of the target object T with respect to the vehicle 20 from the position of the bounding box B in the image IM. As a result, the control device 32 can identify an object existing in the above-described orientation in the sensor coordinate system as the target object T. FIG.

制御装置３２は、物体検出装置５１の検出結果からカメラ座標系での対象物体Ｔの座標を導出し、カメラ座標系での対象物体Ｔの座標に基づき対象物体Ｔを特定してもよい。カメラ座標系とは、カメラ４１を原点とする座標系である。適宜、カメラ座標系での座標をカメラ座標と称する。対象物体Ｔのカメラ座標は、例えば、バウンディングボックスＢの位置情報、バウンディングボックスＢのスケール、カメラ４１の取付位置、及びカメラ４１の取付角度から導出することができる。制御装置３２は、対象物体Ｔのカメラ座標を対象物体Ｔのセンサ座標に変換することができる。カメラ座標からセンサ座標への変換は、センサ座標系とカメラ座標系との原点のずれ、及びセンサ座標系とカメラ座標系との座標軸のずれに基づき行うことができる。制御装置３２は、カメラ座標の変換により得られた対象物体Ｔのセンサ座標と、点群データから得られた物体のセンサ座標との一致性から対象物体Ｔを特定する。例えば、制御装置３２は、カメラ座標の変換により得られた対象物体Ｔのセンサ座標に位置している物体、あるいは、カメラ座標の変換により得られた対象物体Ｔのセンサ座標に最も近い物体を対象物体Ｔとする。ステップＳ１３の処理を行うことで、制御装置３２は、導出部を備えているといえる。 The control device 32 may derive the coordinates of the target object T in the camera coordinate system from the detection result of the object detection device 51, and specify the target object T based on the coordinates of the target object T in the camera coordinate system. A camera coordinate system is a coordinate system having the camera 41 as an origin. Coordinates in the camera coordinate system are appropriately referred to as camera coordinates. The camera coordinates of the target object T can be derived from, for example, the position information of the bounding box B, the scale of the bounding box B, the mounting position of the camera 41 and the mounting angle of the camera 41 . The control device 32 can transform the camera coordinates of the target object T into the sensor coordinates of the target object T. FIG. Conversion from the camera coordinates to the sensor coordinates can be performed based on the deviation of the origin between the sensor coordinate system and the camera coordinate system and the deviation of the coordinate axes between the sensor coordinate system and the camera coordinate system. The control device 32 identifies the target object T based on the matching between the sensor coordinates of the target object T obtained by converting the camera coordinates and the sensor coordinates of the object obtained from the point cloud data. For example, the control device 32 targets an object located at the sensor coordinates of the target object T obtained by converting the camera coordinates, or an object closest to the sensor coordinates of the target object T obtained by converting the camera coordinates. Let the object be T. By performing the process of step S13, it can be said that the control device 32 has a derivation unit.

ステップＳ１４において、制御装置３２は、対象物体Ｔを追跡するように車両２０を移動させる。制御装置３２は、ステップＳ１３で得られた対象物体Ｔのセンサ座標に基づき、車両２０と対象物体Ｔとの離間距離が所定の範囲に収まるように車両２０を移動させる。例えば、制御装置３２は、車両２０から対象物体Ｔに向かう方位に車両２０が進行するように指令を生成し、この指令をモータドライバ２５に送る。また、制御装置３２は、車両２０と対象物体Ｔとの離間距離が長いほど、車両２０の速度が高くなるように指令を生成してもよい。ステップＳ１４の処理を行うことで、制御装置３２は、移動制御部を備えているといえる。 In step S14, the control device 32 moves the vehicle 20 so as to track the target object T. FIG. Based on the sensor coordinates of the target object T obtained in step S13, the control device 32 moves the vehicle 20 so that the separation distance between the vehicle 20 and the target object T falls within a predetermined range. For example, the control device 32 generates a command to move the vehicle 20 in a direction from the vehicle 20 toward the target object T, and sends this command to the motor driver 25 . Further, the control device 32 may generate a command such that the longer the distance between the vehicle 20 and the target object T, the higher the speed of the vehicle 20 . By performing the process of step S14, it can be said that the control device 32 has a movement control section.

本実施形態の作用について説明する。
制御装置３２は、物体検出装置５１の検出結果に基づき、車両２０を移動させている。物体検出装置５１は、重複割合によるバウンディングボックスＢの削除と、包含割合によるバウンディングボックスＢの削除とを併用している。これにより、重複割合によるバウンディングボックスＢの削除のみを行う場合に比べて、同一の対象物体Ｔに付与されたバウンディングボックスＢの数を減らすことができる。同一の対象物体Ｔに複数のバウンディングボックスＢが付与されている場合、いずれのバウンディングボックスＢに囲まれる領域に対象物体Ｔが存在しているかを判定しにくい。これに対し、本実施形態では、バウンディングボックスＢの数を減らすことで、対象物体Ｔが存在している領域を判定しやすい。 The operation of this embodiment will be described.
The control device 32 moves the vehicle 20 based on the detection result of the object detection device 51 . The object detection device 51 uses both deletion of the bounding box B based on the overlapping ratio and deletion of the bounding box B based on the inclusion ratio. As a result, the number of bounding boxes B given to the same target object T can be reduced compared to the case where only the bounding boxes B are deleted based on the overlapping ratio. When a plurality of bounding boxes B are given to the same target object T, it is difficult to determine in which bounding box B the target object T exists. In contrast, in the present embodiment, by reducing the number of bounding boxes B, it is easier to determine the area where the target object T exists.

特に、対象物体Ｔを追跡する自律移動体１０では、対象物体Ｔを追跡するために、同一の対象物体Ｔを検出し続ける必要がある。バウンディングボックスＢが複数存在する場合、前回の制御周期で検出された対象物体Ｔと同一の対象物体ＴがいずれのバウンディングボックスＢに相当するかを判定できない場合が生じる。この場合、対象物体Ｔを追跡することができなくなるおそれがある。本実施形態のように、バウンディングボックスＢを１つに絞ることで、物体検出装置５１は、同一の対象物体Ｔを検出し続けることができる。 In particular, in the autonomous mobile body 10 that tracks the target object T, it is necessary to continue detecting the same target object T in order to track the target object T. When there are a plurality of bounding boxes B, it may not be possible to determine which bounding box B corresponds to the same target object T as the target object T detected in the previous control cycle. In this case, there is a possibility that the target object T cannot be tracked. By narrowing down the bounding box B to one as in this embodiment, the object detection device 51 can continue to detect the same target object T. FIG.

本実施形態の効果について説明する。
（１）物体検出装置５１は、重複割合によるバウンディングボックスＢの削除と、包含割合によるバウンディングボックスＢの削除とを行う。重複割合によるバウンディングボックスＢの削除のみを行う場合に比べて、同一の対象物体Ｔに付与されたバウンディングボックスＢの数を減らすことができる。これにより、物体検出装置５１による対象物体Ｔの検出精度の低下を抑制できる。 Effects of the present embodiment will be described.
(1) The object detection device 51 deletes the bounding box B based on the overlap ratio and deletes the bounding box B based on the inclusion ratio. The number of bounding boxes B assigned to the same target object T can be reduced compared to the case where only the bounding boxes B are deleted based on the overlapping ratio. As a result, deterioration in detection accuracy of the target object T by the object detection device 51 can be suppressed.

（２）自律移動体１０は、物体検出装置５１によって検出された対象物体Ｔを追跡する。物体検出装置５１で検出された対象物体Ｔを追跡することで、同一の対象物体Ｔを追跡することができる。自律移動体１０が、対象物体Ｔを追跡できなくなったり、対象物体Ｔとは異なる物体を追跡することが抑制される。 (2) The autonomous mobile body 10 tracks the target object T detected by the object detection device 51 . By tracking the target object T detected by the object detection device 51, the same target object T can be tracked. The autonomous mobile body 10 is prevented from tracking the target object T or tracking an object different from the target object T.

実施形態は、以下のように変更して実施することができる。本実施形態及び以下の変形例は、技術的に矛盾しない範囲で互いに組み合わせて実施することができる。
○物体検出装置５１は、包含割合が包含閾値以上となった２つのバウンディングボックスＢの組み合わせのうち、信頼度スコアが低いバウンディングボックスＢを削除してもよい。物体検出装置５１は、包含割合が包含閾値以上となった２つのバウンディングボックスＢの組み合わせのうち、（２）式の分母となるバウンディングボックスＢとは異なるバウンディングボックスＢを削除してもよい。この場合、２つのバウンディングボックスＢの組み合わせのうち大きい方のバウンディングボックスＢが削除される。このように、物体検出装置５１は、包含割合に基づき、互いに重複している２つのバウンディングボックスＢの一方を削除できればよい。２つのバウンディングボックスＢのうちいずれを削除するかは、対象物体Ｔの種類等の要素によって、適宜変更すればよい。 Embodiments can be modified and implemented as follows. This embodiment and the following modified examples can be implemented in combination with each other within a technically consistent range.
○ The object detection device 51 may delete the bounding box B with the low reliability score from the combination of the two bounding boxes B whose inclusion rate is equal to or greater than the inclusion threshold. The object detection device 51 may delete a bounding box B that differs from the bounding box B that is the denominator of the formula (2) from among the combinations of the two bounding boxes B whose inclusion rate is equal to or greater than the inclusion threshold. In this case, the larger bounding box B of the combination of two bounding boxes B is deleted. In this way, the object detection device 51 only needs to be able to delete one of the two overlapping bounding boxes B based on the inclusion ratio. Which of the two bounding boxes B is to be deleted may be appropriately changed according to factors such as the type of the target object T. FIG.

○包含閾値の設定によっては、重複割合によるバウンディングボックスＢの削除と、包含割合によるバウンディングボックスＢの削除とを行った場合であっても、複数のバウンディングボックスＢが残る場合がある。この場合、物体検出装置５１は、重複割合や包含割合とは異なる指標によってバウンディングボックスＢを１つに絞ってもよい。また、物体検出装置５１は、複数のバウンディングボックスＢを融合することで、１つのバウンディングボックスＢを生成してもよい。複数のバウンディングボックスＢを融合する手法としては、例えば、NMW(Non-Maximum Weighted)を用いることができる。 O Depending on the setting of the inclusion threshold, a plurality of bounding boxes B may remain even when the bounding box B is deleted according to the overlap ratio and the bounding box B is deleted according to the inclusion ratio. In this case, the object detection device 51 may narrow down the bounding box B to one using an index different from the overlapping ratio and the inclusion ratio. Further, the object detection device 51 may generate one bounding box B by fusing a plurality of bounding boxes B. FIG. As a technique for fusing a plurality of bounding boxes B, for example, NMW (Non-Maximum Weighted) can be used.

また、複数のバウンディングボックスＢが残った場合であっても、物体検出装置５１は、バウンディングボックスＢを１つにしなくてもよい。この場合であっても、バウンディングボックスＢの数が減ることによって、物体検出装置５１による対象物体Ｔの検出精度の低下を抑制できる。 Further, even if a plurality of bounding boxes B remain, the object detection device 51 does not have to reduce the number of bounding boxes B to one. Even in this case, the decrease in the detection accuracy of the target object T by the object detection device 51 can be suppressed by reducing the number of bounding boxes B. FIG.

○カメラ４１としてステレオカメラを用いてもよい。ステレオカメラは、複数のカメラを備える。物体検出装置５１は、複数のカメラによって撮像された画像から視差画像を取得する。視差画像は、同一の特徴点について複数のカメラによって撮像を行った場合に、カメラ間で生じる画素差を示すものである。特徴点は、物体のエッジなど視差が得られる部分、即ち、撮像された画像の各画素において輝度が変化する画素である。物体検出装置５１は、ステレオカメラの眼間距離、焦点距離、視差画像などを用いて特徴点のカメラ座標を導出することができる。そして、物体検出装置５１は、特徴点をクラスタ化することで、物体のカメラ座標を導出することができる。物体検出装置５１は、ステレオカメラから得られた画像を用いて、画像における対象物体Ｔの位置を特定することもできる。これにより、物体検出装置５１は、対象物体Ｔのカメラ座標を導出することができる。物体検出装置５１は、対象物体Ｔのカメラ座標を、車両２０と対象物体Ｔとの位置関係を把握できる座標系の座標に変換する。制御装置３２は、この座標を車両２０と対象物体Ｔとの相対位置とし、車両２０を移動させてもよい。カメラ４１としてステレオカメラを用いる場合、自律移動体１０は外界センサ３１を備えていなくてもよい。 ○ A stereo camera may be used as the camera 41 . A stereo camera has a plurality of cameras. The object detection device 51 acquires parallax images from images captured by a plurality of cameras. A parallax image indicates a pixel difference that occurs between cameras when the same feature point is captured by a plurality of cameras. A feature point is a portion where parallax can be obtained, such as an edge of an object, that is, a pixel whose luminance changes in each pixel of a captured image. The object detection device 51 can derive the camera coordinates of the feature points using the interocular distance, focal length, parallax image, etc. of the stereo camera. Then, the object detection device 51 can derive the camera coordinates of the object by clustering the feature points. The object detection device 51 can also specify the position of the target object T in the image using the image obtained from the stereo camera. Thereby, the object detection device 51 can derive the camera coordinates of the target object T. FIG. The object detection device 51 converts the camera coordinates of the target object T into coordinates of a coordinate system that allows the positional relationship between the vehicle 20 and the target object T to be grasped. The control device 32 may use these coordinates as the relative positions of the vehicle 20 and the target object T to move the vehicle 20 . When using a stereo camera as the camera 41 , the autonomous mobile body 10 does not have to include the external sensor 31 .

カメラ４１としては、ToF（Time of Flight）カメラを用いてもよい。ToFカメラは、赤外線などのパルス光を照射して、その反射光を撮像素子により検知し、パルス光の反射時間により計測箇所までの距離を計測する。ToFカメラを用いることで、画像の各画素に奥行き方向の距離が対応付けられた距離画像を得ることができる。物体検出装置５１は、距離画像を用いて車両２０と対象物体Ｔとの相対位置を導出することができる。この場合であっても、自律移動体１０は、外界センサ３１を備えていなくてもよい。 As the camera 41, a ToF (Time of Flight) camera may be used. A ToF camera emits pulsed light such as infrared rays, detects the reflected light with an imaging device, and measures the distance to a measurement point based on the reflection time of the pulsed light. By using a ToF camera, it is possible to obtain a depth image in which each pixel of the image is associated with a distance in the depth direction. The object detection device 51 can derive the relative position between the vehicle 20 and the target object T using the distance image. Even in this case, the autonomous mobile body 10 does not have to be equipped with the external sensor 31 .

○物体検出装置５１は、時間変化に伴うカメラ４１の位置、及び姿勢の変化から対象物体Ｔのカメラ座標を導出してもよい。この場合、自律移動体１０は、カメラ４１の位置及び姿勢を検出する外界センサ３１を備える。物体検出装置５１は、異なる２つの時刻である第１時刻及び第２時刻でのカメラ４１の位置及び姿勢をセンサから取得する。物体検出装置５１は、第１時刻及び第２時刻でのカメラ４１の位置及び姿勢の変化を導出する。物体検出装置５１は、第１時刻及び第２時刻での画像に写る同一特徴点の画像上での座標の変化と、第１時刻及び第２時刻でのカメラ４１の位置及び姿勢の変化とを用いて三角測量により特徴点のカメラ座標を導出する。即ち、物体検出装置５１は、第１時刻でのカメラ４１と第２時刻でのカメラ４１とを１つのステレオカメラとみなしてカメラ座標を導出する。 ○ The object detection device 51 may derive the camera coordinates of the target object T from changes in the position and orientation of the camera 41 that accompany changes over time. In this case, the autonomous mobile body 10 includes an external sensor 31 that detects the position and orientation of the camera 41 . The object detection device 51 acquires the position and orientation of the camera 41 at two different times, ie, a first time and a second time, from sensors. The object detection device 51 derives changes in the position and orientation of the camera 41 at the first time and the second time. The object detection device 51 detects changes in the coordinates of the same feature point appearing in the image at the first time and the second time, and changes in the position and orientation of the camera 41 at the first time and the second time. is used to derive the camera coordinates of feature points by triangulation. That is, the object detection device 51 derives the camera coordinates by regarding the camera 41 at the first time and the camera 41 at the second time as one stereo camera.

○外界センサ３１は、対象物体Ｔとの相対位置を検出できるものであればよい。例えば、外界センサ３１は、レーダーであってもよい。
○移動体は、飛行体や多足歩行ロボットでもよい。 (circle) the external sensor 31 should just detect the relative position with the target object T. FIG. For example, the external sensor 31 may be radar.
○The moving body may be a flying body or a multi-legged walking robot.

○物体検出処理は、重複割合によるバウンディングボックスＢの削除と、包含割合によるバウンディングボックスＢの削除とを行えればよく、処理の順序は変更してもよい。例えば、包含割合によるバウンディングボックスＢの削除を行った後に、重複割合によるバウンディングボックスＢの削除を行ってもよい。 ○ Object detection processing may be performed by deleting the bounding box B based on the overlapping ratio and deleting the bounding box B based on the inclusion ratio, and the order of the processing may be changed. For example, the bounding box B may be deleted according to the overlap ratio after the bounding box B is deleted according to the inclusion ratio.

○画像ＩＭは、グレースケールの画像であってもよい。
○付与部、第１削除部、包含割合算出部、第２削除部は、それぞれ、個別の装置であってもよい。即ち、物体検出装置５１は、複数の装置を備えたユニットであってもよい。 o The image IM may be a grayscale image.
○ The addition unit, the first deletion unit, the inclusion ratio calculation unit, and the second deletion unit may be separate devices. That is, the object detection device 51 may be a unit including a plurality of devices.

○導出部及び移動制御部は、それぞれ、個別の装置であってもよい。即ち、制御装置３２は、複数の装置を備えたユニットであってもよい。
○物体検出装置５１が導出部として機能するようにしてもよい。 ○ The derivation unit and the movement control unit may be separate devices. That is, the control device 32 may be a unit with multiple devices.
(circle) the object detection apparatus 51 may be made to function as a derivation|leading-out part.

○対象物体Ｔとしては、人以外の物体であってもよい。例えば、対象物体Ｔは、車両２０とは異なる移動体であってもよい。
○物体検出装置５１は、自律移動体１０とは異なる装置に搭載されていてもよい。 ○ The target object T may be an object other than a person. For example, the target object T may be a mobile object different from the vehicle 20 .
O The object detection device 51 may be mounted on a device different from the autonomous mobile body 10 .

Ｂ…バウンディングボックス、ＩＭ…画像、Ｔ…対象物体、１０…自律移動体、２０…移動体としての車両、３２…導出部、及び移動制御部としての制御装置、４１…カメラ、５１…付与部、第１削除部、包含割合算出部、第２削除部、及び取得部としての物体検出装置。 B... Bounding box, IM... Image, T... Target object, 10... Autonomous moving body, 20... Vehicle as moving body, 32... Derivation unit and control device as movement control unit, 41... Camera, 51... Giving unit , a first deletion unit, a inclusion ratio calculation unit, a second deletion unit, and an object detection device as an acquisition unit.

Claims

画像から対象物体を検出する物体検出装置であって、
前記画像に映る前記対象物体を囲むバウンディングボックスを付与する付与部と、
前記バウンディングボックスが複数存在する場合に、互いに重複している２つの前記バウンディングボックスの積集合の領域を当該２つの前記バウンディングボックスの和集合の領域で除算することで重複割合を算出し、前記重複割合が閾値を超えている場合には、互いに重複している２つの前記バウンディングボックスのうち信頼度スコアが低い方を削除する第１削除部と、
前記バウンディングボックスが複数存在する場合に、互いに重複している２つの前記バウンディングボックスの積集合の領域を当該２つの前記バウンディングボックスのうち一方の領域で除算することで包含割合を算出する包含割合算出部と、
前記包含割合に基づき、互いに重複している２つの前記バウンディングボックスの一方を削除する第２削除部と、を備える物体検出装置。 An object detection device for detecting a target object from an image,
an imparting unit that imparts a bounding box surrounding the target object appearing in the image;
When there are a plurality of the bounding boxes, an overlap ratio is calculated by dividing the intersection area of the two overlapping bounding boxes by the area of the union of the two bounding boxes, and a first deletion unit that deletes the one with a lower reliability score out of the two mutually overlapping bounding boxes when the ratio exceeds a threshold;
Inclusion ratio calculation for calculating an inclusion ratio by dividing, when a plurality of the bounding boxes exist, an area of the product set of two mutually overlapping bounding boxes by one area of the two bounding boxes. Department and
and a second deletion unit that deletes one of the two overlapping bounding boxes based on the inclusion ratio.

対象物体を追跡する自律移動体であって、
移動体と、
前記移動体に設けられており、前記対象物体を撮像するカメラと、
物体検出装置と、を備え、
前記物体検出装置は、
前記カメラから画像を取得する取得部と、
前記画像に映る前記対象物体を囲むバウンディングボックスを付与する付与部と、
前記バウンディングボックスが複数存在する場合に、互いに重複している２つの前記バウンディングボックスの積集合の領域を当該２つの前記バウンディングボックスの和集合の領域で除算することで重複割合を算出し、前記重複割合が閾値を超えている場合には、互いに重複している２つの前記バウンディングボックスのうち信頼度スコアが低い方を削除する第１削除部と、
前記バウンディングボックスが複数存在する場合に、互いに重複している２つの前記バウンディングボックスの積集合の領域を当該２つの前記バウンディングボックスのうち一方の領域で除算することで包含割合を算出する包含割合算出部と、
前記包含割合に基づき、互いに重複している２つの前記バウンディングボックスの一方を削除する第２削除部と、を備え、
前記自律移動体は、
前記第１削除部、及び前記第２削除部により前記バウンディングボックスの削除を行った後に残った前記バウンディングボックスに基づき、前記移動体と前記対象物体との相対位置を導出する導出部と、
前記相対位置に基づき、前記対象物体を追跡するように前記移動体を移動させる移動制御部と、を備える自律移動体。 An autonomous mobile body that tracks a target object,
a mobile object;
a camera provided on the moving body for capturing an image of the target object;
and an object detection device,
The object detection device is
an acquisition unit that acquires an image from the camera;
an imparting unit that imparts a bounding box surrounding the target object appearing in the image;
When there are a plurality of the bounding boxes, an overlap ratio is calculated by dividing the intersection area of the two overlapping bounding boxes by the area of the union of the two bounding boxes, and a first deletion unit that deletes the one with a lower reliability score out of the two mutually overlapping bounding boxes when the ratio exceeds a threshold;
Inclusion ratio calculation for calculating an inclusion ratio by dividing, when a plurality of the bounding boxes exist, an area of the product set of two mutually overlapping bounding boxes by one area of the two bounding boxes. Department and
a second deletion unit that deletes one of the two overlapping bounding boxes based on the inclusion ratio;
The autonomous mobile body is
a derivation unit that derives the relative position of the moving object and the target object based on the bounding box remaining after the bounding box is deleted by the first deletion unit and the second deletion unit;
a movement control unit that moves the moving body so as to track the target object based on the relative position.