JP7081526B2

JP7081526B2 - Object detector

Info

Publication number: JP7081526B2
Application number: JP2019030462A
Authority: JP
Inventors: 将崇石▲崎▼
Original assignee: Toyota Industries Corp
Current assignee: Toyota Industries Corp
Priority date: 2019-02-22
Filing date: 2019-02-22
Publication date: 2022-06-07
Anticipated expiration: 2039-02-22
Also published as: JP2020135617A

Description

本発明は、物体検出装置に関する。 The present invention relates to an object detection device.

車両などの移動体には人や、障害物などの物体を検出するための物体検出装置が搭載されている。特許文献１に記載の物体検出装置は、ステレオカメラによって撮像された画像から得られた視差画像に基づきステレオカメラによって撮像された物体の位置を検出している。物体検出装置は、視差画像を左右方向に分割して得られる各領域について視差の頻度分布を算出する。物体検出装置は、視差画像のＸ軸方向の位置と、間引き視差との関係から画像に写る環境を俯瞰したマップを得ることができる。間引き視差とは、視差を距離に応じた間引き率で変換したものである。 A moving object such as a vehicle is equipped with an object detection device for detecting an object such as a person or an obstacle. The object detection device described in Patent Document 1 detects the position of an object captured by a stereo camera based on a parallax image obtained from an image captured by the stereo camera. The object detection device calculates the frequency distribution of parallax for each region obtained by dividing the parallax image in the left-right direction. The object detection device can obtain a map that gives a bird's-eye view of the environment reflected in the image from the relationship between the position of the parallax image in the X-axis direction and the thinned parallax. The thinning parallax is a parallax converted by a thinning rate according to a distance.

特開２０１６－２０６８０１号公報Japanese Unexamined Patent Publication No. 2016-20601

ところで、検出した物体が人候補か否かに応じて異なる処理が行われる場合など、物体検出装置には、検出した物体が人候補か否かを判定することが望まれる場合がある。
本発明の目的は、検出した物体が人候補か否かを判定することができる物体検出装置を提供することにある。 By the way, there are cases where it is desired for the object detection device to determine whether or not the detected object is a human candidate, such as when different processing is performed depending on whether or not the detected object is a human candidate.
An object of the present invention is to provide an object detection device capable of determining whether or not a detected object is a human candidate.

上記課題を解決する物体検出装置は、ステレオカメラと、前記ステレオカメラによる撮像が行われた画像から各画素に視差が対応付けられた視差画像を取得する視差画像取得部と、前記視差が取得された特徴点について、実空間上での位置を表す三次元座標系での座標を算出する座標算出部と、前記三次元座標系の水平面を表す座表面を複数に分割した各エリアに含まれる前記特徴点を、前記特徴点の高さの範囲で分けた複数の区分毎に計数する計数部と、各エリアのうち前記特徴点の総和が閾値を超えたエリアには、物体が存在していると判定する物体判定部と、前記物体が存在していると判定された前記エリア毎に、前記特徴点が存在する区分のうち最も高い区分を前記物体の高さとして設定する物体高設定部と、同一の高さの前記物体のうち隣接したエリアに存在している物体を同一物体であると判定する同一物体判定部と、前記同一物体判定部により判定が行われた後の前記物体毎に、前記物体が占有するエリアの大きさを算出する算出部と、前記算出部により算出された前記エリアの大きさが所定値以下の前記物体を人候補であると判定する人候補判定部と、を備える。 The object detection device that solves the above-mentioned problems is a stereo camera, a parallax image acquisition unit that acquires a parallax image in which a parallax is associated with each pixel from an image captured by the stereo camera, and a parallax image acquisition unit that acquires the parallax. The feature points are included in each area of the coordinate calculation unit that calculates the coordinates in the three-dimensional coordinate system that represents the position in the real space, and the seat surface that represents the horizontal plane of the three-dimensional coordinate system. An object exists in a counting unit that counts feature points for each of a plurality of divisions divided by the height range of the feature points, and in each area where the total sum of the feature points exceeds the threshold value. An object determination unit that determines that the object exists, and an object height setting unit that sets the highest division among the divisions in which the feature points exist as the height of the object for each area where the object is determined to exist. For each of the same object determination unit that determines that an object existing in an adjacent area among the objects of the same height is the same object, and the object after the determination is made by the same object determination unit. , A calculation unit that calculates the size of the area occupied by the object, and a person candidate determination unit that determines that the object whose area size is equal to or less than a predetermined value calculated by the calculation unit is a person candidate. To prepare for.

ステレオカメラによって撮像された画像に物体が写ると、物体の縁などが特徴点となり視差が算出される。従って、三次元座標系での水平面を複数に分割したエリア毎に特徴点を計数すると、物体が存在しているエリアに含まれる特徴点の数は多くなる。エリア毎に特徴点を計数し、閾値との比較を行うことで、各エリアから物体が存在しているエリアを抽出することができる。所定値を画像に人が写ったときに人によって占有されるエリアの大きさに基づいた値とすることで、いずれの物体が人候補であるかを判定することができる。これにより、物体が人候補か否かを判定することができる。 When an object appears in an image captured by a stereo camera, the parallax is calculated by using the edge of the object as a feature point. Therefore, if the feature points are counted for each area in which the horizontal plane is divided into a plurality of areas in the three-dimensional coordinate system, the number of feature points included in the area where the object exists increases. By counting the feature points for each area and comparing with the threshold value, it is possible to extract the area where the object exists from each area. By setting a predetermined value as a value based on the size of the area occupied by a person when a person appears in the image, it is possible to determine which object is a candidate for the person. This makes it possible to determine whether or not the object is a human candidate.

上記物体検出装置について、前記人候補の実空間上での座標から、前記人候補の前記画像上での座標を算出する人候補座標算出部と、前記人候補の前記画像上での座標に対して人検出処理を行い、前記人候補が人か否かを判定する人判定部と、を備えていてもよい。 Regarding the object detection device, with respect to the person candidate coordinate calculation unit that calculates the coordinates of the person candidate on the image from the coordinates of the person candidate in the real space, and the coordinates of the person candidate on the image. It may be provided with a person determination unit that performs a person detection process and determines whether or not the person candidate is a person.

人判定部は、人候補と判定された物体に対して人検出処理を行う。このため、画像の全体に対して人検出処理を行う場合に比べて、短時間で人を検出することができる。
上記物体検出装置について、前記区分のうち前記特徴点の高さの範囲が最も低い区分を最低区分とすると、前記物体高設定部は、前記特徴点が前記最低区分を含む複数の区分に存在する場合、前記最低区分から連続して前記区分に前記特徴点が存在する場合、前記最低区分に連続して前記特徴点が存在する区分のうち最も高い区分を前記物体の高さとして設定し、前記最低区分に連続した前記区分に前記特徴点が存在しない場合、前記最低区分を前記物体の高さとして設定してもよい。 The human determination unit performs human detection processing on an object determined to be a human candidate. Therefore, it is possible to detect a person in a short time as compared with the case where the person detection process is performed on the entire image.
Regarding the object detection device, assuming that the category having the lowest height range of the feature points is the lowest category, the object height setting unit exists in a plurality of categories in which the feature points include the minimum category. In this case, when the feature points are continuously present in the lowest section from the lowest section, the highest section among the sections in which the feature points are continuously present in the lowest section is set as the height of the object. When the feature point does not exist in the section continuous with the lowest section, the lowest section may be set as the height of the object.

これによれば、浮いている物体を除いた物体を検出することができる。 According to this, it is possible to detect an object excluding a floating object.

本発明によれば、検出した物体が人候補か否かを判定することができる。 According to the present invention, it is possible to determine whether or not the detected object is a human candidate.

物体検出装置が搭載されるフォークリフトの側面図。Side view of a forklift on which an object detector is mounted. フォークリフト及び物体検出装置の概略構成図。Schematic block diagram of a forklift and an object detection device. 第１画像を示す図。The figure which shows the 1st image. 物体検出装置が行う処理を示すフローチャート。A flowchart showing the processing performed by the object detection device. 視差画像を示す図。The figure which shows the parallax image. 特徴点がプロットされたプロットエリアを示す図。The figure which shows the plot area where the feature points are plotted. プロットエリアのうち物体が存在するエリアを示す図。The figure which shows the area where the object exists in the plot area. 物体が存在するエリアに区分毎に個別のラベルを付与したプロットエリアを示す図。The figure which shows the plot area which gave the individual label for each section to the area where an object exists. 物体が存在するエリアに同一物体毎に個別のラベルを付与したプロットエリアを示す図。The figure which shows the plot area which gave individual labels for the same object to the area where an object exists.

以下、物体検出装置の一実施形態について説明する。
図１に示すように、フォークリフト１０は、車体１１と、車体１１に設けられた荷役装置１２と、を備える。なお、フォークリフト１０は、自動で走行動作及び荷役動作が行われるものであってもよいし、搭乗者による操作によって走行動作及び荷役動作が行われるものであってもよい。 Hereinafter, an embodiment of the object detection device will be described.
As shown in FIG. 1, the forklift 10 includes a vehicle body 11 and a cargo handling device 12 provided on the vehicle body 11. The forklift 10 may be one in which the traveling operation and the cargo handling operation are automatically performed, or the forklift 10 may be one in which the traveling operation and the cargo handling operation are performed by the operation by the passenger.

図２に示すように、フォークリフト１０は、メインコントローラ２０と、走行用モータＭ１と、走行用モータＭ１を制御する走行制御装置２３と、車速センサ２４と、を備える。メインコントローラ２０は、走行動作及び荷役動作に関する制御を行う。メインコントローラ２０は、ＣＰＵ２１と、種々の制御を行うためのプログラムなどが記憶されたメモリ２２と、を備える。 As shown in FIG. 2, the forklift 10 includes a main controller 20, a traveling motor M1, a traveling control device 23 for controlling the traveling motor M1, and a vehicle speed sensor 24. The main controller 20 controls the traveling operation and the cargo handling operation. The main controller 20 includes a CPU 21 and a memory 22 in which programs for performing various controls and the like are stored.

メインコントローラ２０のＣＰＵ２１は、フォークリフト１０の車速が目標速度となるように走行制御装置２３に走行用モータＭ１の回転数の指令を与える。本実施形態の走行制御装置２３は、モータドライバである。本実施形態の車速センサ２４は、走行用モータＭ１の回転数を検出する回転数センサである。車速センサ２４は、走行用モータＭ１の回転数を走行制御装置２３に出力する。走行制御装置２３は、メインコントローラ２０からの指令に基づき、走行用モータＭ１の回転数が指令と一致するように走行用モータＭ１を制御する。 The CPU 21 of the main controller 20 gives a command of the rotation speed of the traveling motor M1 to the traveling control device 23 so that the vehicle speed of the forklift 10 becomes the target speed. The travel control device 23 of this embodiment is a motor driver. The vehicle speed sensor 24 of the present embodiment is a rotation speed sensor that detects the rotation speed of the traveling motor M1. The vehicle speed sensor 24 outputs the rotation speed of the traveling motor M1 to the traveling control device 23. The travel control device 23 controls the travel motor M1 so that the rotation speed of the travel motor M1 matches the command based on the command from the main controller 20.

フォークリフト１０には、物体検出装置３０が搭載されている。物体検出装置３０は、ステレオカメラ３１と、ステレオカメラ３１によって撮像された画像の画像処理を行う画像処理部４１と、を備える。図１に示すように、ステレオカメラ３１は、例えば、車体１１の上部などフォークリフト１０の上方からフォークリフト１０の走行する路面を鳥瞰できるように配置されている。なお、車体１１とは、座席や走行に関する部材を備える基台や、座席の上部に設けられるヘッドガードを含む。本実施形態のステレオカメラ３１は、フォークリフト１０の後方を撮像する。従って、物体検出装置３０で検出される物体は、フォークリフト１０の後方の物体となる。 The forklift 10 is equipped with an object detection device 30. The object detection device 30 includes a stereo camera 31 and an image processing unit 41 that performs image processing of the image captured by the stereo camera 31. As shown in FIG. 1, the stereo camera 31 is arranged so that the road surface on which the forklift 10 travels can be seen from above the forklift 10, such as the upper part of the vehicle body 11. The vehicle body 11 includes a base provided with seats and members related to traveling, and a head guard provided on the upper part of the seat. The stereo camera 31 of the present embodiment images the rear of the forklift 10. Therefore, the object detected by the object detection device 30 is the object behind the forklift 10.

なお、フォークリフト１０の前方を撮像するステレオカメラを用いてフォークリフト１０の前方の物体を検出するようにしてもよい。また、フォークリフト１０の前方及び後方を撮像する個別のステレオカメラを用いてフォークリフト１０の前方及び後方の両側の物体を検出するようにしてもよい。即ち、ステレオカメラの配置を変更することで、任意の方向の物体を検出することが可能である。フォークリフト１０の前方を撮像するステレオカメラを設ける場合、例えば、車体１１の上部や荷役装置１２の上部などにステレオカメラは設けられる。 An object in front of the forklift 10 may be detected by using a stereo camera that captures an image of the front of the forklift 10. Further, an individual stereo camera that captures images in front of and behind the forklift 10 may be used to detect objects on both sides in front of and behind the forklift 10. That is, by changing the arrangement of the stereo camera, it is possible to detect an object in any direction. When a stereo camera for capturing the front of the forklift 10 is provided, for example, the stereo camera is provided on the upper part of the vehicle body 11 or the upper part of the cargo handling device 12.

図２に示すように、ステレオカメラ３１は、２つのカメラ３２，３３を備える。カメラ３２，３３としては、例えば、ＣＣＤイメージセンサや、ＣＭＯＳイメージセンサが用いられる。各カメラ３２，３３は、互いの光軸が平行となるように配置されている。本実施形態において、２つのカメラ３２，３３は、水平方向に並んで配置されている。２つのカメラ３２，３３のうち、一方を第１カメラ３２、他方を第２カメラ３３とする。第１カメラ３２によって撮像された画像を第１画像、第２カメラ３３によって撮像された画像を第２画像とすると、第１画像と第２画像では同一物体が横方向にずれて写ることになる。詳細にいえば、同一物体を撮像した場合、第１画像に写る物体と、第２画像に写る物体では、横方向の画素［ｐｘ］にカメラ３２，３３間の距離に応じたずれが生じることになる。第１画像及び第２画像は、画素数が同じであり、例えば、６４０×４８０［ｐｘ］＝ＶＧＡの画像が用いられる。第１画像及び第２画像は、ＲＧＢ形式の画像である。 As shown in FIG. 2, the stereo camera 31 includes two cameras 32 and 33. As the cameras 32 and 33, for example, a CCD image sensor or a CMOS image sensor is used. The cameras 32 and 33 are arranged so that their optical axes are parallel to each other. In this embodiment, the two cameras 32 and 33 are arranged side by side in the horizontal direction. Of the two cameras 32 and 33, one is the first camera 32 and the other is the second camera 33. Assuming that the image captured by the first camera 32 is the first image and the image captured by the second camera 33 is the second image, the same object appears laterally displaced in the first image and the second image. .. More specifically, when the same object is imaged, the lateral pixel [px] of the object captured in the first image and the object captured in the second image are displaced according to the distance between the cameras 32 and 33. become. The first image and the second image have the same number of pixels, and for example, an image of 640 × 480 [px] = VGA is used. The first image and the second image are images in RGB format.

画像処理部４１は、ＣＰＵ４２と、ＲＡＭ及びＲＯＭ等からなる記憶部４３と、を備える。記憶部４３には、ステレオカメラ３１によって撮像された画像から物体を検出するための種々のプログラムが記憶されている。画像処理部４１は、各種処理のうち少なくとも一部の処理を実行する専用のハードウェア、例えば、特定用途向け集積回路：ＡＳＩＣを備えていてもよい。画像処理部４１は、コンピュータプログラムに従って動作する１つ以上のプロセッサ、ＡＳＩＣ等の１つ以上の専用のハードウェア回路、あるいは、それらの組み合わせを含む回路として構成し得る。プロセッサは、ＣＰＵ、並びに、ＲＡＭ及びＲＯＭ等のメモリを含む。メモリは、処理をＣＰＵに実行させるように構成されたプログラムコードまたは指令を格納している。メモリ、即ち、コンピュータ可読媒体は、汎用または専用のコンピュータでアクセスできるあらゆるものを含む。 The image processing unit 41 includes a CPU 42 and a storage unit 43 including a RAM, a ROM, and the like. The storage unit 43 stores various programs for detecting an object from the image captured by the stereo camera 31. The image processing unit 41 may include dedicated hardware for executing at least a part of various processes, for example, an integrated circuit for a specific application: ASIC. The image processing unit 41 may be configured as one or more processors operating according to a computer program, one or more dedicated hardware circuits such as an ASIC, or a circuit including a combination thereof. The processor includes a CPU and a memory such as RAM and ROM. The memory stores a program code or a command configured to cause the CPU to execute the process. Memory, or computer-readable media, includes anything that can be accessed by a general-purpose or dedicated computer.

以下、画像処理部４１により行われる物体検出処理について説明する。物体検出処理は、フォークリフト１０が起動状態のときに繰り返し行われる。起動状態とは、フォークリフト１０に走行動作及び荷役動作を行わせることが可能な状態である。以下の説明では、一例として、図３に示す環境をステレオカメラ３１によって撮像した場合の物体検出処理について説明する。図３は、フォークリフト１０の周辺を撮像することで得られた第１画像Ｉ１である。第１画像Ｉ１から把握できるように、フォークリフト１０の周辺には、人や、人以外の物体が存在している。人以外の物体とは、フォークリフト１０の進行の妨げとなる障害物である。 Hereinafter, the object detection process performed by the image processing unit 41 will be described. The object detection process is repeated when the forklift 10 is in the activated state. The activated state is a state in which the forklift 10 can perform a traveling operation and a cargo handling operation. In the following description, as an example, the object detection process when the environment shown in FIG. 3 is imaged by the stereo camera 31 will be described. FIG. 3 is a first image I1 obtained by imaging the periphery of the forklift 10. As can be seen from the first image I1, there are people and non-human objects around the forklift 10. An object other than a human is an obstacle that hinders the progress of the forklift 10.

図４及び図５に示すように、ステップＳ１において、画像処理部４１は、視差画像ｄｐを取得する。視差画像ｄｐは、画素に対して視差［ｐｘ］を対応付けた画像である。視差は、第１画像Ｉ１と、第２画像とを比較し、各画像に写る同一特徴点について第１画像Ｉ１と第２画像の画素数の差を算出することで得られる。なお、特徴点とは、物体のエッジなど、境目として認識可能な部分である。特徴点は、輝度情報などから検出することができる。 As shown in FIGS. 4 and 5, in step S1, the image processing unit 41 acquires the parallax image dp. The parallax image dp is an image in which the parallax [px] is associated with the pixels. The parallax is obtained by comparing the first image I1 and the second image and calculating the difference in the number of pixels between the first image I1 and the second image for the same feature point appearing in each image. The feature point is a part that can be recognized as a boundary, such as an edge of an object. The feature points can be detected from the luminance information and the like.

画像処理部４１は、ステレオカメラ３１によって撮像されている映像から同一フレームの第１画像Ｉ１及び第２画像を取得する。画像処理部４１は、各画像を一時的に格納するＲＡＭを用いて、ＲＧＢからＹＣｒＣｂへの変換を行う。なお、画像処理部４１は、歪み補正、エッジ強調処理などを行ってもよい。画像処理部４１は、第１画像Ｉ１の各画素と第２画像の各画素との類似度を比較して視差を算出するステレオ処理を行う。なお、ステレオ処理としては、画素毎に視差を算出する手法を用いてもよいし、各画像を複数の画素を含むブロックに分割してブロック毎の視差を算出するブロックマッチング法を用いてもよい。各画素の類似度の比較には、例えば、SAD：Sum of Absolute Difference、SSD：Sum of Squared Differenceなどが用いられる。画像処理部４１は、第１画像Ｉ１を基準画像、第２画像を比較画像として視差画像ｄｐを取得する。画像処理部４１は、第１画像Ｉ１の画素毎に、最も類似する第２画像の画素を抽出し、第１画像Ｉ１の画素と、当該画素に最も類似する画素の横方向の画素数の差を視差として算出する。これにより、基準画像である第１画像Ｉ１の各画素に視差が対応付けられた視差画像ｄｐを取得することができる。なお、視差画像ｄｐとは、必ずしも表示を要するものではなく、視差画像ｄｐにおける各画素に視差が対応付けられたデータのことを示す。ステップＳ１の処理を行うことで、画像処理部４１は、視差画像取得部として機能する。 The image processing unit 41 acquires the first image I1 and the second image of the same frame from the image captured by the stereo camera 31. The image processing unit 41 converts RGB to YCrCb using a RAM that temporarily stores each image. The image processing unit 41 may perform distortion correction, edge enhancement processing, and the like. The image processing unit 41 performs stereo processing for calculating the parallax by comparing the similarity between each pixel of the first image I1 and each pixel of the second image. As the stereo processing, a method of calculating the parallax for each pixel may be used, or a block matching method of dividing each image into blocks containing a plurality of pixels and calculating the parallax for each block may be used. .. For comparison of the similarity of each pixel, for example, SAD: Sum of Absolute Difference, SSD: Sum of Squared Difference and the like are used. The image processing unit 41 acquires the parallax image dp with the first image I1 as the reference image and the second image as the comparison image. The image processing unit 41 extracts the pixels of the second image most similar to each pixel of the first image I1, and the difference between the pixels of the first image I1 and the number of pixels in the horizontal direction most similar to the pixels. Is calculated as the parallax. As a result, it is possible to acquire a parallax image dp in which parallax is associated with each pixel of the first image I1 which is a reference image. The parallax image dp does not necessarily require display, and indicates data in which parallax is associated with each pixel in the parallax image dp. By performing the processing in step S1, the image processing unit 41 functions as a parallax image acquisition unit.

図５には、第１画像Ｉ１と第２画像から得られた視差画像ｄｐを示す。図５に示す視差画像ｄｐでは、視差の大小を濃淡で表現している。視差は、ステレオカメラ３１に近い位置ほど大きくなり、ステレオカメラ３１から離れるほど小さくなる。以下、視差画像ｄｐの横方向＝Ｘ軸方向の座標をＸ座標Ｘｉとし、視差画像ｄｐの縦方向＝Ｙ軸方向の座標をＹ座標Ｙｉとする。なお、視差画像ｄｐは第１画像Ｉ１を基準画像として取得された画像であるため、Ｘ座標Ｘｉ及びＹ座標Ｙｉは第１画像Ｉ１の座標ともいえる。Ｘ座標Ｘｉは横方向の画素位置を示し、Ｙ座標Ｙｉは縦方向の画素位置を示す。例えば、視差画像ｄｐが６４０×４８０［ｐｘ］であれば、視差画像ｄｐの中心座標は、（Ｘｉ：３２０，Ｙｉ：２４０）と表すことができる。 FIG. 5 shows the parallax image dp obtained from the first image I1 and the second image. In the parallax image dp shown in FIG. 5, the magnitude of the parallax is expressed by shading. The parallax increases as the position is closer to the stereo camera 31, and decreases as the distance from the stereo camera 31 increases. Hereinafter, the lateral direction of the parallax image dl = the coordinate in the X-axis direction is defined as the X-coordinate Xi, and the vertical direction of the parallax image df = the coordinate in the Y-axis direction is defined as the Y-coordinate Yi. Since the parallax image dp is an image acquired with the first image I1 as a reference image, the X coordinate Xi and the Y coordinate Yi can be said to be the coordinates of the first image I1. The X coordinate Xi indicates the pixel position in the horizontal direction, and the Y coordinate Yi indicates the pixel position in the vertical direction. For example, if the parallax image dp is 640 × 480 [px], the center coordinates of the parallax image dp can be expressed as (Xi: 320, Yi: 240).

図４に示すように、ステップＳ２において、画像処理部４１は、視差画像ｄｐから路面の視差を除去する。路面とは、フォークリフト１０が存在している面である。路面の視差は、以下の（１）式によって予め求められる。 As shown in FIG. 4, in step S2, the image processing unit 41 removes the parallax on the road surface from the parallax image dp. The road surface is the surface on which the forklift 10 is present. The parallax of the road surface is obtained in advance by the following equation (1).

（１）式におけるＭ_０（ｙ）は路面により生じる視差である。ｙは、視差画像ｄｐにおけるＹ座標Ｙｉである。Ｂは第１カメラ３２と第２カメラ３３との離間距離＝基線長［ｍｍ］であり、詳細にいえば、第１カメラ３２の光軸と第２カメラ３３の光軸との離間距離である。Ｈはステレオカメラ３１の設置高さ［ｍｍ］であり、詳細にいえば、路面からステレオカメラ３１までの距離である。θは、ステレオカメラ３１の設置角度であり、ステレオカメラ３１の光軸が水平方向に延びている場合を０°とした場合の角度である。Ｆは焦点距離［ｍｍ］である。（１）式によって求められた路面の視差は、記憶部４３に記憶されている。画像処理部４１は、視差画像ｄｐから路面の視差を除去することで、路面の視差が除去された視差画像ｄｐを取得することができる。即ち、路面よりも高い位置に存在する物体により生じる視差を抽出した視差画像ｄｐを得ることができる。

M ₀ (y) in the equation (1) is the parallax caused by the road surface. y is the Y coordinate Yi in the parallax image dp. B is the separation distance between the first camera 32 and the second camera 33 = the baseline length [mm], and more specifically, is the separation distance between the optical axis of the first camera 32 and the optical axis of the second camera 33. .. H is the installation height [mm] of the stereo camera 31, and more specifically, the distance from the road surface to the stereo camera 31. θ is the installation angle of the stereo camera 31, and is the angle when the optical axis of the stereo camera 31 extends in the horizontal direction as 0 °. F is the focal length [mm]. The parallax of the road surface obtained by the equation (1) is stored in the storage unit 43. The image processing unit 41 can acquire the parallax image dl from which the parallax on the road surface is removed by removing the parallax on the road surface from the parallax image dl. That is, it is possible to obtain a parallax image dp obtained by extracting the parallax caused by an object existing at a position higher than the road surface.

ステップＳ３において、画像処理部４１は、ワールド座標系における特徴点の座標を算出する。まず、画像処理部４１は、カメラ座標系における特徴点の座標を算出する。カメラ座標系は、光軸をＺ軸とし、光軸に直交する２つの軸のそれぞれをＸ軸、Ｙ軸とする３軸直交座標系である。カメラ座標系における特徴点の座標は、カメラ座標系におけるＺ座標Ｚｃ、Ｘ座標Ｘｃ及びＹ座標Ｙｃで表わすことができる。Ｚ座標Ｚｃ、Ｘ座標Ｘｃ及びＹ座標Ｙｃは、それぞれ、以下の（２）式～（４）式を用いて算出することができる。 In step S3, the image processing unit 41 calculates the coordinates of the feature points in the world coordinate system. First, the image processing unit 41 calculates the coordinates of the feature points in the camera coordinate system. The camera coordinate system is a three-axis Cartesian coordinate system in which the optical axis is the Z axis and the two axes orthogonal to the optical axis are the X axis and the Y axis, respectively. The coordinates of the feature points in the camera coordinate system can be represented by Z coordinate Zc, X coordinate Xc, and Y coordinate Yc in the camera coordinate system. The Z coordinate Zc, the X coordinate Xc, and the Y coordinate Yc can be calculated using the following equations (2) to (4), respectively.

（２）式～（４）式におけるＢは基線長［ｍｍ］、ｆは焦点距離［ｍｍ］、ｄは視差［ｐｘ］である。ｘｐは視差画像ｄｐ中の任意のＸ座標Ｘｉであり、ｘ’は視差画像ｄｐの中心座標のＸ座標Ｘｉである。ｙｐは視差画像ｄｐ中の任意のＹ座標Ｙｉであり、ｙ’は視差画像ｄｐの中心座標のＹ座標Ｙｉである。

In equations (2) to (4), B is the baseline length [mm], f is the focal length [mm], and d is the parallax [px]. xp is an arbitrary X coordinate Xi in the parallax image dp, and x'is the X coordinate Xi of the center coordinate of the parallax image dp. yp is an arbitrary Y coordinate Yi in the parallax image dp, and y'is the Y coordinate Yi of the center coordinate of the parallax image dp.

ｘｐを視差画像ｄｐ中の特徴点のＸ座標Ｘｉとし、ｙｐを視差画像ｄｐ中の特徴点のＹ座標Ｙｉとし、ｄを特徴点の座標に対応付けられた視差とすることで、カメラ座標系における特徴点の座標が算出される。 By setting xp as the X coordinate Xi of the feature point in the parallax image dp, yp as the Y coordinate Yi of the feature point in the parallax image dp, and d as the parallax associated with the coordinates of the feature point, the camera coordinate system. The coordinates of the feature points in are calculated.

ここで、フォークリフト１０の進行方向に延びる軸をＹ軸、鉛直方向に延びる軸をＺ軸、Ｙ軸及びＺ軸に直交する軸をＸ軸とする３軸直交座標系での座標を実空間上での三次元座標系であるワールド座標系とする。ワールド座標系での特徴点の座標は、ワールド座標系におけるＸ座標Ｘｗ、Ｙ座標Ｙｗ、Ｚ座標Ｚｗで表わすことができる。 Here, the coordinates in the three-axis Cartesian coordinate system where the axis extending in the traveling direction of the forklift 10 is the Y axis, the axis extending in the vertical direction is the Z axis, and the axes orthogonal to the Y axis and the Z axis are the X axes are in real space. Let it be the world coordinate system, which is the three-dimensional coordinate system in. The coordinates of the feature points in the world coordinate system can be represented by the X coordinate Xw, the Y coordinate Yw, and the Z coordinate Zw in the world coordinate system.

画像処理部４１は、以下の（５）式を用いてカメラ座標系をワールド座標系に変換するワールド座標変換を行う。 The image processing unit 41 performs world coordinate conversion for converting the camera coordinate system into the world coordinate system using the following equation (5).

ここで、（５）式におけるＨはワールド座標系におけるステレオカメラ３１の設置高さ［ｍｍ］であり、θはカメラ３２，３３の光軸と、水平面とがなす角＋９０°の角度である。

Here, H in the equation (5) is the installation height [mm] of the stereo camera 31 in the world coordinate system, and θ is the angle + 90 ° between the optical axes of the

cameras

32 and 33 and the horizontal plane.

ワールド座標変換で得られたワールド座標のうちＸ座標Ｘｗは、フォークリフト１０の左右方向に対するフォークリフト１０から特徴点までの距離を示す。なお、左右とは、ステレオカメラ３１が向いている方向を前とした場合の左右である。Ｙ座標Ｙｗは、フォークリフト１０の進行方向に対するフォークリフト１０から特徴点までの距離を示す。Ｚ座標Ｚｗは、路面から特徴点までの高さを示す。ステップＳ３の処理を行うことで、画像処理部４１は、座標算出部として機能する。 Of the world coordinates obtained by the world coordinate conversion, the X coordinate Xw indicates the distance from the forklift 10 to the feature point in the left-right direction of the forklift 10. The left and right are the left and right when the direction in which the stereo camera 31 is facing is the front. The Y coordinate Yw indicates the distance from the forklift 10 to the feature point with respect to the traveling direction of the forklift 10. The Z coordinate Zw indicates the height from the road surface to the feature point. By performing the processing in step S3, the image processing unit 41 functions as a coordinate calculation unit.

次に、図４及び図６に示すように、ステップＳ４において、画像処理部４１は、ワールド座標系における水平面を表す座表面であるＸＹ平面を複数のエリアＡ１に分割して、プロットエリアＡ２とする。プロットエリアＡ２は、横方向をＸ軸方向、縦方向をＹ軸方向とし、１つのエリアＡ１を１つの画素とみなした画像と捉えることができる。プロットエリアＡ２のうちＹ座標Ｙｗが最小であり、Ｘ座標ＸｗがプロットエリアＡ２の中心である座標をプロットエリアＡ２の原点Ｏとする。原点Ｏは、フォークリフト１０の位置、詳細にいえば、ステレオカメラ３１の位置といえる。原点ＯからＸ軸方向に離れるほど、左右方向に対してフォークリフト１０から離れており、原点ＯからＹ軸方向に離れるほど、前後方向に対してフォークリフト１０から離れている。原点Ｏを（Ｘｗ：０，Ｙｗ：０）とし、原点Ｏより左側のＸ座標Ｘｗは－の値とし、原点Ｏより右側のＸ座標Ｘｗは＋の値とする。即ち、Ｘ座標Ｘｗの－と＋は、フォークリフト１０に対して左右いずれの方向かを示すものである。プロットエリアＡ２は、特徴点をプロットしたワールド座標系を俯瞰した俯瞰図である。言い換えれば、プロットエリアＡ２は、ステレオカメラ３１によって撮像された環境を俯瞰した俯瞰図といえる。 Next, as shown in FIGS. 4 and 6, in step S4, the image processing unit 41 divides the XY plane, which is a seat surface representing the horizontal plane in the world coordinate system, into a plurality of areas A1 and sets the plot area A2. do. The plot area A2 can be regarded as an image in which the horizontal direction is the X-axis direction and the vertical direction is the Y-axis direction, and one area A1 is regarded as one pixel. The Y coordinate Yw of the plot area A2 is the smallest, and the coordinate whose X coordinate Xw is the center of the plot area A2 is defined as the origin O of the plot area A2. The origin O can be said to be the position of the forklift 10, more specifically, the position of the stereo camera 31. The farther away from the origin O in the X-axis direction, the farther away from the forklift 10 in the left-right direction, and the farther away from the origin O in the Y-axis direction, the farther away from the forklift 10 in the front-rear direction. The origin O is (Xw: 0, Yw: 0), the X coordinate Xw on the left side of the origin O is a − value, and the X coordinate Xw on the right side of the origin O is a + value. That is, the − and + of the X coordinate Xw indicate which direction is left or right with respect to the forklift 10. The plot area A2 is a bird's-eye view of the world coordinate system in which the feature points are plotted. In other words, the plot area A2 can be said to be a bird's-eye view of the environment captured by the stereo camera 31.

各エリアＡ１は、同一の大きさであり、例えば、一辺を５００［ｍｍ］とする正方形である。本実施形態では、直立した人を俯瞰した場合に、人の水平方向への寸法の取り得る最大値を考慮してエリアＡ１の大きさは設定される。直立した人の水平方向への寸法の取り得る最大値とは、例えば、人の肩幅である。人の肩幅としては、例えば、成人の平均値を採用することができる。 Each area A1 has the same size, and is, for example, a square having a side of 500 [mm]. In the present embodiment, the size of the area A1 is set in consideration of the maximum value that the horizontal dimension of the person can take when the person is upright from a bird's-eye view. The maximum possible horizontal dimension of an upright person is, for example, the shoulder width of the person. As the shoulder width of a person, for example, the average value of an adult can be adopted.

次に、ステップＳ５において、画像処理部４１は、エリアＡ１毎に含まれる特徴点を計数する。画像処理部４１は、各特徴点のＸ座標Ｘｗ及びＹ座標Ｙｗから各特徴点がいずれのエリアＡ１に位置しているかを判定し、エリアＡ１毎に特徴点を計数する。プロットエリアＡ２はＸＹ平面であるため、特徴点の座標からは特徴点の高さ情報であるＺ座標Ｚｗが失われる。画像処理部４１は、特徴点の計数を行う際に、特徴点の高さの範囲で分けた区分毎に特徴点を計数する。なお、Ｘ座標Ｘｗ及びＹ座標Ｙｗが同一であり、Ｚ座標Ｚｗのみが異なる複数の特徴点が存在する場合、プロットエリアＡ２では、同一座標に複数の特徴点が存在することになる。この場合、同一座標に存在する複数の特徴点毎に個別に計数を行う。 Next, in step S5, the image processing unit 41 counts the feature points included in each area A1. The image processing unit 41 determines from which area A1 each feature point is located from the X coordinate Xw and the Y coordinate Yw of each feature point, and counts the feature points for each area A1. Since the plot area A2 is an XY plane, the Z coordinate Zw, which is the height information of the feature point, is lost from the coordinates of the feature point. When counting the feature points, the image processing unit 41 counts the feature points for each division divided within the range of the height of the feature points. When the X coordinate Xw and the Y coordinate Yw are the same and there are a plurality of feature points different only in the Z coordinate Zw, the plot area A2 has a plurality of feature points at the same coordinate. In this case, counting is performed individually for each of a plurality of feature points existing at the same coordinates.

特徴点の高さの範囲で分けた区分とは、隣接した物体同士を区別できるように設定されている。好ましくは、ステレオカメラ３１の撮像範囲に人と、当該人に隣り合う物体とが存在する場合、人と物体とを別々の物体と区別できるように区分が設定される。本実施形態では、特徴点の高さの範囲は、５００［ｍｍ］以上１０００［ｍｍ］未満、１０００［ｍｍ］以上１５００［ｍｍ］未満、１５００［ｍｍ］以上２０００［ｍｍ］未満、２０００［ｍｍ］以上の４つの区分に分けられている。以下、５００［ｍｍ］以上１０００［ｍｍ］未満の範囲を第１区分、１０００［ｍｍ］以上１５００［ｍｍ］未満の範囲を第２区分、１５００［ｍｍ］以上２０００［ｍｍ］未満の範囲を第３区分、２０００［ｍｍ］以上の範囲を第４区分として説明を行う。第１区分は、特徴点の高さの範囲が最も低い最低区分である。画像処理部４１は、ステップＳ４及びステップＳ５の処理を行うことで、計数部として機能する。 The division divided by the height range of the feature points is set so that adjacent objects can be distinguished from each other. Preferably, when a person and an object adjacent to the person exist in the imaging range of the stereo camera 31, the division is set so that the person and the object can be distinguished from each other. In the present embodiment, the height range of the feature points is 500 [mm] or more and less than 1000 [mm], 1000 [mm] or more and less than 1500 [mm], 1500 [mm] or more and less than 2000 [mm], 2000 [mm. ] It is divided into the above four categories. Hereinafter, the range of 500 [mm] or more and less than 1000 [mm] is the first category, the range of 1000 [mm] or more and less than 1500 [mm] is the second category, and the range of 1500 [mm] or more and less than 2000 [mm] is the first category. The explanation will be given with the range of 3 divisions and 2000 [mm] or more as the 4th division. The first division is the lowest division in which the height range of the feature points is the lowest. The image processing unit 41 functions as a counting unit by performing the processing of steps S4 and S5.

次に、図４及び図７に示すように、ステップＳ６において、画像処理部４１は、物体が存在しているエリアＡ１を検出する。画像処理部４１は、ステップＳ５での計数によって、特徴点の総和が閾値を越えたエリアＡ１には、物体が存在していると判定する。なお、特徴点の総和とは、区分毎に計数された特徴点の数の合計である。閾値としては、ステレオ処理の精度等を原因として生じる僅かな特徴点が存在するエリアＡ１に、物体が存在していると判定されないような値に設定される。即ち、物体が存在していないにも関わらず、物体が存在していると判定されないように閾値は設定されている。図７には、物体が存在していると判定されたエリアＡ１をプロットしている。 Next, as shown in FIGS. 4 and 7, in step S6, the image processing unit 41 detects the area A1 in which the object exists. The image processing unit 41 determines from the count in step S5 that an object exists in the area A1 in which the sum of the feature points exceeds the threshold value. The total number of feature points is the total number of feature points counted for each category. The threshold value is set to a value such that it is not determined that an object exists in the area A1 in which a slight feature point generated due to the accuracy of stereo processing or the like exists. That is, the threshold value is set so that it is not determined that the object exists even though the object does not exist. In FIG. 7, the area A1 where it is determined that the object exists is plotted.

なお、ステレオカメラ３１から離れた位置ほど視差が小さくなるため、視差が１変化したときに算出される特徴点の座標はＹ座標Ｙｗが大きくなるほど、即ち、原点Ｏから離れるほど大きく変化する。結果として、Ｙ座標Ｙｗが大きくなるほど特徴点は離散的になり、Ｙ座標Ｙｗが大きいエリアＡ１ほど、物体が存在していると判定されにくくなる。しかしながら、Ｙ座標Ｙｗが大きいエリアＡ１ほどフォークリフト１０から離れたエリアＡ１であり、フォークリフト１０に近いエリアＡ１に比べて、フォークリフト１０から離れたエリアＡ１の物体を検出する必要性は低い。また、物体検出処理は繰り返し行われるため、フォークリフト１０の進行に伴い物体とフォークリフト１０が近付けば、距離を原因として検出されなかった物体も検出されるようになる。このため、Ｙ座標Ｙｗが大きいエリアＡ１ほど物体が存在していると判定されにくい場合であっても、実用上の支障は来さないと考えられる。ステップＳ６の処理を行うことで、画像処理部４１は、物体判定部として機能する。 Since the parallax becomes smaller as the distance from the stereo camera 31 increases, the coordinates of the feature points calculated when the parallax changes by 1 change larger as the Y coordinate Yw increases, that is, as the distance from the origin O increases. As a result, the larger the Y coordinate Yw, the more discrete the feature points, and the larger the Y coordinate Yw, the more difficult it is to determine that an object exists. However, the area A1 having a larger Y coordinate Yw is the area A1 farther from the forklift 10, and there is less need to detect an object in the area A1 away from the forklift 10 than the area A1 closer to the forklift 10. Further, since the object detection process is repeated, if the object and the forklift 10 come close to each other as the forklift 10 progresses, the object that was not detected due to the distance will be detected. Therefore, even if it is difficult to determine that an object exists in the area A1 having a large Y coordinate Yw, it is considered that there is no practical problem. By performing the processing in step S6, the image processing unit 41 functions as an object determination unit.

次に、ステップＳ７において、画像処理部４１は、各エリアＡ１に存在する物体の高さを設定する。物体の高さとは、特徴点の高さの範囲を表す各区分のいずれかである。画像処理部４１は、最低区分である第１区分を含む複数の区分に特徴点が存在している場合であり、かつ、第１区分から連続して区分に特徴点が存在する場合、第１区分から連続して特徴点が存在する区分のうち最も高い区分を物体の高さとして設定する。例えば、第１区分、第２区分及び第４区分のそれぞれに特徴点が存在しており、第３区分に特徴点が存在していないエリアＡ１であれば、第１区分から連続して特徴点が存在する区分は第２区分までとなるため、物体の高さは第２区分になる。なお、特徴点が存在する区分とは、ステップＳ５での特徴点の計数によって、特徴点が計数された区分である。画像処理部４１は、計数された特徴点の数に計数閾値を設定し、計数された特徴点の数が計数閾値を超えた区分には特徴点が存在していると判定してもよい。即ち、特徴点の数が計数閾値に満たない区分には、特徴点が存在していないと判定してもよい。 Next, in step S7, the image processing unit 41 sets the height of the object existing in each area A1. The height of the object is one of the divisions representing the range of heights of the feature points. The image processing unit 41 is the first when the feature points are present in a plurality of sections including the first section, which is the lowest section, and when the feature points are continuously present in the sections from the first section. The highest division among the divisions in which feature points exist consecutively from the division is set as the height of the object. For example, if the area A1 has feature points in each of the first, second, and fourth sections and no feature points in the third section, the feature points are continuous from the first section. Since the category in which is present is up to the second category, the height of the object is the second category. The category in which the feature points exist is a category in which the feature points are counted by counting the feature points in step S5. The image processing unit 41 may set a counting threshold value for the number of counted feature points, and may determine that the feature points exist in the division in which the number of counted feature points exceeds the counting threshold value. That is, it may be determined that the feature points do not exist in the category in which the number of feature points is less than the count threshold value.

画像処理部４１は、第１区分を含む複数の区分に特徴点が存在している場合で、かつ、第１区分に連続した区分に特徴点が存在しない場合には第１区分を物体の高さとして設定する。例えば、第１区分及び第３区分に特徴点が存在し、第２区分に特徴点が存在しない場合、物体の高さとして第１区分が設定される。 The image processing unit 41 sets the height of the object in the first section when the feature points exist in a plurality of sections including the first section and the feature points do not exist in the continuous sections in the first section. Set as a height. For example, when the feature points exist in the first section and the third section and the feature points do not exist in the second section, the first section is set as the height of the object.

画像処理部４１は、第１区分を除く複数の区分に特徴点が存在している場合には、複数の区分のうち最も高い区分を物体の高さとして設定する。例えば、第２区分及び第３区分に特徴点が存在しており、第１区分及び第４区分に特徴点が存在していない場合、物体の高さとして第３区分が設定される。画像処理部４１は、いずれか１つの区分に特徴点が存在している場合、当該区分を物体の高さとして設定する。例えば、第２区分にのみ特徴点が存在している場合、物体の高さとして第２区分が設定される。図７では、物体が存在するエリアＡ１の濃淡によって、物体の高さを表している。 When the feature points exist in a plurality of sections other than the first section, the image processing unit 41 sets the highest section among the plurality of sections as the height of the object. For example, when the feature points exist in the second section and the third section and the feature points do not exist in the first section and the fourth section, the third section is set as the height of the object. When the feature point exists in any one of the sections, the image processing unit 41 sets the section as the height of the object. For example, when the feature point exists only in the second division, the second division is set as the height of the object. In FIG. 7, the height of the object is represented by the shade of the area A1 in which the object exists.

物体が路面上に位置している場合、特徴点は、第１区分のみ、あるいは、第１区分から連続する区分に存在することになる。従って、第１区分に特徴点が存在しているにも関わらず、特徴点が存在しない区分よりも高い区分に特徴点が存在しているエリアＡ１には、飛行体など路面から浮いた物体が存在していると推測できる。この場合には、路面から浮いている物体は存在しないとみなし、路面から浮いている物体の下方に存在する物体のみを物体として認識する。従って、特徴点が存在する区分のうち最も高い区分とは、路面上に位置している物体によって生じた特徴点のみを考慮した区分であってもよい。ステップＳ７の処理を行うことで、画像処理部４１は、物体高設定部として機能する。 When the object is located on the road surface, the feature points exist only in the first division or in a division continuous from the first division. Therefore, in the area A1 where the feature points exist in the higher section than the section in which the feature points do not exist even though the feature points exist in the first section, an object floating from the road surface such as a flying object is present. It can be inferred that it exists. In this case, it is considered that the object floating from the road surface does not exist, and only the object existing below the object floating from the road surface is recognized as an object. Therefore, the highest category among the categories in which the feature points exist may be a category in which only the feature points generated by the object located on the road surface are considered. By performing the processing in step S7, the image processing unit 41 functions as an object height setting unit.

次に、図４及び図８に示すように、ステップＳ８において、画像処理部４１は、物体が存在していると判定されたエリアＡ１にラベルを付与するラベリング処理を行う。ラベルは、固有の識別子であり、本実施形態では、１～４の番号をラベルとしている。画像処理部４１は、エリアＡ１に存在する物体の高さに応じたラベルを付与する。第１区分は番号１、第２区分は番号２、第３区分は番号３、第４区分は番号４に対応している。なお、図８には、物体が存在していないと判定されたエリアＡ１には、番号０を付与している。 Next, as shown in FIGS. 4 and 8, in step S8, the image processing unit 41 performs a labeling process for assigning a label to the area A1 where it is determined that the object is present. The label is a unique identifier, and in the present embodiment, the number 1 to 4 is used as the label. The image processing unit 41 assigns a label according to the height of the object existing in the area A1. The first category corresponds to number 1, the second category corresponds to number 2, the third category corresponds to number 3, and the fourth category corresponds to number 4. In FIG. 8, the area A1 where it is determined that the object does not exist is assigned a number 0.

次に、図４及び図９に示すように、ステップＳ９において、画像処理部４１は、エリアＡ１毎に検出された物体から同一物体を検出する。画像処理部４１は、複数のエリアＡ１に跨がって存在する同一高さの物体を同一物体と判定する。詳細にいえば、画像処理部４１は、同一の高さの物体のうち隣接したエリアＡ１に存在している物体を同一物体であると判定する。なお、互いに隣接したエリアＡ１とは、Ｙ軸方向及びＸ軸方向に隣り合う４つのエリアＡ１に加えて、斜めに隣り合う４つのエリアＡ１を含むものである。即ち、各エリアＡ１と、当該エリアＡ１を囲む八方のエリアＡ１は、互いに隣接したエリアＡ１といえる。画像処理部４１は、同一物体が存在すると判定されたエリアＡ１毎に、ラベルを付与するラベリング処理を行う。即ち、ステップＳ８で高さ毎に個別に付与されたラベルを、同一物体毎に個別に付与されるラベルに変更する。ラベルは、固有の識別子であり、本実施形態では、同一物体毎に付与される番号である。図９に示す例では、同一物体毎に１～１５までの番号が付与されていることがわかる。図９に示す例では、画像処理部４１は、１５個の物体がプロットエリアＡ２に存在していると判定している。ステップＳ９の処理を行うことで、画像処理部４１は、同一物体判定部として機能する。なお、図８及び図９では、説明の便宜上、プロットエリアＡ２に０～２７の列番号と、０～２７の行番号とを付している。列番号及び行番号は、原点ＯからＹ軸方向に対して最も離れており、原点ＯからＸ軸方向に対して最も左方に離れたエリアＡ１を０としている。 Next, as shown in FIGS. 4 and 9, in step S9, the image processing unit 41 detects the same object from the objects detected in each area A1. The image processing unit 41 determines that an object of the same height existing over a plurality of areas A1 is the same object. More specifically, the image processing unit 41 determines that the objects existing in the adjacent area A1 among the objects of the same height are the same objects. The areas A1 adjacent to each other include four areas A1 diagonally adjacent to each other in addition to the four areas A1 adjacent to each other in the Y-axis direction and the X-axis direction. That is, it can be said that each area A1 and the eight areas A1 surrounding the area A1 are adjacent areas A1. The image processing unit 41 performs labeling processing for assigning a label to each area A1 where it is determined that the same object exists. That is, the label individually given for each height in step S8 is changed to the label individually given for each same object. The label is a unique identifier, and in the present embodiment, it is a number assigned to each of the same objects. In the example shown in FIG. 9, it can be seen that numbers 1 to 15 are assigned to the same object. In the example shown in FIG. 9, the image processing unit 41 determines that 15 objects exist in the plot area A2. By performing the processing in step S9, the image processing unit 41 functions as the same object determination unit. In FIGS. 8 and 9, for convenience of explanation, a column number of 0 to 27 and a row number of 0 to 27 are attached to the plot area A2. The column number and the row number are farthest from the origin O in the Y-axis direction, and the area A1 farthest from the origin O in the X-axis direction is 0.

次に、ステップＳ１０において、画像処理部４１は、同一物体毎に、物体によって占有されるエリアＡ１の大きさを算出する。即ち、ステップＳ９で区別された１５個の物体毎に、占有するエリアＡ１の大きさを算出する。以下、同一物体によって占有されているエリアＡ１をエリアブロックＡ３として説明する。物体が複数のエリアＡ１に跨がっていればエリアブロックＡ３は複数のエリアＡ１で構成され、物体が複数のエリアＡ１に跨がっていなければエリアブロックＡ３は単数のエリアＡ１で構成される。 Next, in step S10, the image processing unit 41 calculates the size of the area A1 occupied by the object for each of the same objects. That is, the size of the occupied area A1 is calculated for each of the 15 objects distinguished in step S9. Hereinafter, the area A1 occupied by the same object will be described as the area block A3. If the object straddles a plurality of areas A1, the area block A3 is composed of a plurality of areas A1, and if the object does not straddle a plurality of areas A1, the area block A3 is composed of a single area A1. ..

画像処理部４１は、エリアブロックＡ３のＸ軸方向に対するエリアＡ１の最大数と、Ｙ軸方向に対するエリアＡ１の最大数からエリアブロックＡ３の大きさを算出する。一例として、図９で番号８が付与されたエリアブロックＡ３の大きさを算出する場合について説明する。エリアブロックＡ３は、行番号の最大が１１で、行番号の最小が９であるため、Ｘ軸方向に対するエリアＡ１の最大数は３である。エリアブロックＡ３は、列番号の最大が２４で、列番号の最小が２２であるため、Ｙ軸方向に対するエリアＡ１の最大数は３である。従って、エリアブロックＡ３の大きさは、プロットエリアＡ２において３×３の領域Ａ４と算出することができる。ステップＳ１０の処理を行うことで、画像処理部４１は、算出部として機能する。 The image processing unit 41 calculates the size of the area block A3 from the maximum number of areas A1 in the X-axis direction of the area block A3 and the maximum number of areas A1 in the Y-axis direction. As an example, a case of calculating the size of the area block A3 to which the number 8 is assigned in FIG. 9 will be described. In the area block A3, the maximum number of rows is 11, and the minimum number of rows is 9, so that the maximum number of areas A1 in the X-axis direction is 3. In the area block A3, the maximum number of columns is 24 and the minimum number of columns is 22, so that the maximum number of areas A1 in the Y-axis direction is 3. Therefore, the size of the area block A3 can be calculated as a 3 × 3 region A4 in the plot area A2. By performing the processing in step S10, the image processing unit 41 functions as a calculation unit.

次に、ステップＳ１１において、画像処理部４１は、番号１～１５までの各物体が人候補か否かを判定する。画像処理部４１は、大きさが所定値以下のエリアブロックＡ３に存在する物体は、人候補であると判定する。所定値は、例えば、直立した人を人候補として抽出できるように設定されている。直立した人を俯瞰した場合に、人によって占有されるエリアＡ１の数は、１又は２と推測される。本実施形態では、マージンを加味して、Ｘ軸方向のエリア数×Ｙ軸方向のエリア数＝３×３を所定値として設定している。画像処理部４１は、Ｘ軸方向が１５００［ｍｍ］以下であり、かつ、Ｙ軸方向が１５００［ｍｍ］以下の物体は、人候補であると判定するといえる。番号８のエリアブロックＡ３には、人候補が存在しているといえる。 Next, in step S11, the image processing unit 41 determines whether or not each of the objects Nos. 1 to 15 is a human candidate. The image processing unit 41 determines that an object existing in the area block A3 having a size equal to or less than a predetermined value is a human candidate. The predetermined value is set so that, for example, an upright person can be extracted as a person candidate. When looking down on an upright person, the number of areas A1 occupied by the person is estimated to be 1 or 2. In the present embodiment, the number of areas in the X-axis direction × the number of areas in the Y-axis direction = 3 × 3 is set as a predetermined value in consideration of a margin. It can be said that the image processing unit 41 determines that an object having an X-axis direction of 1500 [mm] or less and a Y-axis direction of 1500 [mm] or less is a human candidate. It can be said that a person candidate exists in the area block A3 of the number 8.

なお、所定値は、どのような姿勢の人を検出したいかによって変更してもよい。例えば、同一人物が直立している状態と、着座している状態や腕を拡げている状態とを比較すると、着座している状態や腕を拡げている状態のほうが水平方向の寸法が大きくなる。従って、直立している状態の人のみを検出する場合に比べて、着座している状態や腕を拡げている状態の人を検出する場合のほうが所定値を大きくするようにしてもよい。なお、所定値を大きくした場合、種々の姿勢の人を検出可能になる一方で、人とは異なる物体も人候補として検出されやすくなる。従って、所定値の大きさは、画像処理部４１の処理負荷を加味して設定されてもよい。ステップＳ１１の処理を行うことで、画像処理部４１は人候補判定部として機能する。 The predetermined value may be changed depending on what kind of posture the person wants to be detected. For example, when comparing the state in which the same person is standing upright with the state in which he is sitting or with his arms extended, the horizontal dimensions are larger in the state of being seated or with his arms extended. .. Therefore, the predetermined value may be made larger in the case of detecting a person in a sitting state or a state in which the arms are spread than in the case of detecting only a person in an upright state. When the predetermined value is increased, people in various postures can be detected, but objects different from humans can be easily detected as human candidates. Therefore, the magnitude of the predetermined value may be set in consideration of the processing load of the image processing unit 41. By performing the processing in step S11, the image processing unit 41 functions as a person candidate determination unit.

次に、ステップＳ１２において、画像処理部４１は、ステレオカメラ３１によって撮像された画像上での人候補の座標を算出する。本実施形態では、画像として、第１画像Ｉ１上での人候補の座標を算出する。まず、画像処理部４１は、原点Ｏと、エリアブロックＡ３との位置関係からワールド座標系での人候補の座標を算出する。人候補のワールド座標系での座標としては、例えば、Ｘ座標ＸｗはエリアブロックＡ３の中心座標、Ｙ座標ＹｗはエリアブロックＡ３のうち原点Ｏに最も近い座標を採用する。画像処理部４１は、Ｘ軸方向に対して原点ＯからエリアブロックＡ３における最も近いエリアＡ１までのＸ軸方向への距離と、Ｘ軸方向に対して原点ＯからエリアブロックＡ３における最も遠いエリアＡ１までの距離と、を加算した値を二分した距離を人候補のＸ座標Ｘｗとする。画像処理部４１は、Ｙ軸方向に対する原点ＯからエリアブロックＡ３までの最短距離を人候補のＹ座標Ｙｗとする。一例として、図９で、番号８が付与されたエリアブロックＡ３に存在する人候補の座標を算出する場合について説明する。Ｘ軸方向に対して原点ＯからエリアブロックＡ３における最も近いエリアＡ１までのＸ軸方向への距離は、３エリア分の距離である１５００［ｍｍ］である。Ｘ軸方向に対して原点ＯからエリアブロックＡ３における最も遠いエリアＡ１までの距離は、５エリア分の距離である２５００［ｍｍ］である。従って、エリアブロックＡ３に存在する人候補のＸ座標Ｘｗは、以下の（６）式から算出することができる。 Next, in step S12, the image processing unit 41 calculates the coordinates of the human candidate on the image captured by the stereo camera 31. In the present embodiment, the coordinates of the human candidate on the first image I1 are calculated as an image. First, the image processing unit 41 calculates the coordinates of the human candidate in the world coordinate system from the positional relationship between the origin O and the area block A3. As the coordinates in the world coordinate system of the person candidate, for example, the X coordinate Xw adopts the center coordinate of the area block A3, and the Y coordinate Yw adopts the coordinate closest to the origin O in the area block A3. The image processing unit 41 has the distance in the X-axis direction from the origin O to the nearest area A1 in the area block A3 in the X-axis direction, and the farthest area A1 in the area block A3 from the origin O in the X-axis direction. Let the X-coordinate Xw of the person candidate be the distance obtained by dividing the distance to and the sum of the values into two. The image processing unit 41 sets the shortest distance from the origin O in the Y-axis direction to the area block A3 as the Y coordinate Yw of the person candidate. As an example, in FIG. 9, a case where the coordinates of the person candidate existing in the area block A3 to which the number 8 is assigned is calculated will be described. The distance in the X-axis direction from the origin O to the nearest area A1 in the area block A3 with respect to the X-axis direction is 1500 [mm], which is the distance for three areas. The distance from the origin O to the farthest area A1 in the area block A3 in the X-axis direction is 2500 [mm], which is the distance for five areas. Therefore, the X coordinate Xw of the person candidate existing in the area block A3 can be calculated from the following equation (6).

Ｙ軸方向に対する原点ＯからエリアブロックＡ３までの最短距離は、３エリア分の距離である１５００［ｍｍ］である。従って、人候補のＹ座標Ｙｗ＝１５００［ｍｍ］となる。人候補のＺ座標Ｚｗについては、人の足下＝地面を指定するようにしているため０として扱う。番号８のエリアブロックＡ３に存在する人候補のワールド座標は、（Ｘｗ：－２０００，Ｙｗ：１５００，Ｚｗ：０）と算出することができる。

The shortest distance from the origin O to the area block A3 in the Y-axis direction is 1500 [mm], which is the distance for three areas. Therefore, the Y coordinate Yw of the person candidate is 1500 [mm]. The Z coordinate Zw of the human candidate is treated as 0 because the foot of the person = the ground is specified. The world coordinates of the person candidate existing in the area block A3 of the number 8 can be calculated as (Xw: −2000, Yw: 1500, Zw: 0).

次に、画像処理部４１は、人候補のワールド座標をカメラ座標に変換する。ワールド座標からカメラ座標への変換は、以下の（７）式を用いて行うことができる。 Next, the image processing unit 41 converts the world coordinates of the person candidate into camera coordinates. The conversion from the world coordinates to the camera coordinates can be performed by using the following equation (7).

（７）式のＸ座標Ｘｗ、Ｙ座標Ｙｗ、Ｚ座標Ｚｗを人候補のワールド座標とすることで、人候補のカメラ座標を算出することができる。

By setting the X coordinate Xw, Y coordinate Yw, and Z coordinate Zw of the equation (7) as the world coordinates of the human candidate, the camera coordinates of the human candidate can be calculated.

次に、画像処理部４１は、以下の（８）式及び（９）式を用いて、カメラ座標から第１画像Ｉ１上の人候補の座標を算出する。 Next, the image processing unit 41 calculates the coordinates of the human candidate on the first image I1 from the camera coordinates using the following equations (8) and (9).

（８）式及び（９）式のＸ座標Ｘｃ、Ｙ座標Ｙｃ、Ｚ座標Ｚｃを人候補のカメラ座標とすることで、人候補の第１画像Ｉ１上の座標を算出することができる。詳細にいえば、（７）式及び（８）式を用いることで、第１画像Ｉ１上での人候補の足下の座標の中心座標を算出することができる。ステップＳ１２の処理を行うことで、画像処理部４１は、人候補座標算出部として機能する。

By using the X-coordinate Xc, Y-coordinate Yc, and Z-coordinate Zc of the equations (8) and (9) as the camera coordinates of the human candidate, the coordinates on the first image I1 of the human candidate can be calculated. More specifically, by using the equations (7) and (8), the center coordinates of the coordinates of the feet of the human candidate on the first image I1 can be calculated. By performing the processing in step S12, the image processing unit 41 functions as a person candidate coordinate calculation unit.

次に、図３及び図４に示すように、ステップＳ１３において、画像処理部４１は、第１画像Ｉ１において人候補が存在していると判定された座標Ｐ１に対して人検出処理を行う。詳細にいえば、人候補が存在していると判定された座標Ｐ１は人候補の足下の座標であるため、人候補の足下の座標Ｐ１からＹ座標Ｙｉが大きくなる方向に拡がる領域Ａ５に対して人検出処理を行う。人検出処理は、第１画像Ｉ１から特徴量を抽出する特徴量抽出法により行われ、例えば、HOG：Histograms of Oriented Gradientsや、SIFT：Scale Invariant Feature Transformを用いて行われる。これにより、画像処理部４１は、ステップＳ１１で検出された人候補が人か人以外の物体かを判定することができる。なお、フォークリフト１０と人との位置関係は、ステップＳ１２で算出しているため、画像処理部４１は、フォークリフト１０と人との位置関係を把握することができる。ステップＳ１３の処理を行うことで、画像処理部４１は、人判定部として機能する。 Next, as shown in FIGS. 3 and 4, in step S13, the image processing unit 41 performs a person detection process on the coordinates P1 determined in the first image I1 that a person candidate exists. More specifically, since the coordinate P1 determined that the human candidate exists is the coordinate of the foot of the human candidate, the Y coordinate Yi expands from the coordinate P1 of the foot of the human candidate to the region A5. Performs person detection processing. The human detection process is performed by a feature amount extraction method that extracts a feature amount from the first image I1, and is performed by using, for example, HOG: Histograms of Oriented Gradients or SIFT: Scale Invariant Feature Transform. As a result, the image processing unit 41 can determine whether the human candidate detected in step S11 is a human or a non-human object. Since the positional relationship between the forklift 10 and the person is calculated in step S12, the image processing unit 41 can grasp the positional relationship between the forklift 10 and the person. By performing the processing in step S13, the image processing unit 41 functions as a person determination unit.

本実施形態の作用について説明する。
ステレオカメラ３１によって撮像された画像に物体が写ると、物体の縁などが特徴点となり視差が算出される。ワールド座標系のＸＹ平面を複数に分割したエリアＡ１毎に特徴点を計数すると、物体が存在しているエリアＡ１に含まれる特徴点の数は多くなる。エリアＡ１毎に特徴点を計数し、閾値との比較を行うことで、各エリアＡ１から物体が存在しているエリアＡ１を抽出することができる。人を俯瞰した場合に、人によって占有されるエリアＡ１の大きさは推測することができる。このため、同一物体によって占有されるエリアＡ１の大きさから、物体が人候補か否かを判定することができる。 The operation of this embodiment will be described.
When an object appears in the image captured by the stereo camera 31, the parallax is calculated by using the edge of the object as a feature point. When the feature points are counted for each area A1 in which the XY plane of the world coordinate system is divided into a plurality of areas, the number of feature points included in the area A1 in which the object exists increases. By counting the feature points for each area A1 and comparing with the threshold value, the area A1 in which the object exists can be extracted from each area A1. When looking down on a person, the size of the area A1 occupied by the person can be estimated. Therefore, it is possible to determine whether or not the object is a human candidate from the size of the area A1 occupied by the same object.

画像処理部４１は、検出した物体が人候補の場合、人候補が人か否かの判定を行う。一方で、画像処理部４１は、検出した物体が人候補とは異なる場合には人か否かの判定を行わない。即ち、物体が人候補か否かに応じて、画像処理部４１が行う処理が異なることになる。予め人候補である物体を抽出してから、当該物体に対して人か否かの判定を行うことで、第１画像Ｉ１の全体に対して人検出処理を行う場合に比べて、人の検出に要する時間が短縮化される。 When the detected object is a human candidate, the image processing unit 41 determines whether or not the human candidate is a human candidate. On the other hand, if the detected object is different from the human candidate, the image processing unit 41 does not determine whether or not the object is a human. That is, the processing performed by the image processing unit 41 differs depending on whether or not the object is a human candidate. By extracting an object that is a human candidate in advance and then determining whether or not the object is a human, human detection is performed as compared with the case where the human detection process is performed on the entire first image I1. The time required for this is shortened.

フォークリフト１０では、検出した物体が人の場合、物体が人以外の場合とは異なる処理が行われる場合がある。例えば、搭乗者の操作により動作するフォークリフト１０の場合、物体検出装置３０によって人が検出されると、メインコントローラ２０は搭乗者に対して近くに人がいる旨の報知を行う。報知は、表示によって報知を行う表示器や、音によって報知を行うブザー等を用いて行われる。また、メインコントローラ２０は、フォークリフト１０の周辺の人に対して、フォークリフト１０が近くにいることを認識させるための報知を行ってもよい。自動で動作するフォークリフト１０の場合、物体が人の場合と、物体が人以外の場合とで、走行経路や車速を変更する場合がある。例えば、メインコントローラ２０は、物体の回避を行う場合、物体が人以外の場合に比べて、物体が人の場合のほうが回避距離を大きくしたり、物体が人の場合の方が近くを走行するときの車速を低くする。 In the forklift 10, when the detected object is a human, processing different from the case where the object is not a human may be performed. For example, in the case of a forklift 10 operated by a passenger's operation, when a person is detected by the object detection device 30, the main controller 20 notifies the passenger that there is a person nearby. Notification is performed using a display that notifies by display, a buzzer that notifies by sound, or the like. Further, the main controller 20 may notify people around the forklift 10 to recognize that the forklift 10 is nearby. In the case of the forklift 10 that operates automatically, the traveling route and the vehicle speed may be changed depending on whether the object is a human or a non-human object. For example, when the main controller 20 avoids an object, the avoidance distance is longer when the object is a human than when the object is not a human, or when the object is a human, the main controller 20 travels closer. Decrease the vehicle speed at the time.

本実施形態の効果について説明する。
（１）画像処理部４１は、ステレオカメラ３１によって撮像された画像を俯瞰したプロットエリアＡ２から物体を検出する。そして、画像処理部４１は、物体により占有されるエリアＡ１の大きさから物体が人候補か否かを判定することができる。 The effect of this embodiment will be described.
(1) The image processing unit 41 detects an object from the plot area A2, which is a bird's-eye view of the image captured by the stereo camera 31. Then, the image processing unit 41 can determine whether or not the object is a human candidate from the size of the area A1 occupied by the object.

（２）画像処理部４１は、第１画像Ｉ１上で人候補が存在する座標に人検出処理を行う。人候補と判定された物体に対して人検出処理が行なわれるため、第１画像Ｉ１の全体に対して人検出処理を行う場合に比べて人の検出に要する時間を短縮化できる。 (2) The image processing unit 41 performs a person detection process at the coordinates where a person candidate exists on the first image I1. Since the human detection process is performed on the object determined to be the human candidate, the time required for detecting the human can be shortened as compared with the case where the human detection process is performed on the entire first image I1.

（３）画像処理部４１は、特徴点が第１区分を含む複数の区分に存在する場合であり、かつ、第１区分から連続して区分に特徴点が存在する場合、第１区分に連続して特徴点が存在する区分のうち最も高い区分を物体の高さとして設定する。画像処理部４１は、特徴点が第１区分を含む複数の区分に存在する場合であり、かつ、第１区分に連続した区分に特徴点が存在しない場合、第１区分を物体の高さとして設定する。これにより、浮いている物体を除いた物体を検出することができる。 (3) The image processing unit 41 is continuous in the first division when the feature points are present in a plurality of divisions including the first division and the feature points are continuously present in the divisions from the first division. Then, the highest division among the divisions in which the feature points exist is set as the height of the object. When the feature points exist in a plurality of sections including the first section and the feature points do not exist in the continuous sections in the first section, the image processing unit 41 uses the first section as the height of the object. Set. As a result, it is possible to detect an object excluding a floating object.

実施形態は、以下のように変更して実施することができる。実施形態及び以下の変形例は、技術的に矛盾しない範囲で互いに組み合わせて実施することができる。
○画像処理部４１は、特徴点が存在する区分同士の間に、特徴点が存在しない区分が存在するか否かに関わらず、特徴点が存在する区分のうち最も高い区分を物体の高さとして設定してもよい。即ち、画像処理部４１は、路面から浮いている物体が存在すると推測される場合、路面から浮いている物体によって生じた特徴点を考慮して物体の高さを設定してもよい。 The embodiment can be modified and implemented as follows. The embodiments and the following modifications can be implemented in combination with each other within a technically consistent range.
○ The image processing unit 41 sets the highest division among the divisions in which the feature points exist among the divisions in which the feature points exist, regardless of whether or not there is a division in which the feature points do not exist, as the height of the object. It may be set as. That is, when it is presumed that an object floating from the road surface exists, the image processing unit 41 may set the height of the object in consideration of the feature points generated by the object floating from the road surface.

○画像処理部４１は、第１区分を除く複数の区分に特徴点が存在している場合であり、かつ、特徴点が存在する区分同士の間に特徴点が存在しない区分が介在している場合、低い方の区分を物体の高さとして設定してもよい。例えば、第２区分及び第４区分に特徴点が存在しており、第３区分に特徴点が存在していない場合、物体の高さとして第２区分を設定してもよい。第２区分及び第４区分に特徴点が存在しており、第３区分に特徴点が存在していない場合、上下に向かい合う２つの物体が同一エリアＡ１に存在しているといえる。この場合、特徴点が存在する区分のうち最も高い区分とは、上下に向かい合う２つの物体のうち下方の物体によって生じた特徴点のみを考慮した区分である。 ○ The image processing unit 41 is a case where feature points exist in a plurality of sections other than the first section, and a section in which the feature points do not exist is interposed between the sections in which the feature points exist. In that case, the lower division may be set as the height of the object. For example, when the feature points exist in the second section and the fourth section and the feature points do not exist in the third section, the second section may be set as the height of the object. When the feature points exist in the second section and the fourth section and the feature points do not exist in the third section, it can be said that two objects facing each other up and down exist in the same area A1. In this case, the highest category among the categories in which the feature points exist is the category in which only the feature points generated by the lower object among the two vertically facing objects are considered.

○画像処理部４１は、人候補が人か否かを判定しなくてもよい。この場合であっても、物体が人の場合と同様な制御を行うことができる。例えば、人候補となる物体が検出された場合には搭乗者や、人候補に対して報知を行ってもよい。また、物体が人候補か否かに応じて、走行経路や車速を変更してもよい。 ○ The image processing unit 41 does not have to determine whether or not the person candidate is a person. Even in this case, it is possible to perform the same control as when the object is a human. For example, when an object that is a candidate for a person is detected, the passenger or the candidate for a person may be notified. Further, the traveling route and the vehicle speed may be changed depending on whether or not the object is a human candidate.

○特徴点の高さの区分は、４つ以上に分けられていてもよい。例えば、実施形態の区分に加えて、０［ｍｍ］以上５００［ｍｍ］未満の区分を第５区分として追加してもよい。
○画像処理部４１は、人候補のワールド座標系での座標を算出する際に、Ｘ座標ＸｗをエリアブロックＡ３の任意座標、Ｙ座標ＹｗはエリアブロックＡ３の任意座標としてもよい。例えば、画像処理部４１は、Ｘ座標ＸｗをエリアブロックＡ３のうち最も原点Ｏに近い座標、Ｙ座標ＹｗをエリアブロックＡ３の中心座標としてもよい。画像処理部４１は、エリアブロックＡ３のうちのいずれの位置を第１画像Ｉ１での人候補の座標としているかを把握できていれば、人候補の存在する領域Ａ５が第１画像Ｉ１のいずれの方向に拡がっているかを判別できる。従って、人候補のワールド座標系での位置を算出する際に、Ｘ座標ＸｗをエリアブロックＡ３の任意座標、Ｙ座標ＹｗはエリアブロックＡ３の任意座標としても実施形態と同様の効果が得られる。 ○ The height of the feature points may be divided into four or more. For example, in addition to the category of the embodiment, a category of 0 [mm] or more and less than 500 [mm] may be added as the fifth category.
○ The image processing unit 41 may use the X coordinate Xw as an arbitrary coordinate of the area block A3 and the Y coordinate Yw as an arbitrary coordinate of the area block A3 when calculating the coordinates of the person candidate in the world coordinate system. For example, the image processing unit 41 may use the X coordinate Xw as the coordinate closest to the origin O in the area block A3 and the Y coordinate Yw as the center coordinate of the area block A3. If the image processing unit 41 can grasp which position of the area block A3 is the coordinate of the person candidate in the first image I1, the area A5 in which the person candidate exists is any of the first image I1. It can be determined whether it is spreading in the direction. Therefore, when calculating the position of the person candidate in the world coordinate system, the same effect as that of the embodiment can be obtained even if the X coordinate Xw is an arbitrary coordinate of the area block A3 and the Y coordinate Yw is an arbitrary coordinate of the area block A3.

○画像処理部４１は、ステップＳ１２において、原点ＯからエリアブロックＡ３までのＸ軸方向への最短距離と、原点ＯからエリアブロックＡ３におけるＸ軸方向に対して原点Ｏから最も離間した部分までのＸ軸方向への距離と、を加算した値を二分した距離を人候補のＸ座標Ｘｗとしてもよい。例えば、図９で番号８が付与されたエリアブロックＡ３に存在する人候補のＸ座標Ｘｗを算出する場合、原点ＯからエリアブロックＡ３までのＸ軸方向への最短距離は、２エリア分の距離である１０００［ｍｍ］になる。原点ＯからエリアブロックＡ３におけるＸ軸方向に対して原点Ｏから最も離間した部分までのＸ軸方向への距離は、５エリア分の距離である２５００［ｍｍ］になる。すると、エリアブロックＡ３に存在する人候補のＸ座標Ｘｗは、－１７５０になる。なお、Ｘ座標ＸｗをエリアブロックＡ３の中心座標とする場合、中心座標とは、エリアブロックＡ３のＸ軸方向に対するエリアＡ１のうち中央のエリアＡ１の範囲内の座標であればよい。 ○ In step S12, the image processing unit 41 reaches the shortest distance in the X-axis direction from the origin O to the area block A3 and the portion farthest from the origin O in the X-axis direction in the area block A3 from the origin O. The X-coordinate Xw of the person candidate may be the distance obtained by dividing the distance in the X-axis direction and the value obtained by adding the sum into two. For example, when calculating the X coordinate Xw of a person candidate existing in the area block A3 assigned the number 8 in FIG. 9, the shortest distance in the X-axis direction from the origin O to the area block A3 is the distance for two areas. It becomes 1000 [mm]. The distance in the X-axis direction from the origin O to the portion farthest from the origin O in the X-axis direction in the area block A3 is 2500 [mm], which is the distance for five areas. Then, the X coordinate Xw of the person candidate existing in the area block A3 becomes -1750. When the X coordinate Xw is the center coordinate of the area block A3, the center coordinate may be a coordinate within the range of the central area A1 of the area A1 with respect to the X axis direction of the area block A3.

○画像処理部４１は、人候補のワールド座標系での座標を算出する際に、領域Ａ４の複数箇所の座標を算出してもよい。例えば、領域Ａ４の対角となる２つの角の座標を算出してもよい。この場合、第１画像Ｉ１の領域Ａ５の大きさを適切に算出することができる。 ○ The image processing unit 41 may calculate the coordinates of a plurality of locations in the area A4 when calculating the coordinates in the world coordinate system of the person candidate. For example, the coordinates of two diagonal angles of the area A4 may be calculated. In this case, the size of the region A5 of the first image I1 can be appropriately calculated.

○画像処理部４１は、エリアＡ１に物体が存在しているか否かを判定する際の閾値をプロットエリアのＹ座標Ｙｗに応じて変更してもよい。画像処理部４１は、Ｙ軸方向に対して原点Ｏから離れたエリアＡ１ほど、閾値を低くする。これにより、Ｙ軸方向に対して原点Ｏから離れたエリアＡ１ほど、特徴点の総和が低くても物体が存在していると判定されることになる。実施形態に記載したように、Ｙ座標Ｙｗが大きくなるほど特徴点は離散的になる。このため、Ｙ軸方向に対して原点Ｏから離れたエリアＡ１ほど、閾値を低くすることで、特徴点が離散しやすいエリアＡ１でも物体の検出を行うことができる。即ち、フォークリフト１０から離れている物体であっても検出が可能になる。 ○ The image processing unit 41 may change the threshold value for determining whether or not an object exists in the area A1 according to the Y coordinate Yw of the plot area. The image processing unit 41 lowers the threshold value as the area A1 is farther from the origin O in the Y-axis direction. As a result, it is determined that the area A1 farther from the origin O in the Y-axis direction has an object even if the sum of the feature points is lower. As described in the embodiment, the larger the Y coordinate Yw, the more discrete the feature points. Therefore, by lowering the threshold value as the area A1 is farther from the origin O in the Y-axis direction, the object can be detected even in the area A1 where the feature points are likely to be discrete. That is, even an object far from the forklift 10 can be detected.

○画像処理部４１は、特徴点の計数を行う際に、Ｘ座標Ｘｗ及びＹ座標Ｙｗが同一であり、Ｚ座標Ｚｗのみが異なる複数の特徴点が存在する場合、Ｚ座標Ｚｗが最も大きい特徴点のみを計数してもよい。 ○ When the image processing unit 41 counts the feature points, when there are a plurality of feature points in which the X coordinate Xw and the Y coordinate Yw are the same and only the Z coordinate Zw is different, the Z coordinate Zw is the largest feature. Only points may be counted.

○物体が人か否かの判定は、教師有り学習モデルによる機械学習を行った人判定部に対して第１画像Ｉ１を入力することで行われてもよい。詳細にいえば、学習済みの人判定部に対して、第１画像Ｉ１のうち物体が写っている領域Ａ５を入力することで、物体が人か否かを判定させてもよい。人判定部としては、例えば、サポートベクタマシン、ニューラルネットワーク、ナイーブベイズ、決定木等の教師有り学習器を採用することが可能である。機械学習に用いる教師データとしては、画像から抽出された人の形状要素や、外観要素などの画像固有成分が用いられる。形状要素として、例えば、人の大きさや輪郭などが挙げられる。外観要素としては、例えば、光源情報、テクスチャ情報、カメラ情報などが挙げられる。光源情報には、反射率や、陰影等に関する情報が含まれる。テクスチャ情報には、カラー情報等が含まれる。カメラ情報には、画質、解像度、画角等に関する情報が含まれる。 ○ Judgment as to whether or not the object is a human may be performed by inputting the first image I1 to the human determination unit that has performed machine learning by the supervised learning model. More specifically, the trained person determination unit may be made to determine whether or not the object is a person by inputting the area A5 in the first image I1 in which the object is captured. As the human determination unit, for example, a supervised learning device such as a support vector machine, a neural network, naive Bayes, or a decision tree can be adopted. As the teacher data used for machine learning, image-specific components such as human shape elements and appearance elements extracted from the image are used. Examples of the shape element include the size and contour of a person. Examples of the appearance element include light source information, texture information, camera information, and the like. The light source information includes information on reflectance, shading, and the like. The texture information includes color information and the like. The camera information includes information on image quality, resolution, angle of view, and the like.

○三次元座標系は、直交座標系に限られず、極座標系としてもよい。この場合、極座標系のうち水平面を表す座表面は、座標面の原点を中心として扇状のエリアＡ１に分割される。 ○ The three-dimensional coordinate system is not limited to the orthogonal coordinate system, but may be a polar coordinate system. In this case, the seat surface representing the horizontal plane in the polar coordinate system is divided into fan-shaped areas A1 centered on the origin of the coordinate plane.

○ステレオカメラ３１によって撮像された画像のうち第２画像から人を検出するようにしてもよい。この場合、画像処理部４１は、プロットエリアＡ２の原点ＯとエリアブロックＡ３との位置関係から第２画像上での人候補の座標を算出するが、第２画像は比較画像であるため、基線長に応じたずれが生じる。このため、画像処理部４１は、基線長に応じて第２画像上での人候補の座標を補正し、補正した座標に対して人検出処理を行う。 ○ A person may be detected from the second image of the images captured by the stereo camera 31. In this case, the image processing unit 41 calculates the coordinates of the person candidate on the second image from the positional relationship between the origin O of the plot area A2 and the area block A3, but since the second image is a comparison image, the baseline. There is a deviation depending on the length. Therefore, the image processing unit 41 corrects the coordinates of the human candidate on the second image according to the baseline length, and performs the human detection process on the corrected coordinates.

○画像処理部４１は、視差画像ｄｐから路面の視差を除去する際に、ハフ変換などの直線抽出法を用いて路面を検出してもよい。路面は、平坦であるため、路面の視差はステレオカメラ３１から離れるにつれて徐々に小さくなっていく。従って、視差画像ｄｐのＹ座標Ｙｉと、視差との二次元座標系に視差をプロットすると、路面の視差が直線状になって現れる。画像処理部４１は、この直線の視差を路面の視差として除去する。 ○ The image processing unit 41 may detect the road surface by using a linear extraction method such as a Hough transform when removing the parallax of the road surface from the parallax image dp. Since the road surface is flat, the parallax of the road surface gradually decreases as the distance from the stereo camera 31 increases. Therefore, when the parallax is plotted in the two-dimensional coordinate system of the Y coordinate Yi of the parallax image dp and the parallax, the parallax of the road surface appears as a straight line. The image processing unit 41 removes the parallax of this straight line as the parallax of the road surface.

○画像処理部４１は、路面の視差を除去しなくてもよい。路面は、高さが低いため、視差画像ｄｐから路面の視差を除去しない場合であっても、第１区分には含まれず、実施形態と同様の効果が得られる。 ○ The image processing unit 41 does not have to remove the parallax on the road surface. Since the height of the road surface is low, even if the parallax of the road surface is not removed from the parallax image dp, it is not included in the first category, and the same effect as that of the embodiment can be obtained.

○視差画像取得部、座標算出部、計数部、物体判定部、物体高設定部、同一物体判定部、算出部、人候補判定部、人候補座標算出部及び人判定部は、それぞれ、個別の制御装置によって構成されていてもよい。 ○ The parallax image acquisition unit, coordinate calculation unit, counting unit, object determination unit, object height setting unit, same object determination unit, calculation unit, person candidate determination unit, person candidate coordinate calculation unit, and person determination unit are individually separated. It may be configured by a control device.

○カメラ座標からワールド座標への変換はテーブルデータによって行われてもよい。テーブルデータは、Ｙ座標ＹｃとＺ座標Ｚｃの組み合わせにＹ座標Ｙｗを対応させたテーブルデータと、Ｙ座標ＹｃとＺ座標Ｚｃとの組み合わせにＺ座標Ｚｗを対応させたテーブルデータである。これらのテーブルデータを画像処理部４１のＲＯＭなどに記憶しておくことで、カメラ座標系におけるＹ座標ＹｃとＺ座標Ｚｃから、ワールド座標系におけるＹ座標Ｙｗ及びＺ座標Ｚｗを求めることができる。同様に、ワールド座標からカメラ座標への変換についてもテーブルデータによって行われてもよい。 ○ Conversion from camera coordinates to world coordinates may be performed by table data. The table data is table data in which the combination of the Y coordinate Yc and the Z coordinate Zc corresponds to the Y coordinate Yw, and the table data in which the combination of the Y coordinate Yc and the Z coordinate Zc corresponds to the Z coordinate Zw. By storing these table data in the ROM of the image processing unit 41 or the like, the Y coordinate Yw and the Z coordinate Zw in the world coordinate system can be obtained from the Y coordinate Yc and the Z coordinate Zc in the camera coordinate system. Similarly, the conversion from world coordinates to camera coordinates may be performed by table data.

○第１カメラ３２と第２カメラ３３は、鉛直方向に並んで配置されていてもよい。
○第１画像Ｉ１の画素数と第２画像の画素数とは異なっていてもよい。例えば、比較画像である第２画像の画素数を視差画像の画素数と同一とし、基準画像である第１画像Ｉ１の画素数を第２画像の画素数よりも多くしてもよい。 ○ The first camera 32 and the second camera 33 may be arranged side by side in the vertical direction.
○ The number of pixels of the first image I1 and the number of pixels of the second image may be different. For example, the number of pixels of the second image as the comparison image may be the same as the number of pixels of the differential image, and the number of pixels of the first image I1 as the reference image may be larger than the number of pixels of the second image.

○ステレオカメラ３１は、３つ以上のカメラを備えていてもよい。
○フォークリフト１０は、エンジンの駆動によって走行するものでもよい。この場合、走行制御装置は、エンジンへの燃料噴射量などを制御する装置となる。 ○ The stereo camera 31 may include three or more cameras.
○ The forklift 10 may be driven by an engine. In this case, the travel control device is a device that controls the fuel injection amount to the engine and the like.

○物体検出装置３０は、建設機械、自動搬送車、トラックなどフォークリフト１０以外の産業車両や乗用車などの移動体に搭載されていてもよい。 ○ The object detection device 30 may be mounted on a moving body such as an industrial vehicle or a passenger car other than the forklift 10 such as a construction machine, an automatic transport vehicle, and a truck.

３０…物体検出装置、３１…ステレオカメラ、４１…視差画像取得部、座標算出部、計数部、物体判定部、物体高設定部、同一物体判定部、算出部、人候補判定部、人候補座標算出部及び人判定部として機能する画像処理部。 30 ... Object detection device, 31 ... Stereo camera, 41 ... Parallax image acquisition unit, coordinate calculation unit, counting unit, object determination unit, object height setting unit, same object determination unit, calculation unit, person candidate determination unit, person candidate coordinates An image processing unit that functions as a calculation unit and a person determination unit.

Claims

ステレオカメラと、
前記ステレオカメラによる撮像が行われた画像から各画素に視差が対応付けられた視差画像を取得する視差画像取得部と、
前記視差が取得された特徴点について、実空間上での位置を表す三次元座標系での座標を算出する座標算出部と、
前記三次元座標系の水平面を表す座表面を複数に分割した各エリアに含まれる前記特徴点を、前記特徴点の高さの範囲で分けた複数の区分毎に計数する計数部と、
各エリアのうち前記特徴点の総和が閾値を超えたエリアには、物体が存在していると判定する物体判定部と、
前記物体が存在していると判定された前記エリア毎に、前記特徴点が存在する区分のうち最も高い区分を前記物体の高さとして設定する物体高設定部と、
同一の高さの前記物体のうち隣接したエリアに存在している物体を同一物体であると判定する同一物体判定部と、
前記同一物体判定部により判定が行われた後の前記物体毎に、前記物体が占有するエリアの大きさを算出する算出部と、
前記算出部により算出された前記エリアの大きさが所定値以下の前記物体を人候補であると判定する人候補判定部と、を備える物体検出装置。 With a stereo camera
A parallax image acquisition unit that acquires a parallax image in which a parallax is associated with each pixel from an image captured by the stereo camera, and a parallax image acquisition unit.
A coordinate calculation unit that calculates the coordinates of the feature point for which the parallax has been acquired in a three-dimensional coordinate system that represents the position in real space, and
A counting unit that counts the feature points included in each area of the seat surface representing the horizontal plane of the three-dimensional coordinate system in each of the plurality of divisions divided within the height range of the feature points.
In each area, the object determination unit that determines that an object exists in the area where the total sum of the feature points exceeds the threshold value,
An object height setting unit that sets the highest division among the divisions in which the feature points exist as the height of the object for each of the areas where the object is determined to exist,
The same object determination unit that determines that an object existing in an adjacent area among the objects of the same height is the same object,
A calculation unit that calculates the size of the area occupied by the object for each object after the determination is made by the same object determination unit.
An object detection device including a human candidate determination unit that determines that an object whose area size is equal to or smaller than a predetermined value calculated by the calculation unit is a human candidate.

前記人候補の実空間上での座標から、前記人候補の前記画像上での座標を算出する人候補座標算出部と、
前記人候補の前記画像上での座標に対して人検出処理を行い、前記人候補が人か否かを判定する人判定部と、を備える請求項１に記載の物体検出装置。 A person candidate coordinate calculation unit that calculates the coordinates of the person candidate on the image from the coordinates of the person candidate in the real space, and
The object detection device according to claim 1, further comprising a person determination unit that performs a person detection process on the coordinates of the person candidate on the image and determines whether or not the person candidate is a person.

前記区分のうち前記特徴点の高さの範囲が最も低い区分を最低区分とすると、
前記物体高設定部は、前記特徴点が前記最低区分を含む複数の区分に存在する場合、
前記最低区分から連続して前記区分に前記特徴点が存在する場合、前記最低区分に連続して前記特徴点が存在する区分のうち最も高い区分を前記物体の高さとして設定し、
前記最低区分に連続した前記区分に前記特徴点が存在しない場合、前記最低区分を前記物体の高さとして設定する請求項１又は請求項２に記載の物体検出装置。 Assuming that the category having the lowest height range of the feature points among the categories is the lowest category,
When the feature point exists in a plurality of divisions including the minimum division, the object height setting unit may be used.
When the feature points are continuously present in the lowest section from the lowest section, the highest section among the sections in which the feature points are continuously present in the lowest section is set as the height of the object.
The object detection device according to claim 1 or 2, wherein when the feature point does not exist in the category continuous to the minimum category, the minimum category is set as the height of the object.