JP2018092608A

JP2018092608A - Information processing device, imaging device, apparatus control system, movable body, information processing method, and program

Info

Publication number: JP2018092608A
Application number: JP2017190783A
Authority: JP
Inventors: 聖也天野; Seiya Amano; 横田　聡一郎; Soichiro Yokota; 聡一郎横田; 吉田　淳; Atsushi Yoshida; 淳吉田; 旅人鈴木; Tabito Suzuki; 陽一郎大林; Yoichiro Obayashi; 久保園　浩喜; Hiroki Kubozono; 浩喜久保園; 真太郎木田; Shintaro Kida; 大輔岡田; Daisuke Okada; 輔宏木村
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2016-11-25
Filing date: 2017-09-29
Publication date: 2018-06-14
Anticipated expiration: 2037-09-29
Also published as: JP7062904B2

Abstract

PROBLEM TO BE SOLVED: To provide an information processing device, an imaging device, an apparatus control system, a movable body, an information processing method, and a program with which the accuracy in detecting objects can be sufficiently ensured.SOLUTION: An information processing device of the present invention comprises an acquisition part, a creation part, a detection part, an extraction part, and a separation part. The acquisition part acquires a depth map having distance information for every pixel. The creation part creates first correspondence information in which the positions in the transverse direction and the positions in the depth direction are associated with each other on the basis of the depth map. The detection part detects an assembled area of the distance information from the first correspondence information. The extraction part extracts the contour of the assembled area. The separation separates the assembled area on the basis of the characteristics of the contour extracted by the extraction part.SELECTED DRAWING: Figure 5

Description

本発明は、情報処理装置、撮像装置、機器制御システム、移動体、情報処理方法およびプログラムに関する。 The present invention relates to an information processing device, an imaging device, a device control system, a moving body, an information processing method, and a program.

従来、自動車の安全性において、歩行者と自動車とが衝突したときに、いかに歩行者を守れるか、および、乗員を保護できるかの観点から、自動車のボディー構造等の開発が行われてきた。しかしながら、近年、情報処理技術および画像処理技術の発達により、高速に人および自動車を検出する技術が開発されてきている。これらの技術を応用して、自動車が物体に衝突する前に自動的にブレーキをかけ、衝突を未然に防ぐという自動車もすでに開発されている。車両の自動制御には、人または他車等の物体までの距離を正確に測定する必要があり、そのためには、ミリ波レーダおよびレーザレーダによる測距、ならびに、ステレオカメラによる測距等が実用化されている。例えばステレオカメラで測距する場合、左右のカメラで撮影された局所領域のズレ量（視差）に基づいて視差画像を生成し、前方物体と自車との距離を測定することができる。そして、同程度の距離に存在する（同程度の視差値を有する）視差画素の群を１つの物体として検出するクラスタリング処理を行う。 Conventionally, in terms of safety of automobiles, body structures of automobiles have been developed from the viewpoint of how to protect pedestrians and protect passengers when pedestrians and automobiles collide. However, in recent years, with the development of information processing technology and image processing technology, technology for detecting people and cars at high speed has been developed. Automobiles that apply these technologies to automatically apply a brake before an automobile collides with an object to prevent the collision have already been developed. For automatic vehicle control, it is necessary to accurately measure the distance to an object such as a person or other vehicle. For this purpose, distance measurement using millimeter wave radar and laser radar, distance measurement using a stereo camera, etc. are practical. It has become. For example, when ranging with a stereo camera, it is possible to generate a parallax image based on the amount of deviation (parallax) between local regions captured by the left and right cameras, and measure the distance between the front object and the vehicle. Then, clustering processing is performed to detect a group of parallax pixels existing at the same distance (having the same parallax value) as one object.

ステレオカメラなどを用いて得られる３次元データから物体検出を行うには算出された視差を塊として検出する必要があるが、物体間の位置が近い場合は同じ視差塊と判断されることが多く、複数の物体を結合した１つの物体として誤検出され易い。例えば特許文献１には、距離データの並び方向から物体面を検出し、その物体面の位置関係から物体を検出する技術が開示されている。 In order to detect an object from three-dimensional data obtained using a stereo camera or the like, it is necessary to detect the calculated parallax as a lump. However, when the positions between objects are close, it is often determined that the parallax lump is the same. It is easy to be erroneously detected as a single object obtained by combining a plurality of objects. For example, Patent Document 1 discloses a technique for detecting an object plane from the arrangement direction of distance data and detecting an object from the positional relationship of the object plane.

しかしながら、従来技術においては、複数の物体を結合した１つの物体として誤検出されることを十分に防ぐことはできなかった。つまり、物体の検出精度を十分に確保することが困難であるという問題があった。 However, in the prior art, it has not been possible to sufficiently prevent erroneous detection as a single object obtained by combining a plurality of objects. That is, there is a problem that it is difficult to ensure sufficient object detection accuracy.

本発明は、上記に鑑みてなされたものであって、物体の検出精度を十分に確保可能な情報処理装置、撮像装置、機器制御システム、移動体、情報処理方法およびプログラムを提供することを目的とする。 The present invention has been made in view of the above, and an object of the present invention is to provide an information processing apparatus, an imaging apparatus, a device control system, a moving body, an information processing method, and a program capable of sufficiently ensuring the detection accuracy of an object. And

上述した課題を解決し、目的を達成するために、本発明は、画素毎に距離情報を有する距離画像を取得する取得部と、前記距離画像に基づいて横方向の位置と、奥行方向の位置とが対応付けられた第１の対応情報を生成する生成部と、前記第１の対応情報から、前記距離情報の集合領域を検出する検出部と、前記集合領域の輪郭を抽出する抽出部と、前記抽出部により抽出された前記輪郭の特徴に基づいて、前記集合領域を分離する分離部と、を備える情報処理装置である。 In order to solve the above-described problems and achieve the object, the present invention provides an acquisition unit that acquires a distance image having distance information for each pixel, a lateral position based on the distance image, and a depth direction position. Generating unit that generates first correspondence information associated with each other, a detection unit that detects a collection region of the distance information from the first correspondence information, and an extraction unit that extracts the outline of the collection region, An information processing apparatus comprising: a separation unit that separates the collection region based on the feature of the contour extracted by the extraction unit.

本発明によれば、物体の検出精度を十分に確保することができる。 According to the present invention, sufficient object detection accuracy can be ensured.

図１は、実施形態の機器制御システムの概略構成を示す模式図である。FIG. 1 is a schematic diagram illustrating a schematic configuration of a device control system according to an embodiment. 図２は、撮像ユニットおよび解析ユニットの概略的なブロック図である。FIG. 2 is a schematic block diagram of the imaging unit and the analysis unit. 図３は、被写体と各カメラ部の撮像レンズとの位置関係を示す図である。FIG. 3 is a diagram illustrating a positional relationship between the subject and the imaging lens of each camera unit. 図４は、解析ユニットが有する機能を概略的に説明するための図である。FIG. 4 is a diagram for schematically explaining the functions of the analysis unit. 図５は、物体検出処理部が有する機能の一例を示す図である。FIG. 5 is a diagram illustrating an example of functions of the object detection processing unit. 図６は、路面検出処理部が有する機能の一例を示す図である。FIG. 6 is a diagram illustrating an example of functions of the road surface detection processing unit. 図７は、撮像画像の一例を示す図である。FIG. 7 is a diagram illustrating an example of a captured image. 図８は、ＨｉｇｈＵｍａｐの一例を示す図である。FIG. 8 is a diagram illustrating an example of High Umap. 図９は、ＳｔａｎｄａｒｄＵｍａｐの一例を示す図である。FIG. 9 is a diagram illustrating an example of a standard Umap. 図１０は、クラスタリング処理部の詳細な機能の一例を示す図である。FIG. 10 is a diagram illustrating an example of detailed functions of the clustering processing unit. 図１１は、撮像画像の一例を示す図である。FIG. 11 is a diagram illustrating an example of a captured image. 図１２は、孤立領域の一例を示す図である。FIG. 12 is a diagram illustrating an example of an isolated region. 図１３は、図１２に示す孤立領域に対応する視差画像上の領域を示す図である。FIG. 13 is a diagram illustrating a region on the parallax image corresponding to the isolated region illustrated in FIG. 12. 図１４は、オブジェクトタイプごとに定められたサイズ範囲を示す図である。FIG. 14 is a diagram showing a size range defined for each object type. 図１５は、棄却処理を説明するための図である。FIG. 15 is a diagram for explaining the rejection process. 図１６は、クラスタリング処理部による処理の一例を示すフローチャートである。FIG. 16 is a flowchart illustrating an example of processing by the clustering processing unit. 図１７は、孤立領域検出処理の一例を示すフローチャートである。FIG. 17 is a flowchart illustrating an example of the isolated region detection process. 図１８は、二値化処理後の一例を示す図である。FIG. 18 is a diagram illustrating an example after binarization processing. 図１９は、面検出処理部が有する機能の一例を示す図である。FIG. 19 is a diagram illustrating an example of functions of the surface detection processing unit. 図２０は、面検出処理の流れを示すフローチャートである。FIG. 20 is a flowchart showing the flow of the surface detection process. 図２１は、抽出部による輪郭抽出処理の一例を示すフローチャートである。FIG. 21 is a flowchart illustrating an example of contour extraction processing by the extraction unit. 図２２は、探索処理を説明するための図である。FIG. 22 is a diagram for explaining the search process. 図２３は、探索処理を説明するための図である。FIG. 23 is a diagram for explaining the search process. 図２４は、探索処理を説明するための図である。FIG. 24 is a diagram for explaining the search process. 図２５は、探索処理を説明するための図である。FIG. 25 is a diagram for explaining the search process. 図２６は、探索順が記録された画素の繋がりを示す図である。FIG. 26 is a diagram illustrating connection of pixels in which the search order is recorded. 図２７は、分離部による分離処理の一例を示すフローチャートである。FIG. 27 is a flowchart illustrating an example of separation processing by the separation unit. 図２８は、集合領域の位置判断を説明するための図である。FIG. 28 is a diagram for explaining the position determination of the collection area. 図２９は、対象画素の特定方法を説明するための図である。FIG. 29 is a diagram for explaining a target pixel specifying method. 図３０は、対象画素の特定方法を説明するための図である。FIG. 30 is a diagram for explaining a target pixel specifying method. 図３１は、第２の対応情報の生成方法を説明するための図である。FIG. 31 is a diagram for explaining a method of generating the second correspondence information. 図３２は、第２の対応情報の他の態様を説明するための図である。FIG. 32 is a diagram for explaining another aspect of the second correspondence information. 図３３は、分離位置の特定方法を説明するための図である。FIG. 33 is a diagram for explaining a method of specifying the separation position. 図３４は、分離部による分離を説明するための図である。FIG. 34 is a diagram for explaining separation by the separation unit.

以下、添付図面を参照しながら、本発明に係る情報処理装置、撮像装置、機器制御システム、移動体、情報処理方法およびプログラムの実施形態を詳細に説明する。図１は、実施形態の機器制御システム１００の概略構成を示す模式図である。図１に示すように、機器制御システム１００は、移動体の一例である自動車等の車両１０１に設けられる。機器制御システム１００は、撮像ユニット１０２、解析ユニット１０３、制御ユニット１０４および表示部１０５を有している。 Hereinafter, embodiments of an information processing device, an imaging device, a device control system, a moving body, an information processing method, and a program according to the present invention will be described in detail with reference to the accompanying drawings. FIG. 1 is a schematic diagram illustrating a schematic configuration of a device control system 100 according to the embodiment. As illustrated in FIG. 1, the device control system 100 is provided in a vehicle 101 such as an automobile that is an example of a moving body. The device control system 100 includes an imaging unit 102, an analysis unit 103, a control unit 104, and a display unit 105.

撮像ユニット１０２は、移動体の一例としての車両１０１のフロントガラス１０６のルームミラー付近に設けられ、車両１０１の例えば進行方向等の画像を撮像する。撮像ユニット１０２の撮像動作で得られる画像データを含む各種データは、解析ユニット１０３に供給される。解析ユニット１０３は、撮像ユニット１０２から供給される各種データに基づいて、車両１０１が走行中の路面、車両１０１の前方車両、歩行者、障害物等の認識対象物を解析する。制御ユニット１０４は、解析ユニット１０３の解析結果に基づいて、表示部１０５を介して、車両１０１の運転者へ警告等を行う。また、制御ユニット１０４は、解析結果に基づいて、各種車載機器の制御、車両１０１のハンドル制御又はブレーキ制御等の走行支援を行う。なお、以下、移動体の一例として車両１０１について説明するが、本実施の形態の機器制御システム１００は、船舶、航空機、ロボット等にも適用可能である。 The imaging unit 102 is provided in the vicinity of a room mirror of a windshield 106 of a vehicle 101 as an example of a moving body, and captures an image of the vehicle 101 such as a traveling direction. Various data including image data obtained by the imaging operation of the imaging unit 102 is supplied to the analysis unit 103. The analysis unit 103 analyzes recognition objects such as a road surface on which the vehicle 101 is traveling, a vehicle ahead of the vehicle 101, a pedestrian, and an obstacle based on various data supplied from the imaging unit 102. The control unit 104 issues a warning or the like to the driver of the vehicle 101 via the display unit 105 based on the analysis result of the analysis unit 103. Further, the control unit 104 performs traveling support such as control of various in-vehicle devices, steering wheel control or brake control of the vehicle 101 based on the analysis result. Hereinafter, although the vehicle 101 will be described as an example of the moving body, the device control system 100 of the present embodiment can be applied to a ship, an aircraft, a robot, and the like.

図２は、撮像ユニット１０２および解析ユニット１０３の概略的なブロック図である。この例では、解析ユニット１０３は「情報処理装置」として機能し、撮像ユニット１０２および解析ユニット１０３の組は「撮像装置」として機能する。なお、上述の制御ユニット１０４は、「制御部」として機能し、撮像装置の出力結果に基づいて機器（この例では車両１０１）を制御する。撮像ユニット１０２は、左目用となる第１のカメラ部１Ａと、右目用となる第２のカメラ部１Ｂとの、２台のカメラ部が平行に組み付けられて構成されている。つまり、撮像ユニット１０２は、ステレオ画像を撮像するステレオカメラとして構成されている。ステレオ画像とは、複数の視点ごとの撮像で得られる複数の撮像画像（複数の視点と１対１に対応する複数の撮像画像）を含む画像であり、撮像ユニット１０２は、このステレオ画像を撮像するための装置である。第１のカメラ部１Ａおよび第２のカメラ部１Ｂは、それぞれ撮像レンズ５Ａ，５Ｂ、画像センサ６Ａ，６Ｂ、センサコントローラ７Ａ，７Ｂを備えている。画像センサ６Ａ，６Ｂは、例えばＣＣＤイメージセンサまたはＣＭＯＳイメージセンサとなっている。ＣＣＤは、「Charge Coupled Device」の略記である。また、ＣＭＯＳは、「Complementary Metal-Oxide Semiconductor」の略記である。センサコントローラ７Ａ，７Ｂは、画像センサ６Ａ，６Ｂの露光制御、画像読み出し制御、外部回路との通信、および画像データの送信制御等を行う。 FIG. 2 is a schematic block diagram of the imaging unit 102 and the analysis unit 103. In this example, the analysis unit 103 functions as an “information processing apparatus”, and the set of the imaging unit 102 and the analysis unit 103 functions as an “imaging apparatus”. The control unit 104 described above functions as a “control unit” and controls the device (in this example, the vehicle 101) based on the output result of the imaging device. The imaging unit 102 is configured by assembling two camera units in parallel, a first camera unit 1A for the left eye and a second camera unit 1B for the right eye. That is, the imaging unit 102 is configured as a stereo camera that captures a stereo image. The stereo image is an image including a plurality of captured images (a plurality of captured images corresponding to a plurality of viewpoints on a one-to-one basis) obtained by imaging at a plurality of viewpoints, and the imaging unit 102 captures the stereo images. It is a device for doing. The first camera unit 1A and the second camera unit 1B include imaging lenses 5A and 5B, image sensors 6A and 6B, and sensor controllers 7A and 7B, respectively. The image sensors 6A and 6B are, for example, CCD image sensors or CMOS image sensors. CCD is an abbreviation for “Charge Coupled Device”. CMOS is an abbreviation for “Complementary Metal-Oxide Semiconductor”. The sensor controllers 7A and 7B perform exposure control of the image sensors 6A and 6B, image readout control, communication with an external circuit, transmission control of image data, and the like.

解析ユニット１０３は、データバスライン１０、シリアルバスライン１１、ＣＰＵ１５、ＦＰＧＡ１６、ＲＯＭ１７、ＲＡＭ１８、シリアルＩＦ１９、およびデータＩＦ２０を有している。ＣＰＵは、「Central Processing Unit」の略記である。ＦＰＧＡは、「Field-Programmable Gate Array」の略記である。ＲＯＭは、「Read Only Memory」の略記である。ＲＡＭは、「Random Access Memory」の略記である。ＩＦは、「interface」の略記である。 The analysis unit 103 includes a data bus line 10, a serial bus line 11, a CPU 15, an FPGA 16, a ROM 17, a RAM 18, a serial IF 19, and a data IF 20. CPU is an abbreviation for “Central Processing Unit”. FPGA is an abbreviation for “Field-Programmable Gate Array”. ROM is an abbreviation for “Read Only Memory”. RAM is an abbreviation for “Random Access Memory”. IF is an abbreviation for “interface”.

上述の撮像ユニット１０２は、データバスライン１０およびシリアルバスライン１１を介して解析ユニット１０３と接続されている。ＣＰＵ１５は、解析ユニット１０３全体の動作、画像処理、および画像認識処理を実行制御する。第１のカメラ部１Ａおよび第２のカメラ部１Ｂの画像センサ６Ａ，６Ｂで撮像された撮像画像の輝度画像データは、データバスライン１０を介して解析ユニット１０３のＲＡＭ１８に書き込まれる。ＣＰＵ１５またはＦＰＧＡ１６からのセンサ露光値の変更制御データ、画像読み出しパラメータの変更制御データ、および各種設定データ等は、シリアルバスライン１１を介して送受信される。 The above-described imaging unit 102 is connected to the analysis unit 103 via the data bus line 10 and the serial bus line 11. The CPU 15 controls the operation of the entire analysis unit 103, image processing, and image recognition processing. Luminance image data of captured images captured by the image sensors 6A and 6B of the first camera unit 1A and the second camera unit 1B is written into the RAM 18 of the analysis unit 103 via the data bus line 10. Sensor exposure value change control data, image read parameter change control data, various setting data, and the like from the CPU 15 or the FPGA 16 are transmitted and received via the serial bus line 11.

ＦＰＧＡ１６は、ＲＡＭ１８に保存された画像データに対してリアルタイム性が要求される処理を行う。ＦＰＧＡ１６は、第１のカメラ部１Ａおよび第２のカメラ部１Ｂでそれぞれ撮像された輝度画像データ（撮像画像）のうち、一方を基準画像とすると共に他方を比較画像とする。そして、ＦＰＧＡ１６は、撮像領域内の同一地点に対応する基準画像上の対応画像部分と比較画像上の対応画像部分との位置ズレ量を、対応画像部分の視差値（視差画像データ）として算出する。 The FPGA 16 performs processing that requires real-time processing on the image data stored in the RAM 18. The FPGA 16 uses one of the luminance image data (captured images) captured by the first camera unit 1A and the second camera unit 1B as a reference image and the other as a comparison image. Then, the FPGA 16 calculates a positional shift amount between the corresponding image portion on the reference image corresponding to the same point in the imaging region and the corresponding image portion on the comparison image as a parallax value (parallax image data) of the corresponding image portion. .

図３に、ＸＺ平面上における被写体３０と、第１のカメラ部１Ａの撮像レンズ５Ａと、第２のカメラ部１Ｂの撮像レンズ５Ｂとの位置関係を示す。この図３において、各撮像レンズ５Ａ,５Ｂの間の距離ｂおよび各撮像レンズ５Ａ、５Ｂの焦点距離ｆは、それぞれ固定値である。また、被写体３０の注視点Ｇに対する撮像レンズ５ＡのＸ座標のズレ量をΔ１とする。また、被写体３０の注視点Ｇに対する撮像レンズ５ＢのＸ座標のズレ量をΔ２とする。この場合において、ＦＰＧＡ１６は、被写体３０の注視点Ｇに対する各撮像レンズ５Ａ,５ＢのＸ座標の差である視差値ｄを、以下の式１で算出する。 FIG. 3 shows the positional relationship among the subject 30, the imaging lens 5A of the first camera unit 1A, and the imaging lens 5B of the second camera unit 1B on the XZ plane. In FIG. 3, the distance b between the imaging lenses 5A and 5B and the focal length f of the imaging lenses 5A and 5B are fixed values. Also, the amount of deviation of the X coordinate of the imaging lens 5A with respect to the gazing point G of the subject 30 is assumed to be Δ1. Further, the amount of deviation of the X coordinate of the imaging lens 5B with respect to the gazing point G of the subject 30 is assumed to be Δ2. In this case, the FPGA 16 calculates the parallax value d, which is the difference between the X coordinates of the imaging lenses 5A and 5B with respect to the gazing point G of the subject 30, using the following Equation 1.

解析ユニット１０３のＦＰＧＡ１６は、撮像ユニット１０２から供給される輝度画像データに対して、例えばガンマ補正処理および歪み補正処理（左右の撮像画像の平行化）等のリアルタイム性が要求される処理を施す。また、ＦＰＧＡ１６は、このようなリアルタイム性が要求される処理を施した輝度画像データを用いて上述の式１の演算を行うことで、視差画像データを生成し、ＲＡＭ１８に書き込む。 The FPGA 16 of the analysis unit 103 performs processing requiring real-time properties such as gamma correction processing and distortion correction processing (parallelization of the left and right captured images) on the luminance image data supplied from the imaging unit 102. Further, the FPGA 16 generates parallax image data by writing the above-described Equation 1 using the luminance image data subjected to such processing that requires real-time properties, and writes the parallax image data in the RAM 18.

図２に戻って説明を続ける。ＣＰＵ１５は、撮像ユニット１０２の各センサコントローラ７Ａ，７Ｂの制御、および解析ユニット１０３の全体的な制御を行う。また、ＲＯＭ１７には、後述する状況認識、予測、立体物認識等を実行するための立体物認識プログラムが記憶されている。立体物認識プログラムは、画像処理プログラムの一例である。ＣＰＵ１５は、データＩＦ２０を介して、例えば自車両のＣＡＮ情報（車速、加速度、舵角、ヨーレート等）をパラメータとして取得する。そして、ＣＰＵ１５は、ＲＯＭ１７に記憶されている立体物認識プログラムに従って、ＲＡＭ１８に記憶されている輝度画像および視差画像を用いて、状況認識等の各種処理を実行制御することで、例えば先行車両等の認識対象の認識を行う。ＣＡＮは、「Controller Area Network」の略記である。 Returning to FIG. 2, the description will be continued. The CPU 15 performs control of the sensor controllers 7A and 7B of the imaging unit 102 and overall control of the analysis unit 103. The ROM 17 stores a three-dimensional object recognition program for executing situation recognition, prediction, three-dimensional object recognition, and the like, which will be described later. The three-dimensional object recognition program is an example of an image processing program. The CPU 15 acquires, for example, CAN information (vehicle speed, acceleration, steering angle, yaw rate, etc.) of the host vehicle as a parameter via the data IF 20. Then, the CPU 15 executes and controls various processes such as situation recognition using the luminance image and the parallax image stored in the RAM 18 in accordance with the three-dimensional object recognition program stored in the ROM 17. Recognize the recognition target. CAN is an abbreviation for “Controller Area Network”.

認識対象の認識データは、シリアルＩＦ１９を介して、制御ユニット１０４へ供給される。制御ユニット１０４は、認識対象の認識データを用いて自車両のブレーキ制御や自車両の速度制御等の走行支援を行う。 The recognition data to be recognized is supplied to the control unit 104 via the serial IF 19. The control unit 104 performs traveling support such as brake control of the host vehicle and speed control of the host vehicle using the recognition data of the recognition target.

図４は、解析ユニット１０３が有する機能を概略的に説明するための図である。ステレオカメラを構成する撮像ユニット１０２で撮像されるステレオ画像は解析ユニット１０３へ供給される。例えば第１のカメラ部１Ａおよび第２のカメラ部１Ｂがカラー仕様の場合、第１のカメラ部１Ａおよび第２のカメラ部１Ｂの各々は、以下の式２の演算を行うことで、ＲＧＢ（赤緑青）の各信号から輝度（ｙ）信号を生成するカラー輝度変換処理を行う。第１のカメラ部１Ａおよび第２のカメラ部１Ｂの各々は、カラー輝度変換処理により生成した輝度画像データ（撮像画像）を、解析ユニット１０３が有する前処理部１１１へ供給する。第１のカメラ部１Ａで撮像された輝度画像データ（撮像画像）と、第２のカメラ部１Ｂで撮像された輝度画像データ（撮像画像）との組がステレオ画像であると考えることができる。この例では、前処理部１１１は、ＦＰＧＡ１６により実現される。 FIG. 4 is a diagram for schematically explaining the functions of the analysis unit 103. A stereo image picked up by the image pickup unit 102 constituting the stereo camera is supplied to the analysis unit 103. For example, when the first camera unit 1A and the second camera unit 1B have color specifications, each of the first camera unit 1A and the second camera unit 1B performs RGB ( Color luminance conversion processing for generating a luminance (y) signal from each signal (red, green, and blue) is performed. Each of the first camera unit 1A and the second camera unit 1B supplies luminance image data (captured image) generated by the color luminance conversion processing to the preprocessing unit 111 included in the analysis unit 103. It can be considered that a set of luminance image data (captured image) captured by the first camera unit 1A and luminance image data (captured image) captured by the second camera unit 1B is a stereo image. In this example, the preprocessing unit 111 is realized by the FPGA 16.

前処理部１１１は、第１のカメラ部１Ａおよび第２のカメラ部１Ｂから受け取った輝度画像データの前処理を行う。この例では、前処理としてガンマ補正処理を行う。そして、前処理部１１１は、前処理を行った後の輝度画像データを平行化画像生成部１１２へ供給する。 The preprocessing unit 111 performs preprocessing of the luminance image data received from the first camera unit 1A and the second camera unit 1B. In this example, gamma correction processing is performed as preprocessing. Then, the preprocessing unit 111 supplies the luminance image data after the preprocessing to the parallelized image generation unit 112.

平行化画像生成部１１２は、前処理部１１１から供給された輝度画像データに対して、平行化処理（歪み補正処理）を施す。この平行化処理は、第１のカメラ部１Ａ、第２のカメラ部１Ｂから出力される輝度画像データを、２つのピンホールカメラが平行に取り付けられたときに得られる理想的な平行化ステレオ画像に変換する処理である。具体的には、各画素の歪み量を、Δｘ＝ｆ（ｘ、ｙ）、Δｙ＝ｇ（ｘ、ｙ）という多項式を用いて計算した計算結果を用いて、第１のカメラ部１Ａ、第２のカメラ部１Ｂから出力される輝度画像データの各画素を変換する。多項式は、例えば、ｘ（画像の横方向位置）、ｙ（画像の縦方向位置）に関する５次多項式に基づく。これにより、第１のカメラ部１Ａ、第２のカメラ部１Ｂの光学系の歪みを補正した平行な輝度画像を得ることができる。この例では、平行化画像生成部１１２は、ＦＰＧＡ１６により実現される。 The parallelized image generation unit 112 performs parallelization processing (distortion correction processing) on the luminance image data supplied from the preprocessing unit 111. This parallelization processing is an ideal parallel stereo image obtained when two pinhole cameras are attached in parallel to the luminance image data output from the first camera unit 1A and the second camera unit 1B. It is processing to convert to. Specifically, the first camera unit 1A, the first camera unit 1A, the second camera unit 1A, the second camera unit 1A, the second camera unit 1A, the second camera unit 1A, the second camera unit 1A, the second camera unit 1A, the second camera unit 1A, Each pixel of the luminance image data output from the second camera unit 1B is converted. The polynomial is based on, for example, a quintic polynomial relating to x (the horizontal position of the image) and y (the vertical position of the image). Thereby, the parallel brightness image which correct | amended distortion of the optical system of the 1st camera part 1A and the 2nd camera part 1B can be obtained. In this example, the parallelized image generation unit 112 is realized by the FPGA 16.

視差画像生成部１１３は、撮像ユニット１０２により撮像されたステレオ画像から、画素毎に距離情報を備えた距離画像の一例である、画素毎に視差値を備えた視差画像を生成する。ここでは、視差画像生成部１１３は、第１のカメラ部１Ａの輝度画像データを基準画像データとし、第２のカメラ部１Ｂの輝度画像データを比較画像データとし、上述の式１に示す演算を行うことで、基準画像データと比較画像データの視差を示す視差画像データを生成する。具体的には、視差画像生成部１１３は、基準画像データの所定の「行」について、一つの注目画素を中心とした複数画素（例えば１６画素×１画素）からなるブロックを定義する。一方、比較画像データにおける同じ「行」において、定義した基準画像データのブロックと同じサイズのブロックを１画素ずつ横ライン方向（Ｘ方向）へズラす。そして、視差画像生成部１１３は、基準画像データにおいて定義したブロックの画素値の特徴を示す特徴量と比較画像データにおける各ブロックの画素値の特徴を示す特徴量との相関を示す相関値を、それぞれ算出する。なお、ここでいう視差画像は、縦方向位置と横方向位置と奥行き方向位置（視差）が対応付けられた情報を意味する。 The parallax image generation unit 113 generates a parallax image having a parallax value for each pixel, which is an example of a distance image having distance information for each pixel, from the stereo image captured by the imaging unit 102. Here, the parallax image generation unit 113 uses the luminance image data of the first camera unit 1A as reference image data, the luminance image data of the second camera unit 1B as comparison image data, and performs the calculation shown in Equation 1 above. As a result, parallax image data indicating the parallax between the reference image data and the comparison image data is generated. Specifically, the parallax image generation unit 113 defines a block including a plurality of pixels (for example, 16 pixels × 1 pixel) centered on one target pixel for a predetermined “row” of the reference image data. On the other hand, in the same “row” in the comparison image data, a block having the same size as the defined reference image data block is shifted by one pixel in the horizontal line direction (X direction). Then, the parallax image generation unit 113 calculates a correlation value indicating a correlation between a feature amount indicating the feature of the pixel value of the block defined in the reference image data and a feature amount indicating the feature of the pixel value of each block in the comparison image data, Calculate each. Note that the parallax image here means information in which a vertical position, a horizontal position, and a depth direction position (parallax) are associated with each other.

また、視差画像生成部１１３は、算出した相関値に基づき、比較画像データにおける各ブロックの中で最も基準画像データのブロックと相関があった比較画像データのブロックを選定するマッチング処理を行う。その後、基準画像データのブロックの注目画素と、マッチング処理で選定された比較画像データのブロックの対応画素との位置ズレ量を視差値ｄとして算出する。このような視差値ｄを算出する処理を基準画像データの全域又は特定の一領域について行うことで、視差画像データを得る。なお、視差画像の生成方法としては、公知の様々な技術を利用可能である。要するに、視差画像生成部１１３は、ステレオカメラで撮像されるステレオ画像から、画素毎に距離情報を有する距離画像（この例では視差画像）を算出（生成）していると考えることができる。 Further, the parallax image generation unit 113 performs a matching process of selecting the block of the comparison image data that is most correlated with the block of the reference image data among the blocks of the comparison image data based on the calculated correlation value. Thereafter, a positional deviation amount between the target pixel of the block of the reference image data and the corresponding pixel of the block of the comparison image data selected by the matching process is calculated as the parallax value d. The parallax image data is obtained by performing such a process of calculating the parallax value d for the entire area of the reference image data or a specific area. As a method for generating a parallax image, various known techniques can be used. In short, it can be considered that the parallax image generation unit 113 calculates (generates) a distance image (in this example, a parallax image) having distance information for each pixel from a stereo image captured by a stereo camera.

マッチング処理に用いるブロックの特徴量としては、例えばブロック内の各画素の値（輝度値）を用いることができる。また、相関値としては、例えば基準画像データのブロック内の各画素の値（輝度値）と、これらの画素にそれぞれ対応する比較画像データのブロック内の各画素の値（輝度値）との差分の絶対値の総和を用いることができる。この場合、当該総和が最も小さくなるブロックが、最も相関があるブロックとして検出される。 As the feature amount of the block used for the matching process, for example, the value (luminance value) of each pixel in the block can be used. Further, as the correlation value, for example, the difference between the value (luminance value) of each pixel in the block of reference image data and the value (luminance value) of each pixel in the block of comparison image data corresponding to each of these pixels The sum of absolute values of can be used. In this case, the block with the smallest sum is detected as the most correlated block.

このような視差画像生成部１１３のマッチング処理としては、例えばＳＳＤ（Sum of Squared Difference）、ＺＳＳＤ（Zero-mean Sum of Squared Difference）、ＳＡＤ（Sum of Absolute Difference）、又は、ＺＳＡＤ（Zero-mean Sum of Absolute Difference）等の手法を用いることができる。なお、マッチング処理において、１画素未満のサブピクセルレベルの視差値が必要な場合は、推定値を用いる。推定値の推定手法としては、例えば等角直線方式又は二次曲線方式等を用いることができる。ただし、推定したサブピクセルレベルの視差値には誤差が発生する。このため、推定誤差を減少させるＥＥＣ（推定誤差補正）等の手法を用いてもよい。 As such matching processing of the parallax image generation unit 113, for example, SSD (Sum of Squared Difference), ZSSD (Zero-mean Sum of Squared Difference), SAD (Sum of Absolute Difference), or ZSAD (Zero-mean Sum) of Absolute Difference) can be used. In the matching process, when a sub-pixel level disparity value of less than one pixel is required, an estimated value is used. As an estimation method of the estimated value, for example, an equiangular straight line method or a quadratic curve method can be used. However, an error occurs in the estimated sub-pixel level parallax value. For this reason, a technique such as EEC (estimation error correction) for reducing the estimation error may be used.

この例では、視差画像生成部１１３は、ＦＰＧＡ１６により実現される。視差画像生成部１１３により生成された視差画像は、物体検出処理部１１４へ供給される。この例では、物体検出処理部１１４の機能は、ＣＰＵ１５が立体物認識プログラムを実行することにより実現される。 In this example, the parallax image generation unit 113 is realized by the FPGA 16. The parallax image generated by the parallax image generation unit 113 is supplied to the object detection processing unit 114. In this example, the function of the object detection processing unit 114 is realized by the CPU 15 executing a three-dimensional object recognition program.

図５は、物体検出処理部１１４が有する機能の一例を示す図である。図５に示すように、物体検出処理部１１４は、取得部１２１、路面検出処理部１２２、クラスタリング処理部１２３、トラッキング処理部１２４を有する。取得部１２１は、視差画像生成部１１３により生成された視差画像を取得する。取得部１２１は、ステレオカメラで撮像されるステレオ画像から算出された、画素毎に距離情報を有する距離画像（この例では視差画像）を取得する機能を有していると考えることができる。取得部１２１により取得された視差画像は路面検出処理部１２２およびクラスタリング処理部１２３へ入力される。 FIG. 5 is a diagram illustrating an example of functions of the object detection processing unit 114. As illustrated in FIG. 5, the object detection processing unit 114 includes an acquisition unit 121, a road surface detection processing unit 122, a clustering processing unit 123, and a tracking processing unit 124. The acquisition unit 121 acquires the parallax image generated by the parallax image generation unit 113. The acquisition unit 121 can be considered to have a function of acquiring a distance image (in this example, a parallax image) having distance information for each pixel calculated from a stereo image captured by a stereo camera. The parallax image acquired by the acquisition unit 121 is input to the road surface detection processing unit 122 and the clustering processing unit 123.

図６に示すように、路面検出処理部１２２は、路面推定部１３１と、生成部１３２と、を有する。路面推定部１３１は、視差画像を用いて、画像の垂直方向（ステレオカメラの光軸と直交する上下方向）を示す縦方向の位置と、ステレオカメラの光軸の方向を示す奥行方向の位置とが対応付けられた対応情報を生成する。この例では、路面推定部１３１は、視差画像の各画素（視差値）を、画像の垂直方向の座標（ｙ）を縦軸、視差値ｄを横軸とするマップ（以下、「Ｖマップ（Ｖ−Ｄｉｓｐａｒｉｔｙマップ）」と称する）に投票し、投票された視差点から所定の方法で標本点を選択し、選択された点群を直線近似（または、曲線近似）する形で路面形状を推定する。この路面推定の方法としては公知の様々な技術を利用可能である。Ｖマップとは、視差画像の（ｘ座標値、ｙ座標値、視差値ｄ）の組のうち、ｘ軸を視差値ｄ、ｙ軸をｙ座標値、ｚ軸を頻度とした２次元ヒストグラムである。要するに、上記対応情報（この例ではＶマップ）は、縦方向の位置と視差値ｄ（奥行方向の位置に相当）との組み合わせごとに、視差の頻度値を記録した情報であると考えることもできる。路面推定部１３１による推定結果（路面推定情報）は、生成部１３２、クラスタリング処理部１２３へ入力される。 As illustrated in FIG. 6, the road surface detection processing unit 122 includes a road surface estimation unit 131 and a generation unit 132. The road surface estimation unit 131 uses the parallax image to indicate a vertical position indicating the vertical direction of the image (vertical direction orthogonal to the optical axis of the stereo camera) and a depth direction position indicating the direction of the optical axis of the stereo camera. Is generated in correspondence information. In this example, the road surface estimation unit 131 maps each pixel (parallax value) of a parallax image with the vertical coordinate (y) of the image as the vertical axis and the parallax value d as the horizontal axis (hereinafter referred to as “V map ( V-Disparity map) ”is selected, sample points are selected from the voted parallax points by a predetermined method, and the road surface shape is estimated by linear approximation (or curve approximation) of the selected point group. To do. Various known techniques can be used as this road surface estimation method. The V map is a two-dimensional histogram in which the x axis is the parallax value d, the y axis is the y coordinate value, and the z axis is the frequency in a set of parallax images (x coordinate value, y coordinate value, parallax value d). is there. In short, the correspondence information (V map in this example) may be considered as information in which the frequency value of the parallax is recorded for each combination of the position in the vertical direction and the parallax value d (corresponding to the position in the depth direction). it can. The estimation result (road surface estimation information) by the road surface estimation unit 131 is input to the generation unit 132 and the clustering processing unit 123.

生成部１３２は、視差画像に基づいて、ステレオカメラの光軸と直交する方向を示す横方向の位置と、ステレオカメラの光軸の方向を示す奥行方向の位置とが対応付けられた第１の対応情報を生成する。このとき、ノイズを除去するため、視差画像のうち、路面（オブジェクトの高さの基準となる基準オブジェクトの一例）よりも高い範囲に対応する複数の画素に基づいて、第１の対応情報を生成するのが好ましい。なお、この例では、第１の対応情報は、横軸を横方向の実際の距離（実距離）、縦軸を視差画像の視差値ｄ、奥行方向の軸を頻度とした２次元ヒストグラムである。第１の対応情報は、実距離と視差値ｄとの組み合わせごとに、視差の頻度値を記録した情報であると考えることもできる。 Based on the parallax image, the generation unit 132 associates a horizontal position indicating a direction orthogonal to the optical axis of the stereo camera with a position in the depth direction indicating the direction of the optical axis of the stereo camera. Generate correspondence information. At this time, in order to remove noise, first correspondence information is generated based on a plurality of pixels corresponding to a range higher than a road surface (an example of a reference object serving as a reference for the height of the object) in the parallax image. It is preferable to do this. In this example, the first correspondence information is a two-dimensional histogram in which the horizontal axis is the actual distance in the horizontal direction (actual distance), the vertical axis is the parallax value d of the parallax image, and the axis in the depth direction is the frequency. . The first correspondence information can be considered to be information in which the frequency value of the parallax is recorded for each combination of the actual distance and the parallax value d.

ここで、上述の路面推定部１３１の路面推定により、路面を表す直線式が得られているため、視差値ｄが決まれば、対応するｙ座標ｙ０が決まり、この座標ｙ０が路面の高さとなる。例えば視差値がｄでｙ座標がｙ’である場合、ｙ’−ｙ０が視差値ｄのときの路面からの高さを示す。上述の座標（ｄ，ｙ’）の路面からの高さＨは、Ｈ＝（ｚ×（ｙ’−ｙ０））／ｆという演算式で求めることができる。なお、この演算式における「ｚ」は、視差値ｄから計算される距離（ｚ＝ＢＦ／（ｄ−ｏｆｆｓｅｔ））、「ｆ」は撮像ユニット１０２の焦点距離を（ｙ’−ｙ０）の単位と同じ単位に変換した値である。ここで、ＢＦは、撮像ユニット１０２の基線長Ｂと焦点距離ｆを乗じた値、ｏｆｆｓｅｔは無限遠のオブジェクトを撮影したときの視差である。 Here, since the linear equation representing the road surface is obtained by the road surface estimation of the road surface estimation unit 131 described above, if the parallax value d is determined, the corresponding y coordinate y0 is determined, and this coordinate y0 becomes the height of the road surface. . For example, when the parallax value is d and the y coordinate is y ′, the height from the road surface when y′−y0 is the parallax value d is indicated. The height H of the coordinates (d, y ′) from the road surface can be obtained by an arithmetic expression H = (z × (y′−y0)) / f. In this arithmetic expression, “z” is a distance calculated from the parallax value d (z = BF / (d−offset)), and “f” is a focal length of the imaging unit 102 (y′−y0). Is the value converted to the same unit. Here, BF is a value obtained by multiplying the base line length B of the imaging unit 102 and the focal length f, and offset is a parallax when an object at infinity is photographed.

生成部１３２は、第１の対応情報として、「ＨｉｇｈＵｍａｐ」、「ＳｔａｎｄａｒｄＵｍａｐ」、「ＳｍａｌｌＵｍａｐ」のうちの少なくとも１つを生成する。以下、これらのマップについて説明する。まず、「ＨｉｇｈＵｍａｐ」について説明する。視差画像の横方向の位置をｘ、縦方向の位置をｙ、画素ごとに設定される視差値をｄとすると、生成部１３２は、視差画像のうち、路面よりも高い第１の範囲内の所定値以上の高さの範囲を示す第２の範囲内に対応する点（ｘ、ｙ、ｄ）を、（ｘ、ｄ）の値に基づいて投票することで、横軸を視差画像のｘ、縦軸を視差値ｄ、奥行方向の軸を頻度とした２次元ヒストグラムを生成する。そして、この２次元ヒストグラムの横軸を実距離に変換して、ＨｉｇｈＵｍａｐを生成する。 The generation unit 132 generates at least one of “High Umap”, “Standard Umap”, and “Small Umap” as the first correspondence information. Hereinafter, these maps will be described. First, “High Umap” will be described. If the position in the horizontal direction of the parallax image is x, the position in the vertical direction is y, and the parallax value set for each pixel is d, the generation unit 132 is within the first range higher than the road surface in the parallax image. By voting a point (x, y, d) corresponding to the second range indicating a height range equal to or higher than a predetermined value based on the value of (x, d), the horizontal axis is x of the parallax image. Then, a two-dimensional histogram is generated with the parallax value d on the vertical axis and the frequency on the axis in the depth direction. Then, the horizontal axis of this two-dimensional histogram is converted into an actual distance to generate High Umap.

例えば図７に示す撮像画像においては、大人と子供を含む人グループ１と、大人同士の人グループ２と、ポールと、車両とが映り込んでいる。この例では、路面からの実高さが１５０ｃｍ〜２００ｃｍの範囲が第２の範囲として設定され、該第２の範囲の視差値ｄが投票されたＨｉｇｈＵｍａｐは図８のようになる。高さが１５０ｃｍ未満の子供の視差値ｄは投票されないためマップ上に現れないことになる。なお、縦軸は、距離に応じた間引き率を用いて視差値ｄを間引き処理した間引き視差となっている。生成部１３２により生成されたＨｉｇｈＵｍａｐはクラスタリング処理部１２３に入力される。 For example, in the captured image shown in FIG. 7, a person group 1 including adults and children, a person group 2 between adults, a pole, and a vehicle are reflected. In this example, a range where the actual height from the road surface is 150 cm to 200 cm is set as the second range, and High Umap in which the parallax value d of the second range is voted is as shown in FIG. The parallax value d of a child whose height is less than 150 cm is not voted and therefore does not appear on the map. The vertical axis represents the thinning parallax obtained by thinning the parallax value d using the thinning rate according to the distance. The High Umap generated by the generation unit 132 is input to the clustering processing unit 123.

次に、「ＳｔａｎｄａｒｄＵｍａｐ」について説明する。視差画像の横方向の位置をｘ、縦方向の位置をｙ、画素ごとに設定される視差値をｄとすると、生成部１３２は、視差画像のうち第１の範囲内に対応する点（ｘ、ｙ、ｄ）を、（ｘ、ｄ）の値に基づいて投票することで、横軸を視差画像のｘ、縦軸を視差値ｄ、奥行方向の軸を頻度とした２次元ヒストグラムを生成する。そして、この２次元ヒストグラムの横軸を実距離に変換して、ＳｔａｎｄａｒｄＵｍａｐを生成する。図７の例では、０ｃｍ〜２００ｃｍの範囲（上述の第２の範囲を含んでいる）が第１の範囲として設定され、該第１の範囲の視差値ｄが投票されたＳｔａｎｄａｒｄＵｍａｐは図９のようになる。また、生成部１３２は、ＳｔａｎｄａｒｄＵｍａｐと併せて、ＳｔａｎｄａｒｄＵｍａｐに投票される視差点（実距離と視差値ｄとの組）のうち、路面からの高さ（ｈ）が最も高い視差点の高さを記録して、横軸を実距離（カメラの左右方向の距離）、縦軸を視差値ｄとし、対応する点ごとに高さが記録された高さ情報を生成することもできる。高さ情報は、実距離と視差値ｄとの組み合わせごとに高さを記録した情報であると考えてもよい。以下の説明では、この高さ情報を、「ＳｔａｎｄａｒｄＵｍａｐの高さマップ」と称する。「ＳｔａｎｄａｒｄＵｍａｐの高さマップ」に含まれる各画素の位置はＳｔａｎｄａｒｄＵｍａｐに含まれる各画素の位置に対応している。生成部１３２により生成されたＳｔａｎｄａｒｄＵｍａｐおよびＳｔａｎｄａｒｄＵｍａｐの高さマップはクラスタリング処理部１２３に入力される。なお、本処理は物体を検出しやすくするために俯瞰的なマップ（鳥瞰画像、俯瞰画像）を生成するものであるため、横軸は実距離でなくとも実距離に相当するものであればよい。 Next, “Standard Umap” will be described. Assuming that the horizontal position of the parallax image is x, the vertical position is y, and the parallax value set for each pixel is d, the generation unit 132 corresponds to a point (x , Y, d) are voted based on the values of (x, d), thereby generating a two-dimensional histogram with the horizontal axis representing the parallax image x, the vertical axis representing the parallax value d, and the depth axis representing the frequency. To do. Then, the horizontal axis of the two-dimensional histogram is converted into an actual distance to generate a standard Umap. In the example of FIG. 7, the standard Umap in which the range of 0 cm to 200 cm (including the above-described second range) is set as the first range and the parallax value d of the first range is voted is FIG. become that way. In addition to the standard Umap, the generation unit 132 includes the parallax point with the highest height (h) from the road surface among the parallax points (a set of the actual distance and the parallax value d) voted for the Standard Umap. It is also possible to generate height information in which the height is recorded for each corresponding point, with the horizontal axis representing the actual distance (distance in the left-right direction of the camera) and the vertical axis representing the parallax value d. You may think that height information is the information which recorded height for every combination of real distance and the parallax value d. In the following description, this height information is referred to as a “Standard Umap height map”. The position of each pixel included in the “Standard Umap height map” corresponds to the position of each pixel included in the Standard Umap. The standard Umap and the standard Umap height map generated by the generation unit 132 are input to the clustering processing unit 123. In addition, since this process generates a bird's-eye view map (bird's-eye view image, bird's-eye view image) in order to make it easier to detect an object, the horizontal axis may be equivalent to an actual distance, not an actual distance. .

次に、「ＳｍａｌｌＵｍａｐ」について説明する。視差画像の横方向の位置をｘ、縦方向の位置をｙ、画素ごとに設定される視差値をｄとすると、生成部１３２は、視差画像のうち第１の範囲内に対応する点（ｘ、ｙ、ｄ）を、（ｘ、ｄ）の値に基づいて投票（ＳｔａｎｄａｒｄＵｍａｐを作成する場合よりも少ない数を投票）することで、横軸を視差画像のｘ、縦軸を視差値ｄ、奥行方向の軸を頻度とした２次元ヒストグラムを生成する。そして、この２次元ヒストグラムの横軸を実距離に変換して、ＳｍａｌｌＵｍａｐを生成する。ＳｍａｌｌＵｍａｐは、ＳｔａｎｄａｒｄＵｍａｐと比較して１画素の距離分解能が低い。また、生成部１３２は、ＳｍａｌｌＵｍａｐと併せて、ＳｍａｌｌＵｍａｐに投票される視差点（実距離と視差値ｄとの組）のうち、路面からの高さ（ｈ）が最も高い視差点の高さを記録して、横軸を実距離（カメラの左右方向の距離）、縦軸を視差値ｄとし、対応する点ごとに高さが記録された高さ情報を生成することもできる。高さ情報は、実距離と視差値ｄとの組み合わせごとに高さを記録した情報であると考えてもよい。以下の説明では、この高さ情報を、「ＳｍａｌｌＵｍａｐのマップ高さ」と称する。「ＳｍａｌｌＵｍａｐの高さマップ」に含まれる各画素の位置はＳｍａｌｌＵｍａｐに含まれる各画素の位置に対応している。生成部１３２により生成されたＳｍａｌｌＵｍａｐおよびＳｍａｌｌＵｍａｐの高さマップはクラスタリング処理部１２３に入力される。 Next, “Small Umap” will be described. Assuming that the horizontal position of the parallax image is x, the vertical position is y, and the parallax value set for each pixel is d, the generation unit 132 corresponds to a point (x , Y, d) based on the values of (x, d) (voting a smaller number than when creating a Standard Umap), the horizontal axis is x of the parallax image, and the vertical axis is the parallax value d. Then, a two-dimensional histogram with the frequency in the depth direction as a frequency is generated. Then, the horizontal axis of the two-dimensional histogram is converted into an actual distance to generate a Small Umap. Small Umap has a lower distance resolution of one pixel than Standard Umap. In addition to the Small Umap, the generation unit 132 includes a parallax point with the highest height (h) from the road surface among the parallax points voted by the Small Umap (a set of the actual distance and the parallax value d). It is also possible to generate height information in which the height is recorded for each corresponding point, with the horizontal axis representing the actual distance (distance in the left-right direction of the camera) and the vertical axis representing the parallax value d. You may think that height information is the information which recorded height for every combination of real distance and the parallax value d. In the following description, this height information is referred to as “Small Umap map height”. The position of each pixel included in the “Small Umap height map” corresponds to the position of each pixel included in the Small Umap. The Small Umap and the Small Umap height map generated by the generation unit 132 are input to the clustering processing unit 123.

この例では、生成部１３２はＳｔａｎｄａｒｄＵｍａｐを生成し、その生成されたＳｔａｎｄａｒｄＵｍａｐがクラスタリング処理部１２３に入力される場合を例に挙げて説明するが、これに限らず、例えば「ＨｉｇｈＵｍａｐ」、「ＳｔａｎｄａｒｄＵｍａｐ」、「ＳｍａｌｌＵｍａｐの高さマップ」を用いて物体検出を行う場合は、生成部１３２は、「ＨｉｇｈＵｍａｐ」、「ＳｔａｎｄａｒｄＵｍａｐ」、「ＳｍａｌｌＵｍａｐ」を生成し、これらのマップがクラスタリング処理部１２３に入力されてもよい。 In this example, the generation unit 132 generates a standard Umap and the generated standard Umap is input to the clustering processing unit 123 as an example. However, the present invention is not limited to this, and for example, “High Umap”, In the case of performing object detection using “Standard Umap” and “Small Umap height map”, the generation unit 132 generates “High Umap”, “Standard Umap”, and “Small Umap”. The data may be input to the clustering processing unit 123.

図５に戻って説明を続ける。クラスタリング処理部１２３は、路面検出処理部１２２から受け取った各種の情報を用いて、取得部１２１により取得された視差画像上の物***置を検出する。図１０は、クラスタリング処理部１２３の詳細な機能の一例を示す図である。図１０に示すように、クラスタリング処理部１２３は、孤立領域検出処理部１４０、面検出処理部１４１、視差画処理部１５０、棄却処理部１６０を有する。 Returning to FIG. The clustering processing unit 123 detects the object position on the parallax image acquired by the acquisition unit 121 using various types of information received from the road surface detection processing unit 122. FIG. 10 is a diagram illustrating an example of detailed functions of the clustering processing unit 123. As illustrated in FIG. 10, the clustering processing unit 123 includes an isolated region detection processing unit 140, a surface detection processing unit 141, a parallax image processing unit 150, and a rejection processing unit 160.

孤立領域検出処理部１４０は、「検出部」の一例であり、前述の第１の対応情報（この例ではＳｔａｎｄａｒｄＵｍａｐ）から、視差値ｄの塊の領域である孤立領域（集合領域）を検出する。以下の説明では、この検出処理を「孤立領域検出処理」と称する。例えば図１１に示す撮像画像の場合、左右にガードレール８１,８２があり、車両７７および車両７９がセンターラインを挟んで対面通行をしている。各走行車線には、それぞれ１台の車両７７又は車両７９が走行している。車両７９とガードレール８２との間には２本のポール８０Ａ，８０Ｂが存在している。図１２は、図１１に示す撮像画像に基づいて得られたＳｔａｎｄａｒｄＵｍａｐであり、枠で囲まれた領域が孤立領域に相当する。 The isolated region detection processing unit 140 is an example of a “detection unit”, and detects an isolated region (aggregate region) that is a region of a lump of parallax values d from the above-described first correspondence information (in this example, Standard Umap). To do. In the following description, this detection processing is referred to as “isolated region detection processing”. For example, in the case of the captured image shown in FIG. 11, there are guard rails 81 and 82 on the left and right, and the vehicle 77 and the vehicle 79 are facing each other across the center line. One vehicle 77 or vehicle 79 is traveling in each traveling lane. Two poles 80 </ b> A and 80 </ b> B exist between the vehicle 79 and the guardrail 82. FIG. 12 is a standard Umap obtained based on the captured image shown in FIG. 11, and an area surrounded by a frame corresponds to an isolated area.

面検出処理部１４１は、孤立領域検出処理部１４０により検出された孤立領域（集合領域）の輪郭から面情報を算出し、検出結果に３次元構造として情報化する。また、検出された孤立領域が、同一方向の側面を複数持つと判断した場合は、該孤立領域を分離する処理を行う。以下の説明では、面検出処理部１４１による処理を「面検出処理」と称する。より具体的な内容については後述する。 The surface detection processing unit 141 calculates surface information from the outline of the isolated region (collected region) detected by the isolated region detection processing unit 140, and converts the detection result into information as a three-dimensional structure. In addition, when it is determined that the detected isolated area has a plurality of side surfaces in the same direction, processing for separating the isolated area is performed. In the following description, the processing by the surface detection processing unit 141 is referred to as “surface detection processing”. More specific contents will be described later.

視差画処理部１５０は、孤立領域検出処理部１４０により検出された孤立領域に対応する視差画像上の領域や実空間での物体情報を検出する視差画処理を行う。図１３は、図１２に示す孤立領域に対応する視差画像上の領域（視差画処理部１５０による処理の結果）を示す図であり、図１３の領域９１はガードレール８１に対応する領域であり、領域９２は車両７７に対応する領域であり、領域９３は車両７９に対応する領域であり、領域９４はポール８０Ａに対応する領域であり、領域９５はポール８０Ｂに対応する領域であり、領域９６はガードレール８２に対応する領域である。 The parallax image processing unit 150 performs parallax image processing for detecting a region on a parallax image corresponding to the isolated region detected by the isolated region detection processing unit 140 or object information in real space. 13 is a diagram showing a region on the parallax image corresponding to the isolated region shown in FIG. 12 (result of processing by the parallax image processing unit 150), and a region 91 in FIG. 13 is a region corresponding to the guardrail 81, The region 92 is a region corresponding to the vehicle 77, the region 93 is a region corresponding to the vehicle 79, the region 94 is a region corresponding to the pole 80A, and the region 95 is a region corresponding to the pole 80B. Is an area corresponding to the guardrail 82.

棄却処理部１６０は、視差画処理部１５０により検出された視差画上の領域や実空間での物体情報に基づき、出力すべきオブジェクトを選別する棄却処理を行う。棄却処理部１６０は、物体のサイズに着目したサイズ棄却と、物体同士の位置関係に着目したオーバラップ棄却とを実行する。例えばサイズ棄却では、図１４に示す物体（オブジェクト）タイプごとに定められたサイズ範囲に当てはまらないサイズの検出結果を棄却する。例えば図１５の例では、領域９１および領域９６は棄却されている。また、オーバラップ棄却では、視差画処理により検出された、視差画上の孤立領域（リアルＵマップ上の検出結果）に対応する領域同士に対し、重なりを持つ結果の取捨選択を行う。 The rejection processing unit 160 performs a rejection process for selecting an object to be output based on a region on the parallax image detected by the parallax image processing unit 150 or object information in real space. The rejection processing unit 160 executes size rejection focusing on the object size and overlap rejection focusing on the positional relationship between the objects. For example, in the size rejection, a detection result of a size that does not fall within the size range defined for each object (object) type shown in FIG. 14 is rejected. For example, in the example of FIG. 15, the region 91 and the region 96 are rejected. Further, in the overlap rejection, the results corresponding to the isolated areas (detection results on the real U map) on the parallax image detected by the parallax image processing are selected.

図１６は、クラスタリング処理部１２３による処理の一例を示すフローチャートである。この例では、ＳｔａｎｄａｒｄＵｍａｐ、ＳｔａｎｄａｒｄＵｍａｐの高さマップ、視差画像、路面推定情報が入力情報として入力され、視差画像上の検出結果が出力情報として出力される。まず孤立領域検出処理部１４０は孤立領域検出処理を行う（ステップＳ１）。次に、面検出処理部１４１は、面検出処理を行う（ステップＳ２）。視差画処理部１５０は、視差画処理を行う（ステップＳ３）。そして、棄却処理部１６０は、ステップＳ３の視差画処理の結果を用いて棄却処理を行い（ステップＳ４）、最終的な視差画像上の検出結果を出力情報として出力する。なお、ステップＳ２の面検出処理は、ステップＳ３の視差画処理とステップＳ４の棄却処理との間に行われてもよい。 FIG. 16 is a flowchart illustrating an example of processing performed by the clustering processing unit 123. In this example, Standard Umap, Standard Umap height map, parallax image, and road surface estimation information are input as input information, and a detection result on the parallax image is output as output information. First, the isolated area detection processing unit 140 performs isolated area detection processing (step S1). Next, the surface detection processing unit 141 performs surface detection processing (step S2). The parallax image processing unit 150 performs parallax image processing (step S3). Then, the rejection processing unit 160 performs rejection processing using the result of the parallax image processing in step S3 (step S4), and outputs the detection result on the final parallax image as output information. In addition, the surface detection process of step S2 may be performed between the parallax image process of step S3 and the rejection process of step S4.

なお、クラスタリング処理部１２３からの出力情報（検出結果）は図５に示すトラッキング処理部１２４に入力される。トラッキング処理部１２４は、クラスタリング処理部１２３による検出結果（検出された物体）が複数のフレームにわたって連続して出現する場合に追跡対象であると判定し、追跡対象である場合には、その検出結果を物体検出結果として制御ユニット１０４へ出力する。 Note that the output information (detection result) from the clustering processing unit 123 is input to the tracking processing unit 124 shown in FIG. The tracking processing unit 124 determines that the detection result (detected object) by the clustering processing unit 123 appears as a tracking target when it continuously appears across a plurality of frames. Is output to the control unit 104 as an object detection result.

次に、孤立領域検出処理の具体的な内容を説明する。図１７は、孤立領域検出処理の一例を示すフローチャートである。この例では、ＳｔａｎｄａｒｄＵｍａｐが入力情報として入力される。出力情報については後述の説明で明らかになる。まず、孤立領域検出処理部１４０は、ＳｔａｎｄａｒｄＵｍａｐ内の視差の塊ごとにグルーピングしてＩＤを付与するラベリング処理を行う（ステップＳ２１）。具体的には、孤立領域検出処理部１４０は、ＳｔａｎｄａｒｄＵｍａｐに含まれる複数の画素ごとに着目していき、着目画素、および、該着目画素の近傍に存在する８画素（右方向、右斜め上方向、上方向、左斜め上方向、左方向、左斜め下方向、下方向、右斜め下方向の８つの方向と１対１に対応する８つの画素）のうち、頻度値を含む画素の画素値を「１」に設定し、頻度値を含まない画素の画素値を「０」に設定して二値化する。なお、二値化の方法はこれに限らず任意であり、例えば近傍８画素のうち閾値以上の視差の頻度値を含む画素の画素値を「１」とし、それ以外の画素の画素値を「０」とする形態であってもよい。そして、画素値「１」の集合で形成される閉領域を視差の塊（１つのグループ）とし、該閉領域に含まれる各画素に対してＩＤを付与する。なお、ＩＤは、各グループを識別可能な値に設定される。 Next, specific contents of the isolated region detection process will be described. FIG. 17 is a flowchart illustrating an example of the isolated region detection process. In this example, Standard Umap is input as input information. The output information will be clarified in the following description. First, the isolated region detection processing unit 140 performs a labeling process for grouping and assigning an ID to each parallax cluster in the standard Umap (step S21). Specifically, the isolated region detection processing unit 140 pays attention to each of the plurality of pixels included in the standard Umap, and includes the target pixel and eight pixels existing in the vicinity of the target pixel (right direction, diagonally upward right) Out of the eight directions corresponding to the eight directions of the direction, the upper direction, the upper left direction, the left direction, the lower left direction, the lower direction, and the lower right direction). The value is set to “1”, the pixel value of the pixel not including the frequency value is set to “0”, and binarization is performed. Note that the binarization method is not limited to this, and for example, among the neighboring 8 pixels, the pixel value of the pixel including the parallax frequency value equal to or greater than the threshold is set to “1”, and the pixel values of the other pixels are set to “ It may be in the form of “0”. Then, a closed region formed by a set of pixel values “1” is set as a parallax lump (one group), and an ID is assigned to each pixel included in the closed region. The ID is set to a value that can identify each group.

図１８は、二値化処理後の一例を示す図であり、領域２００に含まれる５つの画素の各々に対して、同一のＩＤが付与されることになる。 FIG. 18 is a diagram illustrating an example after the binarization process, and the same ID is assigned to each of the five pixels included in the region 200.

図１７に戻って説明を続ける。ステップＳ２１の後、孤立領域検出処理部１４０は、検出矩形作成処理を行う（ステップＳ２２）。具体的には、孤立領域検出処理部１４０は、同一のＩＤが割り振られた画素の集合領域に外接する矩形を算出し、算出した外接矩形を検出矩形とする。なお、ここでいう検出矩形とは、矩形の位置および大きさを示す情報のことをいい、例えば矩形の角の座標と高さおよび幅をいう。次に、孤立領域検出処理部１４０は、ステップＳ２２で作成した検出矩形のサイズをチェックするサイズチェック処理を行う（ステップＳ２３）。例えば孤立領域検出処理部１４０は、ステップＳ２２で作成した検出矩形のサイズが、ノイズに相当するサイズとして予め定められた閾値以下の場合、該検出矩形を破棄する処理を行う。次に、孤立領域検出処理部１４０は、ステップＳ２２で作成した検出矩形に含まれる各画素の頻度値（視差の頻度値）をチェックする頻度チェック処理を行う（ステップＳ２４）。例えば孤立領域検出処理部１４０は、ステップＳ２２で作成した検出矩形に含まれる頻度値（視差の頻度値）の累積値が、物体を表すのに必要な数として予め定められた閾値以下の場合、該検出矩形を破棄する処理を行う。 Returning to FIG. 17, the description will be continued. After step S21, the isolated region detection processing unit 140 performs a detection rectangle creation process (step S22). Specifically, the isolated region detection processing unit 140 calculates a rectangle circumscribing the pixel collection region to which the same ID is assigned, and sets the calculated circumscribed rectangle as a detection rectangle. Note that the detection rectangle here refers to information indicating the position and size of the rectangle, for example, the coordinates, height, and width of the corners of the rectangle. Next, the isolated area detection processing unit 140 performs a size check process for checking the size of the detection rectangle created in step S22 (step S23). For example, when the size of the detection rectangle created in step S22 is equal to or smaller than a threshold value that is predetermined as a size corresponding to noise, the isolated region detection processing unit 140 performs processing for discarding the detection rectangle. Next, the isolated region detection processing unit 140 performs frequency check processing for checking the frequency value (parallax frequency value) of each pixel included in the detection rectangle created in step S22 (step S24). For example, the isolated region detection processing unit 140, when the cumulative value of the frequency values (parallax frequency values) included in the detection rectangle created in step S22 is equal to or less than a threshold value that is predetermined as a number necessary to represent an object, Processing to discard the detection rectangle is performed.

以上の孤立領域検出処理により、ＳｔａｎｄａｒｄＵｍａｐ上の検出矩形を示す情報が出力情報として出力される。なお、ＳｔａｎｄａｒｄＵｍａｐ上の検出矩形に含まれる各画素に対しては、グループを識別するＩＤが割り当てられている。つまり、ＳｔａｎｄａｒｄＵｍａｐ上でグルーピングされたＩＤのマップを示す情報（「ＳｔａｎｄａｒｄＵｍａｐ上のＩＤＵｍａｐ」、他と区別しない場合は単に「ＩＤマップ」と称する場合がある）が出力情報として出力されることにもなる。 Through the above isolated region detection processing, information indicating a detection rectangle on the standard Umap is output as output information. Note that an ID for identifying a group is assigned to each pixel included in the detection rectangle on the Standard Umap. That is, information indicating a map of IDs grouped on the Standard Umap (“ID Umap on the Standard Umap”, and may be simply referred to as “ID map” if not distinguished from others) is output as output information. It also becomes.

次に、面検出処理の具体的な内容を説明する。図１９に示すように、面検出処理部１４１は、補間処理部１４２と、抽出部１４３と、分離部１４４と、を有する。孤立領域検出処理部１４０による検出結果は面検出処理部１４１に入力されることになり、面検出処理部１４１は、孤立領域検出処理部１４０による検出結果を用いて面検出処理を行う。 Next, specific contents of the surface detection process will be described. As illustrated in FIG. 19, the surface detection processing unit 141 includes an interpolation processing unit 142, an extraction unit 143, and a separation unit 144. The detection result by the isolated region detection processing unit 140 is input to the surface detection processing unit 141, and the surface detection processing unit 141 performs surface detection processing using the detection result by the isolated region detection processing unit 140.

図２０は、面検出処理の流れを示すフローチャートである。ここでは、孤立領域検出処理による検出結果（ＳｔａｎｄａｒｄＵｍａｐ上の検出矩形）ごとに、図２０に示す処理が繰り返される。まず補間処理部１４２は補間処理を行う（ステップＳ１００）。次に、抽出部１４３は輪郭抽出処理を行う（ステップＳ１０１）。次に、分離部１４４は分離処理を行う（ステップＳ１０２）。各ステップの具体的な内容は後述する。 FIG. 20 is a flowchart showing the flow of the surface detection process. Here, the process shown in FIG. 20 is repeated for each detection result (detection rectangle on Standard Umap) by the isolated area detection process. First, the interpolation processing unit 142 performs an interpolation process (step S100). Next, the extraction unit 143 performs contour extraction processing (step S101). Next, the separation unit 144 performs a separation process (step S102). Specific contents of each step will be described later.

図１９に示す補間処理部１４２は、孤立領域検出処理部１４０による検出結果に対して、ノイズを平滑化するための補間処理を行う。補間処理により追加された画素に対しては新たにＩＤを付与する。 The interpolation processing unit 142 illustrated in FIG. 19 performs an interpolation process for smoothing noise on the detection result of the isolated region detection processing unit 140. A new ID is assigned to the pixel added by the interpolation process.

図１９に示す抽出部１４３は、集合領域の輪郭を抽出する。本実施形態では、抽出部１４３は、輪郭を構成する複数の画素ごとに、繋がりの方向を示す方向情報を設定する。より具体的には以下のとおりである。図２１は、抽出部１４３による輪郭抽出処理の一例を示すフローチャートである。以下、図２１のフローチャートの内容を説明する。 The extraction unit 143 illustrated in FIG. 19 extracts the outline of the collection area. In the present embodiment, the extraction unit 143 sets direction information indicating the direction of connection for each of a plurality of pixels constituting the contour. More specifically, it is as follows. FIG. 21 is a flowchart illustrating an example of contour extraction processing by the extraction unit 143. The contents of the flowchart of FIG. 21 will be described below.

図２１に示すように、抽出部１４３は、開始画素を探索する開始画素探索処理を行う（ステップＳ１１１）。より具体的には、図２２に示すように、抽出部１４３は、検出矩形の左下から右上へＩＤを持つ画素を探索し、最初に見つかった画素を着目画素とする。 As illustrated in FIG. 21, the extraction unit 143 performs a start pixel search process for searching for a start pixel (step S111). More specifically, as illustrated in FIG. 22, the extraction unit 143 searches for a pixel having an ID from the lower left to the upper right of the detection rectangle, and sets the first found pixel as the target pixel.

図２１に戻って説明を続ける。開始画素が見つかった場合（ステップＳ１１２：Ｙｅｓ）、抽出部１４３は、輪郭を抽出する処理を行う（ステップＳ１１３）。開始画素が見つからなかった場合（ステップＳ１１２：Ｎｏ）、そのまま処理は終了する。 Returning to FIG. 21, the description will be continued. When the start pixel is found (step S112: Yes), the extraction unit 143 performs a process of extracting a contour (step S113). If the start pixel is not found (step S112: No), the process ends.

以下、上述のステップＳ１１３の処理の具体的な内容を説明する。この例では、抽出部１４３は、着目画素に隣接する８画素について、図２３に示すような左回りの探索順位でＩＤを持つ画素を探索していく。図２３に示す探索順位は、着目画素に隣接する８画素のうち、左下に隣接する画素を第１番目に探索し、真下に隣接する画素を第２番目に探索し、右下に隣接する画素を第３番目に探索し、右に隣接する画素を第４番目に探索し、右上に隣接する画素を第５番目に探索し、真上に隣接する画素を第６番目に探索し、左上に隣接する画素を第７番目に探索し、左に隣接する画素を第８番目に探索することを表している。探索の結果、ＩＤを持つ画素が見つかった場合は、そのときの探索順を着目画素に記録し、発見した画素を次の着目画素として探索を繰り返していく。例えば図２４に示すように、着目画素の右上に隣接する画素が、ＩＤを持つ画素として最初に発見された場合、そのときの探索順である「４」を着目画素に記録し、該着目画素の右上に隣接する画素を次の着目画素として探索を続ける。このとき、着目画素に記録された「４」を示す探索順は、輪郭の繋がりの方向を示す（着目画素から右上の方向に繋がることを示す）情報であり、方向情報に対応している。 Hereinafter, specific contents of the processing in step S113 described above will be described. In this example, the extraction unit 143 searches for pixels having IDs in the counterclockwise search order as shown in FIG. 23 for the eight pixels adjacent to the target pixel. In the search order shown in FIG. 23, among the eight pixels adjacent to the target pixel, the pixel adjacent to the lower left is searched first, the pixel adjacent to the lower right is searched second, and the pixel adjacent to the lower right Is searched third, the pixel adjacent to the right is searched fourth, the pixel adjacent to the upper right is searched fifth, the pixel immediately adjacent above is searched sixth, and the upper left This indicates that the adjacent pixel is searched seventh and the left adjacent pixel is searched eighth. If a pixel having an ID is found as a result of the search, the search order at that time is recorded in the target pixel, and the search is repeated with the found pixel as the next target pixel. For example, as shown in FIG. 24, when a pixel adjacent to the upper right of the target pixel is first found as a pixel having an ID, “4” which is the search order at that time is recorded in the target pixel, and the target pixel The search is continued with the pixel adjacent to the upper right of the next pixel of interest. At this time, the search order indicating “4” recorded in the target pixel is information indicating the direction of connection of contours (indicating that the target pixel is connected in the upper right direction), and corresponds to the direction information.

そして、図２５に示すように、新たな着目画素に隣接する８画素について、図２３に示すような左回りの探索順位でＩＤを持つ画素を探索していく。なお、処理後の画素（探索順が記録済みの画素）については、再び探索されても探索順が記録されることはない。図２５の例では、着目画素の左下に隣接する画素（「４」を示す探索順が記録された直前の着目画素）は既に探索順が記録済みであるので、着目画素の右に隣接する画素が、ＩＤを持つ対象画素として発見される。したがって、このときの探索順である「３」を着目画素に記録し、該着目画素の右に隣接する画素を次の着目画素として探索を続ける。以上のようにして、図２６に示すように、集合領域（ＩＤを付与された画素の集合）の輪郭を構成する画素ごとに、探索順を示す情報（方向情報に対応）が記録（設定）されていく。つまり、集合領域の輪郭は、探索順を示す情報が記録された画素を連結したものとなる。抽出部１４３は、以上のようにして抽出した輪郭の特徴（すなわち、ここでは物体の輪郭を構成する画素が並ぶ方向）に基づいて、背面位置、側面位置を算出することができる。例えば輪郭抽出の経路（着目画素の探索順）が左回りの場合（図２３の場合）、右から左へ向かう方向を示す方向情報（この例では「７」を示す探索順）が設定された画素が最も多いＹ座標値（奥行方向（視差値ｄの方向）の座標値）から背面の距離を算出することができる。また、上から下へ向かう方向を示す方向情報（この例では「１」を示す探索順）が設定された画素が最も多いＹ座標値を左側面の位置として算出し、下から上へ向かう方向を示す情報（この例では「５」を示す探索順）が設定された画素が最も多いＹ座標値を右側面の位置として算出することができる。 Then, as shown in FIG. 25, for the eight pixels adjacent to the new target pixel, the pixels having IDs are searched in the counterclockwise search order as shown in FIG. For the processed pixels (the pixels for which the search order has already been recorded), the search order is not recorded even if the search is performed again. In the example of FIG. 25, the pixel adjacent to the lower left of the pixel of interest (the pixel of interest immediately before the search order indicating “4” is recorded) has already been recorded, so the pixel adjacent to the right of the pixel of interest Are found as target pixels having IDs. Therefore, “3” which is the search order at this time is recorded in the target pixel, and the search is continued with the pixel adjacent to the right of the target pixel as the next target pixel. As described above, as shown in FIG. 26, information indicating the search order (corresponding to the direction information) is recorded (set) for each pixel constituting the outline of the collection area (a collection of pixels assigned ID). It will be done. That is, the outline of the collective region is obtained by connecting pixels in which information indicating the search order is recorded. The extraction unit 143 can calculate the back surface position and the side surface position based on the feature of the contour extracted as described above (that is, the direction in which the pixels constituting the contour of the object are arranged here). For example, when the contour extraction path (search order of the pixel of interest) is counterclockwise (in the case of FIG. 23), direction information indicating the direction from right to left (search order indicating “7” in this example) is set. The back distance can be calculated from the Y coordinate value (the coordinate value in the depth direction (the direction of the parallax value d)) with the largest number of pixels. Further, the Y coordinate value having the most pixels for which direction information indicating the direction from the top to the bottom (in this example, the search order indicating “1”) is set is calculated as the position of the left side surface, and the direction from the bottom to the top Can be calculated as the position of the right side surface with the largest number of pixels for which the information indicating (in this example, the search order indicating “5”) is set.

図１９に戻って説明を続ける。分離部１４４は、抽出部１４３により抽出された輪郭の特徴に基づいて、集合領域を分離する。より具体的には、分離部１４４は、抽出部１４３により抽出された輪郭が同一方向の複数の側面を有する場合に、集合領域を分離する。本実施形態では、分離部１４４は、輪郭を構成する複数の画素の各々に設定された方向情報に基づいて、同一方向の複数の側面が存在する状態を判別する。より具体的には、分離部１４４は、孤立領域検出処理部１４０により検出された集合領域を含む関心領域（第１の対応情報であるＳｔａｎｄａｒｄＵｍａｐ上の領域）の行ごとに横方向（ＳｔａｎｄａｒｄＵｍａｐの横軸（Ｘ軸）方向）にサーチしていき、繋がりの方向として同一の縦方向を示す方向情報が２回目以降にカウントされる画素を対象画素として特定していく。 Returning to FIG. 19, the description will be continued. The separation unit 144 separates the collection region based on the contour features extracted by the extraction unit 143. More specifically, the separation unit 144 separates the collection region when the contour extracted by the extraction unit 143 has a plurality of side surfaces in the same direction. In the present embodiment, the separation unit 144 determines a state where there are a plurality of side surfaces in the same direction based on the direction information set for each of the plurality of pixels constituting the contour. More specifically, the separation unit 144 performs the horizontal direction (Standard Umap) for each row of the region of interest (the region on the Standard Umap that is the first correspondence information) including the collection region detected by the isolated region detection processing unit 140. The horizontal axis (X-axis direction) is searched, and the pixel in which the direction information indicating the same vertical direction as the connection direction is counted after the second time is specified as the target pixel.

そして、分離部１４４は、横方向の位置と、奥行方向（ＳｔａｎｄａｒｄＵｍａｐの縦軸（Ｙ軸）方向）にわたって対象画素をカウントした頻度値と、が対応付けられた第２の対応情報を生成する。分離部１４４は、この第２の対応情報に基づいて、２つ目の側面に対応する頻度値が閾値を超えるか否かを判断し、該２つ目の側面に対応する頻度値が閾値を超える場合に、該２つ目の側面を分離対象とする。さらに、分離部１４４は、分離対象の２つ目の側面に対応する横方向の位置から１つ目の側面に対応する横方向の位置へ向かう方向において、最も遠い奥行方向の位置に対応する横方向の位置を分離位置として特定する。そして、分離部１４４は、特定した分離位置を境界として１つの集合領域を２つの集合領域に分離し、ＩＤを振り直す（別々のＩＤを割り当てる）。 Then, the separation unit 144 generates second correspondence information in which the horizontal position is associated with the frequency value obtained by counting the target pixel in the depth direction (Standard Umap vertical axis (Y-axis) direction). . Based on the second correspondence information, the separation unit 144 determines whether or not the frequency value corresponding to the second side exceeds the threshold, and the frequency value corresponding to the second side determines the threshold value. In the case of exceeding, the second side surface is the separation target. Further, the separation unit 144 has a lateral direction corresponding to the farthest position in the depth direction in the direction from the lateral position corresponding to the second side surface to be separated to the lateral position corresponding to the first side surface. The position in the direction is specified as the separation position. Then, the separation unit 144 separates one collection area into two collection areas with the identified separation position as a boundary, and reassigns IDs (assigns different IDs).

より具体的には以下のとおりである。図２７は、分離部１４４による処理（分離処理）の一例を示すフローチャートである。分離部１４４は、抽出部１４３により輪郭が抽出された集合領域ごとに、図２７の処理を繰り返す。以下、図２７のフローチャートの内容を説明する。 More specifically, it is as follows. FIG. 27 is a flowchart illustrating an example of processing (separation processing) by the separation unit 144. The separation unit 144 repeats the process of FIG. 27 for each collection region whose contour is extracted by the extraction unit 143. The contents of the flowchart in FIG. 27 will be described below.

分離部１４４は、集合領域（物体）の位置は、右または左であるか否かを判断する（ステップＳ１１）。より具体的には、分離部１４４は、集合領域の左端位置と右端位置から、集合領域がＳｔａｎｄａｒｄＵｍａｐ上の中心のＸ座標に対して左側か右側かを判断する。ここでは、図２８に示すように、ＳｔａｎｄａｒｄＵｍａｐ上の中心のＸ座標から、集合領域の左端のＸ座標までの距離をＤｉｆｆ＿Ｌｅｆｔ、中心のＸ座標から、集合領域の右端のＸ座標までの距離をＤｉｆｆ＿Ｒｉｇｈｔと表記する。そして、分離部１４４は、Ｄｉｆｆ＿Ｌｅｆｔが、Ｄｉｆｆ＿Ｒｉｇｈｔよりも所定数の画素分（左寄りか右寄りかを判別可能な数であればよい。この例では５０ｃｍに相当する４つ分）だけ大きい場合、集合領域の位置は左であると判断する。また、分離部１４４は、Ｄｉｆｆ＿Ｒｉｇｈｔが、Ｄｉｆｆ＿Ｌｅｆｔよりも所定の画素数分だけ大きい場合、集合領域の位置は右であると判断する。上記の何れの条件にも該当しない場合、分離部１４４は、集合領域の位置は中心であると判断する。この場合、ステップＳ１１の結果は否定となり、そのまま処理は終了する。一方、集合領域の位置が右または左であると判断した場合、ステップＳ１１の結果は肯定となり、処理はステップＳ１２に移行する。 The separation unit 144 determines whether the position of the collection area (object) is right or left (step S11). More specifically, the separation unit 144 determines from the left end position and the right end position of the collection area whether the collection area is on the left side or the right side with respect to the X coordinate of the center on the Standard Umap. Here, as shown in FIG. 28, the distance from the center X coordinate on the Standard Umap to the left end X coordinate of the collective region is Diff_Left, and the distance from the center X coordinate to the right end X coordinate of the collective region is Expressed as Diff_Right. Then, the separation unit 144 only needs to have Diff_Left larger than Diff_Right by a predetermined number of pixels (a number that can be discriminated whether left or right, in this example, four corresponding to 50 cm). Is determined to be on the left. In addition, when the Diff_Right is larger than the Diff_Left by a predetermined number of pixels, the separation unit 144 determines that the position of the collection region is on the right. When none of the above conditions is satisfied, the separation unit 144 determines that the position of the collective region is the center. In this case, the result of step S11 is negative, and the process ends. On the other hand, if it is determined that the position of the collection area is right or left, the result of step S11 is affirmative, and the process proceeds to step S12.

ステップＳ１２では、分離部１４４は、上述の第２の対応情報を生成する。この例では、分離部１４４は、集合領域を含む関心領域（例えばＳｔａｎｄａｒｄＵｍａｐ上の検出矩形であってもよい）の行ごとに横方向にサーチし、繋がりの方向として同一の縦方向を示す方向情報が２回目以降にカウントされる画素を対象画素として特定していく。そして、分離部１４４は、横方向（ＳｔａｎｄａｒｄＵｍａｐの横軸の方向）の位置と、奥行方向（ＳｔａｎｄａｒｄＵｍａｐの縦軸の方向）にわたって対象画素をカウントした頻度値とが対応付けられた第２の対応情報を生成する。例えば集合領域の位置が右である場合、サーチの開始位置は関心領域の左端であり、サーチ方向は左から右へ向かう方向であり、分離部１４４は、上から下へ向かう方向を示す方向情報（この例では「１」を示す探索順）が２回目以降にカウントされる画素を対象画素として特定する。 In step S12, the separation unit 144 generates the above-described second correspondence information. In this example, the separation unit 144 searches in the horizontal direction for each row of the region of interest including the collection region (for example, a detection rectangle on the Standard Umap), and indicates the same vertical direction as the connection direction. A pixel whose information is counted after the second time is specified as a target pixel. The separation unit 144 then associates the position in the horizontal direction (the direction of the horizontal axis of Standard Umap) with the frequency value obtained by counting the target pixel in the depth direction (the direction of the vertical axis of Standard Umap). Generate correspondence information. For example, when the position of the collection region is right, the search start position is the left end of the region of interest, the search direction is a direction from left to right, and the separation unit 144 has direction information indicating a direction from top to bottom. A pixel that is counted after the second time (search order indicating “1” in this example) is specified as a target pixel.

例えば図２９のように下の行から順番にサーチする場合において、第３番目のサーチで「１」を示す探索順が設定された画素が発見されているが、「１」を示す探索順が設定された画素は１回しか発見されていないので（「１」を示す探索順が設定された画素のカウント数は１なので）、対象画素は存在しないことになる。また、例えば図３０のように、第４番目のサーチでは、「１」を示す探索順が設定された画素が２回発見されているので（「１」を示す探索順が設定された画素のカウント数は２なので）、対象画素は存在することになり、２回目にカウントされた画素が対象画素として特定される。そして、分離部１４４は、横方向（ＳｔａｎｄａｒｄＵｍａｐの横軸の方向）の位置ごとに、以上のようにして特定した対象画素の数を奥行方向（ＳｔａｎｄａｒｄＵｍａｐの縦軸の方向）にわたってカウントしていき、そのカウント数に対応する頻度値を対応付ける。図３１の例では、横方向の位置Ｐに対して、カウント数「５」（奥行方向にわたってカウントされた対象画素の数）を示す頻度値が対応付けられる。このようにして第２の対応情報が生成されることになる。 For example, in the case of searching in order from the lower row as shown in FIG. 29, pixels in which the search order indicating “1” is set in the third search are found, but the search order indicating “1” is Since the set pixel is found only once (because the count number of the pixel for which the search order indicating “1” is set is 1), the target pixel does not exist. For example, as shown in FIG. 30, in the fourth search, a pixel in which the search order indicating “1” is set is found twice (the pixel of the search order in which “1” is set is set). Since the count number is 2), the target pixel exists, and the pixel counted for the second time is specified as the target pixel. Then, the separation unit 144 counts the number of target pixels specified as described above for each position in the horizontal direction (the direction of the horizontal axis of Standard Umap) over the depth direction (the direction of the vertical axis of Standard Umap). Then, the frequency value corresponding to the count number is associated. In the example of FIG. 31, the frequency value indicating the count number “5” (the number of target pixels counted in the depth direction) is associated with the position P in the horizontal direction. In this way, the second correspondence information is generated.

また、例えば集合領域が斜めの形状を持つ場合に対応するために、第２の対応情報は、横方向の位置ごとに、該横方向の位置を含む横方向の所定の範囲内における対象画素の数を奥行方向にわたってカウントした数の総数を頻度値として対応付けた情報であってもよい。例えば図３２において、横方向の位置Ｐを含む所定の範囲は左右に隣接する位置Ｑ、位置Ｒを含む範囲であり、この所定の範囲内における対象画素の数を奥行方向にわたってカウントした数は「５」となる。同様に、横方向の位置Ｑを含む所定の範囲は左右に隣接する位置Ｓ、位置Ｐを含む範囲であり、この所定の範囲内における対象画素の数を奥行方向にわたってカウントした数は「５」となる。同様に、横方向の位置Ｒを含む所定の範囲は左右に隣接する位置Ｐ、位置Ｔを含む範囲であり、この所定の範囲内における対象画素の数を奥行方向にわたってカウントした数は「５」となる。この場合、第２の対応情報は、位置Ｑに対してカウント数「５」を示す頻度値が対応付けられ、位置Ｐに対してカウント数「５」を示す頻度値が対応付けられ、位置Ｒに対してカウント数「５」を示す頻度値が対応付けられた情報となる。 Further, for example, in order to cope with the case where the collection region has an oblique shape, the second correspondence information includes, for each position in the horizontal direction, the target pixel in the predetermined range in the horizontal direction including the horizontal position. It may be information in which the total number of numbers counted in the depth direction is associated as a frequency value. For example, in FIG. 32, the predetermined range including the position P in the horizontal direction is a range including the position Q and the position R adjacent to the left and right, and the number obtained by counting the number of target pixels in the predetermined range in the depth direction is “ 5 ". Similarly, the predetermined range including the position Q in the horizontal direction is a range including the positions S and P adjacent to the left and right, and the number obtained by counting the number of target pixels in the predetermined range in the depth direction is “5”. It becomes. Similarly, the predetermined range including the horizontal position R is a range including the left and right adjacent positions P and T, and the number obtained by counting the number of target pixels in the predetermined range in the depth direction is “5”. It becomes. In this case, in the second correspondence information, the frequency value indicating the count number “5” is associated with the position Q, the frequency value indicating the count number “5” is associated with the position P, and the position R Is associated with a frequency value indicating the count number “5”.

図２７に戻って説明を続ける。ステップＳ１２で第２の対応情報を生成した後、分離部１４４は、集合領域（第２の対応情報の生成元の集合領域）が分離対象であるか否かを判断する（ステップＳ１３）。この例では、第２の対応情報の最大の頻度値（最大頻度）は２つ目の側面の長さと考えることができ、分離部１４４は、第２の対応情報の最大頻度が、もともとの奥行きに比べて十分な長さを有しているか否かを判断する。ここでは、分離部１４４は、第２の対応情報の最大頻度が、もともとの奥行きに比べて十分な長さを有しているか否かを判断するための第１の閾値よりも大きいか否かを判断（第１の条件を満たすか否かを判断）する。また、分離部１４４は、検出物体の距離（奥行方向の距離）に着目し、第２の対応情報の最大頻度（２つ目の側面の長さに対応）が、視差誤差による側面らしい形状をした集合領域を２つ目の側面として検出していないか否かを判断するための第２の閾値よりも大きいか否かを判断（第２の条件を満たすか否かを判断）する。ここでは、分離部１４４は、第１の条件および第２の条件を満たす場合、集合領域は分離対象であると判断する。要するに、分離部１４４は、第２の対応情報に基づいて、２つ目の側面に対応する頻度値が閾値を超えるか否かを判断し、２つ目の側面に対応する頻度値が閾値を超える場合に、該２つ目の側面を分離対象とする。見方を変えれば、第２の対応情報において閾値を超える頻度値は、分離対象とする２つ目の側面に対応する頻度値となる。 Returning to FIG. 27, the description will be continued. After generating the second correspondence information in step S12, the separation unit 144 determines whether or not the collection region (the collection region from which the second correspondence information is generated) is a separation target (step S13). In this example, the maximum frequency value (maximum frequency) of the second correspondence information can be considered as the length of the second side surface, and the separation unit 144 determines that the maximum frequency of the second correspondence information is the original depth. It is judged whether it has sufficient length compared with. Here, the separation unit 144 determines whether or not the maximum frequency of the second correspondence information is greater than a first threshold value for determining whether or not the maximum frequency of the second correspondence information has a sufficient length compared to the original depth. (Determining whether or not the first condition is satisfied). In addition, the separation unit 144 focuses on the distance of the detected object (the distance in the depth direction), and the maximum frequency of the second correspondence information (corresponding to the length of the second side surface) is a shape that seems to be a side surface due to a parallax error. It is determined whether or not the set area is larger than a second threshold value for determining whether or not the second side surface is detected (determining whether or not the second condition is satisfied). Here, the separation unit 144 determines that the collection region is a separation target when the first condition and the second condition are satisfied. In short, the separation unit 144 determines whether or not the frequency value corresponding to the second side surface exceeds the threshold value based on the second correspondence information, and the frequency value corresponding to the second side surface sets the threshold value. In the case of exceeding, the second side surface is the separation target. In other words, the frequency value exceeding the threshold in the second correspondence information becomes the frequency value corresponding to the second side to be separated.

ステップＳ１３の結果が肯定の場合（ステップＳ１３：Ｙｅｓ）、分離部１４４は集合領域を分離する処理を行う（ステップＳ１４）。一方、ステップＳ１３の結果が否定の場合（ステップＳ１３：Ｎｏ）、そのまま処理は終了する。以下、上述のステップＳ１４の処理の具体的な内容について説明する。分離部１４４は、第２の対応情報の最大頻度値に対応付けられた横方向の位置を、２つ目の側面の位置と判断する。例えば図３２の態様の場合は、最大頻度値に対応付けられた３つの横方向の位置（Ｐ、Ｑ、Ｒ）のうち、奥行方向の位置が最も近い対象画素（視差値が最も大きい対象画素（この例では「１」を示す探索順が設定された画素））が対応付けられた位置Ｐを、２つ目の側面に対応する位置として判断することができる。そして、２つ目の側面に対応する横方向の位置から１つ目の側面に対応する横方向の位置へ向かう方向において、最小視差を持つ画素（奥行方向に最も遠い画素）を特定する。集合領域の位置が右である場合、図３３に示すように、分離部１４４は、２つ目の側面に対応する位置から左へ向かう方向において、最小視差を持つ画素の横方向の位置を分離位置として特定する。つまり、奥行方向の空白領域が最も大きい画素の横方向の位置を分離位置として特定する。そして、図３４に示すように、分離部１４４は、以上のようにして特定した分離位置を境界として、１つの集合領域を２つの集合領域（図３４の例では、第１の集合領域と第２の集合領域）に分離する。 If the result of step S13 is affirmative (step S13: Yes), the separation unit 144 performs a process of separating the collection area (step S14). On the other hand, if the result of step S13 is negative (step S13: No), the process ends. Hereinafter, the specific content of the process of above-mentioned step S14 is demonstrated. The separation unit 144 determines that the position in the horizontal direction associated with the maximum frequency value of the second correspondence information is the position of the second side surface. For example, in the case of the aspect of FIG. 32, among the three horizontal positions (P, Q, R) associated with the maximum frequency value, the target pixel having the closest position in the depth direction (the target pixel having the largest parallax value) (In this example, a pixel in which the search order indicating “1” is set)) can be determined as a position corresponding to the second side surface. Then, the pixel having the minimum parallax (the pixel farthest in the depth direction) is specified in the direction from the lateral position corresponding to the second side surface to the lateral position corresponding to the first side surface. When the position of the collection region is right, as illustrated in FIG. 33, the separation unit 144 separates the horizontal position of the pixel having the minimum parallax in the direction from the position corresponding to the second side surface to the left. Specify as location. That is, the horizontal position of the pixel having the largest blank area in the depth direction is specified as the separation position. As shown in FIG. 34, the separation unit 144 uses two separation regions (in the example of FIG. 34, the first collection region and the first collection region) with the separation position specified as described above as a boundary. 2 aggregated areas).

以上に説明したように、本実施形態では、抽出部１４３により抽出された集合領域の輪郭の特徴に基づいて集合領域を分離する。より具体的には、分離部１４４は、抽出部１４３により抽出された集合領域の輪郭が同一方向の複数の面を有する場合に、該集合領域を分離する。これは通常、同一の物体であれば、その物体を囲む面は同一方向に複数存在しないので、同一方向の面が複数ある場合には、複数の物体が連結して誤検出されていると判断できるためである。これにより、複数の物体を結合した１つの物体として誤検出されることを防止できる。すなわち、本実施形態によれば、物体の検出精度を十分に確保することができる。 As described above, in the present embodiment, the collection area is separated based on the outline characteristics of the collection area extracted by the extraction unit 143. More specifically, the separation unit 144 separates the collection region when the outline of the collection region extracted by the extraction unit 143 has a plurality of surfaces in the same direction. Normally, if there are multiple surfaces in the same direction for the same object, if there are multiple surfaces in the same direction, it is determined that multiple objects are connected and erroneously detected. This is because it can. Accordingly, it is possible to prevent erroneous detection as one object obtained by combining a plurality of objects. That is, according to the present embodiment, sufficient object detection accuracy can be ensured.

以上、本発明に係る実施形態について説明したが、本発明は、上述の各実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上述の実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。 As mentioned above, although embodiment which concerns on this invention was described, this invention is not limited to each above-mentioned embodiment as it is, A component can be deform | transformed and embodied in the range which does not deviate from the summary in an implementation stage. . Various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the above-described embodiments. For example, some components may be deleted from all the components shown in the embodiment.

また、上述した実施形態の機器制御システム１００で実行されるプログラムは、インストール可能な形式または実行可能な形式のファイルでＣＤ−ＲＯＭ、フレキシブルディスク（ＦＤ）、ＣＤ−Ｒ、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）等のコンピュータで読み取り可能な記録媒体に記録して提供するように構成してもよいし、インターネット等のネットワーク経由で提供または配布するように構成してもよい。また、各種プログラムを、ＲＯＭ等に予め組み込んで提供するように構成してもよい。 The program executed by the device control system 100 according to the above-described embodiment is a file in an installable format or an executable format, and is a CD-ROM, a flexible disk (FD), a CD-R, a DVD (Digital Versatile Disk). It may be configured to be recorded and provided on a computer-readable recording medium such as USB (Universal Serial Bus), or may be configured to be provided or distributed via a network such as the Internet. Various programs may be provided by being incorporated in advance in a ROM or the like.

１Ａ第１のカメラ部
１Ｂ第２のカメラ部
５Ａ，５Ｂ撮像レンズ
６Ａ，６Ｂ画像センサ
７Ａ，７Ｂセンサコントローラ
１０データバスライン
１１シリアルバスライン
１５ＣＰＵ
１６ＦＰＧＡ
１７ＲＯＭ
１８ＲＡＭ
１９シリアルＩＦ
２０データＩＦ
１００機器制御システム
１０１車両
１０２撮像ユニット
１０３解析ユニット
１０４制御ユニット
１０５表示部
１０６フロントガラス
１１１前処理部
１１２平行化画像生成部
１１３視差画像生成部
１１４物体検出処理部
１２１取得部
１２２路面検出処理部
１２３クラスタリング処理部
１２４トラッキング処理部
１３１路面推定部
１３２生成部
１４０孤立領域検出処理部
１４１面検出処理部
１４２補間処理部
１４３抽出部
１４４分離部
１５０視差画処理部
１６０棄却処理部 DESCRIPTION OF SYMBOLS 1A 1st camera part 1B 2nd camera part 5A, 5B Imaging lens 6A, 6B Image sensor 7A, 7B Sensor controller 10 Data bus line 11 Serial bus line 15 CPU
16 FPGA
17 ROM
18 RAM
19 Serial IF
20 Data IF
DESCRIPTION OF SYMBOLS 100 Device control system 101 Vehicle 102 Imaging unit 103 Analysis unit 104 Control unit 105 Display unit 106 Windshield 111 Preprocessing unit 112 Parallelized image generation unit 113 Parallax image generation unit 114 Object detection processing unit 121 Acquisition unit 122 Road surface detection processing unit 123 Clustering processing unit 124 Tracking processing unit 131 Road surface estimation unit 132 Generation unit 140 Isolated region detection processing unit 141 Surface detection processing unit 142 Interpolation processing unit 143 Extraction unit 144 Separation unit 150 Parallax image processing unit 160 Rejection processing unit

特許第３３４９０６０号公報Japanese Patent No. 3349060

Claims

画素毎に距離情報を有する距離画像を取得する取得部と、
前記距離画像に基づいて横方向の位置と、奥行方向の位置とが対応付けられた第１の対応情報を生成する生成部と、
前記第１の対応情報から、前記距離情報の集合領域を検出する検出部と、
前記集合領域の輪郭を抽出する抽出部と、
前記抽出部により抽出された前記輪郭の特徴に基づいて、前記集合領域を分離する分離部と、を備える、
情報処理装置。 An acquisition unit for acquiring a distance image having distance information for each pixel;
A generating unit that generates first correspondence information in which a position in the horizontal direction and a position in the depth direction are associated with each other based on the distance image;
A detection unit for detecting a collection area of the distance information from the first correspondence information;
An extraction unit for extracting the outline of the gathering region;
A separation unit that separates the aggregate region based on the features of the contour extracted by the extraction unit;
Information processing device.

前記分離部は、前記抽出部により抽出された前記輪郭が同一方向の複数の面を有する場合に、前記集合領域を分離する、
請求項１に記載の情報処理装置。 The separation unit separates the collective region when the contour extracted by the extraction unit has a plurality of surfaces in the same direction;
The information processing apparatus according to claim 1.

前記抽出部は、前記輪郭を構成する複数の画素ごとに、当該複数の画素の繋がりの方向を示す方向情報を設定する、
請求項２に記載の情報処理装置。 The extraction unit sets direction information indicating a connection direction of the plurality of pixels for each of the plurality of pixels constituting the contour.
The information processing apparatus according to claim 2.

前記分離部は、前記輪郭を構成する複数の画素の各々に設定された前記方向情報に基づいて、同一方向の複数の側面が存在する状態を判別する、
請求項３に記載の情報処理装置。 The separation unit determines a state where a plurality of side surfaces in the same direction exist based on the direction information set in each of a plurality of pixels constituting the contour.
The information processing apparatus according to claim 3.

前記分離部は、
前記集合領域を含む関心領域の行ごとに前記横方向にサーチしていき、繋がりの方向として同一の縦方向を示す前記方向情報が２回目以降にカウントされる画素を対象画素として特定し、
前記横方向の位置と、前記奥行方向にわたって前記対象画素をカウントした頻度値とが対応付けられた第２の対応情報を生成する、
請求項４に記載の情報処理装置。 The separation unit is
The horizontal direction is searched for each row of the region of interest including the set region, and the direction information indicating the same vertical direction as the direction of connection is specified as a target pixel, and the pixel is counted for the second time or later.
Generating second correspondence information in which the horizontal position and the frequency value obtained by counting the target pixel over the depth direction are associated with each other;
The information processing apparatus according to claim 4.

前記分離部は、前記第２の対応情報に基づいて、２つ目の側面に対応する頻度値が閾値を超えるか否かを判断し、該２つ目の側面に対応する頻度値が前記閾値を超える場合に、該２つ目の側面を分離対象とする、
請求項５に記載の情報処理装置。 The separation unit determines whether the frequency value corresponding to the second side exceeds a threshold based on the second correspondence information, and the frequency value corresponding to the second side is the threshold If the second side is exceeded,
The information processing apparatus according to claim 5.

前記分離部は、２つ目の側面に対応する前記横方向の位置から１つ目の側面に対応する前記横方向の位置へ向かう方向において、最も遠い前記奥行方向の位置に対応する前記横方向の位置を分離位置として特定する、
請求項６に記載の情報処理装置。 In the direction from the lateral position corresponding to the second side surface to the lateral position corresponding to the first side surface, the separating portion corresponds to the farthest position in the depth direction. Specify the position of
The information processing apparatus according to claim 6.

請求項１〜７の何れか一つに記載の情報処理装置を備える、撮像装置。 An imaging apparatus comprising the information processing apparatus according to claim 1.

請求項８に記載の撮像装置と、前記撮像装置の出力結果に基づいて機器を制御する制御部と、を備える、機器制御システム。 An apparatus control system comprising: the imaging apparatus according to claim 8; and a control unit that controls the apparatus based on an output result of the imaging apparatus.

請求項９に記載の機器制御システムを備え、前記制御部により制御される移動体。 A moving body comprising the device control system according to claim 9 and controlled by the control unit.

画素毎に距離情報を有する距離画像を取得する取得ステップと、
前記距離画像に基づいて横方向の位置と、奥行方向の位置とが対応付けられた第１の対応情報を生成する生成ステップと、
前記第１の情報から、前記距離情報の集合領域を検出する検出ステップと、
前記集合領域の輪郭を抽出する抽出ステップと、
前記抽出ステップにより抽出された前記輪郭の特徴に基づいて、前記集合領域を分離する分離ステップと、を含む、
情報処理方法。 An acquisition step of acquiring a distance image having distance information for each pixel;
A step of generating first correspondence information in which a position in the horizontal direction and a position in the depth direction are associated with each other based on the distance image;
A detection step of detecting a collection area of the distance information from the first information;
An extraction step of extracting an outline of the aggregate region;
Separating the collective region based on the feature of the contour extracted by the extraction step,
Information processing method.

コンピュータに、
画素毎に距離情報を有する距離画像を取得する取得ステップと、
前記距離画像に基づいて横方向の位置と、奥行方向の位置とが対応付けられた第１の情報を生成する生成ステップと、
前記第１の情報から、前記距離情報の集合領域を検出する検出ステップと、
前記集合領域の輪郭を抽出する抽出ステップと、
前記抽出ステップにより抽出された前記輪郭の特徴に基づいて、前記集合領域を分離する分離ステップと、
を実行させるためのプログラム。 On the computer,
An acquisition step of acquiring a distance image having distance information for each pixel;
A step of generating first information in which a horizontal position and a depth direction position are associated with each other based on the distance image;
A detection step of detecting a collection area of the distance information from the first information;
An extraction step of extracting an outline of the aggregate region;
A separation step of separating the collective region based on features of the contour extracted by the extraction step;
A program for running