JP6515020B2

JP6515020B2 - Peripheral monitoring system for work machine

Info

Publication number: JP6515020B2
Application number: JP2015233978A
Authority: JP
Inventors: 芳永清田; 俊介大槻; 晋相澤
Original assignee: Sumitomo Heavy Industries Ltd
Current assignee: Sumitomo Heavy Industries Ltd
Priority date: 2015-11-30
Filing date: 2015-11-30
Publication date: 2019-05-15
Anticipated expiration: 2035-11-30
Also published as: JP2017102605A

Description

本発明は、作業機械の周辺を監視する作業機械用周辺監視システムに関する。 The present invention relates to a work machine peripheral monitoring system for monitoring the periphery of a work machine.

イメージ・センサと熱を感知するサーモパイル・アレイを持ち、撮像範囲と熱検出範囲を重複させ、サーモパイル・アレイの出力が示す人体らしき範囲のみを顔抽出範囲と限定して画像識別処理の際の不要な演算処理量を減らす人体検出装置が知られている（特許文献１参照。）。 It has an image sensor and a thermopile array that senses heat, overlapping the imaging range and the heat detection range, and limiting only the human-like range indicated by the output of the thermopile array as the face extraction range, which is unnecessary in the image identification process There is known a human body detection device that reduces the amount of arithmetic processing (see Patent Document 1).

特開２００６−０５９０１５号公報Japanese Patent Application Laid-Open No. 2006-059015

しかしながら、上述の装置は、イメージ・センサとサーモパイル・アレイを併設し、且つ、撮像範囲と熱検出範囲とを正確に重複させる必要があるため、システム構成が複雑になってしまう。 However, the above-described apparatus complicates the system configuration because the image sensor and the thermopile array must be provided side by side and the imaging range and the heat detection range must be accurately overlapped.

上述に鑑み、より簡易なシステム構成で作業機械の周辺の人を検知できる作業機械用周辺監視システムの提供が望まれる。 In view of the above, it is desirable to provide a work machine peripheral monitoring system capable of detecting persons around the work machine with a simpler system configuration.

本発明の実施例に係る作業機械用周辺監視システムは、作業機械に取り付けられる撮像装置の撮像画像を用いて前記作業機械の周辺に存在する人を検知する作業機械用周辺監視システムであって、前記撮像画像の一部を識別処理対象画像として抽出する抽出部と、前記抽出部が抽出した識別処理対象画像に含まれる画像が人の画像であるかを画像認識処理によって識別する識別部と、を備え、前記識別部は、機械学習によって生成された識別器の識別結果と、前記識別処理対象画像における画像特徴の偏りに基づいて前記識別処理対象画像に含まれる画像が人の画像であるか否かを補助的に識別する補助識別部の識別結果とに基づいて前記識別処理対象画像に含まれる画像が人の画像であるかを識別する。 A work machine peripheral monitoring system according to an embodiment of the present invention is a work machine peripheral monitoring system that detects a person present around the work machine using a captured image of an imaging device attached to the work machine, An extraction unit that extracts a part of the captured image as an identification processing target image; an identification unit that identifies whether an image included in the identification processing target image extracted by the extraction unit is a human image by image recognition processing; Whether the image included in the classification processing target image is a human image based on the classification result of the classifier generated by machine learning and the bias of the image feature in the classification processing target image. Whether the image included in the identification processing target image is an image of a person is identified based on the identification result of the auxiliary identification unit that additionally identifies whether or not it is.

上述の手段により、より簡易なシステム構成で作業機械の周辺の人を検知できる作業機械用周辺監視システムが提供される。 According to the above-described means, a work machine peripheral monitoring system capable of detecting a person around the work machine with a simpler system configuration is provided.

本発明の実施例に係る周辺監視システムが搭載されるショベルの側面図である。It is a side view of a shovel by which a perimeter surveillance system concerning an example of the present invention is carried. 周辺監視システムの構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of a periphery monitoring system. 後方カメラの撮像画像の例である。It is an example of the image pick-up picture of a back camera. 撮像画像から識別処理対象画像を切り出す際に用いられる幾何学的関係の一例を示す概略図である。It is a schematic diagram showing an example of geometrical relation used when cutting out a distinction processing object picture from an image pick-up picture. ショベル後方の実空間の上面視である。It is a top view of real space behind a shovel. 撮像画像から正規化画像を生成する処理の流れを示す図である。It is a figure which shows the flow of the process which produces | generates a normalization image from a captured image. 撮像画像と識別処理対象画像領域と正規化画像との関係を示す図である。It is a figure which shows the relationship between a captured image, an identification process target image area, and a normalization image. 識別処理対象画像領域と識別処理不適領域との関係を示す図である。It is a figure which shows the relationship between an identification process target image area | region and an identification process unsuitable area. 正規化画像の例を示す図である。It is a figure which shows the example of a normalization image. 撮像画像から識別処理対象画像を切り出す際に用いられる幾何学的関係の別の一例を示す概略図である。It is the schematic which shows another example of the geometrical relationship used when cutting out an identification process target image from a captured image. 撮像画像における特徴画像の一例を示す図である。It is a figure which shows an example of the characteristic image in a captured image. 識別部の構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of an identification part. 輝度フィルタ部による識別処理を説明する図である。It is a figure explaining the identification process by a brightness | luminance filter part. 識別部の人識別能力を表す概念図である。It is a conceptual diagram showing the person identification capability of an identification part. 正規化画像と弱識別器との関係を表す概念図である。It is a conceptual diagram showing the relationship between a normalization picture and a weak classifier. 識別処理の流れを示すフローチャートである。It is a flowchart which shows the flow of identification processing. 識別処理の流れを示すフローチャートである。It is a flowchart which shows the flow of identification processing.

図１は、本発明の実施例に係る周辺監視システム１００が搭載される建設機械としてのショベルの側面図である。ショベルの下部走行体１には、旋回機構２を介して上部旋回体３が搭載される。上部旋回体３には、ブーム４が取り付けられる。ブーム４の先端にはアーム５が取り付けられ、アーム５の先端にはバケット６が取り付けられる。ブーム４、アーム５、及びバケット６は掘削アタッチメントを構成し、ブームシリンダ７、アームシリンダ８、及びバケットシリンダ９によりそれぞれ油圧駆動される。また、上部旋回体３には、キャビン１０が設けられ、且つエンジン等の動力源が搭載される。また、上部旋回体３の上部には撮像装置４０が取り付けられる。具体的には、上部旋回体３の後端上部、左端上部、右端上部に後方カメラ４０Ｂ、左側方カメラ４０Ｌ、右側方カメラ４０Ｒが取り付けられる。また、キャビン１０内にはコントローラ３０及び出力装置５０が設置される。 FIG. 1 is a side view of a shovel as a construction machine on which a periphery monitoring system 100 according to an embodiment of the present invention is mounted. The upper swing body 3 is mounted on the lower traveling body 1 of the shovel via the turning mechanism 2. A boom 4 is attached to the upper swing body 3. An arm 5 is attached to the tip of the boom 4, and a bucket 6 is attached to the tip of the arm 5. The boom 4, the arm 5 and the bucket 6 constitute a digging attachment, and are hydraulically driven by the boom cylinder 7, the arm cylinder 8 and the bucket cylinder 9 respectively. In addition, a cabin 10 is provided in the upper revolving superstructure 3 and a power source such as an engine is mounted. Further, an imaging device 40 is attached to the upper part of the upper swing body 3. Specifically, the rear camera 40B, the left side camera 40L, and the right side camera 40R are attached to the upper rear end, the upper left end, and the upper right end of the upper swing body 3. Moreover, in the cabin 10, the controller 30 and the output device 50 are installed.

図２は、周辺監視システム１００の構成例を示す機能ブロック図である。周辺監視システム１００は、主に、コントローラ３０、撮像装置４０、及び出力装置５０を含む。 FIG. 2 is a functional block diagram showing a configuration example of the periphery monitoring system 100. As shown in FIG. The perimeter monitoring system 100 mainly includes a controller 30, an imaging device 40, and an output device 50.

コントローラ３０は、ショベルの駆動制御を行う制御装置である。本実施例では、コントローラ３０は、ＣＰＵ及び内部メモリを含む演算処理装置で構成され、内部メモリに格納された駆動制御用のプログラムをＣＰＵに実行させて各種機能を実現する。 The controller 30 is a control device that performs drive control of the shovel. In the present embodiment, the controller 30 is composed of an arithmetic processing unit including a CPU and an internal memory, and causes the CPU to execute a drive control program stored in the internal memory to realize various functions.

また、コントローラ３０は、各種装置の出力に基づいてショベルの周辺に人が存在するかを判定し、その判定結果に応じて各種装置を制御する。具体的には、コントローラ３０は、撮像装置４０及び入力装置４１の出力を受け、抽出部３１、識別部３２、追跡部３３、及び制御部３５のそれぞれに対応するソフトウェアプログラムを実行する。そして、その実行結果に応じて機械制御装置５１に制御指令を出力してショベルの駆動制御を実行し、或いは、出力装置５０から各種情報を出力させる。なお、コントローラ３０は、画像処理専用の制御装置であってもよい。 Further, the controller 30 determines whether or not a person is present around the shovel based on the outputs of various devices, and controls the various devices according to the determination result. Specifically, the controller 30 receives the outputs of the imaging device 40 and the input device 41, and executes software programs respectively corresponding to the extraction unit 31, the identification unit 32, the tracking unit 33, and the control unit 35. Then, according to the execution result, a control command is output to the machine control device 51 to execute the drive control of the shovel, or various information is output from the output device 50. The controller 30 may be a control device dedicated to image processing.

撮像装置４０は、ショベルの周囲の画像を撮像する装置であり、撮像した画像をコントローラ３０に対して出力する。本実施例では、撮像装置４０は、ＣＣＤ等の撮像素子を採用するワイドカメラであり、上部旋回体３の上部において光軸が斜め下方を向くように取り付けられる。 The imaging device 40 is a device for capturing an image around the shovel, and outputs the captured image to the controller 30. In the present embodiment, the imaging device 40 is a wide camera employing an imaging device such as a CCD, and is mounted so that the optical axis is directed obliquely downward at the upper part of the upper swing body 3.

入力装置４１は操作者の入力を受ける装置である。本実施例では、入力装置４１は、操作装置（操作レバー、操作ペダル等）、ゲートロックレバー、操作装置の先端に設置されたボタン、車載ディスプレイに付属のボタン、タッチパネル等を含む。 The input device 41 is a device that receives an input from the operator. In the present embodiment, the input device 41 includes an operation device (operation lever, operation pedal, etc.), a gate lock lever, a button installed at the tip of the operation device, a button attached to an in-vehicle display, a touch panel, and the like.

出力装置５０は、各種情報を出力する装置であり、例えば、各種画像情報を表示する車載ディスプレイ、各種音声情報を音声出力する車載スピーカ、警報ブザー、警報ランプ等を含む。本実施例では、出力装置５０は、コントローラ３０からの制御指令に応じて各種情報を出力する。 The output device 50 is a device that outputs various information, and includes, for example, an in-vehicle display that displays various image information, an in-vehicle speaker that voice-outputs various audio information, an alarm buzzer, and an alarm lamp. In the present embodiment, the output device 50 outputs various information in response to a control command from the controller 30.

機械制御装置５１は、ショベルの動きを制御する装置であり、例えば、油圧システムにおける作動油の流れを制御する制御弁、ゲートロック弁、エンジン制御装置等を含む。 The machine control device 51 is a device that controls the movement of the shovel, and includes, for example, a control valve that controls the flow of hydraulic fluid in a hydraulic system, a gate lock valve, an engine control device, and the like.

抽出部３１は、撮像装置４０が撮像した撮像画像から識別処理対象画像を抽出する機能要素である。具体的には、抽出部３１は、局所的な輝度勾配又はエッジに基づく簡易な特徴、Hough変換等による幾何学的特徴、輝度に基づいて分割された領域の面積又はアスペクト比に関する特徴等を抽出する比較的演算量の少ない画像処理（以下、「前段画像認識処理」とする。）によって識別処理対象画像を抽出する。識別処理対象画像は、後続の画像処理の対象となる画像部分（撮像画像の一部）であり、人候補画像を含む。人候補画像は、人画像である可能性が高いとされる画像部分（撮像画像の一部）である。 The extraction unit 31 is a functional element that extracts an identification processing target image from a captured image captured by the imaging device 40. Specifically, the extraction unit 31 extracts a simple feature based on a local brightness gradient or edge, a geometric feature such as Hough transform, a feature related to the area or aspect ratio of the area divided based on the brightness, etc. An identification processing target image is extracted by image processing with a relatively small amount of calculation (hereinafter, referred to as “pre-stage image recognition processing”). The identification processing target image is an image portion (a part of a captured image) to be a target of subsequent image processing, and includes human candidate images. The human candidate image is an image portion (a part of a captured image) which is considered to be likely to be a human image.

識別部３２は、抽出部３１が抽出した識別処理対象画像に含まれる人候補画像が人画像であるかを識別する機能要素である。具体的には、識別部３２は、ＨＯＧ（Histograms of Oriented Gradients）特徴量に代表される画像特徴量記述と機械学習により生成した識別器とを用いた画像認識処理等の比較的演算量の多い画像処理（以下、「後段画像認識処理」とする。）によって人候補画像が人画像であるかを識別する。識別部３２が人候補画像を人画像として識別する割合は、抽出部３１による識別処理対象画像の抽出が高精度であるほど高くなる。なお、識別部３２は、夜間、悪天候時等の撮像に適さない環境下で所望の品質の撮像画像を得られない場合等においては、人候補画像の全てが人画像であると識別し、抽出部３１が抽出した識別処理対象画像における人候補画像の全てを人であると識別してもよい。人の検知漏れを防止するためである。 The identification unit 32 is a functional element that identifies whether a human candidate image included in the identification processing target image extracted by the extraction unit 31 is a human image. Specifically, the identification unit 32 has a relatively large amount of calculation such as image recognition processing using an image feature amount description represented by a HOG (Histograms of Oriented Gradients) feature amount and a classifier generated by machine learning. Whether a human candidate image is a human image is identified by image processing (hereinafter referred to as “post-stage image recognition processing”). The rate at which the identification unit 32 identifies a human candidate image as a human image is higher as extraction of an identification processing target image by the extraction unit 31 is more accurate. Note that the identification unit 32 identifies and extracts all human candidate images as human images in the case where a captured image of a desired quality can not be obtained in an environment not suitable for imaging at night, bad weather, etc. All of the human candidate images in the identification processing target image extracted by the unit 31 may be identified as human. This is to prevent human detection leaks.

次に、図３を参照し、後方カメラ４０Ｂが撮像したショベル後方の撮像画像における人画像の見え方について説明する。なお、図３の２つの撮像画像は、後方カメラ４０Ｂの撮像画像の例である。また、図３の点線円は人画像の存在を表し、実際の撮像画像には表示されない。 Next, with reference to FIG. 3, a description will be given of how a human image is seen in a captured image of the rear of the shovel taken by the rear camera 40B. The two captured images in FIG. 3 are examples of captured images of the rear camera 40B. The dotted circle in FIG. 3 indicates the presence of a human image and is not displayed on an actual captured image.

後方カメラ４０Ｂは、ワイドカメラであり、且つ、人を斜め上から見下ろす高さに取り付けられる。そのため、撮像画像における人画像の見え方は、後方カメラ４０Ｂから見た人の存在方向によって大きく異なる。例えば、撮像画像中の人画像は、撮像画像の左右の端部に近いほど傾いて表示される。これは、ワイドカメラの広角レンズに起因する像倒れによる。また、後方カメラ４０Ｂに近いほど頭部が大きく表示される。また、脚部がショベルの車体の死角に入って見えなくなってしまう。これらは、後方カメラ４０Ｂの設置位置に起因する。そのため、撮像画像に何らの加工を施すことなく画像処理によってその撮像画像に含まれる人画像を識別するのは困難である。 The rear camera 40B is a wide camera, and is mounted at a height at which a person looks down obliquely from above. Therefore, the appearance of the human image in the captured image largely differs depending on the direction of the person viewed from the rear camera 40B. For example, the human image in the captured image is displayed to be inclined as it approaches the left and right ends of the captured image. This is due to the image collapse caused by the wide-angle lens of the wide camera. In addition, the head is displayed larger as it is closer to the rear camera 40B. In addition, the legs enter the blind spot of the body of the shovel and become invisible. These originate in the installation position of back camera 40B. Therefore, it is difficult to identify a human image included in a captured image by image processing without performing any processing on the captured image.

そこで、本発明の実施例に係る周辺監視システム１００は、識別処理対象画像を正規化することで、識別処理対象画像に含まれる人画像の識別を促進する。なお、「正規化」は、識別処理対象画像を所定サイズ及び所定形状の画像に変換することを意味する。本実施例では、撮像画像において様々な形状を取り得る識別処理対象画像は射影変換によって所定サイズの長方形画像に変換される。なお、射影変換としては例えば８変数の射影変換行列が用いられる。 Therefore, the surroundings monitoring system 100 according to the embodiment of the present invention promotes identification of a human image included in an identification processing target image by normalizing the identification processing target image. Note that "normalization" means converting an identification processing target image into an image of a predetermined size and a predetermined shape. In the present embodiment, an identification processing target image that can take various shapes in a captured image is converted into a rectangular image of a predetermined size by projective transformation. For example, a projection transformation matrix of 8 variables is used as the projection transformation.

ここで、図４〜図６を参照し、周辺監視システム１００が識別処理対象画像を正規化する処理（以下、「正規化処理」とする。）の一例について説明する。なお、図４は、抽出部３１が撮像画像から識別処理対象画像を切り出す際に用いる幾何学的関係の一例を示す概略図である。 Here, with reference to FIGS. 4 to 6, an example of processing (hereinafter, referred to as “normalization processing”) in which the surrounding area monitoring system 100 normalizes the identification processing target image will be described. FIG. 4 is a schematic view showing an example of a geometrical relationship used when the extraction unit 31 cuts out an identification processing target image from a captured image.

図４のボックスＢＸは、実空間における仮想立体物であり、本実施例では、８つの頂点Ａ〜Ｈで定められる仮想直方体である。また、点Ｐｒは、識別処理対象画像を参照するために予め設定される参照点である。本実施例では、参照点Ｐｒは、人の想定立ち位置として予め設定される点であり、４つの頂点Ａ〜Ｄで定められる四角形ＡＢＣＤの中心に位置する。また、ボックスＢＸのサイズは、人の向き、歩幅、身長等に基づいて設定される。本実施例では、四角形ＡＢＣＤ及び四角形ＥＦＧＨは正方形であり、一辺の長さは例えば８００ｍｍである。また、直方体の高さは例えば１８００ｍｍである。すなわち、ボックスＢＸは、幅８００ｍｍ×奥行８００ｍｍ×高さ１８００ｍｍの直方体である。 The box BX in FIG. 4 is a virtual three-dimensional object in real space, and in this embodiment is a virtual rectangular solid defined by eight vertices A to H. Further, the point Pr is a reference point set in advance to refer to the identification processing target image. In the present embodiment, the reference point Pr is a point set in advance as an assumed standing position of a person, and is located at the center of a quadrangle ABCD defined by four vertices A to D. In addition, the size of the box BX is set based on the direction of the person, the stride, the height, and the like. In the present embodiment, the squares ABCD and EFGH are squares, and the length of one side is, for example, 800 mm. The height of the rectangular parallelepiped is, for example, 1800 mm. That is, the box BX is a rectangular solid having a width of 800 mm, a depth of 800 mm, and a height of 1800 mm.

４つの頂点Ａ、Ｂ、Ｇ、Ｈで定められる四角形ＡＢＧＨは、撮像画像における識別処理対象画像の領域に対応する仮想平面領域ＴＲを形成する。また、仮想平面領域ＴＲとしての四角形ＡＢＧＨは、水平面である仮想地面に対して傾斜する。 A quadrangle ABGH defined by the four vertices A, B, G, and H forms a virtual plane area TR corresponding to the area of the identification processing target image in the captured image. Further, the quadrangle ABGH as the virtual plane region TR is inclined with respect to the virtual ground which is a horizontal surface.

なお、本実施例では、参照点Ｐｒと仮想平面領域ＴＲとの関係を定めるために仮想直方体としてのボックスＢＸが採用される。しかしながら、撮像装置４０の方向を向き且つ仮想地面に対して傾斜する仮想平面領域ＴＲを任意の参照点Ｐｒに関連付けて定めることができるのであれば、他の仮想立体物を用いた関係等の他の幾何学的関係が採用されてもよく、関数、変換テーブル等の他の数学的関係が採用されてもよい。 In the present embodiment, in order to determine the relationship between the reference point Pr and the virtual plane region TR, the box BX as a virtual rectangular parallelepiped is adopted. However, if it is possible to set the virtual plane region TR oriented in the direction of the imaging device 40 and inclined with respect to the virtual ground in association with an arbitrary reference point Pr, other relationships using other virtual three-dimensional objects, etc. Geometric relationships of may be employed, and other mathematical relationships such as functions, transformation tables, etc. may be employed.

図５は、ショベル後方の実空間の上面視であり、参照点Ｐｒ１、Ｐｒ２を用いて仮想平面領域ＴＲ１、ＴＲ２が参照された場合における後方カメラ４０Ｂと仮想平面領域ＴＲ１、ＴＲ２との位置関係を示す。なお、本実施例では、参照点Ｐｒは、仮想地面上の仮想グリッドの格子点のそれぞれに配置可能である。但し、参照点Ｐｒは、仮想地面上に不規則に配置されてもよく、後方カメラ４０Ｂの仮想地面への投影点から放射状に伸びる線分上に等間隔に配置されてもよい。例えば、各線分は１度刻みで放射状に伸び、参照点Ｐｒは各線分上に１００ｍｍ間隔に配置される。 FIG. 5 is a top view of the real space behind the shovel, showing the positional relationship between the rear camera 40B and the virtual plane areas TR1 and TR2 when the virtual plane areas TR1 and TR2 are referenced using the reference points Pr1 and Pr2 Show. In the present embodiment, the reference point Pr can be arranged at each grid point of the virtual grid on the virtual ground. However, the reference points Pr may be irregularly arranged on the virtual ground, or may be equally spaced on line segments extending radially from the projection point of the rear camera 40B on the virtual ground. For example, each line segment radially extends at an interval of one degree, and reference points Pr are arranged at intervals of 100 mm on each line segment.

図４及び図５に示すように、四角形ＡＢＦＥ（図４参照。）で定められるボックスＢＸの第１面は、参照点Ｐｒ１を用いて仮想平面領域ＴＲ１が参照される場合、後方カメラ４０Ｂに正対するように配置される。すなわち、後方カメラ４０Ｂと参照点Ｐｒ１とを結ぶ線分は、参照点Ｐｒ１に関連して配置されるボックスＢＸの第１面と上面視で直交する。同様に、ボックスＢＸの第１面は、参照点Ｐｒ２を用いて仮想平面領域ＴＲ２が参照される場合にも、後方カメラ４０Ｂに正対するように配置される。すなわち、後方カメラ４０Ｂと参照点Ｐｒ２とを結ぶ線分は、参照点Ｐｒ２に関連して配置されるボックスＢＸの第１面と上面視で直交する。この関係は、参照点Ｐｒが何れの格子点上に配置された場合であっても成立する。すなわち、ボックスＢＸは、その第１面が常に後方カメラ４０Ｂに正対するように配置される。 As shown in FIGS. 4 and 5, the first surface of the box BX defined by the quadrilateral ABFE (see FIG. 4) is positive for the rear camera 40B when the virtual plane region TR1 is referred to using the reference point Pr1. It is arranged to be opposite. That is, a line segment connecting the rear camera 40B and the reference point Pr1 is orthogonal to the first surface of the box BX arranged in relation to the reference point Pr1 in top view. Similarly, the first surface of the box BX is disposed to face the rear camera 40B even when the virtual plane region TR2 is referred to using the reference point Pr2. That is, a line segment connecting the rear camera 40B and the reference point Pr2 is orthogonal to the first surface of the box BX arranged in relation to the reference point Pr2 in top view. This relationship holds even when the reference point Pr is disposed on any grid point. That is, the box BX is arranged such that its first surface always faces the rear camera 40B.

図６は、撮像画像から正規化画像を生成する処理の流れを示す図である。具体的には、図６（Ａ）は、後方カメラ４０Ｂの撮像画像の一例であり、実空間における参照点Ｐｒに関連して配置されるボックスＢＸを映し出す。また、図６（Ｂ）は、撮像画像における識別処理対象画像の領域（以下、「識別処理対象画像領域ＴＲｇ」とする。）を切り出した図であり、図６（Ａ）の撮像画像に映し出された仮想平面領域ＴＲに対応する。また、図６（Ｃ）は、識別処理対象画像領域ＴＲｇを有する識別処理対象画像を正規化した正規化画像ＴＲｇｔを示す。 FIG. 6 is a diagram showing a flow of processing for generating a normalized image from a captured image. Specifically, FIG. 6A is an example of a captured image of the rear camera 40B, and the box BX arranged in relation to the reference point Pr in the real space is projected. Further, FIG. 6B is a diagram in which a region of the identification processing target image in the captured image (hereinafter, referred to as “identification processing target image region TRg”) is cut out, and it is shown in the captured image of FIG. Corresponds to the virtual plane region TR. Further, FIG. 6C shows a normalized image TRgt obtained by normalizing the identification processing target image having the identification processing target image region TRg.

図６（Ａ）に示すように、実空間上で参照点Ｐｒ１に関連して配置されるボックスＢＸは、実空間における仮想平面領域ＴＲの位置を定め、そして、仮想平面領域ＴＲに対応する撮像画像上の識別処理対象画像領域ＴＲｇを定める。 As shown in FIG. 6A, the box BX arranged in real space in relation to the reference point Pr1 defines the position of the virtual plane area TR in the real space, and imaging corresponding to the virtual plane area TR The identification processing target image area TRg on the image is determined.

このように、実空間における参照点Ｐｒの位置が決まれば、実空間における仮想平面領域ＴＲの位置が一意に決まり、撮像画像における識別処理対象画像領域ＴＲｇも一意に決まる。そして、抽出部３１は、識別処理対象画像領域ＴＲｇを有する識別処理対象画像を正規化して所定サイズの正規化画像ＴＲｇｔを生成できる。本実施例では、正規化画像ＴＲｇｔのサイズは、例えば縦６４ピクセル×横３２ピクセルである。 As described above, when the position of the reference point Pr in the real space is determined, the position of the virtual plane area TR in the real space is uniquely determined, and the identification processing target image area TRg in the captured image is also uniquely determined. Then, the extraction unit 31 can normalize the identification processing target image having the identification processing target image region TRg to generate a normalized image TRgt of a predetermined size. In the present embodiment, the size of the normalized image TRgt is, for example, 64 vertical pixels × 32 horizontal pixels.

図７は、撮像画像と識別処理対象画像領域と正規化画像との関係を示す図である。具体的には、図７（Ａ１）は、撮像画像における識別処理対象画像領域ＴＲｇ３を示し、図７（Ａ２）は、識別処理対象画像領域ＴＲｇ３を有する識別処理対象画像の正規化画像ＴＲｇｔ３を示す。また、図７（Ｂ１）は、撮像画像における識別処理対象画像領域ＴＲｇ４を示し、図７（Ｂ２）は、識別処理対象画像領域ＴＲｇ４を有する識別処理対象画像の正規化画像ＴＲｇｔ４を示す。同様に、図７（Ｃ１）は、撮像画像における識別処理対象画像領域ＴＲｇ５を示し、図７（Ｃ２）は、識別処理対象画像領域ＴＲｇ５を有する識別処理対象画像の正規化画像ＴＲｇｔ５を示す。 FIG. 7 is a view showing the relationship between a captured image, an identification processing target image area, and a normalized image. Specifically, FIG. 7 (A1) shows an identification processing target image region TRg3 in the captured image, and FIG. 7 (A2) shows a normalized image TRgt3 of the identification processing target image having the identification processing target image region TRg3. . Further, FIG. 7 (B1) shows an identification processing target image area TRg4 in the captured image, and FIG. 7 (B2) shows a normalized image TRgt4 of the identification processing target image having the identification processing target image area TRg4. Similarly, FIG. 7 (C1) shows an identification processing target image area TRg5 in the captured image, and FIG. 7 (C2) shows a normalized image TRgt5 of the identification processing target image having the identification processing target image area TRg5.

図７に示すように、撮像画像における識別処理対象画像領域ＴＲｇ５は、撮像画像における識別処理対象画像領域ＴＲｇ４より大きい。識別処理対象画像領域ＴＲｇ５に対応する仮想平面領域と後方カメラ４０Ｂとの間の距離が、識別処理対象画像領域ＴＲｇ４に対応する仮想平面領域と後方カメラ４０Ｂとの間の距離より小さいためである。同様に、撮像画像における識別処理対象画像領域ＴＲｇ４は、撮像画像における識別処理対象画像領域ＴＲｇ３より大きい。識別処理対象画像領域ＴＲｇ４に対応する仮想平面領域と後方カメラ４０Ｂとの間の距離が、識別処理対象画像領域ＴＲｇ３に対応する仮想平面領域と後方カメラ４０Ｂとの間の距離より小さいためである。すなわち、撮像画像における識別処理対象画像領域は、対応する仮想平面領域と後方カメラ４０Ｂとの間の距離が大きいほど小さい。その一方で、正規化画像ＴＲｇｔ３、ＴＲｇｔ４、ＴＲｇｔ５は何れも同じサイズの長方形画像である。 As shown in FIG. 7, the identification processing target image region TRg5 in the captured image is larger than the identification processing target image region TRg4 in the captured image. This is because the distance between the virtual plane region corresponding to the identification processing target image region TRg5 and the rear camera 40B is smaller than the distance between the virtual plane region corresponding to the identification processing target image region TRg4 and the rear camera 40B. Similarly, the identification processing target image region TRg4 in the captured image is larger than the identification processing target image region TRg3 in the captured image. This is because the distance between the virtual plane region corresponding to the identification processing target image region TRg4 and the rear camera 40B is smaller than the distance between the virtual plane region corresponding to the identification processing target image region TRg3 and the rear camera 40B. That is, the identification processing target image area in the captured image is smaller as the distance between the corresponding virtual plane area and the rear camera 40B is larger. On the other hand, the normalized images TRgt3, TRgt4, and TRgt5 are all rectangular images of the same size.

このように、抽出部３１は、撮像画像において様々な形状及びサイズを取り得る識別処理対象画像を所定サイズの長方形画像に正規化し、人画像を含む人候補画像を正規化できる。具体的には、抽出部３１は、正規化画像の所定領域に人候補画像の頭部であると推定される画像部分（以下、「頭部画像部分」とする。）を配置する。また、正規化画像の別の所定領域に人候補画像の胴体部であると推定される画像部分（以下、「胴体部画像部分」とする。）を配置し、正規化画像のさらに別の所定領域に人候補画像の脚部であると推定される画像部分（以下、「脚部画像部分」とする。）を配置する。また、抽出部３１は、正規化画像の形状に対する人候補画像の傾斜（像倒れ）を抑えた状態で正規化画像を取得できる。 As described above, the extraction unit 31 can normalize an identification processing target image that can take various shapes and sizes in a captured image into a rectangular image of a predetermined size, and normalize a human candidate image including a human image. Specifically, the extraction unit 31 arranges an image portion (hereinafter, referred to as “head image portion”) estimated to be the head of the human candidate image in a predetermined region of the normalized image. In addition, an image portion estimated to be the torso portion of a human candidate image (hereinafter, referred to as “body portion image portion”) is arranged in another predetermined region of the normalized image, and another predetermined region of the normalized image An image portion estimated to be a leg of a human candidate image (hereinafter, referred to as a “leg image portion”) is arranged in the area. Further, the extraction unit 31 can obtain a normalized image in a state in which the inclination (image collapse) of the human candidate image with respect to the shape of the normalized image is suppressed.

次に、図８を参照し、識別処理対象画像領域が、人画像の識別に悪影響を与える識別に適さない画像領域（以下、「識別処理不適領域」とする。）を含む場合の正規化処理について説明する。識別処理不適領域は、人画像が存在し得ない既知の領域であり、例えば、ショベルの車体が映り込んだ領域（以下、「車体映り込み領域」とする。）、撮像画像からはみ出た領域（以下、「はみ出し領域」とする。）等を含む。なお、図８は、識別処理対象画像領域と識別処理不適領域との関係を示す図であり、図７（Ｃ１）及び図７（Ｃ２）に対応する。また、図８左図の右下がりの斜線ハッチング領域は、はみ出し領域Ｒ１に対応し、左下がりの斜線ハッチング領域は、車体映り込み領域Ｒ２に対応する。 Next, referring to FIG. 8, normalization processing in the case where the identification processing target image area includes an image area not suitable for identification which adversely affects identification of a human image (hereinafter referred to as “identification processing unsuitable area”). Will be explained. The identification process inappropriate area is a known area where a human image can not exist, for example, an area where the body of the shovel is reflected (hereinafter referred to as a “body area reflected area”), and an area which protrudes from the captured image ( Hereinafter, the term "outside area" is included. FIG. 8 is a diagram showing the relationship between the identification processing target image area and the identification processing unsuitable area, which corresponds to FIG. 7 (C1) and FIG. 7 (C2). The downward hatching hatching area on the left in FIG. 8 corresponds to the protruding area R1, and the hatching hatching on the lower left corresponds to the vehicle reflection area R2.

本実施例では、抽出部３１は、識別処理対象画像領域ＴＲｇ５がはみ出し領域Ｒ１及び車体映り込み領域Ｒ２の一部を含む場合、それらの識別処理不適領域をマスク処理した後で、識別処理対象画像領域ＴＲｇ５を有する識別処理対象画像の正規化画像ＴＲｇｔ５を生成する。なお、抽出部３１は、正規化画像ＴＲｇｔ５を生成した後で、正規化画像ＴＲｇｔ５における識別処理不適領域に対応する部分をマスク処理してもよい。 In the present embodiment, when the identification processing target image region TRg5 includes a part of the protrusion region R1 and the vehicle reflection region R2, the extraction unit 31 performs mask processing on the identification processing inappropriate region, and then the identification processing target image A normalized image TRgt5 of the identification processing target image having the region TRg5 is generated. Note that the extraction unit 31 may mask the portion corresponding to the identification processing inappropriate area in the normalized image TRgt5 after generating the normalized image TRgt5.

図８右図は、正規化画像ＴＲｇｔ５を示す。また、図８右図において、右下がりの斜線ハッチング領域は、はみ出し領域Ｒ１に対応するマスク領域Ｍ１を表し、左下がりの斜線ハッチング領域は、車体映り込み領域Ｒ２の一部に対応するマスク領域Ｍ２を表す。 The right side of FIG. 8 shows a normalized image TRgt5. Further, in the right drawing of FIG. 8, a hatching hatching area falling to the right represents the mask area M1 corresponding to the projection area R1, and a hatching hatching area falling to the left is the mask area M2 corresponding to a part of the vehicle reflection area R2. Represents

このようにして、抽出部３１は、識別処理不適領域の画像をマスク処理することで、識別処理不適領域の画像が識別部３２による識別処理に影響を及ぼすのを防止する。このマスク処理により、識別部３２は、識別処理不適領域の画像の影響を受けることなく、正規化画像におけるマスク領域以外の領域の画像を用いて人画像であるかを識別できる。なお、抽出部３１は、マスク処理以外の他の任意の公知方法で、識別処理不適領域の画像が識別部３２による識別処理に影響を及ぼさないようにしてもよい。 In this manner, the extraction unit 31 masks the image of the identification inappropriate area, thereby preventing the image of the identification inappropriate area from affecting the identification process of the identification unit 32. By this mask processing, the identification unit 32 can identify whether it is a human image by using the image of the area other than the mask area in the normalized image without being affected by the image of the identification process unsuitable area. In addition, the extraction unit 31 may not cause the image of the inappropriate area for identification processing to affect the identification processing by the identification unit 32 by any known method other than the mask processing.

次に、図９を参照し、抽出部３１が生成する正規化画像の特徴について説明する。なお、図９は、正規化画像の例を示す図である。また、図９に示す１４枚の正規化画像は、図の左端に近い正規化画像ほど、後方カメラ４０Ｂから近い位置に存在する人候補の画像を含み、図の右端に近い正規化画像ほど、後方カメラ４０Ｂから遠い位置に存在する人候補の画像を含む。 Next, with reference to FIG. 9, the features of the normalized image generated by the extraction unit 31 will be described. FIG. 9 is a diagram showing an example of a normalized image. Further, the 14 normalized images shown in FIG. 9 include the images of human candidates existing closer to the rear camera 40B as the normalized images closer to the left end of the drawing, and the normalized images closer to the right end of the drawing The image of the human candidate present at a position far from the rear camera 40B is included.

図９に示すように、抽出部３１は、実空間における仮想平面領域ＴＲと後方カメラ４０Ｂとの間の後方水平距離（図５に示すＹ軸方向の水平距離）に関係なく、何れの正規化画像内においてもほぼ同じ割合で頭部画像部分、胴体部画像部分、脚部画像部分等を配置できる。そのため、抽出部３１は、識別部３２が識別処理を実行する際の演算負荷を低減でき、且つ、その識別結果の信頼性を向上できる。なお、上述の後方水平距離は、実空間における仮想平面領域ＴＲと後方カメラ４０Ｂとの間の位置関係に関する情報の一例であり、抽出部３１は、抽出した識別処理対象画像にその情報を付加する。また、上述の位置関係に関する情報は、仮想平面領域ＴＲに対応する参照点Ｐｒと後方カメラ４０Ｂとを結ぶ線分の後方カメラ４０Ｂの光軸に対する上面視角度等を含む。 As shown in FIG. 9, the extraction unit 31 does not perform any normalization regardless of the rear horizontal distance (horizontal distance in the Y-axis direction shown in FIG. 5) between the virtual plane region TR and the rear camera 40B in real space. Even in the image, the head image part, the body part image part, the leg part image part, etc. can be arranged at substantially the same ratio. Therefore, the extraction unit 31 can reduce the calculation load when the identification unit 32 executes the identification process, and can improve the reliability of the identification result. The above-mentioned backward horizontal distance is an example of information on the positional relationship between the virtual plane region TR and the backward camera 40B in the real space, and the extraction unit 31 adds the information to the extracted identification process target image . The information on the positional relationship described above includes a top view angle and the like with respect to the optical axis of the rear camera 40B of a line segment connecting the rear camera 40B and the reference point Pr corresponding to the virtual plane region TR.

以上の構成により、周辺監視システム１００は、撮像装置４０の方向を向き且つ水平面である仮想地面に対して傾斜する仮想平面領域ＴＲに対応する識別処理対象画像領域ＴＲｇから正規化画像ＴＲｇｔを生成する。そのため、人の高さ方向及び奥行き方向の見え方を考慮した正規化を実現できる。その結果、人を斜め上から撮像するように建設機械に取り付けられる撮像装置４０の撮像画像を用いた場合であっても建設機械の周囲に存在する人をより確実に検知できる。特に、人が撮像装置４０に接近した場合であっても、撮像画像上の十分な大きさの領域を占める識別処理対象画像から正規化画像を生成できるため、その人を確実に検知できる。 With the above configuration, the surrounding area monitoring system 100 generates the normalized image TRgt from the identification processing target image area TRg corresponding to the virtual plane area TR that is directed to the direction of the imaging device 40 and is inclined with respect to the virtual ground that is a horizontal plane. . Therefore, it is possible to realize normalization in consideration of how people look in the height direction and the depth direction. As a result, even in the case of using a captured image of the imaging device 40 attached to the construction machine so as to image a person obliquely from above, it is possible to more reliably detect a person present around the construction machine. In particular, even when a person approaches the imaging device 40, the normalized image can be generated from the identification processing target image that occupies a sufficiently large area on the captured image, so that the person can be detected reliably.

また、周辺監視システム１００は、実空間における仮想直方体であるボックスＢＸの４つの頂点Ａ、Ｂ、Ｇ、Ｈで形成される矩形領域として仮想平面領域ＴＲを定義する。そのため、実空間における参照点Ｐｒと仮想平面領域ＴＲとを幾何学的に対応付けることができ、さらには、実空間における仮想平面領域ＴＲと撮像画像における識別処理対象画像領域ＴＲｇとを幾何学的に対応付けることができる。 In addition, the periphery monitoring system 100 defines a virtual plane region TR as a rectangular region formed by four vertices A, B, G, and H of a box BX which is a virtual rectangular solid in real space. Therefore, the reference point Pr in the real space and the virtual plane region TR can be geometrically corresponded, and further, the virtual plane region TR in the real space and the identification processing target image region TRg in the captured image are geometrically It can correspond.

また、抽出部３１は、識別処理対象画像領域ＴＲｇに含まれる識別処理不適領域の画像をマスク処理する。そのため、識別部３２は、車体映り込み領域Ｒ２を含む識別処理不適領域の画像の影響を受けることなく、正規化画像におけるマスク領域以外の領域の画像を用いて人画像であるかを識別できる。 In addition, the extraction unit 31 performs mask processing on the image of the identification processing unsuitable region included in the identification processing target image region TRg. Therefore, the identification unit 32 can identify whether it is a human image using an image of a region other than the mask region in the normalized image without being affected by the image of the identification inappropriate region including the body reflection region R2.

また、抽出部３１は、参照点Ｐｒ毎に識別処理対象画像を抽出可能である。また、識別処理対象画像領域ＴＲｇのそれぞれは、対応する仮想平面領域ＴＲを介して、人の想定立ち位置として予め設定される参照点Ｐｒの１つに関連付けられる。そのため、周辺監視システム１００は、人が存在する可能性が高い参照点Ｐｒを任意の方法で抽出することで、人候補画像を含む可能性が高い識別処理対象画像を抽出できる。この場合、人候補画像を含む可能性が低い識別処理対象画像に対して、比較的演算量の多い画像処理による識別処理が施されてしまうのを防止でき、人検知処理の高速化を実現できる。 Further, the extraction unit 31 can extract an identification processing target image for each reference point Pr. In addition, each of the identification processing target image regions TRg is associated with one of the reference points Pr, which is preset as an assumed standing position of a person, via the corresponding virtual plane region TR. Therefore, the surrounding area monitoring system 100 can extract an identification processing target image having a high possibility of including a human candidate image by extracting the reference point Pr having a high possibility of the presence of a person by an arbitrary method. In this case, it is possible to prevent identification processing by image processing having a relatively large amount of calculation from being performed on an identification processing target image having a low possibility of containing human candidate images, and speeding up of human detection processing can be realized. .

次に、図１０及び図１１を参照し、人候補画像を含む可能性が高い識別処理対象画像を抽出部３１が抽出する処理の一例について説明する。なお、図１０は、抽出部３１が撮像画像から識別処理対象画像を切り出す際に用いる幾何学的関係の一例を示す概略図であり、図４に対応する。また、図１１は、撮像画像における特徴画像の一例を示す図である。なお、特徴画像は、人の特徴的な部分を表す画像であり、望ましくは、実空間における地面からの高さが変化し難い部分を表す画像である。そのため、特徴画像は、例えば、ヘルメットの画像、肩の画像、頭の画像、人に取り付けられる反射板若しくはマーカの画像等を含む。 Next, with reference to FIG. 10 and FIG. 11, an example of a process in which the extraction unit 31 extracts an identification processing target image having a high possibility of including a human candidate image will be described. FIG. 10 is a schematic view showing an example of a geometrical relationship used when the extraction unit 31 cuts out an identification processing target image from a captured image, and corresponds to FIG. FIG. 11 is a view showing an example of a feature image in a captured image. The characteristic image is an image representing a characteristic portion of a person, and desirably, is an image representing a portion where the height from the ground in real space does not easily change. Therefore, the feature image includes, for example, an image of a helmet, an image of a shoulder, an image of a head, an image of a reflector attached to a person or a marker, and the like.

特に、ヘルメットは、その形状がおよそ球体であり、その投影像が撮像画像上に投影されたときに撮像方向によらず常に円形に近いという特徴を有する。また、ヘルメットは、表面が硬質で光沢又は半光沢を有し、その投影像が撮像画像上に投影されたときに局所的な高輝度領域とその領域を中心とする放射状の輝度勾配を生じさせ易いという特徴を有する。そのため、ヘルメットの画像は、特徴画像として特に相応しい。なお、その投影像が円形に近いという特徴、局所的な高輝度領域を中心とする放射状の輝度勾配を生じさせ易いという特徴等は、撮像画像からヘルメットの画像を見つけ出す画像処理のために利用されてもよい。また、撮像画像からヘルメットの画像を見つけ出す画像処理は、例えば、輝度平滑化処理、ガウス平滑化処理、輝度極大点探索処理、輝度極小点探索処理等を含む。 In particular, the helmet is characterized in that its shape is approximately a sphere, and when its projected image is projected onto a captured image, it is always nearly circular regardless of the imaging direction. In addition, the helmet has a hard and glossy or semi-glossy surface, and when the projected image is projected onto a captured image, a local high brightness area and a radial brightness gradient centered on the area are generated. It has the feature of being easy. Therefore, the image of the helmet is particularly suitable as a feature image. It should be noted that the feature that the projected image is close to a circle, the feature that it is easy to generate a radial brightness gradient centered on a local high brightness region, etc. are used for image processing to find the helmet image from the captured image. May be Further, image processing for finding an image of a helmet from a captured image includes, for example, luminance smoothing processing, Gaussian smoothing processing, luminance maximum point search processing, luminance minimum point search processing, and the like.

本実施例では、抽出部３１は、前段画像認識処理によって、撮像画像におけるヘルメット画像（厳密にはヘルメットであると推定できる画像）を見つけ出す。ショベルの周囲で作業する人はヘルメットを着用していると考えられるためである。そして、抽出部３１は、見つけ出したヘルメット画像の位置から最も関連性の高い参照点Ｐｒを導き出す。その上で、抽出部３１は、その参照点Ｐｒに対応する識別処理対象画像を抽出する。 In the present embodiment, the extraction unit 31 finds a helmet image (strictly speaking, an image that can be estimated to be a helmet) in the captured image by the pre-stage image recognition process. This is because a person working around the shovel is considered to be wearing a helmet. Then, the extraction unit 31 derives the most relevant reference point Pr from the position of the helmet image found. Then, the extraction unit 31 extracts an identification processing target image corresponding to the reference point Pr.

具体的には、抽出部３１は、図１０に示す幾何学的関係を利用し、撮像画像におけるヘルメット画像の位置から関連性の高い参照点Ｐｒを導き出す。なお、図１０の幾何学的関係は、実空間における仮想頭部位置ＨＰを定める点で図４の幾何学的関係と相違するが、その他の点で共通する。 Specifically, the extraction unit 31 derives a highly relevant reference point Pr from the position of the helmet image in the captured image, using the geometrical relationship shown in FIG. The geometrical relationship in FIG. 10 is different from the geometrical relationship in FIG. 4 in that the virtual head position HP in the real space is determined, but is common in other points.

仮想頭部位置ＨＰは、参照点Ｐｒ上に存在すると想定される人の頭部位置を表し、参照点Ｐｒの真上に配置される。本実施例では、参照点Ｐｒ上の高さ１７００ｍｍのところに配置される。そのため、実空間における仮想頭部位置ＨＰが決まれば、実空間における参照点Ｐｒの位置が一意に決まり、実空間における仮想平面領域ＴＲの位置も一意に決まる。また、撮像画像における識別処理対象画像領域ＴＲｇも一意に決まる。そして、抽出部３１は、識別処理対象画像領域ＴＲｇを有する識別処理対象画像を正規化して所定サイズの正規化画像ＴＲｇｔを生成できる。 The virtual head position HP represents the head position of a person assumed to exist on the reference point Pr, and is disposed directly above the reference point Pr. In this embodiment, it is arranged at a height of 1700 mm above the reference point Pr. Therefore, if the virtual head position HP in the real space is determined, the position of the reference point Pr in the real space is uniquely determined, and the position of the virtual plane region TR in the real space is also uniquely determined. In addition, the identification processing target image area TRg in the captured image is also uniquely determined. Then, the extraction unit 31 can normalize the identification processing target image having the identification processing target image region TRg to generate a normalized image TRgt of a predetermined size.

逆に、実空間における参照点Ｐｒの位置が決まれば、実空間における仮想頭部位置ＨＰが一意に決まり、実空間における仮想頭部位置ＨＰに対応する撮像画像上の頭部画像位置ＡＰも一意に決まる。そのため、頭部画像位置ＡＰは、予め設定されている参照点Ｐｒのそれぞれに対応付けて予め設定され得る。なお、頭部画像位置ＡＰは、参照点Ｐｒからリアルタイムに導き出されてもよい。 Conversely, if the position of the reference point Pr in the real space is determined, the virtual head position HP in the real space is uniquely determined, and the head image position AP on the captured image corresponding to the virtual head position HP in the real space is also unique. It depends on Therefore, the head image position AP can be set in advance in association with each of the reference points Pr set in advance. The head image position AP may be derived in real time from the reference point Pr.

そこで、抽出部３１は、前段画像認識処理により後方カメラ４０Ｂの撮像画像内でヘルメット画像を探索する。図１１上図は、抽出部３１がヘルメット画像ＨＲｇを見つけ出した状態を示す。そして、抽出部３１は、ヘルメット画像ＨＲｇを見つけ出した場合、その代表位置ＲＰを決定する。なお、代表位置ＲＰは、ヘルメット画像ＨＲｇの大きさ、形状等から導き出される位置である。本実施例では、代表位置ＲＰは、ヘルメット画像ＨＲｇを含むヘルメット画像領域の中心画素の位置である。図１１下図は、図１１上図における白線で区切られた矩形画像領域であるヘルメット画像領域の拡大図であり、そのヘルメット画像領域の中心画素の位置が代表位置ＲＰであることを示す。 Therefore, the extraction unit 31 searches for a helmet image in the captured image of the rear camera 40B by the front-stage image recognition processing. The upper diagram in FIG. 11 shows a state in which the extraction unit 31 finds the helmet image HRg. Then, when the helmet image HRg is found out, the extraction unit 31 determines the representative position RP. The representative position RP is a position derived from the size, shape, and the like of the helmet image HRg. In the present embodiment, the representative position RP is the position of the central pixel of the helmet image area including the helmet image HRg. The lower part of FIG. 11 is an enlarged view of a helmet image area which is a rectangular image area divided by white lines in the upper view of FIG. 11 and shows that the position of the central pixel of the helmet image area is the representative position RP.

その後、抽出部３１は、例えば最近傍探索アルゴリズムを用いて代表位置ＲＰの最も近傍にある頭部画像位置ＡＰを導き出す。図１１下図は、代表位置ＲＰの近くに６つの頭部画像位置ＡＰ１〜ＡＰ６が予め設定されており、そのうちの頭部画像位置ＡＰ５が代表位置ＲＰの最も近傍にある頭部画像位置ＡＰであることを示す。 After that, the extraction unit 31 derives a head image position AP closest to the representative position RP using, for example, a nearest neighbor search algorithm. In the lower part of FIG. 11, six head image positions AP1 to AP6 are previously set near the representative position RP, and the head image position AP5 among them is the head image position AP closest to the representative position RP. Indicates that.

そして、抽出部３１は、図１０に示す幾何学的関係を利用し、導き出した最近傍の頭部画像位置ＡＰから、仮想頭部位置ＨＰ、参照点Ｐｒ、仮想平面領域ＴＲを辿って、対応する識別処理対象画像領域ＴＲｇを抽出する。その後、抽出部３１は、抽出した識別処理対象画像領域ＴＲｇを有する識別処理対象画像を正規化して正規化画像ＴＲｇｔを生成する。 Then, the extraction unit 31 uses the geometrical relationship shown in FIG. 10 to trace the virtual head position HP, the reference point Pr, and the virtual plane region TR from the derived nearest head image position AP. Extraction target image area TRg is extracted. Thereafter, the extraction unit 31 normalizes the identification processing target image having the extracted identification processing target image region TRg to generate a normalized image TRgt.

このようにして、抽出部３１は、撮像画像における人の特徴画像の位置であるヘルメット画像ＨＲｇの代表位置ＲＰと、予め設定された頭部画像位置ＡＰの１つ（頭部画像位置ＡＰ５）とを対応付けることで識別処理対象画像を抽出する。 In this manner, the extraction unit 31 selects one of the representative position RP of the helmet image HRg, which is the position of the feature image of the person in the captured image, and one of the head image positions AP set in advance (head image position AP5). The identification processing target image is extracted by correlating the

なお、抽出部３１は、図１０に示す幾何学的関係を利用する代わりに、頭部画像位置ＡＰと参照点Ｐｒ、仮想平面領域ＴＲ、又は識別処理対象画像領域ＴＲｇとを直接的に対応付ける参照テーブルを利用し、頭部画像位置ＡＰに対応する識別処理対象画像を抽出してもよい。 Note that, instead of utilizing the geometrical relationship shown in FIG. 10, the extraction unit 31 directly associates the head image position AP with the reference point Pr, the virtual plane region TR, or the identification processing target image region TRg. A table may be used to extract an identification processing target image corresponding to the head image position AP.

また、抽出部３１は、山登り法、Mean-shift法等の最近傍探索アルゴリズム以外の他の公知のアルゴリズムを用いて代表位置ＲＰから参照点Ｐｒを導き出してもよい。例えば、山登り法を用いる場合、抽出部３１は、代表位置ＲＰの近傍にある複数の頭部画像位置ＡＰを導き出し、代表位置ＲＰとそれら複数の頭部画像位置ＡＰのそれぞれに対応する参照点Ｐｒとを紐付ける。このとき、抽出部３１は、代表位置ＲＰと頭部画像位置ＡＰが近いほど重みが大きくなるように参照点Ｐｒに重みを付ける。そして、複数の参照点Ｐｒの重みの分布を山登りし、重みの極大点に最も近い重みを有する参照点Ｐｒから識別処理対象画像領域ＴＲｇを抽出する。 The extraction unit 31 may also derive the reference point Pr from the representative position RP using another well-known algorithm other than the nearest neighbor search algorithm such as the hill climbing method or the Mean-shift method. For example, when the hill climbing method is used, the extraction unit 31 derives a plurality of head image positions AP near the representative position RP, and the reference position Pr corresponding to each of the representative position RP and the plurality of head image positions AP. And At this time, the extraction unit 31 weights the reference point Pr such that the weight increases as the representative position RP and the head image position AP are closer. Then, the weight distribution of the plurality of reference points Pr is climbed, and the identification processing target image area TRg is extracted from the reference point Pr having the weight closest to the maximum point of the weight.

抽出部３１は、最初にヘルメット画像ＨＲｇを見つけ出し、見つけ出したヘルメット画像ＨＲｇの代表位置ＲＰから、頭部画像位置ＡＰ、仮想頭部位置ＨＰ、参照点（想定立ち位置）Ｐｒ、仮想平面領域ＴＲを経て識別処理対象画像領域ＴＲｇを特定する。そして、特定した識別処理対象画像領域ＴＲｇを有する識別処理対象画像を抽出して正規化することで、所定サイズの正規化画像ＴＲｇｔを生成できる。 The extraction unit 31 first finds the helmet image HRg, and from the representative position RP of the helmet image HRg found, the head image position AP, the virtual head position HP, the reference point (the assumed standing position) Pr, and the virtual plane region TR Then, the identification processing target image area TRg is specified. Then, by extracting and normalizing the identification processing target image having the identified identification processing target image region TRg, a normalized image TRgt of a predetermined size can be generated.

或いは、抽出部３１は、最初に頭部画像位置ＡＰの１つを取得し、取得した頭部画像位置ＡＰに対応するヘルメット画像領域でヘルメット画像ＨＲｇを見つけ出した場合に、そのときの頭部画像位置ＡＰから、仮想頭部位置ＨＰ、参照点（想定立ち位置）Ｐｒ、仮想平面領域ＴＲを経て、識別処理対象画像領域ＴＲｇを特定する。そして、特定した識別処理対象画像領域ＴＲｇを有する識別処理対象画像を抽出して正規化することで、所定サイズの正規化画像ＴＲｇｔを生成できる。 Alternatively, when the extraction unit 31 first acquires one of the head image positions AP and finds the helmet image HRg in the helmet image area corresponding to the acquired head image position AP, the head image at that time From the position AP, through the virtual head position HP, the reference point (expected standing position) Pr, and the virtual plane region TR, an identification processing target image region TRg is identified. Then, by extracting and normalizing the identification processing target image having the identified identification processing target image region TRg, a normalized image TRgt of a predetermined size can be generated.

以上の構成により、周辺監視システム１００の抽出部３１は、撮像画像における特徴画像としてのヘルメット画像を見つけ出し、そのヘルメット画像の代表位置ＲＰと所定画像位置としての頭部画像位置ＡＰの１つとを対応付けることで識別処理対象画像を抽出する。そのため、簡易なシステム構成で後段画像認識処理の対象となる画像部分を絞り込むことができる。 With the above configuration, the extraction unit 31 of the periphery monitoring system 100 finds a helmet image as a feature image in the captured image, and associates the representative position RP of the helmet image with one of the head image positions AP as a predetermined image position. Thus, the identification processing target image is extracted. Therefore, it is possible to narrow down the image portion to be the target of the post-stage image recognition processing with a simple system configuration.

なお、抽出部３１は、最初に撮像画像からヘルメット画像ＨＲｇを見つけ出し、そのヘルメット画像ＨＲｇの代表位置ＲＰに対応する頭部画像位置ＡＰの１つを導き出し、その頭部画像位置ＡＰの１つに対応する識別処理対象画像を抽出してもよい。或いは、抽出部３１は、最初に頭部画像位置ＡＰの１つを取得し、その頭部画像位置ＡＰの１つに対応する特徴画像の位置を含む所定領域であるヘルメット画像領域内にヘルメット画像が存在する場合に、その頭部画像位置ＡＰの１つに対応する識別処理対象画像を抽出してもよい。 The extraction unit 31 first finds the helmet image HRg from the captured image, derives one of the head image positions AP corresponding to the representative position RP of the helmet image HRg, and selects one of the head image positions AP. A corresponding identification process target image may be extracted. Alternatively, the extraction unit 31 first acquires one of the head image positions AP, and a helmet image within a helmet image area which is a predetermined area including the position of the feature image corresponding to the one of the head image positions AP In the case where there exist, the identification processing target image corresponding to one of the head image positions AP may be extracted.

また、抽出部３１は、図１０に示すような所定の幾何学的関係を利用し、撮像画像におけるヘルメット画像の代表位置ＲＰから識別処理対象画像を抽出してもよい。この場合、所定の幾何学的関係は、撮像画像における識別処理対象画像領域ＴＲｇと、識別処理対象画像領域ＴＲｇに対応する実空間における仮想平面領域ＴＲと、仮想平面領域ＴＲに対応する実空間における参照点Ｐｒ（人の想定立ち位置）と、参照点Ｐｒに対応する仮想頭部位置ＨＰ（人の想定立ち位置に対応する人の特徴的な部分の実空間における位置である仮想特徴位置）と、仮想頭部位置ＨＰに対応する撮像画像における頭部画像位置ＡＰ（仮想特徴位置に対応する撮像画像における所定画像位置）との幾何学的関係を表す。 Further, the extraction unit 31 may extract the identification processing target image from the representative position RP of the helmet image in the captured image by using a predetermined geometrical relationship as shown in FIG. In this case, the predetermined geometrical relationship is determined in the identification processing target image region TRg in the captured image, the virtual plane region TR in real space corresponding to the identification processing target image region TRg, and the real space corresponding to the virtual plane region TR. Reference point Pr (person's assumed standing position), virtual head position HP corresponding to reference point Pr (virtual feature position in real space of characteristic part of person corresponding to assumed human standing position) It represents a geometrical relationship with a head image position AP (a predetermined image position in the captured image corresponding to the virtual feature position) in the captured image corresponding to the virtual head position HP.

或いは、抽出部３１は、撮像画像における複数の所定画像部分のそれぞれを正規化して複数の正規化画像を生成し、それら正規化画像のうちヘルメット画像を含む正規化画像を識別処理対象画像として抽出してもよい。複数の所定画像部分は、例えば、撮像画像上に予め定められた複数の識別処理対象画像領域ＴＲｇである。識別処理対象画像領域ＴＲｇ（図６参照。）は、実空間における仮想平面領域ＴＲに対応し、仮想平面領域ＴＲは実空間における参照点Ｐｒに対応する。そして、識別部３２は、抽出部３１が抽出した識別処理対象画像が人画像であるかを識別する。この場合、抽出部３１は、１つの正規化画像を生成した段階でその正規化画像にヘルメット画像が含まれるか否かを判定する。但し、複数の正規化画像を生成した段階でそれら複数の正規化画像のそれぞれにヘルメット画像が含まれるか否かを纏めて判定してもよい。また、全ての正規化画像を生成した段階でそれら全ての正規化画像のそれぞれにヘルメット画像が含まれるか否かを纏めて判定してもよい。また、抽出部３１は、所定画像部分の一部を正規化した段階でその部分的に正規化された画像にヘルメット画像が含まれるか否かを判定してもよい。 Alternatively, the extraction unit 31 normalizes each of a plurality of predetermined image portions in the captured image to generate a plurality of normalized images, and extracts a normalized image including a helmet image among the normalized images as an identification processing target image You may The plurality of predetermined image portions are, for example, a plurality of identification processing target image regions TRg predetermined on the captured image. The discrimination target image area TRg (see FIG. 6) corresponds to the virtual plane area TR in the real space, and the virtual plane area TR corresponds to the reference point Pr in the real space. Then, the identification unit 32 identifies whether the identification processing target image extracted by the extraction unit 31 is a human image. In this case, when one normalized image is generated, the extraction unit 31 determines whether the normalized image includes a helmet image. However, at the stage of generating the plurality of normalized images, it may be determined collectively whether or not the helmet image is included in each of the plurality of normalized images. Further, when all normalized images are generated, it may be collectively determined whether each of all the normalized images includes a helmet image. Further, the extraction unit 31 may determine whether or not the helmet image is included in the partially normalized image when the predetermined image portion is partially normalized.

次に図１２を参照し、識別部３２の詳細について説明する。図１２は識別部３２の構成例を示す機能ブロック図である。 Next, the details of the identification unit 32 will be described with reference to FIG. FIG. 12 is a functional block diagram showing a configuration example of the identification unit 32. As shown in FIG.

識別部３２は、主に、輝度フィルタ部３２ａ、画像特徴量算出部３２ｂ、汎用識別部３２ｃ、特殊識別部３２ｄ、パタンフィルタ部３２ｅ、及び調整部３２ｆを含む。 The identification unit 32 mainly includes a luminance filter unit 32a, an image feature quantity calculation unit 32b, a general identification unit 32c, a special identification unit 32d, a pattern filter unit 32e, and an adjustment unit 32f.

輝度フィルタ部３２ａは、識別処理対象画像における画像特徴の偏りに基づいて人の画像であるか否かを識別する補助識別部の一例である。補助識別部は、画像特徴量算出部３２ｂが算出する画像特徴量に基づく識別を補助する。但し、輝度フィルタ部３２ａは省略されてもよい。 The luminance filter unit 32a is an example of an auxiliary identification unit that identifies whether or not the image is a person based on the bias of the image feature in the identification processing target image. The auxiliary identification unit assists identification based on the image feature amount calculated by the image feature amount calculator 32b. However, the luminance filter unit 32a may be omitted.

本実施例では、輝度フィルタ部３２ａによる識別は、汎用識別部３２ｃの識別結果が出る前に実行される。そのため、輝度フィルタ部３２ａで人の画像でないと識別された識別処理対象画像が汎用識別部３２ｃによる識別処理の対象となるのを防止し、無駄な識別処理が行われるのを防止できる。具体的には、輝度フィルタ部３２ａは、抽出部３１が抽出した識別処理対象画像の輝度の偏りが所定値以上の場合にその識別処理対象画像は人画像でないと識別する。汎用識別部３２ｃ及び特殊識別部３２ｄによる画像特徴量に基づく識別での誤報を防止するためである。「誤報」は、誤った識別結果を出力することを意味し、例えば、人画像でないにもかかわらず人画像であると識別することを含む。一方で、その識別処理対象画像の輝度の偏りが所定値より小さい場合にはその識別処理対象画像は人画像であると暫定的に識別する。特に、ＨＯＧ特徴量に基づく識別が行われる場合には輝度勾配ヒストグラムが正規化される。そのため、識別処理対象画像における路面画像の僅かな明暗差による輝度勾配パタンが人の存在に基づく輝度勾配パタンに似ているときには、その路面画像が人画像であると識別されてしまう場合がある。輝度フィルタ部３２ａは、そのような路面画像が人画像であると識別されてしまうのを防止できる。例えば、輝度の偏りは、夏の日差しによる強い陰影、路面上の白線、縁石等の原因によって大きくなる傾向を有する。輝度フィルタ部３２ａは、そのような原因を含む画像が人画像であると識別されてしまうのを防止できる。 In the present embodiment, the identification by the luminance filter unit 32a is performed before the identification result of the general identification unit 32c is output. Therefore, it is possible to prevent the identification processing target image identified as not being a human image in the luminance filter unit 32a from being the target of the identification processing by the general identification unit 32c, and to prevent the unnecessary identification processing from being performed. Specifically, the luminance filter unit 32a identifies that the classification processing target image is not a human image when the deviation of the luminance of the classification processing target image extracted by the extraction unit 31 is equal to or greater than a predetermined value. This is to prevent false alarms in identification based on the image feature amount by the general purpose identification unit 32c and the special identification unit 32d. The “false alarm” means to output an erroneous identification result, and includes, for example, identifying as being a human image although it is not a human image. On the other hand, when the deviation of the luminance of the identification processing target image is smaller than a predetermined value, the identification processing target image is tentatively identified as a human image. In particular, the luminance gradient histogram is normalized when the identification based on the HOG feature is performed. Therefore, when the luminance gradient pattern by the slight contrast of the road surface image in the identification processing target image resembles the luminance gradient pattern based on the presence of a person, the road surface image may be identified as a human image. The luminance filter unit 32a can prevent such a road surface image from being identified as a human image. For example, the brightness deviation tends to be large due to strong shadows due to summer sunlight, white lines on the road surface, curbs, and the like. The luminance filter unit 32a can prevent an image including such a cause from being identified as a human image.

図１３は輝度フィルタ部３２ａによる識別処理を説明する図である。図１３（Ａ）は抽出部３１によって抽出された識別処理対象画像としての正規化画像ＴＲｇｔの一例である。図１３（Ｂ）は図１３（Ａ）の識別処理対象画像に対して設定される７つの領域ＲＧ１〜ＲＧ７を示す。領域ＲＧ１は正規化画像ＴＲｇｔの全体に相当する領域である。領域ＲＧ２及び領域ＲＧ３は正規化画像ＴＲｇｔの右上の頂点と左下の頂点とを結ぶ対角線で区分けされた２つの領域である。領域ＲＧ４及び領域ＲＧ５は正規化画像ＴＲｇｔの左上の頂点と右下の頂点とを結ぶ対角線で区分けされた２つの領域である。領域ＲＧ６は正規化画像ＴＲｇｔの上半分の領域であり、領域ＲＧ７は正規化画像ＴＲｇｔの下半分の領域である。 FIG. 13 is a diagram for explaining identification processing by the luminance filter unit 32a. FIG. 13A is an example of a normalized image TRgt as an identification processing target image extracted by the extraction unit 31. FIG. 13B shows seven regions RG1 to RG7 set for the identification processing target image of FIG. 13A. The region RG1 is a region corresponding to the entire normalized image TRgt. A region RG2 and a region RG3 are two regions divided by diagonal lines connecting the upper right vertex and the lower left vertex of the normalized image TRgt. A region RG4 and a region RG5 are two regions divided by diagonal lines connecting the top left vertex and the bottom right vertex of the normalized image TRgt. Region RG6 is the upper half of normalized image TRgt, and region RG7 is the lower half of normalized image TRgt.

図１３（Ｃ）は図１３（Ｂ）の領域ＲＧ１の各画素の輝度のヒストグラムを示す。図１３（Ｄ）は図１３（Ｃ）のヒストグラムにおける隣接するビンの値を合計する調整を行った後の調整後ヒストグラムを示す。 FIG. 13C shows a histogram of the luminance of each pixel in the region RG1 of FIG. 13B. FIG. 13 (D) shows the adjusted histogram after the adjustment of summing the values of adjacent bins in the histogram of FIG. 13 (C).

図１３（Ａ）の識別処理対象画像は路面上の白線の画像を含む。図１３（Ａ）に示すように識別処理対象画像が局所的には比較的強い明暗差を有するが全体的には比較的弱い明暗差を有する場合、輝度フィルタ部３２ａはその識別処理対象画像が人画像でないと識別する。 The identification processing target image in FIG. 13A includes an image of a white line on the road surface. As shown in FIG. 13A, when the discrimination processing target image locally has a relatively strong contrast but has a relatively weak contrast overall, the luminance filter unit 32a detects the discrimination processing target image. Identify as not being a human image.

具体的には、輝度フィルタ部３２ａは、図１３（Ｂ）に示すように、識別処理対象画像に対して７つの領域ＲＧ１〜ＲＧ７を設定する。そして、７つの領域ＲＧ１〜ＲＧ７のそれぞれについて以下の処理を実行する。以下では、領域ＲＧ１に対する処理を一例として説明するが、領域ＲＧ２〜ＲＧ７に対しても同様の処理が適用される。 Specifically, as shown in FIG. 13B, the luminance filter unit 32a sets seven regions RG1 to RG7 on the identification processing target image. Then, the following processing is performed for each of the seven regions RG1 to RG7. Although the process for region RG1 will be described below as an example, the same process is applied to regions RG2 to RG7.

最初に輝度フィルタ部３２ａは領域ＲＧ１の有効画素割合を算出する。「有効画素割合」は、領域ＲＧ１内の全画素数に占める有効画素数の割合を意味する。「有効画素数」はマスク領域以外の領域にある画素の数（非マスク画素数）を意味する。 First, the luminance filter unit 32a calculates the effective pixel ratio of the region RG1. The “effective pixel ratio” means the ratio of the number of effective pixels to the total number of pixels in the region RG1. “Effective pixel count” means the number of pixels in the area other than the mask area (non-masked pixel count).

有効画素割合が所定値（例えば５０％）以下の場合、輝度フィルタ部３２ａは識別処理対象画像が人画像であると識別する。有効画素数が少なく適切な識別ができないと推定されるためである。すなわち、輝度フィルタ部３２ａは、適切な識別ができない場合には、人画像が非人画像であると誤って識別してしまうのを防止するため暫定的に人画像であると識別し、後続の識別処理に最終的な識別を委ねるようにする。 When the effective pixel ratio is equal to or less than a predetermined value (for example, 50%), the luminance filter unit 32a identifies that the identification processing target image is a human image. This is because it is estimated that the number of effective pixels is small and appropriate identification can not be performed. That is, the luminance filter unit 32a provisionally identifies the human image as a human image in order to prevent the human image from being erroneously identified as a non-human image if proper identification can not be performed. The final identification should be entrusted to the identification process.

有効画素割合が所定値より大きい場合、輝度フィルタ部３２ａは、領域ＲＧ１の各画素の輝度を１６階調に分類して輝度のヒストグラムを生成する。 When the effective pixel ratio is larger than the predetermined value, the luminance filter unit 32a classifies the luminance of each pixel of the region RG1 into 16 gradations and generates a luminance histogram.

例えば、輝度フィルタ部３２ａは、図１３（Ｃ）に示すように、領域ＲＧ１の各画素の２５６階調の輝度値をビットシフト演算によって１６階調に変換して分類する。 For example, as shown in FIG. 13C, the luminance filter unit 32a converts the luminance value of 256 gradations of each pixel of the region RG1 into 16 gradations by a bit shift operation and classifies.

そして、輝度フィルタ部３２ａは、図１３（Ｃ）のヒストグラムにおける隣接する２つのビンの値を合計してそのヒストグラムを調整する。図１３（Ｄ）はその調整後のヒストグラムを示す。 Then, the luminance filter unit 32 a sums the values of two adjacent bins in the histogram of FIG. 13C and adjusts the histogram. FIG. 13 (D) shows the histogram after adjustment.

そして、輝度フィルタ部３２ａは、その調整後のヒストグラムの何れかのビンの値が所定値ＴＨ１以上の場合、その識別処理対象画像は人画像でないと識別する。一方で、その調整後のヒストグラムの各ビンの値が何れも所定値ＴＨ１未満の場合、その識別処理対象画像は人画像であると暫定的に識別する。路面画像等では全体的に弱い明暗差のため輝度が特定の範囲に集中する傾向があるのに対し、人画像では比較的強い明暗差のため輝度が広い範囲に分散する傾向があるためである。図１３（Ｄ）の例では、輝度フィルタ部３２ａは、第４階調のビンの値が所定値ＴＨ１以上のため、その識別処理対象画像は人画像でないと識別する。また、本実施例では、輝度フィルタ部３２ａは、隣接する２つのビンの値の合計が所定値ＴＨ１以上となった時点で人画像でないと識別し、その他の隣接する２つのビンの合計処理を中止する。図１３（Ｄ）の例では、輝度フィルタ部３２ａは、第５階調以降のビンの値の算出を中止する。 Then, if the value of any bin of the adjusted histogram is equal to or greater than a predetermined value TH1, the luminance filter unit 32a identifies that the identification processing target image is not a human image. On the other hand, when the value of each bin of the adjusted histogram is less than the predetermined value TH1, the identification processing target image is tentatively identified as a human image. In the road surface image etc., the luminance tends to be concentrated in a specific range because of the weak light and dark difference as a whole, while in the human image, the luminance tends to be dispersed in a wide range because of relatively strong light and dark difference. . In the example of FIG. 13D, the luminance filter unit 32a identifies that the identification processing target image is not a human image because the value of the fourth gradation bin is equal to or greater than the predetermined value TH1. Further, in the present embodiment, the luminance filter unit 32a identifies that the image is not a human image when the sum of the values of two adjacent bins becomes equal to or greater than a predetermined value TH1, and performs the sum processing of the other adjacent two bins. Discontinue. In the example of FIG. 13D, the luminance filter unit 32a stops the calculation of the bin values of the fifth and subsequent gradations.

このようにして、輝度フィルタ部３２ａは、領域ＲＧ１〜ＲＧ７のそれぞれに基づいて識別処理対象画像が人画像であるか否かを個別に識別する。例えば、有効画素割合が十分に高く、且つ、輝度のヒストグラムに強い偏りがある場合に識別処理対象画像が人画像でないと識別する。 In this manner, the luminance filter unit 32a individually identifies, based on each of the regions RG1 to RG7, whether or not the identification processing target image is a human image. For example, when the effective pixel ratio is sufficiently high and the luminance histogram has a strong bias, it is determined that the identification processing target image is not a human image.

そして、輝度フィルタ部３２ａは、７つの識別結果に基づき、輝度フィルタ部３２ａによる最終的な識別結果を出力する。例えば、輝度フィルタ部３２ａは、７つの識別結果の全てが「人画像でない」の場合に識別処理対象画像が人画像でないと識別する。 Then, the luminance filter unit 32a outputs the final discrimination result by the luminance filter unit 32a based on the seven discrimination results. For example, the luminance filter unit 32a identifies that the identification processing target image is not a human image when all the seven identification results are "not a human image".

また、輝度フィルタ部３２ａは、識別処理対象画像の輝度の偏りが大きいため識別処理対象画像が人画像でないと識別した場合には、出力装置５０を通じてその旨を操作者に通知してもよい。 When the luminance filter unit 32a identifies that the discrimination processing target image is not a human image because the deviation of the luminance of the discrimination processing target image is large, the luminance filter unit 32a may notify the operator of the fact through the output device 50.

上述の例では、輝度フィルタ部３２ａは、実質的に、２５６階調の輝度を１６階調に変換して分類した後でさらに８階調に変換して分類している。すなわち、２段階の変換を行っている。これは、２５６階調の輝度を直接的に８階調に変換する場合（１段階の変換の場合）に比べ、識別処理対象画像の輝度に関する画像特徴を正確に承継できるためである。但し、輝度フィルタ部３２ａは、２５６階調の輝度を直接的に８階調に変換して分類した上で所定値ＴＨ１を用いた識別を行ってもよい。また、３段階以上の変換を行ってもよく、最終的な階調が８階調以外であってもよい。 In the above-described example, the luminance filter unit 32a converts the luminance of 256 gradations into 16 gradations and classifies the gradation into 8 gradations. That is, two-step conversion is performed. This is because image features relating to the luminance of the identification processing target image can be correctly inherited as compared with the case where the luminance of 256 gradations is directly converted into eight gradations (in the case of one-step conversion). However, the luminance filter unit 32a may perform the discrimination using the predetermined value TH1 after directly converting the luminance of 256 gradations into eight gradations and classifying. Also, three or more stages of conversion may be performed, and the final gradation may be other than eight gradations.

また、上述の例では、領域内の全画素数（欠落画素の数を除く。）の７７％の画素数を所定値ＴＨ１として採用する。但し、他の画素数が所定値ＴＨ１として採用されてもよい。欠落画素の数は、例えば、マスク領域の画素数を意味する。 Further, in the above-described example, the number of pixels of 77% of the total number of pixels (excluding the number of missing pixels) in the region is adopted as the predetermined value TH1. However, another number of pixels may be adopted as the predetermined value TH1. The number of missing pixels means, for example, the number of pixels in the mask area.

画像特徴量算出部３２ｂは、識別処理対象画像の画像特徴量を算出する。本実施例では、画像特徴量算出部３２ｂは、縦６４ピクセル×横３２ピクセルの識別処理対象画像を縦４ピクセル×横４ピクセルの１２８個のＨＯＧブロックに分割し、ＨＯＧブロック毎に画像特徴量（ＨＯＧ特徴量）としての輝度勾配ヒストグラムを算出する。 The image feature quantity calculation unit 32b calculates an image feature quantity of the identification processing target image. In the present embodiment, the image feature quantity calculation unit 32b divides the identification processing target image of 64 vertical pixels by 32 horizontal pixels into 128 HOG blocks of 4 vertical pixels by 4 horizontal pixels, and image characteristic quantities for each HOG block. A luminance gradient histogram as (HOG feature amount) is calculated.

汎用識別部３２ｃは、多数の教師画像を用いた機械学習によって生成される汎用識別器である。本実施例では、汎用識別部３２ｃは、例えば、識別結果が真陽性（True Positive）であった教師画像の数の全教師画像数に対する比率である真陽性率が９５％、且つ、識別結果が真陰性（True Negative）であった教師画像の数の全教師画像数に対する比率である真陰性率が９５％となるように設定される。「真陽性」は、人画像が正しく人画像として識別されたことを意味し、「真陰性」は、非人画像が正しく非人画像と識別されたことを意味する。 The universal discriminator 32 c is a universal discriminator generated by machine learning using a large number of teacher images. In the present embodiment, for example, the universal identification unit 32c has a true positive rate of 95%, which is a ratio of the number of teacher images for which the identification result is true positive to the total number of teacher images, and the identification result is The true negative rate, which is the ratio of the number of teacher images that were true negative to the total number of teacher images, is set to be 95%. "True positive" means that a human image is correctly identified as a human image, and "true negative" means that a non-human image is correctly identified as a non-human image.

特殊識別部３２ｄは、前段の識別器の識別結果が偽陽性（False Positive）であった多数の教師画像を用いた機械学習によって生成される識別器である。「偽陽性」は、非人画像が誤って人画像として識別されたことを意味する。「偽陰性（False Negative）」は、人画像が誤って非人画像として識別されたことを意味する。本実施例では、特殊識別部３２ｄは、汎用識別器の識別結果が偽陽性であった教師画像を用いた機械学習によって生成される第１特殊識別器〜第４特殊識別器を含む。汎用識別器の識別結果が偽陽性であった教師画像は、例えばｋ−ｍｅａｎｓ法等によって所定数（例えば特殊識別器の数と同じ数であり本実施例では４）のクラスタにクラスタリング（分類）される。そして、各クラスタに含まれる教師画像を用いた機械学習によって対応する特殊識別器が生成される。 The special discrimination unit 32 d is a classifier generated by machine learning using a large number of teacher images in which the discrimination result of the preceding classifier is false positive. "False positive" means that a non-human image was mistakenly identified as a human image. "False Negative" means that a human image has been mistakenly identified as a non-human image. In the present embodiment, the special identification unit 32 d includes first to fourth special identifiers generated by machine learning using a teacher image in which the identification result of the general-purpose identifier is false positive. For example, the teacher image for which the discrimination result of the general-purpose discriminator is false positive is clustered (classified) into a predetermined number (for example, the same number as the number of special classifiers and 4 in this embodiment) Be done. Then, a corresponding special classifier is generated by machine learning using the teacher image included in each cluster.

汎用識別部３２ｃと特殊識別部３２ｄとでカスケード型識別器が構成される。具体的には、第４特殊識別器による識別は、第３特殊識別器による識別で人画像であると識別された識別処理対象画像のみに対して行われる。同様に、第３特殊識別器による識別は、第２特殊識別器による識別で人画像であると識別された識別処理対象画像のみに対して行われ、第２特殊識別器による識別は、第１特殊識別器による識別で人画像であると識別された識別処理対象画像のみに対して行われる。また、第１特殊識別器による識別は、汎用識別部３２ｃによる識別で人画像であると識別された識別処理対象画像のみに対して行われる。但し、特殊識別部３２ｄは、１つ、２つ、若しくは３つの特殊識別器で構成されてもよく、５つ以上の特殊識別器で構成されてもよい。 The general purpose identification unit 32c and the special identification unit 32d constitute a cascade type identifier. Specifically, the identification by the fourth special identifier is performed only on the identification processing target image identified as the human image by the identification by the third special identifier. Similarly, the identification by the third special identifier is performed only on the identification processing target image identified as the human image by the identification by the second special identifier, and the identification by the second special identifier is the first It is performed only on the identification processing target image identified as the human image by the identification by the special identifier. Further, the identification by the first special classifier is performed only on the identification processing target image identified as the human image by the identification by the general identification unit 32 c. However, the special identification unit 32 d may be configured with one, two, or three special identifiers, or may be configured with five or more special identifiers.

図１４は識別部３２の人識別能力を表す概念図である。具体的には、図１４（Ａ）〜図１４（Ｃ）は、汎用識別部３２ｃによって人画像であると識別される識別処理対象画像が属する範囲、汎用識別部３２ｃによって非人画像であると識別される識別処理対象画像が属する範囲、特殊識別部３２ｄによって人画像であると識別される識別処理対象画像が属する範囲、及び特殊識別部３２ｄによって非人画像であると識別される識別処理対象画像が属する範囲の組み合わせの３つの例を示す。 FIG. 14 is a conceptual diagram showing the human identification capability of the identification unit 32. As shown in FIG. Specifically, FIGS. 14A to 14C show a range to which an identification processing target image to be identified as a human image by general purpose identification unit 32c belongs and a non-human image by general purpose identification unit 32c. A range to which the identification processing target image to be identified belongs, a range to which the identification processing target image to be identified as a human image by the special identification unit 32 d belongs, and an identification processing target to be identified as a non-human image by the special identification unit 32 d Three examples of combinations of ranges to which an image belongs are shown.

図１４（Ａ）〜図１４（Ｃ）のそれぞれにおいて、実線で囲まれた略矩形の範囲Ｄは非人画像が属する範囲を表す。範囲Ｄの外側は人画像が属する範囲を表す。また、点線円で囲まれた範囲Ｇは汎用識別部３２ｃによって非人画像であると識別される識別処理対象画像が属する範囲を表す。範囲Ｇの外側は汎用識別部３２ｃによって人画像であると識別される識別処理対象画像が属する範囲を表す。また、一点鎖線円で囲まれた範囲Ｓ１は第１特殊識別器によって非人画像であると識別される識別処理対象画像が属する範囲を表す。範囲Ｓ１の外側は第１特殊識別器によって人画像であると識別される識別処理対象画像が属する範囲を表す。同様に、一点鎖線円で囲まれた範囲Ｓ２、Ｓ３、Ｓ４は第２、第３、第４特殊識別器によって非人画像であると識別される識別処理対象画像が属する範囲を表す。範囲Ｓ２、Ｓ３、Ｓ４の外側は第２、第３、第４特殊識別器によって人画像であると識別される識別処理対象画像が属する範囲を表す。 In each of FIGS. 14A to 14C, a substantially rectangular range D surrounded by a solid line indicates a range to which a non-human image belongs. The outside of the range D represents the range to which the human image belongs. A range G surrounded by a dotted circle represents a range to which an identification processing target image identified as a non-human image by the general identification unit 32 c belongs. The outside of the range G indicates the range to which the identification processing target image identified as a human image by the general identification unit 32 c belongs. Further, a range S1 enclosed by an alternate long and short dash line circle represents a range to which a classification processing target image to be identified as a non-human image by the first special classifier belongs. The outside of the range S1 represents the range to which the identification processing target image identified as a human image by the first special classifier belongs. Similarly, the ranges S2, S3 and S4 surrounded by a dashed-dotted line circle represent the range to which the identification processing target image to be identified as a non-human image by the second, third and fourth special classifiers belongs. The outside of the ranges S2, S3 and S4 represents the range to which the identification processing target image identified as a human image by the second, third and fourth special classifiers belongs.

以上の関係から、図１４（Ａ）〜図１４（Ｃ）のそれぞれにおいて、黒色で塗りつぶされた領域ＲＧ１は、識別部３２による識別結果が偽陽性となる識別処理対象画像が属する範囲を表す。すなわち、非人画像であるにもかかわらず、汎用識別部３２ｃ及び特殊識別部３２ｄの何れによっても人画像であると識別されてしまう識別処理対象画像が属する範囲を表す。ドットハッチングで表される領域ＲＧ２は、識別部３２による識別結果が偽陰性となる識別処理対象画像が属する範囲を表す。すなわち、人画像であるにもかかわらず、特殊識別部３２ｄにより非人画像であると識別されてしまう識別処理対象画像が属する範囲を表す。したがって、領域ＲＧ１が大きいほど誤報が多くなり、領域ＲＧ２が大きいほど失報が多くなる。 From the above-described relationship, in each of FIGS. 14A to 14C, a region RG1 painted in black represents a range to which an identification processing target image for which the identification result by the identification unit 32 is false positive belongs. That is, it represents a range to which an identification processing target image to be identified as a human image by any of the general identification unit 32 c and the special identification unit 32 d despite being a non-human image. A region RG2 represented by dot hatching represents a range to which a discrimination processing target image for which the discrimination result by the discrimination unit 32 is false negative belongs. That is, it represents a range to which an identification processing target image to be identified as a non-human image by the special identification unit 32 d despite being a human image belongs. Therefore, the larger the region RG1 is, the greater the number of false reports, and the larger the region RG2 is, the greater the number of missed reports.

なお、図１４（Ａ）〜図１４（Ｃ）の３つの例では識別部３２の人識別能力は略同等である。すなわち、それぞれの例における領域ＲＧ１の総面積及び領域ＲＧ２の総面積が略等しく、真陽性率、真陰性率、偽陽性率、及び偽陰性率も略等しい。 In the three examples of FIGS. 14A to 14C, the human identification ability of the identification unit 32 is substantially equal. That is, the total area of the region RG1 and the total area of the region RG2 in each example are approximately equal, and the true positive rate, the true negative rate, the false positive rate, and the false negative rate are also approximately equal.

一方、図１４（Ａ）の範囲Ｇは図１４（Ｂ）の範囲Ｇより小さく、図１４（Ｂ）の範囲Ｇは図１４（Ｃ）の範囲Ｇより小さい。また、図１４（Ａ）の範囲Ｇは範囲Ｄ内に完全に含まれている。これは、汎用識別部３２ｃによる識別の真陰性率が１００％であること（誤報がないこと）を表す。また、図１４（Ｃ）の範囲Ｇは範囲Ｄから大きくはみ出ている。これは、汎用識別部３２ｃによる識別の偽陰性率が比較的高いこと（失報が比較的多いこと）を表す。 On the other hand, the range G of FIG. 14 (A) is smaller than the range G of FIG. 14 (B), and the range G of FIG. 14 (B) is smaller than the range G of FIG. 14 (C). Further, the range G in FIG. 14 (A) is completely included in the range D. This represents that the true negative rate of identification by the universal identification unit 32c is 100% (no false alarm). Further, the range G in FIG. 14C greatly protrudes from the range D. This represents that the false negative rate of identification by the universal identification unit 32c is relatively high (the number of false alarms is relatively high).

図１４（Ａ）の範囲Ｓ１〜Ｓ４は図１４（Ｂ）の範囲Ｓ１〜Ｓ４より大きく、図１４（Ｂ）の範囲Ｓ１〜Ｓ４は図１４（Ｃ）の範囲Ｓ１〜Ｓ４より大きい。また、図１４（Ｃ）の範囲Ｓ１〜Ｓ４は範囲Ｄ内に完全に含まれている。これは、特殊識別部３２ｄによる識別の真陰性率が１００％であること（誤報がないこと）を表す。また、図１４（Ａ）の範囲Ｓ１〜Ｓ４は範囲Ｄから大きくはみ出ている。これは、特殊識別部３２ｄによる識別の偽陰性率が比較的高いこと（失報が比較的多いこと）を表す。 The range S1 to S4 in FIG. 14A is larger than the range S1 to S4 in FIG. 14B, and the range S1 to S4 in FIG. 14B is larger than the range S1 to S4 in FIG. Further, the ranges S1 to S4 in FIG. 14C are completely included in the range D. This represents that the true negative rate of identification by the special identification unit 32d is 100% (no false alarm). Also, the ranges S1 to S4 in FIG. This represents that the false negative rate of identification by the special identification unit 32 d is relatively high (the number of false alarms is relatively high).

したがって、図１４（Ｂ）で表される特性を有する識別部３２は、人識別能力を変化させずに、図１４（Ａ）で表される特性を有する識別部３２に比べ、汎用識別部３２ｃによる識別での誤報を減らすことができる。また、図１４（Ｂ）で表される特性を有する識別部３２は、人識別能力を変化させずに、図１４（Ｃ）で表される特性を有する識別部３２に比べ、特殊識別部３２ｄによる識別での誤報を減らすことができる。 Therefore, the identification unit 32 having the characteristics shown in FIG. 14B does not change the human identification ability, and the general-purpose identification unit 32c has the characteristics compared to the identification unit 32 having the characteristics shown in FIG. False positives in identification due to Further, the identification unit 32 having the characteristics shown in FIG. 14B does not change the human identification ability, and the special identification unit 32d has a characteristic different from that of the identification unit 32 having the characteristics shown in FIG. False positives in identification due to

パタンフィルタ部３２ｅは、補助識別部の別の一例である。本実施例では、パタンフィルタ部３２ｅによる識別は、汎用識別部３２ｃの識別結果が出た後に実行される。そのため、汎用識別部３２ｃの誤った識別結果を覆すことができる。具体的には、パタンフィルタ部３２ｅは、汎用識別部３２ｃとしての汎用識別器を構成する複数の弱識別器のそれぞれの識別結果の偏りが人画像として不適切と判断した場合にその識別処理対象画像は人画像でないと識別する。この場合、汎用識別部３２ｃが人画像であると識別していた場合であってもパタンフィルタ部３２ｅは人画像でないと識別する。汎用識別部３２ｃ及び特殊識別部３２ｄによる画像特徴量に基づく識別での誤報を防止するためである。一方で、その識別結果の偏りが人画像として適切と判断した場合にはその識別処理対象画像は人画像であると識別する。すなわち、汎用識別部３２ｃによる識別結果を覆すことはない。但し、パタンフィルタ部３２ｅは省略されてもよい。 The pattern filter unit 32e is another example of the auxiliary identification unit. In the present embodiment, the identification by the pattern filter unit 32e is executed after the identification result of the general identification unit 32c is output. Therefore, the erroneous identification result of the general purpose identification unit 32c can be reversed. Specifically, when it is determined that the bias of the discrimination result of each of a plurality of weak classifiers constituting the general-purpose classifier as the general-purpose classifier 32c is inappropriate as a human image, the pattern filter unit 32e is a classification process target The image is identified as not being a human image. In this case, even if the general-purpose identification unit 32c identifies that the image is a human image, the pattern filter unit 32e identifies that the image is not a human image. This is to prevent false alarms in identification based on the image feature amount by the general purpose identification unit 32c and the special identification unit 32d. On the other hand, when it is determined that the bias of the identification result is appropriate as a human image, the identification processing target image is identified as a human image. That is, the identification result by the general purpose identification unit 32c is not reversed. However, the pattern filter unit 32e may be omitted.

「弱識別器」は、多数の教師画像を用いた機械学習によって生成される強識別器の構成要素である。強識別器は、例えば、汎用識別器、第１〜第４特殊識別器等である。強識別器の識別結果は、構成要素である複数の弱識別器のそれぞれの識別結果の重み付き多数決に基づく。 A "weak classifier" is a component of a strong classifier generated by machine learning using a large number of teacher images. The strong classifiers are, for example, general purpose classifiers, first to fourth special classifiers, and the like. The discrimination result of the strong classifier is based on the weighted majority of discrimination results of each of a plurality of weak classifiers that are constituent elements.

「識別結果」は、例えば、人らしさを表す値である「人度」で表される。「人度」は人らしさが高いほど絶対値が大きい正値となり、人らしさが低いほど絶対値が大きい負値となる。「人度」の値ゼロは、人画像と非人画像との間の識別境界を表す値（以下、「識別境界値」とする。）として利用され得る。この場合、「人度」の値がゼロ以上であれば人画像であると識別され、「人度」の値がゼロ未満であれば非人画像であると識別される。但し、識別境界値は正値であってもよく負値であってもよい。識別境界値は、汎用識別部３２ｃ及び特殊識別部３２ｄのそれぞれによる識別の誤報の発生傾向を調整するための調整パラメータとして利用される。 The “identification result” is represented by, for example, “humanity” which is a value representing humanity. The “humanity” is a positive value with a larger absolute value as humaniness is higher, and a negative value with a larger absolute value as humaniness is lower. The value "zero" of "humanity" can be used as a value (hereinafter referred to as "identification boundary value") representing the identification boundary between a human image and a non-human image. In this case, if the value of "humanity" is greater than or equal to zero, it is identified as a human image, and if the value of "humanity" is less than zero, it is identified as a nonhuman image. However, the identification boundary value may be a positive value or a negative value. The identification boundary value is used as an adjustment parameter for adjusting the tendency of occurrence of false notification of identification by each of the general identification unit 32 c and the special identification unit 32 d.

弱識別器は、強識別器と同様、多数の教師画像を用いた機械学習によって生成される。本実施例では、弱識別器は１つの識別処理対象画像における１２８個のＨＯＧブロックのそれぞれに対応付けて生成され、ＨＯＧブロック毎に識別結果を出力する。 The weak classifiers are generated by machine learning using a large number of teacher images as in the strong classifiers. In the present embodiment, the weak classifier is generated in association with each of the 128 HOG blocks in one classification processing target image, and outputs the classification result for each HOG block.

図１５は、識別処理対象画像としての正規化画像ＴＲｇｔと弱識別器との関係を表す概念図である。図１５（Ａ）は縦６４ピクセル×横３２ピクセルの正規化画像ＴＲｇｔが縦１６ブロック×横８ブロックの１２８個のＨＯＧブロックに分割された状態を示す。図１５（Ｂ）は正規化画像ＴＲｇｔの中央の８４個のＨＯＧブロックが４つのセクションＳＣ１〜ＳＣ４に分割された状態を示す。正規化画像ＴＲｇｔは２つの対角線で４つのセクションＳＣ１〜ＳＣ４に分割されている。図１５（Ｃ）は４つのセクションＳＣ１〜ＳＣ４の別の構成例を示す。 FIG. 15 is a conceptual diagram showing the relationship between a normalized image TRgt as a classification processing target image and a weak classifier. FIG. 15A shows a state in which a normalized image TRgt of 64 pixels in height × 32 pixels in width is divided into 128 HOG blocks of 16 blocks in height × 8 blocks in width. FIG. 15B shows a state in which the 84 HOG blocks in the center of the normalized image TRgt are divided into four sections SC1 to SC4. The normalized image TRgt is divided into four sections SC1 to SC4 by two diagonal lines. FIG. 15C shows another configuration example of the four sections SC1 to SC4.

基本的に、強識別器としての汎用識別部３２ｃ及び特殊識別部３２ｄのそれぞれの識別結果は、図１５（Ａ）に示すような１２８個の弱識別器のそれぞれの識別結果の重み付き多数決に基づく。強識別器は、例えば、重み付き多数決によって導出された人度がゼロ以上であれば人画像であるとの識別結果を出力し、ゼロ未満であれば非人画像であるとの識別結果を出力する。 Basically, the discrimination results of the general-purpose discriminator 32c and the special discriminator 32d as the strong discriminator are the weighted majority of the discrimination results of the 128 weak discriminators as shown in FIG. Based on. The strong discriminator outputs, for example, the discrimination result that it is a human image if the human degree derived by the weighted majority decision is greater than or equal to zero, and outputs the discrimination result that it is a nonhuman image if it is less than zero. Do.

本実施例の識別部３２は強識別器とは別に４つの複合型弱識別器を有する。４つの複合型弱識別器は第１〜第４複合型弱識別器である。第１複合型弱識別器の識別結果は、図１５（Ｂ）に示すセクションＳＣ１に属する２２個のＨＯＧブロックのそれぞれに対応する２２個の弱識別器のそれぞれの識別結果の重み付き多数決に基づく。２２個の弱識別器は、汎用識別器を構成する弱識別器でもある。第２〜第４複合型弱識別器の識別結果についても同様である。 The discrimination unit 32 of this embodiment has four hybrid weak classifiers in addition to the strong classifiers. The four composite weak classifiers are first to fourth composite weak classifiers. The discrimination result of the first composite weak classifier is based on weighted majority of discrimination results of 22 weak classifiers corresponding to the 22 HOG blocks belonging to the section SC1 shown in FIG. 15B. . The twenty-two weak classifiers are also weak classifiers that constitute a general-purpose classifier. The same applies to the discrimination results of the second to fourth complex weak classifiers.

パタンフィルタ部３２ｅは、複合型弱識別器のそれぞれの識別結果の組み合わせ（識別結果パタン）に基づき、汎用識別部３２ｃとしての汎用識別器を構成する複数の弱識別器のそれぞれの識別結果の偏りが人画像として適切か否かを判断する。そして、識別結果の偏りが人画像として不適切と判断した場合には、汎用識別部３２ｃによる識別結果が人画像であったとしても、その識別処理対象画像は人画像でないと識別する。 The pattern filter unit 32e is a bias of each discrimination result of the plurality of weak classifiers constituting the general-purpose classifier as the general-purpose classifier 32c based on the combination (classification result pattern) of the respective classification results of the composite type weak classifiers. Determines whether the image is appropriate as a human image. Then, when it is determined that the bias of the identification result is inappropriate as a human image, even if the identification result by the general identification unit 32 c is a human image, the identification processing target image is identified as not being a human image.

本実施例では、第１〜第４複合型弱識別器のそれぞれの識別結果の組み合わせから１６通りの識別結果パタンが生成される。そして、１６通りの識別結果パタンのうちの少なくとも１つが正常パタンとして予め設定され、それ以外の識別結果パタンが異常パタンとして予め設定される。正常パタンは、例えば、第１〜第４複合型弱識別器の全ての識別結果が「人画像」である場合を含む。異常パタンは、例えば、第１〜第４複合型弱識別器のうちの２つ以上の識別結果が「非人画像」である場合を含む。 In this embodiment, sixteen discrimination result patterns are generated from combinations of discrimination results of the first to fourth complex weak classifiers. Then, at least one of the sixteen identification result patterns is preset as a normal pattern, and the other identification result patterns are preset as an abnormal pattern. The normal pattern includes, for example, the case where all the discrimination results of the first to fourth complex weak classifiers are “human images”. The abnormality pattern includes, for example, the case where the discrimination result of two or more of the first to fourth complex weak classifiers is “non-human image”.

パタンフィルタ部３２ｅは、第１〜第４複合型弱識別器のそれぞれの識別結果の組み合わせが正常パタンに属する場合、汎用識別器を構成する複数の弱識別器のそれぞれの識別結果の偏りが人画像として適切であると判断する。そして、関連する識別処理対象画像は人画像であると識別する。すなわち、汎用識別部３２ｃによる識別結果を覆すことはない。一方、第１〜第４複合型弱識別器のそれぞれの識別結果の組み合わせが異常パタンに属する場合、汎用識別器を構成する複数の弱識別器のそれぞれの識別結果の偏りが人画像として不適切であると判断する。そして、汎用識別部３２ｃによる識別結果が人画像であったとしても、関連する識別処理対象画像は人画像でないと識別する。 When the combination of the discrimination results of each of the first to fourth complex weak classifiers belongs to the normal pattern, the pattern filter unit 32e determines that the discrimination results of each of the plurality of weak classifiers constituting the general-purpose classifier are biased. Judge as appropriate for the image. Then, the associated identification processing target image is identified as a human image. That is, the identification result by the general purpose identification unit 32c is not reversed. On the other hand, when the combination of the discrimination results of each of the first to fourth complex weak classifiers belongs to the abnormal pattern, the bias of each discrimination result of the plurality of weak classifiers constituting the general-purpose classifier is inappropriate as a human image It is determined that Then, even if the identification result by the general-purpose identification unit 32c is a human image, the related identification processing target image is identified as not being a human image.

４つの複合型弱識別器のうちの１つに属する弱識別器は、４つの複合型弱識別器のうちの別の１又は複数の複合型弱識別器に属していてもよい。例えば、図１５（Ｃ）に示すように、二点鎖線で囲まれたセクションＳＣ１の一部は、一点鎖線で囲まれたセクションＳＣ２の一部と重複し、セクションＳＣ１の他の一部は、破線で囲まれたセクションＳＣ３の一部と重複してもよい。また、実線で囲まれたセクションＳＣ４の一部は、セクションＳＣ２の一部と重複し、セクションＳＣ４の他の一部は、セクションＳＣ３の一部と重複してもよい。また、各セクションは互いに離れて配置されてもよい。例えば、２つのセクションの間に、何れのセクションにも属さない弱識別器が存在してもよい。 A weak classifier belonging to one of the four hybrid weak classifiers may belong to one or more other composite weak classifiers of the four hybrid weak classifiers. For example, as shown in FIG. 15C, a part of the section SC1 surrounded by a two-dot chain line overlaps a part of the section SC2 surrounded by a dashed line, and the other part of the section SC1 is It may overlap with a part of the section SC3 surrounded by a broken line. Also, a part of the section SC4 surrounded by a solid line may overlap with a part of the section SC2, and another part of the section SC4 may overlap with a part of the section SC3. Also, the sections may be spaced apart from one another. For example, there may be weak classifiers that do not belong to any section between the two sections.

また、複合型弱識別器の数は、１つ、２つ、３つの何れであってもよく、５つ以上であってもよい。 Also, the number of hybrid weak classifiers may be one, two, three, or five or more.

また、上述の例では、パタンフィルタ部３２ｅは、汎用識別器を構成する複数の弱識別器のそれぞれの識別結果の偏りが人画像として不適切であると判断した場合、汎用識別器による識別結果が人画像であったとしても識別処理対象画像は人画像でないと識別する。但し、パタンフィルタ部３２ｅは、特殊識別器を構成する複数の弱識別器のそれぞれの識別結果の偏りが人画像として不適切であると判断した場合、特殊識別器による識別結果が人画像であったとしても、識別処理対象画像は人画像でないと識別してもよい。 Further, in the above-described example, when the pattern filter unit 32e determines that the bias of the discrimination result of each of the plurality of weak classifiers constituting the general-purpose classifier is inappropriate as a human image, the classification result by the general-purpose classifier Even if the image is a human image, the image to be identified is identified as not being a human image. However, if the pattern filter unit 32e determines that the bias of the discrimination result of each of the plurality of weak classifiers constituting the special classifier is inappropriate as a human image, the classification result by the special classifier is a human image. Even if it is, it may be identified that the identification process target image is not a human image.

また、パタンフィルタ部３２ｅは、複数の弱識別器のそれぞれの識別結果の偏りが人画像として不適切であるため識別処理対象画像は人画像でないと識別した場合には、出力装置５０を通じてその旨を操作者に通知してもよい。 If the pattern filter unit 32e determines that the discrimination processing target image is not a human image because the bias of the discrimination result of each of the plurality of weak classifiers is inappropriate as a human image, the pattern processing unit 32e indicates that through the output device 50 May be notified to the operator.

調整部３２ｆは、識別部３２の特性を調整する機能要素である。本実施例では、調整部３２ｆは、入力装置４１を介して入力された操作者の指令に応じて汎用識別部３２ｃ及び特殊識別部３２ｄの少なくとも一方に関する調整パラメータを変更する。調整パラメータは識別部３２の特性を調整するためのパラメータであり、識別部３２を構成する複数の識別器のそれぞれの特性に関する情報を含む。本実施例では、調整パラメータは誤報の発生傾向を調整するためのパラメータであり、例えば識別境界値である。 The adjustment unit 32 f is a functional element that adjusts the characteristics of the identification unit 32. In the present embodiment, the adjustment unit 32 f changes the adjustment parameter related to at least one of the general identification unit 32 c and the special identification unit 32 d in accordance with the operator's command input through the input device 41. The adjustment parameter is a parameter for adjusting the characteristic of the identification unit 32, and includes information on each characteristic of the plurality of classifiers constituting the identification unit 32. In the present embodiment, the adjustment parameter is a parameter for adjusting the tendency of occurrence of false alarm, and is, for example, an identification boundary value.

調整部３２ｆは、例えば、識別境界値を変更することで汎用識別部３２ｃ及び特殊識別部３２ｄの少なくとも一方による識別の真陽性率及び偽陽性率の少なくとも一方を変化させる。具体的には、調整部３２ｆは人識別能力が略同等である予め登録された複数の特性設定（プリセットデータ）から１つの特性設定を操作者の指令に応じて選択する。人識別能力が略同等である複数の特性設定は、例えば、図１４（Ａ）〜図１４（Ｃ）に示す識別部３２の３つの異なる特性設定である。調整部３２ｆは、例えば、タッチパネル等を通じた操作者の入力に応じ、図１４（Ｂ）に示すような現在の特性設定を図１４（Ａ）に示すような特性設定に切り換える。但し、調整部３２ｆは、予め登録された複数の特性設定から１つの特性設定を選択する代わりに、識別境界値を直接変更してもよい。 For example, the adjustment unit 32f changes at least one of the true positive rate and the false positive rate of identification by at least one of the general identification unit 32c and the special identification unit 32d by changing the identification boundary value. Specifically, the adjustment unit 32 f selects one characteristic setting from a plurality of pre-registered characteristic settings (preset data) having substantially the same human identification capability in accordance with an instruction from the operator. The plurality of characteristic settings having substantially the same human identification ability are, for example, three different characteristic settings of the identification unit 32 illustrated in FIGS. 14 (A) to 14 (C). The adjusting unit 32 f switches the current characteristic setting as shown in FIG. 14B to the characteristic setting as shown in FIG. 14A, for example, according to the operator's input through the touch panel or the like. However, the adjustment unit 32 f may directly change the identification boundary value instead of selecting one characteristic setting from a plurality of characteristic settings registered in advance.

複数の特性設定のそれぞれは、例えば、１又は複数の使用環境に対応付けられている。例えば、ショベルの使用環境としてのスクラップヤードは図１４（Ａ）に示す特性設定に対応付けられ、ショベルの使用環境としての道路工事現場は図１４（Ｃ）に示す特性設定に対応付けられる。各特性設定は、汎用識別部３２ｃに関する調整パラメータの値と特殊識別部３２ｄに関する調整パラメータの値の組み合わせ（以下、「調整パラメータセット」とする。）で構成される。したがって、複数の特性設定から１つの特性設定を選択することは、複数の調整パラメータセットから１つの調整パラメータセットを選択することを意味する。調整パラメータセットの選択はどのような方法で行われてもよい。調整部３２ｆは、例えば、スクラップヤード、道路工事現場、浚渫工事現場等のショベルの使用環境のカテゴリを画面上で操作者に選択させることで、複数の調整パラメータセットから１つの調整パラメータセットを選び出してもよい。 Each of the plurality of property settings is, for example, associated with one or more usage environments. For example, the scrap yard as a working environment of the shovel is associated with the characteristic setting shown in FIG. 14A, and the road construction site as the working environment of the shovel is associated with the characteristic setting shown in FIG. Each characteristic setting is composed of a combination of the value of the adjustment parameter related to the general identification unit 32 c and the value of the adjustment parameter related to the special identification unit 32 d (hereinafter referred to as “adjustment parameter set”). Therefore, selecting one characteristic setting from a plurality of characteristic settings means selecting one adjustment parameter set from a plurality of adjustment parameter sets. Selection of the adjustment parameter set may be performed in any manner. The adjustment unit 32 f selects one adjustment parameter set from a plurality of adjustment parameter sets by, for example, causing the operator to select the category of the use environment of the shovel such as a scrap yard, a road construction site, or a construction site on the screen. May be

また、操作者は、誤報の発生に気付いた場合には、入力装置４１を用い、誤報が発生したこと及び誤報の原因となった識別処理対象画像を識別部３２に通知してもよい。この場合、調整部３２ｆは、通知された内容に基づき、予め登録された複数の特性設定の中からより適切な特性設定を自動的に選択してもよい。より適切な特性設定は、例えば、その識別処理対象画像に関して誤報を発生させ難い特性設定である。 In addition, when the operator notices the occurrence of the false alarm, the operator may use the input device 41 to notify the identification unit 32 of the identification processing target image that caused the false alarm and the false alarm. In this case, the adjustment unit 32 f may automatically select more appropriate characteristic setting from among the plurality of characteristic settings registered in advance, based on the notified content. A more appropriate characteristic setting is, for example, a characteristic setting that is less likely to cause an erroneous notification regarding the identification processing target image.

この構成により、調整部３２ｆは、例えば図１４（Ａ）の特性設定を図１４（Ｂ）の特性設定に切り換えることで、人識別能力を維持しながら、汎用識別部３２ｃによる誤報（背景全体にばらつくような誤報）を減らすことができる。この切り換えは、汎用識別部３２ｃによる識別の偽陽性率を低下させ、且つ、特殊識別部３２ｄによる識別の真陽性率を増大させることを意味する。その結果、スクラップヤード等の特定の使用環境を映し出す撮像画像の全体で満遍なく誤報を発生させてしまうといった状況を緩和できる。 With this configuration, for example, the adjustment unit 32f switches the characteristic setting of FIG. 14A to the characteristic setting of FIG. It is possible to reduce the number of false alarms). This switching means that the false positive rate of the identification by the general purpose identification unit 32c is reduced, and the true positive rate of the identification by the special identification unit 32d is increased. As a result, it is possible to alleviate a situation where false alarms are uniformly generated in the entire captured image that reflects a specific use environment such as a scrap yard.

また、調整部３２ｆは、例えば図１４（Ｃ）の特性設定を図１４（Ｂ）の特性設定に切り換えることで、人識別能力を維持しながら、特殊識別部３２ｄによる誤報（特定部位に集中する誤報）を減らすことができる。この切り換えは、特殊識別部３２ｄによる識別の偽陽性率を低下させ、且つ、汎用識別部３２ｃによる識別の真陽性率を増大させることを意味する。その結果、特定の看板等の画像を繰り返し人画像として識別してしまうといった状況を緩和できる。 Further, the adjustment unit 32f switches on the characteristic setting of FIG. 14C to the characteristic setting of FIG. 14B, for example, thereby erroneously notifying the special identification unit 32d (focusing on a specific portion while maintaining human identification ability). Misinformation) can be reduced. This switching means that the false positive rate of the identification by the special identification unit 32 d is reduced, and the true positive rate of the identification by the general identification unit 32 c is increased. As a result, it is possible to alleviate the situation in which an image such as a specific signboard is repeatedly identified as a human image.

また、調整部３２ｆは、使用環境毎に異なる学習サンプルを用いて識別器を生成し直すといった煩雑な作業を必要とすることなく、簡易且つ迅速に識別部３２による誤報の発生傾向を調整できる。その結果、使用環境毎に異なる誤報の発生傾向に柔軟に対応できる。 In addition, the adjustment unit 32 f can easily and quickly adjust the false alarm occurrence tendency of the identification unit 32 without requiring a complicated operation such as regenerating a classifier using different learning samples for each use environment. As a result, it is possible to flexibly cope with the tendency of false alarms to differ depending on the use environment.

また、上述の例では、調整部３２ｆは、調整パラメータとして識別境界値を採用するが、他の値を採用してもよい。例えば、抽出部３１、追跡部３３等の特性に応じて値が変化する別の調整パラメータを採用してもよい。或いは、識別処理対象画像のグレースケール化に関する値を調整パラメータとして採用してもよい。 Further, in the above-described example, the adjustment unit 32 f adopts the identification boundary value as the adjustment parameter, but another value may be adopted. For example, another adjustment parameter whose value changes according to the characteristics of the extraction unit 31, the tracking unit 33, and the like may be adopted. Alternatively, a value relating to the gray scale of the identification process target image may be adopted as the adjustment parameter.

また、調整部３２ｆは、ショベルの使用環境に応じて識別器の接続方法を切り換えてもよい。例えば、調整部３２ｆは、カスケード接続される汎用識別器及び第１〜第４特殊識別器のそれぞれによる識別の順番を変更してもよい。識別の順番等の接続方法に関する情報は特性設定に含まれる。このようにして、調整部３２ｆは、予め登録された複数の特性設定から１つの特性設定を選択してカスケード型識別器を生成し且つ調整することができる。 In addition, the adjustment unit 32f may switch the connection method of the identification device according to the use environment of the shovel. For example, the adjustment unit 32 f may change the order of identification by the cascade-connected general-purpose discriminator and the first to fourth special discriminators. Information on the connection method such as the order of identification is included in the characteristic setting. In this manner, the adjustment unit 32 f can select one characteristic setting from a plurality of characteristic settings registered in advance to generate and adjust a cascade type discriminator.

また、調整部３２ｆは、調整パラメータセットの切り換えによっても識別部３２による誤報の発生傾向を変えることができないと判断した場合にその旨を操作者に通知してもよい。新たな教師画像に基づく機械学習の必要性を操作者に伝えるためである。 Further, the adjustment unit 32f may notify the operator of the fact that the tendency of the false alarm by the identification unit 32 can not be changed even by switching the adjustment parameter set. This is to inform the operator of the need for machine learning based on a new teacher image.

以上の構成により、様々な使用環境で使用されるショベルに搭載される周辺監視システム１００は、使用環境に適した調整パラメータセットを用いることで人識別能力の特性を調整できる。その結果、周辺監視システム１００は、特定の使用環境で発生する特定の誤報を抑制できる。 With the above configuration, the periphery monitoring system 100 mounted on a shovel used in various usage environments can adjust the characteristics of the human identification capability by using the adjustment parameter set suitable for the usage environment. As a result, the surrounding area monitoring system 100 can suppress a specific false alarm occurring in a specific usage environment.

次に図１６を参照し、識別部３２による識別処理の流れについて説明する。図１６は識別処理の流れを示すフローチャートであり、識別部３２は識別処理対象画像を取得する度に繰り返しこの識別処理を実行する。 Next, the flow of the identification process by the identification unit 32 will be described with reference to FIG. FIG. 16 is a flowchart showing the flow of identification processing, and the identification unit 32 repeatedly executes this identification processing each time an identification processing target image is acquired.

最初に、識別部３２の輝度フィルタ部３２ａは、識別処理対象画像の輝度の偏りが小さいか否かを判定する（ステップＳＴ１）。本実施例では、輝度フィルタ部３２ａは、識別処理対象画像の各画素の輝度のヒストグラムのビンの値が何れも所定値未満である場合に識別処理対象画像の輝度の偏りが小さいと判定する。 First, the luminance filter unit 32a of the identification unit 32 determines whether the deviation of the luminance of the identification processing target image is small (step ST1). In the present embodiment, the luminance filter unit 32a determines that the deviation of the luminance of the identification processing target image is small when all the bin values of the luminance histogram of each pixel of the identification processing target image are less than a predetermined value.

識別処理対象画像の輝度の偏りが小さいと判定した場合（ステップＳＴ１のＹＥＳ）、識別部３２の画像特徴量算出部３２ｂは、識別処理対象画像の画像特徴量を算出する（ステップＳＴ２）。本実施例では、画像特徴量算出部３２ｂは、縦６４ピクセル×横３２ピクセルの識別処理対象画像を縦４ピクセル×横４ピクセルの１２８個のＨＯＧブロックに分割し、ＨＯＧブロック毎に画像特徴量（ＨＯＧ特徴量）としての輝度勾配ヒストグラムを算出する。そして、識別部３２の汎用識別部３２ｃ及び特殊識別部３２ｄのそれぞれは、画像特徴量算出部３２ｂが算出したＨＯＧ特徴量に基づいて識別処理対象画像が人画像であるか非人画像であるかを識別する。 If it is determined that the deviation in luminance of the identification processing target image is small (YES in step ST1), the image feature amount calculation unit 32b of the identification unit 32 calculates an image feature amount of the identification processing target image (step ST2). In the present embodiment, the image feature quantity calculation unit 32b divides the identification processing target image of 64 vertical pixels by 32 horizontal pixels into 128 HOG blocks of 4 vertical pixels by 4 horizontal pixels, and image characteristic quantities for each HOG block. A luminance gradient histogram as (HOG feature amount) is calculated. Then, each of the general purpose identification unit 32 c and the special identification unit 32 d of the identification unit 32 determines whether the image to be subjected to the identification processing is a human image or a non-human image based on the HOG feature value calculated by the image feature value calculation unit 32 b Identify

その後、識別部３２のパタンフィルタ部３２ｅは、汎用識別部３２ｃが識別処理対象画像を人画像であると識別したか否かを判定する（ステップＳＴ３）。 Thereafter, the pattern filter unit 32e of the identification unit 32 determines whether the general-purpose identification unit 32c has identified the identification processing target image as a human image (step ST3).

そして、汎用識別部３２ｃによる識別結果が人画像であったと判定した場合（ステップＳＴ３のＹＥＳ）、パタンフィルタ部３２ｅは、汎用識別部３２ｃを構成する複数の複合型弱識別器のそれぞれの識別結果の組み合わせである識別結果パタンが適切であるか否かを判定する（ステップＳＴ４）。 Then, when it is determined that the identification result by the general-purpose identification unit 32c is a human image (YES in step ST3), the pattern filter unit 32e identifies each of the plurality of compound weak classifiers that constitute the general-purpose identification unit 32c. It is determined whether the identification result pattern which is a combination of is appropriate (step ST4).

そして、識別結果パタンが適切であると判定した場合（ステップＳＴ４のＹＥＳ）、識別部３２は、識別処理対象画像を人画像として識別する（ステップＳＴ５）。 When it is determined that the identification result pattern is appropriate (YES in step ST4), the identification unit 32 identifies an identification processing target image as a human image (step ST5).

一方、識別処理対象画像の輝度の偏りが大きいと判定した場合（ステップＳＴ１のＮＯ）、又は、汎用識別部３２ｃによる識別結果が非人画像であったと判定した場合（ステップＳＴ３のＮＯ）、又は、識別結果パタンが適切でないと判定した場合（ステップＳＴ４のＮＯ）、識別部３２は、識別処理対象画像を非人画像として識別する（ステップＳＴ６）。 On the other hand, when it is determined that the deviation in luminance of the identification processing target image is large (NO in step ST1), or when it is determined that the identification result by the general identification unit 32c is a non-human image (NO in step ST3), When it is determined that the identification result pattern is not appropriate (NO in step ST4), the identification unit 32 identifies an identification processing target image as a non-human image (step ST6).

このように、周辺監視システム１００は、識別器による識別とは別に、輝度の偏り、汎用識別器を構成する複数の弱識別器のそれぞれの識別結果の偏り等の画像特徴の偏りに基づいて補助的に人画像であるか否かを識別する。そのため、特定の背景画像等が人画像として識別されてしまうのを抑制して人識別能力を高めることができる。 As described above, the surrounding area monitoring system 100 assists on the basis of the bias of the image feature such as the bias of the luminance or the bias of the discrimination result of each of the plurality of weak classifiers constituting the general-purpose classifier apart from the discrimination by the classifier. To identify whether it is a human image or not. Therefore, it is possible to suppress the identification of a specific background image or the like as a human image and to enhance the human identification capability.

次に図１７を参照し、識別部３２による識別処理の流れについて説明する。図１７は識別処理の別の一例の流れを示すフローチャートであり、識別部３２は識別処理対象画像を取得する度に繰り返しこの識別処理を実行する。図１７の識別処理は、ステップＳＴ３Ａを有する点で図１６の識別処理と相違するがその他の点で共通する。そのため、共通部分の説明を省略し、相違部分を詳細に説明する。 Next, the flow of identification processing by the identification unit 32 will be described with reference to FIG. FIG. 17 is a flowchart showing another example of the identification process, and the identification unit 32 repeatedly executes this identification process each time an identification processing target image is acquired. The identification process of FIG. 17 is different from the identification process of FIG. 16 in that it has step ST3A, but is common in the other points. Therefore, the description of the common part is omitted, and the different part will be described in detail.

汎用識別部３２ｃによる識別結果が人画像であったと判定した場合（ステップＳＴ３のＹＥＳ）、識別部３２は、汎用識別部３２ｃが識別結果として出力する人度が所定値以下であるか否かを判定する（ステップＳＴ３Ａ）。この所定値は、例えば、識別境界値より十分に大きな値である。 When it is determined that the identification result by the general-purpose identification unit 32c is a human image (YES in step ST3), the identification unit 32 determines whether or not the personal degree output by the general-purpose identification unit 32c as the identification result is equal to or less than a predetermined value. It determines (step ST3A). This predetermined value is, for example, a value sufficiently larger than the identification boundary value.

そして、人度が所定値以下であると判定した場合（ステップＳＴ３ＡのＹＥＳ）、識別部３２のパタンフィルタ部３２ｅは、汎用識別部３２ｃを構成する複数の複合型弱識別器のそれぞれの識別結果の組み合わせである識別結果パタンが適切であるか否かを判定する（ステップＳＴ４）。以降の処理は図１６の識別処理と同じである。 Then, when it is determined that the human degree is equal to or less than the predetermined value (YES in step ST3A), the pattern filter unit 32e of the identification unit 32 identifies each of the plurality of compound weak classifiers constituting the general identification unit 32c. It is determined whether the identification result pattern which is a combination of is appropriate (step ST4). The subsequent processing is the same as the identification processing of FIG.

一方、人度が所定値より大きいと判定した場合（ステップＳＴ３ＡのＮＯ）、識別部３２は、識別結果パタンが適切であるか否かを判定することなく、識別処理対象画像を人画像として識別する（ステップＳＴ５）。 On the other hand, when it is determined that the human degree is larger than the predetermined value (NO in step ST3A), the identification unit 32 identifies the image to be subjected to identification processing as a human image without determining whether the identification result pattern is appropriate. To do (step ST5).

このように、図１７の識別処理では、汎用識別部３２ｃにより識別処理対象画像が明らかに人画像であると判定された場合には、パタンフィルタ部３２ｅによる補助的な識別が省略される。 As described above, in the identification processing of FIG. 17, when the general-purpose identification unit 32 c clearly determines that the identification processing target image is a human image, auxiliary identification by the pattern filter unit 32 e is omitted.

その結果、図１７の識別処理は、図１６の識別処理による効果と同じ効果を実現しながら処理負荷を低減させることができる。 As a result, the identification process of FIG. 17 can reduce the processing load while realizing the same effect as the effect of the identification process of FIG.

ここで再び図２を参照し、コントローラ３０の他の機能要素についての説明を継続する。 Referring again to FIG. 2, the description of the other functional elements of the controller 30 will be continued.

追跡部３３は、識別部３２が所定時間毎に出力する識別結果を追跡して最終的な人検知結果を出力する機能要素である。本実施例では、追跡部３３は、連続する所定回数分の同一人に関する識別結果が所定条件を満たす場合に、対応する人候補画像が人画像であると判定する。すなわち、対応する三次元位置（実在位置）に人が存在すると判定する。同一人であるか否かはその実在位置に基づいて判定される。具体的には、追跡部３３は、識別部３２による１回目の識別処理において人画像であると識別された画像に写る人の実在位置（参照点ＰｒＩ）に基づいて所定時間内にその人が到達可能な範囲を導き出す。到達可能な範囲は、ショベルの最大旋回速度、ショベルの最大走行速度、人の最大移動速度等に基づいて設定される。そして、２回目の識別処理において人画像であると識別された画像に写る人の実在位置（参照点ＰｒＩＩ）がその範囲内であれば同一人であると判定する。３回目以降の識別処理についても同様である。そして、追跡部３３は、例えば、連続する６回の識別結果のうちの４回で同一人の人画像であると識別された場合に、対応する三次元位置に人が存在すると判定する。また、１回目の識別処理において人画像であると識別された場合であっても、その後の連続する３回の識別処理において同一人の人画像が識別されなかった場合には、対応する三次元位置には人が存在しないと判定する。 The tracking unit 33 is a functional element that tracks the identification result output by the identification unit 32 every predetermined time and outputs a final human detection result. In the present embodiment, the tracking unit 33 determines that the corresponding human candidate image is a human image, when the identification results regarding the same person for a predetermined number of consecutive conditions satisfy the predetermined condition. That is, it is determined that a person is present at the corresponding three-dimensional position (actual position). Whether or not they are the same person is determined based on the actual position. More specifically, the tracking unit 33 determines that the person is within a predetermined time based on the actual position (reference point PrI) of the person appearing in the image identified as a human image in the first identification processing by the identification unit 32. Deriving the reachable range. The reachable range is set based on the maximum turning speed of the shovel, the maximum traveling speed of the shovel, the maximum moving speed of a person, and the like. Then, if the actual position (reference point PrII) of the person appearing in the image identified as the human image in the second identification processing is within the range, it is determined that the same person is present. The same applies to the third and subsequent identification processes. And tracking part 33 judges with a person existing in a corresponding three-dimensional position, for example, when it is identified with a person image of the same person in four times among six consecutive discrimination results. In addition, even if it is identified as a human image in the first identification processing, if a human image of the same person is not identified in the subsequent three consecutive identification processing, the corresponding three-dimensional It is determined that there is no person at the position.

このように、抽出部３１、識別部３２、及び追跡部３３の組み合わせは、撮像装置４０の撮像画像に基づいてショベルの周辺に人が存在するか否かを検知する人検知部３４を構成する。 Thus, the combination of the extraction unit 31, the identification unit 32, and the tracking unit 33 constitutes a human detection unit 34 that detects whether or not a person is present around the shovel based on the image captured by the imaging device 40. .

この構成により、人検知部３４は、誤報（人が存在しないにもかかわらず人が存在すると判定すること）、失報（人が存在するにもかかわらず人が存在しないと判定すること）等の発生を抑制できる。 With this configuration, the human detection unit 34 may be false alarm (determining that there is a human even though there is no human), failure to report (that it may be that human does not exist even though there is a human), etc. Can be suppressed.

また、人検知部３４は、人画像であると識別された画像に写る人の実在位置の推移に基づき、人がショベルに近づいているのかショベルから遠ざかっているのかを判断できる。そして、人検知部３４は、その人の実在位置のショベルからの距離が所定値を下回った場合に制御部３５に制御指令を出力して警報を出力させてもよい。この場合、人検知部３４は、ショベルの動作情報（例えば旋回速度、旋回方向、走行速度、走行方向等）に応じて所定値を調整してもよい。 Further, the human detection unit 34 can determine whether a person is approaching or away from the shovel based on the transition of the actual position of the person shown in the image identified as the human image. Then, the human detection unit 34 may output a control command to the control unit 35 and output an alarm when the distance from the shovel of the actual position of the person is less than a predetermined value. In this case, the human detection unit 34 may adjust the predetermined value in accordance with the operation information (for example, the turning speed, the turning direction, the traveling speed, the traveling direction, and the like) of the shovel.

また、人検知部３４は少なくとも２段階の人検知状態と人非検知状態とを判別して認識してもよい。例えば、距離に関する条件、及び、信頼性に関する条件のうちの少なくとも一方が満たされた状態を第１人検知状態（警戒状態）と判断し、双方が満たされた状態を第２人検知状態（警報状態）と判断してもよい。距離に関する条件は、例えば、人画像であると識別された画像に写る人の実在位置のショベルからの距離が所定値未満であることを含む。信頼性に関する条件は、例えば、連続する６回の識別結果のうちの４回で同一人の人画像であると識別されることを含む。第１人検知状態（警戒状態）では、確度は低いがレスポンスが早い予備的な警報としての第１警報が出力される。第１警報は、例えば小音量のビープ音であり、２つの条件が何れも満たされなくなった場合に自動的に停止される。第２人検知状態（警報状態）では、確度は高いがレスポンスが遅い正式な警報としての第２警報が出力される。第２警報は、例えば大音量のメロディ音であり、少なくとも一方の条件が満たされなくなったとしても自動的に停止されず、その停止には操作者の操作が必要とされる。 Further, the human detection unit 34 may distinguish and recognize at least two stages of the human detection state and the human non-detection state. For example, a state in which at least one of the distance-related condition and the reliability-related condition is satisfied is determined as a first human detection state (warning state), and a state in which both are satisfied is a second human detection state (alarm State). The condition regarding the distance includes, for example, that the distance from the shovel of the actual position of the person shown in the image identified as the human image is less than a predetermined value. The condition relating to the reliability includes, for example, identification as the human image of the same person in four out of six consecutive identification results. In the first human detection state (warning state), the first alarm is output as a preliminary alarm with low accuracy but quick response. The first alarm is, for example, a small volume beep sound, and is automatically stopped when neither of the two conditions is satisfied. In the second human detection state (alarm state), a second alarm is output as a formal alarm with high accuracy but a slow response. The second alarm is, for example, a loud loud melody and is not automatically stopped even if at least one of the conditions is not satisfied, and the operation of the operator is required for the stop.

制御部３５は、各種装置を制御する機能要素である。本実施例では、制御部３５は入力装置４１を介した操作者の入力に応じて各種装置を制御する。例えば、タッチパネルを通じて入力された画像切換指令に応じて車載ディスプレイの画面に表示される表示画像を切り換える。表示画像は、後方カメラ４０Ｂのスルー画像、右側方カメラ４０Ｒのスルー画像、左側方カメラ４０Ｌのスルー画像、視点変換画像等を含む。視点変換画像は、例えば、複数のカメラの撮像画像から合成される鳥瞰画像（ショベルの真上にある仮想視点から見た画像）である。 The control unit 35 is a functional element that controls various devices. In the present embodiment, the control unit 35 controls various devices in accordance with the input of the operator via the input device 41. For example, the display image displayed on the screen of the in-vehicle display is switched according to the image switching command input through the touch panel. The display image includes a through image of the rear camera 40B, a through image of the right side camera 40R, a through image of the left side camera 40L, a viewpoint conversion image, and the like. The viewpoint conversion image is, for example, a bird's eye view image (an image viewed from a virtual viewpoint directly above the shovel) synthesized from captured images of a plurality of cameras.

また、制御部３５は、人検知部３４を構成する追跡部３３の最終的な人検知結果に応じて各種装置を制御する。例えば、追跡部３３の最終的な人検知結果に応じて機械制御装置５１に制御指令を出力してショベルの状態を第１状態と第２状態との間で切り換える。第１状態は、ショベルの動きの制限が解除されている状態、警報の出力が停止されている状態等を含む。第２状態はショベルの動きを制限し或いは停止させている状態、警報を出力させている状態等を含む。本実施例では、制御部３５は、追跡部３３の最終的な人検知結果に基づいてショベルの周辺の所定範囲内に人が存在すると判定した場合、機械制御装置５１に制御指令を出力してショベルの状態を第１状態から第２状態に切り換える。例えば、ショベルの動きを停止させる。この場合、操作者による操作は無効にされる。具体的には、ゲートロック弁に制御指令を出力して操作装置を油圧システムから切り離すことで無操作状態を強制的に創出してショベルの動きを停止させる。或いは、エンジン制御装置に制御指令を出力してエンジンを停止させてもよい。或いは、油圧アクチュエータに流入する作動油の流量を制御する制御弁に制御指令を出力して制御弁の開口面積、開口面積変化速度等を変化させることで油圧アクチュエータの動きを制限してもよい。この場合、最大旋回速度、最大走行速度等が低減される。 In addition, the control unit 35 controls various devices in accordance with the final human detection result of the tracking unit 33 configuring the human detection unit 34. For example, a control command is output to the machine control device 51 according to the final human detection result of the tracking unit 33 to switch the state of the shovel between the first state and the second state. The first state includes a state in which the restriction on the movement of the shovel is released, a state in which the output of an alarm is stopped, and the like. The second state includes a state in which the movement of the shovel is restricted or stopped, a state in which an alarm is output, and the like. In the present embodiment, the control unit 35 outputs a control command to the machine control device 51 when it is determined that there is a person within a predetermined range around the shovel based on the final person detection result of the tracking unit 33. The state of the shovel is switched from the first state to the second state. For example, the movement of the shovel is stopped. In this case, the operation by the operator is invalidated. Specifically, a control command is output to the gate lock valve to separate the operating device from the hydraulic system, thereby forcibly creating a non-operation state and stopping the movement of the shovel. Alternatively, a control command may be output to the engine control device to stop the engine. Alternatively, the movement of the hydraulic actuator may be restricted by outputting a control command to a control valve that controls the flow rate of hydraulic fluid flowing into the hydraulic actuator to change the opening area of the control valve, the opening area change speed, and the like. In this case, the maximum turning speed, the maximum traveling speed and the like are reduced.

また、制御部３５は、ショベルの状態を第２状態とした後で所定の解除条件が満たされた場合にショベルの状態を第１状態に戻す。すなわち、ショベルの動きを制限し或いは停止させた後で所定の解除条件が満たされた場合にその制限又は停止を解除する。所定の解除条件は、例えば、「ショベル周辺の所定範囲内に人が存在しないと判定すること」（以下、「第１解除条件」とする。）を含む。また、所定の解除条件は、例えば、「ショベルが動き出さない状態が確保されていること」（以下、「第２解除条件」とする。）を追加的に含む。また、所定の解除条件は、「ショベル周辺に人がいないことが操作者によって確認されたこと」（以下、「第３解除条件」とする。）を含んでいてもよい。なお、本実施例では、ショベルの動きが制限或いは停止されているか否か、第１解除条件、第２解除条件、第３解除条件のそれぞれが満たされているか否かはフラグを用いて管理される。 The control unit 35 returns the state of the shovel to the first state when the predetermined release condition is satisfied after the state of the shovel is changed to the second state. That is, when the predetermined release condition is satisfied after the movement of the shovel is restricted or stopped, the restriction or stop is released. The predetermined cancellation condition includes, for example, “determining that there is no person in a predetermined range around the shovel” (hereinafter, referred to as “first cancellation condition”). In addition, the predetermined release condition additionally includes, for example, that “a state in which the shovel does not move is secured” (hereinafter, referred to as “second release condition”). Further, the predetermined cancellation condition may include "the fact that the operator has confirmed that there is no person around the shovel" (hereinafter, referred to as "third cancellation condition"). In the present embodiment, it is managed using a flag whether the movement of the shovel is restricted or stopped, and whether each of the first release condition, the second release condition, and the third release condition is satisfied. Ru.

第１解除条件は、例えば、「人検知部３４を構成する追跡部３３の最終的な人検知結果に基づいて制御部３５がショベル周辺の所定範囲内に人が存在しないと判定すること」を含む。 The first cancellation condition is, for example, “determining that the control unit 35 does not have a person within a predetermined range around the shovel based on the final human detection result of the tracking unit 33 configuring the human detection unit 34”. Including.

第２解除条件は、例えば、「全ての操作装置が所定時間以上にわたって中立位置になっていること」、「ゲートロックレバーが下ろされていること（操作装置が無効となっていること）」、「全ての操作装置から操作者の手足が離されていること」、「所定のボタン操作が行われたこと」等を含む。「全ての操作装置が中立位置になっていること」は、例えば、各操作装置からの指令の有無、各操作装置の操作量を検出するセンサの出力値等に基づいて制御部３５が検知する。「所定時間以上にわたって」という条件は瞬間的に中立位置になっただけで第２解除条件が満たされてしまうのを防止する効果がある。「操作装置から操作者の手足が離されていること」は、例えば、運転室内を撮像するカメラの撮像画像、操作装置（例えば操作レバーのグリップ）に取り付けられた静電センサの出力等に基づいて制御部３５が検知する。「所定のボタン操作が行われたこと」は、例えば、車載ディスプレイの画面に「ショベルが動き出さない状態が確保されていますか？」といったメッセージが表示された状態で確認ボタン（例えばホーンボタン又は同じ画面上に表示されたソフトウェアボタン）が押下された場合に制御部３５が検知する。 The second release condition is, for example, "all operating devices are in the neutral position for a predetermined time or more", "the gate lock lever is lowered (operation devices are disabled)", It includes “the operator's hands and feet are separated from all the operating devices”, “the predetermined button operation has been performed”, and the like. “All the operating devices are in the neutral position” is detected by the control unit 35 based on, for example, the presence or absence of a command from each operating device, and the output value of a sensor that detects the operation amount of each operating device. . The condition of “over a predetermined time” has an effect of preventing the second release condition from being satisfied even when the neutral position is instantaneously reached. “The operator's hands and feet are separated from the operation device” is based on, for example, a captured image of a camera for imaging the driver's cabin, an output of an electrostatic sensor attached to the operation device (for example, grip of the operation lever), etc. The control unit 35 detects this. “A predetermined button operation has been performed” is, for example, a confirmation button (for example, a horn button or the same button in a state where a message “Does the state where the shovel does not move is secured?” Is displayed on the in-vehicle display screen? When the software button (displayed on the screen) is pressed, the control unit 35 detects.

第３解除条件は、例えば、車載ディスプレイの画面に「ショベル周辺に人がいないことを確認しましたか？」といったメッセージが表示された状態で確認ボタンが押下された場合に満たされる。なお、第３解除条件は省略されてもよい。 The third cancellation condition is satisfied, for example, when the confirmation button is pressed in a state where a message such as “Are no people in the vicinity of the shovel confirmed?” Is displayed on the screen of the in-vehicle display. The third release condition may be omitted.

所定の解除条件に第３解除条件が含まれる場合、第１解除条件と第２解除条件が満たされると、ショベルは制限解除可能状態となる。制限解除可能状態は、ショベル周辺に人がいないことを操作者が確認しさえすれば制限を解除できる状態を意味する。 When the third cancellation condition is included in the predetermined cancellation condition, the shovel is in the restriction cancellation possible state when the first cancellation condition and the second cancellation condition are satisfied. The state where the restriction can be released means that the restriction can be canceled as long as the operator confirms that there is no person around the shovel.

第１解除条件、第２解除条件、及び第３解除条件のそれぞれが満たされる順番に制限はない。例えば、第３解除条件、第２解除条件、第１解除条件の順で条件が満たされた場合であっても、制御部３５はショベルの動きの制限又は停止を解除する。 There is no restriction on the order in which each of the first release condition, the second release condition, and the third release condition is satisfied. For example, even if the conditions are satisfied in the order of the third release condition, the second release condition, and the first release condition, the control unit 35 releases the restriction or stop of the movement of the shovel.

また、制御部３５は、所定の解除条件が満たされた後で所定の待ち時間が経過したときにその制限又は停止を解除してもよい。急な解除によって操作者を慌てさせることがないようにするためである。 Further, the control unit 35 may release the restriction or the stop when the predetermined waiting time has elapsed after the predetermined cancellation condition is satisfied. This is to prevent the operator from being upset by sudden release.

また、制御部３５は、ショベルの動きを制限し或いは停止させた場合、出力装置５０としての車載ディスプレイに制御指令を出力し、その原因となった人画像が含まれる撮像画像を表示させてもよい。例えば、左側方カメラ４０Ｌの撮像画像のみに人画像が含まれる場合、左側方カメラ４０Ｌのスルー画像を単独で表示させてもよい。或いは、左側方カメラ４０Ｌの撮像画像と後方カメラ４０Ｂの撮像画像のそれぞれに人画像が含まれる場合、２つのカメラのそれぞれのスルー画像を並べて同時に表示させてもよく、２つのカメラの撮像画像を含む１つの合成画像（例えば視点変換画像）を表示させてもよい。また、制限中又は停止中であることを表す画像、解除方法のガイダンス等を表示させてもよい。また、人画像であると識別された人候補画像に対応する画像部分を強調表示してもよい。例えば、識別処理対象画像領域ＴＲｇの輪郭線を所定色で表示してもよい。また、所定の解除条件が満たされた後の待ち時間を設定している場合には、所定の解除条件が満たされたときにその旨を表示した上で、待ち時間のカウントダウンを表示してもよい。また、待ち時間中に警報を出力している場合には待ち時間の経過と共にその警報の音量を徐々に小さくしてもよい。 In addition, even when the control unit 35 restricts or stops the movement of the shovel, the control unit 35 outputs a control command to the on-vehicle display as the output device 50, and displays the captured image including the human image that is the cause. Good. For example, when a human image is included only in the captured image of the left side camera 40L, the through image of the left side camera 40L may be displayed alone. Alternatively, when a human image is included in each of the captured image of the left side camera 40L and the captured image of the rear camera 40B, the through images of the two cameras may be simultaneously displayed side by side, and the captured images of the two cameras may be displayed. One composite image (for example, a viewpoint conversion image) may be displayed. In addition, an image indicating that the image is being restricted or stopped, guidance on a release method, or the like may be displayed. Further, an image portion corresponding to a human candidate image identified as a human image may be highlighted. For example, the outline of the identification processing target image area TRg may be displayed in a predetermined color. In addition, when the waiting time after the predetermined cancellation condition is satisfied is set, even when the predetermined cancellation condition is satisfied, the waiting time countdown is displayed after displaying that effect. Good. In addition, when an alarm is output during the waiting time, the volume of the alarm may be gradually reduced as the waiting time elapses.

また、制御部３５は、ショベルの動きを制限し或いは停止させた場合、出力装置５０としての車載スピーカに制御指令を出力し、その原因となった人が存在する側で警報を出力させてもよい。この場合、車載スピーカは、例えば、運転室内の右壁に設置された右側方スピーカ、左壁に設置された左側方スピーカ、及び後壁に設置された後方スピーカで構成される。そして、制御部３５は、左側方カメラ４０Ｌの撮像画像のみに人画像が含まれる場合、左側方スピーカのみから警報を出力させる。或いは、制御部３５は複数のスピーカを含むサラウンドシステムを用いて音を定位させてもよい。 In addition, when the control unit 35 restricts or stops the movement of the shovel, the control unit 35 outputs a control command to the on-vehicle speaker as the output device 50, and outputs an alarm on the side where the person who caused it is present. Good. In this case, for example, the on-vehicle speaker includes a right side speaker installed on the right wall in the driver's cabin, a left side speaker installed on the left wall, and a rear speaker installed on the rear wall. Then, when the human image is included only in the captured image of the left side camera 40L, the control unit 35 causes the alarm to be output only from the left side speaker. Alternatively, the control unit 35 may localize the sound using a surround system including a plurality of speakers.

また、制御部３５は、人検知部３４が人候補画像を人画像であると識別した場合に、ショベルの動きを制限し或いは停止させることなく警報のみを出力させてもよい。この場合も制御部３５は上述のように距離に関する条件及び信頼性に関する条件のうちの少なくとも一方が満たされた状態を第１人検知状態（警戒状態）と判断し、双方が満たされた状態を第２人検知状態（警報状態）と判断してもよい。そして、制御部３５は、ショベルの動きを制限し或いは停止させた場合と同様に、所定の解除条件が満たされた場合に第２人検知状態（警報状態）での警報を停止させてもよい。自動的に停止され得る第１人検知状態（警戒状態）での警報とは異なり、第２人検知状態（警報状態）での警報の停止には操作者の操作が必要とされるためである。 In addition, when the human detection unit 34 identifies the human candidate image as a human image, the control unit 35 may output only the alarm without restricting or stopping the movement of the shovel. Also in this case, the control unit 35 determines that at least one of the distance-related condition and the reliability-related condition is satisfied as the first person detection state (warning state) as described above, and both of the conditions are satisfied. It may be determined that the second human detection state (alarm state). Then, the control unit 35 may stop the alarm in the second human detection state (alarm state) when the predetermined release condition is satisfied, as in the case where the movement of the shovel is restricted or stopped. . Unlike the alarm in the first human detection state (warning state) which can be automatically stopped, the operation of the operator is required to stop the alarm in the second human detection state (alarm state). .

以上、本発明の好ましい実施例について詳説したが、本発明は、上述した実施例に制限されることはなく、本発明の範囲を逸脱することなしに上述した実施例に種々の変形及び置換を加えることができる。 Although the preferred embodiments of the present invention have been described above in detail, the present invention is not limited to the above-described embodiments, and various modifications and substitutions may be made to the above-described embodiments without departing from the scope of the present invention. It can be added.

例えば、上述の実施例では、周辺監視システム１００は、画像特徴量に基づく識別を補助する補助識別部として輝度フィルタ部３２ａ及びパタンフィルタ部３２ｅの双方を備える。しかしながら、本発明はこの構成に限定されない。例えば、周辺監視システム１００は、補助識別部として輝度フィルタ部３２ａのみを備えていてもよく、補助識別部としてパタンフィルタ部３２ｅのみを備えていてもよい。何れか一方のみを補助識別部として備える場合であっても、周辺監視システム１００は、補助識別部を備えない構成に比べて人識別能力を高めることができる。 For example, in the above-described embodiment, the periphery monitoring system 100 includes both the luminance filter unit 32a and the pattern filter unit 32e as an auxiliary identification unit that assists identification based on the image feature amount. However, the present invention is not limited to this configuration. For example, the periphery monitoring system 100 may include only the luminance filter unit 32a as the auxiliary identification unit, and may include only the pattern filter unit 32e as the auxiliary identification unit. Even in the case where only one of them is provided as the auxiliary identification unit, the surrounding area monitoring system 100 can enhance the human identification capability as compared with the configuration without the auxiliary identification unit.

また、上述の実施例では、ショベルの上部旋回体３の上に取り付けられる撮像装置４０の撮像画像を用いて人を検知する場合を想定するが、本発明はこの構成に限定されるものではない。移動式クレーン、固定式クレーン、リフマグ機、フォークリフト等の他の作業機械の本体部に取り付けられる撮像装置の撮像画像を用いる構成にも適用され得る。 Moreover, in the above-mentioned Example, although the case where a person is detected using the captured image of the imaging device 40 attached on the upper revolving superstructure 3 of a shovel is assumed, this invention is not limited to this structure . The present invention can also be applied to a configuration that uses a captured image of an imaging device attached to the main body of another working machine such as a mobile crane, fixed crane, riff-mag machine, forklift, and the like.

また、上述の実施例では、３つのカメラを用いてショベルの死角領域を撮像するが、１つ、２つ、又は４つ以上のカメラを用いてショベルの死角領域を撮像してもよい。 In the above-mentioned embodiment, although three cameras are used to image the blind spot area of the shovel, one, two, or four or more cameras may be used to image the blind spot area of the shovel.

また、上述の実施例では、複数の撮像画像のそれぞれに対して個別に人検知処理が適用されるが、複数の撮像画像から生成される１つの合成画像に対して人検知処理が適用されてもよい。 Further, in the above-described embodiment, the human detection process is individually applied to each of the plurality of captured images, but the human detection process is applied to one composite image generated from the plurality of captured images. It is also good.

１・・・下部走行体２・・・旋回機構３・・・上部旋回体４・・・ブーム５・・・アーム６・・・バケット７・・・ブームシリンダ８・・・アームシリンダ９・・・バケットシリンダ１０・・・キャビン３０・・・コントローラ３１・・・抽出部３２・・・識別部３３・・・追跡部３４・・・人検知部３５・・・制御部４０・・・撮像装置４０Ｂ・・・後方カメラ４０Ｌ・・・左側方カメラ４０Ｒ・・・右側方カメラ４１・・・入力装置５０・・・出力装置５１・・・機械制御装置１００・・・周辺監視システムＡＰ、ＡＰ１〜ＡＰ６・・・頭部画像位置ＢＸ・・・ボックスＨＤ・・・頭部ＨＰ・・・仮想頭部位置ＨＲｇ・・・ヘルメット画像Ｍ１、Ｍ２・・・マスク領域Ｐｒ、Ｐｒ１、Ｐｒ２、Ｐｒ１０〜Ｐｒ１２・・・参照点Ｒ１・・・はみ出し領域Ｒ２・・・車体映り込み領域ＲＰ・・・代表位置ＴＲ、ＴＲ１、ＴＲ２、ＴＲ１０〜ＴＲ１２・・・仮想平面領域ＴＲｇ、ＴＲｇ３、ＴＲｇ４、ＴＲｇ５・・・識別処理対象画像領域ＴＲｇｔ、ＴＲｇｔ３、ＴＲｇｔ４、ＴＲｇｔ５・・・正規化画像 1 ... lower traveling body 2 ... turning mechanism 3 ... upper swing body 4 ... boom 5 ... arm 6 ... bucket 7 ... boom cylinder 8 ... arm cylinder 9 .. -Bucket cylinder 10 ... cabin 30 ... controller 31 ... extraction unit 32 ... identification unit 33 ... tracking unit 34 ... person detection unit 35 ... control unit 40 ... imaging device 40B: Rear camera 40L: Left side camera 40R: Right side camera 41: Input device 50: Output device 51: Machine control device 100: Perimeter monitoring system AP, AP1 AP6: head image position BX: box HD: head HP: virtual head position HRg: helmet image M1, M2: mask area Pr, Pr1, Pr2, r10 to Pr12 ··· Reference point R1 · · · Overflow area R2 · · · Vehicle reflection area RP · · · Representative position TR, TR1, TR2 and TR10 to TR12 · · · Virtual plane area TRg, TRg3, TRg4, TRg5 ... Identification processing target image area TRgt, TRgt3, TRgt4, TRgt5 ... Normalized image

Claims

作業機械に取り付けられる撮像装置の撮像画像を用いて前記作業機械の周辺に存在する人を検知する作業機械用周辺監視システムであって、
前記撮像画像の一部を識別処理対象画像として抽出する抽出部と、
前記抽出部が抽出した識別処理対象画像に含まれる画像が人の画像であるかを画像認識処理によって識別する識別部と、を備え、
前記識別部は、機械学習によって生成された識別器の識別結果と、前記識別処理対象画像における画像特徴の偏りに基づいて前記識別処理対象画像に含まれる画像が人の画像であるか否かを補助的に識別する補助識別部の識別結果とに基づいて前記識別処理対象画像に含まれる画像が人の画像であるかを識別する、
作業機械用周辺監視システム。 A perimeter monitoring system for a working machine, which detects a person present around the working machine using a captured image of an imaging device attached to the working machine,
An extraction unit that extracts a part of the captured image as an identification processing target image;
An identification unit that identifies whether an image included in the identification processing target image extracted by the extraction unit is an image of a person by image recognition processing;
The identification unit determines whether the image included in the identification processing target image is a human image based on the identification result of the classifier generated by machine learning and the bias of the image feature in the identification processing target image. Identifying whether the image included in the identification processing target image is a person's image, based on the identification result of the auxiliary identification unit that additionally identifies
Peripheral monitoring system for work machines.

前記補助識別部は、前記識別処理対象画像を複数の領域に分割して領域毎に人の画像であるかを識別し、前記複数の領域のそれぞれの識別結果に基づいて前記識別処理対象画像に含まれる画像が人の画像であるかを識別する、
請求項１に記載の作業機械用周辺監視システム。 The auxiliary identification unit divides the identification processing target image into a plurality of regions, identifies whether the image is a human image for each region, and determines the identification processing target image based on the identification result of each of the plurality of regions. Identify whether the included image is a human image,
A work machine peripheral monitoring system according to claim 1.

前記複数の領域の少なくとも２つは重複している、
請求項２に記載の作業機械用周辺監視システム。 At least two of the plurality of regions are overlapping,
A work machine peripheral monitoring system according to claim 2.

前記補助識別部による識別は、前記識別器の識別結果が出た後に実行される、
請求項１乃至３の何れか一項に記載の作業機械用周辺監視システム。 The identification by the auxiliary identification unit is performed after the identification result of the identifier is output,
The work machine peripheral monitoring system according to any one of claims 1 to 3.

前記抽出部は、前記識別処理対象画像を変換して所定サイズの長方形の正規化画像を生成し、
前記正規化画像は２つの対角線で４つの領域に分割され、
前記補助識別部は、前記４つの領域のそれぞれで人の画像であるかを識別し、前記４つの領域のそれぞれの識別結果に基づいて前記識別処理対象画像に含まれる画像が人の画像であるかを識別する、
請求項１乃至４の何れか一項に記載の作業機械用周辺監視システム。 The extraction unit converts the identification processing target image to generate a rectangular normalized image of a predetermined size,
The normalized image is divided into four regions by two diagonals,
The auxiliary identification unit identifies whether the image is a person in each of the four regions, and the image included in the identification processing target image is a person's image based on the identification result of each of the four regions. To identify
A work machine peripheral monitoring system according to any one of claims 1 to 4.

前記補助識別部は、前記識別処理対象画像における画像特徴としての輝度の偏りに基づいて人の画像であるか否かを識別する、
請求項１に記載の作業機械用周辺監視システム。 The auxiliary identification unit identifies whether or not the image is a person based on a bias of luminance as an image feature in the identification processing target image.
A work machine peripheral monitoring system according to claim 1.

前記補助識別部による識別は、前記識別器の識別結果が出る前に実行される、
請求項６に記載の作業機械用周辺監視システム。 The identification by the auxiliary identification unit is performed before the identification result of the identifier is output,
A work machine peripheral monitoring system according to claim 6.