WO2018110377A1

WO2018110377A1 - Video monitoring device

Info

Publication number: WO2018110377A1
Application number: PCT/JP2017/043746
Authority: WO
Inventors: 海斗笹尾; 伊藤　渡; 一成岩永
Original assignee: 株式会社日立国際電気
Priority date: 2016-12-15
Filing date: 2017-12-06
Publication date: 2018-06-21
Also published as: JPWO2018110377A1; JP6831396B2

Abstract

Provided is a technology for automated detection of an asset region, resilient against noise such as changes in sunlight. An edge extraction unit 201 carries out an edge extraction process and a binarization process upon image data which has been acquired from a video acquisition unit 102. A straight line extraction unit 202 carries out a straight line detection process upon an edge image which has been extracted with the edge extraction unit 201. A straight line classification unit 203 carries out a straight line grouping process. A representative straight line selection unit 204 extracts a representative straight line of each of the groups. An asset candidate region extraction unit 205 extracts an asset candidate region which is associated with each of the representative straight lines. An asset identification unit 207 refers to an identification database unit 206 and identifies an asset to be detected from the asset candidate region. An asset state inference unit 208 carries out an inference of an angle and a three-dimensional position of the identified asset to be detected.

Description

映像監視装置Video surveillance device

　本発明は、映像監視装置に係り、例えば、画像中のアセット候補領域を検出する機能を備える映像監視装置に関する。 The present invention relates to a video monitoring apparatus, for example, a video monitoring apparatus having a function of detecting an asset candidate area in an image.

　監視カメラ等の撮像装置を用いて、監視対象領域内に侵入する物体を監視することが行われている。また、監視員による有人監視ではなく、装置或いはシステムが自動的に監視を行う技術が検討されている。このような機能を有する映像監視装置は、検出結果を利用することで移動物体の出現した映像のみ記録する機能や、表示装置に警告アイコンを表示、またブザー等を鳴らして監視員の注意を促すことができるため、従来の常時確認作業が必要であった監視業務の負担低減に役立つ。 Monitoring an object that enters the monitoring target area is performed using an imaging device such as a monitoring camera. In addition, a technique in which a device or system automatically performs a monitoring gutter instead of manned monitoring by a supervisor is being studied. The video monitoring device having such a function uses the detection result to record only the video in which the moving object appears, displays a warning icon on the display device, and sounds a buzzer to alert the monitor. Therefore, it is useful for reducing the burden of monitoring work, which conventionally requires constant confirmation work.

　監視対象領域内に侵入する物体を自動的に検出する技術の一例として、背景差分法と呼ばれる方法を用いた監視技術が従来から広く用いられている（例えば、特許文献１参照）。背景差分法による物体検出方法は、撮像装置等から得られる入力画像と、検出すべき物体が映っていない基準となる背景画像との輝度（或いは、画素値）の差分を算出し、当該検出値が所定の閾値と比べて大きい変化領域に検出すべき物体が存在する或いはその可能性があるとして監視を行う。 As an example of a technique for automatically detecting an object that enters an area to be monitored, a monitoring technique using a method called a background difference method has been widely used (see, for example, Patent Document 1). The object detection method based on the background subtraction method calculates a difference in luminance (or pixel value) between an input image obtained from an imaging device or the like and a reference background image on which an object to be detected is not reflected, and the detected value Monitoring is performed on the assumption that there is or may be an object to be detected in a change area where is larger than a predetermined threshold.

　特許文献１に開示の技術では、撮影機能と測位機能とを備える端末装置によって取得した時間的に異なる２枚の撮影画像から、対象物そのものの経時変化と、対象物と背景との相対的な変化（電柱の傾斜等）の２種類の経時変化を自動で検出する。 In the technique disclosed in Patent Document 1, the temporal change of the target object itself and the relative relationship between the target object and the background are obtained from two time-dependent captured images acquired by a terminal device having a shooting function and a positioning function. Two types of changes over time, such as changes (electric pole inclination, etc.) are automatically detected.

　また、撮像装置等から得られる入力画像単体で物体の領域を検出する方法があり、その一例としてＳＳ手法（Selective Search手法）がある。ＳＳ手法は、入力画像の色やテクスチャの類似度を基に、物体が存在する或いはその可能性がある領域を検出する。 Also, there is a method for detecting an object region by a single input image obtained from an imaging device or the like, and an SS method (Selective search method) is an example. In the SS method, an area where an object exists or is likely to be detected is detected based on the color of the input image and the similarity of the texture.

特開２０１６－１８４６３号公報JP 2016-18463 A

　ところで、上述の物体領域検出手法は、差分法やＳＳ手法などによって、物体が存在する或いはその可能性がある領域を自動で検出し発報する。しかし、差分法では動体のみしか検出できず、建物などの背景とされる物体を検出できない。また、カメラ画像中に日照変化による影が新たに浮かび上がる場合など環境に変化があった場合、その変化を物体が現れたものと認識してしまい、誤報の原因となる。また、ＳＳ手法では、物体らしいものを大量に検出する特性があり誤報の原因になり易く、計算コストも非常に高く一般的なＣＰＵ（Central Processing Unit）ではリアルタイムに動作することは困難である。 By the way, the above-described object region detection method automatically detects and issues a region where an object exists or is likely to exist by a difference method or an SS method. However, only the moving object can be detected by the difference method, and an object that is a background such as a building cannot be detected. In addition, when there is a change in the environment, such as when a shadow due to a change in sunshine appears in the camera image, the change is recognized as an object appearing, which causes a false alarm. In addition, the SS method has a characteristic of detecting a large amount of objects that are likely to be objects, and is likely to cause false alarms. The calculation cost is very high, and it is difficult to operate in real time with a general C PU (Central Processing Unit). .

　対象としている物体とその他の対象としていない物体や日照変化などのノイズを識別する手段としては、検出した物体の領域から特徴抽出と機械学習などによって識別する方法や深層学習（Deep Learning）などの特徴量設計を必要としない機械学習による識別方法がある。いずれの機械学習方法においても、高精度化にあたっては、入力する候補領域の切り出し精度や学習データの質の向上が必要であるとの課題がある。 As a means of discriminating noises such as target objects and other non-target objects and changes in sunlight, features such as feature extraction and machine learning from the detected object region, and features such as deep learning (Deep Learning) There is an identification method by machine learning that does not require quantitative design. In any of the machine learning methods, there is a problem that it is necessary to improve the extraction accuracy of the candidate region to be input and the quality of the learning data in order to improve the accuracy.

　本発明は、このような状況に鑑みなされたもので、上記課題を解決することを目的とする。 The present invention has been made in view of such a situation, and aims to solve the above problems.

　本発明は、映像取得部により取得した入力画像から、監視エリア内のアセット領域を検出する映像監視装置であって、前記入力画像からエッジ画像を取得するエッジ抽出部と、前記エッジ画像から直線を抽出する直線抽出部と、前記抽出された直線を複数のグループに分類する直線分類部と、前記分類された各々のグループから１つの代表直線を選定する代表直線選定部と、前記代表直線を用いてアセット候補領域を抽出するアセット候補領域抽出部と、前記アセット候補領域を識別するために必要な画像群が格納されている識別データベース部と、前記識別データベース部を参照して前記アセット候補領域から検出対象アセットを識別するアセット識別部と、を備える。
　また、前記アセット候補領域抽出部は、前記アセット候補領域の領域位置を検出してもよい。
　また、前記アセット識別部が識別した前記検出対象アセットの状態推定を行うアセット状態推定部を備え、前記アセット状態推定部は、少なくとも前記検出対象アセットの角度または３次元上の位置のいずれかを算出してもよい。 The present invention is a video monitoring device that detects an asset region in a monitoring area from an input image acquired by a video acquisition unit, an edge extraction unit that acquires an edge image from the input image, and a straight line from the edge image. A straight line extracting unit for extracting, a straight line classifying unit for classifying the extracted straight lines into a plurality of groups, a representative straight line selecting unit for selecting one representative straight line from each of the classified groups, and the representative straight line An asset candidate area extracting unit for extracting the asset candidate area, an identification database part storing an image group necessary for identifying the asset candidate area, and referring to the identification database part from the asset candidate area. An asset identification unit that identifies the asset to be detected.
The asset candidate area extraction unit may detect the area position of the asset candidate area.
In addition, an asset state estimation unit that estimates a state of the detection target asset identified by the asset identification unit, and the asset state estimation unit calculates at least one of an angle or a three-dimensional position of the detection target asset. May be.

　本発明によると、日照変化などのノイズに強固なアセット領域の自動検出技術を提供することを目的とする。 According to the present invention, an object of the present invention is to provide an asset area automatic detection technique that is robust against noise such as changes in sunlight.

実施形態に係る、画像監視システムの概略構成を示すブロック図である。1 is a block diagram illustrating a schematic configuration of an image monitoring system according to an embodiment. 実施形態に係る、画像処理部のブロック図である。It is a block diagram of an image processing part concerning an embodiment. 実施形態に係る、エッジ抽出部によるエッジ抽出処理及び２値化処理の例を示した図である。It is the figure which showed the example of the edge extraction process and binarization process by an edge extraction part based on embodiment. 実施形態に係る、直線抽出部による直線抽出処理の例を示した図である。It is the figure which showed the example of the straight line extraction process by the straight line extraction part based on embodiment. 実施形態に係る、直線分類部による直線分類処理の例を示した図である。It is a figure showing an example of straight line classification processing by a straight line classification part concerning an embodiment. 実施形態に係る、代表直線選定部による代表直線選定処理の例を示した図である。It is a figure which showed the example of the representative straight line selection process by the representative straight line selection part based on embodiment. 実施形態に係る、アセット候補領域抽出部によるアセット領域抽出処理の例を示した図である。It is the figure which showed the example of the asset area | region extraction process by the asset candidate area | region extraction part which concerns on embodiment. 実施形態に係る、アセット識別部によるアセット識別処理の例を示した図である。It is the figure which showed the example of the asset identification process by the asset identification part based on embodiment. 実施形態に係る、画像処理部の処理を示したフローチャートである。It is the flowchart which showed the process of the image process part which concerns on embodiment.

　次に、本発明を実施するための形態（以下、単に「実施形態」という）を、図面を参照して具体的に説明する。本実施形態では、カメラなどの撮像装置から映像を取得し、画像処理によって監視エリア内のアセット領域を検出する機能を有する画像監視技術を説明する。当該技術では、取得した画像からエッジ画像を生成後、直線性に関する特徴量を抽出・選定し、それらをもとに画像上でのアセット候補領域を検出する。以下、具体的に説明する。ここで、「アセット」とは、道路や橋梁、標識などの構造物を指し、特に直線を形状の特徴とし有する構造物を指す。 Next, a mode for carrying out the present invention (hereinafter simply referred to as “embodiment”) will be specifically described with reference to the drawings. In the present embodiment, an image monitoring technique having a function of acquiring a video from an imaging device such as a camera and detecting an asset region in the monitoring area by image processing will be described. In this technique, after an edge image is generated from the acquired image, a feature amount related to linearity is extracted and selected, and an asset candidate area on the image is detected based on the extracted feature amount. This will be specifically described below. Here, “assets” refers to structures such as roads, bridges, signs, and the like, and particularly structures having a straight line as a feature of the shape.

　図１は、本実施形態に係る画像監視システム１００の概略構成を示すブロック図である。画像監視システム１００は、ハードウェアとしてはＣＰＵを備えた電子計算機システムにより構成され、それぞれの機能が実行されるようになっている。前記ハードウェアは、ＤＳＰ（Digital Signal Processor）やＦＰＧＡ（Field-Programmable Gate Array）、ＧＰＵ（Graphics Processing Unit）など電子計算機システム以外で代替してもよい。 FIG. 1 is a block diagram illustrating a schematic configuration of an image monitoring system 100 according to the present embodiment. The image monitoring system 100 is configured as an electronic computer system including a CPU as hardware, and each function is executed. The hardware may be replaced by other than an electronic computer system such as a DSP (Digital Signal Processor), an FPGA (Field-Programmable Gate Array), or a GPU (Graphics Processing Unit).

　画像監視システム１００は、撮像装置１０１と、記録装置１１０と、計算装置１１１と、出力装置１１２とを備える。撮像装置１０１は、テレビカメラ等であって、１台または複数のいずれでもよい。撮像装置１０１が撮影した映像は、計算装置１１１へ出力される。 The image monitoring system 100 includes an imaging device 101, a recording device 110, a calculation device 111, a bag, and an output device 112. The imaging apparatus 101 is a television camera or the like, and may be one or a plurality. The video imaged by the imaging device 101 should be output to the calculation device 111.

　計算装置１１１は、映像取得部１０２と、画像処理部１０３と、データ通信部１０４と、記録制御部１０５と、表示制御部１０６と、を備える。 The computing device 111 includes a video acquisition unit 102, an image processing unit 103, a data communication unit 104, a bag, a recording control unit 105, and a display control unit 106.

　映像取得部１０２は、撮像装置１０１または記録装置１１０から映像を取得する。具体的には、映像取得部１０２は、撮像装置１０１からのリアルタイムの画像データや、画像データが記録されている記録装置１１０などから入力された映像信号から、１次元配列もしくは２次元、３次元配列の画像データを取得する。 The video acquisition unit 102 acquires a video from the imaging device 101 or the recording device 110. Specifically, the video acquisition unit 102 uses a real-time image data from the imaging device 101 or a video signal input from the recording device 110 in which the image data is recorded, or the like, as a one-dimensional array or two-dimensional, Obtain image data of a three-dimensional array.

　この画像データにおいては、ノイズやフリッカなどの影響を低減するために、前処理として平滑化フィルタや輪郭強調フィルタ、濃度変換などの処理が施されてもよい。また、用途に応じてＲＧＢカラーやモノクロなどのデータ形式が選択されてもよい。さらには、処理コスト低減の観点から、所定の大きさになるように画像データに対して縮小処理が施されてもよい。 This image data may be subjected to processing such as a smoothing filter, contour enhancement filter, and density conversion in combination with preprocessing in order to reduce the influence of noise and flicker. In addition, a data format such as RGB color or monochrome may be selected according to the usage. Further, from the viewpoint of reducing the processing cost, the image data may be reduced so as to have a predetermined size.

　画像処理部１０３は、映像取得部１０２より得られた画像データを入力として、画像処理によりアセット領域を検出し、その状態を推定する。画像処理部１０３のより具体的な構成及び処理については後述する。データ通信部１０４は、画像処理部１０３の結果や記録装置１１０に保存された情報等を、ローカルエリアに設置された他の機器や、ネットワーク上の監視センタなどと送受信する。 The image processing unit 103 receives the image data obtained from the video acquisition unit 102, detects an asset region by image processing, and estimates its state. More specific configuration and processing of the image processing unit 103 will be described later. The data communication unit 104 transmits / receives the result of the image processing unit 103, information stored in the recording device 110, and the like to other devices installed in the local area, a monitoring center on the network, and the like.

　記録制御部１０５は、画像処理部１０３における検出及び推定結果を用いて、映像の記録制御を行ったり、記録映像の圧縮率や記録間隔の制御を行ったりする。表示制御部１０６は、映像取得部１０２で取得した映像および画像処理部１０３で得られた結果や、記録装置１１０に保存された情報の表示を制御する。記録装置１１０は、映像取得部１０２より得られた映像を、記録制御部１０５の命令により記録保持する。 The recording control unit 105 uses the detection and estimation results in the image processing unit 103 to perform video recording control and control the recording video compression rate and recording interval. The display control unit 10 6 controls display of the video acquired by the video acquisition unit 102 and the result obtained by the image processing unit 103 and information stored in the recording device 110. The recording device 110 records and holds the video obtained from the video acquisition unit 102 according to a command from the recording control unit 105.

　出力装置１１２は、発報装置１０７と、指示装置１０８と、表示出力装置１０９とを備える。発報装置１０７は、画像処理部１０３で検出した異常状態を音声や光などでユーザに知らせる。指示装置１０８は、画像処理部１０３に用いるパラメータや発報内容を受けて発報を停めるなどユーザからの指示を入力する。表示出力装置１０９は、各種の情報を表示する。 The output device 112 includes a notification device 107, an instruction device 108, and a display output device 109. The alarm device 107 notifies the user of an abnormal state detected by the image processing unit 103 by voice or light. The instruction device 108 receives an instruction from the user, such as stopping the notification in response to the parameters used in the image processing unit 103 and the details of the notification. The display output device 109 displays various types of information.

　なお、計算装置１１１は、一つの装置として構成されてもよいし、複数の装置の集合体として構成されてもよく、その構成態様は問わない。 Note that the computing device 111 may be configured as a single device, or may be configured as an aggregate of a plurality of devices, and the configuration mode is not limited.

　つづいて、図２～図８を参照して、画像処理部１０３の機能を具体的に説明する。
　図２は、画像処理部１０３のブロック図である。画像処理部１０３は、エッジ抽出部２０１と、直線抽出部２０２と、直線分類部２０３と、代表直線選定部２０４と、アセット候補領域抽出部２０５と、識別データベース部２０６と、アセット識別部２０７と、アセット状態推定部２０８と、を備える。 Next, the function of the image processing unit 103 will be specifically described with reference to FIGS.
FIG. 2 is a block diagram of the image processing unit 103. The image processing unit 103 includes an edge extraction unit 201, a line extraction unit 202, a line classification unit 203, a representative line selection unit 204, an asset candidate area extraction unit 205, an identification database unit 206, and an asset identification unit 207. And an asset state estimation unit 208.

　図３は、エッジ抽出部２０１によるエッジ抽出処理及び２値化処理の例を示した図である。エッジ抽出部２０１は、図示のように、映像取得部１０２より取得した入力画像３０１からエッジを抽出する。本実施形態では、入力画像データの輝度値に対して垂直方向のみのＳｏｂｅｌフィルタを適用し、判別分析法（大津の手法等）による２値化を行うことでエッジ画像３０２を取得する。 FIG. 3 is a diagram illustrating an example of edge extraction processing and binarization processing by the edge extraction unit 201. The edge extraction unit 201 extracts edges from the input image 301 acquired from the video acquisition unit 102 as illustrated. In the present embodiment, the edge image 302 is acquired by applying a Sobel filter only in the vertical direction to the luminance value of the input image data and performing binarization by a discriminant analysis method (Otsu's method or the like).

　ここでエッジ抽出処理に用いる情報として、入力画像データの各画素値であるＲＧＢ値、ＨＳＶ空間に変換した際の各値Ｈ（色相）、Ｓ（彩度）、Ｖ（明度）、またはそれらを補正・統合した情報が用いられてもよい。また、垂直方向のＳｏｂｅｌフィルタの代わりに水平方向のＳｏｂｅｌフィルタやラプラシアンフィルタなどの他のフィルタが用いられてもよい。さらに、２値化に用いられる閾値は任意に決定されてもよい。 Here, as information used for the edge extraction processing, RGB values で which are each pixel value of the input image data, each value H (hue), S (saturation), V (lightness) when converted into the HSV space, or these are used. Corrected and integrated information may be used. Further, other filters such as a horizontal Sobel filter and a Laplacian filter may be used instead of the vertical Sobel filter. Furthermore, the threshold value used for binarization may be arbitrarily determined.

　図４は、直線抽出部２０２による直線抽出処理の例を示した図である。図示のように、直線抽出部２０２は、エッジ抽出部２０１で抽出したエッジ画像３０２を用いて直線画像４０１を取得する。 FIG. 4 is a diagram showing an example of line extraction processing by the line extraction unit 202. As shown in the figure, the eyelid straight line extraction unit 202 acquires a straight line image 401 using the edge image 302 extracted by the edge extraction unit 201.

　本実施形態では、直線抽出処理としてハフ変換を適用する。具体的には、投票数の閾値として取得画像の高さの１／３を設定する。例えば、取得画像の高さが「１０８０」であった場合、投票数の閾値を「３６０」とする。投票数の閾値は取得画像の幅などを利用して動的に設定、もしくは任意の値で設定してもよい。また、直線画像４０１を取得する際、ハフ変換で得た結果を、太さを「２」として８近傍で投影する。投影する際、各直線の太さや４近傍と８近傍の選択は、取得画像の大きさに合わせて動的に値を設定もしくは任意の値を設定してもよい。 In this embodiment, the Hough transform is applied as the straight line extraction process. Specifically, 1/3 of the height of the acquired image is set as a threshold value 投票 for the number of votes. For example, when the height of the acquired image is “1080”, the threshold for the number of votes is set to “360”. The threshold for the number of votes may be set dynamically using the width of the acquired image or an arbitrary value. Further, when acquiring the straight line image 401, the result obtained by the Hough transform is projected in the vicinity of 8 with a thickness of “2”. When projecting, the thickness of each straight line and the selection of the vicinity of 4 and 8 may be set dynamically or in accordance with the size of the acquired image.

　垂直成分の強い直線のみを取得するために、抽出した各直線の傾きを算出し、画像上の垂直線を０度としたとき、－ｋ１～ｋ２（度）の範囲内の傾きを持つ直線のみを選定する。ここで、ｋ１、ｋ２は任意の定数である。本実施例では、アセットのうち例えば電柱を高精度に抽出することを目的としてｋ１＝ｋ２＝１５とする。また、ハフ変換を適用して直線を抽出する場合は、ρ＝ｘ×ｃｏｓ（θ）＋ｙ×ｓｉｎ（θ）によって直線を表現するものとしたとき、ｋ１、ｋ２の代わりにρやθを基準にしてもよい。 In order to obtain only straight lines with strong vertical components, the slope of each extracted straight line is calculated, and when the vertical line on the image is 0 degrees, only straight lines having a slope in the range of -k1 to k2 (degrees) To select. Here, k1 and k2 are arbitrary constants. In the present embodiment, k1 = k2 = 15 is set for the purpose of extracting, for example, a utility pole among assets with high accuracy. In addition, when a straight line is extracted by applying the Hough transform, when ρ = x × cos (θ) + y × sin (θ) is used to represent a straight line, ρ and θ are used as references instead of k1 and k2. It may be.

　図５は、直線分類部２０３による直線分類処理（グルーピング処理）の例を示した図である。直線分類部２０３は、直線抽出部２０２で抽出した直線画像４０１をいくつかのグループに分類する。 FIG. 5 is a diagram showing an example of straight line classification processing (grouping processing) by the straight line classification unit 203. The straight line classification unit 203 classifies the straight line images 401 extracted by the straight line extraction unit 202 into several groups.

　直線分類部２０３は、各直線を画像上に投影した後、画像内の各列を順次探索し、各列に一定の幅Ｒ１を持たせた範囲内において各直線の一部でも含まれていれば、それらの直線を同じグループとして分類し、直線分類画像５０１を取得する。本実施形態では、Ｒ１＝２０を設定している。ただし、直線抽出部２０２と同様に、分類の基準はハフ変換を行う際のρやθを用いてもよい。ここでは、直線分類画像５０１では、直線分類処理の結果、グループ１（Ｇ１）～グループ５（Ｇ５）の５グループに分類されている。分類結果を表示出力する場合は、色分けやラベル名を付加して示す。 After projecting each straight line onto the image, the straight line classification unit 203 sequentially searches each column in the image, and even a part of each straight line may be included within a range in which each column has a certain width R1. For example, these straight lines are classified as the same group, and a straight line classified image 501 is acquired. In the present embodiment, R1 = 20 is set. However, as with the straight line extraction unit 202, ρ or θ used when performing the Hough transform may be used as the classification criterion. Here, the straight line classification image 501 is classified into five groups, group 1 (G1) to group 5 (G5), as a result of the straight line classification process. When the classification result is displayed and output, it is indicated by adding a color classification or a label name.

　図６は、代表直線選定部２０４による代表直線選定処理の例を示した図である。代表直線選定部２０４は、直線分類部２０３によって分類した各々の直線グループ（Ｇ１～Ｇ５）から代表する直線１つを選定する。具体的には、代表直線選定部２０４は、直線分類部２０３で分類された各グループ（Ｇ１～Ｇ５）から、直線抽出部２０２でハフ変換を用いたときの最大投票数を得た直線を１つの代表直線として選定し、代表直線画像６０１を取得する。 FIG. 6 is a diagram showing an example of representative line selection processing by the representative line selection unit 204. The representative straight line selection unit 204 selects one representative straight line from each of the straight line groups (G1 to G5) classified by the straight line classification unit 203. Specifically, the representative straight line selection unit 204 obtains a straight line obtained from the groups (G1 to G5) classified by the straight line classification unit 203 and obtained the maximum number of votes when using the Hough transform in the straight line extraction unit 202. A representative straight line image 601 is acquired by selecting the two representative straight lines.

　より具体的には、各グループ（Ｇ１～Ｇ５）で最もハフ変換の投票スコアが高い直線を、エッジ画像上で上下からスキャンし、一定ピクセル連続でエッジが存在した部分を根元／頂点と見なす。図示では、グループ１（Ｇ１）～グループ５（Ｇ５）のそれぞれに対応した、第１～第５グループ代表直線ＬＧ１～ＬＧ５が選定され示されている。なお、代表直線の選定方法としては、各グループの直線の傾きの平均値、最大値、最小値、中央値などが用いられてもよい。 More specifically, the straight line having the highest Hough transform voting score in each group (G1 to G5) is scanned from the top and bottom on the edge image, and the portion where the edge exists continuously for a certain number of pixels is regarded as the root ridge / vertex. In the figure, first to fifth group representative straight lines LG1 to LG5 corresponding to each of group 1 (G1) to group 5 (G5) are selected and shown. As a representative straight line selection method, an average value, a maximum value, a minimum value, a median value, etc., of the straight lines of each group may be used.

　図７は、アセット候補領域抽出部２０５によるアセット領域抽出処理の例を示した図である。ここでは、アセット候補領域抽出部２０５は、画像上のそれぞれのアセット候補位置（左上座標と右下座標）を検出する。 FIG. 7 is a diagram showing an example of asset region extraction processing by the asset candidate region extraction unit 205. Here, the asset candidate area extraction unit 205 detects each asset candidate position (upper left coordinate and lower right coordinate) on the image.

　はじめに、アセット候補領域抽出部２０５は、それぞれのｘ座標を検出するために、直線分類部２０３で取得した直線分類画像５０１を用いて、各グループを囲むための矩形の最小ｘ座標と最大ｘ座標を計算し、それぞれをアセット候補位置の左上と右下のｘ座標とする。ここで、それぞれのｘ座標から算出される幅に一定のオフセット値Ｒ２を与えてもよく、本実施形態ではＲ２＝２０を設定し、それぞれのｘ座標を均等に外側へ広げているが、どちらか一方のみを外側へ広げるもしくは内側に狭めてもよい。 First, the asset candidate area extraction unit 205 uses the straight line classification image 501 acquired by the straight line classification unit 203 to detect the respective x coordinates, and uses the minimum x coordinate and the maximum x coordinate of a rectangle for enclosing each group. And the x coordinates of the upper left and lower right of the asset candidate position. Here, a constant offset value R2 may be given to the width calculated from each x-coordinate. In this embodiment, R2 = 20 is set and each x-coordinate is evenly spread outward. Only one of them may be expanded outward or narrowed inward.

　次に、アセット候補領域抽出部２０５は、それぞれのｙ座標を検出するために、エッジ抽出部２０１で抽出したエッジ画像３０２と代表直線選定部２０４で取得した代表直線画像６０１との論理演算ＡＮＤを計算し、膨張処理と収縮処理を複数回行い論理積膨張画像７０１を得る。図示の例では、５×５のフィルタによる膨張処理を３回と１×５のフィルタによる収縮処理を３回実施した。 Next, the asset candidate area extraction unit 205 performs a logical operation AND between the edge image 302 extracted by the edge extraction unit 201 and the representative straight line image 601 acquired by the representative straight line selection unit 204 in order to detect each y coordinate. Calculation and expansion processing and contraction processing are performed a plurality of times to obtain a logical product expansion image 701. In the illustrated example, the expansion process using a 5 × 5 filter was performed three times and the contraction process using a 1 × 5 filter was performed three times.

　ここで、それぞれのフィルタの大きさや実施回数は任意のものに変更してもよい。その後、アセット候補領域抽出部２０５は、論理積膨張画像７０１に対してラベリングを行い、各ラベル領域７０２の面積や高さ、幅などを計算する。計算した各項目から総合的に一番大きいラベル領域７０２を代表直線ごとに決定する。ここでは、代表直線が５本であるので、第１～第５ラベル領域７０２＿１～７０２＿５の５領域となる。 Here, the size of each filter and the number of executions may be changed to arbitrary ones. Thereafter, the asset candidate area extraction unit 205 performs labeling on the logical product expansion image 701 and calculates the area, height, width, and the like of each label area 702. A label area 702 that is the largest overall is determined for each representative straight line from the calculated items. Here, since there are five representative straight lines, there are five areas of first to fifth label areas 702_1 to 702_5.

　そして、アセット候補領域抽出部２０５は、各ラベル領域７０２（第１～第５ラベル領域７０２＿１～７０２＿５）を囲むための矩形の最小ｙ座標と最大ｙ座標を計算し、それぞれをアセット候補位置の左上と右下のｙ座標とする。ここで、それぞれのｙ座標から算出される高さに一定のオフセット値Ｒ３を持たせてもよい。本実施形態ではＲ３＝０を設定しているが、前記ｘ座標の時と同様に、それぞれのｘ座標を均等に外側へ広げるもしくは内側に狭めてもよいし、どちらか一方のみを外側へ広げるもしくは内側に狭めてもよい。 Then, the asset candidate area extraction unit 205 calculates the minimum y coordinate and the maximum y coordinate of the rectangle for enclosing each label area 702 (first to fifth label areas 702_1 to 702_5), and sets each of the asset candidate positions. Let the y coordinate be in the upper left and lower right. Here, a constant offset value R3 may be given to the height calculated from each y coordinate. In this embodiment, R3 = 0 is set. However, as in the case of the x coordinate, each x coordinate may be spread outwardly or narrowed inward, or only one of them may be expanded outward. Or it may be narrowed inside.

　アセット候補領域抽出部２０５は、前記で得たそれぞれのｘ座標とｙ座標を合わせてアセット候補領域７０３（第１～第５アセット候補領域７０３＿１～７０３＿５）として抽出する。図７では、入力画像３０１にアセット候補領域７０３を投影した合成画像７０４を示している。 The asset candidate area extraction unit 205 extracts each of the x and y coordinates obtained above as asset candidate areas 703 (first to fifth asset candidate areas 703_1 to 703_5). FIG. 7 shows a composite image 704７０ obtained by projecting the asset candidate area 703 on the input image 301.

　図８は、アセット識別部２０７によるアセット識別処理の例を示した図である。図示のように、入力画像３０１にアセット候補領域抽出部２０５で抽出したアセット候補領域７０３を投影した合成画像７０４に対して、事前に様々なシーンで撮影された大量のアセット画像を格納した識別データベース部２０６を用いて、深層学習（Deep Learning）によるアセット識別を行い、ノイズ領域を排除する。 FIG. 8 is a diagram showing an example of asset identification processing by the asset identification unit 207. As shown in the figure, an identification database storing a large number of asset images taken in advance in various scenes with respect to the composite image 704 obtained by projecting the asset candidate region 703 extracted by the asset candidate region extraction unit 205 onto the input image 301. Using the unit 206, asset identification is performed by deep learning (Deep Learning), and noise regions are excluded.

　すなわち、識別データベース部２０６には、検出したい検出対象アセット（例えば電柱や標識）と、非アセット領域（検出対象アセット以外のアセット）の画像データが収集され、深層学習（Deep Learning）のような機械学習により、非特徴量ベースで学習を行う。抽出した領域を学習データとして用いることで、学習データ作成の自動化や精度向上を行える。 That is, the identification database unit 206 collects image data of detection target assets (for example, power poles and signs) to be detected and non-asset areas (assets other than the detection target assets), such as deep learning (Deep Learning). Machine learning should be performed on a non-feature basis. By using the extracted region as learning data, it is possible to automate learning data creation and improve accuracy.

　図８の例では、アセット識別部２０７は、第１及び第３アセット候補領域７０３＿１、７０３＿３に対応する電柱を検知対象のアセットとして識別しており、電柱以外はノイズとみなし、識別画像８０１を取得する。本実施形態によると、識別対象（検出対象アセット）は道路標識や信号機、電灯、街灯、白線などの道路上表記物、電線、橋脚、配管、建物の柱、鴨居、天板などのその他のアセットでもよく、識別は検知対象と非検知対象の２クラスに限らず、多クラス識別に拡張することも可能である。その他、直線を特徴とする物には適用可能である。また、深層学習（Deep Learning）の代わりに、特徴量を何かしら決めた上でＳＶＭ（Support Vector Machine）などの一般的な機械学習法を用いてもよい。特徴量として、例えば、電線の有無や看板の有無などのルールベースの方法やＨｏＧ特徴などの汎用な特徴など様々な特徴を用いることができる。本候補領域抽出手段を用いることで、従来法のＳＳ手法より高速に動作し、誤報の数が少なくなる。 In the example of FIG. 8, the asset identification unit 207 identifies the power poles corresponding to the first and third asset candidate areas 703_1 and 703_3 as the assets to be detected, regards other than the power poles as noise, and acquires the identification image 801. To do. According to the present embodiment, identification targets (detection target assets) are road signs, traffic lights, electric lights, street lights, road markings such as white lines, electric wires, piers, piping, building pillars, ducks, top plates, etc. The asset may be an asset, and the identification is not limited to two classes of detection target and non-detection target, but can be extended to multi-class identification. In addition, it can be applied to an object characterized by a straight line. Further, instead of deep learning (Deep 特徴 Learning), a general machine learning method such as SVM (Support Vector Machine) may be used after some feature amount is determined. As the feature amount, for example, various features such as a rule-based method such as presence / absence of an electric wire or presence / absence of a signboard and general-purpose features such as a HoG bag feature can be used. By using this candidate area extraction means, it operates faster than the conventional SS method, and the number of false alarms is reduced.

　アセット状態推定部２０８は、アセット識別部２０７によって検出対象アセットと識別されたアセットの角度や３次元上の位置を推定する。アセットの位置や角度の推定には、画像から座標変換やハフ変換などによって求める方法の他、ＴｏＦ（Time of Flight）やステレオカメラなどから得られる３次元情報やＧＰＳ（Global Positioning System）)などの位置センサ、水平水準器などの角度センサを併用したシステムを用いてもよい。さらに、画像処理部１０３は、アセット識別部２０７において必要となる教師データを効率的に作成するために、識別データベース部２０６に格納されている画像群をエッジ抽出部２０１に入力することで、アセット候補領域抽出部２０５やアセット識別部２０７で得られたアセット候補領域およびアセット領域を教師データとして活用することができる。 The asset state estimation unit 208 estimates the angle and the three-dimensional position of the asset identified as the detection target asset by the asset identification unit 207. For estimation of asset position and angle, in addition to methods such as coordinate transformation or Hough transformation from 変換 images, 3D information obtained from ToF (Time of Flight) and stereo cameras, GPS (Global Positioning System)), etc. You may use the system which used angle sensors, such as a position sensor and a horizontal level, together. Further, the image processing unit 103 inputs the image group stored in the identification database unit 206 to the edge extraction unit 2 01 in order to efficiently create the teacher data required in the asset identification unit 207. The asset candidate area and the asset area obtained by the asset candidate area extraction unit 205 and the asset identification unit 207 can be used as teacher data.

　図９は、画像監視システム１００の主に画像処理部１０３の処理に着目したフローチャートであり、当該フローチャートを参照して画像処理部１０３の処理を簡単に纏めて説明する。 FIG. 9 is a flowchart mainly focusing on the processing of the image processing unit 103 of the image monitoring system 100. The processing of the image processing unit 103 will be briefly described with reference to the flowchart.

　まず、画像入力処理として、映像取得部１０２が撮像装置１０１や記録装置１１０から画像データを取得しエッジ抽出部２０１へ出力する（Ｓ１０）。エッジ抽出部２０１は、映像取得部１０２から取得した画像データに対して所定のフィルタリングを施しエッジ抽出処理及び２値化処理を行う（Ｓ１２）。直線抽出部２０２は、エッジ抽出部２０１で抽出したエッジ画像３０２に対して直線検出処理を行い、直線画像４０１を取得する（Ｓ１４）。 First, as an image input process, the video acquisition unit 102 acquires the bag image data from the imaging device 101 or the recording device 110 and outputs it to the edge extraction unit 201 (S10). The edge extraction unit 201 performs predetermined filtering on the image data acquired from the heel image acquisition unit 102 to perform edge extraction processing and binarization processing (S12). The straight line extraction unit 202 performs straight line detection processing on the edge image 302 extracted by the edge extraction unit 201, and acquires a straight line image 401 (S1 4).

　つづいて、直線分類部２０３は、直線のグルーピング処理を行い、近くに存在する直線同士を１グループとして纏める（Ｓ１６）。つぎに、アセット領域抽出処理として、代表直線選定部２０４及びアセット候補領域抽出部２０５の処理を行う（Ｓ１８）。代表直線選定部２０４が各グループの代表直線を抽出し、アセット候補領域抽出部２０５がラベリング処理等を行いそれぞれの代表直線に対応したアセット候補領域を抽出する。 Subsequently, the straight line classifying unit 203 performs straight line grouping processing and collects straight lines existing in the vicinity as one group (S16). Next, processing of the representative straight line selection unit 204 and the asset candidate region extraction unit 205 is performed as asset region extraction processing (S18). The representative straight line selection unit 204 extracts a representative straight line of each group, and the asset candidate region extraction unit 205 performs a labeling process or the like to extract an asset candidate region corresponding to each representative straight line.

　さらにつづいて、アセット識別部２０７が識別データベース部２０６を参照し、アセット候補領域から検出対象アセットを識別する（Ｓ２０）。アセット状態推定部２０８は、識別された検出対象アセットの状態推定（角度や３次元上の位置の推定）を行う（Ｓ２２）。 Further, the asset identification unit 207 refers to the identification database unit 206 to identify the detection target asset from the asset candidate area (S20). The asset state estimation unit 208 performs state estimation (angle or three-dimensional position estimation) of the detection target asset that has been identified (S22).

　以上、本実施形態によれば、撮像装置１０１から入力される映像または記録装置１１０の記録映像から、移動物体等の影響を受けることなく画像中に写る複数のアセット候補領域を個別に検出することが可能となる。その結果、日照変化などのノイズに強固なアセット領域の自動検出機能を有する検出技術とすることができる。言い換えると、この検出技術により、好適なアセット領域の記録、アセットを検出した場合の発報機能の向上、モニタ表示による監視員への適切な周知が可能となる。 As described above, according to the present embodiment, a plurality of asset candidate areas appearing in an image can be individually detected from the video input from the imaging device 101 or the recorded video of the recording device 110 なく without being affected by a moving object or the like. Is possible. As a result, it is possible to provide a detection technique having an automatic asset region detection function that is robust against noise such as changes in sunlight. In other words, this detection technique makes it possible to record a suitable asset area, improve the reporting function when an asset is detected, and appropriately notify the monitor by a monitor display.

　以上、本発明を、実施形態をもとに説明した。この実施形態は例示であり、それらの各構成要素の組み合わせにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。 The present invention has been described based on the embodiments. This embodiment is an exemplification, and it will be understood by those skilled in the art that various modifications can be made to combinations of these components, and such modifications are also within the scope of the present invention.

１００　画像監視システム、１０１　撮像装置、１０２　映像取得部、１０３　画像処理部、１０４　データ通信部、１０５　記録制御部、１０６　表示制御部、１０７　発報装置、１０８　指示装置、１０９　表示出力装置、１１０　記録装置、１１１　計算装置、１１２　出力装置、２０１　エッジ抽出部、２０２　直線抽出部、２０３　直線分類部、２０４　代表直線選定部、２０５　アセット候補領域抽出部、２０６　識別データベース部、２０７　アセット識別部、２０８　アセット状態推定部 100 image monitoring system, 101 imaging device, 102 video acquisition unit, 103 image processing unit, 104 data communication unit, 105 recording control unit, 106 display control unit, 107 reporting device, 108 indicating device, 109 display output device, 110 recording Device, 111 calculation device, 112 output device, 201 edge extraction unit, 202 straight line extraction unit, 203 straight line classification unit, 204 representative line selection unit, 205 asset candidate area extraction unit, 206 identification database unit, 207 asset identification unit, 208 asset State estimation unit

Claims

　映像取得部により取得した入力画像から、監視エリア内のアセット領域を検出する映像監視装置であって、
　前記入力画像からエッジ画像を取得するエッジ抽出部と、
　前記エッジ画像から直線を抽出する直線抽出部と、
　前記抽出された直線を複数のグループに分類する直線分類部と、
　前記分類された各々のグループから１つの代表直線を選定する代表直線選定部と、
　前記代表直線を用いてアセット候補領域を抽出するアセット候補領域抽出部と、
　前記アセット候補領域を識別するために必要な画像群が格納されている識別データベース部と、
　前記識別データベース部を参照して前記アセット候補領域から検出対象アセットを識別するアセット識別部と、
　を備えることを特徴とする映像監視装置。 A video monitoring device that detects an asset area in a monitoring area from an input image acquired by a video acquisition unit,
An edge extraction unit that acquires an edge image from the input image;
A straight line extraction unit for extracting a straight line from the edge image;
A straight line classification unit for classifying the extracted straight lines into a plurality of groups;
A representative straight line selection unit for selecting one representative straight line from each of the classified groups;
An asset candidate area extraction unit that extracts an asset candidate area using the representative straight line;
An identification database part storing images necessary for identifying the asset candidate area;
An asset identification unit that identifies a detection target asset from the asset candidate region with reference to the identification database unit;
A video surveillance apparatus comprising:
　前記アセット候補領域抽出部は、前記アセット候補領域の領域位置を検出することを特徴とする請求項１に記載の映像監視装置。 2. The video monitoring apparatus according to claim 1, wherein the asset candidate area extraction unit detects an area position of the asset candidate area.
　前記アセット識別部が識別した前記検出対象アセットの状態推定を行うアセット状態推定部を備え、
　前記アセット状態推定部は、少なくとも前記検出対象アセットの角度または３次元上の位置のいずれかを算出することを特徴とする請求項１に記載の映像監視装置。 An asset state estimation unit that performs state estimation of the detection target asset identified by the asset identification unit;
The video monitoring apparatus according to claim 1, wherein the asset state estimation unit calculates at least one of an angle and a three-dimensional position of the detection target asset.
前記アセット識別部が識別した前記検出対象アセットの状態推定を行うアセット状態推定部を備え、
　前記アセット状態推定部は、少なくとも前記検出対象アセットの角度または３次元上の位置のいずれかを算出することを特徴とする請求項２に記載の映像監視装置。 An asset state estimation unit that performs state estimation of the detection target asset identified by the asset identification unit;
The video monitoring apparatus according to claim 2, wherein the asset state estimation unit calculates at least one of an angle and a three-dimensional position of the detection target asset.