WO2018110377A1 - Video monitoring device - Google Patents

Video monitoring device Download PDF

Info

Publication number
WO2018110377A1
WO2018110377A1 PCT/JP2017/043746 JP2017043746W WO2018110377A1 WO 2018110377 A1 WO2018110377 A1 WO 2018110377A1 JP 2017043746 W JP2017043746 W JP 2017043746W WO 2018110377 A1 WO2018110377 A1 WO 2018110377A1
Authority
WO
WIPO (PCT)
Prior art keywords
asset
unit
straight line
image
extraction unit
Prior art date
Application number
PCT/JP2017/043746
Other languages
French (fr)
Japanese (ja)
Inventor
海斗 笹尾
伊藤 渡
一成 岩永
Original Assignee
株式会社日立国際電気
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立国際電気 filed Critical 株式会社日立国際電気
Priority to JP2018556603A priority Critical patent/JP6831396B2/en
Publication of WO2018110377A1 publication Critical patent/WO2018110377A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes

Definitions

  • the present invention relates to a video monitoring apparatus, for example, a video monitoring apparatus having a function of detecting an asset candidate area in an image.
  • Monitoring an object that enters the monitoring target area is performed using an imaging device such as a monitoring camera.
  • an imaging device such as a monitoring camera.
  • a technique in which a device or system automatically performs a monitoring gutter instead of manned monitoring by a supervisor is being studied.
  • the video monitoring device having such a function uses the detection result to record only the video in which the moving object appears, displays a warning icon on the display device, and sounds a buzzer to alert the monitor. Therefore, it is useful for reducing the burden of monitoring work, which conventionally requires constant confirmation work.
  • a monitoring technique using a method called a background difference method As an example of a technique for automatically detecting an object that enters an area to be monitored, a monitoring technique using a method called a background difference method has been widely used (see, for example, Patent Document 1).
  • the object detection method based on the background subtraction method calculates a difference in luminance (or pixel value) between an input image obtained from an imaging device or the like and a reference background image on which an object to be detected is not reflected, and the detected value Monitoring is performed on the assumption that there is or may be an object to be detected in a change area where is larger than a predetermined threshold.
  • the temporal change of the target object itself and the relative relationship between the target object and the background are obtained from two time-dependent captured images acquired by a terminal device having a shooting function and a positioning function. Two types of changes over time, such as changes (electric pole inclination, etc.) are automatically detected.
  • an SS method Selective search method
  • an area where an object exists or is likely to be detected is detected based on the color of the input image and the similarity of the texture.
  • the above-described object region detection method automatically detects and issues a region where an object exists or is likely to exist by a difference method or an SS method.
  • a difference method or an SS method.
  • an object that is a background such as a building cannot be detected.
  • the SS method has a characteristic of detecting a large amount of objects that are likely to be objects, and is likely to cause false alarms. The calculation cost is very high, and it is difficult to operate in real time with a general C PU (Central Processing Unit). .
  • C PU Central Processing Unit
  • the present invention has been made in view of such a situation, and aims to solve the above problems.
  • the present invention is a video monitoring device that detects an asset region in a monitoring area from an input image acquired by a video acquisition unit, an edge extraction unit that acquires an edge image from the input image, and a straight line from the edge image.
  • a straight line extracting unit for extracting, a straight line classifying unit for classifying the extracted straight lines into a plurality of groups, a representative straight line selecting unit for selecting one representative straight line from each of the classified groups, and the representative straight line
  • An asset candidate area extracting unit for extracting the asset candidate area, an identification database part storing an image group necessary for identifying the asset candidate area, and referring to the identification database part from the asset candidate area.
  • An asset identification unit that identifies the asset to be detected.
  • the asset candidate area extraction unit may detect the area position of the asset candidate area.
  • an asset state estimation unit that estimates a state of the detection target asset identified by the asset identification unit, and the asset state estimation unit calculates at least one of an angle or a three-dimensional position of the detection target asset. May be.
  • an object of the present invention is to provide an asset area automatic detection technique that is robust against noise such as changes in sunlight.
  • FIG. 1 is a block diagram illustrating a schematic configuration of an image monitoring system according to an embodiment. It is a block diagram of an image processing part concerning an embodiment. It is the figure which showed the example of the edge extraction process and binarization process by an edge extraction part based on embodiment. It is the figure which showed the example of the straight line extraction process by the straight line extraction part based on embodiment. It is a figure showing an example of straight line classification processing by a straight line classification part concerning an embodiment. It is a figure which showed the example of the representative straight line selection process by the representative straight line selection part based on embodiment. It is the figure which showed the example of the asset area
  • an image monitoring technique having a function of acquiring a video from an imaging device such as a camera and detecting an asset region in the monitoring area by image processing will be described.
  • an edge image is generated from the acquired image
  • a feature amount related to linearity is extracted and selected, and an asset candidate area on the image is detected based on the extracted feature amount.
  • asset candidate area on the image is detected based on the extracted feature amount.
  • asserts refers to structures such as roads, bridges, signs, and the like, and particularly structures having a straight line as a feature of the shape.
  • FIG. 1 is a block diagram illustrating a schematic configuration of an image monitoring system 100 according to the present embodiment.
  • the image monitoring system 100 is configured as an electronic computer system including a CPU as hardware, and each function is executed.
  • the hardware may be replaced by other than an electronic computer system such as a DSP (Digital Signal Processor), an FPGA (Field-Programmable Gate Array), or a GPU (Graphics Processing Unit).
  • DSP Digital Signal Processor
  • FPGA Field-Programmable Gate Array
  • GPU Graphics Processing Unit
  • the image monitoring system 100 includes an imaging device 101, a recording device 110, a calculation device 111, a bag, and an output device 112.
  • the imaging apparatus 101 is a television camera or the like, and may be one or a plurality.
  • the video imaged by the imaging device 101 should be output to the calculation device 111.
  • the computing device 111 includes a video acquisition unit 102, an image processing unit 103, a data communication unit 104, a bag, a recording control unit 105, and a display control unit 106.
  • the video acquisition unit 102 acquires a video from the imaging device 101 or the recording device 110. Specifically, the video acquisition unit 102 uses a real-time image data from the imaging device 101 or a video signal input from the recording device 110 in which the image data is recorded, or the like, as a one-dimensional array or two-dimensional, Obtain image data of a three-dimensional array.
  • This image data may be subjected to processing such as a smoothing filter, contour enhancement filter, and density conversion in combination with preprocessing in order to reduce the influence of noise and flicker.
  • processing such as a smoothing filter, contour enhancement filter, and density conversion in combination with preprocessing in order to reduce the influence of noise and flicker.
  • a data format such as RGB color or monochrome may be selected according to the usage.
  • the image data may be reduced so as to have a predetermined size.
  • the image processing unit 103 receives the image data obtained from the video acquisition unit 102, detects an asset region by image processing, and estimates its state. More specific configuration and processing of the image processing unit 103 will be described later.
  • the data communication unit 104 transmits / receives the result of the image processing unit 103, information stored in the recording device 110, and the like to other devices installed in the local area, a monitoring center on the network, and the like.
  • the recording control unit 105 uses the detection and estimation results in the image processing unit 103 to perform video recording control and control the recording video compression rate and recording interval.
  • the display control unit 10 6 controls display of the video acquired by the video acquisition unit 102 and the result obtained by the image processing unit 103 and information stored in the recording device 110.
  • the recording device 110 records and holds the video obtained from the video acquisition unit 102 according to a command from the recording control unit 105.
  • the output device 112 includes a notification device 107, an instruction device 108, and a display output device 109.
  • the alarm device 107 notifies the user of an abnormal state detected by the image processing unit 103 by voice or light.
  • the instruction device 108 receives an instruction from the user, such as stopping the notification in response to the parameters used in the image processing unit 103 and the details of the notification.
  • the display output device 109 displays various types of information.
  • the computing device 111 may be configured as a single device, or may be configured as an aggregate of a plurality of devices, and the configuration mode is not limited.
  • FIG. 2 is a block diagram of the image processing unit 103.
  • the image processing unit 103 includes an edge extraction unit 201, a line extraction unit 202, a line classification unit 203, a representative line selection unit 204, an asset candidate area extraction unit 205, an identification database unit 206, and an asset identification unit 207. And an asset state estimation unit 208.
  • FIG. 3 is a diagram illustrating an example of edge extraction processing and binarization processing by the edge extraction unit 201.
  • the edge extraction unit 201 extracts edges from the input image 301 acquired from the video acquisition unit 102 as illustrated.
  • the edge image 302 is acquired by applying a Sobel filter only in the vertical direction to the luminance value of the input image data and performing binarization by a discriminant analysis method (Otsu's method or the like).
  • RGB values ⁇ which are each pixel value of the input image data, each value H (hue), S (saturation), V (lightness) when converted into the HSV space, or these are used. Corrected and integrated information may be used. Further, other filters such as a horizontal Sobel filter and a Laplacian filter may be used instead of the vertical Sobel filter. Furthermore, the threshold value used for binarization may be arbitrarily determined.
  • FIG. 4 is a diagram showing an example of line extraction processing by the line extraction unit 202.
  • the eyelid straight line extraction unit 202 acquires a straight line image 401 using the edge image 302 extracted by the edge extraction unit 201.
  • the Hough transform is applied as the straight line extraction process. Specifically, 1/3 of the height of the acquired image is set as a threshold value ⁇ for the number of votes. For example, when the height of the acquired image is “1080”, the threshold for the number of votes is set to “360”. The threshold for the number of votes may be set dynamically using the width of the acquired image or an arbitrary value. Further, when acquiring the straight line image 401, the result obtained by the Hough transform is projected in the vicinity of 8 with a thickness of “2”. When projecting, the thickness of each straight line and the selection of the vicinity of 4 and 8 may be set dynamically or in accordance with the size of the acquired image.
  • each extracted straight line is calculated, and when the vertical line on the image is 0 degrees, only straight lines having a slope in the range of -k1 to k2 (degrees) To select.
  • k1 and k2 are arbitrary constants.
  • x ⁇ cos ( ⁇ ) + y ⁇ sin ( ⁇ ) is used to represent a straight line
  • ⁇ and ⁇ are used as references instead of k1 and k2. It may be.
  • FIG. 5 is a diagram showing an example of straight line classification processing (grouping processing) by the straight line classification unit 203.
  • the straight line classification unit 203 classifies the straight line images 401 extracted by the straight line extraction unit 202 into several groups.
  • the straight line classification image 501 is classified into five groups, group 1 (G1) to group 5 (G5), as a result of the straight line classification process. When the classification result is displayed and output, it is indicated by adding a color classification or a label name.
  • FIG. 6 is a diagram showing an example of representative line selection processing by the representative line selection unit 204.
  • the representative straight line selection unit 204 selects one representative straight line from each of the straight line groups (G1 to G5) classified by the straight line classification unit 203. Specifically, the representative straight line selection unit 204 obtains a straight line obtained from the groups (G1 to G5) classified by the straight line classification unit 203 and obtained the maximum number of votes when using the Hough transform in the straight line extraction unit 202.
  • a representative straight line image 601 is acquired by selecting the two representative straight lines.
  • the straight line having the highest Hough transform voting score in each group is scanned from the top and bottom on the edge image, and the portion where the edge exists continuously for a certain number of pixels is regarded as the root ridge / vertex.
  • first to fifth group representative straight lines LG1 to LG5 corresponding to each of group 1 (G1) to group 5 (G5) are selected and shown.
  • an average value, a maximum value, a minimum value, a median value, etc., of the straight lines of each group may be used.
  • FIG. 7 is a diagram showing an example of asset region extraction processing by the asset candidate region extraction unit 205.
  • the asset candidate area extraction unit 205 detects each asset candidate position (upper left coordinate and lower right coordinate) on the image.
  • the asset candidate area extraction unit 205 uses the straight line classification image 501 acquired by the straight line classification unit 203 to detect the respective x coordinates, and uses the minimum x coordinate and the maximum x coordinate of a rectangle for enclosing each group. And the x coordinates of the upper left and lower right of the asset candidate position.
  • a constant offset value R2 may be given to the width calculated from each x-coordinate.
  • R2 20 is set and each x-coordinate is evenly spread outward. Only one of them may be expanded outward or narrowed inward.
  • the asset candidate area extraction unit 205 performs a logical operation AND between the edge image 302 extracted by the edge extraction unit 201 and the representative straight line image 601 acquired by the representative straight line selection unit 204 in order to detect each y coordinate.
  • Calculation and expansion processing and contraction processing are performed a plurality of times to obtain a logical product expansion image 701.
  • the expansion process using a 5 ⁇ 5 filter was performed three times and the contraction process using a 1 ⁇ 5 filter was performed three times.
  • each filter and the number of executions may be changed to arbitrary ones.
  • the asset candidate area extraction unit 205 performs labeling on the logical product expansion image 701 and calculates the area, height, width, and the like of each label area 702.
  • a label area 702 that is the largest overall is determined for each representative straight line from the calculated items.
  • there are five representative straight lines there are five areas of first to fifth label areas 702_1 to 702_5.
  • the asset candidate area extraction unit 205 calculates the minimum y coordinate and the maximum y coordinate of the rectangle for enclosing each label area 702 (first to fifth label areas 702_1 to 702_5), and sets each of the asset candidate positions.
  • a constant offset value R3 may be given to the height calculated from each y coordinate.
  • R3 0 is set.
  • each x coordinate may be spread outwardly or narrowed inward, or only one of them may be expanded outward. Or it may be narrowed inside.
  • the asset candidate area extraction unit 205 extracts each of the x and y coordinates obtained above as asset candidate areas 703 (first to fifth asset candidate areas 703_1 to 703_5).
  • FIG. 7 shows a composite image 70470 obtained by projecting the asset candidate area 703 on the input image 301.
  • FIG. 8 is a diagram showing an example of asset identification processing by the asset identification unit 207.
  • an identification database storing a large number of asset images taken in advance in various scenes with respect to the composite image 704 obtained by projecting the asset candidate region 703 extracted by the asset candidate region extraction unit 205 onto the input image 301.
  • asset identification is performed by deep learning (Deep Learning), and noise regions are excluded.
  • the identification database unit 206 collects image data of detection target assets (for example, power poles and signs) to be detected and non-asset areas (assets other than the detection target assets), such as deep learning (Deep Learning). Machine learning should be performed on a non-feature basis. By using the extracted region as learning data, it is possible to automate learning data creation and improve accuracy.
  • detection target assets for example, power poles and signs
  • non-asset areas assets other than the detection target assets
  • Machine learning should be performed on a non-feature basis.
  • the asset identification unit 207 identifies the power poles corresponding to the first and third asset candidate areas 703_1 and 703_3 as the assets to be detected, regards other than the power poles as noise, and acquires the identification image 801.
  • identification targets are road signs, traffic lights, electric lights, street lights, road markings such as white lines, electric wires, piers, piping, building pillars, ducks, top plates, etc.
  • the asset may be an asset, and the identification is not limited to two classes of detection target and non-detection target, but can be extended to multi-class identification. In addition, it can be applied to an object characterized by a straight line.
  • a general machine learning method such as SVM (Support Vector Machine) may be used after some feature amount is determined.
  • SVM Small Vector Machine
  • various features such as a rule-based method such as presence / absence of an electric wire or presence / absence of a signboard and general-purpose features such as a HoG bag feature can be used.
  • the asset state estimation unit 208 estimates the angle and the three-dimensional position of the asset identified as the detection target asset by the asset identification unit 207. For estimation of asset position and angle, in addition to methods such as coordinate transformation or Hough transformation from ⁇ ⁇ images, 3D information obtained from ToF (Time of Flight) and stereo cameras, GPS (Global Positioning System)), etc. You may use the system which used angle sensors, such as a position sensor and a horizontal level, together. Further, the image processing unit 103 inputs the image group stored in the identification database unit 206 to the edge extraction unit 2 01 in order to efficiently create the teacher data required in the asset identification unit 207. The asset candidate area and the asset area obtained by the asset candidate area extraction unit 205 and the asset identification unit 207 can be used as teacher data.
  • FIG. 9 is a flowchart mainly focusing on the processing of the image processing unit 103 of the image monitoring system 100. The processing of the image processing unit 103 will be briefly described with reference to the flowchart.
  • the video acquisition unit 102 acquires the bag image data from the imaging device 101 or the recording device 110 and outputs it to the edge extraction unit 201 (S10).
  • the edge extraction unit 201 performs predetermined filtering on the image data acquired from the heel image acquisition unit 102 to perform edge extraction processing and binarization processing (S12).
  • the straight line extraction unit 202 performs straight line detection processing on the edge image 302 extracted by the edge extraction unit 201, and acquires a straight line image 401 (S1 4).
  • the straight line classifying unit 203 performs straight line grouping processing and collects straight lines existing in the vicinity as one group (S16).
  • processing of the representative straight line selection unit 204 and the asset candidate region extraction unit 205 is performed as asset region extraction processing (S18).
  • the representative straight line selection unit 204 extracts a representative straight line of each group, and the asset candidate region extraction unit 205 performs a labeling process or the like to extract an asset candidate region corresponding to each representative straight line.
  • the asset identification unit 207 refers to the identification database unit 206 to identify the detection target asset from the asset candidate area (S20).
  • the asset state estimation unit 208 performs state estimation (angle or three-dimensional position estimation) of the detection target asset that has been identified (S22).
  • a plurality of asset candidate areas appearing in an image can be individually detected from the video input from the imaging device 101 or the recorded video of the recording device 110 ⁇ ⁇ without being affected by a moving object or the like. Is possible.
  • this detection technique makes it possible to record a suitable asset area, improve the reporting function when an asset is detected, and appropriately notify the monitor by a monitor display.
  • 100 image monitoring system 101 imaging device, 102 video acquisition unit, 103 image processing unit, 104 data communication unit, 105 recording control unit, 106 display control unit, 107 reporting device, 108 indicating device, 109 display output device, 110 recording Device, 111 calculation device, 112 output device, 201 edge extraction unit, 202 straight line extraction unit, 203 straight line classification unit, 204 representative line selection unit, 205 asset candidate area extraction unit, 206 identification database unit, 207 asset identification unit, 208 asset State estimation unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

Provided is a technology for automated detection of an asset region, resilient against noise such as changes in sunlight. An edge extraction unit 201 carries out an edge extraction process and a binarization process upon image data which has been acquired from a video acquisition unit 102. A straight line extraction unit 202 carries out a straight line detection process upon an edge image which has been extracted with the edge extraction unit 201. A straight line classification unit 203 carries out a straight line grouping process. A representative straight line selection unit 204 extracts a representative straight line of each of the groups. An asset candidate region extraction unit 205 extracts an asset candidate region which is associated with each of the representative straight lines. An asset identification unit 207 refers to an identification database unit 206 and identifies an asset to be detected from the asset candidate region. An asset state inference unit 208 carries out an inference of an angle and a three-dimensional position of the identified asset to be detected.

Description

映像監視装置Video surveillance device
 本発明は、映像監視装置に係り、例えば、画像中のアセット候補領域を検出する機能を備える映像監視装置に関する。 The present invention relates to a video monitoring apparatus, for example, a video monitoring apparatus having a function of detecting an asset candidate area in an image.
 監視カメラ等の撮像装置を用いて、監視対象領域内に侵入する物体を監視することが行われている。また、監視員による有人監視ではなく、装置或いはシステムが自動的に監視 を行う技術が検討されている。このような機能を有する映像監視装置は、検出結果を利用することで移動物体の出現した映像のみ記録する機能や、表示装置に警告アイコンを表示 、またブザー等を鳴らして監視員の注意を促すことができるため、従来の常時確認作業が必要であった監視業務の負担低減に役立つ。 Monitoring an object that enters the monitoring target area is performed using an imaging device such as a monitoring camera. In addition, a technique in which a device or system automatically performs a monitoring gutter instead of manned monitoring by a supervisor is being studied. The video monitoring device having such a function uses the detection result to record only the video in which the moving object appears, displays a warning icon on the display device, and sounds a buzzer to alert the monitor. Therefore, it is useful for reducing the burden of monitoring work, which conventionally requires constant confirmation work.
 監視対象領域内に侵入する物体を自動的に検出する技術の一例として、背景差分法と呼ばれる方法を用いた監視技術が従来から広く用いられている(例えば、特許文献1参照)。背景差分法による物体検出方法は、撮像装置等から得られる入力画像と、検出すべき物体が映っていない基準となる背景画像との輝度(或いは、画素値)の差分を算出し、当該検出値が所定の閾値と比べて大きい変化領域に検出すべき物体が存在する或いはその可能性があるとして監視を行う。 As an example of a technique for automatically detecting an object that enters an area to be monitored, a monitoring technique using a method called a background difference method has been widely used (see, for example, Patent Document 1). The object detection method based on the background subtraction method calculates a difference in luminance (or pixel value) between an input image obtained from an imaging device or the like and a reference background image on which an object to be detected is not reflected, and the detected value Monitoring is performed on the assumption that there is or may be an object to be detected in a change area where is larger than a predetermined threshold.
 特許文献1に開示の技術では、撮影機能と測位機能とを備える端末装置によって取得した時間的に異なる2枚の撮影画像から、対象物そのものの経時変化と、対象物と背景との相対的な変化(電柱の傾斜等)の2種類の経時変化を自動で検出する。 In the technique disclosed in Patent Document 1, the temporal change of the target object itself and the relative relationship between the target object and the background are obtained from two time-dependent captured images acquired by a terminal device having a shooting function and a positioning function. Two types of changes over time, such as changes (electric pole inclination, etc.) are automatically detected.
 また、撮像装置等から得られる入力画像単体で物体の領域を検出する方法があり、その一例としてSS手法(Selective Search手法)がある。SS手法は、入力画像の色やテク スチャの類似度を基に、物体が存在する或いはその可能性がある領域を検出する。 Also, there is a method for detecting an object region by a single input image obtained from an imaging device or the like, and an SS method (Selective search method) is an example. In the SS method, an area where an object exists or is likely to be detected is detected based on the color of the input image and the similarity of the texture.
特開2016-18463号公報JP 2016-18463 A
 ところで、上述の物体領域検出手法は、差分法やSS手法などによって、物体が存在する或いはその可能性がある領域を自動で検出し発報する。しかし、差分法では動体のみしか検出できず、建物などの背景とされる物体を検出できない。また、カメラ画像中に日照 変化による影が新たに浮かび上がる場合など環境に変化があった場合、その変化を物体が現れたものと認識してしまい、誤報の原因となる。また、SS手法では、物体らしいものを大量に検出する特性があり誤報の原因になり易く、計算コストも非常に高く一般的なC PU(Central Processing Unit)ではリアルタイムに動作することは困難である。 By the way, the above-described object region detection method automatically detects and issues a region where an object exists or is likely to exist by a difference method or an SS method. However, only the moving object can be detected by the difference method, and an object that is a background such as a building cannot be detected. In addition, when there is a change in the environment, such as when a shadow due to a change in sunshine appears in the camera image, the change is recognized as an object appearing, which causes a false alarm. In addition, the SS method has a characteristic of detecting a large amount of objects that are likely to be objects, and is likely to cause false alarms. The calculation cost is very high, and it is difficult to operate in real time with a general C PU (Central Processing Unit). .
 対象としている物体とその他の対象としていない物体や日照変化などのノイズを識別する手段としては、検出した物体の領域から特徴抽出と機械学習などによって識別する方法や深層学習(Deep Learning)などの特徴量設計を必要としない機械学習による識別方法がある。いずれの機械学習方法においても、高精度化にあたっては、入力する候補領域の切り出し精度や学習データの質の向上が必要であるとの課題がある。 As a means of discriminating noises such as target objects and other non-target objects and changes in sunlight, features such as feature extraction and machine learning from the detected object region, and features such as deep learning (Deep Learning) There is an identification method by machine learning that does not require quantitative design. In any of the machine learning methods, there is a problem that it is necessary to improve the extraction accuracy of the candidate region to be input and the quality of the learning data in order to improve the accuracy.
 本発明は、このような状況に鑑みなされたもので、上記課題を解決することを目的とする。 The present invention has been made in view of such a situation, and aims to solve the above problems.
 本発明は、映像取得部により取得した入力画像から、監視エリア内のアセット領域を検出する映像監視装置であって、前記入力画像からエッジ画像を取得するエッジ抽出部と、 前記エッジ画像から直線を抽出する直線抽出部と、前記抽出された直線を複数のグループに分類する直線分類部と、前記分類された各々のグループから1つの代表直線を選定する代表直線選定部と、前記代表直線を用いてアセット候補領域を抽出するアセット候補領域抽出部と、前記アセット候補領域を識別するために必要な画像群が格納されている識別データベース部と、前記識別データベース部を参照して前記アセット候補領域から検出対象アセットを識別するアセット識別部と、を備える。
 また、前記アセット候補領域抽出部は、前記アセット候補領域の領域位置を検出しても よい。
 また、前記アセット識別部が識別した前記検出対象アセットの状態推定を行うアセット状態推定部を備え、前記アセット状態推定部は、少なくとも前記検出対象アセットの角度または3次元上の位置のいずれかを算出してもよい。
The present invention is a video monitoring device that detects an asset region in a monitoring area from an input image acquired by a video acquisition unit, an edge extraction unit that acquires an edge image from the input image, and a straight line from the edge image. A straight line extracting unit for extracting, a straight line classifying unit for classifying the extracted straight lines into a plurality of groups, a representative straight line selecting unit for selecting one representative straight line from each of the classified groups, and the representative straight line An asset candidate area extracting unit for extracting the asset candidate area, an identification database part storing an image group necessary for identifying the asset candidate area, and referring to the identification database part from the asset candidate area. An asset identification unit that identifies the asset to be detected.
The asset candidate area extraction unit may detect the area position of the asset candidate area.
In addition, an asset state estimation unit that estimates a state of the detection target asset identified by the asset identification unit, and the asset state estimation unit calculates at least one of an angle or a three-dimensional position of the detection target asset. May be.
 本発明によると、日照変化などのノイズに強固なアセット領域の自動検出技術を提供することを目的とする。 According to the present invention, an object of the present invention is to provide an asset area automatic detection technique that is robust against noise such as changes in sunlight.
実施形態に係る、画像監視システムの概略構成を示すブロック図である。1 is a block diagram illustrating a schematic configuration of an image monitoring system according to an embodiment. 実施形態に係る、画像処理部のブロック図である。It is a block diagram of an image processing part concerning an embodiment. 実施形態に係る、エッジ抽出部によるエッジ抽出処理及び2値化処理の例を示した図である。It is the figure which showed the example of the edge extraction process and binarization process by an edge extraction part based on embodiment. 実施形態に係る、直線抽出部による直線抽出処理の例を示した図である。It is the figure which showed the example of the straight line extraction process by the straight line extraction part based on embodiment. 実施形態に係る、直線分類部による直線分類処理の例を示した図である。It is a figure showing an example of straight line classification processing by a straight line classification part concerning an embodiment. 実施形態に係る、代表直線選定部による代表直線選定処理の例を示した図で ある。It is a figure which showed the example of the representative straight line selection process by the representative straight line selection part based on embodiment. 実施形態に係る、アセット候補領域抽出部によるアセット領域抽出処理の例を示した図である。It is the figure which showed the example of the asset area | region extraction process by the asset candidate area | region extraction part which concerns on embodiment. 実施形態に係る、アセット識別部によるアセット識別処理の例を示した図である。It is the figure which showed the example of the asset identification process by the asset identification part based on embodiment. 実施形態に係る、画像処理部の処理を示したフローチャートである。It is the flowchart which showed the process of the image process part which concerns on embodiment.
 次に、本発明を実施するための形態(以下、単に「実施形態」という)を、図面を参照 して具体的に説明する。本実施形態では、カメラなどの撮像装置から映像を取得し、画像処理によって監視エリア内のアセット領域を検出する機能を有する画像監視技術を説明する。当該技術では、取得した画像からエッジ画像を生成後、直線性に関する特徴量を抽出 ・選定し、それらをもとに画像上でのアセット候補領域を検出する。以下、具体的に説明する。ここで、「アセット」とは、道路や橋梁、標識などの構造物を指し、特に直線を形状の特徴とし有する構造物を指す。 Next, a mode for carrying out the present invention (hereinafter simply referred to as “embodiment”) will be specifically described with reference to the drawings. In the present embodiment, an image monitoring technique having a function of acquiring a video from an imaging device such as a camera and detecting an asset region in the monitoring area by image processing will be described. In this technique, after an edge image is generated from the acquired image, a feature amount related to linearity is extracted and selected, and an asset candidate area on the image is detected based on the extracted feature amount. This will be specifically described below. Here, “assets” refers to structures such as roads, bridges, signs, and the like, and particularly structures having a straight line as a feature of the shape.
 図1は、本実施形態に係る画像監視システム100の概略構成を示すブロック図である 。画像監視システム100は、ハードウェアとしてはCPUを備えた電子計算機システムにより構成され、それぞれの機能が実行されるようになっている。前記ハードウェアは、 DSP(Digital Signal Processor)やFPGA(Field-Programmable Gate Array)、 GPU(Graphics Processing Unit)など電子計算機システム以外で代替してもよい。 FIG. 1 is a block diagram illustrating a schematic configuration of an image monitoring system 100 according to the present embodiment. The image monitoring system 100 is configured as an electronic computer system including a CPU as hardware, and each function is executed. The hardware may be replaced by other than an electronic computer system such as a DSP (Digital Signal Processor), an FPGA (Field-Programmable Gate Array), or a GPU (Graphics Processing Unit).
 画像監視システム100は、撮像装置101と、記録装置110と、計算装置111と 、出力装置112とを備える。撮像装置101は、テレビカメラ等であって、1台または複数のいずれでもよい。撮像装置101が撮影した映像は、計算装置111へ出力される 。 The image monitoring system 100 includes an imaging device 101, a recording device 110, a calculation device 111, a bag, and an output device 112. The imaging apparatus 101 is a television camera or the like, and may be one or a plurality. The video imaged by the imaging device 101 should be output to the calculation device 111.
 計算装置111は、映像取得部102と、画像処理部103と、データ通信部104と 、記録制御部105と、表示制御部106と、を備える。 The computing device 111 includes a video acquisition unit 102, an image processing unit 103, a data communication unit 104, a bag, a recording control unit 105, and a display control unit 106.
 映像取得部102は、撮像装置101または記録装置110から映像を取得する。具体的には、映像取得部102は、撮像装置101からのリアルタイムの画像データや、画像データが記録されている記録装置110などから入力された映像信号から、1次元配列も しくは2次元、3次元配列の画像データを取得する。 The video acquisition unit 102 acquires a video from the imaging device 101 or the recording device 110. Specifically, the video acquisition unit 102 uses a real-time image data from the imaging device 101 or a video signal input from the recording device 110 in which the image data is recorded, or the like, as a one-dimensional array or two-dimensional, Obtain image data of a three-dimensional array.
 この画像データにおいては、ノイズやフリッカなどの影響を低減するために、前処理と して平滑化フィルタや輪郭強調フィルタ、濃度変換などの処理が施されてもよい。また、 用途に応じてRGBカラーやモノクロなどのデータ形式が選択されてもよい。さらには、処理コスト低減の観点から、所定の大きさになるように画像データに対して縮小処理が施 されてもよい。 This image data may be subjected to processing such as a smoothing filter, contour enhancement filter, and density conversion in combination with preprocessing in order to reduce the influence of noise and flicker. In addition, a data format such as RGB color or monochrome may be selected according to the usage. Further, from the viewpoint of reducing the processing cost, the image data may be reduced so as to have a predetermined size.
 画像処理部103は、映像取得部102より得られた画像データを入力として、画像処理によりアセット領域を検出し、その状態を推定する。画像処理部103のより具体的な構成及び処理については後述する。データ通信部104は、画像処理部103の結果や記録装置110に保存された情報等を、ローカルエリアに設置された他の機器や、ネットワーク上の監視センタなどと送受信する。 The image processing unit 103 receives the image data obtained from the video acquisition unit 102, detects an asset region by image processing, and estimates its state. More specific configuration and processing of the image processing unit 103 will be described later. The data communication unit 104 transmits / receives the result of the image processing unit 103, information stored in the recording device 110, and the like to other devices installed in the local area, a monitoring center on the network, and the like.
 記録制御部105は、画像処理部103における検出及び推定結果を用いて、映像の記録制御を行ったり、記録映像の圧縮率や記録間隔の制御を行ったりする。表示制御部10 6は、映像取得部102で取得した映像および画像処理部103で得られた結果や、記録装置110に保存された情報の表示を制御する。記録装置110は、映像取得部102より得られた映像を、記録制御部105の命令により記録保持する。 The recording control unit 105 uses the detection and estimation results in the image processing unit 103 to perform video recording control and control the recording video compression rate and recording interval. The display control unit 10 6 controls display of the video acquired by the video acquisition unit 102 and the result obtained by the image processing unit 103 and information stored in the recording device 110. The recording device 110 records and holds the video obtained from the video acquisition unit 102 according to a command from the recording control unit 105.
 出力装置112は、発報装置107と、指示装置108と、表示出力装置109とを備える。発報装置107は、画像処理部103で検出した異常状態を音声や光などでユーザに知らせる。指示装置108は、画像処理部103に用いるパラメータや発報内容を受けて発報を停めるなどユーザからの指示を入力する。表示出力装置109は、各種の情報を 表示する。 The output device 112 includes a notification device 107, an instruction device 108, and a display output device 109. The alarm device 107 notifies the user of an abnormal state detected by the image processing unit 103 by voice or light. The instruction device 108 receives an instruction from the user, such as stopping the notification in response to the parameters used in the image processing unit 103 and the details of the notification. The display output device 109 displays various types of information.
 なお、計算装置111は、一つの装置として構成されてもよいし、複数の装置の集合体として構成されてもよく、その構成態様は問わない。 Note that the computing device 111 may be configured as a single device, or may be configured as an aggregate of a plurality of devices, and the configuration mode is not limited.
 つづいて、図2~図8を参照して、画像処理部103の機能を具体的に説明する。
 図2は、画像処理部103のブロック図である。画像処理部103は、エッジ抽出部201と、直線抽出部202と、直線分類部203と、代表直線選定部204と、アセット候補領域抽出部205と、識別データベース部206と、アセット識別部207と、アセット状態推定部208と、を備える。
Next, the function of the image processing unit 103 will be specifically described with reference to FIGS.
FIG. 2 is a block diagram of the image processing unit 103. The image processing unit 103 includes an edge extraction unit 201, a line extraction unit 202, a line classification unit 203, a representative line selection unit 204, an asset candidate area extraction unit 205, an identification database unit 206, and an asset identification unit 207. And an asset state estimation unit 208.
 図3は、エッジ抽出部201によるエッジ抽出処理及び2値化処理の例を示した図である。エッジ抽出部201は、図示のように、映像取得部102より取得した入力画像301からエッジを抽出する。本実施形態では、入力画像データの輝度値に対して垂直方向のみのSobelフィルタを適用し、判別分析法(大津の手法等)による2値化を行うことでエッジ画像302を取得する。 FIG. 3 is a diagram illustrating an example of edge extraction processing and binarization processing by the edge extraction unit 201. The edge extraction unit 201 extracts edges from the input image 301 acquired from the video acquisition unit 102 as illustrated. In the present embodiment, the edge image 302 is acquired by applying a Sobel filter only in the vertical direction to the luminance value of the input image data and performing binarization by a discriminant analysis method (Otsu's method or the like).
 ここでエッジ抽出処理に用いる情報として、入力画像データの各画素値であるRGB値 、HSV空間に変換した際の各値H(色相)、S(彩度)、V(明度)、またはそれらを補正・統合した情報が用いられてもよい。また、垂直方向のSobelフィルタの代わりに水平方向のSobelフィルタやラプラシアンフィルタなどの他のフィルタが用いられてもよい。さらに、2値化に用いられる閾値は任意に決定されてもよい。 Here, as information used for the edge extraction processing, RGB values で which are each pixel value of the input image data, each value H (hue), S (saturation), V (lightness) when converted into the HSV space, or these are used. Corrected and integrated information may be used. Further, other filters such as a horizontal Sobel filter and a Laplacian filter may be used instead of the vertical Sobel filter. Furthermore, the threshold value used for binarization may be arbitrarily determined.
 図4は、直線抽出部202による直線抽出処理の例を示した図である。図示のように、 直線抽出部202は、エッジ抽出部201で抽出したエッジ画像302を用いて直線画像401を取得する。 FIG. 4 is a diagram showing an example of line extraction processing by the line extraction unit 202. As shown in the figure, the eyelid straight line extraction unit 202 acquires a straight line image 401 using the edge image 302 extracted by the edge extraction unit 201.
 本実施形態では、直線抽出処理としてハフ変換を適用する。具体的には、投票数の閾値 として取得画像の高さの1/3を設定する。例えば、取得画像の高さが「1080」であった場合、投票数の閾値を「360」とする。投票数の閾値は取得画像の幅などを利用して動的に設定、もしくは任意の値で設定してもよい。また、直線画像401を取得する際 、ハフ変換で得た結果を、太さを「2」として8近傍で投影する。投影する際、各直線の太さや4近傍と8近傍の選択は、取得画像の大きさに合わせて動的に値を設定もしくは任意の値を設定してもよい。 In this embodiment, the Hough transform is applied as the straight line extraction process. Specifically, 1/3 of the height of the acquired image is set as a threshold value 投票 for the number of votes. For example, when the height of the acquired image is “1080”, the threshold for the number of votes is set to “360”. The threshold for the number of votes may be set dynamically using the width of the acquired image or an arbitrary value. Further, when acquiring the straight line image 401, the result obtained by the Hough transform is projected in the vicinity of 8 with a thickness of “2”. When projecting, the thickness of each straight line and the selection of the vicinity of 4 and 8 may be set dynamically or in accordance with the size of the acquired image.
 垂直成分の強い直線のみを取得するために、抽出した各直線の傾きを算出し、画像上の 垂直線を0度としたとき、-k1~k2(度)の範囲内の傾きを持つ直線のみを選定する 。ここで、k1、k2は任意の定数である。本実施例では、アセットのうち例えば電柱を高精度に抽出することを目的としてk1=k2=15とする。また、ハフ変換を適用して直線を抽出する場合は、ρ=x×cos(θ)+y×sin(θ)によって直線を表現するものとしたとき、k1、k2の代わりにρやθを基準にしてもよい。 In order to obtain only straight lines with strong vertical components, the slope of each extracted straight line is calculated, and when the vertical line on the image is 0 degrees, only straight lines having a slope in the range of -k1 to k2 (degrees) To select. Here, k1 and k2 are arbitrary constants. In the present embodiment, k1 = k2 = 15 is set for the purpose of extracting, for example, a utility pole among assets with high accuracy. In addition, when a straight line is extracted by applying the Hough transform, when ρ = x × cos (θ) + y × sin (θ) is used to represent a straight line, ρ and θ are used as references instead of k1 and k2. It may be.
 図5は、直線分類部203による直線分類処理(グルーピング処理)の例を示した図である。直線分類部203は、直線抽出部202で抽出した直線画像401をいくつかのグループに分類する。 FIG. 5 is a diagram showing an example of straight line classification processing (grouping processing) by the straight line classification unit 203. The straight line classification unit 203 classifies the straight line images 401 extracted by the straight line extraction unit 202 into several groups.
 直線分類部203は、各直線を画像上に投影した後、画像内の各列を順次探索し、各列 に一定の幅R1を持たせた範囲内において各直線の一部でも含まれていれば、それらの直線を同じグループとして分類し、直線分類画像501を取得する。本実施形態では、R1 =20を設定している。ただし、直線抽出部202と同様に、分類の基準はハフ変換を行う際のρやθを用いてもよい。ここでは、直線分類画像501では、直線分類処理の結果 、グループ1(G1)~グループ5(G5)の5グループに分類されている。分類結果を表示出力する場合は、色分けやラベル名を付加して示す。 After projecting each straight line onto the image, the straight line classification unit 203 sequentially searches each column in the image, and even a part of each straight line may be included within a range in which each column has a certain width R1. For example, these straight lines are classified as the same group, and a straight line classified image 501 is acquired. In the present embodiment, R1 = 20 is set. However, as with the straight line extraction unit 202, ρ or θ used when performing the Hough transform may be used as the classification criterion. Here, the straight line classification image 501 is classified into five groups, group 1 (G1) to group 5 (G5), as a result of the straight line classification process. When the classification result is displayed and output, it is indicated by adding a color classification or a label name.
 図6は、代表直線選定部204による代表直線選定処理の例を示した図である。代表直線選定部204は、直線分類部203によって分類した各々の 直線グループ(G1~G5)から代表する直線1つを選定する。具体的には、代表直線選定部204は、直線分類部203で分類された各グループ(G1~G5)から、直線抽出部202でハフ変換を用いたときの最大投票数を得た直線を1つの代表直線として選定し 、代表直線画像601を取得する。 FIG. 6 is a diagram showing an example of representative line selection processing by the representative line selection unit 204. The representative straight line selection unit 204 selects one representative straight line from each of the straight line groups (G1 to G5) classified by the straight line classification unit 203. Specifically, the representative straight line selection unit 204 obtains a straight line obtained from the groups (G1 to G5) classified by the straight line classification unit 203 and obtained the maximum number of votes when using the Hough transform in the straight line extraction unit 202. A representative straight line image 601 is acquired by selecting the two representative straight lines.
 より具体的には、各グループ(G1~G5)で最もハフ変換の投票スコアが高い直線を、エッジ画像上で上下からスキャンし、一定ピクセル連続でエッジが存在した部分を根元 /頂点と見なす。図示では、グループ1(G1)~グループ5(G5)のそれぞれに対応 した、第1~第5グループ代表直線LG1~LG5が選定され示されている。なお、代表直線の選定方法としては、各グループの直線の傾きの平均値、最大値、最小値、中央値 などが用いられてもよい。 More specifically, the straight line having the highest Hough transform voting score in each group (G1 to G5) is scanned from the top and bottom on the edge image, and the portion where the edge exists continuously for a certain number of pixels is regarded as the root ridge / vertex. In the figure, first to fifth group representative straight lines LG1 to LG5 corresponding to each of group 1 (G1) to group 5 (G5) are selected and shown. As a representative straight line selection method, an average value, a maximum value, a minimum value, a median value, etc., of the straight lines of each group may be used.
 図7は、アセット候補領域抽出部205によるアセット領域抽出処理の例を示した図である。ここでは、アセット候補領域抽出部205は、画像上のそれぞれのアセット候補位置(左上座標と右下座標)を検出する。 FIG. 7 is a diagram showing an example of asset region extraction processing by the asset candidate region extraction unit 205. Here, the asset candidate area extraction unit 205 detects each asset candidate position (upper left coordinate and lower right coordinate) on the image.
 はじめに、アセット候補領域抽出部205は、それぞれのx座標を検出するために、直線分類部203で取得した直線分類画像501を用いて、各グループを囲むための矩形の最小x座標と最大x座標を計算し、それぞれをアセット候補位置の左上と右下のx座標とする。ここで、それぞれのx座標から算出される幅に一定のオフセット値R2を与えても よく、本実施形態ではR2=20を設定し、それぞれのx座標を均等に外側へ広げているが、どちらか一方のみを外側へ広げるもしくは内側に狭めてもよい。 First, the asset candidate area extraction unit 205 uses the straight line classification image 501 acquired by the straight line classification unit 203 to detect the respective x coordinates, and uses the minimum x coordinate and the maximum x coordinate of a rectangle for enclosing each group. And the x coordinates of the upper left and lower right of the asset candidate position. Here, a constant offset value R2 may be given to the width calculated from each x-coordinate. In this embodiment, R2 = 20 is set and each x-coordinate is evenly spread outward. Only one of them may be expanded outward or narrowed inward.
 次に、アセット候補領域抽出部205は、それぞれのy座標を検出するために、エッジ抽出部201で抽出したエッジ画像302と代表直線選定部204で取得した代表直線画像601との論理演算ANDを計算し、膨張処理と収縮処理を複数回行い論理積膨張画像701を得る。図示の例では、5×5のフィルタによる膨張処理を3回と1×5のフィルタによる収縮処理を3回実施した。 Next, the asset candidate area extraction unit 205 performs a logical operation AND between the edge image 302 extracted by the edge extraction unit 201 and the representative straight line image 601 acquired by the representative straight line selection unit 204 in order to detect each y coordinate. Calculation and expansion processing and contraction processing are performed a plurality of times to obtain a logical product expansion image 701. In the illustrated example, the expansion process using a 5 × 5 filter was performed three times and the contraction process using a 1 × 5 filter was performed three times.
 ここで、それぞれのフィルタの大きさや実施回数は任意のものに変更してもよい。その後、アセット候補領域抽出部205は、論理積膨張画像701に対してラベリングを行い 、各ラベル領域702の面積や高さ、幅などを計算する。計算した各項目から総合的に一 番大きいラベル領域702を代表直線ごとに決定する。ここでは、代表直線が5本であるので、第1~第5ラベル領域702_1~702_5の5領域となる。 Here, the size of each filter and the number of executions may be changed to arbitrary ones. Thereafter, the asset candidate area extraction unit 205 performs labeling on the logical product expansion image 701 and calculates the area, height, width, and the like of each label area 702. A label area 702 that is the largest overall is determined for each representative straight line from the calculated items. Here, since there are five representative straight lines, there are five areas of first to fifth label areas 702_1 to 702_5.
 そして、アセット候補領域抽出部205は、各ラベル領域702(第1~第5ラベル領 域702_1~702_5)を囲むための矩形の最小y座標と最大y座標を計算し、それぞれをアセット候補位置の左上と右下のy座標とする。ここで、それぞれのy座標から算出される高さに一定のオフセット値R3を持たせてもよい。本実施形態ではR3=0を設定しているが、前記x座標の時と同様に、それぞれのx座標を均等に外側へ広げるもしくは内側に狭めてもよいし、どちらか一方のみを外側へ広げるもしくは内側に狭めてもよい 。 Then, the asset candidate area extraction unit 205 calculates the minimum y coordinate and the maximum y coordinate of the rectangle for enclosing each label area 702 (first to fifth label areas 702_1 to 702_5), and sets each of the asset candidate positions. Let the y coordinate be in the upper left and lower right. Here, a constant offset value R3 may be given to the height calculated from each y coordinate. In this embodiment, R3 = 0 is set. However, as in the case of the x coordinate, each x coordinate may be spread outwardly or narrowed inward, or only one of them may be expanded outward. Or it may be narrowed inside.
 アセット候補領域抽出部205は、前記で得たそれぞれのx座標とy座標を合わせてアセット候補領域703(第1~第5アセット候補領域703_1~703_5)として抽出する。図7では、入力画像301にアセット候補領域703を投影した合成画像704 を示している。 The asset candidate area extraction unit 205 extracts each of the x and y coordinates obtained above as asset candidate areas 703 (first to fifth asset candidate areas 703_1 to 703_5). FIG. 7 shows a composite image 70470 obtained by projecting the asset candidate area 703 on the input image 301.
 図8は、アセット識別部207によるアセット識別処理の例を示した図である。図示のように、入力画像301にアセット候補領域抽出部205で抽出したアセット候補領域703を投影した合成画像704に対して、事前に様々なシーンで撮影された大量のアセット画像を格納した識別データベース部206を用いて、深層学習(Deep Learning)によるアセット識別を行い、ノイズ領域を排除する。 FIG. 8 is a diagram showing an example of asset identification processing by the asset identification unit 207. As shown in the figure, an identification database storing a large number of asset images taken in advance in various scenes with respect to the composite image 704 obtained by projecting the asset candidate region 703 extracted by the asset candidate region extraction unit 205 onto the input image 301. Using the unit 206, asset identification is performed by deep learning (Deep Learning), and noise regions are excluded.
 すなわち、識別データベース部206には、検出したい検出対象アセット(例えば電柱や標識)と、非アセット領域(検出対象アセット以外のアセット)の画像データが収集さ れ、深層学習(Deep Learning)のような機械学習により、非特徴量ベースで学習を行う 。抽出した領域を学習データとして用いることで、学習データ作成の自動化や精度向上を行える。 That is, the identification database unit 206 collects image data of detection target assets (for example, power poles and signs) to be detected and non-asset areas (assets other than the detection target assets), such as deep learning (Deep Learning). Machine learning should be performed on a non-feature basis. By using the extracted region as learning data, it is possible to automate learning data creation and improve accuracy.
 図8の例では、アセット識別部207は、第1及び第3アセット候補領域703_1、 703_3に対応する電柱を検知対象のアセットとして識別しており、電柱以外はノイズとみなし、識別画像801を取得する。本実施形態によると、識別対象(検出対象アセッ ト)は道路標識や信号機、電灯、街灯、白線などの道路上表記物、電線、橋脚、配管、建物の柱、鴨居、天板などのその他のアセットでもよく、識別は検知対象と非検知対象の2クラスに限らず、多クラス識別に拡張することも可能である。その他、直線を特徴とする物には適用可能である。また、深層学習(Deep Learning)の代わりに、特徴量を何かしら決めた上でSVM(Support Vector Machine)などの一般的な機械学習法を用いてもよい。特徴量として、例えば、電線の有無や看板の有無などのルールベースの方法やHoG 特徴などの汎用な特徴など様々な特徴を用いることができる。本候補領域抽出手段を用いることで、従来法のSS手法より高速に動作し、誤報の数が少なくなる。 In the example of FIG. 8, the asset identification unit 207 identifies the power poles corresponding to the first and third asset candidate areas 703_1 and 703_3 as the assets to be detected, regards other than the power poles as noise, and acquires the identification image 801. To do. According to the present embodiment, identification targets (detection target assets) are road signs, traffic lights, electric lights, street lights, road markings such as white lines, electric wires, piers, piping, building pillars, ducks, top plates, etc. The asset may be an asset, and the identification is not limited to two classes of detection target and non-detection target, but can be extended to multi-class identification. In addition, it can be applied to an object characterized by a straight line. Further, instead of deep learning (Deep 特 徴 Learning), a general machine learning method such as SVM (Support Vector Machine) may be used after some feature amount is determined. As the feature amount, for example, various features such as a rule-based method such as presence / absence of an electric wire or presence / absence of a signboard and general-purpose features such as a HoG bag feature can be used. By using this candidate area extraction means, it operates faster than the conventional SS method, and the number of false alarms is reduced.
 アセット状態推定部208は、アセット識別部207によって検出対象アセットと識別されたアセットの角度や3次元上の位置を推定する。アセットの位置や角度の推定には、 画像から座標変換やハフ変換などによって求める方法の他、ToF(Time of Flight)やステレオカメラなどから得られる3次元情報やGPS(Global Positioning System))などの位置センサ、水平水準器などの角度センサを併用したシステムを用いてもよい。さらに、画像処理部103は、アセット識別部207において必要となる教師データを効率的に作成するために、識別データベース部206に格納されている画像群をエッジ抽出部2 01に入力することで、アセット候補領域抽出部205やアセット識別部207で得られたアセット候補領域およびアセット領域を教師データとして活用することができる。 The asset state estimation unit 208 estimates the angle and the three-dimensional position of the asset identified as the detection target asset by the asset identification unit 207. For estimation of asset position and angle, in addition to methods such as coordinate transformation or Hough transformation from 変 換 images, 3D information obtained from ToF (Time of Flight) and stereo cameras, GPS (Global Positioning System)), etc. You may use the system which used angle sensors, such as a position sensor and a horizontal level, together. Further, the image processing unit 103 inputs the image group stored in the identification database unit 206 to the edge extraction unit 2 01 in order to efficiently create the teacher data required in the asset identification unit 207. The asset candidate area and the asset area obtained by the asset candidate area extraction unit 205 and the asset identification unit 207 can be used as teacher data.
 図9は、画像監視システム100の主に画像処理部103の処理に着目したフローチャートであり、当該フローチャートを参照して画像処理部103の処理を簡単に纏めて説明 する。 FIG. 9 is a flowchart mainly focusing on the processing of the image processing unit 103 of the image monitoring system 100. The processing of the image processing unit 103 will be briefly described with reference to the flowchart.
 まず、画像入力処理として、映像取得部102が撮像装置101や記録装置110から 画像データを取得しエッジ抽出部201へ出力する(S10)。エッジ抽出部201は、 映像取得部102から取得した画像データに対して所定のフィルタリングを施しエッジ抽出処理及び2値化処理を行う(S12)。直線抽出部202は、エッジ抽出部201で抽出したエッジ画像302に対して直線検出処理を行い、直線画像401を取得する(S1 4)。 First, as an image input process, the video acquisition unit 102 acquires the bag image data from the imaging device 101 or the recording device 110 and outputs it to the edge extraction unit 201 (S10). The edge extraction unit 201 performs predetermined filtering on the image data acquired from the heel image acquisition unit 102 to perform edge extraction processing and binarization processing (S12). The straight line extraction unit 202 performs straight line detection processing on the edge image 302 extracted by the edge extraction unit 201, and acquires a straight line image 401 (S1 4).
 つづいて、直線分類部203は、直線のグルーピング処理を行い、近くに存在する直線同士を1グループとして纏める(S16)。つぎに、アセット領域抽出処理として、代表直線選定部204及びアセット候補領域抽出部205の処理を行う(S18)。代表直線選定部204が各グループの代表直線を抽出し、アセット候補領域抽出部205がラベリ ング処理等を行いそれぞれの代表直線に対応したアセット候補領域を抽出する。 Subsequently, the straight line classifying unit 203 performs straight line grouping processing and collects straight lines existing in the vicinity as one group (S16). Next, processing of the representative straight line selection unit 204 and the asset candidate region extraction unit 205 is performed as asset region extraction processing (S18). The representative straight line selection unit 204 extracts a representative straight line of each group, and the asset candidate region extraction unit 205 performs a labeling process or the like to extract an asset candidate region corresponding to each representative straight line.
 さらにつづいて、アセット識別部207が識別データベース部206を参照し、アセッ ト候補領域から検出対象アセットを識別する(S20)。アセット状態推定部208は、 識別された検出対象アセットの状態推定(角度や3次元上の位置の推定)を行う(S22 )。 Further, the asset identification unit 207 refers to the identification database unit 206 to identify the detection target asset from the asset candidate area (S20). The asset state estimation unit 208 performs state estimation (angle or three-dimensional position estimation) of the detection target asset that has been identified (S22).
 以上、本実施形態によれば、撮像装置101から入力される映像または記録装置110 の記録映像から、移動物体等の影響を受けることなく画像中に写る複数のアセット候補領域を個別に検出することが可能となる。その結果、日照変化などのノイズに強固なアセット領域の自動検出機能を有する検出技術とすることができる。言い換えると、この検出技術により、好適なアセット領域の記録、アセットを検出した場合の発報機能の向上、モニ タ表示による監視員への適切な周知が可能となる。 As described above, according to the present embodiment, a plurality of asset candidate areas appearing in an image can be individually detected from the video input from the imaging device 101 or the recorded video of the recording device 110 な く without being affected by a moving object or the like. Is possible. As a result, it is possible to provide a detection technique having an automatic asset region detection function that is robust against noise such as changes in sunlight. In other words, this detection technique makes it possible to record a suitable asset area, improve the reporting function when an asset is detected, and appropriately notify the monitor by a monitor display.
 以上、本発明を、実施形態をもとに説明した。この実施形態は例示であり、それらの各構成要素の組み合わせにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。 The present invention has been described based on the embodiments. This embodiment is an exemplification, and it will be understood by those skilled in the art that various modifications can be made to combinations of these components, and such modifications are also within the scope of the present invention.
100 画像監視システム、 101 撮像装置、 102 映像取得部、 103 画像処理部、 104 データ通信部、 105 記録制御部、 106 表示制御部、 107 発報装置、 108 指示装置、 109 表示出力装置、 110 記録装置、 111 計算装置、 112 出力装置、 201 エッジ抽出部、 202 直線抽出部、 203 直線分類部、 204 代表直線選定部、 205 アセット候補領域抽出部、 206 識別データベース部、 207 アセット識別部、 208 アセット状態推定部 100 image monitoring system, 101 imaging device, 102 video acquisition unit, 103 image processing unit, 104 data communication unit, 105 recording control unit, 106 display control unit, 107 reporting device, 108 indicating device, 109 display output device, 110 recording Device, 111 calculation device, 112 output device, 201 edge extraction unit, 202 straight line extraction unit, 203 straight line classification unit, 204 representative line selection unit, 205 asset candidate area extraction unit, 206 identification database unit, 207 asset identification unit, 208 asset State estimation unit

Claims (4)

  1.  映像取得部により取得した入力画像から、監視エリア内のアセット領域を検出する映像監視装置であって、
     前記入力画像からエッジ画像を取得するエッジ抽出部と、
     前記エッジ画像から直線を抽出する直線抽出部と、
     前記抽出された直線を複数のグループに分類する直線分類部と、
     前記分類された各々のグループから1つの代表直線を選定する代表直線選定部と、
     前記代表直線を用いてアセット候補領域を抽出するアセット候補領域抽出部と、
     前記アセット候補領域を識別するために必要な画像群が格納されている識別データベー ス部と、
     前記識別データベース部を参照して前記アセット候補領域から検出対象アセットを識別するアセット識別部と、
     を備えることを特徴とする映像監視装置。
    A video monitoring device that detects an asset area in a monitoring area from an input image acquired by a video acquisition unit,
    An edge extraction unit that acquires an edge image from the input image;
    A straight line extraction unit for extracting a straight line from the edge image;
    A straight line classification unit for classifying the extracted straight lines into a plurality of groups;
    A representative straight line selection unit for selecting one representative straight line from each of the classified groups;
    An asset candidate area extraction unit that extracts an asset candidate area using the representative straight line;
    An identification database part storing images necessary for identifying the asset candidate area;
    An asset identification unit that identifies a detection target asset from the asset candidate region with reference to the identification database unit;
    A video surveillance apparatus comprising:
  2.  前記アセット候補領域抽出部は、前記アセット候補領域の領域位置を検出することを特徴とする請求項1に記載の映像監視装置。 2. The video monitoring apparatus according to claim 1, wherein the asset candidate area extraction unit detects an area position of the asset candidate area.
  3.  前記アセット識別部が識別した前記検出対象アセットの状態推定を行うアセット状態推定部を備え、
     前記アセット状態推定部は、少なくとも前記検出対象アセットの角度または3次元上の位置のいずれかを算出することを特徴とする請求項1に記載の映像監視装置。
    An asset state estimation unit that performs state estimation of the detection target asset identified by the asset identification unit;
    The video monitoring apparatus according to claim 1, wherein the asset state estimation unit calculates at least one of an angle and a three-dimensional position of the detection target asset.
  4. 前記アセット識別部が識別した前記検出対象アセットの状態推定を行うアセット状態推定部を備え、
     前記アセット状態推定部は、少なくとも前記検出対象アセットの角度または3次元上の位置のいずれかを算出することを特徴とする請求項2に記載の映像監視装置。
    An asset state estimation unit that performs state estimation of the detection target asset identified by the asset identification unit;
    The video monitoring apparatus according to claim 2, wherein the asset state estimation unit calculates at least one of an angle and a three-dimensional position of the detection target asset.
PCT/JP2017/043746 2016-12-15 2017-12-06 Video monitoring device WO2018110377A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2018556603A JP6831396B2 (en) 2016-12-15 2017-12-06 Video monitoring device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016243402 2016-12-15
JP2016-243402 2016-12-15

Publications (1)

Publication Number Publication Date
WO2018110377A1 true WO2018110377A1 (en) 2018-06-21

Family

ID=62558664

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/043746 WO2018110377A1 (en) 2016-12-15 2017-12-06 Video monitoring device

Country Status (2)

Country Link
JP (1) JP6831396B2 (en)
WO (1) WO2018110377A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020135776A (en) * 2019-02-26 2020-08-31 プラス株式会社 Imaging device
CN111724440A (en) * 2020-05-27 2020-09-29 杭州数梦工场科技有限公司 Orientation information determining method and device of monitoring equipment and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05215547A (en) * 1992-02-06 1993-08-24 Nippon Telegr & Teleph Corp <Ntt> Method for determining corresponding points between stereo images
JPH0694442A (en) * 1992-09-17 1994-04-05 Nippon Telegr & Teleph Corp <Ntt> Apparatus for measuring bent degree of utility pole
JP2011080845A (en) * 2009-10-06 2011-04-21 Topcon Corp Method and apparatus for creating three-dimensional data
JP2015088168A (en) * 2013-09-25 2015-05-07 国際航業株式会社 Learning sample creation device, learning sample creation program, and automatic recognition device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05215547A (en) * 1992-02-06 1993-08-24 Nippon Telegr & Teleph Corp <Ntt> Method for determining corresponding points between stereo images
JPH0694442A (en) * 1992-09-17 1994-04-05 Nippon Telegr & Teleph Corp <Ntt> Apparatus for measuring bent degree of utility pole
JP2011080845A (en) * 2009-10-06 2011-04-21 Topcon Corp Method and apparatus for creating three-dimensional data
JP2015088168A (en) * 2013-09-25 2015-05-07 国際航業株式会社 Learning sample creation device, learning sample creation program, and automatic recognition device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020135776A (en) * 2019-02-26 2020-08-31 プラス株式会社 Imaging device
CN111724440A (en) * 2020-05-27 2020-09-29 杭州数梦工场科技有限公司 Orientation information determining method and device of monitoring equipment and electronic equipment
CN111724440B (en) * 2020-05-27 2024-02-02 杭州数梦工场科技有限公司 Method and device for determining azimuth information of monitoring equipment and electronic equipment

Also Published As

Publication number Publication date
JPWO2018110377A1 (en) 2019-10-24
JP6831396B2 (en) 2021-02-17

Similar Documents

Publication Publication Date Title
CN109886130B (en) Target object determination method and device, storage medium and processor
RU2484531C2 (en) Apparatus for processing video information of security alarm system
JP6144656B2 (en) System and method for warning a driver that visual recognition of a pedestrian may be difficult
JP6764481B2 (en) Monitoring device
CN110650316A (en) Intelligent patrol and early warning processing method and device, electronic equipment and storage medium
JP2017531883A (en) Method and system for extracting main subject of image
CN104966304A (en) Kalman filtering and nonparametric background model-based multi-target detection tracking method
JP2014059875A5 (en)
Lin et al. Collaborative pedestrian tracking and data fusion with multiple cameras
KR20140095333A (en) Method and apparratus of tracing object on image
Ozcelik et al. A vision based traffic light detection and recognition approach for intelligent vehicles
US20110280442A1 (en) Object monitoring system and method
CN111967396A (en) Processing method, device and equipment for obstacle detection and storage medium
CN111382637A (en) Pedestrian detection tracking method, device, terminal equipment and medium
CN114140745A (en) Method, system, device and medium for detecting personnel attributes of construction site
WO2018110377A1 (en) Video monitoring device
CN113920585A (en) Behavior recognition method and device, equipment and storage medium
Kumar et al. Traffic surveillance and speed limit violation detection system
EP3044734B1 (en) Isotropic feature matching
Noman et al. An optimized and fast scheme for real-time human detection using raspberry pi
CN113052139A (en) Deep learning double-flow network-based climbing behavior detection method and system
KR102150661B1 (en) Method and Apparatus for Preprocessing Image for Detecting Objects
WO2022198507A1 (en) Obstacle detection method, apparatus, and device, and computer storage medium
Tang Development of a multiple-camera tracking system for accurate traffic performance measurements at intersections
Lan et al. A new vehicle detection algorithm for real-time image processing system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17881642

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018556603

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17881642

Country of ref document: EP

Kind code of ref document: A1