JP2024072568A

JP2024072568A - Information processing program, information processing device, information processing method, and model generation method

Info

Publication number: JP2024072568A
Application number: JP2022183469A
Authority: JP
Inventors: 誠一郎近藤; 逸夫熊澤; 賢登立野; ミャグマルスレンナイダンスレン; ヤーンジェンチュワンゾウ
Original assignee: Tokyo Electric Power Co Inc; Tokyo Institute of Technology NUC; Tokyo Electric Power Co Holdings Inc
Current assignee: Tokyo Institute of Technology NUC; Tokyo Electric Power Co Holdings Inc
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2024-05-28

Abstract

【課題】送配電設備及び周辺物の距離を導出する情報処理プログラム、情報処理装置、情報処理方法及びモデルの生成方法を提供すること。【解決手段】情報処理プログラムは、ステレオカメラにより撮影され、被写体に送配電設備を含む複数の画像を取得し、複数の画像を入力した場合に、深度情報を生成するよう学習された第１モデルへ、取得した複数の画像を入力して深度情報を生成し、画像を入力した場合に、送配電設備と周辺物とを分類するように学習された第２モデルへ、取得した画像を入力して送配電設備及び周辺物の分類を取得し、取得した分類と前記深度情報に基づいて、前記送配電設備と前記周辺物との間の距離を導出する。【選択図】図１２[Problem] To provide an information processing program, information processing device, information processing method, and model generation method for deriving the distance between power transmission and distribution equipment and surrounding objects. [Solution] The information processing program acquires a plurality of images captured by a stereo camera and including power transmission and distribution equipment as a subject, inputs the acquired plurality of images into a first model trained to generate depth information when the plurality of images are input, generates depth information, inputs the acquired images into a second model trained to classify power transmission and distribution equipment and surrounding objects when an image is input, acquires a classification of the power transmission and distribution equipment and the surrounding objects, and derives the distance between the power transmission and distribution equipment and the surrounding objects based on the acquired classification and the depth information. [Selected Figure] Figure 12

Description

本発明は、送配電設備及び周辺物の距離を導出する情報処理プログラム、情報処理装置、情報処理方法及びモデルの生成方法に関する。 The present invention relates to an information processing program, an information processing device, an information processing method, and a model generation method for deriving the distance between power transmission and distribution equipment and surrounding objects.

送配電設備、特に架空配電線が、建造物、道路、鉄道、樹木等と接近する場合に、これら周辺物が、架空配電線と接触しないように、又は、架空配電線を切断しないように、離隔距離が規定されている。しかし、架空配電線周辺の環境変化により、新たな周辺物が現れたり、離隔距離が縮まったりする場合がある。そのため、架空配電線と周辺物との離隔距離が規定されている距離以内を維持しているか、定期的に点検する必要がある。 When power transmission and distribution equipment, particularly overhead distribution lines, come close to buildings, roads, railways, trees, etc., a clearance distance is regulated to ensure that these surrounding objects do not come into contact with the overhead distribution line or cut off the line. However, changes in the environment around the overhead distribution line can result in new surrounding objects appearing or the clearance distance decreasing. For this reason, it is necessary to periodically check whether the clearance distance between the overhead distribution line and surrounding objects is being maintained within the regulated distance.

このような事情に関連して、特許文献１には、架空配電線の側面側よりクレーン、飛行物、樹木等が接触することにより生じる異常箇所の画像を撮影する巡視点検システムが提案されている。 In relation to these circumstances, Patent Document 1 proposes a patrol inspection system that takes images of abnormal areas caused by contact with a crane, flying object, tree, etc. from the side of an overhead distribution line.

特開２０１８－７４７５７号公報JP 2018-74757 A

しかし、上記公知技術は架空配電線と周辺物とが接触しているか否かは点検できるものの、架空配電線と周辺物との離隔距離は導出できない。本発明はこのような状況に鑑みてなされたものである。その目的は、送配電設備及び周辺物の距離を導出する情報処理プログラム、情報処理装置、情報処理方法及びモデルの生成方法の提供である。 However, although the above-mentioned known techniques can check whether an overhead distribution line is in contact with a surrounding object, they cannot derive the distance between the overhead distribution line and the surrounding object. The present invention has been made in consideration of this situation. Its purpose is to provide an information processing program, an information processing device, an information processing method, and a model generation method for deriving the distance between a power transmission and distribution facility and a surrounding object.

本願の一態様に係る情報処理プログラムは、ステレオカメラにより撮影され、被写体に送配電設備を含む複数の画像を取得し、複数の画像を入力した場合に、深度情報を生成するよう学習された第１モデルへ、取得した複数の画像を入力して深度情報を生成し、画像を入力した場合に、送配電設備と周辺物とを分類するように学習された第２モデルへ、取得した画像を入力して送配電設備及び周辺物の分類を取得し、取得した分類と前記深度情報に基づいて、前記送配電設備と前記周辺物との間の距離を導出する。 An information processing program according to one aspect of the present application acquires multiple images captured by a stereo camera, including power transmission and distribution equipment as a subject, inputs the acquired multiple images into a first model trained to generate depth information when the multiple images are input, generates depth information, inputs the acquired images into a second model trained to classify power transmission and distribution equipment and surrounding objects when an image is input, acquires a classification of the power transmission and distribution equipment and surrounding objects, and derives the distance between the power transmission and distribution equipment and the surrounding objects based on the acquired classification and the depth information.

本願の一態様にあっては、送配電設備及び周辺物の距離を導出することが可能となる。 In one aspect of the present application, it is possible to derive the distance between the power transmission and distribution equipment and surrounding objects.

判定システムの構成例を示す説明図である。FIG. 1 is an explanatory diagram illustrating an example of the configuration of a determination system. 判定サーバのハードウェア構成例を示すブロック図である。FIG. 2 is a block diagram showing an example of a hardware configuration of a determination server. ユーザ端末のハードウェア構成例を示すブロック図である。FIG. 2 is a block diagram showing an example of a hardware configuration of a user terminal. 設定ＤＢの例を示す説明図である。FIG. 11 is an explanatory diagram illustrating an example of a setting DB. 画像位置ＤＢの例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of an image position DB; 判定結果ＤＢの例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of a determination result DB. 第１モデルの構成例を示す説明図である。FIG. 2 is an explanatory diagram illustrating an example of the configuration of a first model. 特徴抽出モジュールの構成例を示す説明図である。FIG. 2 is an explanatory diagram illustrating an example of the configuration of a feature extraction module. 第２モデルの構成例を示す説明図である。FIG. 11 is an explanatory diagram showing a configuration example of a second model. 第１モデル生成処理の手順例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for a first model generation process. 第２モデル生成処理の手順例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for a second model generation process. 判定処理の手順例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for a determination process. 判定処理の手順例を示すフローチャートである。13 is a flowchart illustrating an example of a procedure for a determination process. 警告画面の例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of a warning screen.

（実施の形態１）
以下実施の形態を、図面を参照して説明する。図１は判定システムの構成例を示す説明図である。判定システム１００は判定サーバ１、ユーザ端末２、ステレオカメラ３及び位置測位機４を含む。判定サーバ１とユーザ端末２とはネットワークＮにより、通信可能に接続されている。ステレオカメラ３及び位置測位機４は、ユーザ端末２と通信可能に接続されている。 (Embodiment 1)
Hereinafter, an embodiment will be described with reference to the drawings. Fig. 1 is an explanatory diagram showing an example of the configuration of a determination system. The determination system 100 includes a determination server 1, a user terminal 2, a stereo camera 3, and a position measuring device 4. The determination server 1 and the user terminal 2 are communicatively connected via a network N. The stereo camera 3 and the position measuring device 4 are communicatively connected to the user terminal 2.

送配電設備は送電設備及び配電設備を含む概念である。以下の説明では、送配電設備の中で、架空配電線を対象とするが、架空送電線も同様である。以下では、架空配電線を単に配電線ともいう。送配電設備に限らず、通信設備、特に通信線又は光ファイバを対象としてもよい。 The concept of power transmission and distribution facilities includes both power transmission and distribution facilities. In the following explanation, overhead power distribution lines are the focus of the power transmission and distribution facilities, but the same goes for overhead transmission lines. Below, overhead power distribution lines are also simply referred to as power distribution lines. The term is not limited to power transmission and distribution facilities, and may also refer to communication facilities, particularly communication lines or optical fibers.

判定サーバ１はサーバコンピュータ、ワークステーション、ＰＣ（Personal Computer）等で構成する。また、判定サーバ１を複数のコンピュータからなるマルチコンピュータ、ソフトウェアによって仮想的に構築された仮想マシン又は量子コンピュータで構成しても良い。さらに、判定サーバ１の機能をクラウドサービスで実現してもよい。 The determination server 1 is configured with a server computer, a workstation, a PC (Personal Computer), etc. The determination server 1 may also be configured with a multi-computer consisting of multiple computers, a virtual machine virtually constructed by software, or a quantum computer. Furthermore, the functions of the determination server 1 may be realized by a cloud service.

ユーザ端末２はエンドユーザが使用する端末である。ユーザ端末２はノートＰＣ、タブレットコンピュータ、スマートフォン等で構成する。図１において、ユーザ端末２は１台のみ記載しているが、２台以上でもよい。 The user terminal 2 is a terminal used by an end user. The user terminal 2 may be a notebook PC, a tablet computer, a smartphone, or the like. In FIG. 1, only one user terminal 2 is shown, but there may be two or more.

ステレオカメラ３は撮像素子と光学系とを含む撮像部を２つ備えたカメラである。ステレオカメラ３は、位置が異なる２つの撮像部により、被写体を同時に撮影することにより、その奥行き方向の情報も記録できるようにしたカメラである。ステレオカメラ３はユーザ端末２毎に用意する前提であるが、運用上、差支えがなければ、それに限らない。なお、ステレオカメラ３はユーザ端末２が担う機能を備えてもよい。また、ステレオカメラ３は後述する第１モデル、第２モデル等を備え、配電線とその周辺物との間の距離を導出する機能を備えてもよい。 The stereo camera 3 is a camera equipped with two imaging units each including an image sensor and an optical system. The stereo camera 3 is a camera that can simultaneously capture an image of a subject using two imaging units at different positions, allowing information about the subject's depth to be recorded as well. Although it is assumed that a stereo camera 3 is provided for each user terminal 2, this is not limited as long as there is no operational problem. The stereo camera 3 may also have the functions carried out by the user terminal 2. The stereo camera 3 may also have a first model, a second model, etc., which will be described later, and a function to derive the distance between the power line and its surrounding objects.

位置測位機４は現在位置の地理座標を計測する装置である。例えば、位置測位機４は、ＧＰＳ（Global Positioning System）衛星、準天頂衛星、ＧＬＯＮＡＳＳ（Global Navigation Satellite System）衛星、Galileo衛星等の衛星測位システムを構成する衛星からの電波を受信し、現在位置を取得する。位置計測の目的は、ステレオカメラ３が撮影する画像に撮影位置を付与するためである。位置測位機４はステレオカメラ３の筐体に納めることが望ましい。また、画像へ撮影位置の付与が可能であれば、ユーザ端末２が位置測位機４を備えていてもよい。 The position measuring device 4 is a device that measures the geographic coordinates of the current position. For example, the position measuring device 4 receives radio waves from satellites that make up a satellite positioning system, such as GPS (Global Positioning System) satellites, quasi-zenith satellites, GLONASS (Global Navigation Satellite System) satellites, and Galileo satellites, to obtain the current position. The purpose of position measurement is to assign the shooting position to the images captured by the stereo camera 3. It is preferable that the position measuring device 4 is housed in the housing of the stereo camera 3. In addition, the user terminal 2 may be equipped with the position measuring device 4 if it is possible to assign the shooting position to the images.

図２は判定サーバのハードウェア構成例を示すブロック図である。判定サーバ１は制御部１１、主記憶部１２、補助記憶部１３、通信部１５及び読み取り部１６を含む。制御部１１、主記憶部１２、補助記憶部１３、通信部１５及び読み取り部１６はバスＢにより接続されている。 Figure 2 is a block diagram showing an example of the hardware configuration of the determination server. The determination server 1 includes a control unit 11, a main memory unit 12, an auxiliary memory unit 13, a communication unit 15, and a reading unit 16. The control unit 11, the main memory unit 12, the auxiliary memory unit 13, the communication unit 15, and the reading unit 16 are connected by a bus B.

制御部１１は、一又は複数のＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro-Processing Unit）、ＧＰＵ（Graphics Processing Unit）等の演算処理装置を有する。制御部１１は、補助記憶部１３に記憶された制御プログラム１Ｐ（プログラム、プログラム製品）を読み出して実行することにより、判定サーバ１に係る種々の情報処理、制御処理等を行い、第１取得部、第２取得部、生成部、導出部等の機能部を実現する。 The control unit 11 has one or more arithmetic processing devices such as a CPU (Central Processing Unit), an MPU (Micro-Processing Unit), a GPU (Graphics Processing Unit), etc. The control unit 11 reads and executes a control program 1P (program, program product) stored in the auxiliary storage unit 13, thereby performing various information processing, control processing, etc. related to the determination server 1, and realizing functional units such as a first acquisition unit, a second acquisition unit, a generation unit, and a derivation unit.

主記憶部１２は、ＳＲＡＭ（Static Random Access Memory）、ＤＲＡＭ（Dynamic Random Access Memory）、フラッシュメモリ等である。主記憶部１２は主として制御部１１が演算処理を実行するために必要なデータを一時的に記憶する。 The main memory unit 12 is a static random access memory (SRAM), a dynamic random access memory (DRAM), a flash memory, etc. The main memory unit 12 mainly temporarily stores data required for the control unit 11 to execute arithmetic processing.

補助記憶部１３はハードディスク又はＳＳＤ（Solid State Drive）等であり、制御部１１が処理を実行するために必要な制御プログラム１Ｐや各種ＤＢ（Database）を記憶する。補助記憶部１３は、設定ＤＢ１３１、撮影画像ＤＢ１３２、画像位置ＤＢ１３３、及び、判定結果ＤＢ１３４を記憶する。また、補助記憶部１３は第１モデル１４１及び第２モデル１４２を記憶する。補助記憶部１３は判定サーバ１に接続された外部記憶装置であってもよい。補助記憶部１３に記憶する各種ＤＢ等を、判定サーバ１とは異なるデータベースサーバやクラウドストレージに記憶してもよい。 The auxiliary storage unit 13 is a hard disk or SSD (Solid State Drive) or the like, and stores the control program 1P and various DBs (Databases) necessary for the control unit 11 to execute processing. The auxiliary storage unit 13 stores a setting DB 131, a captured image DB 132, an image position DB 133, and a judgment result DB 134. The auxiliary storage unit 13 also stores a first model 141 and a second model 142. The auxiliary storage unit 13 may be an external storage device connected to the judgment server 1. The various DBs and the like stored in the auxiliary storage unit 13 may be stored in a database server or cloud storage different from the judgment server 1.

通信部１５はネットワークＮを介して、ユーザ端末２と通信を行う。また、制御部１１が通信部１５を用い、ネットワークＮ等を介して他のコンピュータから制御プログラム１Ｐをダウンロードし、補助記憶部１３に記憶してもよい。 The communication unit 15 communicates with the user terminal 2 via the network N. The control unit 11 may also use the communication unit 15 to download the control program 1P from another computer via the network N, etc., and store it in the auxiliary storage unit 13.

読み取り部１６はＣＤ（Compact Disc）－ＲＯＭ及びＤＶＤ（Digital Versatile Disc）－ＲＯＭを含む可搬型記憶媒体１ａを読み取る。制御部１１が読み取り部１６を介して、制御プログラム１Ｐを可搬型記憶媒体１ａより読み取り、補助記憶部１３に記憶してもよい。また、半導体メモリ１ｂから、制御部１１が制御プログラム１Ｐを読み込んでもよい。 The reading unit 16 reads portable storage medium 1a, including CD (Compact Disc)-ROM and DVD (Digital Versatile Disc)-ROM. The control unit 11 may read the control program 1P from the portable storage medium 1a via the reading unit 16 and store it in the auxiliary storage unit 13. The control unit 11 may also read the control program 1P from the semiconductor memory 1b.

図３はユーザ端末のハードウェア構成例を示すブロック図である。ユーザ端末２は制御部２１、主記憶部２２、補助記憶部２３、通信部２４、入力部２５、表示部２６及びシリアル通信部２７を含む。各構成はバスＢで接続されている。 Figure 3 is a block diagram showing an example of the hardware configuration of a user terminal. The user terminal 2 includes a control unit 21, a main memory unit 22, an auxiliary memory unit 23, a communication unit 24, an input unit 25, a display unit 26, and a serial communication unit 27. Each component is connected by a bus B.

制御部２１は、一又は複数のＣＰＵ、ＭＰＵ、ＧＰＵ等の演算処理装置を有する。制御部２１は、補助記憶部２３に記憶された制御プログラム２Ｐ（プログラム、プログラム製品）を読み出して実行することにより、種々の機能を提供する。 The control unit 21 has one or more arithmetic processing devices such as a CPU, MPU, GPU, etc. The control unit 21 provides various functions by reading and executing a control program 2P (program, program product) stored in the auxiliary storage unit 23.

主記憶部２２は、ＳＲＡＭ、ＤＲＡＭ、フラッシュメモリ等である。主記憶部２２は主として制御部２１が演算処理を実行するために必要なデータを一時的に記憶する。 The main memory unit 22 is an SRAM, DRAM, flash memory, etc. The main memory unit 22 mainly temporarily stores data required for the control unit 21 to execute arithmetic processing.

補助記憶部２３はハードディスク又はＳＳＤ等であり、制御部２１が処理を実行するために必要な各種データを記憶する。補助記憶部２３はステレオカメラ３から取得したステレオ画像や、位置測位機４から取得した位置情報を記憶してもよい。補助記憶部２３はユーザ端末２に接続された外部記憶装置であってもよい。補助記憶部２３に記憶する各種ＤＢ等を、データベースサーバやクラウドストレージに記憶してもよい。 The auxiliary storage unit 23 is a hard disk or SSD, etc., and stores various data necessary for the control unit 21 to execute processing. The auxiliary storage unit 23 may store stereo images acquired from the stereo camera 3 and location information acquired from the position measuring device 4. The auxiliary storage unit 23 may be an external storage device connected to the user terminal 2. The various DBs, etc. stored in the auxiliary storage unit 23 may be stored in a database server or cloud storage.

通信部２４はネットワークＮを介して、判定サーバ１と通信を行う。また、制御部２１が通信部２４を用い、ネットワークＮ等を介して他のコンピュータから制御プログラム２Ｐをダウンロードし、補助記憶部２３に記憶してもよい。 The communication unit 24 communicates with the determination server 1 via the network N. In addition, the control unit 21 may use the communication unit 24 to download the control program 2P from another computer via the network N, etc., and store it in the auxiliary storage unit 23.

入力部２５はキーボードやマウスである。表示部２６は液晶表示パネル等を含む。表示部２６は判定サーバ１が出力した判定結果などを表示する。また、入力部２５と表示部２６とを一体化し、タッチパネルディスプレイを構成してもよい。なお、ユーザ端末２は外部の表示装置に表示を行ってもよい。 The input unit 25 is a keyboard and a mouse. The display unit 26 includes a liquid crystal display panel, etc. The display unit 26 displays the judgment results output by the judgment server 1. The input unit 25 and the display unit 26 may be integrated to form a touch panel display. The user terminal 2 may display on an external display device.

シリアル通信部２７は他の機器とシリアル通信を行う通信インターフェースである。シリアル通信部２７は、ＵＳＢ（Universal Serial Bus）規格に従った有線通信、Bluetooth（登録商標）規格等に従った無線通信を行う。シリアル通信部２７は、ステレオカメラ３からステレオ画像を、位置測位機４から位置情報を取得する。 The serial communication unit 27 is a communication interface that performs serial communication with other devices. The serial communication unit 27 performs wired communication according to the USB (Universal Serial Bus) standard and wireless communication according to the Bluetooth (registered trademark) standard. The serial communication unit 27 acquires stereo images from the stereo camera 3 and position information from the position measuring device 4.

図４は設定ＤＢの例を示す説明図である。設定ＤＢ１３１は離隔距離の閾値を記憶する。設定ＤＢ１３１は設備列、分類列及び閾値列を含む。設備列は送電設備の分類を記憶する。分類列は周辺物の分類を記憶する。閾値列は周辺物に対する閾値を記憶する。図４に示されている例では、建造物と配電線との離隔距離は２ｍ以上でなくてはならないことを示している。その他に、地面と配電線との離隔距離、道路と配電線との離隔距離、配電線又は電柱とビルに設置された看板との離隔距離などの閾値を記憶する。 Figure 4 is an explanatory diagram showing an example of a setting DB. The setting DB 131 stores thresholds for the distance. The setting DB 131 includes an equipment column, a classification column, and a threshold column. The equipment column stores the classification of power transmission equipment. The classification column stores the classification of surrounding objects. The threshold column stores thresholds for surrounding objects. The example shown in Figure 4 indicates that the distance between a building and a distribution line must be 2 m or more. In addition, thresholds for the distance between the ground and a distribution line, the distance between a road and a distribution line, and the distance between a distribution line or utility pole and a signboard installed on a building are stored.

撮影画像ＤＢ１３２(図示なし)は、ステレオカメラ３が撮影したステレオ画像を記憶する。撮影画像ＤＢ１３２が記憶するステレオ画像は、静止画でも動画でもよい。画像のデータ形式は特に限定されないが、左画像と右画像との対応関係が取れるデータが必要である。静止画のデータ形式は、例えばＤＣ－００６に沿った形式でもよい。ＤＣ－００６はＣＩＰＡ（Camera & Imaging Product Association：一般社団法人カメラ映像機器工業会）が制定したデジタルスチルカメラ用ステレオ静止画像フォーマットである。動画像の場合は、右画像と左画像とで、タイムコード同期を取って、撮影する。右動画データと左動画データとは独立したデータとして撮影画像ＤＢ１３２に記憶されるが、タイムコードによりフレーム単位で、対応関係のある左画像と右画像とからなる画像組、すなわちステレオ画像を取得可能である。 The captured image DB 132 (not shown) stores the stereo images captured by the stereo camera 3. The stereo images stored in the captured image DB 132 may be still images or videos. The image data format is not particularly limited, but data that allows a correspondence between the left and right images is required. The data format of the still images may be, for example, a format conforming to DC-006. DC-006 is a stereo still image format for digital still cameras established by CIPA (Camera & Imaging Product Association). In the case of videos, the right and left images are shot with time codes synchronized. The right and left video data are stored in the captured image DB 132 as independent data, but it is possible to obtain an image set consisting of a left image and a right image that correspond to each other, that is, a stereo image, in frame units using the time code.

図５は画像位置ＤＢの例を示す説明図である。画像位置ＤＢ１３３は画像を撮影した位置の地理座標を記憶する。画像位置ＤＢ１３３は画像名列、タイムコード列、経度列及び緯度列を含む。画像名列は画像を特定可能な画像の名称を記憶する。画像を特定可能であれば数値でもよい。タイムコード列は画像が動画である場合、タイムコードを記憶する。画像が静止画である場合、タイムコード列の値は不定である。経度列は撮影位置の経度を記憶する。緯度列は撮影位置の緯度を記憶する。撮影位置の緯度及び経度は位置測位機４から取得する。画像位置ＤＢ１３３の基となるデータは、ユーザ端末２で生成する。静止画の場合、ユーザ端末２はステレオカメラ３から静止画を受け取る毎に、位置測位機４が出力した地理座標を、静止画に対応付けて記憶する。動画の場合、ユーザ端末２はステレオカメラ３から取得したタイムコードと、位置測位機４が出力した地理座標とを対応付けて記憶する。 Figure 5 is an explanatory diagram showing an example of an image position DB. Image position DB 133 stores the geographic coordinates of the position where an image was taken. Image position DB 133 includes an image name column, a time code column, a longitude column, and a latitude column. The image name column stores the name of the image that can identify the image. A numerical value may be used as long as the image can be identified. The time code column stores the time code if the image is a video. If the image is a still image, the value of the time code column is indefinite. The longitude column stores the longitude of the shooting position. The latitude column stores the latitude of the shooting position. The latitude and longitude of the shooting position are obtained from the position measuring device 4. The data that forms the basis of image position DB 133 is generated by the user terminal 2. In the case of a still image, each time the user terminal 2 receives a still image from the stereo camera 3, it stores the geographic coordinates output by the position measuring device 4 in association with the still image. In the case of video, the user terminal 2 stores the time code acquired from the stereo camera 3 in association with the geographic coordinates output by the position measuring device 4.

図６は判定結果ＤＢの例を示す説明図である。判定結果ＤＢ１３４は配電設備と周辺物との離隔距離についての判定結果を記憶する。判定結果ＤＢ１３４は画像名列、タイムコード列、対象物ＩＤ列、対象物・分類列、配電線ＩＤ列、離隔距離列及び判定列を含む。画像名列は処理対象となった画像の名称を記憶する。タイムコード列は画像が動画である場合、判定に用いた画像のタイムコードを記憶する。タイムコード列は画像が静止画である場合、値は不定である。対象物ＩＤ列は画像に含まれ、離隔距離の判定対象となった対象物のＩＤを記憶する。対象物・分類列は対象物の分類を記憶する。配電線ＩＤ列は画像に含まれ、離隔距離の判定対象となった配電線のＩＤを記憶する。離隔距離列は導出した離隔距離を記憶する。判定列は離隔距離が閾値以上であるか否かの判定結果を記憶する。例えば、離隔距離が閾値以上であれば、判定列はＯＫを記憶する。離隔距離が閾値未満であれば、判定列はＮＧを記憶する。 Figure 6 is an explanatory diagram showing an example of a judgment result DB. The judgment result DB 134 stores the judgment results regarding the separation distance between the power distribution equipment and surrounding objects. The judgment result DB 134 includes an image name column, a time code column, an object ID column, an object/classification column, a power distribution line ID column, a separation distance column, and a judgment column. The image name column stores the name of the image to be processed. If the image is a video, the time code column stores the time code of the image used for the judgment. If the image is a still image, the value of the time code column is indefinite. The object ID column stores the ID of the object included in the image that was the subject of the separation distance judgment. The object/classification column stores the classification of the object. The power distribution line ID column stores the ID of the power distribution line included in the image that was the subject of the separation distance judgment. The separation distance column stores the derived separation distance. The judgment column stores the judgment result of whether the separation distance is equal to or greater than a threshold value. For example, if the separation distance is equal to or greater than a threshold value, the judgment column stores OK. If the separation distance is less than the threshold, the judgment column stores NG.

図７は第１モデルの構成例を示す説明図である。第１モデル１４１はステレオ画像を入力した場合に視差マップを出力するように学習された学習モデルである。第１モデル１４１はＰＳＭＮｅｔ（Pyramid Stereo Matching Network）を改造した学習モデルである。第１モデル１４１は２つ特徴抽出モジュール１４１１及び視差回帰モジュール１４１２を含む。特徴抽出モジュール１４１１は入力画像に含まれる物体を検出する。特徴抽出モジュール１４１１は右画像を処理するモジュールと左画像を処理するモジュールとがあり、モジュール間で重みを共有する。第１モデル１４１は２つの特徴抽出モジュール１４１１の出力から４次元コストボリュームを作成する。コストは、画像毎に、ステレオ画像を構成する左右の画像の一致の度合を示す。コストボリュームは、左画像に対して右画像を幅（width）方向に１ピクセルずつずらして、最大視差（Max Disparity）までずらした各画像等を結合（concat）して作成する。したがって、コストボリュームはD（深度）×H（高さ）×W（幅）×C（コスト）の４次元となる。視差回帰モジュール１４１２は、４次元コストボリュームから視差マップ（深度情報）を作成し、出力する。第１モデル１４１は図７に示した構成に限られず、ステレオ画像を入力した場合に視差マップを出力するように学習された学習モデルであれば、他の構成でもよい。 Figure 7 is an explanatory diagram showing an example of the configuration of the first model. The first model 141 is a learning model that is trained to output a disparity map when a stereo image is input. The first model 141 is a learning model modified from PSMNet (Pyramid Stereo Matching Network). The first model 141 includes two feature extraction modules 1411 and a disparity regression module 1412. The feature extraction module 1411 detects objects included in the input image. The feature extraction module 1411 has a module for processing the right image and a module for processing the left image, and weights are shared between the modules. The first model 141 creates a four-dimensional cost volume from the output of the two feature extraction modules 1411. The cost indicates the degree of match between the left and right images that make up the stereo image for each image. The cost volume is created by shifting the right image by one pixel in the width direction relative to the left image, and concating each image shifted to the maximum disparity. Therefore, the cost volume has four dimensions: D (depth) x H (height) x W (width) x C (cost). The disparity regression module 1412 creates and outputs a disparity map (depth information) from the four-dimensional cost volume. The first model 141 is not limited to the configuration shown in FIG. 7, and may have any other configuration as long as it is a learning model that has been trained to output a disparity map when a stereo image is input.

図８は特徴抽出モジュールの構成例を示す説明図である。特徴抽出モジュール１４１１は改変ＦＰＮ（Feature Pyramid Networks：特徴ピラミッドネットワーク）１４１１１、連結層１４１１２、及びコンボリューション層１４１１３を含む。改変ＦＰＮ１４１１１は、従来のＦＰＮと同様に、マルチスケールＣＮＮ（Convolution Neural Network）エンコーダ（抽出エンコーダＥＣ）と、その後半に、マルチスケール特徴ピラミッド方式の画像特徴集約を、複数スケールで行うＣＮＮデコーダ（集約デコーダＤＣ）とを含む。複数スケールで行うＣＮＮデコーダにおいて、各スケールで画像特徴集約を行う層はアップサンプリング層ＵＰである。従来のＦＰＮでは、各スケールのＣＮＮエンコーダとＣＮＮデコーダとを結合するスキップ接続を含む。これにより，ＦＰＮ全体では砂時計ネットワークを形成している。ＦＰＮでは各スケールで特徴量を出力する構成となっているが、改変ＦＰＮ１４１１１では最下層が出力する特徴量のみを使用する。集約デコーダＤＣでは最下層のアップサンプリング層ＢＵＰのみが、集約デコーダＤＣの外部へ特徴量を出力する。また、バックボーンはMobileNetを採用することにより、処理が軽くなるようにしてある。連結層１４１１２はエンコーダの最下層（MobileNet）の出力と、デコーダの最下層ＢＵＰの出力と結合する。コンボリューション層１４１１３は連結層１４１１２が出力したデータに対して畳み込み演算を行う。特徴抽出モジュール１４１１は、コンボリューション層１４１１３の出力とエンコーダの最下層の出力とを結合した特徴量を出力する。特徴抽出モジュール１４１１は図８に示した構成に限られず、右画像、左画像を入力した場合に、４次元コストボリュームを作成可能な特徴量を出力するように学習された学習モデルであれば、他の構成でもよい。 Figure 8 is an explanatory diagram showing an example of the configuration of a feature extraction module. The feature extraction module 1411 includes a modified FPN (Feature Pyramid Networks) 14111, a connection layer 14112, and a convolution layer 14113. The modified FPN 14111 includes a multi-scale CNN (Convolution Neural Network) encoder (extraction encoder EC) and a CNN decoder (aggregation decoder DC) that performs image feature aggregation at multiple scales using the multi-scale feature pyramid method, just like the conventional FPN. In the CNN decoder that performs image feature aggregation at multiple scales, the layer that performs image feature aggregation at each scale is the upsampling layer UP. The conventional FPN includes skip connections that connect the CNN encoder and CNN decoder at each scale. As a result, the entire FPN forms an hourglass network. The FPN is configured to output features at each scale, but the modified FPN 14111 uses only the features output by the bottom layer. In the aggregate decoder DC, only the bottom upsampling layer BUP outputs features to the outside of the aggregate decoder DC. In addition, the backbone adopts MobileNet to reduce processing. The concatenation layer 14112 connects the output of the bottom layer (MobileNet) of the encoder with the output of the bottom layer BUP of the decoder. The convolution layer 14113 performs a convolution operation on the data output by the concatenation layer 14112. The feature extraction module 1411 outputs features that combine the output of the convolution layer 14113 with the output of the bottom layer of the encoder. The feature extraction module 1411 is not limited to the configuration shown in FIG. 8, and may have other configurations as long as it is a learning model that has been trained to output features that can create a four-dimensional cost volume when a right image and a left image are input.

図９は第２モデルの構成例を示す説明図である。第２モデル１４２は複数の学習モデルを結合したモデルである。第２モデル１４２はエンコーダ１４２１、第１デコーダ１４２２、第２デコーダ１４２３、及び第３デコーダ１４２４を含む。 Figure 9 is an explanatory diagram showing an example of the configuration of the second model. The second model 142 is a model that combines multiple learning models. The second model 142 includes an encoder 1421, a first decoder 1422, a second decoder 1423, and a third decoder 1424.

エンコーダ１４２１と第１デコーダ１４２２とにより、Ｕ－Ｎｅｔを構成する。Ｕ－Ｎｅｔはセマンティックセグメンテーション（Semantic Segmentation）を行う学習モデルである。本実施の形態おいて、Ｕ－Ｎｅｔは被写体を配電線及び配電線を含む電線と、電柱と、その他の物体とに分類する。その他は、建造物、樹木、道路、歩道橋、索道等である。以降、本実施の形態におけるＵ－Ｎｅｔをセグメンテーションモデルという。セグメンテーションモデルは、Ｕ－ＮＥＴに限られず、画像内の物体を分類可能な学習モデルであれば、他のモデルで構成してもよい。 The encoder 1421 and the first decoder 1422 constitute a U-Net. The U-Net is a learning model that performs semantic segmentation. In this embodiment, the U-Net classifies subjects into power distribution lines and electric wires including power distribution lines, utility poles, and other objects. The others are buildings, trees, roads, footbridges, cableways, etc. Hereinafter, the U-Net in this embodiment is referred to as a segmentation model. The segmentation model is not limited to the U-NET, and may be composed of other models as long as they are learning models that can classify objects in an image.

エンコーダ１４２１と第２デコーダ１４２３とにより、ＹＯＬｉｎＯを構成する。第２デコーダ１４２３は一部の層において、第１デコーダ１４２２とのスキップ接続を有する。ＹＯＬｉｎＯはリアルタイムでポリラインを検出するモデルである。本実施の形態において、ＹＯＬｉｎＯは特に、配電線を検出するために用いられる。以降、本実施の形態におけるＹＯＬｉｎＯを線状物体検出モデルという。線状物体検出モデルは、ＹＯＬｉｎＯに限られず、画像内の線状物体を検出可能な学習モデルであれば、他のモデルで構成してもよい。 YOLinO is composed of the encoder 1421 and the second decoder 1423. The second decoder 1423 has a skip connection with the first decoder 1422 in some layers. YOLinO is a model that detects polylines in real time. In this embodiment, YOLinO is used particularly for detecting power lines. Hereinafter, YOLinO in this embodiment is referred to as a linear object detection model. The linear object detection model is not limited to YOLinO, and may be composed of other models as long as they are learning models capable of detecting linear objects in an image.

エンコーダ１４２１と第３デコーダ１４２４とにより、Ｙｏｌｏを構成する。Ｙｏｌｏは直方体状の物体の検出に優れたモデルである。本実施の形態において、Ｙｏｌｏは特に、電柱を検出するために用いられる。以降、本実施の形態におけるＹｏｌｏを電柱検出モデルという。電柱検出モデルはＹｏｌｏに限られず、画像内において電柱のような棒状物体を検出可能な学習モデルであれば、他のモデルで構成してもよい。第２モデル１４２は、セグメンテーションモデル、線状物体検出モデル、及び電柱検出モデルの３モデルを必ず備えている必要はなく、セグメンテーションモデルのみの１モデル、又は、セグメンテーションモデル及び線状物体検出モデル、若しくは、セグメンテーションモデル及び電柱検出モデルの２モデルの構成でもよい。また、セグメンテーションモデル、線状物体検出モデル、及び電柱検出モデルは、エンコーダ１４２１を共有する構成となっているが、共有しない構成でもよい。 The encoder 1421 and the third decoder 1424 constitute Yolo. Yolo is a model that excels in detecting rectangular parallelepiped objects. In this embodiment, Yolo is used particularly to detect utility poles. Hereinafter, Yolo in this embodiment is referred to as a utility pole detection model. The utility pole detection model is not limited to Yolo, and may be constituted by other models as long as they are learning models capable of detecting rod-shaped objects such as utility poles in an image. The second model 142 does not necessarily have to include three models, namely, the segmentation model, the linear object detection model, and the utility pole detection model, and may be constituted by one model only of the segmentation model, or two models, namely, the segmentation model and the linear object detection model, or the segmentation model and the utility pole detection model. In addition, the segmentation model, the linear object detection model, and the utility pole detection model are configured to share the encoder 1421, but may not be configured to share the encoder 1421.

セグメンテーションモデル、線状物体検出モデル、電柱検出モデルそれぞれの出力は、判別器１１ｂに入力される。判別器１１ｂに基づき、被写体それぞれの分類を決定する。分類の決定において、配電線の判定については、セグメンテーションモデルの判定結果よりも、線状物体検出モデルの判定結果を優先させる。また、電柱の判定については、セグメンテーションモデルの判定結果よりも、電柱検出モデルの判定結果を優先させる。判別器１１ｂは決定した分類を付した分類済画像を出力する。分類済画像と視差マップとは導出部１１ｃに入力される。導出部１１ｃは分類済画像と視差マップとに基づき、配電線とその周辺物との離隔距離を導出する。判別器１１ｂは被写体の分類と離隔距離とを対応付けた結果画像を出力する。 The outputs of the segmentation model, linear object detection model, and utility pole detection model are input to discriminator 11b. The classification of each subject is determined based on discriminator 11b. In determining the classification, the judgment result of the linear object detection model is given priority over the judgment result of the segmentation model for the judgment of power lines. Furthermore, the judgment result of the utility pole detection model is given priority over the judgment result of the segmentation model for the judgment of utility poles. Discriminator 11b outputs a classified image with the determined classification. The classified image and the disparity map are input to derivation unit 11c. The derivation unit 11c derives the separation distance between the power line and its surrounding objects based on the classified image and the disparity map. Discriminator 11b outputs a resultant image in which the classification of the subject is associated with the separation distance.

次に、判定システム１００で行われる情報処理について説明する。図１０は第１モデル生成処理の手順例を示すフローチャートである。第１モデル作成処理は、第１モデル１４１を作成する処理である。制御部１１は訓練データを取得する（ステップＳ１）。訓練データは複数のデータレコードからなるデータセットである。制御部１１は処理対象とする１レコードを選択する（ステップＳ２）。制御部１１は学習を行う（ステップＳ３）。制御部１１は訓練データに含まれる入力データ（ステレオ画像）を、第１モデル１４１へ入力する。制御部１１は第１モデル１４１が出力したデータ（視差マップ）と、訓練データに含まれる正解データとを対照し、第１モデル１４１が出力したデータと、正解データとが一致するように、第１モデル１４１を構成するニューロン間の重み等のパラメータを最適化する。制御部１１は学習を終了するか否かを判定する（ステップＳ４）。例えば、訓練データに含まれる全てのレコードを用いて学習をした場合、制御部１１は終了すると判定する。制御部１１は学習を終了しないと判定した場合（ステップＳ４でＮＯ）、処理をステップＳ２へ戻し、学習を繰り返す。制御部１１は学習を終了すると判定した場合（ステップＳ４でＹＥＳ）、最適化されたパラメータ値等の学習結果を記憶し（ステップＳ５）、処理を終了する。 Next, the information processing performed by the determination system 100 will be described. FIG. 10 is a flowchart showing an example of the procedure of the first model generation process. The first model creation process is a process for creating the first model 141. The control unit 11 acquires training data (step S1). The training data is a data set consisting of multiple data records. The control unit 11 selects one record to be processed (step S2). The control unit 11 performs learning (step S3). The control unit 11 inputs input data (stereo image) included in the training data to the first model 141. The control unit 11 compares the data (disparity map) output by the first model 141 with the correct answer data included in the training data, and optimizes parameters such as weights between neurons constituting the first model 141 so that the data output by the first model 141 matches the correct answer data. The control unit 11 determines whether to end the learning (step S4). For example, when learning has been performed using all records included in the training data, the control unit 11 determines that the learning is to be ended. If the control unit 11 determines not to end the learning (NO in step S4), it returns the process to step S2 and repeats the learning. If the control unit 11 determines to end the learning (YES in step S4), it stores the learning results such as optimized parameter values (step S5) and ends the process.

第１モデル１４１の訓練データは、現場で撮影したステレオ画像及び、当該画像から生成した視差マップを用いる。視差マップを生成するにあったては、レーザレンジファインダ等の測距装置で測定した距離を用いて精度を確保してもよい。初期学習時の訓練データとして、一般に公開されているデータセット、例えば、Driving Stereo、KITTI_2015、Scene Flowを利用してもよい。 The training data for the first model 141 uses stereo images taken on-site and a disparity map generated from the images. When generating the disparity map, accuracy may be ensured by using distances measured with a distance measuring device such as a laser range finder. Publicly available datasets such as Driving Stereo, KITTI_2015, and Scene Flow may be used as training data for initial learning.

図１１は第２モデル生成処理の手順例を示すフローチャートである。制御部１１は訓練データを取得する（ステップＳ１１）。訓練データは複数のデータレコードからなるデータセットである。各レコードには画像内の各被写体の分類がラベル付けされている。制御部１１は処理対象とする１レコードを選択する（ステップＳ１２）。制御部１１は３モデルの学習を行う（ステップＳ１３）。３モデルは、セグメンテーションモデル、線状物体検出モデル、及び電柱検出モデルである。制御部１１は、訓練データに含まれる入力画像（ラベルなし画像）を、３モデルに共通するエンコーダ１４２１へ入力する。制御部１１はセグメンテーションモデルを構成する第１デコーダ１４２２、線状物体検出モデルを構成する第２デコーダ１４２３、電柱検出モデルを構成する第３デコーダ１４２４、それぞれから出力された画像を取得する。セグメンテーションモデルが出力する画像には被写体に分類が付されている。線状物体検出モデルが出力する画像には検出した線状物体が含まれている。電柱検出モデルが出力する画像には検出した電柱が含まれている。制御部１１は各モデルの出力と、各被写体にラベル付けされた分類とを対照して、各モデルが正解を出力するように、エンコーダ１４２１、第１デコーダ１４２２、第２デコーダ１４２３、及び第３デコーダ１４２４を構成するニューロン間の重み等のパラメータを最適化する。制御部１１は学習を終了するか否かを判定する（ステップＳ１４）。例えば、訓練データに含まれる全てのレコードを用いて学習をした場合、制御部１１は終了すると判定する。制御部１１は学習を終了しないと判定した場合（ステップＳ１４でＮＯ）、処理をステップＳ１２へ戻し、学習を繰り返す。制御部１１は学習を終了すると判定した場合（ステップＳ１４でＹＥＳ）、最適化されたパラメータ値等の学習結果を記憶し（ステップＳ１５）、処理を終了する。なお、ここでは３モデルを同時に学習したが、個別に学習してもよい。 FIG. 11 is a flowchart showing an example of the procedure of the second model generation process. The control unit 11 acquires training data (step S11). The training data is a data set consisting of multiple data records. Each record is labeled with the classification of each object in the image. The control unit 11 selects one record to be processed (step S12). The control unit 11 learns three models (step S13). The three models are a segmentation model, a linear object detection model, and a utility pole detection model. The control unit 11 inputs an input image (unlabeled image) included in the training data to an encoder 1421 common to the three models. The control unit 11 acquires images output from the first decoder 1422 constituting the segmentation model, the second decoder 1423 constituting the linear object detection model, and the third decoder 1424 constituting the utility pole detection model. The image output by the segmentation model has classification assigned to the object. The image output by the linear object detection model includes the detected linear object. The image output by the utility pole detection model includes the detected utility pole. The control unit 11 compares the output of each model with the classification labeled for each object, and optimizes parameters such as weights between neurons constituting the encoder 1421, the first decoder 1422, the second decoder 1423, and the third decoder 1424 so that each model outputs the correct answer. The control unit 11 determines whether or not to end the learning (step S14). For example, when learning has been performed using all records included in the training data, the control unit 11 determines that the learning is to be ended. When the control unit 11 determines that the learning is not to be ended (NO in step S14), the process returns to step S12, and the learning is repeated. When the control unit 11 determines that the learning is to be ended (YES in step S14), the control unit 11 stores the learning results such as optimized parameter values (step S15), and ends the process. Note that, although the three models are learned simultaneously here, they may be learned individually.

第２モデル１４２の訓練データは調査、障害発生時の現場写真や現場で撮影した動画に、被写体の分類を注記として付した画像を用いる。初期学習時の訓練データとして、一般に公開されているデータセット、例えば、カラー写真の教師ラベル付き画像データベースであるImageNetを利用してもよい。 The training data for the second model 142 is images of on-site photographs taken during investigations or incidents, or videos taken at the site, with annotations indicating the classification of the subject. A publicly available dataset, such as ImageNet, a teacher-labeled image database of color photographs, may be used as training data for the initial learning stage.

図１２及び図１３は判定処理の手順例を示すフローチャートである。判定処理は入力画像の被写体として配電線が含まれる場合、当該配電線とその周辺物との離隔距離を導出し、導出した離隔距離が閾値以上であるか否かを判定する処理である。判定サーバ１の制御部１１は、処理対象とする動画データを取得する（ステップＳ３１）。動画データはステレオ動画データである。制御部１１は処理対象とする１フレームを選択する（ステップＳ３２）。制御部１１は選択したフレームのステレオ画像を第１モデル１４１へ入力する（ステップＳ３３）。制御部１１は第１モデル１４１が出力する視差マップを取得する（ステップＳ３４）。制御部１１は視差マップ、右画像又は左画像を第２モデル１４２へ入力する（ステップＳ３５）。第２モデル１４２に入力された右画像又は左画像（以下、入力画像）をエンコーダ１４２１に入力される。エンコーダ１４２１と接続されている第１デコーダ１４２２、第２デコーダ１４２３、第３デコーダ１４２４は画像を判別器１１ｂへ出力する。判別器１１ｂは３つのデコーダからの出力に基づき、入力画像における各被写体の分類を決定する。判別器１１ｂは決定した分類を付した分類済画像を出力する。分類済画像と視差マップとは導出部１１ｃへ入力される。導出部１１ｃは、配電線とその周辺物との離隔距離を導出する（ステップＳ３６）。ステレオカメラ３と被写体である配電線又は周辺物との距離は、ステレオカメラ３の基線長、各カメラの焦点距離、視差より求めることが可能である。離隔距離は、配電線と周辺物とが最も近づく点どうしの距離である。該当する２点の３次元座標値に基づいて、２点間の距離を導出可能である。離隔距離の導出は、配電線と周辺物との全ての組み合わせについて行われる。導出部１１ｃは被写体の分類と離隔距離とを対応付けた結果画像を出力する。制御部１１は結果画像において、処理対象とする周辺物を選択する（ステップＳ３７）。制御部１１は被写体に対応付けられている分類を参照して、分類が配電線や電柱ではないものを選択する。制御部１１は選択した被写体に対応付けられている離隔距離を取得する（ステップＳ３８）。制御部１１は設定ＤＢ１３１と被写体の分類及び離隔距離とを対照して、離隔距離が閾値未満であるか否かを判定する（ステップＳ３９）。制御部１１は離隔距離が閾値未満でないと判定した場合（ステップＳ３９でＮＯ）、処理をステップＳ４１へ進める。制御部１１は離隔距離が閾値未満であると判定した場合（ステップＳ３９でＹＥＳ）、フラグをオンにする（ステップＳ４０）。なお、フラグの初期状態はオフである。制御部１１は結果を判定結果ＤＢ１３４に記憶する（ステップＳ４１）。制御部１１は終了するか否かを判定する（ステップＳ４２）。制御部１１は画像内の全ての周辺物について処理済みである場合、終了する判定し、それ以外は終了しないと判定する。制御部１１は終了しないと判定した場合（ステップＳ４２でＮＯ）、処理をステップＳ３７へ戻す。制御部１１は終了すると判定した場合（ステップＳ４２でＹＥＳ）、フラグがオンであるか否かを判定する（図１３のステップＳ４３）。制御部１１はフラグがオンでないと判定した場合（ステップＳ４３でＮＯ）、処理をステップＳ４６へ移す。制御部１１はフラグがオンであると判定した場合（ステップＳ４３でＹＥＳ）、警告画面を出力する（ステップＳ４４）。警告画面はユーザ端末２へ送信され、ユーザ端末２の表示部２６に表示される。制御部１１は再開するか否か判定する（ステップＳ４５）。制御部１１は警告画面を出力後、ユーザ端末２から再開指示を受信した場合、再開すると判定する。それ以外の場合、制御部１１は再開しないと判定する。制御部１１は再開しないと判定した場合（ステップＳ４５でＮＯ）、ステップＳ４５を繰り返す。制御部１１は再開すると判定した場合（ステップＳ４５でＹＥＳ）、全フレームを処理したか否かを判定する（ステップＳ４６）。制御部１１は全フレームを処理していないと判定した場合（ステップＳ４６でＮＯ）、処理をステップＳ３２に戻す。制御部１１は全フレームを処理したと判定した場合（ステップＳ４６でＹＥＳ）、処理を終了する。 Figures 12 and 13 are flowcharts showing an example of the procedure of the judgment process. In the judgment process, when a power distribution line is included as a subject of the input image, the judgment process derives the distance between the power distribution line and its surrounding objects, and determines whether the derived distance is equal to or greater than a threshold. The control unit 11 of the judgment server 1 acquires video data to be processed (step S31). The video data is stereo video data. The control unit 11 selects one frame to be processed (step S32). The control unit 11 inputs the stereo image of the selected frame to the first model 141 (step S33). The control unit 11 acquires a parallax map output by the first model 141 (step S34). The control unit 11 inputs the parallax map, right image, or left image to the second model 142 (step S35). The right image or left image (hereinafter, input image) input to the second model 142 is input to the encoder 1421. The first decoder 1422, the second decoder 1423, and the third decoder 1424 connected to the encoder 1421 output the image to the discriminator 11b. The discriminator 11b determines the classification of each object in the input image based on the output from the three decoders. The discriminator 11b outputs a classified image with the determined classification. The classified image and the parallax map are input to the derivation unit 11c. The derivation unit 11c derives the separation distance between the power distribution line and its surrounding objects (step S36). The distance between the stereo camera 3 and the subject, which is the power distribution line or the surrounding object, can be obtained from the base line length of the stereo camera 3, the focal length of each camera, and the parallax. The separation distance is the distance between the points where the power distribution line and the surrounding object are closest to each other. The distance between the two points can be derived based on the three-dimensional coordinate values of the two relevant points. The separation distance is derived for all combinations of the power distribution line and the surrounding object. The derivation unit 11c outputs a result image in which the classification of the object is associated with the separation distance. The control unit 11 selects a surrounding object to be processed in the result image (step S37). The control unit 11 refers to the classification associated with the object and selects an object that is not classified as a power distribution line or a utility pole. The control unit 11 acquires the separation distance associated with the selected object (step S38). The control unit 11 compares the classification of the object and the separation distance with the setting DB 131 and determines whether the separation distance is less than the threshold value (step S39). If the control unit 11 determines that the separation distance is not less than the threshold value (NO in step S39), the control unit 11 advances the process to step S41. If the control unit 11 determines that the separation distance is less than the threshold value (YES in step S39), the control unit 11 turns on the flag (step S40). The initial state of the flag is off. The control unit 11 stores the result in the determination result DB 134 (step S41). The control unit 11 determines whether to end the process (step S42). If all the surrounding objects in the image have been processed, the control unit 11 determines to end the process, otherwise it determines not to end the process. If the control unit 11 determines not to end the process (NO in step S42), the process returns to step S37. If the control unit 11 determines to end the process (YES in step S42), it determines whether the flag is on or not (step S43 in FIG. 13). If the control unit 11 determines that the flag is not on (NO in step S43), it moves the process to step S46. If the control unit 11 determines that the flag is on (YES in step S43), it outputs a warning screen (step S44). The warning screen is sent to the user terminal 2 and displayed on the display unit 26 of the user terminal 2. The control unit 11 determines whether to resume the process (step S45). If the control unit 11 receives a resume instruction from the user terminal 2 after outputting the warning screen, it determines to resume the process. Otherwise, the control unit 11 determines not to resume the process. If the control unit 11 determines not to resume the process (NO in step S45), it repeats step S45. If the control unit 11 determines to resume (YES in step S45), it determines whether or not all frames have been processed (step S46). If the control unit 11 determines that all frames have not been processed (NO in step S46), it returns the process to step S32. If the control unit 11 determines that all frames have been processed (YES in step S46), it ends the process.

図１４は警告画面の例を示す説明図である。警告画面ｄ０１は離隔距離が閾値未満の周辺物を検出した場合に表示される画面である。警告画面ｄ０１は画像表示領域ｄ０１１、結果表示領域ｄ０１２、及び再開ボタンｄ０１３を含む。画像表示領域ｄ０１１は撮影画像を表示する。結果表示領域ｄ０１２は判定結果を表示する。例えば、周辺物の分類、離隔距離、閾値を表示する。再開ボタンｄ０１３を選択する他のフレ－ムについての処理が再開される。 Figure 14 is an explanatory diagram showing an example of a warning screen. The warning screen d01 is a screen that is displayed when a surrounding object is detected with a distance less than the threshold. The warning screen d01 includes an image display area d011, a result display area d012, and a resume button d013. The image display area d011 displays the captured image. The result display area d012 displays the judgment result. For example, it displays the classification of the surrounding object, the distance, and the threshold. Selecting the resume button d013 resumes processing for other frames.

警告画面ｄ０１の画像表示領域ｄ０１１において、配電線との離隔距離が閾値未満である周辺物を検出した場合、配電線と周辺物とを強調表示してもよい。図１４に示す例では、該当する配電線ｄ０１１１が太く見えるように、配電線の認識結果に基づいて、線分が重畳表示されている。同様に、周辺物ｄ０１１２の輪郭が太く見えるように、線分が重畳表示されている。強調表示は線や輪郭線を太く表示するのに限らず、表示色を変えてもよい。 When a nearby object whose distance from the power distribution line is less than a threshold is detected in the image display area d011 of the warning screen d01, the power distribution line and the nearby object may be highlighted. In the example shown in FIG. 14, a line segment is superimposed based on the recognition result of the power distribution line so that the corresponding power distribution line d0111 appears thick. Similarly, a line segment is superimposed so that the outline of the nearby object d0112 appears thick. Highlighting is not limited to displaying lines or outlines thick, and the display color may be changed.

本実施の形態においては、撮影画像の被写体に配電線及びそれ以外の周辺物が含まれる場合、配電線と周辺物との離隔距離を導出することが可能となる。また、離隔距離が閾値未満であること検出した場合、検出元の画像を表示するので、状況を確認することが可能である。 In this embodiment, when the subject of a captured image includes a power distribution line and other surrounding objects, it is possible to derive the distance between the power distribution line and the surrounding objects. Furthermore, when it is detected that the distance is less than a threshold, the image from which the detection occurred is displayed, making it possible to check the situation.

上述した第１モデル生成処理、第２モデル生成処理、及び判定処理は、判定サーバ１が行うとしたが、ユーザ端末２が行ってもよい。比較的に処理寮が多い第１モデル生成処理及び第２モデル生成処理は判定サーバ１で行い、生成した第１モデル１４１及び第２モデル１４２をユーザ端末２に記憶し、判定処理をユーザ端末２で行ってもよい。 The above-mentioned first model generation process, second model generation process, and judgment process are performed by the judgment server 1, but they may also be performed by the user terminal 2. The first model generation process and second model generation process, which involve a relatively large number of processes, may be performed by the judgment server 1, the generated first model 141 and second model 142 may be stored in the user terminal 2, and the judgment process may be performed by the user terminal 2.

警告画面ｄ０１において、画像表示領域ｄ０１１に表示している画像を撮影した位置を示す地図を表示してもよい。制御部１１は、表示している画像の名称、タイムコードから、画像位置ＤＢ１３３を検索し、撮影位置の経度、緯度を取得する。制御部１１は取得した経度、緯度を、地図配信システムへ送信する。制御部１１は地図配信システムから返信された当該経度及び緯度を含む地図画像を、警告画面ｄ０１に表示する。 The warning screen d01 may display a map showing the location where the image displayed in the image display area d011 was taken. The control unit 11 searches the image location DB 133 from the name and time code of the displayed image, and acquires the longitude and latitude of the shooting location. The control unit 11 transmits the acquired longitude and latitude to the map distribution system. The control unit 11 displays the map image including the longitude and latitude returned from the map distribution system on the warning screen d01.

各実施の形態で記載されている技術的特徴（構成要件）はお互いに組み合わせ可能であり、組み合わせすることにより、新しい技術的特徴を形成することができる。
今回開示された実施の形態はすべての点で例示であって、制限的なものではないと考えられるべきである。本発明の範囲は、上記した意味ではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。
特許請求の範囲には他の２以上のクレームを引用するクレームを記載する形式（マルチクレーム形式）を用いているが、これに限るものではない。マルチクレームを少なくとも一つ引用するマルチクレーム（マルチマルチクレーム）を記載する形式を用いて記載してもよい。 The technical features (constituent elements) described in each embodiment can be combined with each other, and by combining them, new technical features can be formed.
The embodiments disclosed herein are illustrative in all respects and should not be considered as limiting. The scope of the present invention is defined by the claims, not by the above meaning, and is intended to include all modifications within the scope and meaning equivalent to the claims.
The claims are written in a format in which a claim cites two or more other claims (multiple claim format), but are not limited to this format. A multiple claim that cites at least one other claim (multi-multi claim) may also be written.

１００判定システム
１判定サーバ
１１制御部
１１ｂ判別器
１１ｃ導出部
１２主記憶部
１３補助記憶部
１３１設定ＤＢ
１３２撮影画像ＤＢ
１３３画像位置ＤＢ
１３４判定結果ＤＢ
１４１第１モデル
１４１１特徴抽出モジュール
１４１１１改変ＦＰＮ
ＥＣ抽出エンコーダ
ＤＣ集約デコーダ
ＵＰアップサンプリング層
１４１１２連結層
１４１１３コンボリューション層
１４１２視差回帰モジュール
１４２第２モデル
１４２１エンコーダ
１４２２第１デコーダ
１４２３第２デコーダ
１４２４第３デコーダ
１５通信部
１６読み取り部
１Ｐ制御プログラム
１ａ可搬型記憶媒体
１ｂ半導体メモリ
２ユーザ端末
３ステレオカメラ
４位置測位機
Ｂバス
Ｎネットワーク REFERENCE SIGNS LIST 100 Determination system 1 Determination server 11 Control unit 11b Discriminator 11c Derivation unit 12 Main memory unit 13 Auxiliary memory unit 131 Setting DB
132 Photographed image DB
133 Image position DB
134 Judgment result DB
141 First model 1411 Feature extraction module 14111 Modified FPN
EC Extraction encoder DC Aggregation decoder UP Upsampling layer 14112 Concatenation layer 14113 Convolution layer 1412 Disparity regression module 142 Second model 1421 Encoder 1422 First decoder 1423 Second decoder 1424 Third decoder 15 Communication unit 16 Reading unit 1P Control program 1a Portable storage medium 1b Semiconductor memory 2 User terminal 3 Stereo camera 4 Position measuring device B Bus N Network

Claims

ステレオカメラにより撮影され、被写体に送配電設備を含む複数の画像を取得し、
複数の画像を入力した場合に、深度情報を生成するよう学習された第１モデルへ、取得した複数の画像を入力して深度情報を生成し、
画像を入力した場合に、送配電設備と周辺物とを分類するように学習された第２モデルへ、取得した画像を入力して送配電設備及び周辺物の分類を取得し、
取得した分類と前記深度情報に基づいて、前記送配電設備と前記周辺物との間の距離を導出する
処理をコンピュータに実行させる情報処理プログラム。 The image is taken with a stereo camera, and multiple images including the power transmission and distribution equipment are acquired.
When multiple images are input, the acquired multiple images are input to a first model trained to generate depth information, and depth information is generated;
The acquired image is input to a second model that has been trained to classify the power transmission and distribution equipment and the surrounding objects when an image is input, and a classification of the power transmission and distribution equipment and the surrounding objects is obtained;
An information processing program that causes a computer to execute a process of deriving a distance between the power transmission and distribution facility and the surrounding object based on the acquired classification and the depth information.

前記送配電設備と前記周辺物とが接近する２点を判定し、該２点それぞれについて、前記ステレオカメラの基線長、焦点距離及び前記深度情報に基づき、３次元座標値を求め、求めた前記２点の３次元座標値より、前記距離を導出する
請求項１に記載の情報処理プログラム。 The information processing program according to claim 1, further comprising: determining two points where the power transmission and distribution equipment and the surrounding object are close to each other; calculating three-dimensional coordinate values for each of the two points based on a baseline length, a focal length and the depth information of the stereo camera; and deriving the distance from the three-dimensional coordinate values of the two points.

前記距離が閾値未満である場合、警告を出力する
請求項１に記載の情報処理プログラム。 The information processing program according to claim 1 , further comprising: outputting a warning when the distance is less than a threshold value.

前記距離が閾値未満である前記周辺物、又は、前記送配電設備を強調表示する
請求項１又は請求項２に記載の情報処理プログラム。 The information processing program according to claim 1 or 2, further comprising highlighting the surrounding object or the power transmission and distribution facility for which the distance is less than a threshold value.

前記送配電設備は送電線若しくは配電線又は鉄塔若しくは電柱であり、前記周辺物は樹木、植物のつる、道路、地面又は看板である
請求項１又は請求項２に記載の情報処理プログラム。 The information processing program according to claim 1 or 2, wherein the power transmission and distribution facility is a power transmission line, a power distribution line, a steel tower, or a utility pole, and the surrounding object is a tree, a vine of a plant, a road, the ground, or a signboard.

前記第２モデルは、取得した画像から特徴量を抽出するエンコーダと、
前記エンコーダにより抽出された特徴量に基づき前記送配電設備及び前記周辺物を特定する第１デコーダと、
前記エンコーダにより抽出された特徴量に基づき送電線又は配電線を特定する第２デコーダと、
前記エンコーダにより抽出された特徴量に基づき鉄塔又は電柱を特定する第３デコーダと
を含む請求項１又は請求項２に記載の情報処理プログラム。 The second model includes an encoder that extracts features from an acquired image;
a first decoder that identifies the power transmission and distribution facility and the surrounding objects based on the feature amount extracted by the encoder;
a second decoder for identifying a power transmission line or a power distribution line based on the feature amount extracted by the encoder;
and a third decoder that identifies a steel tower or a utility pole based on the feature amount extracted by the encoder.

前記送配電設備に含まれる送電線又は配電線の特定においては、前記第１デコーダの特定結果よりも、前記第２デコーダの特定結果を優先し、
前記送配電設備に含まれる鉄塔又は電柱の特定においては、前記第１デコーダの特定結果よりも、前記第３デコーダの特定結果を優先する
請求項６に記載の情報処理プログラム。 In identifying a transmission line or a distribution line included in the power transmission and distribution facility, a result of the identification by the second decoder is prioritized over a result of the identification by the first decoder;
The information processing program according to claim 6 , wherein in identifying a steel tower or a utility pole included in the power transmission and distribution facility, a result of identification by the third decoder is given priority over a result of identification by the first decoder.

前記第１モデルは入力された画像から特徴量を抽出する抽出エンコーダと、前記特徴量を集約する集約デコーダとを含み、
前記集約デコーダは、階層化された複数のアップサンプリング層を含み、前記複数のアップサンプリング層において、最下層のみが前記集約デコーダの外部へ特徴量を出力する
請求項１又は請求項２に記載の情報処理プログラム。 The first model includes an extraction encoder that extracts features from an input image, and an aggregation decoder that aggregates the features;
The information processing program according to claim 1 or 2, wherein the aggregated decoder includes a plurality of hierarchical upsampling layers, and among the plurality of upsampling layers, only a bottom layer outputs features to an outside of the aggregated decoder.

ステレオカメラにより撮影され、被写体に送配電設備を含む複数の画像を取得する第１取得部と、
複数の画像を入力した場合に、深度情報を生成するよう学習された第１モデルへ、取得した複数の画像を入力して深度情報を生成する生成部と、
画像を入力した場合に、送配電設備と周辺物とを分類するように学習された第２モデルへ、取得した画像を入力して送配電設備及び周辺物の分類を取得する第２取得部と、
取得した分類と前記深度情報に基づいて、前記送配電設備と前記周辺物との間の距離を導出する導出部と
を備える情報処理装置。 A first acquisition unit that acquires a plurality of images captured by a stereo camera and including power transmission and distribution facilities as objects;
A generation unit that inputs the acquired multiple images to a first model that has been trained to generate depth information when multiple images are input, and generates depth information;
A second acquisition unit that inputs an acquired image into a second model that has been trained to classify the power transmission and distribution equipment and the surrounding objects when an image is input, and acquires a classification of the power transmission and distribution equipment and the surrounding objects;
and a derivation unit that derives a distance between the power transmission and distribution facility and the surrounding object based on the acquired classification and the depth information.

ステレオカメラにより撮影され、被写体に送配電設備を含む複数の画像を取得し、
複数の画像を入力した場合に、深度情報を生成するよう学習された第１モデルへ、取得した複数の画像を入力して深度情報を生成し、
画像を入力した場合に、送配電設備と周辺物とを分類するように学習された第２モデルへ、取得した画像を入力して送配電設備及び周辺物の分類を取得し、
取得した分類と前記深度情報に基づいて、前記送配電設備と前記周辺物との間の距離を導出する処理を
コンピュータが行う情報処理方法。 The image is taken with a stereo camera, and multiple images including the power transmission and distribution equipment are acquired.
When multiple images are input, the acquired multiple images are input to a first model trained to generate depth information, and depth information is generated;
The acquired image is input to a second model that has been trained to classify the power transmission and distribution equipment and the surrounding objects when an image is input, and a classification of the power transmission and distribution equipment and the surrounding objects is obtained;
An information processing method in which a computer performs a process of deriving a distance between the power transmission and distribution facility and the surrounding objects based on the acquired classification and the depth information.

エンコーダ、並びに、該エンコーダにそれぞれ接続する第１デコーダ、第２デコーダ、及び、第３デコーダを用意し、
送電線又は配電線及び鉄塔又は電柱を含む物体の分類が注記として付された画像を取得し、
取得した画像に基づき、画像を入力した場合に、該画像に被写体として含まれる物体の分類を行う前記エンコーダ及び前記第１デコーダにより構成されるセグメンテーションモデルと、画像を入力した場合に、該画像に被写体として含まれる送電線又は配電線を分類する前記エンコーダ及び前記第２デコーダにより構成される線状物体検出モデルと、画像を入力した場合に、該画像に被写体として含まれる鉄塔又は電柱を分類する前記エンコーダ及び前記第３デコーダにより構成される電柱検出モデルとを学習により生成する
モデルの生成方法。 providing an encoder, and a first decoder, a second decoder, and a third decoder respectively connected to the encoder;
obtaining images annotated with classifications of objects including power lines and towers or poles;
A model generation method which generates, through learning, a segmentation model constituted by the encoder and the first decoder which classifies objects contained as subjects in an image when an image is input based on an acquired image, a linear object detection model constituted by the encoder and the second decoder which classifies power transmission lines or power distribution lines contained as subjects in the image when an image is input, and a utility pole detection model constituted by the encoder and the third decoder which classifies steel towers or utility poles contained as subjects in the image when an image is input.