JPWO2019087383A1

JPWO2019087383A1 - Crowd density calculation device, crowd density calculation method, and crowd density calculation program

Info

Publication number: JPWO2019087383A1
Application number: JP2019550118A
Authority: JP
Inventors: 士人新井; 亮史服部; 奥村　誠司; 誠司奥村
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2017-11-06
Filing date: 2017-11-06
Publication date: 2020-04-02
Anticipated expiration: 2037-11-06
Also published as: WO2019087383A1; SG11202002953YA; CN111279392B; CN111279392A; JP6678835B2

Abstract

群集密度を算出する群集密度算出装置（１００）において、映像取得部（１１０）は、人が撮像されている映像ストリーム（２２）から映像フレーム（２１）を取得する。解析部（１２０）は、映像フレーム（２１）に３次元座標を対応付け、映像フレーム（２１）上において３次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得する。解析部（１２０）は、複数の立体領域の各立体領域に存在する人の数に基づいて、映像フレームにおける人の密度分布を群集密度分布（２２５）として算出する。In the crowd density calculation device (100) for calculating crowd density, a video acquisition unit (110) acquires a video frame (21) from a video stream (22) in which a person is imaged. The analysis unit (120) associates the three-dimensional coordinates with the video frame (21), and creates a plurality of regions representing each three-dimensional space of the plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame (21). It is acquired as each three-dimensional area of the three-dimensional area. The analysis unit (120) calculates the density distribution of people in the video frame as a crowd density distribution (225) based on the number of people existing in each stereoscopic area of the plurality of stereoscopic areas.

Description

本発明は、群集密度算出装置、群集密度算出方法および群集密度算出プログラムに関する。 The present invention relates to a crowd density calculation device, a crowd density calculation method, and a crowd density calculation program.

人の数あるいは人の密度をカメラ映像から推定する技術がある。カメラ映像から人の数を推定する技術として、人物検出に基づき人数をカウントする手法、または、前景面積から人数を推定する手法といった技術がある。
人物検出に基づく手法では、群集密度が低いと、人の数を高精度に推定できる。しかし、この手法では、人数が増えるに従って演算量が増える。さらに、この手法では、人数が増えるに従い群集密度が高くなるので、人物同士のオクルージョン、すなわち隠蔽の影響により推定精度が低下する。
前景面積から人数を推定する手法では、群集密度が低い場合、人物検出に基づく手法に比較すると推定精度が劣る。しかし、この手法では、群集密度が高い場合でも演算量が変わらない。
なお、人の密度を推定する技術は、映像フレームの任意の領域ごとに人数を推定する技術と等価である。There is a technique for estimating the number of people or the density of people from camera images. As a technique for estimating the number of people from a camera image, there is a technique of counting the number of people based on detection of a person, or a technique of estimating the number of people from a foreground area.
In the method based on person detection, when the crowd density is low, the number of people can be estimated with high accuracy. However, in this method, the amount of calculation increases as the number of people increases. Furthermore, in this method, the crowd density increases as the number of people increases, so that the estimation accuracy decreases due to the effect of occlusion between persons, that is, the effect of concealment.
In the method of estimating the number of people from the foreground area, when the crowd density is low, the estimation accuracy is inferior to the method based on person detection. However, in this method, the amount of calculation does not change even when the crowd density is high.
Note that the technology for estimating the density of people is equivalent to the technology for estimating the number of people for each arbitrary region of a video frame.

特許文献１および特許文献２には、群集を撮影した映像を取得し、背景差分により抽出した前景を人物領域として、人物領域の面積から画面内の人数を推定する技術が開示されている。
特許文献１では、画像中の各画素が人の数にどれだけ寄与するかを数量的に表す荷重値が算出される。荷重値は、画像における対象物の見かけ上の体積から算出される。これにより、奥行きの違いにより単位人数あたりの前景面積の現れ方が異なるという課題を解決し、奥行きのある画像でも人数の推定を可能としている。
また、特許文献２では、予め群集を模したＣＧ（ｃｏｍｐｕｔｅｒｇｒａｐｈｉｃｓ）モデルを複数の混雑度で作成し、群集同士のオクルージョンを考慮した前景面積と人数の関係式を導出する。そして、特許文献２では、オクルージョンの影響を抑制した人数の推定を可能としている。Patent Literature 1 and Patent Literature 2 disclose techniques for acquiring a video image of a crowd and estimating the number of people in a screen from the area of the person region, with the foreground extracted by the background difference as the person region.
In Patent Literature 1, a load value that quantitatively indicates how much each pixel in an image contributes to the number of people is calculated. The load value is calculated from the apparent volume of the object in the image. This solves the problem that the appearance of the foreground area per unit number differs depending on the depth, and the number of persons can be estimated even with an image having a depth.
In Patent Document 2, a CG (computer graphics) model imitating a crowd is created in advance with a plurality of congestion degrees, and a relational expression between the foreground area and the number of people taking into account the occlusion between the crowds is derived. In Patent Document 2, it is possible to estimate the number of people while suppressing the influence of occlusion.

特開２００９−２９４７５５号公報JP 2009-294755 A 特開２００５−０２５３２８号公報JP 2005-025328 A

特許文献１および特許文献２に開示されている技術では、映像フレームに存在する人の数および映像フレーム中の任意の領域における人の密度は算出される。しかし、実世界の物理空間上における人位置の推定は行っていない。これは特許文献１および特許文献２においては物理空間上の点から映像フレーム上の点への対応付けは行われているが、その逆は行われていないためである。 In the techniques disclosed in Patent Literature 1 and Patent Literature 2, the number of people existing in a video frame and the density of people in an arbitrary region in the video frame are calculated. However, it does not estimate the position of a person in the physical space of the real world. This is because in Patent Literature 1 and Patent Literature 2, a point in a physical space is associated with a point in a video frame, but the reverse is not performed.

本発明は、映像フレームから実世界の物理空間上における群集の存在位置を算出し、群集の密度分布として出力することを目的とする。 An object of the present invention is to calculate the location of a crowd in a physical space in the real world from a video frame and output the calculated location as a density distribution of the crowd.

本発明に係る群集密度算出装置は、
人が撮像されている映像ストリームから映像フレームを取得する映像取得部と、
前記映像フレームに３次元座標を対応付け、前記映像フレーム上において前記３次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得し、前記複数の立体領域の各立体領域に存在する人の数に基づいて、前記映像フレームにおける人の密度分布を群集密度分布として算出する解析部とを備えた。The crowd density calculation device according to the present invention,
A video acquisition unit that acquires a video frame from a video stream in which a person is being imaged;
Associating three-dimensional coordinates with the video frame, obtaining, on the video frame, a region representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates as each three-dimensional region of a plurality of three-dimensional regions; An analysis unit that calculates a density distribution of people in the video frame as a crowd density distribution based on the number of people existing in each of the plurality of three-dimensional regions.

本発明に係る群集密度算出装置では、解析部が、映像フレームに３次元座標を対応付け、映像フレーム上において３次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得する。また、解析部が、複数の立体領域の各立体領域に存在する人の数に基づいて、映像フレームにおける群集密度分布を算出する。よって、本発明に係る群集密度算出装置によれば、映像フレームから実世界の物理空間における群集密度分布を定量的に把握することができる。 In the crowd density calculation device according to the present invention, the analysis unit associates three-dimensional coordinates with the video frame, and sets a plurality of regions representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame. Are obtained as the three-dimensional regions of the three-dimensional region. In addition, the analysis unit calculates a crowd density distribution in the video frame based on the number of people existing in each of the plurality of three-dimensional regions. Therefore, according to the crowd density calculation device of the present invention, the crowd density distribution in the physical space in the real world can be quantitatively grasped from the video frame.

実施の形態１に係る群集密度算出装置の構成図。FIG. 2 is a configuration diagram of a crowd density calculation device according to the first embodiment. 実施の形態１に係る解析部の詳細構成図。FIG. 3 is a detailed configuration diagram of an analysis unit according to the first embodiment. 群集密度分布の定義を説明する図。The figure explaining the definition of crowd density distribution. ΔＸとΔＹの大きさを一定とした場合の群集密度分布のイメージを表す図。The figure showing the image of the crowd density distribution when the magnitude | size of (DELTA) X and (DELTA) Y is fixed. 実世界の物理空間上の点を映像フレーム座標系に投映するイメージを表す図。FIG. 3 is a diagram illustrating an image in which a point in the physical space of the real world is projected on a video frame coordinate system. 実施の形態１に係る群集密度算出処理のフロー図。FIG. 4 is a flowchart of crowd density calculation processing according to the first embodiment. 実施の形態１に係る解析処理のフロー図。FIG. 3 is a flowchart of an analysis process according to the first embodiment. 立体領域ごとに前景面積を人数に換算するイメージを表す図。The figure showing the image which converts foreground area into the number of people for every three-dimensional area. 立体領域ごとの人数から暫定密度分布を出力するイメージを表す図。The figure showing the image which outputs provisional density distribution from the number of people for every three-dimensional field. 正解前景画像のイメージを表す図。The figure showing the image of a correct answer foreground image. 正解前景画像において混雑レベルを数値化したイメージを表す図。The figure showing the image which digitized the congestion level in the correct foreground image. 実施の形態１に係る存在判定処理のイメージを表す図。FIG. 7 is a diagram illustrating an image of a presence determination process according to the first embodiment. 実施の形態１の変形例に係る群集密度算出装置の構成図。FIG. 9 is a configuration diagram of a crowd density calculation device according to a modification of the first embodiment. 実施の形態２に係る解析部の詳細構成図。FIG. 7 is a detailed configuration diagram of an analysis unit according to the second embodiment. 実施の形態２に係る解析処理のフロー図。FIG. 11 is a flowchart of an analysis process according to the second embodiment. 実施の形態２に係る位置補正処理のイメージを表す図。FIG. 10 is a diagram illustrating an image of a position correction process according to the second embodiment. 実施の形態３に係る解析処理のフロー図。FIG. 13 is a flowchart of an analysis process according to the third embodiment.

以下、本発明の実施の形態について、図を用いて説明する。なお、各図中、同一または相当する部分には、同一符号を付している。実施の形態の説明において、同一または相当する部分については、説明を適宜省略または簡略化する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the drawings, the same or corresponding portions are denoted by the same reference numerals. In the description of the embodiments, the description of the same or corresponding portions will be omitted or simplified as appropriate.

実施の形態１．
＊＊＊構成の説明＊＊＊
図１を用いて、本実施の形態に係る群集密度算出装置１００の構成を説明する。
群集密度算出装置１００は、コンピュータである。群集密度算出装置１００は、プロセッサ９１０を備えるとともに、メモリ９２１、補助記憶装置９２２、入力インタフェース９３０、出力インタフェース９４０、および通信装置９５０といった他のハードウェアを備える。プロセッサ９１０は、信号線を介して他のハードウェアと接続され、これら他のハードウェアを制御する。Embodiment 1 FIG.
*** Configuration description ***
The configuration of the crowd density calculation device 100 according to the present embodiment will be described with reference to FIG.
The crowd density calculation device 100 is a computer. The crowd density calculation device 100 includes a processor 910 and other hardware such as a memory 921, an auxiliary storage device 922, an input interface 930, an output interface 940, and a communication device 950. The processor 910 is connected to other hardware via a signal line, and controls the other hardware.

群集密度算出装置１００は、機能要素として、映像取得部１１０と、解析部１２０と、結果出力部１３０と、記憶部１４０とを備える。記憶部１４０には、解析部１２０による解析処理で用いられる解析パラメータ１４１が記憶されている。 The crowd density calculation device 100 includes a video acquisition unit 110, an analysis unit 120, a result output unit 130, and a storage unit 140 as functional elements. The storage unit 140 stores the analysis parameters 141 used in the analysis processing by the analysis unit 120.

映像取得部１１０と解析部１２０と結果出力部１３０の機能は、ソフトウェアにより実現される。記憶部１４０は、メモリ９２１に備えられる。記憶部１４０は、補助記憶装置９２２に備えられていてもよい。 The functions of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 are realized by software. The storage unit 140 is provided in the memory 921. The storage unit 140 may be provided in the auxiliary storage device 922.

プロセッサ９１０は、群集密度算出プログラムを実行する装置である。群集密度算出プログラムは、映像取得部１１０と解析部１２０と結果出力部１３０の機能を実現するプログラムである。
プロセッサ９１０は、演算処理を行うＩＣ（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）である。プロセッサ９１０の具体例は、ＣＰＵ、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）である。The processor 910 is a device that executes a crowd density calculation program. The crowd density calculation program is a program that implements the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130.
The processor 910 is an IC (Integrated Circuit) that performs arithmetic processing. Specific examples of the processor 910 are a CPU, a DSP (Digital Signal Processor), and a GPU (Graphics Processing Unit).

メモリ９２１は、データを一時的に記憶する記憶装置である。メモリ９２１の具体例は、ＳＲＡＭ（ＳｔａｔｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、あるいはＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）である。 The memory 921 is a storage device that temporarily stores data. A specific example of the memory 921 is an SRAM (Static Random Access Memory) or a DRAM (Dynamic Random Access Memory).

補助記憶装置９２２は、データを保管する記憶装置である。補助記憶装置９２２の具体例は、ＨＤＤである。また、補助記憶装置９２２は、ＳＤ（登録商標）メモリカード、ＣＦ、ＮＡＮＤフラッシュ、フレキシブルディスク、光ディスク、コンパクトディスク、ブルーレイ（登録商標）ディスク、ＤＶＤといった可搬記憶媒体であってもよい。なお、ＨＤＤは、ＨａｒｄＤｉｓｋＤｒｉｖｅの略語である。ＳＤ（登録商標）は、ＳｅｃｕｒｅＤｉｇｉｔａｌの略語である。ＣＦは、ＣｏｍｐａｃｔＦｌａｓｈの略語である。ＤＶＤは、ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋの略語である。 The auxiliary storage device 922 is a storage device for storing data. A specific example of the auxiliary storage device 922 is an HDD. The auxiliary storage device 922 may be a portable storage medium such as an SD (registered trademark) memory card, CF, NAND flash, flexible disk, optical disk, compact disk, Blu-ray (registered trademark) disk, or DVD. Note that HDD is an abbreviation for Hard Disk Drive. SD (registered trademark) is an abbreviation for Secure Digital. CF is an abbreviation for CompactFlash. DVD is an abbreviation for Digital Versatile Disk.

入力インタフェース９３０は、マウス、キーボード、あるいはタッチパネルといった入力装置と接続されるポートである。また、入力インタフェース９３０は、カメラ２００と接続されるポートであってもよい。入力インタフェース９３０は、具体的には、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）端子である。なお、入力インタフェース９３０は、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）と接続されるポートであってもよい。群集密度算出装置１００は、入力インタフェース９３０を介して、カメラ２００から映像ストリーム２１を取得してもよい。 The input interface 930 is a port connected to an input device such as a mouse, a keyboard, or a touch panel. The input interface 930 may be a port connected to the camera 200. The input interface 930 is, specifically, a USB (Universal Serial Bus) terminal. Note that the input interface 930 may be a port connected to a LAN (Local Area Network). The crowd density calculation device 100 may acquire the video stream 21 from the camera 200 via the input interface 930.

出力インタフェース９４０は、ディスプレイといった出力機器のケーブルが接続されるポートである。出力インタフェース９４０は、具体的には、ＵＳＢ端子またはＨＤＭＩ（登録商標）（ＨｉｇｈＤｅｆｉｎｉｔｉｏｎＭｕｌｔｉｍｅｄｉａＩｎｔｅｒｆａｃｅ）端子である。ディスプレイは、具体的には、ＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）である。群集密度算出装置１００は、出力インタフェース９４０を介して、結果出力部１３０により出力された解析結果をディスプレイに表示する。 The output interface 940 is a port to which a cable of an output device such as a display is connected. The output interface 940 is, specifically, a USB terminal or an HDMI (registered trademark) (High Definition Multimedia Interface) terminal. The display is, specifically, an LCD (Liquid Crystal Display). The crowd density calculation device 100 displays the analysis result output by the result output unit 130 on the display via the output interface 940.

通信装置９５０は、ネットワークを介して他の装置と通信する。通信装置９５０は、レシーバとトランスミッタを有する。通信装置９５０は、有線または無線で、ＬＡＮ、インターネット、あるいは電話回線といった通信網に接続している。通信装置９５０は、具体的には、通信チップまたはＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）である。群集密度算出装置１００は、通信装置９５０を介して、カメラ２００から映像ストリーム２２を受信してもよい。また、群集密度算出装置１００は、通信装置９５０を介して、結果出力部１３０により出力された解析結果を外部の装置に送信してもよい。 The communication device 950 communicates with another device via a network. The communication device 950 includes a receiver and a transmitter. The communication device 950 is connected to a communication network such as a LAN, the Internet, or a telephone line in a wired or wireless manner. The communication device 950 is, specifically, a communication chip or a NIC (Network Interface Card). The crowd density calculation device 100 may receive the video stream 22 from the camera 200 via the communication device 950. Further, the crowd density calculation device 100 may transmit the analysis result output by the result output unit 130 to an external device via the communication device 950.

群集密度算出プログラムは、プロセッサ９１０に読み込まれ、プロセッサ９１０によって実行される。メモリ９２１には、群集密度算出プログラムだけでなく、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）も記憶されている。プロセッサ９１０は、ＯＳを実行しながら、群集密度算出プログラムを実行する。群集密度算出プログラムおよびＯＳは、補助記憶装置９２２に記憶されていてもよい。補助記憶装置９２２に記憶されている群集密度算出プログラムおよびＯＳは、メモリ９２１にロードされ、プロセッサ９１０によって実行される。なお、群集密度算出プログラムの一部または全部がＯＳに組み込まれていてもよい。 The crowd density calculation program is read by the processor 910 and executed by the processor 910. The memory 921 stores not only a crowd density calculation program but also an OS (Operating System). The processor 910 executes the crowd density calculation program while executing the OS. The crowd density calculation program and the OS may be stored in the auxiliary storage device 922. The crowd density calculation program and the OS stored in the auxiliary storage device 922 are loaded into the memory 921 and executed by the processor 910. A part or all of the crowd density calculation program may be incorporated in the OS.

群集密度算出装置１００は、プロセッサ９１０を代替する複数のプロセッサを備えていてもよい。これら複数のプロセッサは、群集密度算出プログラムの実行を分担する。それぞれのプロセッサは、プロセッサ９１０と同じように、群集密度算出プログラムを実行する装置である。 The crowd density calculation device 100 may include a plurality of processors instead of the processor 910. These processors share execution of the crowd density calculation program. Each processor is a device that executes a crowd density calculation program, similarly to the processor 910.

群集密度算出プログラムにより利用、処理または出力されるデータ、情報、信号値および変数値は、メモリ９２１、補助記憶装置９２２、または、プロセッサ９１０内のレジスタあるいはキャッシュメモリに記憶される。 Data, information, signal values, and variable values used, processed, or output by the crowd density calculation program are stored in the memory 921, the auxiliary storage device 922, or a register or a cache memory in the processor 910.

群集密度算出プログラムは、映像取得部１１０と解析部１２０と結果出力部１３０の各部の「部」を「処理」、「手順」あるいは「工程」に読み替えた各処理、各手順あるいは各工程を、コンピュータに実行させる。また、群集密度算出方法は、群集密度算出装置１００が群集密度算出プログラムを実行することにより行われる方法である。
群集密度算出プログラムは、コンピュータ読取可能な記録媒体に格納されて提供されてもよい。また、群集密度算出プログラムは、プログラムプロダクトとして提供されてもよい。The crowd density calculation program executes each process, each procedure or each process in which the “unit” of each unit of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 is read as “processing”, “procedure” or “step”. Let the computer run. The crowd density calculation method is a method performed by the crowd density calculation device 100 executing a crowd density calculation program.
The crowd density calculation program may be provided by being stored in a computer-readable recording medium. Further, the crowd density calculation program may be provided as a program product.

図２を用いて、本実施の形態に係る解析部１２０の詳細構成について説明する。
解析部１２０は、前景抽出部１２１と、暫定密度計算部１２２と、存在判定部１２３と、標準化部１２４と、分布出力部１２５とを備える。すなわち、映像取得部１１０と解析部１２０と結果出力部１３０の各部とは、映像取得部１１０と、前景抽出部１２１と、暫定密度計算部１２２と、存在判定部１２３と、標準化部１２４と、分布出力部１２５と、結果出力部１３０の各部である。The detailed configuration of the analysis unit 120 according to the present embodiment will be described with reference to FIG.
The analysis unit 120 includes a foreground extraction unit 121, a provisional density calculation unit 122, a presence determination unit 123, a standardization unit 124, and a distribution output unit 125. That is, the image acquisition unit 110, the analysis unit 120, and the result output unit 130 include the image acquisition unit 110, the foreground extraction unit 121, the provisional density calculation unit 122, the existence determination unit 123, the standardization unit 124, The distribution output unit 125 and the result output unit 130.

図１および図２を用いて、群集密度算出装置１００の各機能要素の概要について説明する。
群集密度算出装置１００は、物体を撮像し、映像ストリーム２１として配信するカメラ２００と接続される。物体とは、具体的には人である。すなわち、映像ストリーム２１は、群集映像である。
映像取得部１１０は、入力インタフェース９３０を介して、カメラ２００から配信される映像ストリーム２１を取得する。映像取得部１１０は、映像ストリーム２１から映像フレーム２２を取得する。具体的には、映像取得部１１０は、映像ストリーム２１を復号し、映像フレーム２２に変換する。
解析部１２０は、映像フレーム２２に３次元座標を対応付ける。解析部１２０は、映像フレーム２２上において３次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得する。そして、解析部１２０は、複数の立体領域の各立体領域に存在する人の数に基づいて、映像フレーム２２における群集密度分布を算出する。すなわち、解析部１２０は、映像フレーム２２を用いて、物理空間上の３次元座標における群集の位置を群集密度分布２２５として算出する。
結果出力部１３０は、出力インタフェース９４０を介して、解析部１２０から出力された群集密度分布２２５をディスプレイといった出力装置に出力する。The outline of each functional element of the crowd density calculation device 100 will be described with reference to FIGS. 1 and 2.
The crowd density calculation device 100 is connected to a camera 200 that images an object and distributes the image as a video stream 21. The object is specifically a person. That is, the video stream 21 is a crowd video.
The video acquisition unit 110 acquires the video stream 21 distributed from the camera 200 via the input interface 930. The video acquisition unit 110 acquires a video frame 22 from the video stream 21. Specifically, the video acquisition unit 110 decodes the video stream 21 and converts the video stream 21 into a video frame 22.
The analysis unit 120 associates the video frame 22 with three-dimensional coordinates. The analysis unit 120 obtains, on the video frame 22, a region representing each of the plurality of three-dimensional spaces obtained based on the three-dimensional coordinates as each of the plurality of three-dimensional regions. Then, the analysis unit 120 calculates the crowd density distribution in the video frame 22 based on the number of people existing in each of the plurality of three-dimensional regions. That is, the analysis unit 120 uses the video frame 22 to calculate the position of the crowd in the three-dimensional coordinates in the physical space as the crowd density distribution 225.
The result output unit 130 outputs the crowd density distribution 225 output from the analysis unit 120 to an output device such as a display via the output interface 940.

次に、解析部１２０が有する各機能要素の概要について説明する。
前景抽出部１２１は、映像フレーム２２から前景の特徴を有する部分を前景画像２２１として抽出する。暫定密度計算部１２２は、前景画像２２１と記憶部１４０に記憶されている解析パラメータ１４１とを用いて、物理空間上の各位置に対する見かけ上の群集の密度分布である暫定密度分布２２２を算出する。存在判定部１２３は、物理空間上において人が存在しない位置を判定することにより、暫定密度分布２２２を補正する。存在判定部１２３は、補正した暫定密度分布２２２を補正密度分布２２３として出力する。標準化部１２４は、映像フレーム２２により表される画像に存在する総人数を利用して、補正密度分布２２３を標準化し、確定密度分布２２４を出力する。分布出力部１２５は、確定密度分布２２４を出力形式に変換し、最終的に群集密度分布２２５として出力する。Next, an outline of each functional element included in the analysis unit 120 will be described.
The foreground extraction unit 121 extracts a part having the foreground feature from the video frame 22 as a foreground image 221. Using the foreground image 221 and the analysis parameters 141 stored in the storage unit 140, the provisional density calculation unit 122 calculates a provisional density distribution 222 that is the apparent density distribution of the crowd at each position in the physical space. . The presence determination unit 123 corrects the provisional density distribution 222 by determining a position where no person exists in the physical space. The existence determination unit 123 outputs the corrected provisional density distribution 222 as the corrected density distribution 223. The standardization unit 124 standardizes the corrected density distribution 223 by using the total number of people present in the image represented by the video frame 22 and outputs a fixed density distribution 224. The distribution output unit 125 converts the deterministic density distribution 224 into an output format, and finally outputs it as a crowd density distribution 225.

図３を用いて、群集密度分布の定義を説明する。群集密度分布とは、実世界の物理空間上の人数分布である。
図３に示すように、実世界の３次元座標である物理空間座標系Ｘ_ｒ−Ｙ_ｒ−Ｚ_ｒにおいて、実世界の地面に対応するＸ_ｒ−Ｙ_ｒ平面上のある位置（Ｘ_ｉ，Ｙ_ｉ）における幅ΔＸと奥行きΔＹで囲まれる領域をΔＳ_ｉｊとする。そして、ΔＳ_ｉｊを底面とした高さＨの角柱で囲まれる立体空間の領域を立体空間Ｖ_ｉｊとする。立体空間Ｖ_ｉｊ内に存在する人数をｈ_ｔｉｊとする。人数ｈ_ｔｉｊをΔＳ_ｉｊに対応させ、解析領域全体分を並べたものが群集密度分布である。高さＨは人の身長程度の高さとする。The definition of the crowd density distribution will be described with reference to FIG. The crowd density distribution is the distribution of people in the physical space of the real world.
As shown in FIG. 3, the physical space coordinates _X _r _-Y r -Z r is a three-dimensional coordinates of the real world, _X r -Y _r a top planar position _(X i corresponding to the ground in the real world, A region surrounded by the width ΔX and the depth ΔY in Y _i ) is defined as ΔS _ij . Then, an area of the three-dimensional space surrounded by a prism having a height of H and a bottom surface of ΔS _ij is defined as a three-dimensional space V _ij . The number of persons existing in the three-dimensional space V _ij is defined as h _tij . A crowd density distribution is obtained by arranging the number of persons h _tij in correspondence with ΔS _ij and arranging the entire analysis area. The height H is about the height of a person.

図４は、ΔＸとΔＹの大きさを一定とした場合の群集密度分布のイメージを表す図である。領域ΔＳ_０２と、領域ΔＳ_１０と領域ΔＳ_２０の中間の地点とに１人ずつ人が存在した場合の群集密度分布は、次の通りである。このときの群集密度分布は、ΔＳ_０２の位置に１人、ΔＳ_１０の位置に０．５人、ΔＳ_２０の位置に０．５人、その他ΔＳ_ｉｊの位置に０人存在するという分布となる。
なお、ΔＸとΔＹの大きさは規定しない。ΔＸとΔＹの大きさは可変としても構わない。FIG. 4 is a diagram illustrating an image of a crowd density distribution when the magnitudes of ΔX and ΔY are fixed. The crowd density distribution in a case where one person exists in the region ΔS ₀₂ and one person in the middle between the region ΔS ₁₀ and the region ΔS ₂₀ is as follows. At this time, the crowd density distribution is such that there is one person at the position of ΔS ₀₂ , 0.5 at the position of ΔS ₁₀ , 0.5 at the position of ΔS ₂₀ , and 0 at the position of the other ΔS _ij. .
The magnitudes of ΔX and ΔY are not specified. The magnitudes of ΔX and ΔY may be variable.

図５は、実世界の物理空間上の点を映像フレーム座標系に投映するイメージを表す図である。すなわち、図５は、映像フレームに３次元座標を対応付けるイメージである。実世界の物理空間における地面上の点をＰ_ｇｉｊ＝（Ｘ_ｉｊ，Ｙ_ｉｊ，０）とする。実世界の物理空間における高さＨの平面上の点をＰ_ｈｉｊ＝（Ｘ_ｉｊ，Ｙ_ｉｊ，Ｈ）とする。地面上の点Ｐ_ｇｉｊと高さＨの平面上の点Ｐ_ｈｉｊを、映像フレーム座標系ｘ_ｉｍｇ−ｙ_ｉｍｇ上に投映した点をｐ_ｇｉｊとｐ_ｈｉｊとする。Ｐ_ｇｉｊとｐ_ｇｉｊ、Ｐ_ｈｉｊとｐ_ｈｉｊを対応付ける情報が、解析パラメータ１４１として、記憶部１４０に記憶されている。また、物理空間上の立体空間Ｖ_ｉｊを、映像フレーム座標系上に投映した領域を立体領域ｖ_ｉｊとする。立体領域ｖ_ｉｊは、図５において斜線で示す２次元領域である。すなわち、立体領域ｖ_ｉｊは、映像フレームに３次元座標の立体空間Ｖ_ｉｊが表された際に、映像フレーム上の立体空間Ｖ_ｉｊの外周により表された２次元の領域である。立体領域ｖ_ｉｊを角柱領域あるいは直方体領域と呼ぶ。FIG. 5 is a diagram illustrating an image in which a point in the physical space of the real world is projected on a video frame coordinate system. That is, FIG. 5 is an image in which three-dimensional coordinates are associated with a video frame. A point on the ground in the physical space of the real world is defined as P _gij = (X _ij , Y _ij , 0). Let a point on a plane of height H in the physical space of the real world be P _hij = (X _ij , Y _ij , H). The points at which the point P _gij on the ground and the point P _hij on the plane having the height H are _projected on the video frame coordinate system x _img -y _img are defined as p _gij and p _hij . Information associating P _gij with p _gij , and P _hij with p _hij is stored in the storage unit 140 as the analysis parameter 141. In addition, a region in which the three-dimensional space V _ij in the physical space is projected on the video frame coordinate system is referred to as a three-dimensional region v _ij . The three-dimensional region v _ij is a two-dimensional region indicated by oblique lines in FIG. That is, the three-dimensional region v _ij is a two-dimensional region represented by the outer periphery of the three-dimensional space V _ij on the video frame when the three-dimensional space V _ij of the three-dimensional coordinates is represented in the video frame. The three-dimensional region v _ij is called a prism region or a rectangular parallelepiped region.

複数の立体領域の各立体領域ｖ_ｉｊは、人が複数の立体空間の各立体空間Ｖ_ｉｊに立っている場合に人の頭が対応する頭領域と、人が立つ地面に対応する地面領域とを備える。頭領域は、立体領域ｖ_ｉｊにおいて頭の高さ位置に相当するｐ_ｈｉｊ、ｐ_{ｈｉ＋１ｊ}、ｐ_{ｈｉｊ＋１}、ｐ_{ｈｉ＋１ｊ＋１}で囲まれる領域である。また、地面領域は、立体領域ｖ_ｉｊにおいて地面の位置に相当するｐ_ｇｉｊ、ｐ_{ｇｉ＋１ｊ}、ｐ_{ｇｉｊ＋１}、ｐ_{ｇｉ＋１ｊ＋１}で囲まれる領域である。Each of the three-dimensional regions v _ij of the plurality of three-dimensional regions is a head region corresponding to a person's head when a person stands in each of the three-dimensional spaces V _ij of the plurality of three-dimensional spaces, and a ground region corresponding to the ground on which the person stands. Is provided. The head region is a region _surrounded by p _hij , p _{hi + 1j} , p _{hij + 1} , and p _{hi + 1j + 1} corresponding to the head height position in the three-dimensional region v _ij . The ground area is an area surrounded by p _gij , p _{gi + 1j} , p _{gij + 1} , and p _{gi + 1j + 1} corresponding to the position of the ground in the three-dimensional area v _ij .

物理空間上の座標と映像フレーム上の座標を対応付ける情報とは、座標変換式でもよいし、対応する物理空間上の座標と映像フレーム上の座標の組でもよい。
また、各Ｐ_ｇｉｊまたは各Ｐ_ｈｉｊは同一平面上である必要はない。立体空間の領域Ｖ_ｉｊを定義できれば、各Ｐ_ｇｉｊまたは各Ｐ_ｈｉｊが表す面が、曲面あるいは階段状となっていてもよい。The information for associating the coordinates on the physical space with the coordinates on the video frame may be a coordinate conversion formula, or may be a set of the corresponding coordinates on the physical space and the coordinates on the video frame.
Also, each P _gij or each P _hij need not be on the same plane. As long as the region V _{ij of the} three-dimensional space can be defined, the plane represented by each P _gij or each P _hij may be a curved surface or a stepped shape.

＊＊＊動作の説明＊＊＊
図６を用いて、本実施の形態に係る群集密度算出装置１００による群集密度算出処理Ｓ１００について説明する。*** Explanation of operation ***
The crowd density calculation process S100 by the crowd density calculation device 100 according to the present embodiment will be described with reference to FIG.

＜解析パラメータ読み込み処理＞
ステップＳＴ０１において、群集密度算出装置１００は、記憶部１４０に解析パラメータ１４１を読み込む。解析パラメータ１４１は、補助記憶装置９２２に記憶されていてもよいし、入力インタフェース９３０あるいは通信装置９５０を介して、外部から入力されてもよい。読み込まれた解析パラメータ１４１は、解析部１２０により使用される。<Analysis parameter reading process>
In step ST01, the crowd density calculation device 100 reads the analysis parameters 141 into the storage unit 140. The analysis parameter 141 may be stored in the auxiliary storage device 922, or may be input from outside via the input interface 930 or the communication device 950. The read analysis parameters 141 are used by the analysis unit 120.

＜映像取得処理＞
ステップＳＴ０２において、映像取得部１１０は、カメラ２００から映像ストリーム２１を受信するために待機する。映像取得部１１０は、カメラ２００から映像ストリーム２１を受信すると、受信した映像ストリーム２１の少なくとも１フレーム分を復号する。ここで受信対象とする映像ストリームは、例えば、映像圧縮符号化方式で圧縮された映像符号化データが、映像配信プロトコルでＩＰ配信されるものである。映像圧縮符号化方式の具体例は、Ｈ．２６２／ＭＰＥＧ−２ｖｉｄｅｏ、Ｈ．２６４／ＡＶＣ、Ｈ．２６５／ＨＥＶＣ、またはＪＰＥＧである。映像配信プロトコルの具体例は、ＭＰＥＧ−２ＴＳ、ＲＴＰ／ＲＴＳＰ、ＭＭＴ、またはＤＡＳＨである。ＭＰＥＧ−２ＴＳは、ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ２ＴｒａｎｓｐｏｒｔＳｔｒｅａｍの略語である。ＲＴＰ／ＲＴＳＰは、Ｒｅａｌ−ｔｉｍｅＴｒａｎｓｐｏｒｔＰｒｏｔｏｃｏｌ／ＲｅａｌＴｉｍｅＳｔｒｅａｍｉｎｇＰｒｏｔｏｃｏｌの略語である。ＭＭＴは、ＭＰＥＧＭｅｄｉａＴｒａｎｓｐｏｒｔの略語である。ＤＡＳＨは、ＤｙｎａｍｉｃＡｄａｐｔｉｖｅＳｔｒｅａｍｉｎｇｏｖｅｒＨＴＴＰの略語である。受信対象とする映像ストリームは、上記以外の符号化または配信フォーマットでもよいし、ＳＤＩ、ＨＤ−ＳＤＩといった非圧縮の伝送規格でもよい。ＳＤＩは、ＳｅｒｉａｌＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅの略語である。ＨＤ−ＳＤＩは、ＨｉｇｈＤｅｆｉｎｉｔｉｏｎ−ＳｅｒｉａｌＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅの略語である。<Video acquisition processing>
In step ST02, the video acquisition unit 110 waits to receive the video stream 21 from the camera 200. Upon receiving the video stream 21 from the camera 200, the video acquisition unit 110 decodes at least one frame of the received video stream 21. Here, the video stream to be received is, for example, video coded data compressed by a video compression coding method, which is IP-distributed by a video distribution protocol. A specific example of the video compression encoding method is described in H.264. H.262 / MPEG-2 video, H.264. H.264 / AVC, H.264. 265 / HEVC or JPEG. Specific examples of the video distribution protocol are MPEG-2 TS, RTP / RTSP, MMT, or DASH. MPEG-2 TS is an abbreviation for Moving Picture Experts Group 2 Transport Stream. RTP / RTSP is an abbreviation for Real-time Transport Protocol / Real Time Streaming Protocol. MMT is an abbreviation for MPEG Media Transport. DASH is an abbreviation for Dynamic Adaptive Streaming over HTTP. The video stream to be received may be an encoding or distribution format other than those described above, or may be an uncompressed transmission standard such as SDI or HD-SDI. SDI is an abbreviation for Serial Digital Interface. HD-SDI is an abbreviation for High Definition-Serial Digital Interface.

＜解析処理＞
ステップＳＴ０３において、解析部１２０は、映像取得部１１０から映像フレーム２２を取得する。解析部１２０は、解析パラメータ１４１を用いて、映像フレーム２２を解析する。解析部１２０は、映像フレーム２２を解析することにより、映像フレーム２２に映る群集の群集密度分布を算出する。また、解析部１２０は、算出した群集密度分布を出力形式に変換し、群集密度分布２２５として出力する。<Analysis processing>
In step ST03, the analysis unit 120 acquires the video frame 22 from the video acquisition unit 110. The analysis unit 120 analyzes the video frame 22 using the analysis parameter 141. The analysis unit 120 calculates the crowd density distribution of the crowd reflected in the video frame 22 by analyzing the video frame 22. In addition, the analysis unit 120 converts the calculated crowd density distribution into an output format, and outputs it as a crowd density distribution 225.

＜結果出力処理＞
ステップＳＴ０４において、結果出力部１３０は、出力インタフェース９４０を介して、解析部１２０から出力された群集密度分布２２５を群集密度算出装置１００の外部に出力する。出力の形式としては、例えば、モニタへの表示、ログファイルへの出力、外部接続機器への出力、またはネットワークへの送出といった形式があげられる。結果出力部１３０は、上記以外の形式で群集密度分布２２５を出力でもよい。また、結果出力部１３０は、解析部１２０から群集密度分布２２５が出力される都度、群集密度分布２２５を外部に出力してもよい。あるいは、結果出力部１３０は、特定の期間または特定数の群集密度分布２２５を集計または統計処理した後に出力するといった断続的な出力を行ってもよい。群集密度算出装置１００は、ステップＳＴ０４の後はステップＳＴ０２に戻り、次の映像フレーム２２の処理を行う。<Result output processing>
In step ST04, the result output unit 130 outputs the crowd density distribution 225 output from the analysis unit 120 to the outside of the crowd density calculation device 100 via the output interface 940. The output format includes, for example, display on a monitor, output to a log file, output to an externally connected device, or transmission to a network. The result output unit 130 may output the crowd density distribution 225 in a format other than the above. Further, the result output unit 130 may output the crowd density distribution 225 to the outside each time the crowd density distribution 225 is output from the analysis unit 120. Alternatively, the result output unit 130 may perform intermittent output such as outputting after a specific period or a specific number of crowd density distributions 225 are tabulated or statistically processed. After step ST04, the crowd density calculation apparatus 100 returns to step ST02 and performs the processing of the next video frame 22.

＜＜解析処理の詳細＞＞
図７を用いて、本実施の形態に係る解析処理の詳細について説明する。
ステップＳＴ１１において、前景抽出部１２１は、映像フレーム２２における人の画像、すなわち前景を前景画像２２１として抽出する。前景抽出部１２１は、前景画像２２１を暫定密度計算部１２２に出力する。<< Details of analysis processing >>
The details of the analysis processing according to the present embodiment will be described with reference to FIG.
In step ST11, the foreground extraction unit 121 extracts a human image in the video frame 22, that is, the foreground as the foreground image 221. The foreground extraction unit 121 outputs the foreground image 221 to the provisional density calculation unit 122.

図８は、立体領域ごとに前景面積から人数を換算するイメージを表す図である。図９は、立体領域ごとの人数から暫定密度分布を出力する図である。図８では、２人の人の画像が前景画像２２１として抽出されている。
前景抽出処理の手法として、予め背景画像を登録しておき、入力画像との差分を計算する背景差分法がある。また、連続して入力される映像フレームから、ＭＯＧ（ＭｉｘｔｕｒｅｏｆＧａｕｓｓｉａｎＤｉｓｔｒｉｂｕｔｉｏｎ）といったモデルを用いて背景画像を自動更新する適応型の背景差分法がある。また、画像内の動き情報を画素単位で取得する密なオプティカルフロー導出アルゴリズムがある。FIG. 8 is a diagram illustrating an image in which the number of people is converted from the foreground area for each three-dimensional region. FIG. 9 is a diagram for outputting a provisional density distribution from the number of persons in each three-dimensional region. In FIG. 8, images of two people are extracted as foreground images 221.
As a technique of foreground extraction processing, there is a background difference method in which a background image is registered in advance and a difference from the input image is calculated. In addition, there is an adaptive background subtraction method for automatically updating a background image from a continuously input video frame using a model such as MOG (Mixture of Gaussian Distribution). There is also a dense optical flow derivation algorithm that acquires motion information in an image in pixel units.

ステップＳＴ１２において、暫定密度計算部１２２は、前景画像２２１に基づいて、複数の立体領域の各立体領域に見かけ上存在する人の数を暫定密度分布２２２として算出する。具体的には、暫定密度計算部１２２は、物理空間上の点を映像フレーム座標系に投映した上で、前景画像２２１と関係式１４２を用いて、各立体領域ｖ_ｉｊに存在する人の数を算出する。暫定密度計算部１２２は、各立体領域ｖ_ｉｊに存在する人の数を算出する手法として、暫定群集密度推定を用いる。In step ST12, the provisional density calculation unit 122 calculates, as the provisional density distribution 222, the number of persons apparently present in each of the plurality of three-dimensional regions based on the foreground image 221. Specifically, the provisional density calculation unit 122 projects the points in the physical space on the video frame coordinate system, and then uses the foreground image 221 and the relational expression 142 to calculate the number of persons existing in each three-dimensional region v _ij. Is calculated. The provisional density calculation unit 122 uses the provisional crowd density estimation as a method of calculating the number of people existing in each three-dimensional region v _ij .

図８に示すように、暫定密度計算部１２２は、映像フレーム２２における各立体領域ｖ_ｉｊごとに前景面積を集計する。暫定密度計算部１２２は、各立体領域ｖ_ｉｊにおける前景面積に基づいて、各立体領域ｖ_ｉｊにおける人数を計算する。このとき、暫定密度計算部１２２は、予め求めておいた前景面積と人の数の関係式１４２を利用して、各立体領域ｖ_ｉｊにおける人の数を計算する。暫定密度計算部１２２は、各立体領域ｖ_ｉｊにおける人の数を、各立体領域ｖ_ｉｊに対応する領域ΔＳ_ｉｊの人数ｈ_ｉｊとする。暫定密度計算部１２２は、図９に示すように、全ての領域ΔＳ_ｉｊの人数ｈ_ｉｊが算出された群集密度分布を、暫定密度分布２２２として出力する。As shown in FIG. 8, the provisional density calculation unit 122 totals the foreground area for each three-dimensional region v _ij in the video frame 22. Preliminary density calculation unit 122, based on the foreground area in each stereo area v _ij, to compute the number of persons in each three-dimensional region v _ij. At this time, the provisional density calculation unit 122 calculates the number of people in each three-dimensional region v _ij using the relational expression 142 of the number of people and the foreground area obtained in advance. Preliminary density calculation unit 122, the number of people in each of the three-dimensional shape regions _{v ij,} the number _{h ij} of the corresponding region [Delta] S _ij to each three-dimensional region _{v ij.} Preliminary density calculation unit 122, as shown in FIG. 9, the crowd density distribution number h _ij is calculated for all the regions [Delta] S _ij, and outputs as a provisional density distribution 222.

＜＜＜前景面積と人数の関係式の求め方＞＞＞
ここで、前景面積と人数の関係式１４２は、群集同士のオクルージョンを考慮したものとする。前景面積と人数の関係式１４２は、記憶部１４０に記憶されているものとする。前景面積と人数の関係式１４２の導出方法について、以下に説明する。
図１０は、正解前景画像のイメージを表す図である。
図１１は、正解前景画像において混雑レベルを数値化したイメージを表す図である。
映像フレームに映る人数が既知であるとともに、人の物理座標系上での接地点が既知である正解画像を用意する。正解画像に対し前景抽出を行い、図１０に示すように正解前景画像を作成する。
正解前景画像を図１１に示すように、複数の小領域に分割し、混雑レベル別に各小領域で人物１人あたりの前景面積量を計算する。また、同様の処理を混雑レベルと配置パターンを変えた多数の正解前景画像に適用し、各小領域での人数１人当たりの前景面積を集計する。これにより、混雑レベルごとの各小領域における人数と前景の関係が前景面積と人数の関係式１４２として導出できる。前景面積と人数の関係式１４２は、記憶部１４０に保存される。各混雑レベルにおける小領域内での前景面積の占有割合が混雑レベルを判定するレベル閾値１４３として記憶部１４０に保存される。
暫定密度計算部１２２は、前景面積と人数の関係式１４２を用いる際、小領域ごとに混雑レベルを判定し、混雑レベルに対応した関係式１４２を利用し、前景面積から人数を算出する。暫定密度計算部１２２は、小領域内の前景の占有割合とレベル閾値１４３の比較によって、混雑レベルを判定する。<<<< How to calculate the relational expression between foreground area and number of people >>>>
Here, it is assumed that the relational expression 142 between the foreground area and the number of people considers the occlusion between the crowds. It is assumed that the relational expression 142 between the foreground area and the number of people is stored in the storage unit 140. A method for deriving the relational expression 142 between the foreground area and the number of people will be described below.
FIG. 10 is a diagram illustrating an image of a correct foreground image.
FIG. 11 is a diagram illustrating an image obtained by quantifying the congestion level in the correct foreground image.
A correct image is prepared in which the number of people appearing in the video frame is known and the ground contact point of the person on the physical coordinate system is known. Foreground extraction is performed on the correct image, and a correct foreground image is created as shown in FIG.
As shown in FIG. 11, the correct foreground image is divided into a plurality of small areas, and the amount of foreground area per person in each small area is calculated for each congestion level. In addition, the same processing is applied to a number of correct foreground images having different congestion levels and arrangement patterns, and the foreground area per person in each small area is totaled. Thus, the relationship between the number of people and the foreground in each small area for each congestion level can be derived as a relational expression 142 between the area of the foreground and the number of people. The relational expression 142 between the foreground area and the number of people is stored in the storage unit 140. The occupation ratio of the foreground area in the small area at each congestion level is stored in the storage unit 140 as the level threshold 143 for determining the congestion level.
When using the relational expression 142 between the foreground area and the number of persons, the provisional density calculation unit 122 determines the congestion level for each small area, and calculates the number of persons from the foreground area using the relational expression 142 corresponding to the congestion level. The provisional density calculation unit 122 determines the congestion level by comparing the occupation ratio of the foreground in the small area with the level threshold 143.

＜＜存在判定処理＞＞
ステップＳＴ１３において、存在判定部１２３は、複数の立体領域の各立体領域ｖ_ｉｊに、人が存在するかを判定する。すなわち、存在判定部１２３は、複数の立体空間の各立体空間Ｖ_ｉｊに、人が存在するかを判定し、人が存在しないと判定された立体空間に対応する立体領域ｖ_ｉｊについて人が存在しないと判定する。存在判定部１２３は、各立体領域ｖ_ｉｊに対して人の存在判定を行い、人が存在しないと判定した立体領域ｖ_ｉｊの人の数を０にする。すなわち、存在判定部１２３は、人が存在しないと判定した立体領域ｖ_ｉｊに対応する領域ΔＳ_ｉｊの人数を０にする。すなわち、存在判定部１２３は、人が存在しないと判定された立体空間に対応する立体領域ｖ_ｉｊの人の数を０に補正した暫定密度分布２２２を、補正密度分布２２３として出力する。<< Presence determination processing >>
In step ST13, the existence determining unit 123 determines whether a person exists in each of the three-dimensional regions v _ij of the plurality of three-dimensional regions. That is, the existence determination unit 123 determines whether or not a person exists in each of the three-dimensional spaces V _ij of the plurality of three-dimensional spaces, and determines whether or not a person exists in the three-dimensional region v _ij corresponding to the three-dimensional space determined to have no person. It is determined not to be performed. The presence determination unit 123 determines the presence of a person in each of the three-dimensional regions v _ij , and sets the number of people in the three-dimensional region v _ij determined to be non-existent to zero. That is, the existence determination unit 123 sets the number of persons in the region ΔS _ij corresponding to the three-dimensional region v _{ij for} which it is determined that no person exists to zero. That is, the existence determination unit 123 outputs, as the corrected density distribution 223, the provisional density distribution 222 in which the number of persons in the three-dimensional area v _ij corresponding to the three-dimensional space in which it is determined that no persons exist is corrected to zero.

図１２は、本実施の形態に係る存在判定処理のイメージ図である。
上述したように、複数の立体領域の各立体領域ｖ_ｉｊは、人が立体空間Ｖ_ｉｊに立っている場合に、その人の頭が対応する頭領域を備える。また、複数の立体領域の各立体領域ｖ_ｉｊは、人が立つ地面に対応する地面領域を備える。存在判定部１２３は、立体領域ｖ_ｉｊにおいて頭領域と地面領域との両領域に人が存在する場合に、その立体領域ｖ_ｉｊに対応する立体空間Ｖ_ｉｊに人が存在すると判定する。具体的には、存在判定部１２３は、図１２に示すように、各立体領域ｖ_ｉｊの頭領域と地面領域におけるそれぞれの前景面積が規定値以下である場合に人が存在しないと判定し、ｈ_ｉｊ＝０とする。また、存在判定部１２３は、頭領域と地面領域の両領域に規定値以上の前景面積が存在した立体領域ｖ_ｉｊのみについて、暫定密度分布２２２の値をそのまま使用する。
存在判定部１２３は、人存在処理を施した暫定密度分布２２２を、補正密度分布２２３として出力する。FIG. 12 is an image diagram of the presence determination process according to the present embodiment.
As described above, each of the three-dimensional regions v _ij of the plurality of three-dimensional regions includes a head region corresponding to a person's head when the person stands in the three-dimensional space V _ij . Each of the three-dimensional regions v _ij of the plurality of three-dimensional regions includes a ground region corresponding to a ground on which a person stands. When a person exists in both the head region and the ground region in the three-dimensional region v _ij , the presence determination unit 123 determines that a person exists in the three-dimensional space V _ij corresponding to the three-dimensional region v _ij . Specifically, as illustrated in FIG. 12, the presence determination unit 123 determines that no person exists when the foreground area in the head region and the ground region of each three-dimensional region v _ij is equal to or smaller than a specified value, _Let h _ij = 0. In addition, the existence determination unit 123 uses the value of the provisional density distribution 222 as it is for only the three-dimensional region _vij in which the foreground area equal to or larger than the specified value exists in both the head region and the ground region.
The presence determination unit 123 outputs the provisional density distribution 222 subjected to the human presence processing as the corrected density distribution 223.

＜＜標準化部＞＞
ステップＳＴ１４において、標準化部１２４は、前景画像２２１に基づいて映像フレーム２２における人の総数を取得する。標準化部１２４は、人の総数に基づいて、補正密度分布２２３における複数の立体領域の各立体領域ｖ_ｉｊの人の数を標準化する。具体的には、標準化部１２４は、以下の式（１）と式（２）を用いて、映像フレーム内に存在する総人数ｈ_{ｔｏｔａｌ}で群集密度の標準化を行う。標準化部１２４は、前景面積と人数の関係式１４２を前景画像２２１の全体に適用することにより、映像フレーム２２における総人数ｈ_{ｔｏｔａｌ}を算出する。式（２）におけるＲｏｗｓはｉの総数である。また、式（２）におけるＣｏｌｓはｊの総数である。

<<<< Standardization Division >>
In step ST14, the standardization unit 124 acquires the total number of people in the video frame 22 based on the foreground image 221. The standardization unit 124 standardizes the number of persons in each of the three-dimensional areas v _ij of the plurality of three-dimensional areas in the corrected density distribution 223 based on the total number of persons. Specifically, the standardization unit 124 uses the following equations (1) and (2) to standardize the crowd density with the total number h _total of persons existing in the video frame. The standardization unit 124 calculates the total number h _total of the video frames 22 by applying the relational expression 142 between the foreground area and the number of people to the entire foreground image 221. Rows in equation (2) is the total number of i. Cols in the equation (2) is the total number of j.

標準化部１２４は、補正密度分布すべての立体領域ｖ_ｉｊについて標準化処理を施し、確定密度分布２２４として出力する。The standardization unit 124 performs a standardization process on all of the three-dimensional regions v _ij of the corrected density distribution, and outputs the result as a definite density distribution 224.

＜＜分布出力処理＞＞
ステップＳＴ１５において、分布出力部１２５は、標準化部１２４から確定密度分布２２４を取得する。分布出力部１２５は、確定密度分布２２４を出力形式に変換し、群集密度分布２２５として結果出力部１３０に出力する。<< Distribution output processing >>
In step ST15, the distribution output unit 125 acquires the deterministic density distribution 224 from the standardization unit 124. The distribution output unit 125 converts the deterministic density distribution 224 into an output format, and outputs the result to the result output unit 130 as the crowd density distribution 225.

＊＊＊他の構成＊＊＊
本実施の形態では、映像取得部１１０と解析部１２０と結果出力部１３０の機能がソフトウェアで実現される。以下において、変形例として、映像取得部１１０と解析部１２０と結果出力部１３０の機能がハードウェアで実現されてもよい。*** Other configuration ***
In the present embodiment, the functions of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 are realized by software. Hereinafter, as a modified example, the functions of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 may be realized by hardware.

図１３は、本実施の形態の変形例に係る群集密度算出装置１００の構成を示す図である。
群集密度算出装置１００は、電子回路９０９、メモリ９２１、補助記憶装置９２２、入力インタフェース９３０、出力インタフェース９４０、および通信装置９５０を備える。FIG. 13 is a diagram illustrating a configuration of a crowd density calculation device 100 according to a modification of the present embodiment.
The crowd density calculation device 100 includes an electronic circuit 909, a memory 921, an auxiliary storage device 922, an input interface 930, an output interface 940, and a communication device 950.

電子回路９０９は、映像取得部１１０と解析部１２０と結果出力部１３０の機能を実現する専用の電子回路である。
電子回路９０９は、具体的には、単一回路、複合回路、プログラム化したプロセッサ、並列プログラム化したプロセッサ、ロジックＩＣ、ＧＡ、ＡＳＩＣ、または、ＦＰＧＡである。ＧＡは、ＧａｔｅＡｒｒａｙの略語である。ＡＳＩＣは、ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔの略語である。ＦＰＧＡは、Ｆｉｅｌｄ−ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙの略語である。
映像取得部１１０と解析部１２０と結果出力部１３０の機能は、１つの電子回路で実現されてもよいし、複数の電子回路に分散して実現されてもよい。
別の変形例として、映像取得部１１０と解析部１２０と結果出力部１３０の一部の機能が電子回路で実現され、残りの機能がソフトウェアで実現されてもよい。The electronic circuit 909 is a dedicated electronic circuit that implements the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130.
The electronic circuit 909 is, specifically, a single circuit, a composite circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA, an ASIC, or an FPGA. GA is an abbreviation for Gate Array. ASIC is an abbreviation for Application Specific Integrated Circuit. FPGA is an abbreviation for Field-Programmable Gate Array.
The functions of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 may be realized by one electronic circuit, or may be realized by being distributed to a plurality of electronic circuits.
As another modification, a part of the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130 may be implemented by an electronic circuit, and the remaining functions may be implemented by software.

プロセッサと電子回路の各々は、プロセッシングサーキットリとも呼ばれる。つまり、群集密度算出装置１００において、映像取得部１１０と解析部１２０と結果出力部１３０の機能は、プロセッシングサーキットリにより実現される。 Each of the processor and the electronic circuit is also called a processing circuitry. That is, in the crowd density calculation device 100, the functions of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 are realized by the processing circuitry.

群集密度算出装置１００において、映像取得部１１０、前景抽出部１２１、暫定密度計算部１２２、存在判定部１２３、標準化部１２４、分布出力部１２５、および結果出力部１３０の「部」を「工程」あるいは「処理」に読み替えてもよい。また、映像取得処理、前景抽出処理、暫定密度計算処理、存在判定処理、標準化処理、分布出力処理、および結果出力処理の「処理」を「プログラム」、「プログラムプロダクト」または「プログラムを記録したコンピュータ読取可能な記憶媒体」に読み替えてもよい。 In the crowd density calculation device 100, the “unit” of the video acquisition unit 110, the foreground extraction unit 121, the provisional density calculation unit 122, the existence determination unit 123, the standardization unit 124, the distribution output unit 125, and the result output unit 130 is “process”. Alternatively, it may be read as “processing”. The “processing” of the image acquisition processing, foreground extraction processing, provisional density calculation processing, presence determination processing, standardization processing, distribution output processing, and result output processing is defined as “program”, “program product”, or “computer that has recorded the program”. It may be replaced with “readable storage medium”.

＊＊＊本実施の形態の効果の説明＊＊＊
本実施の形態に係る群集密度算出装置１００では、前景抽出部が、カメラで撮影した映像の映像フレームから前景画像を抽出する。暫定密度計算部が、物理空間上の領域を映像フレームに投映した領域に存在する人数、すなわち見かけ上の人数を算出し、暫定密度分布を生成する。存在判定部が、人が存在しない物理空間上の領域を判定し、判定結果に基づいて暫定密度分布を補正する。結果出力部は、補正および標準化された暫定密度分布を群集密度分布として、ディスプレイといった出力装置に出力する。
したがって、本実施の形態に係る群集密度算出装置１００によれば、カメラから入力された映像フレームから、実世界の物理空間上における人の存在位置を算出し、群集密度分布として出力することができる。*** Explanation of effect of this embodiment ***
In crowd density calculation apparatus 100 according to the present embodiment, foreground extraction unit extracts a foreground image from a video frame of a video captured by a camera. The provisional density calculation unit calculates the number of persons existing in the area where the area in the physical space is projected on the video frame, that is, the apparent number of persons, and generates a provisional density distribution. The presence determination unit determines an area in the physical space where no person exists, and corrects the provisional density distribution based on the determination result. The result output unit outputs the corrected and standardized provisional density distribution as a crowd density distribution to an output device such as a display.
Therefore, according to the crowd density calculating apparatus 100 according to the present embodiment, it is possible to calculate the position of a person in the physical space of the real world from the video frame input from the camera and output the calculated position as the crowd density distribution. .

実施の形態２．
本実施の形態では、実施の形態１と異なる点について説明する。なお、実施の形態１と同様の構成には同一の符号を付し、その説明を省略する場合がある。Embodiment 2 FIG.
In the present embodiment, points different from the first embodiment will be described. The same components as those in the first embodiment are denoted by the same reference numerals, and description thereof may be omitted.

実施の形態１では、複数の立体領域の各立体領域ｖ_ｉｊが互いに重なり合っている。このため、暫定密度分布を計算する際、本来人が存在する立体領域ｖ_ｉｊ以外にも前景が出現する場合がある。これにより、算出される暫定密度分布が不正確になる場合がある。本実施の形態では、立体領域の重複による影響を排除し、群集密度分布をより高精度化する形態について説明する。
本実施の形態では、実施の形態１における標準化部１２４を省き、暫定密度計算部１２２と存在判定部１２３の間に新たに位置補正部１２６を追加する。In the first embodiment, the three-dimensional regions v _ij of the plurality of three-dimensional regions overlap each other. For this reason, when calculating the provisional density distribution, the foreground may appear in addition to the three-dimensional region v _{ij in which} a person originally exists. As a result, the calculated provisional density distribution may be inaccurate. In the present embodiment, a description will be given of a mode in which the influence of the overlap of the three-dimensional regions is eliminated and the crowd density distribution is made more accurate.
In the present embodiment, the standardization unit 124 in the first embodiment is omitted, and a position correction unit 126 is newly added between the provisional density calculation unit 122 and the existence determination unit 123.

図１４を用いて、本実施の形態に係る解析部１２０ａの詳細構成について説明する。
解析部１２０ａは、前景抽出部１２１と、暫定密度計算部１２２と、位置補正部１２６と、存在判定部１２３と、分布出力部１２５とを備える。The detailed configuration of the analysis unit 120a according to the present embodiment will be described with reference to FIG.
The analysis unit 120a includes a foreground extraction unit 121, a provisional density calculation unit 122, a position correction unit 126, a presence determination unit 123, and a distribution output unit 125.

図１５を用いて、本実施の形態に係る解析部１２０ａによる解析処理の詳細について説明する。
図１５において、ステップＳＴ１１とステップＳＴ１２は、実施の形態１と同様である。
ステップＳＴ１６において、位置補正部１２６は、暫定密度計算部１２２から暫定密度分布２２２を取得する。位置補正部１２６は、複数の立体空間うち隣接する立体空間同士の重複部分を表す重複領域における人の数に基づいて、暫定密度分布２２２を補正し、補正した暫定密度分布２２２を出力する。すなわち、位置補正部１２６は、暫定密度分布２２２を用いて、立体領域ｖ_ｉｊの重複による影響を考慮した人位置の補正を行う。With reference to FIG. 15, details of the analysis processing by analysis section 120a according to the present embodiment will be described.
In FIG. 15, steps ST11 and ST12 are the same as in the first embodiment.
In Step ST16, the position correction unit 126 acquires the provisional density distribution 222 from the provisional density calculation unit 122. The position correction unit 126 corrects the provisional density distribution 222 based on the number of people in an overlapping area representing an overlapping portion between adjacent three-dimensional spaces among the plurality of three-dimensional spaces, and outputs the corrected provisional density distribution 222. That is, the position correction unit 126 corrects the human position using the provisional density distribution 222 in consideration of the influence of the overlapping of the three-dimensional regions v _ij .

図１６は、本実施の形態に係る位置補正部１２６による位置補正処理のイメージを表す図である。
重複領域Ａ_{ｄｕｐｌｍ}とは、図１６に示す通り、立体領域ｖ_ｌと立体領域ｖ_ｍとが重なり合っている領域である。
位置補正部１２６による位置補正処理は、図１６に示すように、周囲に分散した値をシャープにする一種のフィルタのような働きをする。
ここで添え字のｌとｍは、２変数である添え字ｉｊを便宜上１変数に書き換えたものである。以下の本実施の形態の説明では、添え字のｌとｍを使用する。添え字ｌとｉｊの関係は、ｌ＝ｉ×Ｃｏｌｓ＋ｊである。添え字ｍとｉｊの関係も同様である。FIG. 16 is a diagram illustrating an image of a position correction process performed by position correction section 126 according to the present embodiment.
The overlap region _{A Duplm,} a region as shown in FIG. 16, where the three-dimensional region _{v l} and solid area _{v m} overlap.
As shown in FIG. 16, the position correction processing by the position correction unit 126 functions as a kind of filter that sharpens the values distributed around.
Here, the subscripts 1 and m are obtained by rewriting the subscript ij, which is two variables, into one variable for convenience. In the following description of the present embodiment, the subscripts l and m are used. The relationship between the subscripts l and ij is l = i × Cols + j. The same applies to the relationship between the subscripts m and ij.

ここで、求めるべき各領域ΔＳ_ｌの人数をベクトル表記したものをｈ_ｔ、セル同士の重複関係を示す係数行列をＡ、暫定密度分布２２２で得られた出力をベクトル表記したものをｈとする。各領域ΔＳ_ｌの人数は、式（３）および式（４）で計算できる。
式（３）および式（４）を利用することで、位置補正部１２６は、重複領域の影響が除去された高精度な群集密度分布２２５を出力することができる。Here, h _t denotes a vector notation of the number of persons in each area ΔS _{1 to} be obtained, A denotes a coefficient matrix indicating the overlapping relationship between cells, and h denotes an output obtained by the provisional density distribution 222. . The number of each region [Delta] S _l can be calculated by Equation (3) and (4).
By using Expressions (3) and (4), the position correction unit 126 can output a highly accurate crowd density distribution 225 from which the influence of the overlapping area has been removed.

以下に、式（３）の導出について詳しく説明する。立体空間Ｖ_ｌに存在する人数をｈ_ｔｌ、立体領域ｖ_ｌに出現する人数をｈ_ｌとする。立体空間Ｖ_ｌに存在する人数ｈ_ｔｌの人が、立体領域ｖ_ｍに出現する人数をｈ_{ｃｏｍ＿ｌ→ｍ}とする。すると、立体領域ｖ_ｌに出現する見かけ上の人数ｈ_ｌは、式（５）で表すことができる。ｈ_{ｃｏｍ＿ｌ→ｍ}は、セルｌに存在する人数ｈ_ｔｌに、係数α_ｌｍをかけたものとして表せる。よって、セル同士の重複関係を表す係数行列は、式（４）となる。ｌ＝ｍのときα_ｌｍ＝１とおけば、式（５）と式（６）より、見かけ上の人数は、存在する人数に係数をかけた式（７）で表すことができる。
ｌの総数をＮとすると、式（７）より、Ｎ本のＮ限１次連立方程式となるため、行列表現で書き直すと、式（８）となる。式（８）の両辺からの係数行列Ａの逆行列をかけることで、式（３）が導出できる。

Hereinafter, the derivation of Expression (3) will be described in detail. Three-dimensional space _{V l} the number of people present in the _{h tl,} the number of people appearing in the three-dimensional area _{v l} and _{h l.} People number of people _{h tl} present in three-dimensional space _{V l} is, the number of people appearing in the three-dimensional area _{v m} and _{h com_l → m.} Then, the number h _l apparent that appear in the three-dimensional region v _l can be expressed by Equation (5). h _{com —} l _{→ m} can be expressed as a value obtained by multiplying the number of people h _tl existing in the cell l by a coefficient α _lm . Therefore, a coefficient matrix representing the overlapping relationship between cells is represented by Expression (4). If α _lm = 1 when l = m, the apparent number of persons can be expressed by equation (7) obtained by multiplying the number of existing persons by a coefficient from equations (5) and (6).
Assuming that the total number of l is N, from equation (7), there are N simultaneous N-order linear simultaneous equations. Therefore, when rewritten in matrix expression, equation (8) is obtained. Equation (3) can be derived by multiplying the inverse of the coefficient matrix A from both sides of equation (8).

なお、係数α_ｌｍの求め方は限定しない。例えば、図１５に示す立体領域ｖ_ｌの面積Ａ_ｌと、立体領域ｖ_ｌと立体領域ｖ_ｍとの重複領域Ａ_{ｄｕｐｌｍ}を用いて式（９）のように係数α_ｌｍの求めてもよい。

Note that the method of _obtaining the coefficient α _lm is not limited. For example, the area _{A l} stereoscopic area _{v l} shown in FIG. 15 may be calculated coefficient alpha _lm as equation using overlapping region _{A Duplm} the solid region _{v l} and solid area _{v m} (9).

なお、式（９）を用いた係数行列Ａの算出は、ステップＳＴ０１の解析パラメータ読み込み処理において、１度だけ行うこととする。
以降の、ステップＳＴ１３とステップＳＴ１５は、実施の形態１と同様である。The calculation of the coefficient matrix A using the equation (9) is performed only once in the analysis parameter reading process in step ST01.
The subsequent steps ST13 and ST15 are the same as in the first embodiment.

以上のように、本実施の形態に係る解析部１２０ａを用いた群集密度算出装置によれば、重複領域の影響を除去することにより、群集密度分布をより高精度に算出することができる。 As described above, according to the crowd density calculation device using the analysis unit 120a according to the present embodiment, the crowd density distribution can be calculated with higher accuracy by removing the influence of the overlapping area.

実施の形態３．
本実施の形態では、実施の形態２と異なる点について説明する。なお、実施の形態１，２と同様の構成には同一の符号を付し、その説明を省略する場合がある。Embodiment 3 FIG.
In this embodiment, points different from Embodiment 2 will be described. The same components as those in the first and second embodiments are denoted by the same reference numerals, and description thereof may be omitted.

実施の形態２では、式（９）を用いて計算された重複関係を示す係数行列Ａは、立体領域ｖ_ｉｊ上で均等に前景が出現することを想定している。実際には、重複関係を示す係数行列Ａは、立体空間Ｖ_ｉｊ内で実際に人が存在する位置により影響を受ける。このため、立体空間Ｖ_ｉｊ内で実際に人が存在する位置によっては算出された群集密度分布に誤差が生じる虞がある。本実施の形態では、係数行列Ａを数値計算により最適化する。その時の群集密度分布の総人数ｈ_{ｔｏｔａｌ}は、画面内に存在する総人数を用いる。画面内の総人数ｈ_{ｔｏｔａｌ}は、前景面積と人数の関係式を前景画像全体に適用することにより算出する。In the second embodiment, it is assumed that the coefficient matrix A indicating the overlapping relationship calculated using Expression (9) has foreground appearing evenly on the three-dimensional area v _ij . Actually, the coefficient matrix A indicating the overlapping relationship is influenced by the position where the person actually exists in the three-dimensional space _Vij . For this reason, an error may occur in the calculated crowd density distribution depending on the position where the person actually exists in the three-dimensional space V _ij . In the present embodiment, the coefficient matrix A is optimized by numerical calculation. The total number h _total of the crowd density distribution at that time uses the total number of people existing in the screen. The total number h _total of the screen is calculated by applying a relational expression between the foreground area and the number to the entire foreground image.

図１７を用いて、本実施の形態に係る解析処理の詳細について説明する。
図１７において、ステップＳＴ１１とステップＳＴ１２とステップＳＴ１６は、実施の形態１と同様である。The details of the analysis processing according to the present embodiment will be described with reference to FIG.
In FIG. 17, steps ST11, ST12, and ST16 are the same as in the first embodiment.

ステップＳＴ１７において、位置補正部１２６は、係数行列Ａを再計算することにより、最適化する。また、位置補正部１２６は、映像フレームにおける人の総数の誤差が閾値以下になるまで、暫定密度分布の補正を繰り返す。 In step ST17, the position correction unit 126 optimizes by recalculating the coefficient matrix A. The position correction unit 126 repeats the correction of the provisional density distribution until the error of the total number of people in the video frame becomes equal to or smaller than the threshold.

ｈ’_{ｔｏｔａｌ}の計算には式（２）と式（７）を利用する。
暫定密度分布の補正に関する評価関数を式（１０）で定める。位置補正部１２６は、最急降下法を用いて、式（１０）で計算される誤差Ｅが閾値以下になるまで繰り返し計算を行う。

Equations (2) and (7) are used to calculate h ′ _total .
An evaluation function relating to correction of the provisional density distribution is defined by Expression (10). The position correction unit 126 repeatedly performs the calculation using the steepest descent method until the error E calculated by the equation (10) becomes equal to or smaller than the threshold.

なお、位置補正部１２６による最適化の手法は最急降下法に限定しない。 The method of optimization by the position correction unit 126 is not limited to the steepest descent method.

以上のように、本実施の形態に係る群集密度算出装置は、式（１０）を用いて、係数行列Ａを毎フレーム更新する。よって、本実施の形態に係る群集密度算出装置によれば、実施の形態２と比較して、群集密度分布をより高精度で算出できる。 As described above, the crowd density calculation device according to the present embodiment updates the coefficient matrix A for each frame using Expression (10). Therefore, according to the crowd density calculation device according to the present embodiment, the crowd density distribution can be calculated with higher accuracy than in the second embodiment.

以上の実施の形態１から３では、群集密度算出装置の各部を独立した機能ブロックとして説明した。しかし、群集密度算出装置の構成は、上述した実施の形態のような構成でなくてもよい。群集密度算出装置の機能ブロックは、上述した実施の形態で説明した機能を実現することができれば、どのような構成でもよい。また、群集密度算出装置は、１つの装置でなく、複数の装置から構成されたシステムでもよい。
また、実施の形態１から３のうち、複数の部分を組み合わせて実施しても構わない。あるいは、これらの実施の形態のうち、１つの部分を実施しても構わない。その他、これら実施の形態を、全体としてあるいは部分的に、どのように組み合わせて実施しても構わない。
すなわち、実施の形態１から３では、各実施の形態の自由な組み合わせ、あるいは各実施の形態の任意の構成要素の変形、もしくは各実施の形態において任意の構成要素の省略が可能である。In the above first to third embodiments, each unit of the crowd density calculation device has been described as an independent functional block. However, the configuration of the crowd density calculation device may not be the configuration as in the above-described embodiment. The functional blocks of the crowd density calculation device may have any configuration as long as the functions described in the above-described embodiments can be realized. Further, the crowd density calculation device may be a system including a plurality of devices instead of one device.
Further, a plurality of parts of the first to third embodiments may be combined and implemented. Alternatively, one of these embodiments may be implemented. In addition, these embodiments may be implemented in any combination as a whole or a part.
That is, in the first to third embodiments, any combination of the embodiments can be freely combined, or any component of each embodiment can be modified, or any component can be omitted in each embodiment.

なお、上述した実施の形態は、本質的に好ましい例示であって、本発明の範囲、本発明の適用物の範囲、および本発明の用途の範囲を制限することを意図するものではない。上述した実施の形態は、必要に応じて種々の変更が可能である。上述した実施の形態に係る群集密度算出装置は、群集の密度を推定する群集密度推定装置、および、群集密度推定システムに適用することができる。 The embodiments described above are essentially preferred examples, and are not intended to limit the scope of the present invention, the scope of the application of the present invention, and the range of the use of the present invention. Various changes can be made to the embodiment described above as needed. The crowd density calculation device according to the above-described embodiment can be applied to a crowd density estimation device that estimates the density of a crowd and a crowd density estimation system.

２１映像ストリーム、２２映像フレーム、１００群集密度算出装置、１１０映像取得部、１２０，１２０ａ解析部、１２１前景抽出部、１２２暫定密度計算部、１２３存在判定部、１２４標準化部、１２５分布出力部、１２６位置補正部、１３０結果出力部、１４０記憶部、１４１解析パラメータ、１４２関係式、１４３レベル閾値、２００カメラ、２２１前景画像、２２２暫定密度分布、２２３補正密度分布、２２４確定密度分布、２２５群集密度分布、９０９電子回路、９１０プロセッサ、９２１メモリ、９２２補助記憶装置、９３０入力インタフェース、９４０出力インタフェース、９５０通信装置、Ｓ１００群集密度算出処理。 Reference Signs List 21 video stream, 22 video frame, 100 crowd density calculation device, 110 video acquisition unit, 120, 120a analysis unit, 121 foreground extraction unit, 122 provisional density calculation unit, 123 existence determination unit, 124 standardization unit, 125 distribution output unit, 126 position correction unit, 130 result output unit, 140 storage unit, 141 analysis parameter, 142 relational expression, 143 level threshold, 200 camera, 221 foreground image, 222 provisional density distribution, 223 corrected density distribution, 224 fixed density distribution, 225 crowd Density distribution, 909 electronic circuit, 910 processor, 921 memory, 922 auxiliary storage device, 930 input interface, 940 output interface, 950 communication device, S100 crowd density calculation processing.

Claims

人が撮像されている映像ストリームから映像フレームを取得する映像取得部と、
前記映像フレームに３次元座標を対応付け、前記映像フレーム上において前記３次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得し、前記複数の立体領域の各立体領域に存在する人の数に基づいて、前記映像フレームにおける人の密度分布を群集密度分布として算出する解析部と
を備えた群集密度算出装置。A video acquisition unit that acquires a video frame from a video stream in which a person is being imaged;
Associating three-dimensional coordinates with the video frame, obtaining, on the video frame, a region representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates as each three-dimensional region of a plurality of three-dimensional regions; A crowd density calculation device comprising: an analysis unit configured to calculate a density distribution of people in the video frame as a crowd density distribution based on the number of people present in each of the plurality of three-dimensional regions.

前記解析部は、
前記映像フレームにおける人の画像を前景画像として抽出する前景抽出部と、
前記前景画像に基づいて、前記複数の立体領域の各立体領域に見かけ上存在する人の数を暫定密度分布として算出する暫定密度計算部と、
前記複数の立体空間の各立体空間に人が存在するかを判定する存在判定部と、
人が存在しないと判定された立体空間に対応する立体領域の人の数を０に補正した前記暫定密度分布を、補正密度分布として出力する存在判定部と
を備えた請求項１に記載の群集密度算出装置。The analysis unit,
A foreground extraction unit that extracts a human image in the video frame as a foreground image,
Based on the foreground image, a provisional density calculation unit that calculates the number of people apparently present in each of the plurality of three-dimensional regions as a provisional density distribution,
An existence determining unit that determines whether a person is present in each of the plurality of three-dimensional spaces,
2. The crowd according to claim 1, further comprising: an existence determining unit configured to output the provisional density distribution in which the number of people in a three-dimensional region corresponding to the three-dimensional space in which it is determined that no person exists is corrected to 0, as a corrected density distribution. Density calculator.

前記複数の立体領域の各立体領域は、
前記人が前記複数の立体空間の各立体空間に立っている場合に前記人の頭が対応する頭領域と前記人が立つ地面に対応する地面領域とを備え、
前記存在判定部は、
立体領域において前記頭領域と前記地面領域との両領域に人が存在する場合に、前記立体領域に対応する立体空間に人が存在すると判定する請求項２に記載の群集密度算出装置。Each three-dimensional region of the plurality of three-dimensional regions is
When the person stands in each three-dimensional space of the plurality of three-dimensional spaces, the person includes a head region corresponding to the head of the person and a ground region corresponding to the ground on which the person stands,
The existence determination unit,
The crowd density calculation device according to claim 2, wherein it is determined that a person exists in a three-dimensional space corresponding to the three-dimensional region when a person exists in both the head region and the ground region in the three-dimensional region.

前記解析部は、
前記前景画像に基づいて前記映像フレームにおける人の総数を取得し、前記人の総数に基づいて、前記補正密度分布における前記複数の立体領域の各立体領域の人の数を標準化する標準化部と、
前記標準化部から標準化された前記補正密度分布を確定密度分布として取得し、前記確定密度分布を出力形式に変換する分布出力部と
を備えた請求項２または３に記載の群集密度算出装置。The analysis unit,
A standardization unit that obtains the total number of people in the video frame based on the foreground image, and standardizes the number of people in each of the plurality of three-dimensional regions in the corrected density distribution based on the total number of people.
The crowd density calculation device according to claim 2, further comprising: a distribution output unit configured to acquire the corrected density distribution standardized from the standardization unit as a deterministic density distribution, and to convert the deterministic density distribution into an output format.

前記解析部は、
前記複数の立体領域のうち隣接する立体領域同士の重複部分を表す重複領域における人の数に基づいて、前記暫定密度分布を補正し、補正した前記暫定密度分布を出力する位置補正部を備えた請求項２または３に記載の群集密度算出装置。The analysis unit,
A position correction unit that corrects the provisional density distribution based on the number of people in an overlapping region representing an overlapping portion between adjacent three-dimensional regions among the plurality of three-dimensional regions, and outputs the corrected provisional density distribution. The crowd density calculation device according to claim 2 or 3.

前記位置補正部は、
前記映像フレームにおける人の総数の誤差が閾値以下になるまで、前記暫定密度分布の補正を繰り返す請求項５に記載の群集密度算出装置。The position correction unit,
The crowd density calculation device according to claim 5, wherein the correction of the provisional density distribution is repeated until the error of the total number of people in the video frame becomes equal to or smaller than a threshold.

映像取得部が、人が撮像されている映像ストリームから映像フレームを取得し、
解析部が、前記映像フレームに３次元座標を対応付け、前記映像フレーム上において前記３次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得し、前記複数の立体領域の各立体領域に存在する人の数に基づいて、前記映像フレームにおける人の密度分布である群集密度分布を算出する群集密度算出方法。A video acquisition unit that acquires a video frame from a video stream in which a person is imaged;
An analysis unit that associates three-dimensional coordinates with the video frame, and converts a region representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame into a three-dimensional region of a plurality of three-dimensional regions And calculating a crowd density distribution that is a density distribution of people in the video frame based on the number of people present in each of the plurality of three-dimensional regions.

人が撮像されている映像ストリームから映像フレームを取得する映像取得処理と、
前記映像フレームに３次元座標を対応付け、前記映像フレーム上において前記３次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得し、前記複数の立体領域の各立体領域に存在する人の数に基づいて、前記映像フレームにおける人の密度分布である群集密度分布を算出する解析処理と
をコンピュータに実行させる群集密度算出プログラム。Video acquisition processing for acquiring a video frame from a video stream in which a person is being imaged;
Associating three-dimensional coordinates with the video frame, obtaining, on the video frame, a region representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates as each three-dimensional region of a plurality of three-dimensional regions; A crowd density calculation program for causing a computer to execute an analysis process of calculating a crowd density distribution that is a density distribution of people in the video frame based on the number of people existing in each of the plurality of three-dimensional regions.