WO2019087383A1 - Crowd density calculation device, crowd density calculation method and crowd density calculation program - Google Patents

Crowd density calculation device, crowd density calculation method and crowd density calculation program Download PDF

Info

Publication number
WO2019087383A1
WO2019087383A1 PCT/JP2017/039901 JP2017039901W WO2019087383A1 WO 2019087383 A1 WO2019087383 A1 WO 2019087383A1 JP 2017039901 W JP2017039901 W JP 2017039901W WO 2019087383 A1 WO2019087383 A1 WO 2019087383A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
unit
density distribution
people
video frame
Prior art date
Application number
PCT/JP2017/039901
Other languages
French (fr)
Japanese (ja)
Inventor
士人 新井
亮史 服部
奥村 誠司
Original Assignee
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社 filed Critical 三菱電機株式会社
Priority to PCT/JP2017/039901 priority Critical patent/WO2019087383A1/en
Priority to CN201780096261.XA priority patent/CN111279392B/en
Priority to JP2019550118A priority patent/JP6678835B2/en
Priority to SG11202002953YA priority patent/SG11202002953YA/en
Publication of WO2019087383A1 publication Critical patent/WO2019087383A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes

Definitions

  • the present invention relates to a crowd density calculation device, a crowd density calculation method, and a crowd density calculation program.
  • a technique for estimating the number of people or the density of people from camera images there is a technique for estimating the number of people or the density of people from camera images.
  • techniques for estimating the number of people from camera images there are techniques such as a method of counting the number of people based on human detection or a technique of estimating the number of people from the foreground area.
  • the method based on human detection when the crowd density is low, the number of people can be estimated with high accuracy.
  • the amount of computation increases as the number of people increases.
  • the crowd density increases as the number of people increases, and therefore the estimation accuracy decreases due to the influence of occlusion between persons, that is, concealment.
  • the estimation accuracy is inferior to the method based on the human detection.
  • the amount of computation does not change even if the crowd density is high.
  • the technique for estimating the density of people is equivalent to the technique for estimating the number of people for each arbitrary area of the video frame.
  • Patent Document 1 and Patent Document 2 disclose a technique of acquiring an image obtained by capturing a crowd, and using the foreground extracted by the background difference as a person area, to estimate the number of persons in the screen from the area of the person area.
  • a load value is calculated which quantitatively represents how much each pixel in an image contributes to the number of people.
  • the load value is calculated from the apparent volume of the object in the image. This solves the problem that the appearance of the foreground area per unit number of people differs due to the difference in depth, and makes it possible to estimate the number of people even in images with depth.
  • Patent Document 1 and Patent Document 2 the number of people present in a video frame and the density of people in an arbitrary area in the video frame are calculated. However, we do not estimate the human position on the physical space of the real world. This is because, in Patent Document 1 and Patent Document 2, the correspondence from the point on the physical space to the point on the video frame is performed, but the reverse is not performed.
  • An object of the present invention is to calculate the existence position of a crowd on the physical space of the real world from a video frame and output it as the density distribution of the crowd.
  • the crowd density calculation device is A video acquisition unit that acquires a video frame from a video stream in which a person is imaged; Three-dimensional coordinates are associated with the video frame, and an area representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame is acquired as each three-dimensional area of the plurality of three-dimensional areas; And an analysis unit configured to calculate a density distribution of persons in the image frame as a population density distribution based on the number of people present in each of the plurality of three-dimensional regions.
  • the analysis unit associates three-dimensional coordinates with the video frame, and a plurality of regions representing each three-dimensional space of the plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame It acquires as each three-dimensional area of three-dimensional area of. Further, the analysis unit calculates a crowd density distribution in the video frame based on the number of people present in each three-dimensional area of the plurality of three-dimensional areas. Therefore, according to the crowd density calculation device of the present invention, it is possible to quantitatively grasp the crowd density distribution in the physical space of the real world from the video frame.
  • FIG. 1 is a configuration diagram of a crowd density calculation device according to Embodiment 1.
  • FIG. FIG. 2 is a detailed configuration diagram of an analysis unit according to the first embodiment. The figure explaining the definition of crowd density distribution. The figure showing the image of crowd density distribution at the time of making the size of deltaX and deltaY constant. A diagram representing an image of projecting points in the physical world of the real world onto a video frame coordinate system.
  • FIG. 6 is a flowchart of crowd density calculation processing according to the first embodiment.
  • FIG. 6 is a flowchart of analysis processing according to the first embodiment. The figure showing the image which converts foreground area into the number of people for every three-dimensional field. The figure showing the image which outputs temporary density distribution from the number of people for every three-dimensional field.
  • FIG. 6 is a diagram showing an image of presence determination processing according to the first embodiment.
  • FIG. 8 is a detailed configuration diagram of an analysis unit according to a second embodiment.
  • FIG. 7 is a flow diagram of analysis processing according to the second embodiment.
  • FIG. 8 is a view showing an image of position correction processing according to the second embodiment.
  • FIG. 14 is a flowchart of analysis processing according to the third embodiment.
  • the crowd density calculation device 100 is a computer.
  • the crowd density calculation device 100 includes a processor 910 and other hardware such as a memory 921, an auxiliary storage device 922, an input interface 930, an output interface 940, and a communication device 950.
  • the processor 910 is connected to other hardware via signal lines to control these other hardware.
  • the crowd density calculation device 100 includes an image acquisition unit 110, an analysis unit 120, a result output unit 130, and a storage unit 140 as functional elements.
  • the storage unit 140 stores analysis parameters 141 used in analysis processing by the analysis unit 120.
  • the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130 are realized by software.
  • the storage unit 140 is provided in the memory 921.
  • the storage unit 140 may be included in the auxiliary storage device 922.
  • the processor 910 is a device that executes a crowd density calculation program.
  • the crowd density calculation program is a program for realizing the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130.
  • the processor 910 is an IC (Integrated Circuit) that performs arithmetic processing. Specific examples of the processor 910 are a CPU, a digital signal processor (DSP), and a graphics processing unit (GPU).
  • the memory 921 is a storage device that temporarily stores data.
  • a specific example of the memory 921 is a static random access memory (SRAM) or a dynamic random access memory (DRAM).
  • the auxiliary storage device 922 is a storage device for storing data.
  • a specific example of the auxiliary storage device 922 is an HDD.
  • the auxiliary storage device 922 may also be a portable storage medium such as an SD (registered trademark) memory card, a CF, a NAND flash, a flexible disk, an optical disk, a compact disk, a Blu-ray (registered trademark) disk, and a DVD.
  • HDD is an abbreviation of Hard Disk Drive.
  • SD (registered trademark) is an abbreviation of Secure Digital.
  • CF is an abbreviation of Compact Flash.
  • DVD is an abbreviation of Digital Versatile Disk.
  • the input interface 930 is a port connected to an input device such as a mouse, a keyboard, or a touch panel. Also, the input interface 930 may be a port connected to the camera 200. Specifically, the input interface 930 is a USB (Universal Serial Bus) terminal. The input interface 930 may be a port connected to a LAN (Local Area Network). The crowd density calculation apparatus 100 may acquire the video stream 21 from the camera 200 via the input interface 930.
  • an input device such as a mouse, a keyboard, or a touch panel.
  • the input interface 930 may be a port connected to the camera 200.
  • the input interface 930 is a USB (Universal Serial Bus) terminal.
  • the input interface 930 may be a port connected to a LAN (Local Area Network).
  • the crowd density calculation apparatus 100 may acquire the video stream 21 from the camera 200 via the input interface 930.
  • the output interface 940 is a port to which a cable of an output device such as a display is connected.
  • the output interface 940 is a USB terminal or an HDMI (registered trademark) (High Definition Multimedia Interface) terminal.
  • the display is specifically an LCD (Liquid Crystal Display).
  • the crowd density calculation apparatus 100 displays the analysis result output by the result output unit 130 on the display via the output interface 940.
  • Communication device 950 communicates with other devices via a network.
  • Communication device 950 has a receiver and a transmitter.
  • the communication device 950 is connected to a communication network such as a LAN, the Internet, or a telephone line by wire or wirelessly.
  • the communication device 950 is specifically a communication chip or a NIC (Network Interface Card).
  • the crowd density calculation device 100 may receive the video stream 22 from the camera 200 via the communication device 950. In addition, the crowd density calculation device 100 may transmit the analysis result output by the result output unit 130 to an external device via the communication device 950.
  • the crowd density calculation program is read into the processor 910 and executed by the processor 910.
  • the memory 921 stores not only the crowd density calculation program but also an operating system (OS).
  • the processor 910 executes the crowd density calculation program while executing the OS.
  • the crowd density calculation program and the OS may be stored in the auxiliary storage device 922.
  • the crowd density calculation program and the OS stored in the auxiliary storage device 922 are loaded into the memory 921 and executed by the processor 910. Note that part or all of the crowd density calculation program may be incorporated into the OS.
  • the crowd density calculation apparatus 100 may include a plurality of processors that replace the processor 910.
  • the plurality of processors share execution of the crowd density calculation program.
  • Each processor is an apparatus which executes a crowd density calculation program in the same manner as the processor 910.
  • Data, information, signal values, and variable values used, processed or output by the crowd density calculation program are stored in the memory 921, the auxiliary storage device 922, or a register or cache memory in the processor 910.
  • each process, each procedure or each process is replaced with “process”, “procedure” or “process” of “part” of each part of the image acquisition unit 110, the analysis unit 120, and the result output unit 130. Make it run on a computer.
  • the crowd density calculation method is a method performed by the crowd density calculation device 100 executing a crowd density calculation program.
  • the crowd density calculation program may be stored in a computer readable recording medium and provided. Also, the population density calculation program may be provided as a program product.
  • the analysis unit 120 includes a foreground extraction unit 121, a provisional density calculation unit 122, a presence determination unit 123, a standardization unit 124, and a distribution output unit 125. That is, the video acquisition unit 110, the analysis unit 120, and each unit of the result output unit 130 are the video acquisition unit 110, the foreground extraction unit 121, the provisional density calculation unit 122, the presence determination unit 123, and the standardization unit 124. It is each part of the distribution output part 125 and the result output part 130.
  • the outline of each functional element of the crowd density calculation device 100 will be described using FIGS. 1 and 2.
  • the crowd density calculation device 100 is connected to a camera 200 that captures an object and delivers it as a video stream 21.
  • the object is specifically a person. That is, the video stream 21 is a crowd video.
  • the video acquisition unit 110 acquires the video stream 21 distributed from the camera 200 via the input interface 930.
  • the video acquisition unit 110 acquires the video frame 22 from the video stream 21.
  • the video acquisition unit 110 decodes the video stream 21 and converts it into a video frame 22.
  • the analysis unit 120 associates the video frame 22 with three-dimensional coordinates.
  • the analysis unit 120 acquires, as three-dimensional regions of a plurality of three-dimensional regions, regions representing the three-dimensional regions of a plurality of three-dimensional spaces obtained based on three-dimensional coordinates on the video frame 22. Then, the analysis unit 120 calculates the crowd density distribution in the video frame 22 based on the number of people present in each three-dimensional area of the plurality of three-dimensional areas. That is, the analysis unit 120 calculates the position of the crowd in the three-dimensional coordinates in the physical space as the crowd density distribution 225 using the video frame 22.
  • the result output unit 130 outputs the crowd density distribution 225 output from the analysis unit 120 to an output device such as a display via the output interface 940.
  • the foreground extraction unit 121 extracts a portion having foreground features from the video frame 22 as a foreground image 221.
  • the temporary density calculation unit 122 calculates a temporary density distribution 222, which is a density distribution of apparent crowds for each position in the physical space, using the foreground image 221 and the analysis parameter 141 stored in the storage unit 140. .
  • the presence determination unit 123 corrects the provisional density distribution 222 by determining the position where no person is present in the physical space.
  • the presence determination unit 123 outputs the corrected temporary density distribution 222 as a corrected density distribution 223.
  • the standardization unit 124 uses the total number of people present in the image represented by the video frame 22 to standardize the correction density distribution 223 and outputs a definite density distribution 224.
  • the distribution output unit 125 converts the definite density distribution 224 into an output format, and finally outputs it as a crowd density distribution 225.
  • the definition of the population density distribution is the number distribution on the physical space of the real world.
  • the physical space coordinates X r -Y r -Z r is a three-dimensional coordinates of the real world
  • X r -Y r a top planar position (X i corresponding to the ground in the real world,
  • An area surrounded by the width ⁇ X and the depth ⁇ Y in Y i ) is taken as ⁇ S ij .
  • a region of a three-dimensional space surrounded by a prism of height H with ⁇ S ij as a bottom surface is set as a three-dimensional space V ij .
  • the number of people present in the three-dimensional space V ij is h t ij .
  • the crowd density distribution is obtained by arranging the total number of analysis regions corresponding to the number of people h tij corresponding to ⁇ S ij .
  • the height H is about the height of a person.
  • FIG. 4 is a diagram showing an image of the crowd density distribution when the magnitudes of ⁇ X and ⁇ Y are constant.
  • the population density distribution in the case where one person exists at each of the area ⁇ S 02 and the middle point between the area ⁇ S 10 and the area ⁇ S 20 is as follows.
  • the crowd density distribution at this time is such that one person exists at the position of ⁇ S 02 , 0.5 person at the position of ⁇ S 10 , 0.5 person at the position of ⁇ S 20 , and 0 people at the other positions of ⁇ S ij. .
  • the magnitudes of ⁇ X and ⁇ Y are not defined.
  • the magnitudes of ⁇ X and ⁇ Y may be variable.
  • FIG. 5 is a view showing an image of projecting a point on the physical space of the real world on a video frame coordinate system. That is, FIG. 5 is an image in which three-dimensional coordinates are associated with the video frame.
  • P gij (X ij , Y ij , 0) be a point on the ground in the physical space of the real world.
  • the points P gij on the ground and the points P hij on the plane of height H are projected on the image frame coordinate system x img -y img as p gij and p hij .
  • an area obtained by projecting the three-dimensional space V ij on the physical space on the video frame coordinate system is defined as a three-dimensional area v ij .
  • the three-dimensional area v ij is a two-dimensional area indicated by oblique lines in FIG. That is, the three-dimensional area v ij is a two-dimensional area represented by the outer periphery of the three-dimensional space V ij on the video frame when the three-dimensional space V ij of three-dimensional coordinates is represented on the video frame.
  • the three-dimensional area v ij is called a prismatic area or a rectangular parallelepiped area.
  • each three-dimensional region v ij of the plurality of three-dimensional regions corresponds to a head region corresponding to the human head and a ground region corresponding to the ground on which the person stands. Equipped with The head area is an area surrounded by p hij , p hi + 1j , p hij + 1 and p hi + 1j + 1 corresponding to the height position of the head in the three-dimensional area v ij .
  • the ground area is an area surrounded by p gij , p gi + 1j , p gij + 1 , and p gi + 1j + 1 corresponding to the position of the ground in the three-dimensional area v ij .
  • the coordinate in the physical space and the information for correlating the coordinate in the video frame may be a coordinate conversion formula, or may be a set of the corresponding coordinate in the physical space and the coordinate in the video frame. Also, each P gij or each P hij does not have to be on the same plane. If the region V ij of the three-dimensional space can be defined, the surface represented by each P gij or each P hij may be a curved surface or a step.
  • the crowd density calculation device 100 reads the analysis parameter 141 into the storage unit 140.
  • the analysis parameter 141 may be stored in the auxiliary storage device 922 or may be input from the outside via the input interface 930 or the communication device 950.
  • the analysis parameter 141 read is used by the analysis unit 120.
  • the video acquisition unit 110 stands by to receive the video stream 21 from the camera 200.
  • the video acquisition unit 110 decodes at least one frame of the received video stream 21.
  • the video stream to be received is, for example, one in which video coded data compressed by a video compression coding method is delivered by IP using a video delivery protocol.
  • a specific example of the video compression coding method is H.264. 262 / MPEG-2 video, H.3. H.264 / AVC, H.264. 265 / HEVC or JPEG.
  • video delivery protocols are MPEG-2 TS, RTP / RTSP, MMT, or DASH.
  • MPEG-2 TS is an abbreviation of Moving Picture Experts Group 2 Transport Stream.
  • RTP / RTSP is an abbreviation of Real-time Transport Protocol / Real Time Streaming Protocol.
  • MMT is an abbreviation of MPEG Media Transport.
  • DASH is an abbreviation of Dynamic Adaptive Streaming over HTTP.
  • the video stream to be received may be an encoding or delivery format other than the above, or an uncompressed transmission standard such as SDI or HD-SDI.
  • SDI is an abbreviation of Serial Digital Interface.
  • HD-SDI is an abbreviation of High Definition-Serial Digital Interface.
  • the analysis unit 120 acquires the video frame 22 from the video acquisition unit 110.
  • the analysis unit 120 analyzes the video frame 22 using the analysis parameter 141.
  • the analysis unit 120 analyzes the video frame 22 to calculate the crowd density distribution of the crowd shown in the video frame 22. Further, the analysis unit 120 converts the calculated crowd density distribution into an output format, and outputs it as a crowd density distribution 225.
  • the result output unit 130 outputs the crowd density distribution 225 output from the analysis unit 120 to the outside of the crowd density calculation device 100 via the output interface 940.
  • the output format include display on a monitor, output to a log file, output to an externally connected device, or transmission to a network.
  • the result output unit 130 may output the crowd density distribution 225 in a format other than the above.
  • the result output unit 130 may output the crowd density distribution 225 to the outside each time the crowd density distribution 225 is output from the analysis unit 120.
  • the result output unit 130 may perform intermittent output such as output after totalization or statistical processing of the crowd density distribution 225 for a specific period or a specific number.
  • the crowd density calculation device 100 returns to step ST02 to process the next video frame 22.
  • the foreground extraction unit 121 extracts an image of a person in the video frame 22, that is, the foreground, as the foreground image 221.
  • the foreground extraction unit 121 outputs the foreground image 221 to the temporary density calculation unit 122.
  • FIG. 8 is a view showing an image in which the number of people is converted from the foreground area for each three-dimensional area.
  • FIG. 9 is a diagram of outputting a provisional density distribution from the number of people for each three-dimensional region.
  • images of two people are extracted as the foreground image 221.
  • a background subtraction method in which a background image is registered in advance and a difference from an input image is calculated.
  • an adaptive background subtraction method in which a background image is automatically updated from a continuously input video frame using a model such as MOG (Mixture of Gaussian Distribution).
  • MOG Matture of Gaussian Distribution
  • the temporary density calculation unit 122 calculates, as a temporary density distribution 222, the number of people apparently present in each three-dimensional area of the plurality of three-dimensional areas. Specifically, the provisional density calculation unit 122 projects the points in the physical space on the video frame coordinate system, and then, using the foreground image 221 and the relational expression 142, the number of people present in each three-dimensional region v ij Calculate The provisional density calculation unit 122 uses provisional crowd density estimation as a method of calculating the number of people present in each three-dimensional region vij .
  • the temporary density calculation unit 122 counts the foreground area for each three-dimensional region v ij in the video frame 22. Preliminary density calculation unit 122, based on the foreground area in each stereo area v ij, to compute the number of persons in each three-dimensional region v ij. At this time, the provisional density calculation unit 122 calculates the number of people in each three-dimensional region v ij by using the relational expression 142 of the foreground area and the number of people obtained in advance.
  • Preliminary density calculation unit 122 the number of people in each of the three-dimensional shape regions v ij, the number h ij of the corresponding region [Delta] S ij to each three-dimensional region v ij.
  • Preliminary density calculation unit 122 as shown in FIG. 9, the crowd density distribution number h ij is calculated for all the regions [Delta] S ij, and outputs as a provisional density distribution 222.
  • FIG. 10 is a diagram showing an image of a correct foreground image.
  • FIG. 11 is a diagram showing an image obtained by digitizing the congestion level in the correct foreground image.
  • a correct image is prepared in which the number of people appearing in the video frame is known and the ground point on the physical coordinate system of the person is known.
  • the foreground extraction is performed on the correct image, and the correct foreground image is created as shown in FIG.
  • the correct foreground image is divided into a plurality of small areas, and the amount of foreground area per person in each small area is calculated for each congestion level.
  • the same processing is applied to a large number of correct foreground images in which the congestion level and the arrangement pattern are changed, and the foreground area per one person in each small area is counted.
  • the relationship between the number of people and the foreground in each small area for each congestion level can be derived as a relational expression 142 between the foreground area and the number of people.
  • the relational expression 142 between the foreground area and the number of persons is stored in the storage unit 140.
  • the occupancy ratio of the foreground area in the small area at each congestion level is stored in the storage unit 140 as the level threshold 143 for determining the congestion level.
  • the temporary density calculation unit 122 determines the congestion level for each small area, and uses the relational expression 142 corresponding to the congestion level to calculate the number of people from the foreground area.
  • the provisional density calculation unit 122 determines the congestion level by comparing the occupancy ratio of the foreground in the small area with the level threshold 143.
  • the existence determination unit 123 determines each solid region v ij of the plurality of three-dimensional regions, whether a person exists. That is, the presence determination unit 123 determines whether a person is present in each three-dimensional space V ij of a plurality of three-dimensional spaces, and a person is present for a three-dimensional area v ij corresponding to the three-dimensional space determined to have no person. It decides that it does not do.
  • the presence determination unit 123 determines the presence of a person with respect to each three-dimensional region v ij , and sets the number of people in the three-dimensional region v ij determined to have no person to zero.
  • the presence determination unit 123 sets the number of persons in the region ⁇ S ij corresponding to the three-dimensional region v ij determined to have no person to 0. That is, the presence determination unit 123 outputs, as the corrected density distribution 223, a temporary density distribution 222 in which the number of persons in the three-dimensional region vij corresponding to the three-dimensional space determined to have no people is corrected to zero.
  • FIG. 12 is an image diagram of the presence determination process according to the present embodiment.
  • each three-dimensional area v ij of the plurality of three-dimensional areas includes a head area corresponding to the head of the person.
  • each three-dimensional area v ij of the plurality of three-dimensional areas comprises a ground area corresponding to the ground on which a person stands.
  • the presence determination unit 123 determines that a person is present in the three-dimensional space V ij corresponding to the three-dimensional area v ij . Specifically, as shown in FIG.
  • the standardization unit 124 acquires the total number of people in the video frame 22 based on the foreground image 221. Standardization unit 124, based on the total number of people, to standardize the number of people in each three-dimensional regions v ij of the plurality of three-dimensional regions in the correction density distribution 223. Specifically, the standardization unit 124 performs the standardization of the crowd density with the total number h total present in the video frame, using the following Equation (1) and Equation (2). The standardization unit 124 calculates the total number of persons h total in the video frame 22 by applying the relational expression 142 of the foreground area and the number of persons to the entire foreground image 221. Rows in equation (2) is the total number of i. Moreover, Cols in Formula (2) is a total of j.
  • the standardization unit 124 performs standardization processing on all the three-dimensional regions v ij of the corrected density distribution, and outputs the result as a definite density distribution 224.
  • step ST15 the distribution output unit 125 acquires the determined density distribution 224 from the standardization unit 124.
  • the distribution output unit 125 converts the definite density distribution 224 into an output format, and outputs the result as the crowd density distribution 225 to the result output unit 130.
  • the functions of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 are realized by software.
  • the functions of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 may be realized by hardware.
  • FIG. 13 is a diagram showing a configuration of a crowd density calculation device 100 according to a modification of the present embodiment.
  • the crowd density calculation device 100 includes an electronic circuit 909, a memory 921, an auxiliary storage device 922, an input interface 930, an output interface 940, and a communication device 950.
  • the electronic circuit 909 is a dedicated electronic circuit that implements the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130.
  • the electronic circuit 909 is a single circuit, a complex circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA, an ASIC, or an FPGA.
  • GA is an abbreviation of Gate Array.
  • ASIC is an abbreviation of Application Specific Integrated Circuit.
  • FPGA is an abbreviation of Field-Programmable Gate Array.
  • the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130 may be realized by one electronic circuit, or may be realized by being dispersed in a plurality of electronic circuits. As another modification, some functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130 may be realized by an electronic circuit, and the remaining functions may be realized by software.
  • Each of the processor and electronics is also referred to as processing circuitry. That is, in the crowd density calculation device 100, the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130 are realized by processing circuitry.
  • the “parts” of the image acquisition unit 110, the foreground extraction unit 121, the provisional density calculation unit 122, the presence determination unit 123, the standardization unit 124, the distribution output unit 125, and the result output unit 130 are “steps”. Alternatively, it may be read as “processing”.
  • the "processing" of the image acquisition processing, foreground extraction processing, provisional density calculation processing, presence determination processing, standardization processing, distribution output processing, and result output processing is "program", “program product” or "computer that records the program” It may be read as "readable storage medium”.
  • the foreground extraction unit extracts the foreground image from the video frame of the video captured by the camera.
  • the provisional density calculation unit generates the provisional density distribution by calculating the number of people present in the area where the area in the physical space is projected to the video frame, that is, the apparent number of people.
  • the presence determination unit determines a region in the physical space where no person is present, and corrects the provisional density distribution based on the determination result.
  • the result output unit outputs the corrected and standardized provisional density distribution as a population density distribution to an output device such as a display. Therefore, according to the crowd density calculation apparatus 100 according to the present embodiment, the presence position of a person on the physical space of the real world can be calculated from the video frame input from the camera, and can be output as a crowd density distribution. .
  • the three-dimensional regions v ij of the plurality of three-dimensional regions overlap each other. For this reason, when calculating the provisional density distribution, the foreground may appear in addition to the three-dimensional area v ij in which a person originally exists. This may cause the calculated provisional density distribution to be inaccurate.
  • an embodiment will be described in which the influence of duplication of three-dimensional regions is eliminated and the crowd density distribution is further enhanced.
  • the standardization unit 124 in the first embodiment is omitted, and a position correction unit 126 is newly added between the provisional density calculation unit 122 and the presence determination unit 123.
  • the analysis unit 120 a includes a foreground extraction unit 121, a provisional density calculation unit 122, a position correction unit 126, a presence determination unit 123, and a distribution output unit 125.
  • step ST16 the position correction unit 126 acquires the provisional density distribution 222 from the provisional density calculation unit 122.
  • the position correction unit 126 corrects the provisional density distribution 222 based on the number of persons in the overlapping area representing the overlapping part of adjacent three-dimensional spaces among the plurality of three-dimensional spaces, and outputs the corrected provisional density distribution 222. That is, using the temporary density distribution 222, the position correction unit 126 corrects the human position in consideration of the influence of the overlapping of the three-dimensional area vij .
  • FIG. 16 is a diagram showing an image of position correction processing by the position correction unit 126 according to the present embodiment.
  • the overlapping area A duplm is, as shown in FIG. 16, an area in which the three-dimensional area v l and the three-dimensional area v m overlap each other.
  • the position correction processing by the position correction unit 126 functions as a kind of filter that sharpens the values dispersed around, as shown in FIG.
  • the subscripts l and m are obtained by rewriting the subscript ij, which is a two variable, into one variable for the sake of convenience.
  • the subscripts l and m are used.
  • Equation (3) the position correction unit 126 can output the highly accurate crowd density distribution 225 from which the influence of the overlapping area has been removed.
  • Equation (3) The derivation of equation (3) will be described in detail below.
  • h tl be the number of people present in the three-dimensional space V l and h l be the number of people appearing in the three-dimensional region v l .
  • People number of people h tl present in three-dimensional space V l is, the number of people appearing in the three-dimensional area v m and h com_l ⁇ m.
  • the apparent number h l appearing in the three-dimensional region v l can be expressed by Expression (5).
  • the h com — l ⁇ m can be expressed as the number of people h tl present in the cell l multiplied by the coefficient ⁇ lm .
  • Equation (4) a coefficient matrix representing an overlapping relationship between cells is expressed by Equation (4).
  • equation (7) the apparent number of people can be expressed by the equation (7) obtained by multiplying the number of people present by a factor from the equations (5) and (6).
  • equation (8) can be obtained by rewriting in matrix expression.
  • Expression (3) can be derived by multiplying the inverse matrix of the coefficient matrix A from both sides of Expression (8).
  • the method of obtaining the coefficient ⁇ lm is not limited.
  • the area A l stereoscopic area v l shown in FIG. 15 may be calculated coefficient alpha lm as equation using overlapping region A Duplm the solid region v l and solid area v m (9).
  • the crowd density distribution can be calculated with higher accuracy by removing the influence of the overlapping region.
  • the coefficient matrix A indicating the overlapping relationship calculated using equation (9) causes the foreground to appear uniformly on the three-dimensional region v ij .
  • the coefficient matrix A indicating the overlapping relationship is influenced by the position where a person actually exists in the three-dimensional space V ij . For this reason, there is a possibility that an error may occur in the calculated crowd density distribution depending on the position where a person actually exists in the three-dimensional space V ij .
  • the coefficient matrix A is optimized by numerical calculation.
  • the total number h total of the population density distribution at that time uses the total number existing in the screen.
  • the total number h total in the screen is calculated by applying the relational expression of the foreground area and the number to the entire foreground image.
  • steps ST11, ST12 and ST16 are the same as in the first embodiment.
  • step ST17 the position correction unit 126 optimizes the coefficient matrix A by recalculation. Further, the position correction unit 126 repeats correction of the provisional density distribution until the error of the total number of persons in the video frame becomes equal to or less than the threshold.
  • Equations (2) and (7) are used to calculate h'total .
  • An evaluation function regarding correction of the provisional density distribution is defined by equation (10).
  • the position correction unit 126 repeatedly performs calculation using the steepest descent method until the error E calculated by equation (10) becomes equal to or less than the threshold.
  • the optimization method by the position correction unit 126 is not limited to the steepest descent method.
  • the apparatus for calculating the crowd density according to the present embodiment updates the coefficient matrix A every frame using Equation (10). Therefore, according to the crowd density calculation device according to the present embodiment, the crowd density distribution can be calculated with higher accuracy compared to the second embodiment.
  • each part of the crowd density calculation device has been described as an independent functional block.
  • the configuration of the crowd density calculation device may not be the configuration as in the embodiment described above.
  • the functional block of the crowd density calculation apparatus may have any configuration as long as the function described in the above-described embodiment can be realized.
  • the crowd density calculation device may be a system configured of a plurality of devices instead of one device.
  • a plurality of parts of Embodiments 1 to 3 may be implemented in combination. Alternatively, one portion of these embodiments may be implemented.
  • these embodiments may be implemented in any combination in whole or in part. That is, in the first to third embodiments, free combinations of the respective embodiments, or modifications of any components of the respective embodiments, or omissions of any components in the respective embodiments are possible.
  • the embodiments described above are essentially preferable examples, and are not intended to limit the scope of the present invention, the scope of the application of the present invention, and the scope of the application of the present invention.
  • the embodiment described above can be variously modified as needed.
  • the crowd density calculation device according to the above-described embodiment can be applied to a crowd density estimation device that estimates crowd density and a crowd density estimation system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)

Abstract

In this crowd density calculation device (100) that calculates a crowd density, an image acquisition unit (110) acquires an image frame (21) from an image stream (22) in which a person is imaged. An analysis unit (120) associates three-dimensional coordinates with the image frame (21), and acquires, as each three-dimensional region of a plurality of three-dimensional regions, a region representing each three-dimensional space of a plurality of three-dimensional spaces obtained on the basis of the three-dimensional coordinates on the image frame (21). The analysis unit (120) calculates, as a crowd density distribution (225), the density distribution of the person in the image frame (21), on the basis of the number of people present in each three-dimensional region of the plurality of three-dimensional regions.

Description

群集密度算出装置、群集密度算出方法および群集密度算出プログラムCrowd density calculation device, crowd density calculation method and crowd density calculation program
 本発明は、群集密度算出装置、群集密度算出方法および群集密度算出プログラムに関する。 The present invention relates to a crowd density calculation device, a crowd density calculation method, and a crowd density calculation program.
 人の数あるいは人の密度をカメラ映像から推定する技術がある。カメラ映像から人の数を推定する技術として、人物検出に基づき人数をカウントする手法、または、前景面積から人数を推定する手法といった技術がある。
 人物検出に基づく手法では、群集密度が低いと、人の数を高精度に推定できる。しかし、この手法では、人数が増えるに従って演算量が増える。さらに、この手法では、人数が増えるに従い群集密度が高くなるので、人物同士のオクルージョン、すなわち隠蔽の影響により推定精度が低下する。
 前景面積から人数を推定する手法では、群集密度が低い場合、人物検出に基づく手法に比較すると推定精度が劣る。しかし、この手法では、群集密度が高い場合でも演算量が変わらない。
 なお、人の密度を推定する技術は、映像フレームの任意の領域ごとに人数を推定する技術と等価である。
There is a technique for estimating the number of people or the density of people from camera images. As techniques for estimating the number of people from camera images, there are techniques such as a method of counting the number of people based on human detection or a technique of estimating the number of people from the foreground area.
In the method based on human detection, when the crowd density is low, the number of people can be estimated with high accuracy. However, with this method, the amount of computation increases as the number of people increases. Furthermore, in this method, the crowd density increases as the number of people increases, and therefore the estimation accuracy decreases due to the influence of occlusion between persons, that is, concealment.
In the method of estimating the number of people from the foreground area, when the crowd density is low, the estimation accuracy is inferior to the method based on the human detection. However, with this method, the amount of computation does not change even if the crowd density is high.
The technique for estimating the density of people is equivalent to the technique for estimating the number of people for each arbitrary area of the video frame.
 特許文献1および特許文献2には、群集を撮影した映像を取得し、背景差分により抽出した前景を人物領域として、人物領域の面積から画面内の人数を推定する技術が開示されている。
 特許文献1では、画像中の各画素が人の数にどれだけ寄与するかを数量的に表す荷重値が算出される。荷重値は、画像における対象物の見かけ上の体積から算出される。これにより、奥行きの違いにより単位人数あたりの前景面積の現れ方が異なるという課題を解決し、奥行きのある画像でも人数の推定を可能としている。
 また、特許文献2では、予め群集を模したCG(computer graphics)モデルを複数の混雑度で作成し、群集同士のオクルージョンを考慮した前景面積と人数の関係式を導出する。そして、特許文献2では、オクルージョンの影響を抑制した人数の推定を可能としている。
Patent Document 1 and Patent Document 2 disclose a technique of acquiring an image obtained by capturing a crowd, and using the foreground extracted by the background difference as a person area, to estimate the number of persons in the screen from the area of the person area.
In Patent Document 1, a load value is calculated which quantitatively represents how much each pixel in an image contributes to the number of people. The load value is calculated from the apparent volume of the object in the image. This solves the problem that the appearance of the foreground area per unit number of people differs due to the difference in depth, and makes it possible to estimate the number of people even in images with depth.
Moreover, in patent document 2, CG (computer graphics) model which imitated a crowd is beforehand created with several congestion degrees, and the relational expression of the foreground area and the number of persons which considered occlusion of crowds is derived | led-out. And in patent document 2, estimation of the number of persons who suppressed the influence of occlusion is enabled.
特開2009-294755号公報JP, 2009-294755, A 特開2005-025328号公報JP, 2005-025328, A
 特許文献1および特許文献2に開示されている技術では、映像フレームに存在する人の数および映像フレーム中の任意の領域における人の密度は算出される。しかし、実世界の物理空間上における人位置の推定は行っていない。これは特許文献1および特許文献2においては物理空間上の点から映像フレーム上の点への対応付けは行われているが、その逆は行われていないためである。 In the techniques disclosed in Patent Document 1 and Patent Document 2, the number of people present in a video frame and the density of people in an arbitrary area in the video frame are calculated. However, we do not estimate the human position on the physical space of the real world. This is because, in Patent Document 1 and Patent Document 2, the correspondence from the point on the physical space to the point on the video frame is performed, but the reverse is not performed.
 本発明は、映像フレームから実世界の物理空間上における群集の存在位置を算出し、群集の密度分布として出力することを目的とする。 An object of the present invention is to calculate the existence position of a crowd on the physical space of the real world from a video frame and output it as the density distribution of the crowd.
 本発明に係る群集密度算出装置は、
 人が撮像されている映像ストリームから映像フレームを取得する映像取得部と、
 前記映像フレームに3次元座標を対応付け、前記映像フレーム上において前記3次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得し、前記複数の立体領域の各立体領域に存在する人の数に基づいて、前記映像フレームにおける人の密度分布を群集密度分布として算出する解析部とを備えた。
The crowd density calculation device according to the present invention is
A video acquisition unit that acquires a video frame from a video stream in which a person is imaged;
Three-dimensional coordinates are associated with the video frame, and an area representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame is acquired as each three-dimensional area of the plurality of three-dimensional areas; And an analysis unit configured to calculate a density distribution of persons in the image frame as a population density distribution based on the number of people present in each of the plurality of three-dimensional regions.
 本発明に係る群集密度算出装置では、解析部が、映像フレームに3次元座標を対応付け、映像フレーム上において3次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得する。また、解析部が、複数の立体領域の各立体領域に存在する人の数に基づいて、映像フレームにおける群集密度分布を算出する。よって、本発明に係る群集密度算出装置によれば、映像フレームから実世界の物理空間における群集密度分布を定量的に把握することができる。 In the crowd density calculation device according to the present invention, the analysis unit associates three-dimensional coordinates with the video frame, and a plurality of regions representing each three-dimensional space of the plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame It acquires as each three-dimensional area of three-dimensional area of. Further, the analysis unit calculates a crowd density distribution in the video frame based on the number of people present in each three-dimensional area of the plurality of three-dimensional areas. Therefore, according to the crowd density calculation device of the present invention, it is possible to quantitatively grasp the crowd density distribution in the physical space of the real world from the video frame.
実施の形態1に係る群集密度算出装置の構成図。1 is a configuration diagram of a crowd density calculation device according to Embodiment 1. FIG. 実施の形態1に係る解析部の詳細構成図。FIG. 2 is a detailed configuration diagram of an analysis unit according to the first embodiment. 群集密度分布の定義を説明する図。The figure explaining the definition of crowd density distribution. ΔXとΔYの大きさを一定とした場合の群集密度分布のイメージを表す図。The figure showing the image of crowd density distribution at the time of making the size of deltaX and deltaY constant. 実世界の物理空間上の点を映像フレーム座標系に投映するイメージを表す図。A diagram representing an image of projecting points in the physical world of the real world onto a video frame coordinate system. 実施の形態1に係る群集密度算出処理のフロー図。FIG. 6 is a flowchart of crowd density calculation processing according to the first embodiment. 実施の形態1に係る解析処理のフロー図。FIG. 6 is a flowchart of analysis processing according to the first embodiment. 立体領域ごとに前景面積を人数に換算するイメージを表す図。The figure showing the image which converts foreground area into the number of people for every three-dimensional field. 立体領域ごとの人数から暫定密度分布を出力するイメージを表す図。The figure showing the image which outputs temporary density distribution from the number of people for every three-dimensional field. 正解前景画像のイメージを表す図。The figure showing the image of a correct foreground picture. 正解前景画像において混雑レベルを数値化したイメージを表す図。The figure showing the image which quantified the congestion level in the correct foreground image. 実施の形態1に係る存在判定処理のイメージを表す図。FIG. 6 is a diagram showing an image of presence determination processing according to the first embodiment. 実施の形態1の変形例に係る群集密度算出装置の構成図。The block diagram of the crowd density calculation device concerning the modification of Embodiment 1. FIG. 実施の形態2に係る解析部の詳細構成図。FIG. 8 is a detailed configuration diagram of an analysis unit according to a second embodiment. 実施の形態2に係る解析処理のフロー図。FIG. 7 is a flow diagram of analysis processing according to the second embodiment. 実施の形態2に係る位置補正処理のイメージを表す図。FIG. 8 is a view showing an image of position correction processing according to the second embodiment. 実施の形態3に係る解析処理のフロー図。FIG. 14 is a flowchart of analysis processing according to the third embodiment.
 以下、本発明の実施の形態について、図を用いて説明する。なお、各図中、同一または相当する部分には、同一符号を付している。実施の形態の説明において、同一または相当する部分については、説明を適宜省略または簡略化する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the drawings, the same or corresponding parts are denoted by the same reference numerals. In the description of the embodiment, the description of the same or corresponding parts will be omitted or simplified as appropriate.
 実施の形態1.
***構成の説明***
 図1を用いて、本実施の形態に係る群集密度算出装置100の構成を説明する。
 群集密度算出装置100は、コンピュータである。群集密度算出装置100は、プロセッサ910を備えるとともに、メモリ921、補助記憶装置922、入力インタフェース930、出力インタフェース940、および通信装置950といった他のハードウェアを備える。プロセッサ910は、信号線を介して他のハードウェアと接続され、これら他のハードウェアを制御する。
Embodiment 1
*** Description of the configuration ***
The structure of the crowd density calculation apparatus 100 which concerns on this Embodiment is demonstrated using FIG.
The crowd density calculation device 100 is a computer. The crowd density calculation device 100 includes a processor 910 and other hardware such as a memory 921, an auxiliary storage device 922, an input interface 930, an output interface 940, and a communication device 950. The processor 910 is connected to other hardware via signal lines to control these other hardware.
 群集密度算出装置100は、機能要素として、映像取得部110と、解析部120と、結果出力部130と、記憶部140とを備える。記憶部140には、解析部120による解析処理で用いられる解析パラメータ141が記憶されている。 The crowd density calculation device 100 includes an image acquisition unit 110, an analysis unit 120, a result output unit 130, and a storage unit 140 as functional elements. The storage unit 140 stores analysis parameters 141 used in analysis processing by the analysis unit 120.
 映像取得部110と解析部120と結果出力部130の機能は、ソフトウェアにより実現される。記憶部140は、メモリ921に備えられる。記憶部140は、補助記憶装置922に備えられていてもよい。 The functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130 are realized by software. The storage unit 140 is provided in the memory 921. The storage unit 140 may be included in the auxiliary storage device 922.
 プロセッサ910は、群集密度算出プログラムを実行する装置である。群集密度算出プログラムは、映像取得部110と解析部120と結果出力部130の機能を実現するプログラムである。
 プロセッサ910は、演算処理を行うIC(Integrated Circuit)である。プロセッサ910の具体例は、CPU、DSP(Digital Signal Processor)、GPU(Graphics Processing Unit)である。
The processor 910 is a device that executes a crowd density calculation program. The crowd density calculation program is a program for realizing the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130.
The processor 910 is an IC (Integrated Circuit) that performs arithmetic processing. Specific examples of the processor 910 are a CPU, a digital signal processor (DSP), and a graphics processing unit (GPU).
 メモリ921は、データを一時的に記憶する記憶装置である。メモリ921の具体例は、SRAM(Static Random Access Memory)、あるいはDRAM(Dynamic Random Access Memory)である。 The memory 921 is a storage device that temporarily stores data. A specific example of the memory 921 is a static random access memory (SRAM) or a dynamic random access memory (DRAM).
 補助記憶装置922は、データを保管する記憶装置である。補助記憶装置922の具体例は、HDDである。また、補助記憶装置922は、SD(登録商標)メモリカード、CF、NANDフラッシュ、フレキシブルディスク、光ディスク、コンパクトディスク、ブルーレイ(登録商標)ディスク、DVDといった可搬記憶媒体であってもよい。なお、HDDは、Hard Disk Driveの略語である。SD(登録商標)は、Secure Digitalの略語である。CFは、CompactFlashの略語である。DVDは、Digital Versatile Diskの略語である。 The auxiliary storage device 922 is a storage device for storing data. A specific example of the auxiliary storage device 922 is an HDD. The auxiliary storage device 922 may also be a portable storage medium such as an SD (registered trademark) memory card, a CF, a NAND flash, a flexible disk, an optical disk, a compact disk, a Blu-ray (registered trademark) disk, and a DVD. HDD is an abbreviation of Hard Disk Drive. SD (registered trademark) is an abbreviation of Secure Digital. CF is an abbreviation of Compact Flash. DVD is an abbreviation of Digital Versatile Disk.
 入力インタフェース930は、マウス、キーボード、あるいはタッチパネルといった入力装置と接続されるポートである。また、入力インタフェース930は、カメラ200と接続されるポートであってもよい。入力インタフェース930は、具体的には、USB(Universal Serial Bus)端子である。なお、入力インタフェース930は、LAN(Local Area Network)と接続されるポートであってもよい。群集密度算出装置100は、入力インタフェース930を介して、カメラ200から映像ストリーム21を取得してもよい。 The input interface 930 is a port connected to an input device such as a mouse, a keyboard, or a touch panel. Also, the input interface 930 may be a port connected to the camera 200. Specifically, the input interface 930 is a USB (Universal Serial Bus) terminal. The input interface 930 may be a port connected to a LAN (Local Area Network). The crowd density calculation apparatus 100 may acquire the video stream 21 from the camera 200 via the input interface 930.
 出力インタフェース940は、ディスプレイといった出力機器のケーブルが接続されるポートである。出力インタフェース940は、具体的には、USB端子またはHDMI(登録商標)(High Definition Multimedia Interface)端子である。ディスプレイは、具体的には、LCD(Liquid Crystal Display)である。群集密度算出装置100は、出力インタフェース940を介して、結果出力部130により出力された解析結果をディスプレイに表示する。 The output interface 940 is a port to which a cable of an output device such as a display is connected. Specifically, the output interface 940 is a USB terminal or an HDMI (registered trademark) (High Definition Multimedia Interface) terminal. The display is specifically an LCD (Liquid Crystal Display). The crowd density calculation apparatus 100 displays the analysis result output by the result output unit 130 on the display via the output interface 940.
 通信装置950は、ネットワークを介して他の装置と通信する。通信装置950は、レシーバとトランスミッタを有する。通信装置950は、有線または無線で、LAN、インターネット、あるいは電話回線といった通信網に接続している。通信装置950は、具体的には、通信チップまたはNIC(Network Interface Card)である。群集密度算出装置100は、通信装置950を介して、カメラ200から映像ストリーム22を受信してもよい。また、群集密度算出装置100は、通信装置950を介して、結果出力部130により出力された解析結果を外部の装置に送信してもよい。 Communication device 950 communicates with other devices via a network. Communication device 950 has a receiver and a transmitter. The communication device 950 is connected to a communication network such as a LAN, the Internet, or a telephone line by wire or wirelessly. The communication device 950 is specifically a communication chip or a NIC (Network Interface Card). The crowd density calculation device 100 may receive the video stream 22 from the camera 200 via the communication device 950. In addition, the crowd density calculation device 100 may transmit the analysis result output by the result output unit 130 to an external device via the communication device 950.
 群集密度算出プログラムは、プロセッサ910に読み込まれ、プロセッサ910によって実行される。メモリ921には、群集密度算出プログラムだけでなく、OS(Operating System)も記憶されている。プロセッサ910は、OSを実行しながら、群集密度算出プログラムを実行する。群集密度算出プログラムおよびOSは、補助記憶装置922に記憶されていてもよい。補助記憶装置922に記憶されている群集密度算出プログラムおよびOSは、メモリ921にロードされ、プロセッサ910によって実行される。なお、群集密度算出プログラムの一部または全部がOSに組み込まれていてもよい。 The crowd density calculation program is read into the processor 910 and executed by the processor 910. The memory 921 stores not only the crowd density calculation program but also an operating system (OS). The processor 910 executes the crowd density calculation program while executing the OS. The crowd density calculation program and the OS may be stored in the auxiliary storage device 922. The crowd density calculation program and the OS stored in the auxiliary storage device 922 are loaded into the memory 921 and executed by the processor 910. Note that part or all of the crowd density calculation program may be incorporated into the OS.
 群集密度算出装置100は、プロセッサ910を代替する複数のプロセッサを備えていてもよい。これら複数のプロセッサは、群集密度算出プログラムの実行を分担する。それぞれのプロセッサは、プロセッサ910と同じように、群集密度算出プログラムを実行する装置である。 The crowd density calculation apparatus 100 may include a plurality of processors that replace the processor 910. The plurality of processors share execution of the crowd density calculation program. Each processor is an apparatus which executes a crowd density calculation program in the same manner as the processor 910.
 群集密度算出プログラムにより利用、処理または出力されるデータ、情報、信号値および変数値は、メモリ921、補助記憶装置922、または、プロセッサ910内のレジスタあるいはキャッシュメモリに記憶される。 Data, information, signal values, and variable values used, processed or output by the crowd density calculation program are stored in the memory 921, the auxiliary storage device 922, or a register or cache memory in the processor 910.
 群集密度算出プログラムは、映像取得部110と解析部120と結果出力部130の各部の「部」を「処理」、「手順」あるいは「工程」に読み替えた各処理、各手順あるいは各工程を、コンピュータに実行させる。また、群集密度算出方法は、群集密度算出装置100が群集密度算出プログラムを実行することにより行われる方法である。
 群集密度算出プログラムは、コンピュータ読取可能な記録媒体に格納されて提供されてもよい。また、群集密度算出プログラムは、プログラムプロダクトとして提供されてもよい。
In the crowd density calculation program, each process, each procedure or each process is replaced with “process”, “procedure” or “process” of “part” of each part of the image acquisition unit 110, the analysis unit 120, and the result output unit 130. Make it run on a computer. The crowd density calculation method is a method performed by the crowd density calculation device 100 executing a crowd density calculation program.
The crowd density calculation program may be stored in a computer readable recording medium and provided. Also, the population density calculation program may be provided as a program product.
 図2を用いて、本実施の形態に係る解析部120の詳細構成について説明する。
 解析部120は、前景抽出部121と、暫定密度計算部122と、存在判定部123と、標準化部124と、分布出力部125とを備える。すなわち、映像取得部110と解析部120と結果出力部130の各部とは、映像取得部110と、前景抽出部121と、暫定密度計算部122と、存在判定部123と、標準化部124と、分布出力部125と、結果出力部130の各部である。
The detailed configuration of the analysis unit 120 according to the present embodiment will be described with reference to FIG.
The analysis unit 120 includes a foreground extraction unit 121, a provisional density calculation unit 122, a presence determination unit 123, a standardization unit 124, and a distribution output unit 125. That is, the video acquisition unit 110, the analysis unit 120, and each unit of the result output unit 130 are the video acquisition unit 110, the foreground extraction unit 121, the provisional density calculation unit 122, the presence determination unit 123, and the standardization unit 124. It is each part of the distribution output part 125 and the result output part 130.
 図1および図2を用いて、群集密度算出装置100の各機能要素の概要について説明する。
 群集密度算出装置100は、物体を撮像し、映像ストリーム21として配信するカメラ200と接続される。物体とは、具体的には人である。すなわち、映像ストリーム21は、群集映像である。
 映像取得部110は、入力インタフェース930を介して、カメラ200から配信される映像ストリーム21を取得する。映像取得部110は、映像ストリーム21から映像フレーム22を取得する。具体的には、映像取得部110は、映像ストリーム21を復号し、映像フレーム22に変換する。
 解析部120は、映像フレーム22に3次元座標を対応付ける。解析部120は、映像フレーム22上において3次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得する。そして、解析部120は、複数の立体領域の各立体領域に存在する人の数に基づいて、映像フレーム22における群集密度分布を算出する。すなわち、解析部120は、映像フレーム22を用いて、物理空間上の3次元座標における群集の位置を群集密度分布225として算出する。
 結果出力部130は、出力インタフェース940を介して、解析部120から出力された群集密度分布225をディスプレイといった出力装置に出力する。
The outline of each functional element of the crowd density calculation device 100 will be described using FIGS. 1 and 2.
The crowd density calculation device 100 is connected to a camera 200 that captures an object and delivers it as a video stream 21. The object is specifically a person. That is, the video stream 21 is a crowd video.
The video acquisition unit 110 acquires the video stream 21 distributed from the camera 200 via the input interface 930. The video acquisition unit 110 acquires the video frame 22 from the video stream 21. Specifically, the video acquisition unit 110 decodes the video stream 21 and converts it into a video frame 22.
The analysis unit 120 associates the video frame 22 with three-dimensional coordinates. The analysis unit 120 acquires, as three-dimensional regions of a plurality of three-dimensional regions, regions representing the three-dimensional regions of a plurality of three-dimensional spaces obtained based on three-dimensional coordinates on the video frame 22. Then, the analysis unit 120 calculates the crowd density distribution in the video frame 22 based on the number of people present in each three-dimensional area of the plurality of three-dimensional areas. That is, the analysis unit 120 calculates the position of the crowd in the three-dimensional coordinates in the physical space as the crowd density distribution 225 using the video frame 22.
The result output unit 130 outputs the crowd density distribution 225 output from the analysis unit 120 to an output device such as a display via the output interface 940.
 次に、解析部120が有する各機能要素の概要について説明する。
 前景抽出部121は、映像フレーム22から前景の特徴を有する部分を前景画像221として抽出する。暫定密度計算部122は、前景画像221と記憶部140に記憶されている解析パラメータ141とを用いて、物理空間上の各位置に対する見かけ上の群集の密度分布である暫定密度分布222を算出する。存在判定部123は、物理空間上において人が存在しない位置を判定することにより、暫定密度分布222を補正する。存在判定部123は、補正した暫定密度分布222を補正密度分布223として出力する。標準化部124は、映像フレーム22により表される画像に存在する総人数を利用して、補正密度分布223を標準化し、確定密度分布224を出力する。分布出力部125は、確定密度分布224を出力形式に変換し、最終的に群集密度分布225として出力する。
Next, an outline of each functional element of the analysis unit 120 will be described.
The foreground extraction unit 121 extracts a portion having foreground features from the video frame 22 as a foreground image 221. The temporary density calculation unit 122 calculates a temporary density distribution 222, which is a density distribution of apparent crowds for each position in the physical space, using the foreground image 221 and the analysis parameter 141 stored in the storage unit 140. . The presence determination unit 123 corrects the provisional density distribution 222 by determining the position where no person is present in the physical space. The presence determination unit 123 outputs the corrected temporary density distribution 222 as a corrected density distribution 223. The standardization unit 124 uses the total number of people present in the image represented by the video frame 22 to standardize the correction density distribution 223 and outputs a definite density distribution 224. The distribution output unit 125 converts the definite density distribution 224 into an output format, and finally outputs it as a crowd density distribution 225.
 図3を用いて、群集密度分布の定義を説明する。群集密度分布とは、実世界の物理空間上の人数分布である。
 図3に示すように、実世界の3次元座標である物理空間座標系X-Y-Zにおいて、実世界の地面に対応するX-Y平面上のある位置(X,Y)における幅ΔXと奥行きΔYで囲まれる領域をΔSijとする。そして、ΔSijを底面とした高さHの角柱で囲まれる立体空間の領域を立体空間Vijとする。立体空間Vij内に存在する人数をhtijとする。人数htijをΔSijに対応させ、解析領域全体分を並べたものが群集密度分布である。高さHは人の身長程度の高さとする。
The definition of the population density distribution will be described with reference to FIG. The population density distribution is the number distribution on the physical space of the real world.
As shown in FIG. 3, the physical space coordinates X r -Y r -Z r is a three-dimensional coordinates of the real world, X r -Y r a top planar position (X i corresponding to the ground in the real world, An area surrounded by the width ΔX and the depth ΔY in Y i ) is taken as ΔS ij . Then, a region of a three-dimensional space surrounded by a prism of height H with ΔS ij as a bottom surface is set as a three-dimensional space V ij . The number of people present in the three-dimensional space V ij is h t ij . The crowd density distribution is obtained by arranging the total number of analysis regions corresponding to the number of people h tij corresponding to ΔS ij . The height H is about the height of a person.
 図4は、ΔXとΔYの大きさを一定とした場合の群集密度分布のイメージを表す図である。領域ΔS02と、領域ΔS10と領域ΔS20の中間の地点とに1人ずつ人が存在した場合の群集密度分布は、次の通りである。このときの群集密度分布は、ΔS02の位置に1人、ΔS10の位置に0.5人、ΔS20の位置に0.5人、その他ΔSijの位置に0人存在するという分布となる。
 なお、ΔXとΔYの大きさは規定しない。ΔXとΔYの大きさは可変としても構わない。
FIG. 4 is a diagram showing an image of the crowd density distribution when the magnitudes of ΔX and ΔY are constant. The population density distribution in the case where one person exists at each of the area ΔS 02 and the middle point between the area ΔS 10 and the area ΔS 20 is as follows. The crowd density distribution at this time is such that one person exists at the position of ΔS 02 , 0.5 person at the position of ΔS 10 , 0.5 person at the position of ΔS 20 , and 0 people at the other positions of ΔS ij. .
Note that the magnitudes of ΔX and ΔY are not defined. The magnitudes of ΔX and ΔY may be variable.
 図5は、実世界の物理空間上の点を映像フレーム座標系に投映するイメージを表す図である。すなわち、図5は、映像フレームに3次元座標を対応付けるイメージである。実世界の物理空間における地面上の点をPgij=(Xij,Yij,0)とする。実世界の物理空間における高さHの平面上の点をPhij=(Xij,Yij,H)とする。地面上の点Pgijと高さHの平面上の点Phijを、映像フレーム座標系ximg-yimg上に投映した点をpgijとphijとする。Pgijとpgij、Phijとphijを対応付ける情報が、解析パラメータ141として、記憶部140に記憶されている。また、物理空間上の立体空間Vijを、映像フレーム座標系上に投映した領域を立体領域vijとする。立体領域vijは、図5において斜線で示す2次元領域である。すなわち、立体領域vijは、映像フレームに3次元座標の立体空間Vijが表された際に、映像フレーム上の立体空間Vijの外周により表された2次元の領域である。立体領域vijを角柱領域あるいは直方体領域と呼ぶ。 FIG. 5 is a view showing an image of projecting a point on the physical space of the real world on a video frame coordinate system. That is, FIG. 5 is an image in which three-dimensional coordinates are associated with the video frame. Let P gij = (X ij , Y ij , 0) be a point on the ground in the physical space of the real world. A point on a plane of height H in the physical space of the real world is taken as P hij = (X ij , Y ij , H). The points P gij on the ground and the points P hij on the plane of height H are projected on the image frame coordinate system x img -y img as p gij and p hij . Information that associates P gij with p gij and P hij with p hij is stored in the storage unit 140 as an analysis parameter 141. In addition, an area obtained by projecting the three-dimensional space V ij on the physical space on the video frame coordinate system is defined as a three-dimensional area v ij . The three-dimensional area v ij is a two-dimensional area indicated by oblique lines in FIG. That is, the three-dimensional area v ij is a two-dimensional area represented by the outer periphery of the three-dimensional space V ij on the video frame when the three-dimensional space V ij of three-dimensional coordinates is represented on the video frame. The three-dimensional area v ij is called a prismatic area or a rectangular parallelepiped area.
 複数の立体領域の各立体領域vijは、人が複数の立体空間の各立体空間Vijに立っている場合に人の頭が対応する頭領域と、人が立つ地面に対応する地面領域とを備える。頭領域は、立体領域vijにおいて頭の高さ位置に相当するphij、phi+1j、phij+1、phi+1j+1で囲まれる領域である。また、地面領域は、立体領域vijにおいて地面の位置に相当するpgij、pgi+1j、pgij+1、pgi+1j+1で囲まれる領域である。 When each person is standing in each three-dimensional space V ij of a plurality of three-dimensional regions, each three-dimensional region v ij of the plurality of three-dimensional regions corresponds to a head region corresponding to the human head and a ground region corresponding to the ground on which the person stands. Equipped with The head area is an area surrounded by p hij , p hi + 1j , p hij + 1 and p hi + 1j + 1 corresponding to the height position of the head in the three-dimensional area v ij . The ground area is an area surrounded by p gij , p gi + 1j , p gij + 1 , and p gi + 1j + 1 corresponding to the position of the ground in the three-dimensional area v ij .
 物理空間上の座標と映像フレーム上の座標を対応付ける情報とは、座標変換式でもよいし、対応する物理空間上の座標と映像フレーム上の座標の組でもよい。
 また、各Pgijまたは各Phijは同一平面上である必要はない。立体空間の領域Vijを定義できれば、各Pgijまたは各Phijが表す面が、曲面あるいは階段状となっていてもよい。
The coordinate in the physical space and the information for correlating the coordinate in the video frame may be a coordinate conversion formula, or may be a set of the corresponding coordinate in the physical space and the coordinate in the video frame.
Also, each P gij or each P hij does not have to be on the same plane. If the region V ij of the three-dimensional space can be defined, the surface represented by each P gij or each P hij may be a curved surface or a step.
***動作の説明***
 図6を用いて、本実施の形態に係る群集密度算出装置100による群集密度算出処理S100について説明する。
*** Description of operation ***
A crowd density calculation process S100 performed by the crowd density calculation device 100 according to the present embodiment will be described with reference to FIG.
<解析パラメータ読み込み処理>
 ステップST01において、群集密度算出装置100は、記憶部140に解析パラメータ141を読み込む。解析パラメータ141は、補助記憶装置922に記憶されていてもよいし、入力インタフェース930あるいは通信装置950を介して、外部から入力されてもよい。読み込まれた解析パラメータ141は、解析部120により使用される。
<Analysis parameter reading process>
In step ST01, the crowd density calculation device 100 reads the analysis parameter 141 into the storage unit 140. The analysis parameter 141 may be stored in the auxiliary storage device 922 or may be input from the outside via the input interface 930 or the communication device 950. The analysis parameter 141 read is used by the analysis unit 120.
<映像取得処理>
 ステップST02において、映像取得部110は、カメラ200から映像ストリーム21を受信するために待機する。映像取得部110は、カメラ200から映像ストリーム21を受信すると、受信した映像ストリーム21の少なくとも1フレーム分を復号する。ここで受信対象とする映像ストリームは、例えば、映像圧縮符号化方式で圧縮された映像符号化データが、映像配信プロトコルでIP配信されるものである。映像圧縮符号化方式の具体例は、H.262/MPEG-2 video、H.264/AVC、H.265/HEVC、またはJPEGである。映像配信プロトコルの具体例は、MPEG-2 TS、RTP/RTSP、MMT、またはDASHである。MPEG-2 TSは、Moving Picture Experts Group 2 Transport Streamの略語である。RTP/RTSPは、Real-time Transport Protocol/Real Time Streaming Protocolの略語である。MMTは、MPEG Media Transportの略語である。DASHは、Dynamic Adaptive Streaming over HTTPの略語である。受信対象とする映像ストリームは、上記以外の符号化または配信フォーマットでもよいし、SDI、HD-SDIといった非圧縮の伝送規格でもよい。SDIは、Serial Digital Interfaceの略語である。HD-SDIは、High Definition-Serial Digital Interfaceの略語である。
<Video acquisition processing>
In step ST02, the video acquisition unit 110 stands by to receive the video stream 21 from the camera 200. When receiving the video stream 21 from the camera 200, the video acquisition unit 110 decodes at least one frame of the received video stream 21. Here, the video stream to be received is, for example, one in which video coded data compressed by a video compression coding method is delivered by IP using a video delivery protocol. A specific example of the video compression coding method is H.264. 262 / MPEG-2 video, H.3. H.264 / AVC, H.264. 265 / HEVC or JPEG. Examples of video delivery protocols are MPEG-2 TS, RTP / RTSP, MMT, or DASH. MPEG-2 TS is an abbreviation of Moving Picture Experts Group 2 Transport Stream. RTP / RTSP is an abbreviation of Real-time Transport Protocol / Real Time Streaming Protocol. MMT is an abbreviation of MPEG Media Transport. DASH is an abbreviation of Dynamic Adaptive Streaming over HTTP. The video stream to be received may be an encoding or delivery format other than the above, or an uncompressed transmission standard such as SDI or HD-SDI. SDI is an abbreviation of Serial Digital Interface. HD-SDI is an abbreviation of High Definition-Serial Digital Interface.
<解析処理>
 ステップST03において、解析部120は、映像取得部110から映像フレーム22を取得する。解析部120は、解析パラメータ141を用いて、映像フレーム22を解析する。解析部120は、映像フレーム22を解析することにより、映像フレーム22に映る群集の群集密度分布を算出する。また、解析部120は、算出した群集密度分布を出力形式に変換し、群集密度分布225として出力する。
<Analysis process>
In step ST03, the analysis unit 120 acquires the video frame 22 from the video acquisition unit 110. The analysis unit 120 analyzes the video frame 22 using the analysis parameter 141. The analysis unit 120 analyzes the video frame 22 to calculate the crowd density distribution of the crowd shown in the video frame 22. Further, the analysis unit 120 converts the calculated crowd density distribution into an output format, and outputs it as a crowd density distribution 225.
<結果出力処理>
 ステップST04において、結果出力部130は、出力インタフェース940を介して、解析部120から出力された群集密度分布225を群集密度算出装置100の外部に出力する。出力の形式としては、例えば、モニタへの表示、ログファイルへの出力、外部接続機器への出力、またはネットワークへの送出といった形式があげられる。結果出力部130は、上記以外の形式で群集密度分布225を出力でもよい。また、結果出力部130は、解析部120から群集密度分布225が出力される都度、群集密度分布225を外部に出力してもよい。あるいは、結果出力部130は、特定の期間または特定数の群集密度分布225を集計または統計処理した後に出力するといった断続的な出力を行ってもよい。群集密度算出装置100は、ステップST04の後はステップST02に戻り、次の映像フレーム22の処理を行う。
<Result output process>
In step ST04, the result output unit 130 outputs the crowd density distribution 225 output from the analysis unit 120 to the outside of the crowd density calculation device 100 via the output interface 940. Examples of the output format include display on a monitor, output to a log file, output to an externally connected device, or transmission to a network. The result output unit 130 may output the crowd density distribution 225 in a format other than the above. In addition, the result output unit 130 may output the crowd density distribution 225 to the outside each time the crowd density distribution 225 is output from the analysis unit 120. Alternatively, the result output unit 130 may perform intermittent output such as output after totalization or statistical processing of the crowd density distribution 225 for a specific period or a specific number. After step ST04, the crowd density calculation device 100 returns to step ST02 to process the next video frame 22.
<<解析処理の詳細>>
 図7を用いて、本実施の形態に係る解析処理の詳細について説明する。
 ステップST11において、前景抽出部121は、映像フレーム22における人の画像、すなわち前景を前景画像221として抽出する。前景抽出部121は、前景画像221を暫定密度計算部122に出力する。
<< Details of Analysis Processing >>
Details of the analysis processing according to the present embodiment will be described with reference to FIG.
In step ST11, the foreground extraction unit 121 extracts an image of a person in the video frame 22, that is, the foreground, as the foreground image 221. The foreground extraction unit 121 outputs the foreground image 221 to the temporary density calculation unit 122.
 図8は、立体領域ごとに前景面積から人数を換算するイメージを表す図である。図9は、立体領域ごとの人数から暫定密度分布を出力する図である。図8では、2人の人の画像が前景画像221として抽出されている。
 前景抽出処理の手法として、予め背景画像を登録しておき、入力画像との差分を計算する背景差分法がある。また、連続して入力される映像フレームから、MOG(Mixture of Gaussian Distribution)といったモデルを用いて背景画像を自動更新する適応型の背景差分法がある。また、画像内の動き情報を画素単位で取得する密なオプティカルフロー導出アルゴリズムがある。
FIG. 8 is a view showing an image in which the number of people is converted from the foreground area for each three-dimensional area. FIG. 9 is a diagram of outputting a provisional density distribution from the number of people for each three-dimensional region. In FIG. 8, images of two people are extracted as the foreground image 221.
As a method of foreground extraction processing, there is a background subtraction method in which a background image is registered in advance and a difference from an input image is calculated. In addition, there is an adaptive background subtraction method in which a background image is automatically updated from a continuously input video frame using a model such as MOG (Mixture of Gaussian Distribution). In addition, there is a dense optical flow derivation algorithm for acquiring motion information in an image in pixel units.
 ステップST12において、暫定密度計算部122は、前景画像221に基づいて、複数の立体領域の各立体領域に見かけ上存在する人の数を暫定密度分布222として算出する。具体的には、暫定密度計算部122は、物理空間上の点を映像フレーム座標系に投映した上で、前景画像221と関係式142を用いて、各立体領域vijに存在する人の数を算出する。暫定密度計算部122は、各立体領域vijに存在する人の数を算出する手法として、暫定群集密度推定を用いる。 In step ST12, based on the foreground image 221, the temporary density calculation unit 122 calculates, as a temporary density distribution 222, the number of people apparently present in each three-dimensional area of the plurality of three-dimensional areas. Specifically, the provisional density calculation unit 122 projects the points in the physical space on the video frame coordinate system, and then, using the foreground image 221 and the relational expression 142, the number of people present in each three-dimensional region v ij Calculate The provisional density calculation unit 122 uses provisional crowd density estimation as a method of calculating the number of people present in each three-dimensional region vij .
 図8に示すように、暫定密度計算部122は、映像フレーム22における各立体領域vijごとに前景面積を集計する。暫定密度計算部122は、各立体領域vijにおける前景面積に基づいて、各立体領域vijにおける人数を計算する。このとき、暫定密度計算部122は、予め求めておいた前景面積と人の数の関係式142を利用して、各立体領域vijにおける人の数を計算する。暫定密度計算部122は、各立体領域vijにおける人の数を、各立体領域vijに対応する領域ΔSijの人数hijとする。暫定密度計算部122は、図9に示すように、全ての領域ΔSijの人数hijが算出された群集密度分布を、暫定密度分布222として出力する。 As shown in FIG. 8, the temporary density calculation unit 122 counts the foreground area for each three-dimensional region v ij in the video frame 22. Preliminary density calculation unit 122, based on the foreground area in each stereo area v ij, to compute the number of persons in each three-dimensional region v ij. At this time, the provisional density calculation unit 122 calculates the number of people in each three-dimensional region v ij by using the relational expression 142 of the foreground area and the number of people obtained in advance. Preliminary density calculation unit 122, the number of people in each of the three-dimensional shape regions v ij, the number h ij of the corresponding region [Delta] S ij to each three-dimensional region v ij. Preliminary density calculation unit 122, as shown in FIG. 9, the crowd density distribution number h ij is calculated for all the regions [Delta] S ij, and outputs as a provisional density distribution 222.
<<<前景面積と人数の関係式の求め方>>>
 ここで、前景面積と人数の関係式142は、群集同士のオクルージョンを考慮したものとする。前景面積と人数の関係式142は、記憶部140に記憶されているものとする。前景面積と人数の関係式142の導出方法について、以下に説明する。
 図10は、正解前景画像のイメージを表す図である。
 図11は、正解前景画像において混雑レベルを数値化したイメージを表す図である。
 映像フレームに映る人数が既知であるとともに、人の物理座標系上での接地点が既知である正解画像を用意する。正解画像に対し前景抽出を行い、図10に示すように正解前景画像を作成する。
 正解前景画像を図11に示すように、複数の小領域に分割し、混雑レベル別に各小領域で人物1人あたりの前景面積量を計算する。また、同様の処理を混雑レベルと配置パターンを変えた多数の正解前景画像に適用し、各小領域での人数1人当たりの前景面積を集計する。これにより、混雑レベルごとの各小領域における人数と前景の関係が前景面積と人数の関係式142として導出できる。前景面積と人数の関係式142は、記憶部140に保存される。各混雑レベルにおける小領域内での前景面積の占有割合が混雑レベルを判定するレベル閾値143として記憶部140に保存される。
 暫定密度計算部122は、前景面積と人数の関係式142を用いる際、小領域ごとに混雑レベルを判定し、混雑レベルに対応した関係式142を利用し、前景面積から人数を算出する。暫定密度計算部122は、小領域内の前景の占有割合とレベル閾値143の比較によって、混雑レベルを判定する。
<<< How to find the equation of the foreground area and the number of people >>>
Here, the relational expression 142 of the foreground area and the number of people takes into consideration the occlusion between the crowds. The relational expression 142 between the foreground area and the number of persons is assumed to be stored in the storage unit 140. A method of deriving the relational expression 142 of the foreground area and the number of persons will be described below.
FIG. 10 is a diagram showing an image of a correct foreground image.
FIG. 11 is a diagram showing an image obtained by digitizing the congestion level in the correct foreground image.
A correct image is prepared in which the number of people appearing in the video frame is known and the ground point on the physical coordinate system of the person is known. The foreground extraction is performed on the correct image, and the correct foreground image is created as shown in FIG.
As shown in FIG. 11, the correct foreground image is divided into a plurality of small areas, and the amount of foreground area per person in each small area is calculated for each congestion level. Further, the same processing is applied to a large number of correct foreground images in which the congestion level and the arrangement pattern are changed, and the foreground area per one person in each small area is counted. As a result, the relationship between the number of people and the foreground in each small area for each congestion level can be derived as a relational expression 142 between the foreground area and the number of people. The relational expression 142 between the foreground area and the number of persons is stored in the storage unit 140. The occupancy ratio of the foreground area in the small area at each congestion level is stored in the storage unit 140 as the level threshold 143 for determining the congestion level.
When using the relational expression 142 of the foreground area and the number of people, the temporary density calculation unit 122 determines the congestion level for each small area, and uses the relational expression 142 corresponding to the congestion level to calculate the number of people from the foreground area. The provisional density calculation unit 122 determines the congestion level by comparing the occupancy ratio of the foreground in the small area with the level threshold 143.
<<存在判定処理>>
 ステップST13において、存在判定部123は、複数の立体領域の各立体領域vijに、人が存在するかを判定する。すなわち、存在判定部123は、複数の立体空間の各立体空間Vijに、人が存在するかを判定し、人が存在しないと判定された立体空間に対応する立体領域vijについて人が存在しないと判定する。存在判定部123は、各立体領域vijに対して人の存在判定を行い、人が存在しないと判定した立体領域vijの人の数を0にする。すなわち、存在判定部123は、人が存在しないと判定した立体領域vijに対応する領域ΔSijの人数を0にする。すなわち、存在判定部123は、人が存在しないと判定された立体空間に対応する立体領域vijの人の数を0に補正した暫定密度分布222を、補正密度分布223として出力する。
<< presence judgment processing >>
In step ST13, the existence determination unit 123 determines each solid region v ij of the plurality of three-dimensional regions, whether a person exists. That is, the presence determination unit 123 determines whether a person is present in each three-dimensional space V ij of a plurality of three-dimensional spaces, and a person is present for a three-dimensional area v ij corresponding to the three-dimensional space determined to have no person. It decides that it does not do. The presence determination unit 123 determines the presence of a person with respect to each three-dimensional region v ij , and sets the number of people in the three-dimensional region v ij determined to have no person to zero. That is, the presence determination unit 123 sets the number of persons in the region ΔS ij corresponding to the three-dimensional region v ij determined to have no person to 0. That is, the presence determination unit 123 outputs, as the corrected density distribution 223, a temporary density distribution 222 in which the number of persons in the three-dimensional region vij corresponding to the three-dimensional space determined to have no people is corrected to zero.
 図12は、本実施の形態に係る存在判定処理のイメージ図である。
 上述したように、複数の立体領域の各立体領域vijは、人が立体空間Vijに立っている場合に、その人の頭が対応する頭領域を備える。また、複数の立体領域の各立体領域vijは、人が立つ地面に対応する地面領域を備える。存在判定部123は、立体領域vijにおいて頭領域と地面領域との両領域に人が存在する場合に、その立体領域vijに対応する立体空間Vijに人が存在すると判定する。具体的には、存在判定部123は、図12に示すように、各立体領域vijの頭領域と地面領域におけるそれぞれの前景面積が規定値以下である場合に人が存在しないと判定し、hij=0とする。また、存在判定部123は、頭領域と地面領域の両領域に規定値以上の前景面積が存在した立体領域vijのみについて、暫定密度分布222の値をそのまま使用する。
 存在判定部123は、人存在処理を施した暫定密度分布222を、補正密度分布223として出力する。
FIG. 12 is an image diagram of the presence determination process according to the present embodiment.
As described above, when the person stands in the three-dimensional space V ij , each three-dimensional area v ij of the plurality of three-dimensional areas includes a head area corresponding to the head of the person. Also, each three-dimensional area v ij of the plurality of three-dimensional areas comprises a ground area corresponding to the ground on which a person stands. When a person is present in both the head region and the ground region in the three-dimensional area v ij , the presence determination unit 123 determines that a person is present in the three-dimensional space V ij corresponding to the three-dimensional area v ij . Specifically, as shown in FIG. 12, the presence determination unit 123 determines that there is no person when each foreground area in the head region and the ground region of each three-dimensional region v ij is equal to or less than a specified value, Let h ij = 0. Further, the presence determination unit 123 uses the value of the provisional density distribution 222 as it is only for the three-dimensional area vij in which the foreground area of the specified value or more exists in both the head area and the ground area.
The presence determination unit 123 outputs the temporary density distribution 222 subjected to the human presence process as the corrected density distribution 223.
<<標準化部>>
 ステップST14において、標準化部124は、前景画像221に基づいて映像フレーム22における人の総数を取得する。標準化部124は、人の総数に基づいて、補正密度分布223における複数の立体領域の各立体領域vijの人の数を標準化する。具体的には、標準化部124は、以下の式(1)と式(2)を用いて、映像フレーム内に存在する総人数htotalで群集密度の標準化を行う。標準化部124は、前景面積と人数の関係式142を前景画像221の全体に適用することにより、映像フレーム22における総人数htotalを算出する。式(2)におけるRowsはiの総数である。また、式(2)におけるColsはjの総数である。
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000002
<< Standardization Department >>
In step ST14, the standardization unit 124 acquires the total number of people in the video frame 22 based on the foreground image 221. Standardization unit 124, based on the total number of people, to standardize the number of people in each three-dimensional regions v ij of the plurality of three-dimensional regions in the correction density distribution 223. Specifically, the standardization unit 124 performs the standardization of the crowd density with the total number h total present in the video frame, using the following Equation (1) and Equation (2). The standardization unit 124 calculates the total number of persons h total in the video frame 22 by applying the relational expression 142 of the foreground area and the number of persons to the entire foreground image 221. Rows in equation (2) is the total number of i. Moreover, Cols in Formula (2) is a total of j.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000002
 標準化部124は、補正密度分布すべての立体領域vijについて標準化処理を施し、確定密度分布224として出力する。 The standardization unit 124 performs standardization processing on all the three-dimensional regions v ij of the corrected density distribution, and outputs the result as a definite density distribution 224.
<<分布出力処理>>
 ステップST15において、分布出力部125は、標準化部124から確定密度分布224を取得する。分布出力部125は、確定密度分布224を出力形式に変換し、群集密度分布225として結果出力部130に出力する。
<< Distributed output processing >>
In step ST15, the distribution output unit 125 acquires the determined density distribution 224 from the standardization unit 124. The distribution output unit 125 converts the definite density distribution 224 into an output format, and outputs the result as the crowd density distribution 225 to the result output unit 130.
***他の構成***
 本実施の形態では、映像取得部110と解析部120と結果出力部130の機能がソフトウェアで実現される。以下において、変形例として、映像取得部110と解析部120と結果出力部130の機能がハードウェアで実現されてもよい。
*** Other configuration ***
In the present embodiment, the functions of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 are realized by software. In the following, as a modification, the functions of the video acquisition unit 110, the analysis unit 120, and the result output unit 130 may be realized by hardware.
 図13は、本実施の形態の変形例に係る群集密度算出装置100の構成を示す図である。
 群集密度算出装置100は、電子回路909、メモリ921、補助記憶装置922、入力インタフェース930、出力インタフェース940、および通信装置950を備える。
FIG. 13 is a diagram showing a configuration of a crowd density calculation device 100 according to a modification of the present embodiment.
The crowd density calculation device 100 includes an electronic circuit 909, a memory 921, an auxiliary storage device 922, an input interface 930, an output interface 940, and a communication device 950.
 電子回路909は、映像取得部110と解析部120と結果出力部130の機能を実現する専用の電子回路である。
 電子回路909は、具体的には、単一回路、複合回路、プログラム化したプロセッサ、並列プログラム化したプロセッサ、ロジックIC、GA、ASIC、または、FPGAである。GAは、Gate Arrayの略語である。ASICは、Application Specific Integrated Circuitの略語である。FPGAは、Field-Programmable Gate Arrayの略語である。
 映像取得部110と解析部120と結果出力部130の機能は、1つの電子回路で実現されてもよいし、複数の電子回路に分散して実現されてもよい。
 別の変形例として、映像取得部110と解析部120と結果出力部130の一部の機能が電子回路で実現され、残りの機能がソフトウェアで実現されてもよい。
The electronic circuit 909 is a dedicated electronic circuit that implements the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130.
Specifically, the electronic circuit 909 is a single circuit, a complex circuit, a programmed processor, a parallel programmed processor, a logic IC, a GA, an ASIC, or an FPGA. GA is an abbreviation of Gate Array. ASIC is an abbreviation of Application Specific Integrated Circuit. FPGA is an abbreviation of Field-Programmable Gate Array.
The functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130 may be realized by one electronic circuit, or may be realized by being dispersed in a plurality of electronic circuits.
As another modification, some functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130 may be realized by an electronic circuit, and the remaining functions may be realized by software.
 プロセッサと電子回路の各々は、プロセッシングサーキットリとも呼ばれる。つまり、群集密度算出装置100において、映像取得部110と解析部120と結果出力部130の機能は、プロセッシングサーキットリにより実現される。 Each of the processor and electronics is also referred to as processing circuitry. That is, in the crowd density calculation device 100, the functions of the image acquisition unit 110, the analysis unit 120, and the result output unit 130 are realized by processing circuitry.
 群集密度算出装置100において、映像取得部110、前景抽出部121、暫定密度計算部122、存在判定部123、標準化部124、分布出力部125、および結果出力部130の「部」を「工程」あるいは「処理」に読み替えてもよい。また、映像取得処理、前景抽出処理、暫定密度計算処理、存在判定処理、標準化処理、分布出力処理、および結果出力処理の「処理」を「プログラム」、「プログラムプロダクト」または「プログラムを記録したコンピュータ読取可能な記憶媒体」に読み替えてもよい。 In the crowd density calculation apparatus 100, the “parts” of the image acquisition unit 110, the foreground extraction unit 121, the provisional density calculation unit 122, the presence determination unit 123, the standardization unit 124, the distribution output unit 125, and the result output unit 130 are “steps”. Alternatively, it may be read as "processing". In addition, the "processing" of the image acquisition processing, foreground extraction processing, provisional density calculation processing, presence determination processing, standardization processing, distribution output processing, and result output processing is "program", "program product" or "computer that records the program" It may be read as "readable storage medium".
***本実施の形態の効果の説明***
 本実施の形態に係る群集密度算出装置100では、前景抽出部が、カメラで撮影した映像の映像フレームから前景画像を抽出する。暫定密度計算部が、物理空間上の領域を映像フレームに投映した領域に存在する人数、すなわち見かけ上の人数を算出し、暫定密度分布を生成する。存在判定部が、人が存在しない物理空間上の領域を判定し、判定結果に基づいて暫定密度分布を補正する。結果出力部は、補正および標準化された暫定密度分布を群集密度分布として、ディスプレイといった出力装置に出力する。
 したがって、本実施の形態に係る群集密度算出装置100によれば、カメラから入力された映像フレームから、実世界の物理空間上における人の存在位置を算出し、群集密度分布として出力することができる。
*** Explanation of the effect of the present embodiment ***
In the crowd density calculation device 100 according to the present embodiment, the foreground extraction unit extracts the foreground image from the video frame of the video captured by the camera. The provisional density calculation unit generates the provisional density distribution by calculating the number of people present in the area where the area in the physical space is projected to the video frame, that is, the apparent number of people. The presence determination unit determines a region in the physical space where no person is present, and corrects the provisional density distribution based on the determination result. The result output unit outputs the corrected and standardized provisional density distribution as a population density distribution to an output device such as a display.
Therefore, according to the crowd density calculation apparatus 100 according to the present embodiment, the presence position of a person on the physical space of the real world can be calculated from the video frame input from the camera, and can be output as a crowd density distribution. .
 実施の形態2.
 本実施の形態では、実施の形態1と異なる点について説明する。なお、実施の形態1と同様の構成には同一の符号を付し、その説明を省略する場合がある。
Second Embodiment
In this embodiment, points different from the first embodiment will be described. In addition, the same code | symbol may be attached | subjected to the structure similar to Embodiment 1, and the description may be abbreviate | omitted.
 実施の形態1では、複数の立体領域の各立体領域vijが互いに重なり合っている。このため、暫定密度分布を計算する際、本来人が存在する立体領域vij以外にも前景が出現する場合がある。これにより、算出される暫定密度分布が不正確になる場合がある。本実施の形態では、立体領域の重複による影響を排除し、群集密度分布をより高精度化する形態について説明する。
 本実施の形態では、実施の形態1における標準化部124を省き、暫定密度計算部122と存在判定部123の間に新たに位置補正部126を追加する。
In the first embodiment, the three-dimensional regions v ij of the plurality of three-dimensional regions overlap each other. For this reason, when calculating the provisional density distribution, the foreground may appear in addition to the three-dimensional area v ij in which a person originally exists. This may cause the calculated provisional density distribution to be inaccurate. In the present embodiment, an embodiment will be described in which the influence of duplication of three-dimensional regions is eliminated and the crowd density distribution is further enhanced.
In the present embodiment, the standardization unit 124 in the first embodiment is omitted, and a position correction unit 126 is newly added between the provisional density calculation unit 122 and the presence determination unit 123.
 図14を用いて、本実施の形態に係る解析部120aの詳細構成について説明する。
 解析部120aは、前景抽出部121と、暫定密度計算部122と、位置補正部126と、存在判定部123と、分布出力部125とを備える。
The detailed configuration of the analysis unit 120a according to the present embodiment will be described with reference to FIG.
The analysis unit 120 a includes a foreground extraction unit 121, a provisional density calculation unit 122, a position correction unit 126, a presence determination unit 123, and a distribution output unit 125.
 図15を用いて、本実施の形態に係る解析部120aによる解析処理の詳細について説明する。
 図15において、ステップST11とステップST12は、実施の形態1と同様である。
 ステップST16において、位置補正部126は、暫定密度計算部122から暫定密度分布222を取得する。位置補正部126は、複数の立体空間うち隣接する立体空間同士の重複部分を表す重複領域における人の数に基づいて、暫定密度分布222を補正し、補正した暫定密度分布222を出力する。すなわち、位置補正部126は、暫定密度分布222を用いて、立体領域vijの重複による影響を考慮した人位置の補正を行う。
Details of analysis processing by the analysis unit 120a according to the present embodiment will be described using FIG.
In FIG. 15, steps ST11 and ST12 are the same as in the first embodiment.
In step ST16, the position correction unit 126 acquires the provisional density distribution 222 from the provisional density calculation unit 122. The position correction unit 126 corrects the provisional density distribution 222 based on the number of persons in the overlapping area representing the overlapping part of adjacent three-dimensional spaces among the plurality of three-dimensional spaces, and outputs the corrected provisional density distribution 222. That is, using the temporary density distribution 222, the position correction unit 126 corrects the human position in consideration of the influence of the overlapping of the three-dimensional area vij .
 図16は、本実施の形態に係る位置補正部126による位置補正処理のイメージを表す図である。
 重複領域Aduplmとは、図16に示す通り、立体領域vと立体領域vとが重なり合っている領域である。
 位置補正部126による位置補正処理は、図16に示すように、周囲に分散した値をシャープにする一種のフィルタのような働きをする。
 ここで添え字のlとmは、2変数である添え字ijを便宜上1変数に書き換えたものである。以下の本実施の形態の説明では、添え字のlとmを使用する。添え字lとijの関係は、l=i×Cols+jである。添え字mとijの関係も同様である。
FIG. 16 is a diagram showing an image of position correction processing by the position correction unit 126 according to the present embodiment.
The overlapping area A duplm is, as shown in FIG. 16, an area in which the three-dimensional area v l and the three-dimensional area v m overlap each other.
The position correction processing by the position correction unit 126 functions as a kind of filter that sharpens the values dispersed around, as shown in FIG.
Here, the subscripts l and m are obtained by rewriting the subscript ij, which is a two variable, into one variable for the sake of convenience. In the following description of the present embodiment, the subscripts l and m are used. The relationship between subscripts l and ij is l = i × Cols + j. The same applies to the relationship between subscripts m and ij.
 ここで、求めるべき各領域ΔSの人数をベクトル表記したものをh、セル同士の重複関係を示す係数行列をA、暫定密度分布222で得られた出力をベクトル表記したものをhとする。各領域ΔSの人数は、式(3)および式(4)で計算できる。
 式(3)および式(4)を利用することで、位置補正部126は、重複領域の影響が除去された高精度な群集密度分布225を出力することができる。
Here, let h t be the vector representation of the number of people in each region ΔS l to be determined, A be the coefficient matrix indicating the overlapping relationship between cells, and h be the vector representation of the output obtained by the provisional density distribution 222 . The number of people in each region ΔS 1 can be calculated by Equation (3) and Equation (4).
By using Equations (3) and (4), the position correction unit 126 can output the highly accurate crowd density distribution 225 from which the influence of the overlapping area has been removed.
 以下に、式(3)の導出について詳しく説明する。立体空間Vに存在する人数をhtl、立体領域vに出現する人数をhとする。立体空間Vに存在する人数htlの人が、立体領域vに出現する人数をhcom_l→mとする。すると、立体領域vに出現する見かけ上の人数hは、式(5)で表すことができる。hcom_l→mは、セルlに存在する人数htlに、係数αlmをかけたものとして表せる。よって、セル同士の重複関係を表す係数行列は、式(4)となる。l=mのときαlm=1とおけば、式(5)と式(6)より、見かけ上の人数は、存在する人数に係数をかけた式(7)で表すことができる。
 lの総数をNとすると、式(7)より、N本のN限1次連立方程式となるため、行列表現で書き直すと、式(8)となる。式(8)の両辺からの係数行列Aの逆行列をかけることで、式(3)が導出できる。
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000008
The derivation of equation (3) will be described in detail below. Let h tl be the number of people present in the three-dimensional space V l and h l be the number of people appearing in the three-dimensional region v l . People number of people h tl present in three-dimensional space V l is, the number of people appearing in the three-dimensional area v m and h com_l → m. Then, the apparent number h l appearing in the three-dimensional region v l can be expressed by Expression (5). The h com — l → m can be expressed as the number of people h tl present in the cell l multiplied by the coefficient α lm . Therefore, a coefficient matrix representing an overlapping relationship between cells is expressed by Equation (4). Assuming that α lm = 1 when l = m, the apparent number of people can be expressed by the equation (7) obtained by multiplying the number of people present by a factor from the equations (5) and (6).
Assuming that the total number of l is N, since it becomes N N-limited first-order simultaneous equations from equation (7), equation (8) can be obtained by rewriting in matrix expression. Expression (3) can be derived by multiplying the inverse matrix of the coefficient matrix A from both sides of Expression (8).
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000008
 なお、係数αlmの求め方は限定しない。例えば、図15に示す立体領域vの面積Aと、立体領域vと立体領域vとの重複領域Aduplmを用いて式(9)のように係数αlmの求めてもよい。
Figure JPOXMLDOC01-appb-M000009
Note that the method of obtaining the coefficient α lm is not limited. For example, the area A l stereoscopic area v l shown in FIG. 15 may be calculated coefficient alpha lm as equation using overlapping region A Duplm the solid region v l and solid area v m (9).
Figure JPOXMLDOC01-appb-M000009
 なお、式(9)を用いた係数行列Aの算出は、ステップST01の解析パラメータ読み込み処理において、1度だけ行うこととする。
 以降の、ステップST13とステップST15は、実施の形態1と同様である。
Calculation of coefficient matrix A using equation (9) is performed only once in the analysis parameter reading process of step ST01.
The subsequent steps ST13 and ST15 are the same as in the first embodiment.
 以上のように、本実施の形態に係る解析部120aを用いた群集密度算出装置によれば、重複領域の影響を除去することにより、群集密度分布をより高精度に算出することができる。 As described above, according to the crowd density calculation device using the analysis unit 120a according to the present embodiment, the crowd density distribution can be calculated with higher accuracy by removing the influence of the overlapping region.
 実施の形態3.
 本実施の形態では、実施の形態2と異なる点について説明する。なお、実施の形態1,2と同様の構成には同一の符号を付し、その説明を省略する場合がある。
Third Embodiment
In the present embodiment, points different from the second embodiment will be described. In addition, the same code | symbol may be attached | subjected to the structure similar to Embodiment 1, 2, and the description may be abbreviate | omitted.
 実施の形態2では、式(9)を用いて計算された重複関係を示す係数行列Aは、立体領域vij上で均等に前景が出現することを想定している。実際には、重複関係を示す係数行列Aは、立体空間Vij内で実際に人が存在する位置により影響を受ける。このため、立体空間Vij内で実際に人が存在する位置によっては算出された群集密度分布に誤差が生じる虞がある。本実施の形態では、係数行列Aを数値計算により最適化する。その時の群集密度分布の総人数htotalは、画面内に存在する総人数を用いる。画面内の総人数htotalは、前景面積と人数の関係式を前景画像全体に適用することにより算出する。 In the second embodiment, it is assumed that the coefficient matrix A indicating the overlapping relationship calculated using equation (9) causes the foreground to appear uniformly on the three-dimensional region v ij . In practice, the coefficient matrix A indicating the overlapping relationship is influenced by the position where a person actually exists in the three-dimensional space V ij . For this reason, there is a possibility that an error may occur in the calculated crowd density distribution depending on the position where a person actually exists in the three-dimensional space V ij . In the present embodiment, the coefficient matrix A is optimized by numerical calculation. The total number h total of the population density distribution at that time uses the total number existing in the screen. The total number h total in the screen is calculated by applying the relational expression of the foreground area and the number to the entire foreground image.
 図17を用いて、本実施の形態に係る解析処理の詳細について説明する。
 図17において、ステップST11とステップST12とステップST16は、実施の形態1と同様である。
Details of the analysis processing according to the present embodiment will be described with reference to FIG.
In FIG. 17, steps ST11, ST12 and ST16 are the same as in the first embodiment.
 ステップST17において、位置補正部126は、係数行列Aを再計算することにより、最適化する。また、位置補正部126は、映像フレームにおける人の総数の誤差が閾値以下になるまで、暫定密度分布の補正を繰り返す。 In step ST17, the position correction unit 126 optimizes the coefficient matrix A by recalculation. Further, the position correction unit 126 repeats correction of the provisional density distribution until the error of the total number of persons in the video frame becomes equal to or less than the threshold.
 h’totalの計算には式(2)と式(7)を利用する。
 暫定密度分布の補正に関する評価関数を式(10)で定める。位置補正部126は、最急降下法を用いて、式(10)で計算される誤差Eが閾値以下になるまで繰り返し計算を行う。
Figure JPOXMLDOC01-appb-M000010
Equations (2) and (7) are used to calculate h'total .
An evaluation function regarding correction of the provisional density distribution is defined by equation (10). The position correction unit 126 repeatedly performs calculation using the steepest descent method until the error E calculated by equation (10) becomes equal to or less than the threshold.
Figure JPOXMLDOC01-appb-M000010
 なお、位置補正部126による最適化の手法は最急降下法に限定しない。 The optimization method by the position correction unit 126 is not limited to the steepest descent method.
 以上のように、本実施の形態に係る群集密度算出装置は、式(10)を用いて、係数行列Aを毎フレーム更新する。よって、本実施の形態に係る群集密度算出装置によれば、実施の形態2と比較して、群集密度分布をより高精度で算出できる。 As described above, the apparatus for calculating the crowd density according to the present embodiment updates the coefficient matrix A every frame using Equation (10). Therefore, according to the crowd density calculation device according to the present embodiment, the crowd density distribution can be calculated with higher accuracy compared to the second embodiment.
 以上の実施の形態1から3では、群集密度算出装置の各部を独立した機能ブロックとして説明した。しかし、群集密度算出装置の構成は、上述した実施の形態のような構成でなくてもよい。群集密度算出装置の機能ブロックは、上述した実施の形態で説明した機能を実現することができれば、どのような構成でもよい。また、群集密度算出装置は、1つの装置でなく、複数の装置から構成されたシステムでもよい。
 また、実施の形態1から3のうち、複数の部分を組み合わせて実施しても構わない。あるいは、これらの実施の形態のうち、1つの部分を実施しても構わない。その他、これら実施の形態を、全体としてあるいは部分的に、どのように組み合わせて実施しても構わない。
 すなわち、実施の形態1から3では、各実施の形態の自由な組み合わせ、あるいは各実施の形態の任意の構成要素の変形、もしくは各実施の形態において任意の構成要素の省略が可能である。
In the above first to third embodiments, each part of the crowd density calculation device has been described as an independent functional block. However, the configuration of the crowd density calculation device may not be the configuration as in the embodiment described above. The functional block of the crowd density calculation apparatus may have any configuration as long as the function described in the above-described embodiment can be realized. Also, the crowd density calculation device may be a system configured of a plurality of devices instead of one device.
In addition, a plurality of parts of Embodiments 1 to 3 may be implemented in combination. Alternatively, one portion of these embodiments may be implemented. In addition, these embodiments may be implemented in any combination in whole or in part.
That is, in the first to third embodiments, free combinations of the respective embodiments, or modifications of any components of the respective embodiments, or omissions of any components in the respective embodiments are possible.
 なお、上述した実施の形態は、本質的に好ましい例示であって、本発明の範囲、本発明の適用物の範囲、および本発明の用途の範囲を制限することを意図するものではない。上述した実施の形態は、必要に応じて種々の変更が可能である。上述した実施の形態に係る群集密度算出装置は、群集の密度を推定する群集密度推定装置、および、群集密度推定システムに適用することができる。 The embodiments described above are essentially preferable examples, and are not intended to limit the scope of the present invention, the scope of the application of the present invention, and the scope of the application of the present invention. The embodiment described above can be variously modified as needed. The crowd density calculation device according to the above-described embodiment can be applied to a crowd density estimation device that estimates crowd density and a crowd density estimation system.
 21 映像ストリーム、22 映像フレーム、100 群集密度算出装置、110 映像取得部、120,120a 解析部、121 前景抽出部、122 暫定密度計算部、123 存在判定部、124 標準化部、125 分布出力部、126 位置補正部、130 結果出力部、140 記憶部、141 解析パラメータ、142 関係式、143 レベル閾値、200 カメラ、221 前景画像、222 暫定密度分布、223 補正密度分布、224 確定密度分布、225 群集密度分布、909 電子回路、910 プロセッサ、921 メモリ、922 補助記憶装置、930 入力インタフェース、940 出力インタフェース、950 通信装置、S100 群集密度算出処理。 Reference Signs List 21 image stream 22 image frame 100 crowd density calculation device 110 image acquisition unit 120 120a analysis unit 121 foreground extraction unit 122 provisional density calculation unit 123 presence determination unit 124 standardization unit 125 distribution output unit 126 position correction unit, 130 result output unit, 140 storage unit, 141 analysis parameter, 142 relational expression, 143 level threshold, 200 camera, 221 foreground image, 222 provisional density distribution, 223 correction density distribution, 224 definite density distribution, 225 crowd Density distribution, 909 electronic circuit, 910 processor, 921 memory, 922 auxiliary storage device, 930 input interface, 940 output interface, 950 communication device, S100 crowd density calculation processing.

Claims (8)

  1.  人が撮像されている映像ストリームから映像フレームを取得する映像取得部と、
     前記映像フレームに3次元座標を対応付け、前記映像フレーム上において前記3次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得し、前記複数の立体領域の各立体領域に存在する人の数に基づいて、前記映像フレームにおける人の密度分布を群集密度分布として算出する解析部と
    を備えた群集密度算出装置。
    A video acquisition unit that acquires a video frame from a video stream in which a person is imaged;
    Three-dimensional coordinates are associated with the video frame, and an area representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame is acquired as each three-dimensional area of the plurality of three-dimensional areas; A crowd density calculation device comprising: an analysis unit that calculates a density distribution of people in the image frame as a crowd density distribution based on the number of people present in each of the plurality of three-dimensional regions.
  2.  前記解析部は、
     前記映像フレームにおける人の画像を前景画像として抽出する前景抽出部と、
     前記前景画像に基づいて、前記複数の立体領域の各立体領域に見かけ上存在する人の数を暫定密度分布として算出する暫定密度計算部と、
     前記複数の立体空間の各立体空間に人が存在するかを判定する存在判定部と、
     人が存在しないと判定された立体空間に対応する立体領域の人の数を0に補正した前記暫定密度分布を、補正密度分布として出力する存在判定部と
    を備えた請求項1に記載の群集密度算出装置。
    The analysis unit
    A foreground extraction unit which extracts an image of a person in the video frame as a foreground image;
    A provisional density calculation unit that calculates the number of persons apparently present in each three-dimensional area of the plurality of three-dimensional areas based on the foreground image as a provisional density distribution;
    An existence determination unit that determines whether a person is present in each three-dimensional space of the plurality of three-dimensional spaces;
    The group according to claim 1, further comprising: a presence determination unit that outputs, as a corrected density distribution, the temporary density distribution in which the number of persons in the three-dimensional region corresponding to the three-dimensional space determined to have no people is corrected to zero. Density calculation device.
  3.  前記複数の立体領域の各立体領域は、
     前記人が前記複数の立体空間の各立体空間に立っている場合に前記人の頭が対応する頭領域と前記人が立つ地面に対応する地面領域とを備え、
     前記存在判定部は、
     立体領域において前記頭領域と前記地面領域との両領域に人が存在する場合に、前記立体領域に対応する立体空間に人が存在すると判定する請求項2に記載の群集密度算出装置。
    Each three-dimensional area of the plurality of three-dimensional areas is
    When the person stands in each three-dimensional space of the plurality of three-dimensional spaces, the human head has a corresponding head region and a ground region corresponding to the ground on which the person stands.
    The presence determination unit
    The crowd density calculation device according to claim 2, wherein when a person is present in both the head region and the ground region in a three-dimensional region, it is determined that a person is present in a three-dimensional space corresponding to the three-dimensional region.
  4.  前記解析部は、
     前記前景画像に基づいて前記映像フレームにおける人の総数を取得し、前記人の総数に基づいて、前記補正密度分布における前記複数の立体領域の各立体領域の人の数を標準化する標準化部と、
     前記標準化部から標準化された前記補正密度分布を確定密度分布として取得し、前記確定密度分布を出力形式に変換する分布出力部と
    を備えた請求項2または3に記載の群集密度算出装置。
    The analysis unit
    A standardization unit that acquires the total number of people in the video frame based on the foreground image, and standardizes the number of people in each of the plurality of three-dimensional regions in the correction density distribution based on the total number of people;
    The group density calculation device according to claim 2 or 3, further comprising: a distribution output unit that acquires the corrected density distribution standardized from the standardization unit as a definite density distribution, and converts the definite density distribution into an output format.
  5.  前記解析部は、
     前記複数の立体領域のうち隣接する立体領域同士の重複部分を表す重複領域における人の数に基づいて、前記暫定密度分布を補正し、補正した前記暫定密度分布を出力する位置補正部を備えた請求項2または3に記載の群集密度算出装置。
    The analysis unit
    The position correction unit corrects the provisional density distribution based on the number of persons in an overlapping area representing an overlapping part of adjacent ones of the plurality of solid areas, and outputs the corrected provisional density distribution. The crowd density calculation device according to claim 2 or 3.
  6.  前記位置補正部は、
     前記映像フレームにおける人の総数の誤差が閾値以下になるまで、前記暫定密度分布の補正を繰り返す請求項5に記載の群集密度算出装置。
    The position correction unit
    The crowd density calculation device according to claim 5, wherein the correction of the provisional density distribution is repeated until the error of the total number of persons in the video frame becomes equal to or less than a threshold.
  7.  映像取得部が、人が撮像されている映像ストリームから映像フレームを取得し、
     解析部が、前記映像フレームに3次元座標を対応付け、前記映像フレーム上において前記3次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得し、前記複数の立体領域の各立体領域に存在する人の数に基づいて、前記映像フレームにおける人の密度分布である群集密度分布を算出する群集密度算出方法。
    The video acquisition unit acquires a video frame from a video stream in which a person is imaged,
    An analysis unit associates three-dimensional coordinates with the video frame, and an area representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame The crowd density calculation method of calculating a crowd density distribution, which is a density distribution of people in the image frame, based on the number of people present in each of the three-dimensional regions of the plurality of three-dimensional regions.
  8.  人が撮像されている映像ストリームから映像フレームを取得する映像取得処理と、
     前記映像フレームに3次元座標を対応付け、前記映像フレーム上において前記3次元座標に基づいて得られる複数の立体空間の各立体空間を表す領域を、複数の立体領域の各立体領域として取得し、前記複数の立体領域の各立体領域に存在する人の数に基づいて、前記映像フレームにおける人の密度分布である群集密度分布を算出する解析処理と
    をコンピュータに実行させる群集密度算出プログラム。
    Video acquisition processing for acquiring a video frame from a video stream in which a person is imaged;
    Three-dimensional coordinates are associated with the video frame, and an area representing each three-dimensional space of a plurality of three-dimensional spaces obtained based on the three-dimensional coordinates on the video frame is acquired as each three-dimensional area of the plurality of three-dimensional areas; A program for causing a computer to execute a group density calculation program which causes a computer to execute analysis processing for calculating a population density distribution which is a density distribution of people in the video frame based on the number of people present in each three-dimensional area of the plurality of three-dimensional areas.
PCT/JP2017/039901 2017-11-06 2017-11-06 Crowd density calculation device, crowd density calculation method and crowd density calculation program WO2019087383A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
PCT/JP2017/039901 WO2019087383A1 (en) 2017-11-06 2017-11-06 Crowd density calculation device, crowd density calculation method and crowd density calculation program
CN201780096261.XA CN111279392B (en) 2017-11-06 2017-11-06 Cluster density calculation device, cluster density calculation method, and computer-readable storage medium
JP2019550118A JP6678835B2 (en) 2017-11-06 2017-11-06 Crowd density calculation device, crowd density calculation method, and crowd density calculation program
SG11202002953YA SG11202002953YA (en) 2017-11-06 2017-11-06 Crowd density calculation apparatus, crowd density calculation method, and crowd density calculation program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2017/039901 WO2019087383A1 (en) 2017-11-06 2017-11-06 Crowd density calculation device, crowd density calculation method and crowd density calculation program

Publications (1)

Publication Number Publication Date
WO2019087383A1 true WO2019087383A1 (en) 2019-05-09

Family

ID=66331546

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/039901 WO2019087383A1 (en) 2017-11-06 2017-11-06 Crowd density calculation device, crowd density calculation method and crowd density calculation program

Country Status (4)

Country Link
JP (1) JP6678835B2 (en)
CN (1) CN111279392B (en)
SG (1) SG11202002953YA (en)
WO (1) WO2019087383A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287929A (en) * 2019-07-01 2019-09-27 腾讯科技(深圳)有限公司 The quantity of target determines method, apparatus, equipment and storage medium in group region
CN112749589A (en) * 2019-10-30 2021-05-04 中移(苏州)软件技术有限公司 Method and device for determining routing inspection path and storage medium
JP2023501690A (en) * 2019-11-20 2023-01-18 オムロン株式会社 Methods and systems for predicting crowd dynamics

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016091326A (en) * 2014-11-05 2016-05-23 日本電信電話株式会社 Camera image person counting method and camera image person counting apparatus
JP2016163075A (en) * 2015-02-26 2016-09-05 キヤノン株式会社 Video processing device, video processing method, and program
JP2017041869A (en) * 2015-08-20 2017-02-23 株式会社東芝 Image processing system, image processing method, and program

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101464944B (en) * 2007-12-19 2011-03-16 中国科学院自动化研究所 Crowd density analysis method based on statistical characteristics
CN101714293A (en) * 2009-12-16 2010-05-26 上海交通投资信息科技有限公司 Stereoscopic vision based acquisition method of congestion degree of bus passenger flow
CN102982341B (en) * 2012-11-01 2015-06-24 南京师范大学 Self-intended crowd density estimation method for camera capable of straddling
JP6276519B2 (en) * 2013-05-22 2018-02-07 株式会社 日立産業制御ソリューションズ Person counting device and human flow line analyzing device
CN104504688A (en) * 2014-12-10 2015-04-08 上海大学 Method and system based on binocular stereoscopic vision for passenger flow density estimation
US20170053172A1 (en) * 2015-08-20 2017-02-23 Kabushiki Kaisha Toshiba Image processing apparatus, and image processing method
CN106326937B (en) * 2016-08-31 2019-08-09 郑州金惠计算机***工程有限公司 Crowd density distribution estimation method based on convolutional neural networks
CN107256225B (en) * 2017-04-28 2020-09-01 济南中维世纪科技有限公司 Method and device for generating heat map based on video analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016091326A (en) * 2014-11-05 2016-05-23 日本電信電話株式会社 Camera image person counting method and camera image person counting apparatus
JP2016163075A (en) * 2015-02-26 2016-09-05 キヤノン株式会社 Video processing device, video processing method, and program
JP2017041869A (en) * 2015-08-20 2017-02-23 株式会社東芝 Image processing system, image processing method, and program

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287929A (en) * 2019-07-01 2019-09-27 腾讯科技(深圳)有限公司 The quantity of target determines method, apparatus, equipment and storage medium in group region
CN110287929B (en) * 2019-07-01 2023-09-05 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for determining number of targets in group area
CN112749589A (en) * 2019-10-30 2021-05-04 中移(苏州)软件技术有限公司 Method and device for determining routing inspection path and storage medium
CN112749589B (en) * 2019-10-30 2023-04-18 中移(苏州)软件技术有限公司 Method and device for determining routing inspection path and storage medium
JP2023501690A (en) * 2019-11-20 2023-01-18 オムロン株式会社 Methods and systems for predicting crowd dynamics
JP7276607B2 (en) 2019-11-20 2023-05-18 オムロン株式会社 Methods and systems for predicting crowd dynamics

Also Published As

Publication number Publication date
JPWO2019087383A1 (en) 2020-04-02
SG11202002953YA (en) 2020-05-28
CN111279392B (en) 2023-12-15
CN111279392A (en) 2020-06-12
JP6678835B2 (en) 2020-04-08

Similar Documents

Publication Publication Date Title
Grundmann et al. Calibration-free rolling shutter removal
US8532198B2 (en) Banding artifact detection in digital video content
RU2603529C2 (en) Noise reduction for image sequences
WO2019087383A1 (en) Crowd density calculation device, crowd density calculation method and crowd density calculation program
Appina et al. Study of subjective quality and objective blind quality prediction of stereoscopic videos
JP7184050B2 (en) Encoding device, encoding method, decoding device, and decoding method
Khaloo et al. Pixel‐wise structural motion tracking from rectified repurposed videos
CN107396112B (en) Encoding method and device, computer device and readable storage medium
US20110188740A1 (en) Device for improving stereo matching results, method of improving stereo matching results using the device, and system for receiving stereo matching results
KR20210096234A (en) Point cloud coding using homography transformation
WO2019037471A1 (en) Video processing method, video processing device and terminal
US20150043807A1 (en) Depth image compression and decompression utilizing depth and amplitude data
Okarma Colour image quality assessment using the combined full-reference metric
KR101904125B1 (en) VIDEO PROCESSING Device and Method For Depth Video by Concave Curved Surface Modeling
CN110383295B (en) Image processing apparatus, image processing method, and computer-readable storage medium
TWM535848U (en) Apparatus for combining with wavelet transformer and edge detector to generate a depth map from a single image
TWI481262B (en) Image encoding system and image encoding method
KR102262030B1 (en) VIDEO PROCESSING Device and Method For Depth Video by Spherical Modeling, and Non-Transitory COMPUTER READABLE RECORDING MEDIUM
Wazirali et al. Objective quality metrics in correlation with subjective quality metrics for steganography
Ji et al. A full-reference image quality assessment algorithm based on haar wavelet transform
US20160055655A1 (en) Method and apparatus of creating a perceptual harmony map
JP7042736B2 (en) Foreground extraction device, foreground extraction method, and foreground extraction program
EP2958103A1 (en) Method and device for encoding a sequence of pictures
WO2021053735A1 (en) Upscaling device, upscaling method, and upscaling program
BHASKAR et al. HVS BASED NEAR REVERSIBLE DATA HIDING SCHEME USING DCT.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17930872

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019550118

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17930872

Country of ref document: EP

Kind code of ref document: A1