JP7309392B2

JP7309392B2 - Image processing device, image processing method and program

Info

Publication number: JP7309392B2
Application number: JP2019048472A
Authority: JP
Inventors: 哲広船城
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-03-15
Filing date: 2019-03-15
Publication date: 2023-07-18
Anticipated expiration: 2039-03-15
Also published as: JP2020149565A

Description

本発明は、画像処理装置、画像処理方法およびプログラムに関する。 The present invention relates to an image processing device, an image processing method, and a program.

撮像装置により所定の領域を撮影し、撮影した画像を解析することにより、画像中の人物の数を計測するシステムが知られている。このシステムは、公共の空間での混雑の検知、混雑時の人の流れを把握することによりイベント時の混雑解消および災害時の避難誘導への活用が期待されている。 2. Description of the Related Art A system is known that measures the number of people in an image by photographing a predetermined area with an imaging device and analyzing the photographed image. This system is expected to be used to detect congestion in public spaces and grasp the flow of people during times of congestion to relieve congestion during events and to guide evacuations in the event of a disaster.

画像中の人物の数を計測する方法として、特許文献１には、人体検出手段によって検出した人物の数を計数する方法が開示されている。また、特許文献２には、機械学習によって得た認識モデルを用いて、画像の所定の領域に存在する人数を直接推定する方法が開示されている。 As a method of measuring the number of persons in an image, Patent Document 1 discloses a method of counting the number of persons detected by a human body detecting means. Further, Patent Literature 2 discloses a method of directly estimating the number of people existing in a predetermined area of an image using a recognition model obtained by machine learning.

特開２０１５－７０３５９号公報JP 2015-70359 A 特開２０１８－２２３４０号公報Japanese Patent Application Laid-Open No. 2018-22340

特許文献１の方法は、人がまばらに存在し、かつ、人が所定の大きさ以上である場合には、高精度で人数を推定できる。しかしながら、人が高密度で存在し、人体の大部分が隠れている場合、または、人体が所定の大きさより小さい場合には、人体検出手段の精度が低下する。 The method of Patent Literature 1 can estimate the number of people with high accuracy when the people are sparsely present and the people are equal to or larger than a predetermined size. However, when people are present at a high density and most of the human body is hidden, or when the human body is smaller than a predetermined size, the accuracy of the human body detection means is degraded.

特許文献２の方法は、画像中に存在する人の大きさが学習で想定した大きさの場合においては、人が高密度で存在していても高精度で人数を推定できる。しなしながら、人の大きさが想定より大きく外れている場合においては、人数推定の精度が低下する。 The method of Patent Literature 2 can estimate the number of people with high accuracy even if people exist in a high density when the size of the people present in the image is the size assumed in the learning. However, if the size of the people deviates from the expected size, the accuracy of the estimation of the number of people decreases.

本発明の目的は、画像の中の人数を高精度で推定できるようにすることである。 SUMMARY OF THE INVENTION It is an object of the present invention to be able to estimate the number of people in an image with high accuracy.

本発明の一観点によれば、画像を分割することにより複数の分割領域を決定する分割手段と、前記分割手段により決定された複数の分割領域の各々の中の人数を推定する推定手段とを有し、前記分割手段は、前記画像の中の各位置で想定される人体のサイズと、前記推定手段が目標値以上の正解率で推定することができる人体のサイズの範囲とを基に、前記複数の分割領域の各々の位置とサイズを決定し、前記複数の分割領域のうち、少なくとも２つの分割領域の一部が重複することを特徴とする画像処理装置が提供される。 According to one aspect of the present invention, dividing means for determining a plurality of divided regions by dividing an image, and estimating means for estimating the number of people in each of the plurality of divided regions determined by the dividing means. wherein the dividing means is based on a human body size assumed at each position in the image and a human body size range that the estimation means can estimate with an accuracy rate equal to or higher than a target value , An image processing apparatus is provided, wherein the position and size of each of the plurality of divided areas are determined, and at least two of the plurality of divided areas partially overlap each other.

本発明によれば、画像の中の人数を高精度で推定できる。 According to the present invention, the number of people in an image can be estimated with high accuracy.

画像処理装置のハードウェア構成例を示すブロック図である。2 is a block diagram showing a hardware configuration example of an image processing apparatus; FIG. 画像処理装置の機能構成例を示すブロック図である。2 is a block diagram showing a functional configuration example of an image processing apparatus; FIG. 画像の例を示す図である。FIG. 4 is a diagram showing an example of an image; 分割領域の例を示す図である。FIG. 10 is a diagram showing an example of divided regions; 推定の正解率の分布を示す図である。FIG. 10 is a diagram showing the distribution of the estimated accuracy rate; 領域分割部の処理方法を示すフローチャートである。8 is a flow chart showing a processing method of an area dividing unit; 領域分割部の処理方法を示すフローチャートである。8 is a flow chart showing a processing method of an area dividing unit;

以下に、本発明の好ましい実施形態を、添付の図面に基づいて詳細に説明する。ただし、本発明の実施形態は以下の実施形態に限定されるものではない。各図面に示される同一または同等の構成要素、部材、処理には、同一の符号を付するものとし、適宜重複した説明は省略する。また、各図面において説明上重要ではない部材の一部は省略して表示する。以下、本発明の実施形態について図面に基づいて説明する。 Preferred embodiments of the present invention are described in detail below with reference to the accompanying drawings. However, embodiments of the present invention are not limited to the following embodiments. The same or equivalent constituent elements, members, and processes shown in each drawing are denoted by the same reference numerals, and duplication of description will be omitted as appropriate. Also, in each drawing, some members that are not important for explanation are omitted. BEST MODE FOR CARRYING OUT THE INVENTION An embodiment of the present invention will be described below with reference to the drawings.

図１は、本発明の実施形態による画像処理装置１００のハードウェア構成の一例を示すブロック図である。画像処理装置１００は、ハードウェア構成として、ＣＰＵ１０１と、メモリ１０２と、ネットワークインタフェース１０３と、表示装置１０４と、入力装置１０５とを有する。 FIG. 1 is a block diagram showing an example hardware configuration of an image processing apparatus 100 according to an embodiment of the present invention. The image processing apparatus 100 has a CPU 101, a memory 102, a network interface 103, a display device 104, and an input device 105 as a hardware configuration.

ＣＰＵ１０１は、画像処理装置１００の全体を制御する。メモリ１０２は、ＣＰＵ１０１が処理するデータおよびプログラム等を記憶する。入力装置１０５は、マウスまたはボタン等であり、ユーザの操作を入力する。表示装置１０４は、液晶表示装置等であり、ＣＰＵ１０１による処理の結果等を表示する。ネットワークインタフェース１０３は、画像処理装置１００をネットワークに接続するためのインタフェースである。ＣＰＵ１０１がメモリ１０２に記憶されたプログラムを実行することにより、後述する図２から図５の処理が実現される。 A CPU 101 controls the entire image processing apparatus 100 . The memory 102 stores data and programs processed by the CPU 101 . The input device 105 is a mouse, buttons, or the like, and inputs user's operations. The display device 104 is a liquid crystal display device or the like, and displays the results of processing by the CPU 101 and the like. A network interface 103 is an interface for connecting the image processing apparatus 100 to a network. As the CPU 101 executes the program stored in the memory 102, the processes shown in FIGS. 2 to 5, which will be described later, are realized.

図２は、画像処理装置１００の機能構成の一例を示すブロック図である。画像処理装置１００は、機能構成として、画像取得部２０１と、分布取得部２０２と、領域分割部２０３と、人数推定部２０４と、人数統合部２０５と、表示部２０６とを有する。 FIG. 2 is a block diagram showing an example of the functional configuration of the image processing apparatus 100. As shown in FIG. The image processing apparatus 100 includes an image acquisition unit 201, a distribution acquisition unit 202, an area division unit 203, a number estimation unit 204, a number integration unit 205, and a display unit 206 as functional configurations.

画像取得部２０１は、人数を推定する対象となる画像を取得する。 The image acquisition unit 201 acquires an image for estimating the number of people.

分布取得部２０２は、画像取得部２０１により取得された画像上の各画素の位置において想定される人体サイズの分布を取得する。分布取得部２０２は、ユーザが画像上のいくつかの位置における人体サイズを指定することで、画像上の任意の位置における人体の平均的な人体サイズを補間により推定し、人体サイズの分布を取得してもよい。また、分布取得部２０２は、人体を検出し、その検出結果から画像上の任意の位置における平均的な人体サイズを補間により推定し、人体サイズの分布を取得してもよい。 The distribution acquisition unit 202 acquires the distribution of human body sizes assumed at the position of each pixel on the image acquired by the image acquisition unit 201 . The distribution acquisition unit 202 obtains the distribution of human body sizes by estimating the average human body size of the human body at any position on the image by interpolation when the user designates the human body sizes at several positions on the image. You may Also, the distribution acquisition unit 202 may detect a human body, estimate an average human body size at an arbitrary position on the image from the detection result by interpolation, and acquire the distribution of the human body size.

補間による推定方法は、例えば、画像上の座標（ｘ，ｙ）における人体枠の大きさをｓとしたとき、ｓは、ｘ、ｙおよび未知の１個以上のパラメータによって表せると仮定する。例えば、ｓ＝ａｘ＋ｂｙ＋ｃと仮定する。この例では、未知のパラメータは、ａ、ｂおよびｃである。分布取得部２０２は、ユーザが指定した人体の位置およびサイズの集合、または、人体検出により検出された人体の位置およびサイズの集合を用いて、未知のパラメータを例えば最小二乗法等の統計処理により求めることができる。 The estimation method using interpolation assumes that, for example, when the size of the body frame at coordinates (x, y) on the image is s, s can be represented by x, y and one or more unknown parameters. For example, assume s=ax+by+c. In this example, the unknown parameters are a, b and c. The distribution acquisition unit 202 uses a set of human body positions and sizes designated by the user or a set of human body positions and sizes detected by human body detection to obtain unknown parameters by statistical processing such as the least-squares method. can ask.

図３は、画像取得部２０１により取得された画像３００の一例を示す図である。画像３００は、高所に設置された撮像装置により撮像された画像である。画像３００は、人体３０１、３０２及び３０３を含む。撮像装置を斜めに設置した場合、図３のように、撮像装置に近い手前側の位置の人体３０３は大きく、撮像装置から離れた奥側の位置の人体３０１は小さい。分布取得部２０２は、画像取得部２０１により取得された画像６００上の各画素の位置において想定される人体サイズの分布を取得する。 FIG. 3 is a diagram showing an example of an image 300 acquired by the image acquisition unit 201. As shown in FIG. An image 300 is an image captured by an imaging device installed at a high place. Image 300 includes human bodies 301 , 302 and 303 . When the imaging device is installed obliquely, as shown in FIG. 3, the human body 303 on the front side near the imaging device is large, and the human body 301 on the far side away from the imaging device is small. The distribution acquisition unit 202 acquires the distribution of human body sizes assumed at the position of each pixel on the image 600 acquired by the image acquisition unit 201 .

図２の領域分割部２０３は、画像取得部２０１により取得された画像を分割することにより、複数の分割領域を決定（生成）する。 The region dividing unit 203 in FIG. 2 determines (generates) a plurality of divided regions by dividing the image acquired by the image acquiring unit 201 .

図４は、領域分割部２０３により決定される分割領域４０１～４０３の一例を示す図である。図３と同様に、画像３００は、人体３０１～３０３を含む。撮像装置から離れた奥側の位置の人体３０１は、画像３００上では小さくなっている。そのため、領域分割部２０３は、画像３００上の人体３０１のサイズを基に、人体３０１の位置に対応する分割領域４０１を決定する。同様に、領域分割部２０３は、画像３００上の人体３０２のサイズを基に、人体３０２の位置に対応する分割領域４０２を決定する。同様に、領域分割部２０３は、画像３００上の人体３０３のサイズを基に、人体３０３の位置に対応する分割領域４０３を決定する。分割領域４０１～４０３は、矩形であり、例えば正方形である。 FIG. 4 is a diagram showing an example of divided areas 401 to 403 determined by the area dividing section 203. As shown in FIG. Similar to FIG. 3, image 300 includes human bodies 301-303. A human body 301 located on the far side away from the imaging device is small on the image 300 . Therefore, based on the size of the human body 301 on the image 300 , the region dividing unit 203 determines the divided regions 401 corresponding to the positions of the human body 301 . Similarly, the region dividing unit 203 determines a divided region 402 corresponding to the position of the human body 302 based on the size of the human body 302 on the image 300 . Similarly, the region dividing unit 203 determines a divided region 403 corresponding to the position of the human body 303 based on the size of the human body 303 on the image 300 . The divided areas 401 to 403 are rectangular, for example square.

図２の人数推定部２０４は、機械学習によって得られた回帰器を用いて、画像取得部２０１により取得された画像において、領域分割部２０３により決定された複数の分割領域の各々の中の人体の数（人数）を逐次的に推定する。画像におけるオブジェクトの数の推定方法の一例として、ある固定サイズＳの小画像を入力とし、その小画像に写っているオブジェクトの数を出力とする回帰器を用いる手法について説明する。オブジェクトが人体の場合、この手法では予め、頭部など人体の位置が既知である大量の小画像を学習データとして、サポートベクターマシンや深層学習等の既知の機械学習手法に基づいて回帰器を学習しておく。このとき、回帰器の精度向上を図るため、学習データは、小画像のサイズと映っている人のサイズとの比率がほぼ一定であることが望ましい。 The number-of-people estimation unit 204 in FIG. 2 uses a regressor obtained by machine learning to estimate the number of human bodies in each of the plurality of divided regions determined by the region division unit 203 in the image acquired by the image acquisition unit 201 . The number of (number of people) is estimated sequentially. As an example of a method for estimating the number of objects in an image, a method using a regressor that inputs a small image of a certain fixed size S and outputs the number of objects in the small image will be described. If the object is a human body, this method uses a large number of small images with known positions of the human body, such as the head, as training data, and trains the regressor based on known machine learning methods such as support vector machines and deep learning. Keep At this time, in order to improve the accuracy of the regressor, it is desirable that the training data have a substantially constant ratio between the size of the small image and the size of the person in the image.

人数推定部２０４は、領域分割部２０３により決定された複数の分割領域それぞれについて、分割領域内の画像を固定サイズＳにリサイズしたものを小画像とし、該小画像を回帰器に入力することで「該分割領域内の人体の位置」を回帰器からの出力として求める。人数推定部２０４は、回帰器からの出力である「分割領域内の人体の位置」の個数（＝人体の数）を、該分割領域における人数として推定する。なお、人数推定部２０４が推定するオブジェクトの数は必ずしも整数とは限らず、実数を取ることもありえる。人数推定部２０４は、実数を四捨五入により整数に丸めて扱ってもよいし、実数のまま扱ってもよい。人数推定部２０４は、分割領域のサイズと推定対象のオブジェクトのサイズとの比率が、学習データのその比率とほぼ一定となるように制約を加えることで、推定の正解率の向上を図ることができる。 For each of the plurality of divided areas determined by the area dividing unit 203, the number-of-people estimation unit 204 obtains a small image obtained by resizing the image in the divided area to a fixed size S, and inputs the small image to the regressor. The "position of the human body within the divided area" is obtained as an output from the regressor. The number-of-persons estimation unit 204 estimates the number of "positions of the human body in the divided area" (=the number of human bodies) output from the regressor as the number of people in the divided area. Note that the number of objects estimated by the number-of-persons estimation unit 204 is not necessarily an integer, and may be a real number. The number-of-persons estimation unit 204 may handle the real number by rounding it to an integer by rounding off, or may handle the real number as it is. The number-of-people estimation unit 204 imposes a constraint such that the ratio between the size of the divided area and the size of the object to be estimated is substantially constant with the ratio of the learning data, thereby improving the accuracy rate of estimation. can.

図５は、人数推定部２０４の推定の正解率の一例を示すグラフである。横軸は、分割領域４０１～４０３の横幅の画素数に対する人体３０１～３０３の横幅の画素数の割合を示す。縦軸は、人数推定部２０４の推定の正解率を示す。例えば、分割領域が１００画素×１００画素であり、人体の横幅が５０画素である場合、横軸の割合は、５０画素／１００画素＝５０％である。人数推定部２０４の推定の正解率は、回帰器の学習に用いた学習データである小画像のサイズと映っている人のサイズとの比率を中心とした正規分布となることが一般的である。割合範囲５０２は、分割領域４０１～４０３の横幅の画素数に対する人体３０１～３０３の横幅の画素数の割合の範囲であり、推定の正解率が目標値５０１以上である場合の割合の範囲（以下、理想割合範囲）である。人数推定部２０４は、割合範囲５０２の人体サイズの人数を目標値５０１以上の推定の正解率で推定可能である。理想割合範囲である割合範囲５０２の幅は、目標値５０１に応じて変化する。 FIG. 5 is a graph showing an example of an estimation accuracy rate of the number-of-people estimation unit 204 . The horizontal axis indicates the ratio of the number of horizontal pixels of the human bodies 301 to 303 to the number of horizontal pixels of the divided regions 401 to 403 . The vertical axis indicates the accuracy rate of the estimation by the number-of-persons estimation unit 204 . For example, if the divided area is 100 pixels×100 pixels and the width of the human body is 50 pixels, the ratio on the horizontal axis is 50 pixels/100 pixels=50%. The accuracy rate of the estimation by the number of people estimating unit 204 is generally a normal distribution centered on the ratio between the size of the small image that is the learning data used for learning the regressor and the size of the person in the image. . A ratio range 502 is a ratio range of the number of horizontal pixels of the human bodies 301 to 303 to the number of pixels of the horizontal width of the divided regions 401 to 403. , ideal proportion range). The number-of-persons estimation unit 204 can estimate the number of persons of the human body size in the ratio range 502 with an estimation accuracy rate equal to or higher than the target value 501 . The width of the ratio range 502 , which is the ideal ratio range, changes according to the target value 501 .

図２に示す領域分割部２０３は、理想割合範囲である割合範囲５０２に対応するサイズの分割領域を決定する必要がある。領域分割部２０３が画像の画素位置における人体サイズに応じた適切な分割領域を決定することにより、人数推定部２０４は、推定の正解率を高くすることができる。 The region dividing unit 203 shown in FIG. 2 needs to determine divided regions having a size corresponding to the ratio range 502, which is the ideal ratio range. The number-of-people estimation unit 204 can increase the accuracy rate of estimation by the region dividing unit 203 determining appropriate divided regions according to the human body size at the pixel position of the image.

人数統合部２０５は、人数推定部２０４により推定された各分割領域の人数を統合する。例えば、人数統合部２０５は、人数推定部２０４により推定された各分割領域の人数を合算する。また、人数統合部２０５は、ユーザが設定した画像上の領域の内部の分割領域の人数のみを合算してもよい。また、人数統合部２０５は、画像取得部２０１により取得された画像に人数を重畳してもよい。 The number integration unit 205 integrates the number of persons in each divided area estimated by the number estimation unit 204 . For example, the number integration unit 205 adds up the number of persons in each divided area estimated by the number estimation unit 204 . Further, the number-of-persons integration unit 205 may sum only the number of persons in the divided areas inside the area on the image set by the user. Also, the number-of-persons integration unit 205 may superimpose the number of persons on the image acquired by the image acquisition unit 201 .

表示部２０６は、人数統合部２０５により統合された人数を、表示装置１０４に表示する。表示部２０６は、人数の数値を表示してもよいし、人数を重畳した画像を表示してもよい。また、表示部２０６は、ファイルに人数を出力してもよいし、ネットワークプロトコルを利用して人数を送信してもよい。 The display unit 206 displays the number of people integrated by the number of people integration unit 205 on the display device 104 . The display unit 206 may display the numerical value of the number of people, or may display an image on which the number of people is superimposed. Also, the display unit 206 may output the number of people to a file, or may transmit the number of people using a network protocol.

図６は、図２の領域分割部２０３の処理方法の一例を示すフローチャートである。ステップＳ６０１では、領域分割部２０３は、分布取得部２０２で取得された人体サイズの分布と、決定済みの分割領域を基に、決定済みの分割領域における理想割合範囲内の人体サイズに対応する画素以外で想定される人体サイズが最大となる位置を決定する。ステップＳ６０１の処理の詳細は、図７を参照しながら説明する。 FIG. 6 is a flow chart showing an example of a processing method of the area dividing unit 203 of FIG. In step S601, based on the distribution of human body sizes acquired by the distribution acquisition unit 202 and the determined divided regions, the region dividing unit 203 calculates pixels corresponding to human body sizes within the ideal ratio range in the determined divided regions. Determine the position where the assumed human body size is maximum outside. Details of the processing in step S601 will be described with reference to FIG.

図７は、図６のステップＳ６０１の処理の詳細を示すフローチャートである。ステップＳ７０１では、領域分割部２０３は、画像の中の人物の候補位置および候補位置で想定される人体サイズを初期化する。なお、本実施形態において、ステップＳ７０１にて、領域分割部２０３は、画像における全画素のうち最も左側かつ最も上側の画素を候補位置として初期化し、当該候補位置で想定される人体サイズはゼロとして初期化する。 FIG. 7 is a flow chart showing details of the processing in step S601 of FIG. In step S701, the region dividing unit 203 initializes the candidate positions of the person in the image and the human body size assumed at the candidate positions. In this embodiment, in step S701, the region dividing unit 203 initializes the leftmost and uppermost pixel among all the pixels in the image as a candidate position, and assumes that the human body size assumed at the candidate position is zero. initialize.

ステップＳ７０２では、領域分割部２０３は、画像取得部２０１により取得された画像の中の各画素を対象として、後述するステップＳ７０３～Ｓ７０８の処理を繰り返す。本実施形態では、領域分割部２０３は、まず画像における全画素のうち、最も左側かつ最も上側の画素を対象とし、次に、画像の左上から右下までラスタスキャンを行うよう順次対象とする画素（対象画素）を変更しつつ、Ｓ７０３～Ｓ７０８の処理を繰り返す。 In step S702, the region dividing unit 203 repeats the processing of steps S703 to S708, which will be described later, for each pixel in the image acquired by the image acquiring unit 201. FIG. In the present embodiment, the area dividing unit 203 first targets the leftmost and uppermost pixel among all the pixels in the image, and then sequentially performs raster scanning from the upper left to the lower right of the image. The processing of S703 to S708 is repeated while changing the (target pixel).

ステップＳ７０３では、領域分割部２０３は、分布取得部２０２により取得された人体サイズの分布を基に、ステップＳ７０２で対象とした対象画素の位置で想定される人体サイズを取得する。 In step S703, the region dividing unit 203 acquires the human body size assumed at the position of the target pixel targeted in step S702 based on the human body size distribution acquired by the distribution acquisition unit 202. FIG.

ステップＳ７０４では、領域分割部２０３は、図６のステップＳ６０２で決定済みの分割領域が存在する場合には、各分割領域を対象として、後述するステップＳ７０５とＳ７０６の処理を繰り返す。なお、図６に示す決定済みの分割領域が存在しない場合は、Ｓ７０７へ遷移する。 In step S704, if there is a divided area determined in step S602 of FIG. 6, the area dividing unit 203 repeats the processing of steps S705 and S706, which will be described later, for each divided area. Note that if there is no determined divided area shown in FIG. 6, the process proceeds to S707.

ステップＳ７０５では、領域分割部２０３は、ステップＳ７０２で対象とした画素が、ステップＳ７０４で対象とした分割領域（以下、対象分割領域）の内部に包含されるか否かを判定する。領域分割部２０３は、対象画素が対象分割領域に包含されていない場合には、ステップＳ７０４の繰り返し処理に進み、対象画素が対象分割領域に包含されている場合には、ステップＳ７０６に進む。 In step S705, the region dividing unit 203 determines whether or not the pixel targeted in step S702 is included in the divided region targeted in step S704 (hereinafter referred to as the target divided region). If the target pixel is not included in the target divided region, the region dividing unit 203 proceeds to repeat the processing of step S704, and if the target pixel is included in the target divided region, proceeds to step S706.

ステップＳ７０６では、領域分割部２０３は、ステップＳ７０５の対象分割領域のサイズと、理想割合範囲である割合範囲５０２とを基に、人数推定部２０４が目標値５０１以上の正解率で分割領域の人数を推定可能な人体サイズの範囲である理想サイズ範囲を算出する。 In step S706, based on the size of the target divided area obtained in step S705 and the ratio range 502, which is the ideal ratio range, the number of persons estimating unit 204 estimates the number of persons in the divided area with an accuracy rate equal to or higher than the target value of 501. An ideal size range, which is a range of human body sizes that can be estimated, is calculated.

そして、領域分割部２０３は、ステップＳ７０３で取得した対象画素での人体サイズが、ステップＳ７０６で算出した理想サイズ範囲に含まれているか否かを判定する。領域分割部２０３は、ステップＳ７０３で取得した対象画素での人体サイズがステップＳ７０６で算出した理想サイズ範囲に含まれていない場合には、ステップＳ７０４の繰り返し処理に進む。また、領域分割部２０３は、ステップＳ７０３で取得した人体サイズがステップＳ７０６で算出した理想サイズ範囲に含まれている場合には、ステップＳ７０２へ遷移し、次の対象画素を決定する。 Then, the region dividing unit 203 determines whether or not the human body size at the target pixel acquired in step S703 is included in the ideal size range calculated in step S706. If the human body size at the target pixel acquired in step S703 is not included in the ideal size range calculated in step S706, the region dividing unit 203 proceeds to the repeated processing of step S704. If the human body size acquired in step S703 is included in the ideal size range calculated in step S706, the region dividing unit 203 proceeds to step S702 to determine the next target pixel.

領域分割部２０３は、ステップＳ７０４のすべての分割領域の処理を終了した後、ステップＳ７０７に進む。ステップＳ７０７では、領域分割部２０３は、ステップＳ７０３で取得した対象画素における想定される人体サイズが現在の候補位置での人体サイズよりも大きいか否かを判定する。領域分割部２０３は、ステップＳ７０３で取得した人体サイズが現在の候補位置での人体サイズより大きい場合には、ステップＳ７０８に進む。また、領域分割部２０３は、ステップＳ７０３で取得した人体サイズが現在の候補位置での人体サイズより大きくない場合には、ステップＳ７０２へ遷移し、次の対象画素を決定する。 After completing the processing of all divided regions in step S704, the region dividing unit 203 proceeds to step S707. In step S707, the region dividing unit 203 determines whether the assumed human body size at the target pixel acquired in step S703 is larger than the human body size at the current candidate position. If the human body size acquired in step S703 is larger than the human body size at the current candidate position, the region dividing unit 203 proceeds to step S708. If the human body size acquired in step S703 is not larger than the human body size at the current candidate position, the region dividing unit 203 transitions to step S702 to determine the next target pixel.

ステップＳ７０８では、領域分割部２０３は、現在の対象画素を候補位置とし、現在の対象画素における人体サイズを候補位置で想定される人体サイズとする。 In step S708, the region dividing unit 203 sets the current target pixel as the candidate position, and sets the human body size at the current target pixel as the human body size assumed at the candidate position.

領域分割部２０３は、画像における全画素に対しＳ７０２～Ｓ７０８の繰り返し処理が終了した後における候補位置を、決定済みの分割領域における理想サイズ範囲内の人体サイズの画素以外の領域にて、人体サイズが最大となる位置として決定する。このとき、決定される人体サイズが最大となる位置をサイズ最大位置とする。 The region dividing unit 203 divides the candidate positions after the repeated processing of S702 to S708 for all pixels in the image into human-size pixels in regions other than the pixels of the human-body size within the ideal size range in the determined divided regions. is determined as the maximum position. At this time, the position at which the determined human body size is maximum is defined as the maximum size position.

次に、図６のステップＳ６０２では、領域分割部２０３は、図５の割合範囲５０２と、ステップＳ６０１で決定されたサイズ最大位置と当該サイズ最大位置で想定される人体サイズとを基に、分割領域の位置とサイズを決定する。そして、領域分割部２０３は、決定した位置とサイズを基に、ステップＳ６０１で決定された位置を含む分割領域を決定する。本実施形態における領域分割部２０３は、ステップＳ６０１で決定されたサイズ最大位置で想定される人体サイズと分割領域のサイズとの比が理想割合範囲に含まれ、かつ、当該サイズ最大位置が分割領域の左下の端点に位置するよう分割領域を決定する。なお、分割領域は、ステップＳ６０１で決定された位置を中心とする領域でもよいし、右下、左上および右上のいずれかとする領域でもよい。 Next, in step S602 in FIG. 6, the region dividing unit 203 divides the area based on the ratio range 502 in FIG. 5, the maximum size position determined in step S601, and the human body size assumed at the maximum size position. Determine the location and size of the region. Based on the determined position and size, the region dividing unit 203 determines a divided region including the position determined in step S601. The region dividing unit 203 in this embodiment determines that the ratio between the human body size assumed at the maximum size position determined in step S601 and the size of the divided region is within the ideal ratio range, and that the maximum size position is equal to the divided region. Determine the segmented region so that it is located at the lower left end point of . Note that the divided area may be an area centered on the position determined in step S601, or may be any of the lower right, upper left, and upper right areas.

なお、ステップＳ６０１で決定されたサイズ最大位置は、決定済みの分割領域内に位置する場合も考えられる。具体的には、決定済みの分割領域である分割領域Ａ内の画素のうち、分割領域Ａの理想サイズ範囲から外れる人体サイズの画素においてサイズ最大位置となる画素が存在する場合がある。このとき、Ｓ６０２にて、領域分割部２０３は、分割領域Ａ内におけるサイズ最大位置で想定される人体サイズと分割領域のサイズとの比が理想割合範囲に含まれ、かつ、当該サイズ最大位置が分割領域の左下の端点に位置するような分割領域である分割領域Ｂを決定する。このように領域分割部２０３により決定された分割領域Ａと分割領域Ｂは一部重複する。 Note that the maximum size position determined in step S601 may be located within the determined divided area. Specifically, among the pixels in the divided area A, which is the determined divided area, there may be a pixel at the maximum size position among the pixels of the human body size outside the ideal size range of the divided area A. At this time, in S602, the region dividing unit 203 determines that the ratio of the human body size assumed at the maximum size position in the divided region A to the size of the divided region is within the ideal ratio range, and that the maximum size position is A segmented region B, which is a segmented region located at the lower left end point of the segmented region, is determined. The divided regions A and B determined by the region division unit 203 in this way partially overlap.

次に、図６のステップＳ６０３では、領域分割部２０３は、画像取得部２０１により取得された画像の中の全画素がステップＳ６０２で決定した複数の分割領域のいずれかの理想サイズ範囲に含まれているか否かを判定する。領域分割部２０３は、全画素が複数の分割領域のいずれかの理想サイズ範囲に含まれている場合には、図６の処理を終了し、少なくとも１個の画素が複数の分割領域のいずれの理想サイズ範囲にも含まれていない場合には、ステップＳ６０１に戻る。 Next, in step S603 of FIG. 6, the region dividing unit 203 determines whether all pixels in the image acquired by the image acquiring unit 201 are included in the ideal size range of any of the plurality of divided regions determined in step S602. Determine whether or not If all the pixels are included in the ideal size range of any of the plurality of divided regions, the region dividing unit 203 ends the processing of FIG. If it is not included in the ideal size range, the process returns to step S601.

以上のように、領域分割部２０３は、ステップＳ７０３で取得した人体サイズと、図５の割合範囲５０２とを基に、複数の分割領域の各々の位置とサイズを決定する。１つの分割領域内で理想サイズ範囲に含まれる画素と、理想サイズ範囲内に含まれない画素が存在する場合がある。しかしながら、本実施形態における領域分割部２０３では、複数の分割領域同士の一部を重複させることで、画像における全画素が、複数の分割領域のいずれかの理想サイズ範囲に含ませることができる。このようにすることで、画像から人数を推定する精度を向上させることができる。 As described above, the region dividing unit 203 determines the position and size of each of the plurality of divided regions based on the human body size acquired in step S703 and the ratio range 502 in FIG. In one divided area, there may be pixels included in the ideal size range and pixels not included in the ideal size range. However, the region dividing unit 203 according to the present embodiment partially overlaps the plurality of divided regions, so that all pixels in the image can be included in the ideal size range of any one of the plurality of divided regions. By doing so, it is possible to improve the accuracy of estimating the number of people from the image.

割合範囲５０２は、人数推定部２０４が目標値以上の正解率で推定することができる分割領域の横幅のサイズに対する人体の横幅のサイズの割合の範囲である。理想割合範囲である割合範囲５０２は、人数推定部２０４の推定特性に応じた人体のサイズの範囲、または、人数推定部２０４が目標値以上の正解率で推定することができる人体のサイズの範囲に対応する。 A ratio range 502 is a ratio range of the width size of the human body to the width size of the divided area that can be estimated by the number of people estimation unit 204 with an accuracy rate equal to or higher than the target value. A ratio range 502, which is an ideal ratio range, is a range of human body sizes according to the estimation characteristics of the number estimation unit 204, or a range of human body sizes that the number estimation unit 204 can estimate with an accuracy rate equal to or higher than a target value. corresponds to

領域分割部２０３は、ステップＳ７０３で取得した人体のサイズが、人数推定部２０４が目標値以上の正解率で人数を推定することができる人体のサイズの範囲である理想サイズ範囲に含まれるように、複数の分割領域を決定する。 The region dividing unit 203 divides the human body sizes acquired in step S703 so that the human body sizes are included in the ideal size range, which is the human body size range in which the number of people estimating unit 204 can estimate the number of people with an accuracy rate equal to or higher than the target value. , to determine a plurality of segmented regions.

領域分割部２０３は、人数推定部２０４が高い正解率で推定可能な割合範囲５０２に応じて、適切な分割領域を決定する。人数推定部２０４は、領域分割部２０３により決定された各分割領域内の人数を推定する。これにより、人数推定部２０４は、人数の推定の正解率を高めることができる。 The region dividing unit 203 determines appropriate divided regions according to the ratio range 502 in which the number of persons estimating unit 204 can estimate with a high accuracy rate. The people estimation unit 204 estimates the number of people in each divided area determined by the area dividing unit 203 . Thereby, the number-of-persons estimation part 204 can raise the accuracy rate of estimation of the number of people.

なお、ユーザが画像の中の推定対象領域を設定してもよい。領域分割部２０３は、ユーザが設定した画像の中の推定対象領域を分割することにより、複数の分割領域を決定する。その場合、ステップＳ６０１では、領域分割部２０３は、ユーザが設定した推定対象領域内において、上記の位置を決定する。ステップＳ６０４では、領域分割部２０３は、ユーザが設定した推定対象領域内の全画素が複数の分割領域のいずれかに含まれているか否かを判定する。 Note that the user may set the estimation target area in the image. The region dividing unit 203 determines a plurality of divided regions by dividing the estimation target region in the image set by the user. In that case, in step S601, the region division unit 203 determines the above positions within the estimation target region set by the user. In step S604, the region dividing unit 203 determines whether or not all pixels in the estimation target region set by the user are included in any of the plurality of divided regions.

また、ユーザが図５の目標値５０１を設定してもよい。その場合、ステップＳ７０６およびＳ８０６では、領域分割部２０３は、ユーザが設定した目標値５０１に対応する割合範囲５０２を用いる。 Alternatively, the user may set the target value 501 in FIG. In that case, in steps S706 and S806, the region division unit 203 uses the ratio range 502 corresponding to the target value 501 set by the user.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in the computer of the system or apparatus reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.

以上、本発明の好ましい実施形態について説明したが、本発明はこれらの実施形態に限定されず、その要旨の範囲内で種々の変形及び変更が可能である。 Although preferred embodiments of the present invention have been described above, the present invention is not limited to these embodiments, and various modifications and changes are possible within the scope of the gist.

１００画像処理装置、２０１画像取得部、２０２分布取得部、２０３領域分割部、２０４人数推定部、２０５人数統合部、２０６表示部 100 image processing device, 201 image acquisition unit, 202 distribution acquisition unit, 203 region division unit, 204 number estimation unit, 205 number integration unit, 206 display unit

Claims

画像を分割することにより複数の分割領域を決定する分割手段と、
前記分割手段により決定された複数の分割領域の各々の中の人数を推定する推定手段とを有し、
前記分割手段は、前記画像の中の各位置で想定される人体のサイズと、前記推定手段が目標値以上の正解率で推定することができる人体のサイズの範囲とを基に、前記複数の分割領域の各々の位置とサイズを決定し、
前記複数の分割領域のうち、少なくとも２つの分割領域の一部が重複することを特徴とする画像処理装置。 a dividing means for determining a plurality of divided regions by dividing an image;
estimating means for estimating the number of people in each of the plurality of divided areas determined by the dividing means;
The dividing means divides the plurality of human body sizes based on the size of the human body assumed at each position in the image and the range of human body sizes that the estimating means can estimate with an accuracy rate equal to or higher than a target value. Determine the position and size of each of the divided regions,
An image processing apparatus, wherein at least two of the plurality of divided areas partially overlap each other.

前記分割手段は、前記画像の中の各画素の位置で想定される人体のサイズと、前記推定手段が目標値以上の正解率で推定することができる前記分割領域のサイズに対する人体のサイズの割合の範囲とを基に、前記複数の分割領域の各々の位置とサイズを決定することを特徴とする請求項１に記載の画像処理装置。 The dividing means divides the size of the human body assumed at the position of each pixel in the image, and the ratio of the human body size to the size of the divided region that the estimating means can estimate with an accuracy rate equal to or higher than a target value. 2. The image processing apparatus according to claim 1, wherein the position and size of each of the plurality of divided areas are determined based on the range of .

前記分割手段は、前記画像の中の各画素の位置で想定される人体のサイズと、前記推定手段が目標値以上の正解率で推定することができる前記分割領域の横幅のサイズに対する人体の横幅のサイズの割合の範囲とを基に、前記複数の分割領域の各々の位置とサイズを決定することを特徴とする請求項１または２に記載の画像処理装置。 The dividing means divides the width of the human body with respect to the size of the human body assumed at the position of each pixel in the image and the horizontal width of the divided region that the estimating means can estimate with an accuracy rate equal to or higher than a target value. 3. The image processing apparatus according to claim 1, wherein the position and size of each of the plurality of divided areas are determined based on the size ratio range of .

前記分割手段は、前記画像の中の各画素の位置で想定される人体のサイズが、前記推定手段が目標値以上の正解率で人数を推定することができる人体のサイズの範囲に含まれるように、前記複数の分割領域を決定することを特徴とする請求項１～３のいずれか１項に記載の画像処理装置。 The dividing means divides the size of the human body assumed at the position of each pixel in the image so that it is included in the range of human body sizes in which the estimating means can estimate the number of people with an accuracy rate equal to or higher than a target value. 4. The image processing apparatus according to any one of claims 1 to 3 , wherein the plurality of divided areas are determined in a second step.

前記画像の中の全画素は、前記複数の分割領域のうちのいずれかに含まれていることを特徴とする請求項１～４のいずれか１項に記載の画像処理装置。 5. The image processing apparatus according to claim 1, wherein all pixels in said image are included in one of said plurality of divided areas.

前記分割手段は、前記画像の中の推定対象領域を分割することにより複数の分割領域を決定することを特徴とする請求項１～５のいずれか１項に記載の画像処理装置。 6. The image processing apparatus according to any one of claims 1 to 5 , wherein said dividing means determines a plurality of divided areas by dividing an estimation target area in said image.

前記推定手段は、機械学習によって得られた回帰器を用いて、人数を推定することを特徴とする請求項１～６のいずれか１項に記載の画像処理装置。 7. The image processing apparatus according to claim 1, wherein said estimating means estimates the number of people using a regressor obtained by machine learning.

画像を分割することにより複数の分割領域を決定する分割ステップと、
前記分割ステップで決定された複数の分割領域の各々の中の人数を推定する推定ステップとを有し、
前記分割ステップでは、前記画像の中の各位置で想定される人体のサイズと、前記推定ステップにおいて目標値以上の正解率で推定することができる人体のサイズの範囲とを基に、前記複数の分割領域の各々の位置とサイズを決定し、
前記複数の分割領域のうち、少なくとも２つの分割領域の一部が重複することを特徴とする画像処理装置の画像処理方法。 a dividing step of determining a plurality of divided regions by dividing an image;
an estimating step of estimating the number of people in each of the plurality of divided areas determined in the dividing step;
In the dividing step, based on the size of the human body assumed at each position in the image and the range of human body sizes that can be estimated with an accuracy rate equal to or higher than a target value in the estimating step, the plurality of Determine the position and size of each of the divided regions,
An image processing method for an image processing device, wherein at least two of the plurality of divided areas partially overlap each other.

コンピュータを、請求項１～７のいずれか１項に記載された画像処理装置の各手段として機能させるためのプログラム。 A program for causing a computer to function as each means of the image processing apparatus according to any one of claims 1 to 7 .