JP2010257451A

JP2010257451A - Device, method and program for searching person

Info

Publication number: JP2010257451A
Application number: JP2010084175A
Authority: JP
Inventors: Masaki Murakawa; 正貴村川; Takehiro Mabuchi; 健宏馬渕; Takuo Moriguchi; 拓雄森口; Satoshi Futami; 聡二見
Original assignee: Sohgo Security Services Co Ltd
Current assignee: Sohgo Security Services Co Ltd
Priority date: 2009-03-31
Filing date: 2010-03-31
Publication date: 2010-11-11
Anticipated expiration: 2030-03-31
Also published as: JP5523900B2

Abstract

<P>PROBLEM TO BE SOLVED: To efficiently perform accurate person search for a photographed image by simple operation. <P>SOLUTION: The device for searching a person to be displayed on a screen by extracting information for persons included in a video image taken by an imaging means includes an accumulation means which accumulates person feature information for each of time-series images included in the image; a search means which searches a corresponding person based on a condition preset to the accumulation means; a screen creation means which extracts information corresponding to a search result obtained by the search means from the accumulation means, and creates a screen for displaying a person feature or action preset in conformation to the extracted person, or patrol position information of a security robot provided with the imaging means; and an output means which displays the screen created by the screen creation means and outputs information displayed on the screen to an external device. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、人物検索装置、人物検索方法、及び人物検索プログラムに係り、特に撮影された画像に対して簡単な操作で効率的に高精度な人物検索を行うための人物検索装置、人物検索方法、及び人物検索プログラムに関する。 The present invention relates to a person search device, a person search method, and a person search program, and more particularly, a person search device and a person search method for efficiently and accurately searching a person with a simple operation on a captured image. And a person search program.

従来、カメラ等の撮像手段により撮影された映像に対して画像処理等を行い、その映像を構成する画像中に含まれる人物を自動的に検出する手法が存在している。その中でも画像中から人物の顔を検出する技術が急速に発達しており、例えば顔の判別精度を向上させるために、入力画像から肌色領域を抽出し、その肌色領域に基づいて、頭頂部、目、口等の顔の特徴点の位置を検出して、その検出結果から肌色領域が顔か否かを判定する画像処理装置が存在している（例えば、特許文献１参照。）。 2. Description of the Related Art Conventionally, there has been a technique for performing image processing or the like on a video taken by an imaging means such as a camera and automatically detecting a person included in an image constituting the video. Among them, a technique for detecting a human face from an image has been rapidly developed.For example, in order to improve face discrimination accuracy, a skin color region is extracted from an input image, and based on the skin color region, the top of the head, There is an image processing apparatus that detects the positions of facial feature points such as eyes and mouth and determines whether or not the skin color region is a face from the detection result (see, for example, Patent Document 1).

また、このような顔検出技術は、例えば銀行や百貨店、コンビニエンスストア等の警備対象施設内に設置された監視カメラにより得られる映像に対して処理がなされ、犯罪時の迅速な人物特定や、不審者か否かを検出して犯罪を未然に防止するため等に用いられる。 In addition, such face detection technology is applied to images obtained by surveillance cameras installed in guarded facilities such as banks, department stores, and convenience stores. It is used to detect crimes and prevent crimes.

更に、検出された顔情報を特徴情報と共に蓄積しておき、検索したい人物の顔の特徴を検索条件として指定することで、指定された特徴を有する顔画像を検索する手法が存在する（例えば、特許文献２参照。）。 Furthermore, there is a method for searching for a face image having a specified feature by accumulating detected face information together with feature information and specifying a facial feature of a person to be searched as a search condition (for example, (See Patent Document 2).

特開２００４−５３８４号公報JP 2004-5384 A 特開２００６−３１８３７５号公報JP 2006-318375 A

ところで、上述した従来の手法により検出された人物は、画像検索処理において効率的に高精度な検索が行えるように、カメラから読み取れる多種の情報を蓄積しておくことが好ましい。しかしながら、人物の顔情報のみからでは、抽出できる情報に限りがあり、例えばその人物が不審者等の特定の人物であるか否かを判定する場合や、映像中から不審者等の特定人物を検出する場合等には、より詳細な人物情報が必要となる。また、膨大な量を検索対象とするため、その検索条件やデータの蓄積及び検索等について、検索者が簡単な操作で効率的に高精度な人物検索を実現し、更にその結果を効果的に出力するように構築する必要がある。 By the way, it is preferable that the person detected by the above-described conventional method accumulates various kinds of information that can be read from the camera so that the highly accurate search can be efficiently performed in the image search process. However, there is a limit to the information that can be extracted from only the face information of a person. For example, when determining whether or not the person is a specific person such as a suspicious person, In the case of detection, more detailed person information is required. In addition, since a huge amount is targeted for search, the searcher can efficiently and accurately perform a person search with simple operations for the search conditions and data storage and search, etc. Must be built to output.

本発明は、上記の問題点に鑑みてなされたものであって、撮影された画像に対して簡単な操作で効率的に高精度な人物検索を行うための人物検索装置、人物検索方法、及び人物検索プログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and a person search device, a person search method, and a person search method for efficiently and accurately performing a person search on a captured image with a simple operation, and The purpose is to provide a person search program.

上記課題を解決するために、本件発明は、以下の特徴を有する課題を解決するための手段を採用している。 In order to solve the above problems, the present invention employs means for solving the problems having the following characteristics.

本発明では、撮像手段により撮影された映像に含まれる人物に関する情報を抽出し、画面に表示する人物検索装置において、前記映像に含まれる時系列の各画像に対して人物特徴情報を蓄積する蓄積手段と、前記蓄積手段に対して予め設定される条件に基づいて該当する人物を検索する検索手段と、前記検索手段により得られる検索結果に対応する情報を前記蓄積手段により抽出し、抽出した人物に対応して予め設定された人物特徴或いは行動、又は前記撮像手段を設けた警備ロボットの巡回位置情報を表示するための画面を生成する画面生成手段と、前記画面生成手段により生成された画面を表示すると共に、前記画面に表示される情報を外部機器に出力する出力手段とを有することを特徴とする。これにより、撮影された画像に対して簡単な操作で効率的に高精度な人物検索を行うことができる。 In the present invention, in a person search device that extracts information related to a person included in a video captured by an imaging unit and displays the information on a screen, the storage for storing personal feature information for each time-series image included in the video Means, search means for searching for a person based on conditions preset for the storage means, and information corresponding to the search result obtained by the search means is extracted by the storage means, and the extracted person A screen generating means for generating a screen for displaying a personal feature or action set in advance corresponding to the information, or a patrol position information of a security robot provided with the imaging means, and a screen generated by the screen generating means. And output means for outputting information displayed on the screen to an external device. Thereby, a highly accurate person search can be efficiently performed with a simple operation on the captured image.

また本発明では、前記映像に含まれる人物に対して、顔領域を検出する顔領域検出手段と、前記映像に含まれる人物に対して、人体領域を検出する人体領域検出手段と、前記顔領域検出手段により得られる検出結果と、前記人体領域検出手段により得られる検出結果とを用いて時間の異なる複数の画像の人物を同定する同定手段と、前記同定手段により得られる人物の動作を追跡する追跡手段と、前記追跡手段により得られる追跡結果から不審者を検出する不審者検出手段と、前記同定手段により同定された人物の特徴情報を統合する人物情報統合手段と、前記人物情報統合手段により得られる統合結果を、前記映像を構成する画像フレーム毎に生成するフレーム情報生成手段とを有し、前記蓄積手段は、前記フレーム情報生成手段により得られるフレーム情報に基づいて、映像に含まれる時系列の各画像に対して人物特徴情報を蓄積することを特徴とする。 Further, in the present invention, a face area detecting means for detecting a face area for a person included in the video, a human body area detecting means for detecting a human body area for the person included in the video, and the face area Using the detection result obtained by the detection means and the detection result obtained by the human body region detection means, an identification means for identifying persons of a plurality of images having different times, and tracking the movement of the person obtained by the identification means A tracking unit; a suspicious person detecting unit that detects a suspicious person from the tracking result obtained by the tracking unit; a person information integrating unit that integrates characteristic information of the person identified by the identifying unit; and the person information integrating unit. Frame information generating means for generating the obtained integration result for each image frame constituting the video, and the storage means is obtained by the frame information generating means. Based on the frame information, and wherein the accumulating personal characteristic information for each image of the time series included in the video.

また本発明では、前記フレーム情報生成手段は、前記映像中に含まれる人物の全ての特徴データを離散化して出現回数をカウントし、離散化されたデータを各フレーム数で正規化した人物特徴データを生成することを特徴とする。 In the present invention, the frame information generation means discretizes all the feature data of the person included in the video, counts the number of appearances, and normalizes the discretized data by the number of frames. Is generated.

また本発明では、前記人体領域検出手段は、検出された人体の所定範囲の色情報を抽出し、抽出した色情報を予め人の視覚特性に基づき設定された色空間に変換し、前記蓄積手段は、前記色空間に変換された色情報を蓄積することを特徴とする。 In the present invention, the human body region detecting means extracts color information in a predetermined range of the detected human body, converts the extracted color information into a color space set in advance based on human visual characteristics, and the storage means Stores the color information converted into the color space.

また本発明では、前記検索手段は、前記検索条件として、検索者により入力された検索要求を、前記人物の特徴の内容に応じた特徴量に変換し、正規化ヒストグラムの最大値を検索要求特徴量とし、前記検索要求特徴量により前記蓄積手段により蓄積された人物情報を検索して該当する人物に付与された特徴量との距離を算出して正答率のスコアリングを行うことを特徴とする。 In the present invention, the search means converts a search request input by a searcher as the search condition into a feature quantity corresponding to the content of the person's feature, and sets the maximum value of the normalized histogram as the search request feature. And searching for the person information stored by the storage means according to the search request feature quantity, calculating a distance from the feature quantity given to the corresponding person, and scoring the correct answer rate. .

更に本発明では、撮像手段により撮影された映像に含まれる人物に関する情報を抽出し、画面に表示する人物検索方法において、前記映像に含まれる時系列の各画像に対して人物特徴情報を蓄積手段に蓄積する蓄積手順と、前記蓄積手段に対して予め設定される条件に基づいて該当する人物を検索する検索手順と、前記検索手順により得られる検索結果に対応する情報を前記蓄積手段により抽出し、抽出した人物に対応して予め設定された人物特徴或いは行動、又は前記撮像手段を設けた警備ロボットの巡回位置情報を表示するための画面を生成する画面生成手順と、前記画面生成手段により生成された画面を表示すると共に、前記画面に表示される情報を外部機器に出力する出力手順とを有することを特徴とする。これにより、撮影された画像に対して簡単な操作で効率的に高精度な人物検索を行うことができる。 Furthermore, in the present invention, in the person search method for extracting information about a person included in a video photographed by the imaging means and displaying it on the screen, the person feature information is stored in each time-series image included in the video. The storage means extracts the information corresponding to the search result obtained by the search procedure obtained by the storage procedure, the search procedure for searching for a person based on the condition preset for the storage means, and the search procedure. A screen generation procedure for generating a screen for displaying a personal feature or action set in advance corresponding to the extracted person, or a patrol position information of a security robot provided with the imaging means, and generated by the screen generation means And an output procedure for outputting information displayed on the screen to an external device. Thereby, a highly accurate person search can be efficiently performed with a simple operation on the captured image.

また本発明では、前記映像に含まれる人物に対して、顔領域を検出する顔領域検出手順と、前記映像に含まれる人物に対して、人体領域を検出する人体領域検出手順と、前記顔領域検出手順により得られる検出結果と、前記人体領域検出手順により得られる検出結果とを用いて時間の異なる複数の画像の人物を同定する同定手順と、前記同定手順により得られる人物の動作を追跡する追跡手順と、前記追跡手順により得られる追跡結果から不審者を検出する不審者検出手順と、前記同定手順により同定された人物の特徴情報を統合する人物情報統合手順と、前記人物情報統合手順により得られる統合結果を、前記映像を構成する画像フレーム毎に生成するフレーム情報生成手順とを有し、前記蓄積手順は、前記フレーム情報生成手順により得られるフレーム情報に基づいて、映像に含まれる時系列の各画像に対して人物特徴情報を蓄積することを特徴とする。 Further, in the present invention, a face area detection procedure for detecting a face area for a person included in the video, a human body area detection procedure for detecting a human body area for the person included in the video, and the face area Using the detection result obtained by the detection procedure and the detection result obtained by the human body region detection procedure, an identification procedure for identifying persons of a plurality of images at different times, and tracking the movement of the person obtained by the identification procedure A tracking procedure, a suspicious person detection procedure for detecting a suspicious person from a tracking result obtained by the tracking procedure, a person information integration procedure for integrating feature information of a person identified by the identification procedure, and the person information integration procedure. A frame information generation procedure for generating an obtained integration result for each image frame constituting the video, and the accumulation procedure is obtained by the frame information generation procedure. Based on the frame information, and wherein the accumulating personal characteristic information for each image of the time series included in the video.

また本発明では、前記フレーム情報生成手順は、前記映像中に含まれる人物の全ての特徴データを離散化して出現回数をカウントし、離散化されたデータを各フレーム数で正規化した人物特徴データを生成することを特徴とする。 Also, in the present invention, the frame information generation procedure includes discretizing all feature data of a person included in the video, counting the number of appearances, and normalizing the discretized data by the number of frames. Is generated.

また本発明では、前記人体領域検出手順は、検出された人体の所定範囲の色情報を抽出し、抽出した色情報を予め人の視覚特性に基づき設定された色空間に変換し、前記蓄積手順は、前記色空間に変換された色情報を蓄積することを特徴とする。 Also, in the present invention, the human body region detection procedure extracts color information of a predetermined range of the detected human body, converts the extracted color information into a color space set in advance based on human visual characteristics, and stores the storage procedure. Stores the color information converted into the color space.

また本発明では、前記検索手順は、前記検索条件として、検索者により入力された検索要求を、前記人物の特徴の内容に応じた特徴量に変換し、正規化ヒストグラムの最大値を検索要求特徴量とし、前記検索要求特徴量により前記蓄積手段により蓄積された人物情報を検索して該当する人物に付与された特徴量との距離を算出して正答率のスコアリングを行うことを特徴とする。 In the present invention, the search procedure converts, as the search condition, a search request input by a searcher into a feature amount corresponding to the feature content of the person, and sets the maximum value of the normalized histogram as the search request feature. And searching for the person information stored by the storage means according to the search request feature quantity, calculating a distance from the feature quantity given to the corresponding person, and scoring the correct answer rate. .

更に本発明では、撮像手段により撮影された映像に含まれる人物に関する情報を抽出し、画面に表示する人物検索装置における人物検索プログラムにおいて、コンピュータを、前記映像に含まれる時系列の各画像に対して人物特徴情報を蓄積する蓄積手段、前記蓄積手段に対して予め設定される条件に基づいて該当する人物を検索する検索手段、前記検索手段により得られる検索結果に対応する情報を前記蓄積手段により抽出し、抽出した人物に対応して予め設定された人物特徴或いは行動、又は前記撮像手段を設けた警備ロボットの巡回位置情報を表示するための画面を生成する画面生成手段、及び、前記画面生成手段により生成された画面を表示すると共に、前記画面に表示される情報を外部機器に出力する出力手段として機能させる。これにより、撮影された画像に対して簡単な操作で効率的に高精度な人物検索を行うことができる。また、プログラムをインストールすることにより、汎用のパーソナルコンピュータ等で本発明における人物検索処理を容易に実現することができる。 Further, according to the present invention, in a person search program in a person search apparatus that extracts information related to a person included in a video photographed by an imaging unit and displays the information on a screen, a computer is used for each time-series image included in the video. Storage means for storing person feature information, search means for searching for a person based on conditions preset for the storage means, and information corresponding to the search results obtained by the search means by the storage means Screen generation means for generating a screen for extracting and displaying a personal feature or action set in advance corresponding to the extracted person, or patrol position information of a security robot provided with the imaging means, and the screen generation The screen generated by the means is displayed, and the information displayed on the screen is functioned as an output means for outputting to an external device. Thereby, a highly accurate person search can be efficiently performed with a simple operation on the captured image. Further, by installing the program, the person search process according to the present invention can be easily realized by a general-purpose personal computer or the like.

本発明によれば、撮影された画像に対して簡単な操作で効率的に高精度な人物検索を行うことができる。 According to the present invention, a highly accurate person search can be efficiently performed with respect to a captured image with a simple operation.

本実施形態における人物検索装置の概略構成の一例を示す図である。It is a figure which shows an example of schematic structure of the person search device in this embodiment. 顔領域検出手段及び人体領域検出手段における付加機能を説明するための一例の図である。It is a figure of an example for demonstrating the additional function in a face area | region detection means and a human body area | region detection means. 本実施形態におけるフレーム情報の一例を示す図である。It is a figure which shows an example of the frame information in this embodiment. 本実施形態における検索用人物特徴を取得するまでの概要を説明するための図である。It is a figure for demonstrating the outline | summary until it acquires the person characteristic for a search in this embodiment. 本実施形態における検索手法を説明するための一例の図である。It is a figure of an example for demonstrating the search method in this embodiment. 本実施形態における画面遷移例を示す図である。It is a figure which shows the example of a screen transition in this embodiment. 本実施形態における初期画面の一例を示す図である。It is a figure which shows an example of the initial screen in this embodiment. 本実施形態における日時指定画面を示す図である。It is a figure which shows the date specification screen in this embodiment. 本実施形態における特徴・行動条件指定画面を示す図である。It is a figure which shows the feature and action condition designation | designated screen in this embodiment. カラーマップの一例を示す図である。It is a figure which shows an example of a color map. 本実施形態における検索結果表示画面を示す図である。It is a figure which shows the search result display screen in this embodiment. 検索結果表示領域を具体的に説明するための図である。It is a figure for demonstrating a search result display area concretely. 本実施形態における映像再生画面を示す図である。It is a figure which shows the video reproduction screen in this embodiment. 本実施形態における共有画面を示す図である。It is a figure which shows the shared screen in this embodiment. 本実施形態における人物検索処理が実現可能なハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions which can implement | achieve the person search process in this embodiment. 本実施形態における蓄積処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the accumulation | storage process procedure in this embodiment. 本実施形態における検索処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the search processing procedure in this embodiment.

＜本発明について＞
本発明は、例えばカメラ等の撮影手段により取得し、予め蓄積された膨大な量の映像、又は撮影手段からのリアルタイム映像から、検索者が意図する人物像に近い人物が撮影された時刻の映像を容易に検索、確認可能にする。具体的には、映像検索の際には、人物の特徴や行動を基準に映像を検索する。例えば、身長等といった数値や、サングラスの有無、服の色等のように主観に依存する情報等、これまでは組み合わせることができなかった様々な種別の人物特徴や行動を組み合わせて検索する。なお、本発明における撮像手段としては、例えば、予め設定された壁や天井等に設置される固定型のものや、警備ロボット等に搭載された移動型のものがある。その何れにも対応した適切な人物検出手法を提供する。 <About the present invention>
The present invention is an image of a time when a person close to a person image intended by a searcher is photographed from a huge amount of images stored in advance by a photographing means such as a camera or a real-time video from the photographing means. Can be easily searched and confirmed. Specifically, when searching for a video, the video is searched based on the characteristics and actions of the person. For example, a search is performed by combining various types of personal features and actions that could not be combined so far, such as numerical values such as height, information such as the presence or absence of sunglasses, and the color of clothes, which depend on subjectivity. In addition, as an imaging means in this invention, there exist a fixed-type thing installed in the wall or ceiling etc. which were preset, for example, and a movable type mounted in the security robot etc. An appropriate person detection method corresponding to any of them is provided.

以下に、本発明における人物検索装置、人物検索方法、及び人物検索プログラムを好適に実施した形態について、図面を用いて説明する。 Hereinafter, embodiments in which a person search device, a person search method, and a person search program according to the present invention are suitably implemented will be described with reference to the drawings.

＜人物検索装置の概略構成例＞
図１は、本実施形態における人物検索装置の概略構成の一例を示す図である。図１に示す人物検索装置１０は、入力手段１１と、出力手段１２と、蓄積手段１３と、顔領域検出手段１４と、人体領域検出手段１５と、同定手段１６と、追跡手段１７と、不審者検出手段１８と、人物情報統合手段１９と、フレーム情報生成手段２０と、検索手段２１と、画面生成手段２２と、通知手段２３と、送受信手段２４と、制御手段２５とを有するよう構成されている。なお、送受信手段２４には、人物情報を抽出するための人物を撮影するカメラ等の撮像手段２６が接続されており、撮像手段２６により撮影された映像に含まれる時系列の各画像を取得することができる。また、撮像手段２６は、人物検索装置１０と一体に構成されていてもよい。 <Schematic configuration example of person search device>
FIG. 1 is a diagram illustrating an example of a schematic configuration of a person search apparatus according to the present embodiment. 1 includes an input unit 11, an output unit 12, a storage unit 13, a face region detection unit 14, a human body region detection unit 15, an identification unit 16, a tracking unit 17, and a suspicious unit. Person detection means 18, person information integration means 19, frame information generation means 20, search means 21, screen generation means 22, notification means 23, transmission / reception means 24, and control means 25. ing. Note that the transmission / reception means 24 is connected to an imaging means 26 such as a camera for photographing a person for extracting person information, and acquires each time-series image included in the video imaged by the imaging means 26. be able to. Further, the imaging unit 26 may be configured integrally with the person search device 10.

入力手段１１は、ユーザ等からの顔領域検出指示や、人体領域検出指示、同定指示、追跡指示、不審者検出指示、人物情報統合指示、フレーム情報生成指示、検索指示、画面生成指示、通知指示、送受信指示等の各種指示を受け付ける。なお、入力手段１１は、例えばキーボードや、マウス等のポインティングデバイス、マイク等の音声入力デバイス等からなる。 The input means 11 includes a face area detection instruction, a human body area detection instruction, an identification instruction, a tracking instruction, a suspicious person detection instruction, a person information integration instruction, a frame information generation instruction, a search instruction, a screen generation instruction, and a notification instruction from a user or the like Various instructions such as transmission / reception instructions are accepted. Note that the input unit 11 includes, for example, a keyboard, a pointing device such as a mouse, a voice input device such as a microphone, and the like.

出力手段１２は、入力手段１１により入力された指示内容や、各指示内容に基づいて生成された制御データや顔領域検出手段１４、人体領域検出手段１５、同定手段１６、追跡手段１７、不審者検出手段１８、人物情報統合手段１９、フレーム情報生成手段２０、検索手段２１、画面生成手段２２、通知手段２３、送受信手段２４等の各構成により実行された経過又は結果により得られる各種情報の内容を表示したり、その音声を出力する。なお、出力手段１２は、ディスプレイ等の画面表示機能やスピーカ等の音声出力機能等を有する。 The output unit 12 includes the instruction content input by the input unit 11, control data generated based on each instruction content, the face region detection unit 14, the human body region detection unit 15, the identification unit 16, the tracking unit 17, and the suspicious person. Contents of various information obtained by progress or result executed by each component such as detection means 18, person information integration means 19, frame information generation means 20, search means 21, screen generation means 22, notification means 23, transmission / reception means 24, etc. Is displayed and the sound is output. The output unit 12 has a screen display function such as a display, a sound output function such as a speaker, and the like.

更に、出力手段１２は、後述する検索手段２１により得られた検索結果や画面生成手段２２により生成された画面に表示された情報等を外部機器に出力する。つまり、出力手段１２は、外部機器への出力として、例えば、プリンタに出力したり、ファイルを生成してデータベース等の記録装置や記録媒体に出力したり、警備ロボットに対して検索結果に基づき生成された制御信号を出力したり、警備対象施設内のセンサのＯＮ／ＯＦＦやライトの点灯／消灯を切り替えたり、警備員が所持する携帯端末に対して検索結果に基づく関連情報（異常があった場所や内容等）を表示するための制御信号を出力するといった印刷・出力機能等を有する。また、出力手段１２は、上述した１又は複数の外部機器に同時に出力することができる。 Further, the output unit 12 outputs a search result obtained by the search unit 21 described later, information displayed on the screen generated by the screen generation unit 22, and the like to an external device. That is, the output means 12 outputs to an external device, for example, to a printer, generates a file and outputs it to a recording device such as a database or a recording medium, or generates a file based on a search result for a security robot Output control signals, switch on / off of sensors in the facility to be guarded, switch lights on / off, or related information based on search results for mobile terminals held by guards (there was an anomaly A printing / output function for outputting a control signal for displaying the location and contents). Further, the output unit 12 can output simultaneously to one or more external devices described above.

蓄積手段１３は、上述した本実施形態を実現するための様々な情報を蓄積することができ、必要に応じて読み出しや書き込みが行われる。具体的には、蓄積手段１３は、顔の認証や、性別・年代等を推定するのに使用される各種特徴量データや、顔領域検出手段１４における顔領域検出結果、人体領域検出手段１５における人体領域検出結果、同定手段１６における人物同定結果、追跡手段１７における追跡結果、不審者検出手段１８における不審者検出結果、人物情報統合手段１９における人物情報統合結果、フレーム情報生成手段２０におけるフレーム情報生成結果、送受信手段２４における送受信情報、制御手段２５により制御された情報、エラー発生時のエラー情報、ログ情報、本発明を実現するためのプログラム等の各情報が蓄積される。 The storage unit 13 can store various information for realizing the above-described embodiment, and reading and writing are performed as necessary. Specifically, the accumulating unit 13 includes various feature amount data used for face authentication, gender and age estimation, face area detection results in the face area detecting unit 14, and human body area detecting unit 15. Human body region detection result, person identification result in identification means 16, tracking result in tracking means 17, suspicious person detection result in suspicious person detection means 18, person information integration result in person information integration means 19, frame information in frame information generation means 20 Information such as the generation result, transmission / reception information in the transmission / reception means 24, information controlled by the control means 25, error information at the time of error occurrence, log information, a program for realizing the present invention, and the like are accumulated.

顔領域検出手段１４は、カメラ等の撮像手段２６等により撮影されたリアルタイム映像や蓄積手段１３に既に蓄積されている膨大な量の監視映像に対して、その映像中の画像に人物が含まれていると判断した場合、その人物の顔領域を検出する。つまり、顔領域検出手段１４は、カメラ等の撮像手段２６等により撮影された映像を、送受信手段２４を介して取得し、その取得した映像に含まれる時系列の各画像のうち、所定の画像（各フレーム画像や数フレーム分の間隔を空けた画像等）について１又は複数の人物の顔を検出する。 The face area detection unit 14 includes a person included in an image in the video for a real-time video captured by the imaging unit 26 such as a camera or a huge amount of monitoring video already stored in the storage unit 13. If it is determined that the face area of the person is detected, the face area of the person is detected. That is, the face area detection unit 14 acquires a video captured by the imaging unit 26 such as a camera via the transmission / reception unit 24, and among the time-series images included in the acquired video, a predetermined image One or a plurality of human faces are detected for each frame image or an image with an interval of several frames.

具体的には、顔領域検出手段１４は、例えば撮影された画像に含まれる顔における目や鼻、口等の位置情報からその顔の特徴量を取得し、予め設定された顔として検出されるための特徴量の照合パターンを用いたマッチング処理等を行うことにより人物の顔を検出する。また、顔領域検出手段１４は、上述の顔検出処理に限定されず、例えばエッジ検出や形状パターン検出による顔検出、色相抽出又は肌色抽出による顔検出等を用いることができる。なお、顔領域検出手段１４は、顔検出後、画像中の顔の縦幅、横幅からなる矩形の顔領域等を検出する。 Specifically, the face area detection unit 14 acquires the feature amount of the face from position information such as eyes, nose, and mouth in the face included in the photographed image, for example, and detects it as a preset face. For example, a human face is detected by performing a matching process using a feature amount matching pattern. Further, the face area detection unit 14 is not limited to the above-described face detection processing, and for example, face detection by edge detection or shape pattern detection, face detection by hue extraction or skin color extraction, or the like can be used. Note that the face area detection unit 14 detects a rectangular face area having a vertical and horizontal width of the face in the image after detecting the face.

また、顔領域検出手段１４は、顔領域の中心座標（位置情報）、及び領域の画像上の大きさ（サイズ）を検出し、その顔領域を所定形状により元の画像に合成して顔領域が明確に分かるように画面表示するための各種情報を取得し、蓄積手段１３に蓄積させる。なお、顔領域の形状は、本発明においては、矩形や円形、楕円形、他の多角形、人物の顔の外形形状から所定倍率で拡大させたシルエット形状等であってもよい。また、顔領域検出手段１４は、上述した内容の他に幾つかの付加機能を有するがその内容については後述する。 The face area detecting means 14 detects the center coordinates (position information) of the face area and the size (size) of the area on the image, and synthesizes the face area with the original image with a predetermined shape. Various information for screen display is acquired and stored in the storage means 13 so as to be clearly understood. In the present invention, the shape of the face region may be a rectangle, a circle, an ellipse, another polygon, a silhouette shape enlarged from the outer shape of a human face at a predetermined magnification, or the like. The face area detecting means 14 has some additional functions in addition to the above-described contents, which will be described later.

人体領域検出手段１５は、カメラ等の撮像手段２６等により撮影されたリアルタイム映像や蓄積手段１３に既に蓄積されている膨大な量の監視映像に対して、その映像中の画像に人物が含まれていると判断した場合、その人物の人体領域を検出する。つまり、人体領域検出手段１５は、カメラ等の撮像手段２６等により撮影された映像を、送受信手段２４を介して取得し、その取得した映像に含まれる時系列の各画像のうち、所定の画像（各フレーム画像や数フレーム分の間隔を空けた画像等）について１又は複数の人体領域を検出する。 The human body region detection means 15 includes a person included in an image in the video for a real time video taken by the imaging means 26 such as a camera or a huge amount of monitoring video already stored in the storage means 13. If it is determined that the human body area of the person is detected. That is, the human body region detection unit 15 acquires a video captured by the imaging unit 26 such as a camera via the transmission / reception unit 24, and among the time-series images included in the acquired video, a predetermined image One or a plurality of human body regions are detected for each frame image or an image with an interval of several frames.

具体的には、人体領域検出手段１５は、例えば連続する画像フレーム同士を比較して、色情報（輝度、色度等）が所定時間内に変化する場所が存在し、更にその場所で囲まれる領域が所定の領域以上のもの、又は経時的な移動範囲が所定の範囲内のものを人体領域として検出する。なお、人体検出手法については、本発明においてはこれに限定されない。 Specifically, the human body region detection unit 15 compares, for example, successive image frames, and there is a place where the color information (luminance, chromaticity, etc.) changes within a predetermined time, and is further surrounded by the place. A human body region is detected when the region is equal to or larger than the predetermined region or when the moving range with time is within the predetermined range. The human body detection method is not limited to this in the present invention.

また、人体領域検出手段１５は、人体領域の中心座標、及び人体領域の画像上の大きさを検出し、その人体領域を所定形状により元の画像に合成して人体領域が明確に分かるように画面表示するための各種情報を取得し、蓄積手段１３に蓄積させる。なお、人体領域の形状は、上述した顔領域と同様に、矩形や円形、楕円形、他の多角形、人物の外形形状から所定倍率で拡大させたシルエット形状等であってもよい。また、人体領域検出手段１５は、上述した内容の他に幾つかの付加機能を有するがその内容については後述する。 Further, the human body region detection means 15 detects the center coordinates of the human body region and the size of the human body region on the image, and synthesizes the human body region with the original image with a predetermined shape so that the human body region can be clearly understood. Various information for screen display is acquired and stored in the storage means 13. The shape of the human body region may be a rectangle, a circle, an ellipse, another polygon, a silhouette shape enlarged at a predetermined magnification from the outer shape of a person, and the like, similar to the face region described above. Further, the human body region detection means 15 has some additional functions in addition to the above-described contents, which will be described later.

同定手段１６は、同一のカメラで異なる時間に撮影された映像から抽出された２つの画像から検出される１又は複数の人物に対して、同一人物が含まれているか否かの同定処理を行う。具体的には、同定手段１６は、例えば撮影中に含まれる人物が今まで撮影された人であるか、又は新規人物であるかを判断する方法として、例えばその前の映像に人物がいるか否かを判断し、人物がいた場合にその人物に予め設定された移動可能範囲に含まれているか否かを判断し、更には前後の顔の特徴量同士の比較による類似性の有無により同一人物がふくまれているか否かを判断することができる。 The identification unit 16 performs an identification process as to whether or not the same person is included in one or a plurality of persons detected from two images extracted from videos taken at different times by the same camera. . Specifically, for example, the identification unit 16 determines whether the person included in the shooting is a person who has been shot or a new person, for example, whether there is a person in the previous video. If there is a person, it is determined whether or not the person is included in the movable range set in advance. Further, the same person can be determined based on the presence or absence of similarity by comparing the front and back face feature amounts. It is possible to determine whether or not a message is included.

また、同定手段１６は、顔領域検出手段１４において検出された顔のパターンと、蓄積手段１３に予め蓄積されている人物情報（氏名、性別、年齢、どの芸能人（人物）に似ている等）を含む顔のパターンとを比較して、その顔が誰の顔であるか、すなわち、人物が誰であるかを判定することができる。更に、同定手段１６は、その顔が誰にどの程度類似しているかといった類似度を取得することもできる。これにより、後述する画面生成手段２２等において、ある人物に対する類似度の高い人が表示されている画像を類似度の高い順に所定の数だけ表示させていくことができる。 The identification unit 16 also includes the face pattern detected by the face region detection unit 14 and personal information stored in the storage unit 13 in advance (name, gender, age, which celebrity (person) is similar). It is possible to determine who the face is, that is, who the person is. Furthermore, the identification means 16 can also acquire a similarity such as how much the face is similar to whom. Thereby, in the screen generation means 22 etc. which will be described later, it is possible to display a predetermined number of images in which a person with a high degree of similarity to a certain person is displayed in descending order of the degree of similarity.

なお、同定手段１６は、その画像の状況に応じた特徴量の取捨選択手法により適切なパラメータを用いて人物の同定を行ってもよい。例えば、同定手段１６は、人物の状態（立ち止まっているか、歩いているか、日照変化があったかどうか等）を判定し、その結果に基づいて各種特徴量（歩容（歩幅、歩く速度、姿勢等を含む）、体型、性別、服の色、髪の色等）を人物同定に使用することができる。更に、同定手段１６は、使用する特徴量のみについて分離度を算出し、その分離度に基づいて重み付けを行い、人物同定に対する特徴量毎の寄与率を変化させる。これにより、状況に応じた特徴量を用いて高精度な人物判定を行うことができる。 Note that the identification unit 16 may identify a person using an appropriate parameter by a method of selecting a feature amount according to the state of the image. For example, the identification unit 16 determines the state of the person (stopped, walking, whether there was a change in sunshine, etc.), and based on the result, various feature amounts (gait (step length, walking speed, posture, etc.) are determined. Including), body type, gender, clothes color, hair color, etc.) can be used for person identification. Furthermore, the identification unit 16 calculates the degree of separation only for the feature amount to be used, performs weighting based on the degree of separation, and changes the contribution rate for each feature amount with respect to person identification. Thereby, highly accurate person determination can be performed using the feature-value according to a condition.

また、同定手段１６においては、人体領域検出手段１５における人体検出結果と、顔領域検出手段１４における顔検出結果とが異なる場合がある。このような場合には、同定手段１６は、検出情報選択手段１６が、途中で処理を切り替えた場合に、比較しても一致しないため、同定により同一人物であると判定することができない場合も考えられる。そこで、同定手段１６は、取得された映像に対して、例えば切り替える前後等において、顔領域検出手段１４における顔検出と人体領域検出手段１５における人体検出とを両方行い、それぞれの検出結果について上述したように同一人物であるか否かの対応付けを行っておく。これにより、同定手段１６で同定処理を行う際、検出情報が切り替わっても容易に人物同定処理を行うことができる。 In the identification unit 16, the human body detection result in the human body region detection unit 15 may be different from the face detection result in the face region detection unit 14. In such a case, the identification unit 16 may not be able to determine the same person by identification because the detection information selection unit 16 does not match even if the comparison is performed when the process is switched halfway. Conceivable. Therefore, the identification unit 16 performs both the face detection by the face region detection unit 14 and the human body detection by the human body region detection unit 15 on the acquired video before and after switching, for example, and the detection results are described above. In this way, it is associated with whether or not they are the same person. Thereby, when the identification unit 16 performs the identification process, the person identification process can be easily performed even if the detection information is switched.

追跡手段１７は、同定手段１６により同定された同一人物の挙動を撮影された映像を用いて経時的に追跡する。なお、追跡手段１７は、映像に含まれる複数の人物に対して平行して経時的に追跡することができる。また、追跡手段１７は、映像に含まれる人物を経時的に追跡する場合、画像中から取得されるその人物の向き、姿勢、今までの行動等から、次の移動可能範囲を推測することもできる。この場合、追跡手段１７は、その移行可能範囲を画像に合成するための情報を生成し、後述する画面生成手段２２に出力することもできる。 The tracking unit 17 tracks the behavior of the same person identified by the identifying unit 16 over time using the captured video. The tracking unit 17 can track a plurality of persons included in the video over time in parallel. In addition, when tracking the person included in the video over time, the tracking unit 17 may estimate the next movable range from the orientation, posture, behavior, and the like of the person acquired from the image. it can. In this case, the tracking unit 17 can also generate information for combining the transferable range with the image and output the information to the screen generation unit 22 described later.

ここで、追跡手段１７は、画像中に含まれる人体領域の足、頭の位置とその特徴から人物を追跡し、同一として判断できる人物の大きさが変化した場合、人物の一部が、建物等で隠蔽されたか否かを判断する。なお、隠蔽される人物の一部とは、例えば下半身や上半身、頭、腕、胴体等である。つまり、本実施形態では、少なくとも１度画面上で足のつま先から頭部までの人体の身長に関する情報が取得できた場合、その後、頭及び足の両方が同時に隠蔽されていなければ、隠蔽部分を推定して人物を追跡することができる。 Here, the tracking means 17 tracks the person from the position of the foot and head of the human body region included in the image and its characteristics, and when the size of the person that can be determined to be the same changes, It is judged whether or not it is hidden by etc. The part of the person to be concealed is, for example, the lower body, the upper body, the head, arms, and the torso. In other words, in this embodiment, when information about the height of the human body from the toes of the feet to the head can be acquired at least once on the screen, if both the head and feet are not simultaneously hidden, The person can be estimated and tracked.

不審者検出手段１８は、追跡手段１７による追跡結果を、蓄積手段１３に蓄積された予め設定される経時的な不審行動パターンや挙動データ等と照合することで、撮像手段２６により撮影された１又は複数の人物のうち、不審者に該当する人物を検出する。 The suspicious person detection means 18 collates the tracking result by the tracking means 17 with preset suspicious behavior patterns and behavior data stored in the storage means 13, so that 1 taken by the imaging means 26. Or, a person corresponding to a suspicious person among a plurality of persons is detected.

つまり、不審者検出手段１８は、追跡対象人物が遮蔽物に隠れたり、カメラを所定時間以上気にしてみていたり、きょろきょろしていたり、長時間滞在している等の不審行動が少なくとも１つある場合には、その人物を不審者として検出する。更に、不審者検出手段１８は、追跡対象人物がマスクをしていたり、サングラスをかけていることで、顔を隠している場合にも不審者として検出する。 In other words, the suspicious person detecting means 18 has at least one suspicious behavior such as the person to be tracked hidden behind the shield, looking at the camera for more than a predetermined time, waiting for a long time, or staying for a long time. In that case, the person is detected as a suspicious person. Further, the suspicious person detecting means 18 detects a suspicious person even when the person to be tracked is wearing a mask or wearing sunglasses to hide his face.

また、不審者検出手段１８は、不審者を検出した場合に、どの不審行動に該当したのかの情報も含めてその全ての内容を蓄積手段１３に蓄積しておく。これにより、検索時におけるキーワードとして「不審者に相当するもの」が指定された場合に検索者の簡単な操作で容易に抽出されるようにしておく。更に、不審者検出手段１８は、不審者を検出した段階で画面生成手段２２にその旨の内容を通知する画面を生成させてもよい。 Further, when the suspicious person detecting means 18 detects a suspicious person, the suspicious person detecting means 18 accumulates all the contents including information on which suspicious action is applicable in the accumulating means 13. As a result, when “what corresponds to a suspicious person” is designated as a keyword at the time of search, it is easily extracted by a simple operation of the searcher. Furthermore, the suspicious person detecting means 18 may generate a screen for notifying the screen generating means 22 of the contents to that effect when the suspicious person is detected.

人物情報統合手段１９は、顔領域と人体領域とを同一人物として対応付けて、その人物の特徴を統合する。具体的には、人物情報統合手段１９は、画像中における顔領域の重心座標を取得し、取得した座標を包含する人体領域があった場合、その顔領域及び人体領域は、同一人物によるものであるとして対応付けを行う。 The person information integration means 19 associates the face area and the human body area as the same person, and integrates the characteristics of the person. Specifically, the person information integration unit 19 acquires the barycentric coordinates of the face area in the image, and when there is a human body area including the acquired coordinates, the face area and the human body area are from the same person. Associating as there is.

更に、人物情報統合手段１９は、１つの人体領域が２つ以上の顔領域を包含する場合、全ての顔領域に同じ人体領域を対応付けておく。これにより、追跡処理等において映像の途中で複数の人物が画面上で一時的に重なった場合にも、途切れることなく継続して追跡することができる。なお、人物の管理はＩＤ等の識別情報を用いて行う。なお、上述した人体領域と顔領域とを同一人物のものとして対応付ける処理については、本発明においてはこれに限定されるものではなく、例えば人物の姿勢や向き等を抽出し、抽出した情報を用いて対応付けを行ってもよい。 Furthermore, when one human body region includes two or more face regions, the person information integration unit 19 associates the same human body region with all the face regions. Thereby, even when a plurality of persons temporarily overlap on the screen during the tracking process or the like, it can be continuously tracked without interruption. Persons are managed using identification information such as IDs. Note that the processing for associating the human body region and the face region as those of the same person is not limited to this in the present invention. For example, the posture and orientation of the person are extracted and the extracted information is used. May be associated with each other.

フレーム情報生成手段２０は、上述した人物情報統合手段１９により統合された人物毎の人物情報を、映像に含まれる画像の１フレーム毎に格納したフレーム情報を生成し、生成した検索用人物特徴データを蓄積手段１３にデータベース等として蓄積する。これにより、フレーム単位で人物の特徴を管理することができる。また、フレーム毎にその撮影された時間情報が付与されているので、検索の際に、どの時間にどのような人物が何人いるか等、多種の検索キーワードを用いて様々な検索を高精度に実現することができる。なお、本実施形態における具体的なフレーム情報の構成等については、後述する。 The frame information generation unit 20 generates frame information in which the person information for each person integrated by the person information integration unit 19 described above is stored for each frame of the image included in the video, and the generated search person feature data Is stored in the storage means 13 as a database or the like. Thereby, the characteristics of a person can be managed in units of frames. In addition, since the time information of the shot is given for each frame, various searches can be performed with high accuracy using various search keywords such as what kind of person at what time during the search. can do. Note that a specific configuration of frame information in the present embodiment will be described later.

検索手段２１は、カメラ等の撮像手段２６等により撮影されたリアルタイム映像や、蓄積手段１３に既に蓄積されている監視映像等の各種情報に対して、本実施形態により検索者が指定する選択可能な所定のキーワードや抽出条件等を設定し、又は予め不審者を検出するための不審者検出条件を設定して、対応する人物が含まれる画像等の検索結果を抽出する。 The search means 21 can be selected by the searcher according to the present embodiment for various types of information such as real-time video captured by the imaging means 26 such as a camera and surveillance video already stored in the storage means 13. A predetermined keyword, an extraction condition, or the like is set, or a suspicious person detection condition for detecting a suspicious person is set in advance, and a search result such as an image including a corresponding person is extracted.

また、検索手段２１は、その検索結果に対して自動的に顔や服装（衣類だけでなく帽子、マスク、眼鏡等も含む）等の特徴、行動等の抽出条件等を設定し、当該検索結果に類似する人物の検索を行う類似検索を行い、類似度の高い順に所定の数だけ検索結果を出力するといった処理を行うこともできる。 The search means 21 automatically sets features such as a face and clothes (including not only clothes but also hats, masks, glasses, etc.), action extraction conditions, and the like for the search results. It is also possible to perform a process of performing a similar search for searching for persons similar to, and outputting a predetermined number of search results in descending order of similarity.

また、検索手段２１は、後述するように監視ロボットを用いて迷子等の捜索をリアルタイムで行う場合等には、監視ロボットに備えられた撮像手段２６からの映像に対して、予め設定された条件（身長、洋服の色、場所情報等）を満たす人物が表示されている画像を検索し、その結果を画面生成手段２２によりリアルタイムに表示させるといった機能も有する。 In addition, when the search unit 21 searches for a lost child or the like in real time using a monitoring robot as will be described later, a condition set in advance for an image from the imaging unit 26 provided in the monitoring robot is used. It also has a function of retrieving an image on which a person satisfying (height, color of clothes, location information, etc.) is displayed and displaying the result in real time by the screen generation means 22.

また、検索手段２１における検索画面やその結果表示される出力結果等は、画面生成手段２２により対応する画面が生成され、出力手段１２により出力される。これにより、本実施形態では、予め蓄積された映像やリアルタイム映像に基づいて所定の条件を満たす検索を行い、その検索結果を表示させることができる。なお、検索手段２１における検索手法については、後述する。 The search screen in the search means 21 and the output results displayed as a result thereof are generated by the screen generation means 22 and output by the output means 12. Thereby, in this embodiment, the search which satisfy | fills a predetermined condition can be performed based on the image | video and the real-time image | video accumulated beforehand, and the search result can be displayed. The search method in the search means 21 will be described later.

画面生成手段２２は、撮像手段２６により撮影された映像や顔領域検出手段１４により検出された顔領域や、人体領域検出手段１５により検出された人体領域、本実施形態における人物検索を行うためのメニュー画面、検索手段２１により検索で使用される入力画面、検索結果画面等、通知手段２３における通知結果等を出力するための画面を生成する。このとき、画面生成手段２２は、例えば撮影された人物の領域に対応する位置情報等の数値化されたデータ等を表示させることもできる。 The screen generation unit 22 performs a video search performed by the imaging unit 26, a face region detected by the face region detection unit 14, a human body region detected by the human body region detection unit 15, and a person search in this embodiment. A screen for outputting a notification result or the like in the notification means 23 such as a menu screen, an input screen used in the search by the search means 21, or a search result screen is generated. At this time, the screen generation means 22 can also display, for example, digitized data such as position information corresponding to the area of the photographed person.

また、画面生成手段２２は、映像に含まれる人物を追跡する際に用いられる移動可能範囲や追跡手段１８により得られる追跡ルート等の各種データを監視し易いように画面を生成して表示させることができる。更に、画面生成手段２２は、予め設定される不審者の行動パターンに該当する場合には、その旨の内容を通知するための画面を生成する。 The screen generation means 22 generates and displays a screen so that various data such as a movable range used when tracking a person included in the video and a tracking route obtained by the tracking means 18 can be easily monitored. Can do. Furthermore, when the screen generation means 22 corresponds to a preset suspicious person's behavior pattern, the screen generation means 22 generates a screen for notifying the content to that effect.

更に、画面生成手段２２は、検索手段２１により得られる検索結果を表示する際、その検索結果をリアルタイムで表示するような場合には、その検索結果の更新表示等を一時的に停止することができる。これにより、例えば、検索結果をリアルタイムで随時更新表示させているような場合には、一時停止することで、情報を確認し忘れることなく、必要な情報を表示させておくことができる。また、画面生成手段２２は、上述したように一時停止した表示内容を、その続きから継続して表示させることができる。 Further, when displaying the search results obtained by the search means 21, the screen generation means 22 may temporarily stop the display of the search results when displaying the search results in real time. it can. Thereby, for example, when the search result is updated and displayed as needed in real time, the necessary information can be displayed without forgetting to confirm the information by pausing. Further, the screen generation means 22 can continuously display the display content suspended as described above.

また、画面生成手段２２は、映像に含まれる人物の特徴或いは行動を表示するだけでなく、例えば撮像手段２６を設けた１又は複数の警備ロボットが警備対象施設内を巡回しているような場合には、その警備ロボット毎に監視映像と共に送信された巡回位置情報（３次元座標、緯度、経度等）を表示することもできる。これにより、警備ロボット毎の現在位置を正確に把握することができる。 In addition, the screen generation means 22 not only displays the characteristics or actions of the person included in the video, but also, for example, when one or more security robots provided with the imaging means 26 circulate in the security target facility. Can display the patrol position information (three-dimensional coordinates, latitude, longitude, etc.) transmitted together with the monitoring video for each security robot. Thereby, it is possible to accurately grasp the current position for each security robot.

なお、画面生成に必要な各種情報は、蓄積手段１３に予め蓄積されている情報等から必要な情報を適宜読み出して使用することができる。また、画面生成手段２２は、生成された画面等を出力手段１２としてのディスプレイに表示したり、スピーカ等により音声等を出力することができる。 Various kinds of information necessary for screen generation can be used by appropriately reading out necessary information from information stored in the storage unit 13 in advance. Further, the screen generation unit 22 can display the generated screen or the like on a display as the output unit 12 or output sound or the like through a speaker or the like.

通知手段２３は、不審者検出手段１８により得られる不審者として検出された画像と、その画像に関する情報（検出日時、検出場所、その前の所定時間分の映像等）を画面生成手段２２により生成させて、出力手段１２により表示させる。また、通知手段２３は、そのような不審行動検出における問題発生信号に対応させて、管理者や警備会社におけるそのビルの担当警備員、監視員、代表責任者、警備ロボット等に通知を行うアラート機能を有する。 The notification means 23 generates an image detected as a suspicious person obtained by the suspicious person detection means 18 and information related to the image (detection date and time, detection location, video for a predetermined time, etc.) by the screen generation means 22. And displayed by the output means 12. In addition, the notification means 23 is an alert that notifies the security officer, the supervisor, the representative manager, the security robot, etc. in charge of the building in the manager or the security company in response to the problem occurrence signal in such suspicious behavior detection. It has a function.

なお、通知手段２３は、例えば警備ロボット等に通知を行う場合には、その警備ロボットが対象者と対面しているか又は警備ロボットの撮像手段２６により対象者が撮影されるほど接近した位置にいるため、警備ロボットから対象者に対して音声メッセージを出力させたり、警報ランプや非常音等により周囲に対して注意を促すような処理を行わせるような警備ロボットに対する制御信号を通知することもできる。 For example, when notifying the security robot or the like, the notification means 23 is in a position where the security robot is facing the target person or close enough that the target person is photographed by the imaging means 26 of the security robot. Therefore, it is possible to output a control message for the security robot that causes the security robot to output a voice message to the target person, or to perform a process of calling attention to the surroundings by an alarm lamp or emergency sound. .

送受信手段２４は、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）やインターネット等の通信ネットワーク等を介して１又は複数の撮像手段２６からの監視映像を受信する。なお、送受信手段２４は、撮像手段２６から直接監視映像を受信しなくてもよく、例えば予め撮像手段２６で取得した映像をどこかに一時的に保存しておき、その保存された情報を用いて本実施形態における人体検出を行ってもよい。 The transmission / reception means 24 receives monitoring video from one or a plurality of imaging means 26 via a communication network such as a LAN (Local Area Network) or the Internet. Note that the transmission / reception unit 24 does not have to receive the monitoring video directly from the imaging unit 26. For example, the video acquired by the imaging unit 26 is temporarily stored somewhere and the stored information is used. In this embodiment, human body detection may be performed.

また、送受信手段２４は、人物検索装置１０を構成する他の端末に送信したり、他の端末から各種データを受信するための通信インタフェースとして用いることができる。 Further, the transmission / reception means 24 can be used as a communication interface for transmitting to other terminals constituting the person search apparatus 10 and receiving various data from other terminals.

制御手段２５は、人物検索装置１０における各機能構成全体の制御を行う。具体的には、制御手段２５は、入力手段１１により入力されたユーザからの入力情報に基づいて、顔領域検出を行ったり、人体領域検出を行ったり、同定処理を行ったり、追跡処理を行ったり、不審者検出処理を行ったり、人物情報統合処理を行ったり、フレーム情報を生成したり、検索処理を行ったり、画面生成を行ったり、通知制御を行ったり、送受信制御を行う等の各種制御を行う。 The control means 25 controls the entire functional configuration of the person search apparatus 10. Specifically, the control unit 25 performs face area detection, human body area detection, identification processing, and tracking processing based on input information from the user input by the input unit 11. Various processes such as performing suspicious person detection processing, performing personal information integration processing, generating frame information, performing search processing, generating screens, performing notification control, performing transmission / reception control, etc. Take control.

また、制御手段２５は、例えば警備ロボットに警備対象施設内を巡回させている場合には、その警備ロボットに対して巡回経路や対象者発見時の動作等について制御を行う。具体的には、制御手段２５は、対象者発見時に通知手段２３により対象者やその周囲について、音声メッセージや警報等の出力を行わせたり、対象者発見後に警備ロボットの巡回を強制終了させ、帰還経路を設定する等の制御を行う。つまり、警備ロボットが対象者を発見した時点で巡回を終了させ、管理室等の所定の場所まで対象者を案内させるという経路設定が可能になる。 Further, for example, when the security robot is patroling the inside of the security target facility, the control unit 25 controls the security robot with respect to the tour route and the operation when the target person is found. Specifically, the control unit 25 causes the notification unit 23 to output a voice message, a warning, or the like for the target person and its surroundings when the target person is discovered, or forcibly terminates patrol of the security robot after the target person is found, Control such as setting a return path. That is, when the security robot finds the target person, the patrol can be ended and the route setting can be made to guide the target person to a predetermined place such as a management room.

更に、制御手段２５は、警備ロボットから得られる警告信号(例えば、火災、漏水、人物検知等)に対応して、警備ロボットに緊急時の動作等を行わせる制御信号を生成し、その制御信号を、出力手段１２や送受信手段２４等を介して警備ロボット等に送信し、警備ロボットを制御することができる。 Further, the control means 25 generates a control signal for causing the security robot to perform an emergency operation in response to a warning signal (for example, fire, water leakage, person detection, etc.) obtained from the security robot. Can be transmitted to the security robot or the like via the output means 12, the transmission / reception means 24, etc., and the security robot can be controlled.

＜顔領域検出手段１４、人体領域検出手段１５における付加機能＞
次に、顔領域検出手段１４及び人体領域検出手段１５における付加機能について図を用いて説明する。図２は、顔領域検出手段及び人体領域検出手段における付加機能を説明するための一例の図である。なお、図２（ａ）は顔領域検出手段１４における機能構成例を示し、図２（ｂ）は人体領域検出手段１５における機能構成例を示している。 <Additional functions in face area detection means 14 and human body area detection means 15>
Next, additional functions in the face area detection unit 14 and the human body area detection unit 15 will be described with reference to the drawings. FIG. 2 is a diagram illustrating an example of additional functions in the face area detecting unit and the human body area detecting unit. 2A shows an example of the functional configuration of the face area detecting means 14, and FIG. 2B shows an example of the functional configuration of the human body area detecting means 15.

顔領域検出手段１４は、図２（ａ）に示すようにトラッキング手段３１と、顔認証手段３２と、性別・年代推定手段３３と、顔隠し判定手段３４とを有している。トラッキング手段３１は、検出された顔領域を用いて人物追跡を行い、その際、画像フレーム中に含まれる複数の顔領域をそれぞれ識別して蓄積するために識別情報（トラッキングＩＤ）を割り当てる。また、トラッキング手段３１は、割り当てたトラッキングＩＤについて、トラッキングの状態等により、例えば「未使用」、「フレームイン」、「フレームアウト」、「追跡中」の４種の状態を出力する。 As shown in FIG. 2A, the face area detection unit 14 includes a tracking unit 31, a face authentication unit 32, a gender / age estimation unit 33, and a face-hiding determination unit 34. The tracking unit 31 performs person tracking using the detected face area, and assigns identification information (tracking ID) to identify and accumulate each of the plurality of face areas included in the image frame. In addition, the tracking unit 31 outputs, for example, four types of statuses “unused”, “frame-in”, “frame-out”, and “tracking” for the assigned tracking ID, depending on the tracking status.

顔認証手段３２は、検出された顔領域から目や鼻、口等の各配置情報からなる顔の特徴量を取得し、取得した特徴量と、予め蓄積手段１３等にデータベース（ＤＢ）等として蓄積された既に登録されている人物（登録人物）や芸能人等の人物の顔の特徴量とを用いて顔認証処理を行い、認証された人物の識別情報（ＩＤ）を出力する。また、顔認証手段３２は、蓄積手段１３に蓄積された顔画像のＩＤを、人物（登録人物・芸能人）の名前に変換して出力してもよい。 The face authentication unit 32 acquires a face feature amount including each piece of arrangement information such as eyes, nose, and mouth from the detected face area, and stores the acquired feature amount in advance as a database (DB) or the like in the storage unit 13 or the like. Face authentication processing is performed using the accumulated facial features of persons such as registered persons (registered persons) and entertainers, and the identification information (ID) of the authenticated person is output. Further, the face authentication unit 32 may convert the ID of the face image stored in the storage unit 13 into the name of a person (registered person / celebrity) and output it.

性別・年代推定手段３３は、検出された顔領域から、顔の特徴量を取得し、取得した顔の特徴量と、蓄積手段１３により性別、年代を推定し、結果を出力する。 The gender / age estimation means 33 acquires the facial feature quantity from the detected face area, estimates the gender and age using the acquired facial feature quantity and the storage means 13, and outputs the result.

顔隠し判定手段３４は、検出された顔領域に対し、マスクやサングラス等の顔を隠すための処理が行われているか否かを検出する。具体的には、顔隠し判定手段３４は、顔領域から推測される目領域、口領域の色情報を取得し、目領域の色情報がサングラスを着用していると想定される色（例えば黒系）であるときには、サングラスにより顔が隠されていると判定し、口領域がマスクをそれぞれについて着用していると想定される色（例えば白系）であるときには、マスクにより顔が隠されていると判定し、その判定結果を出力する。 The face hiding determination unit 34 detects whether or not a process for hiding the face such as a mask or sunglasses is performed on the detected face area. Specifically, the face hiding determination unit 34 acquires color information of the eye area and mouth area estimated from the face area, and the color information of the eye area is assumed to be wearing sunglasses (for example, black). System), it is determined that the face is hidden by sunglasses, and when the mouth area is a color assumed to be wearing a mask (for example, white), the face is hidden by the mask. And the determination result is output.

また、人体領域検出手段１５は、図２（ｂ）に示すように身長推定手段３５と、色情報抽出手段３６と、人物位置推定手段３７とを有している。 Further, the human body region detection means 15 has a height estimation means 35, a color information extraction means 36, and a person position estimation means 37 as shown in FIG.

身長推定手段３５は、検出された人体領域、顔領域それぞれに対して、人物の身長を算出する。例えば、身長推定手段３５は、画像中に含まれる１又は複数の人物を検出し、その人物がいた場合に新規の人物であるか否かを判断し、新規人物である場合、その人体領域の足の先と、その画像に対応する消失点とから実空間上の人物の位置を算出し、画像上の見かけ上の大きさを併せて、実際の人物の身長を推定する。なお、消失点は、画像中に含まれる場合もあるが、カメラのアングル等により画像中に含まれない場合もある。この場合には、仮想的な空間上に消失点を設定し、その消失点を利用する。なお、身長推定手法については、本発明においてはこれに限定されない。 The height estimation means 35 calculates the height of the person for each of the detected human body area and face area. For example, the height estimating means 35 detects one or a plurality of persons included in the image, and determines whether or not the person is a new person when the person is present. The position of the person in real space is calculated from the tip of the foot and the vanishing point corresponding to the image, and the height of the actual person is estimated together with the apparent size on the image. The vanishing point may be included in the image, but may not be included in the image depending on the camera angle or the like. In this case, a vanishing point is set on a virtual space and the vanishing point is used. Note that the height estimation method is not limited to this in the present invention.

色情報抽出手段３６は、検出された人体領域及び顔領域のそれぞれの領域情報の位置関係を用いて、人物の頭部領域や肌領域、上半身領域、下半身領域を推定し、各領域の色情報を抽出する。具体的には、頭髪、上衣、下衣等の色情報を抽出し、抽出した色情報の代表色（平均色等）を頭髪、上衣、下衣等の色情報として決定する。 The color information extraction unit 36 estimates the head region, skin region, upper body region, and lower body region of a person using the positional relationship between the detected region information of the human body region and the face region, and color information of each region To extract. Specifically, color information such as hair, upper garment, and lower garment is extracted, and a representative color (average color, etc.) of the extracted color information is determined as color information such as hair, upper garment, and lower garment.

なお、上述した各領域の色情報の算出では、各領域の総画素数・色の出現頻度をカウントし、例えば上位１０色についてのＲＧＢ値を出力する。ここで、上述した各領域は、具体的には、顔領域の上部から頭部領域、顔領域の左右から頬（肌）領域、人体領域の中心よりも上部で顔領域よりも下部から上半身領域、人体領域の中心より下部或いは地面から上部で下半身領域を抽出する。また、上述した各領域は、例えば顔領域の位置や大きさ（例えば、顔領域を矩形領域で表した場合の１辺長さ）等を基準として設定してもよい。なお、抽出される領域の設定内容は、本発明においてはこれに限定されない。更に、色情報抽出手段３６は、顔領域、人体領域、頭部領域、肌領域、上半身領域、下半身領域の少なくとも１つの領域の画像データをそのまま切り取って出力することもできる。 In the above-described calculation of the color information of each area, the total number of pixels and the appearance frequency of each area are counted, and, for example, RGB values for the top 10 colors are output. Here, each of the above-described areas is specifically the head area from the top of the face area, the cheek (skin) area from the left and right of the face area, and the upper body area from the bottom of the face area above the center of the human body area. The lower body region is extracted below the center of the human body region or above the ground. Each area described above may be set based on, for example, the position and size of the face area (for example, the length of one side when the face area is represented by a rectangular area). In addition, the setting content of the area | region extracted is not limited to this in this invention. Furthermore, the color information extraction means 36 can cut out and output the image data of at least one of the face area, human body area, head area, skin area, upper body area, and lower body area as it is.

人物位置推定手段３７は、検出された人体領域、顔領域それぞれに対して、人物の実空間上での位置座標を算出する。この場合、人物位置推定手段３７は、カメラから撮影された画像から得られる２次元座標に対して予め設定される変換式により３次元の実空間上での位置座標（Ｘ，Ｙ，Ｚ）を取得する。 The person position estimation means 37 calculates the position coordinates of the person in the real space for each of the detected human body area and face area. In this case, the person position estimating means 37 calculates the position coordinates (X, Y, Z) in the three-dimensional real space by a conversion formula set in advance for the two-dimensional coordinates obtained from the image taken by the camera. get.

＜フレーム情報生成手段２０におけるフレーム情報の構成例＞
次に、上述したフレーム情報生成手段２０におけるフレーム情報の構成例について説明する。図３は、本実施形態におけるフレーム情報の一例を示す図である。フレーム情報生成手段２０は、１フレーム毎に処理を行い検出された人物について、図３に示す情報が抽出される。ここで、図３に示すフレーム情報としては、例えば共通の項目として「ファイル名」、「検出日時」があり、顔領域から抽出される情報として「人物位置座標（Ｘ，Ｙ，Ｚ）」、「身長情報」、「各種色情報」、「登録者情報」、「似ている芸能人情報」、「年代情報」、「性別情報」、「顔の向き（ＰＡＮ，ＴＩＬＴ）」、「マスク」、「サングラス」、「取得顔画像情報」、があり「人物位置座標（Ｘ，Ｙ，Ｚ）」、「身長情報」、「各種色情報」等がある。 <Configuration Example of Frame Information in Frame Information Generation Unit 20>
Next, a configuration example of frame information in the above-described frame information generation unit 20 will be described. FIG. 3 is a diagram illustrating an example of frame information in the present embodiment. The frame information generating means 20 extracts information shown in FIG. 3 for the person detected by performing processing for each frame. 3 includes, for example, “file name” and “detection date / time” as common items, and “person position coordinates (X, Y, Z)” as information extracted from the face area. "Height information", "various color information", "registrant information", "similar celebrity information", "age information", "sex information", "face orientation (PAN, TILT)", "mask" “Sunglasses”, “acquired face image information”, “personal position coordinates (X, Y, Z)”, “height information”, “various color information”, and the like.

「ファイル名」には、現在処理をしている映像ファイルのファイル名が格納される。また、「検出日時」には、人物が検出された日時が格納される。なお、「検出日時」としては、映像ファイル名が実際の録画開始時刻と対応しているため、撮影開始時刻と映像内での時刻を加算し、実際の時刻が格納される。 The “file name” stores the file name of the video file currently being processed. In addition, “date and time of detection” stores the date and time when a person was detected. As “detection date and time”, since the video file name corresponds to the actual recording start time, the actual time is stored by adding the shooting start time and the time in the video.

また、顔領域から抽出される「人物位置座標（Ｘ，Ｙ，Ｚ）」には、カメラ位置のＸＹ座標と地面を原点とした実空間上での人物足元のＸＹＺ座標が格納される。つまり、「人物位置座標（Ｘ，Ｙ，Ｚ）」には、顔領域から算出した人物の実空間上での位置座標が格納される。また、「身長情報」には、顔領域から算出した身長が格納される。また、「各種色情報」には、顔領域から算出した頭部・上半身の総画素数・色上位１０色・頻度が格納される。 Further, the “person position coordinates (X, Y, Z)” extracted from the face area stores the XY coordinates of the camera position and the XYZ coordinates of the person's feet in the real space with the ground as the origin. That is, the position coordinates in the real space of the person calculated from the face area are stored in “person position coordinates (X, Y, Z)”. Also, the height calculated from the face area is stored in the “height information”. The “various color information” stores the total number of pixels of the head and upper body, the top 10 colors, and the frequency calculated from the face area.

なお、頭部の色情報としては、顔領域から算出した頭部についての頻度１位から１０位までのＲＧＢの数値と頻度が格納される。また、頭部の総画素数としては、顔領域から算出した頭部領域の総画素数が格納される。また、上半身の色情報としては、顔領域から算出した上半身についての頻度１位から１０位までのＲＧＢの数値と頻度が格納される。更に、上半身の総画素数としては、顔領域から算出した上半身領域の総画素数が格納される。 As the head color information, RGB numerical values and frequencies from the first to the tenth frequency for the head calculated from the face area are stored. Further, the total number of pixels of the head region calculated from the face region is stored as the total number of pixels of the head. Further, as the upper body color information, RGB numerical values and frequencies of the upper body calculated from the face area from the 1st frequency to the 10th frequency are stored. Further, the total number of pixels of the upper body area calculated from the face area is stored as the total number of pixels of the upper body.

また、「登録者情報」には、登録人物ＤＢと照合した結果、最も近い人物名と認証スコアが設定される。具体的には、登録人物ＤＢとの照合により、登録者として（認証閾値を上回った）認証された場合、該当登録者の名前が上位１０名分格納される。認証されなかった場合は、「未登録者」が格納される。また、認証結果スコアは、上位１０名分の認証スコアが０から１０００までの整数値で示され、照合の結果、「未登録者」であった場合は−１が格納される。 In “registrant information”, the closest person name and authentication score are set as a result of collation with the registered person DB. Specifically, when authentication is performed as a registrant (exceeding the authentication threshold) by collation with the registered person DB, the names of the top 10 registrants are stored. If not authenticated, “unregistered person” is stored. Further, the authentication result score is indicated by an integer value from 0 to 1000 for the top 10 authentication scores, and -1 is stored if the result of collation is “unregistered”.

また、「似ている芸能人情報」には、芸能人を登録したＤＢと照合した結果、最も近い人物名と認証スコアが格納される。具体的には、芸能人ＤＢとの照合により、認証スコアが最も高く、かつ、閾値よりも上の場合に、該当芸能人の名前が格納される。認証がされなかった場合、「似ていない」を格納する。また、認証結果スコアは、上述と同様である。 Further, in “similar celebrity information”, the closest person name and authentication score are stored as a result of collation with a DB in which celebrities are registered. Specifically, the name of the entertainer is stored when the authentication score is the highest and is above the threshold value by collation with the entertainer DB. If not authenticated, “similar” is stored. The authentication result score is the same as described above.

また、「年代情報」には、検出人物の年代・信頼度が格納される。また、「性別情報」には、男女何れかの性別と信頼度が格納される。また、「顔の向き」には、パン、チルト等のカメラワークに対するカメラを原点とした、顔向き角度が格納される。具体的には、カメラを原点とし、カメラから見た顔の角度が格納され、カメラから見て上方向と左方向が＋となる。 The “age information” stores the age and reliability of the detected person. The “sex information” stores the sex and reliability of either gender. Further, the “face orientation” stores a face orientation angle with respect to camera work such as pan and tilt as the origin. Specifically, the angle of the face viewed from the camera is stored with the camera as the origin, and the upward direction and the left direction viewed from the camera are +.

また、「マスク」には、マスク等で口の周辺が隠れているかどうかが格納され、具体的には「あり」、「なし」、「不明」の何れかが格納される。また、「サングラス」には、サングラス等で目の周辺が隠れているかどうかが格納され、具体的には「あり」、「なし」、「不明」の何れかが格納される。 Further, “mask” stores whether or not the periphery of the mouth is hidden by a mask or the like, and specifically stores “Yes”, “No”, or “Unknown”. Further, “Sunglasses” stores whether or not the vicinity of the eyes is hidden by sunglasses or the like, and specifically, “Yes”, “No”, or “Unknown” is stored.

また、「取得顔画像情報」には、検出された顔画像へのファイルパスと信頼度が格納され、具体的には、このフレームで検出された顔画像が蓄積されている場所へのファイルパスと、検出された顔領域の一辺の長さ、認証スコアが閾値以上で最も高い認証結果名、その信頼度等が格納される。 The “acquired face image information” stores the file path to the detected face image and the reliability, and specifically, the file path to the location where the face image detected in this frame is stored. And the length of one side of the detected face area, the authentication result name having the highest authentication score equal to or higher than the threshold, the reliability, and the like are stored.

また、人体領域から抽出される「人物位置座標（Ｘ，Ｙ，Ｚ）」には、カメラ位置のＸＹ座標と地面を原点とした実空間上での人物足元のＸＹＺ座標が格納され、人体領域から算出された、人物の実空間上での位置座標が格納される。 Further, in the “person position coordinates (X, Y, Z)” extracted from the human body region, the XY coordinates of the camera position and the XYZ coordinates of the human foot in the real space with the ground as the origin are stored. The position coordinates of the person in the real space calculated from the above are stored.

また、「身長情報」には、人体領域・顔領域から算出した身長が格納される。また、「各種色情報」には、人体領域から算出した頭部・上半身・下半身の総画素数・色上位１０色・頻度が格納され、具体的には頭部についての頻度１位から１０位までのＲＧＢの数値と頻度、頭部領域の総画素数、上半身についての頻度１位から１０位までのＲＧＢの数値と頻度、上半身領域の総画素数、下半身についての頻度１位から１０位までのＲＧＢの数値と頻度、及び下半身領域の総画素数のうち少なくとも１つが格納される。 Also, the height calculated from the human body area / face area is stored in the “height information”. Further, the “various color information” stores the total number of pixels of the head, upper body, and lower body, the top 10 colors, and the frequency calculated from the human body region. Specifically, the frequency for the head is ranked from 1st to 10th. RGB values and frequencies up to, the total number of pixels in the head region, the RGB numbers and frequencies from the first to the 10th frequency for the upper body, the total number of pixels in the upper body region, the frequencies from the first to the 10th in the lower body At least one of the RGB numerical value and frequency, and the total number of pixels in the lower body region are stored.

本実施形態では、上述したような情報を各種処理の結果として、フレーム単位で出力される。なお、人物領域、顔領域共に検出されない場合は何も出力されない。また、人物領域、顔領域の検出状態によっては、出力されない情報もある。その場合には、「不明」を示す値が出力される。 In the present embodiment, the above-described information is output in units of frames as a result of various processes. If neither a person area nor a face area is detected, nothing is output. Some information may not be output depending on the detection state of the person area and the face area. In that case, a value indicating “unknown” is output.

また、本実施形態では、上述した情報がフレーム毎に生成され、フレーム情報を出力された各データにはタグを付与する。これにより、各データが何の情報に対応するかを明確にすることができる。また、１フレーム内に複数の人物が検出された場合には、人物毎に別々のトラッキングＩＤ（識別情報）が割り振られ、トラッキングＩＤ毎に同日付、同時刻を付与し、複数行のフレーム情報が出力される。 In the present embodiment, the above-described information is generated for each frame, and a tag is assigned to each data for which the frame information is output. This makes it possible to clarify what information each data corresponds to. In addition, when a plurality of persons are detected in one frame, different tracking IDs (identification information) are allocated for each person, and the same date and time are assigned to each tracking ID. Is output.

また、図４は、本実施形態における検索用人物特徴を取得するまでの概要を説明するための図である。図４に示す図は、上述した顔領域検出や人体領域検出により得られる人物特徴から検索用人物特徴データを生成するための図である。なお、人物特徴データがフレーム情報生成手段２０等により生成され、蓄積手段１３にデータベース等により蓄積される。 FIG. 4 is a diagram for explaining the outline until the retrieval person feature is acquired in the present embodiment. The diagram shown in FIG. 4 is a diagram for generating search person feature data from the person features obtained by the above-described face region detection and human body region detection. The person characteristic data is generated by the frame information generation means 20 and the like, and is stored in the storage means 13 by a database or the like.

具体的に説明すると、図４（ａ）では、フレーム情報生成において、各フレーム（Ｆ）において検出された人物（例えば、人物Ａ、人物Ｂ等）に対する特徴（例えば、身長、サングラスの有無等）が取得され、これらの人物特徴を図４（ｂ）に示すように人物毎に纏めて人物特徴をカウントする。つまり、図４（ｂ）では、全てのデータを離散化して出現回数をカウントしており、具体的には、図４（ｂ）において、人物Ａは、映像中の３フレームで抽出され、そのうち、身長が１７０ｃｍと推定された場合が２回、１７５ｃｍと推定された場合が１回あり、更にサングラスの着用ありと検出された場合が２回、着用なしと検出された場合が１回ある等とカウントされる。なお、上述の例では、身長を５ｃｍ毎に離散化しているが、本発明においてはこれに限定されない。 Specifically, in FIG. 4A, in frame information generation, characteristics (for example, height, presence / absence of sunglasses, etc.) for the person (for example, person A, person B, etc.) detected in each frame (F). Is acquired, and these person features are collected for each person as shown in FIG. 4B, and the person features are counted. That is, in FIG. 4B, all data are discretized and the number of appearances is counted. Specifically, in FIG. 4B, the person A is extracted in three frames in the video, of which When the height is estimated to be 170 cm, there are 2 times, when the height is estimated to be 175 cm, once when the wearing of sunglasses is detected, twice, when there is no detection of wearing, etc. Is counted. In the above example, the height is discretized every 5 cm, but the present invention is not limited to this.

次に、図４（ｃ）に示すように、各フレーム数で正規化を行い、このデータを検索に使用する。具体的には、検索用人物特徴データの作成する際、様々な特徴や行動の組み合わせで人物を検索するために、例えば、身長、性別、服の色のＨＳＶ値等の様々な形態により出力される人物情報を下記の式に従い、統一的に記述する。 Next, as shown in FIG. 4C, normalization is performed with the number of frames, and this data is used for retrieval. Specifically, when the search person feature data is created, in order to search for a person with a combination of various features and actions, for example, it is output in various forms such as height, sex, and HSV value of clothes color. According to the following formula, describe the person information in a unified manner.

例えば、ある人物Ａにおける特徴Ｘについてのデータを作成する場合、特徴Ｘがデジタル値（性別（男・女）、サングラス（有・無）、マスク（有・無）等）の際には、以下に示す（１）式により頻度値Ｘの人物Ａについての総頻度Ｆ_ＡＸを算出する。 For example, when creating data about a feature X of a person A, when the feature X is a digital value (gender (male / female), sunglasses (presence / absence), mask (presence / absence), etc.), The total frequency F _AX for the person A having the frequency value X is calculated by the equation (1) shown below.

また、特徴Ｘがアナログ値（身長、認証スコア、服の色のＨＳＶ値等）の際には、以下に示す（２）式により連続値Ｘ内特定範囲ｄの人物Ａについての総頻度を算出する。

When the feature X is an analog value (height, authentication score, clothes color HSV value, etc.), the total frequency of the person A in the specific range d within the continuous value X is calculated by the following equation (2). To do.

これにより、人物の全ての情報をヒストグラムへと変換することができる。なお、このままでは人物間の比較に用いることができないため、各人物の検出フレーム数で正規化することで、正規化ヒストグラムへと変換し、以下に示す（３）式により人物Ａの特徴Ｘについての信頼度Ｐ_ＡＸを算出する。

Thereby, all the information of the person can be converted into a histogram. Since it cannot be used for comparison between persons as it is, it is converted into a normalized histogram by normalizing with the number of detected frames of each person, and the characteristic X of the person A is expressed by the following equation (3). The reliability P _AX is calculated.

これにより、図４（ｃ）に示すように検索用人物特徴データを取得することができる。

Thereby, as shown in FIG.4 (c), the person characteristic data for a search can be acquired.

＜本実施形態における検索手法＞
次に、上述した検索手段２１における本発明における検索手法について、図を用いて説明する。本実施形態における検索手法では、上述した人物特徴データを用いた検索スコアリング手法により、映像に映っている人物を、様々な人物の特徴や行動の組み合わせで、検索することができる。また、どの組み合わせにおいても、検索要求との一致度が高い順番に検索結果を出力することができる。 <Search method in this embodiment>
Next, the search method in the present invention in the search means 21 described above will be described with reference to the drawings. In the search method according to the present embodiment, it is possible to search for a person shown in a video with a combination of various person characteristics and actions by the above-described search scoring method using person feature data. In any combination, search results can be output in descending order of matching with the search request.

ここで、図５は、本実施形態における検索手法を説明するための一例の図である。本実施形態では、図５（ａ）に示すように検索者が入力した検索要求「サングラスをしている１７５ｃｍの人」を、上述した特徴量へと変換し、正規化ヒストグラム上の最大値（＝１）を検索要求特徴量とする（図５（ｂ））。また、検索要求特徴量に該当する特徴を参照して抽出し（図５（ｃ））と、その該当特徴との距離を算出し、その距離を検索スコアとし（図５（ｄ））、スコアの高い人物順に検索結果を出力する。 Here, FIG. 5 is a diagram illustrating an example of a search method according to the present embodiment. In this embodiment, as shown in FIG. 5A, the search request “175 cm person wearing sunglasses” input by the searcher is converted into the above-described feature amount, and the maximum value ( = 1) is set as the search request feature amount (FIG. 5B). Further, a feature corresponding to the search request feature value is extracted with reference to the feature (FIG. 5C), a distance from the feature is calculated, and the distance is set as a search score (FIG. 5D). The search results are output in order of the person with the highest

なお、本実施形態における検索に適した色情報の取り扱いとしては、人物特徴の中でも、服や髪の毛の色は、特に重要な要素であるが、色の表現は主観により変化するため、主観による変動を吸収できるようにする。具体的には、人が感じる色に検索結果を適合させるため、例えば、ＨＳＶ空間等の予め人の視覚特性に基づき設定された色空間で処理を行う。なお、上述した色空間は、本発明においてはＨＳＶに限定されず、例えばＨＬＳ、ＬＵＶ、ＬＡＢ等を用いることができる。更に、色解像度を低下させることで、主観による変動を吸収する。 It should be noted that the color information suitable for search in the present embodiment is particularly important for the human characteristics, such as the color of clothes and hair, but the expression of the color changes depending on the subjectivity. To absorb. Specifically, in order to adapt the search result to a color that a person feels, for example, processing is performed in a color space that is set in advance based on human visual characteristics, such as an HSV space. The color space described above is not limited to HSV in the present invention, and for example, HLS, LUV, LAB, or the like can be used. Furthermore, the subjectivity variation is absorbed by reducing the color resolution.

また、フリーワードによる検索を行う際には、入力された語句を検索要求へと変換し、検索処理を行う。なお、従来では、検索キーワードとして例えば「赤い上着青いズボン」のように入力した場合、どの形容詞と、どの名詞とが修飾関係にあるかが理解できないという問題があった。そのため、本実施形態では、入力された語句の修飾関係を考慮して、検索要求へと変換することで、検索精度をより向上させることができる。
なお、本実施形態では、検索要求の入力の際には、人物を表現するフリーワード入力や、検索可能な人物特徴一覧から選択することができる。 Further, when performing a search using free words, the input word / phrase is converted into a search request and a search process is performed. Conventionally, when a search keyword such as “red jacket blue pants” is entered, there is a problem that it is impossible to understand which adjectives and which nouns are in a modification relationship. Therefore, in the present embodiment, the search accuracy can be further improved by converting into a search request in consideration of the modification relationship of the input words.
In this embodiment, when inputting a search request, it is possible to select from a free word input representing a person or a searchable person feature list.

これにより、蓄積された膨大な量の映像から、検索者が意図する人物像に近い人物が撮影された時刻の映像を容易に検索し、その検索結果を取得することができる。また、本実施形態では、映像確認の際には、人物映像に加えて、その人物の見た目の特徴と行動が文字で表記される。また、所望の人物については、不審者情報として各種媒体へ出力することができる。これにより、撮影された画像に対して簡単な操作で効率的に高精度な人物検索を行うことができる。 Thereby, it is possible to easily search for a video at a time when a person close to a person image intended by the searcher is taken from a huge amount of videos stored, and to obtain the search result. In the present embodiment, when the video is confirmed, in addition to the person video, the appearance characteristics and behavior of the person are written in characters. The desired person can be output to various media as suspicious person information. Thereby, a highly accurate person search can be efficiently performed with a simple operation on the captured image.

＜画面生成手段２２において生成される画面例＞
次に、画面生成手段２２において生成される画面例について図を用いて説明する。図６は、本実施形態における画面遷移例を示す図である。図６に示す画面例では、初期画面４１と、日時指定画面４２と、特徴・行動条件指定画面４３と、検索結果表示画面４４と、映像再生画面４５と、共有画面４６とを有するよう構成されている。本実施形態に係るシステム起動時には、最初に初期画面４１が表示される。また、図６に示す各画面への遷移は、図６に示すように予め所定の画面に遷移する遷移ボタンを設定し、その遷移ボタンを選択することで、それぞれの画面がディスプレイ等の出力手段１２に出力される。次に、上述した各画面について、以下に具体的に説明するが、本発明における画面生成される画面例はこれに限定されず、また以下に示す画面レイアウトや表示内容についてもこれに限定されるものではない。 <Example of Screen Generated by Screen Generation Unit 22>
Next, an example of a screen generated by the screen generation unit 22 will be described with reference to the drawings. FIG. 6 is a diagram showing an example of screen transition in the present embodiment. The screen example shown in FIG. 6 includes an initial screen 41, a date / time designation screen 42, a feature / action condition designation screen 43, a search result display screen 44, a video playback screen 45, and a shared screen 46. ing. When the system according to the present embodiment is started, an initial screen 41 is first displayed. Further, the transition to each screen shown in FIG. 6 is performed by setting a transition button for transitioning to a predetermined screen in advance as shown in FIG. 6 and selecting the transition button so that each screen is output means such as a display. 12 is output. Next, each of the above-described screens will be described in detail below. However, screen examples generated in the present invention are not limited to this, and the screen layout and display contents shown below are also limited thereto. It is not a thing.

＜初期画面４１＞
図７は、本実施形態における初期画面の一例を示す図である。図７に示す初期画面４１からの検索方法には、来店者検索領域５１、不審者検索領域５２、キーワード検索領域５３）と、条件指定検索（人物特徴から検索）領域５４−１と、リアルタイム捜索領域５４−２との５種類がある。 <Initial screen 41>
FIG. 7 is a diagram illustrating an example of an initial screen in the present embodiment. The search method from the initial screen 41 shown in FIG. 7 includes a store visitor search area 51, a suspicious person search area 52, a keyword search area 53), a condition-designated search (search from person characteristics) area 54-1, and a real-time search. There are five types of areas 54-2.

図７において、上述した撮像手段２６が設置された対象設備内（例えば商店に来店した人物を確認する場合には、来店者検索領域５１の「来店者検索ボタン」を押下することで日時指定画面４２へと遷移する。また、初期画面４１において、不審者を検索する場合には、不審者検索領域５２の「検索」ボタンを押下することで日時指定画面４２へと遷移する。 In FIG. 7, the date and time designation screen is displayed by pressing the “visitor search button” in the visitor search area 51 in the target facility where the above-described imaging means 26 is installed (for example, when confirming a person who has visited a store) In addition, when searching for a suspicious person on the initial screen 41, the screen moves to the date and time designation screen 42 by pressing the “search” button in the suspicious person search area 52.

また、初期画面４１において、キーワード検索を行う場合には、キーワード検索領域５３のエディットボックス内に検索対象のキーワードを入力し、「キーワード検索」ボタンを押下することで、予め設定された日時や場所情報等と、エディットボックス内に入力された検索用の文字をキーワード（検索要求の変換処理）として検索手段２１により検索がなされると同時に、画面が検索結果表示画面４４へと遷移して、その検索結果が表示される。 In addition, when performing a keyword search on the initial screen 41, a keyword to be searched is entered in the edit box of the keyword search area 53, and a “date search” button is pressed to set a preset date and time or place. A search is performed by the search means 21 using information or the like and a search character entered in the edit box as a keyword (search request conversion process), and at the same time, the screen transitions to the search result display screen 44. Search results are displayed.

また、初期画面４１において、人物の特徴から検索する場合には、条件指定検索領域５４−１の「人物特徴から検索」ボタンを押下することで、特徴・行動条件指定画面４３へと遷移する。 Further, when searching from the characteristics of a person on the initial screen 41, a transition to the characteristic / behavioral condition specifying screen 43 is made by pressing a “search from person characteristics” button in the condition specifying search area 54-1.

更に、初期画面において、監視ロボットによる迷子等のリアルタイム捜索を行う場合には、リアルタイム捜索領域５４−２の「リアルタイム捜索ボタン」を押下することで、特徴・行動条件指定画面４３へと遷移する。 Further, in the initial screen, when performing a real-time search for a lost child or the like by the monitoring robot, the “real-time search button” in the real-time search area 54-2 is pressed to transit to the feature / action condition designation screen 43.

＜日時指定画面４２＞
図８は、本実施形態における日時指定画面を示す図である。上述した初期画面４１において、「検索」内の各ボタンが押された場合、日時指定画面４２にて日時条件を指定する。また、図８に示す日時指定画面４２では、検索場所条件を指定することもできる。これにより、例えば、警備ロボット等の移動型の撮像手段により得られる映像を対象とした場合にも、その映像と共に得られる警備ロボットの巡回位置情報等に基づいて、対象とする場所により得られる映像のみを対象として効率よく検索することができる。なお、この場所指定は、例えば、固定型の撮像手段からの映像を対象とする場合にも、検索対象の映像を限定することができるため有効である。 <Date designation screen 42>
FIG. 8 is a diagram showing a date and time designation screen in the present embodiment. In the initial screen 41 described above, when each button in “Search” is pressed, the date / time condition is designated on the date / time designation screen 42. In addition, on the date and time designation screen 42 shown in FIG. 8, a search location condition can also be designated. Thus, for example, even when an image obtained by a mobile imaging means such as a security robot is targeted, an image obtained by a target location based on the patrol position information of the security robot obtained together with the image Can be efficiently searched for only the target. This location designation is effective because, for example, the video to be searched can be limited even when the video from the fixed imaging means is targeted.

次に、指定された日時条件や場所条件を、検索処理に渡し、検索処理が終了したら検索結果を呼び出して検索結果表示画面４４へ遷移する。 Next, the specified date and time conditions and location conditions are passed to the search process, and when the search process is completed, the search result is called and the screen shifts to the search result display screen 44.

なお、図８に示す日時指定画面４２は、ラジオボタン・日時指定ボックス５５−１、場所指定ボックス５５−２、検索実行ボタン群５６、及びハイパーリンク表示領域５７を有している。ラジオボタン・日時指定ボックス５５−１では、初期状態は、「本日の画像から」となっており、日時指定のボックスは半透明の網がけがされ、操作ができない状態となっている。ここで、ラジオボタンで「時間を指定する」が選択された場合には、日時指定のボックスが操作可能となる。また、場所指定ボックス５５−２では、予め設定された場所や範囲を指定する。なお、指定される場所としては、例えば地名や郵便番号、商店や商業施設等の名称、フロアの階数等でもよく、更に、その映像を撮影した警備ロボットから半径１００ｍ以内等の地域範囲として指定することもできる。 The date / time designation screen 42 shown in FIG. 8 includes a radio button / date / time designation box 55-1, a location designation box 55-2, a search execution button group 56, and a hyperlink display area 57. In the radio button / date / time designation box 55-1, the initial state is “from today's image”, and the date / time designation box is in a state of being semi-transparent and not operable. Here, when “specify time” is selected with the radio button, the date / time designation box becomes operable. Also, a place designation box 55-2 designates a preset place or range. The designated place may be, for example, a place name, a zip code, a name of a store or a commercial facility, the number of floors, and the like, and is further designated as an area range within a radius of 100 m from the security robot that shot the video. You can also.

検索実行ボタン群５６には、「ＯＫボタン」と「キャンセルボタン」とを有し、「ＯＫボタン」の部分が、初期画面４１により渡された、遷移元情報（押されたボタン）によって表示が変化する。それぞれ表示は、「来店者検索」「不審者検索」「キーワード検索」となる。また、それぞれのボタンが押された時に、日時情報や場所情報を格納用変数へ格納し、日時指定情報や場所指定情報を引数として、検索処理関数を呼び出す。また、検索処理が終了次第（逐次処理）、検索結果表示画面４４へと遷移する。なお、上述のラジオボタン・日時指定ボックス５５−１で「本日の画像から」が選択された場合、引数でフラグを指定し、検索処理関数を呼び出す。また、ハイパーリンク表示領域５７では、文字が表示された部分を選択することで、最初の画面（初期画面４１）に遷移する。 The search execution button group 56 has an “OK button” and a “cancel button”, and the “OK button” portion is displayed by the transition source information (the pressed button) passed by the initial screen 41. Change. The respective displays are “store visitor search”, “suspicious person search”, and “keyword search”. When each button is pressed, the date / time information and the location information are stored in the storage variables, and the search processing function is called using the date / time specification information and the location specification information as arguments. Also, as soon as the search process is completed (sequential process), the screen transitions to the search result display screen 44. If “from today's image” is selected in the radio button / date / time designation box 55-1, the flag is designated by the argument and the search processing function is called. Further, in the hyperlink display area 57, by selecting a portion where characters are displayed, the screen is changed to the first screen (initial screen 41).

＜特徴・行動条件指定画面４３＞
図９は、本実施形態における特徴・行動条件指定画面を示す図である。上述した初期画面４１において、「人物特徴から検索」又は「リアルタイム捜索」が選択された場合、特徴・行動条件指定画面４３にて検索条件を指定する。全ての条件について、格納関数を呼び出す。また、検索ボタンが押されると、検索処理関数を呼び出し、検索結果表示画面４４へ遷移する。 <Characteristic / action condition designation screen 43>
FIG. 9 is a diagram showing a feature / action condition designation screen in the present embodiment. When “search from person feature” or “real-time search” is selected on the initial screen 41 described above, the search condition is designated on the feature / action condition designation screen 43. Call the storage function for all conditions. Further, when the search button is pressed, a search processing function is called and a transition is made to the search result display screen 44.

なお、特徴・行動条件指定画面４３には、日付指定領域５８、場所指定領域５９−１、身長指定領域５９−２、色情報指定領域６０、性別指定領域６１、マスク指定領域６２、サングラス指定領域６３、年代指定領域６４、登場人物指定領域６５、似ている芸能人指定領域６６、検索実行ボタン群６７、及びハイパーリンク表示領域６８を有している。 The feature / action condition designation screen 43 includes a date designation area 58, a place designation area 59-1, a height designation area 59-2, a color information designation area 60, a gender designation area 61, a mask designation area 62, and a sunglasses designation area. 63, an age designation area 64, a character designation area 65, a similar entertainer designation area 66, a search execution button group 67, and a hyperlink display area 68.

日付指定領域５８では、検索対象日時の設定を行う。なお、開始と終了時刻の大小エラーチェックを行いエラー時には検索者にその旨を通知する。また、場所指定領域５９−１では、対象の人物を検索する場所の設定を行う。具体的には、映像が撮影された撮像手段２６の位置情報に基づいて検索対象の場所情報を指定する。なお、位置情報は、撮像手段２６が固定型であれば、映像データに含まれるその撮像手段２６の識別情報により容易に取得することができ、更に監視ロボット等の移動型である場合には、映像データに含まれるその監視ロボットの巡回位置情報を利用する。 In the date designation area 58, the search target date and time is set. The start and end times are checked for large and small errors, and the error is notified to the searcher when an error occurs. In the place designation area 59-1, a place for searching for a target person is set. Specifically, the location information to be searched is specified based on the position information of the image pickup means 26 where the video was shot. If the image pickup means 26 is a fixed type, the position information can be easily obtained from the identification information of the image pickup means 26 included in the video data, and if the position information is a moving type such as a monitoring robot, The patrol position information of the monitoring robot included in the video data is used.

また、身長指定領域５９−２では、検索対象となる人物の身長の設定を行う。なお、最小値（例えば、１００）と最大値（例えば、２００）の閾値を設定しておき、その範囲に含まれない数値が指定された場合や数値以外の文字が入力された場合には検索者にその旨を通知する。 In the height designation area 59-2, the height of a person to be searched is set. A threshold value is set for a minimum value (for example, 100) and a maximum value (for example, 200), and a search is performed when a numerical value not included in the range is specified or when a character other than the numerical value is input. Notify the person to that effect.

また、色情報指定領域６０では、検索対象の人物の頭部の色、上衣（上半身の衣類）の色、下衣（下半身の衣類）の色を設定する。なお、本実施形態では、図９に示すようにカラーマップにより指定を行う。ここで、図１０は、カラーマップの一例を示す図である。本実施形態では、カラーマップによりＲＧＢ各値（０〜２５５）をＨＳＶ値へ変換し、Ｈ（０〜３６０）を外円環に、ＳＶ（０〜１）を内部矩形にプロットしたものを用いる。また、外円環は、ＢＭＰを貼り付けるのみで角度情報のみを取得する。また、外側の円ボタンによってＨ値を選択し、Ｈの値に従ったＳＶマップが内部矩形に表示される。つまり、Ｈの値と矩形内部の円ボタンの位置（ＳＶ値）とをＲＧＢに変換することで、検索したい色情報（ＲＧＢ値）を出力する。なお、内・外部の円ボタンには現在選択している色が表示される。上述したカラーマップを使用することで、検索者は多種の色情報を容易に設定することができる。なお、色情報の指定を行わない場合には、ＨＳＶ値を「−１」とする。 In the color information designation area 60, the color of the head of the person to be searched, the color of the upper garment (upper body clothing), and the color of the lower garment (lower body clothing) are set. In the present embodiment, designation is performed using a color map as shown in FIG. Here, FIG. 10 is a diagram illustrating an example of a color map. In the present embodiment, RGB values (0 to 255) are converted into HSV values using a color map, and H (0 to 360) is plotted on the outer ring and SV (0 to 1) is plotted on the inner rectangle. . In addition, the outer ring acquires only angle information only by pasting BMP. Further, the H value is selected by the outer circle button, and the SV map according to the value of H is displayed in the internal rectangle. That is, the color information (RGB value) to be searched is output by converting the value of H and the position of the circle button (SV value) inside the rectangle into RGB. The currently selected color is displayed on the inner and outer circle buttons. By using the color map described above, the searcher can easily set various kinds of color information. When color information is not designated, the HSV value is set to “−1”.

また、性別指定領域６１では、「男性」、「女性」、「どちらでもない」の何れかを設定する。また、マスク指定領域６２では、マスクの着用について「している」、「していない」、どちらでも」の何れかを設定する。サングラス指定領域６３では、サングラスの着用について「している」、「していない」、「どちらでも」の何れかを設定する。 In the gender designation area 61, “male”, “female”, or “neither” is set. Also, in the mask designation area 62, “doing”, “not doing” or “both” is set for wearing the mask. In the sunglasses designation area 63, “doing”, “not doing”, or “both” is set for wearing sunglasses.

また、年代指定領域６４では、例えば「〜９歳」、「１０〜１９歳」、「２０〜３９歳」、「４０〜５９歳」、「６０歳〜」のように予め設定された年代のうち、何れかを設定する。また、登場人物指定領域６５では、予め蓄積されている登録人物情報にしたがって、リスト中から登録人物名を設定する。また、似ている芸能人指定領域６６では、予め蓄積されている芸能人情報にしたがって、リスト中から芸能人名を設定する。 Further, in the age designation area 64, for example, “-9 years old”, “10-19 years old”, “20-39 years old”, “40-59 years old”, “60 years old”, “60 years old” Set one of them. In the character designation area 65, a registered person name is set from the list in accordance with registered person information stored in advance. In the similar entertainer designation area 66, the entertainer name is set from the list in accordance with entertainer information accumulated in advance.

検索実行ボタン群６７では、「この条件に合う人物を検索ボタン」と「クリアボタン」とを有し、「この条件に合う人物を検索ボタン」を押下した場合、初期画面４１で指示された「人物特徴から検索」又は「リアルタイム捜索」に対応する全ての条件に関しての格納関数を順次呼び出す。また、最後に検索関数を呼び出して検索処理を実行し、処理が終了後、検索結果表示画面４４へ遷移する。なお、条件に合わない入力があった場合には、例えば「もう一度入力してください」というメッセージが通知される。また、検索実行ボタン群６７の「クリアボタン」が押下された場合には、特徴・行動条件指定画面４３上の各種条件の設定がクリア（初期化）される。また、ハイパーリンク表示領域６８では、文字が表示された部分を選択することで、最初の画面（初期画面４１）に遷移する。 The search execution button group 67 has a “search button for a person who meets this condition” and a “clear button”. When the “search button for a person who satisfies this condition” is pressed, “ The storage function for all conditions corresponding to “search from person characteristics” or “real-time search” is sequentially called. Finally, the search function is called to execute the search process, and after the process is completed, the screen transitions to the search result display screen 44. If there is an input that does not meet the conditions, for example, a message “Please input again” is notified. When the “clear button” of the search execution button group 67 is pressed, the settings of various conditions on the feature / action condition designation screen 43 are cleared (initialized). Further, in the hyperlink display area 68, by selecting a portion where characters are displayed, the screen is changed to the first screen (initial screen 41).

＜検索結果表示画面４４＞
図１１は、本実施形態における検索結果表示画面を示す図である。上述した初期画面４１、日時指定画面４２、特徴・行動条件指定画面４３において、各種検索条件を指定し、検索結果表示画面４４にて検索結果を表示する。また、所定の検索結果出力関数を呼び出し、各種情報を表示する。なお、図１１の例では、１ページあたりの表示上限数を５としているが、本発明においてはこれに限定されず、必要に応じて任意に設定することができる。 <Search result display screen 44>
FIG. 11 is a diagram showing a search result display screen in the present embodiment. Various search conditions are designated on the initial screen 41, the date / time designation screen 42, and the feature / action condition designation screen 43 described above, and the search results are displayed on the search result display screen 44. Also, a predetermined search result output function is called to display various information. In the example of FIG. 11, the upper limit number of displays per page is five. However, the present invention is not limited to this and can be arbitrarily set as necessary.

また、図１１に示す検索結果表示画面４４は、各種検索領域６９、検索結果数表示領域７０、該当人物出現時の画像表示領域７１、該当人物の顔画像７２、検索結果表示領域７３、映像確認ボタン７４、ページ選択アイコン７５、及びハイパーリンク表示領域７６を有している。 Further, the search result display screen 44 shown in FIG. 11 includes various search areas 69, a search result number display area 70, an image display area 71 when the corresponding person appears, a face image 72 of the corresponding person, a search result display area 73, and a video confirmation. A button 74, a page selection icon 75, and a hyperlink display area 76 are included.

各種検索領域６９では、初期画面４１等で選択したイベントに従い、そのイベントに対応する各種ボタンを押下することで、検索処理と画面遷移を行う。なお、各種検索領域６９には、「一時停止ボタン」が設けられているが、このボタンは、例えば、リアルタイム捜索等におけるリアルタイムな検索結果の出力を更新せずに一時的に停止するためのものである。具体的には、例えば、リアルタイム捜索の場合には、検索結果表示領域７３に迷子等の捜索の対象となる人物との類似性が高いものから随時表示されたり、ある一定以上の類似性を有する結果が随時表示されるため、表示内容が更新され、これにより画面の表示されている内容を見逃す恐れがある。そのため、検索結果の更新表示を一時停止することにより、検索結果を正確に確認することができる。 In various search areas 69, according to the event selected on the initial screen 41 or the like, the search process and screen transition are performed by pressing various buttons corresponding to the event. The various search areas 69 are provided with a “pause button”. For example, this button is used to temporarily stop the output of the real-time search result without updating the real-time search. It is. Specifically, for example, in the case of real-time search, the search result display area 73 is displayed as needed from the one with a high similarity to a person to be searched such as a lost child, or has a certain degree of similarity. Since the result is displayed at any time, the display content is updated, which may cause the content displayed on the screen to be missed. Therefore, the search result can be confirmed accurately by pausing the update display of the search result.

また、検索結果数表示領域７０では、検索した結果数と、現在表示中の件数を表示する（例えば、「検索結果６２件中１−５件目」等）。また、該当人物出現時の画像表示領域７１では、該当人物出現時のサムネイル用人体画像を表示する。また、該当人物の顔画像７２では、該当人物のサムネイル用顔画像を表示する。なお、該当人物出現時の画像表示領域７１及び該当人物の顔画像７２において、表示対象画像ない場合は表示しない（「ＮＯＩＭＡＧＥ」表示等）。 In the search result number display area 70, the number of search results and the number of currently displayed results are displayed (for example, “1-5 of 62 search results”). In the image display area 71 when the corresponding person appears, the human body image for thumbnail when the corresponding person appears is displayed. Further, in the face image 72 of the corresponding person, the face image for the thumbnail of the corresponding person is displayed. In the image display area 71 and the face image 72 of the corresponding person when the corresponding person appears, the image is not displayed if there is no display target image (“NO IMAGE” display or the like).

また、検索結果表示領域７３は、検索の結果、該当する人物の情報（見た目の特徴・行動）、又は警備ロボットによる検知等に対応するアイコンを表示する。ここで、図１２は、検索結果表示領域を具体的に説明するための図である。図１２における検索結果表示領域には、撮影時間・人物情報７３−１、アイコン表示領域７３−２、撮影場所情報７３−３、検索スコア情報７３−４が検索結果毎に表示されている。 In addition, the search result display area 73 displays icons corresponding to information (appearance characteristics / behavior) of the corresponding person, detection by the security robot, or the like as a result of the search. Here, FIG. 12 is a diagram for specifically explaining the search result display area. In the search result display area in FIG. 12, shooting time / person information 73-1, icon display area 73-2, shooting location information 73-3, and search score information 73-4 are displayed for each search result.

撮影時間・人物情報７３−１では、検索により抽出された映像の撮影時間と、その人物が登録者である場合にはその登録者人物名、登録者でない場合には、未登録者であることを表示する。また、アイコン表示領域７３−２には、一例として８種類のアイコンが表示されており、それぞれ左から「年代」、「身長」、「サングラスの有無」「似ている芸能人」、「マスクの有無」、「きょろきょろしている」、「長時間滞在」、「カメラを確認」、「警備ロボットにおける検知」等の条件を容易に視認し易いアイコンが所定の位置で固定に配置されている。 In the shooting time / person information 73-1, the shooting time of the video extracted by the search, the name of the registrant if the person is a registrant, and the unregistered person if the person is not a registrant. Is displayed. In addition, eight types of icons are displayed in the icon display area 73-2 as an example. From the left, “age”, “height”, “with / without sunglasses”, “similar entertainer”, “with / without mask” Icons that are easy to visually recognize conditions such as “slowly”, “stay for a long time”, “check camera”, “detection by security robot” are fixedly arranged at predetermined positions.

なお、アイコン表示領域７３−２に示す警備ロボットのアイコンは、撮像手段２６と連携する巡回中の警備ロボットのアイコンを示している。そのため、アイコン表示領域７３−２には、警備ロボットのアイコン表示の他に、その警備ロボットの位置情報を表示させてもよい。また、アイコン表示領域７３−２に表示されるアイコンの種類や数についてはこれに限定されるものではなく、撮像手段２６と連携する各種センサや防犯設備等の機器のアイコンを表示してもよい。 In addition, the icon of the security robot shown in the icon display area 73-2 indicates the icon of the security robot that is traveling in cooperation with the imaging unit 26. Therefore, the icon display area 73-2 may display the position information of the guard robot in addition to the icon display of the guard robot. Further, the types and number of icons displayed in the icon display area 73-2 are not limited to this, and icons of various sensors and security equipment that cooperate with the imaging unit 26 may be displayed. .

また、上述のアイコンは、上述の各条件に対応して予め設定されており、更に「赤」、「黒」、「グレー」等の予め設定された色で分けて表示することができる。ここで、「赤」で表示されている場合は、検索条件に含まれていたものを示し、「黒」で検索条件には含まれていないが、上記条件に該当する映像であるものと示し、「グレー」は検索条件及び映像の解析結果で該当しなかったものを示している。なお、色の種類については本発明においては特に制限はなく、他の色でもよく、また点滅や斜線等により区別して強調表示させることもできる。なお、アイコン表示領域７３−２には、アイコンではなく文字そのものを表示することもでき、この場合には、例えば文字のフォントの種類や大きさ、太さ、下線等の形式を変更したり、点滅させたり、色を変える等により区別させることができる。 The icons described above are set in advance corresponding to the above-described conditions, and can be displayed separately in preset colors such as “red”, “black”, and “gray”. Here, if it is displayed in “red”, it indicates that it was included in the search condition, and “black” indicates that it is not included in the search condition but is an image that meets the above condition. “Gray” indicates that the search condition and the analysis result of the video are not applicable. Note that the type of color is not particularly limited in the present invention, and other colors may be used. Alternatively, the colors may be distinguished and highlighted by blinking, diagonal lines, or the like. In the icon display area 73-2, not the icon but the character itself can be displayed. In this case, for example, the font type, size, thickness, underline, etc. of the character can be changed, It can be distinguished by blinking or changing the color.

また、撮影場所情報７３−３には、検索により抽出された映像の撮影場所が表示され、検索スコア情報７３−４には、上述した検索手段２１により得られる検索スコアリング結果が表示される。つまり、検索結果表示領域７３では、この検索スコアが高いものから表示することができる。これにより、様々な条件における検索結果を、検索要求との一致度順に出力することにより、容易に要求する人物の確認ができる。なお、本発明においては検索スコア順に表示する手法に限定されず、例えば撮影日時順に表示させることもできる。 The shooting location information 73-3 displays the shooting location of the video extracted by the search, and the search score information 73-4 displays the search scoring result obtained by the search means 21 described above. That is, in the search result display area 73, it is possible to display from the highest search score. As a result, it is possible to easily confirm the requesting person by outputting the search results under various conditions in the order of coincidence with the search request. Note that the present invention is not limited to the method of displaying in the order of search score, and for example, it is possible to display in the order of shooting date and time.

更に、検索結果表示領域７３のアイコン表示領域７３−２にカーソルを移動させ、所定時間（例えば、０．５秒）停止させると、予め設定されるアイコンに対応する付加情報７３−５がポップアップ表示される。 Further, when the cursor is moved to the icon display area 73-2 of the search result display area 73 and stopped for a predetermined time (for example, 0.5 seconds), additional information 73-5 corresponding to a preset icon is displayed as a pop-up. Is done.

また、映像確認ボタン７４では、ボタンをクリックすると映像再生処理を呼び出し、映像再生画面４５へ遷移する。また、ページ選択アイコン７５では、検索結果が複数ページに及ぶ場合に、それぞれのページにジャンプする。また、ハイパーリンク表示領域７６では、文字が表示された部分を選択することで、最初の画面（初期画面４１）に遷移する。上述した画面により、検索者に対して抽出された検索結果がどの条件に該当しているかを容易に把握させることができる。 On the video confirmation button 74, when the button is clicked, a video playback process is called, and a transition is made to the video playback screen 45. Also, the page selection icon 75 jumps to each page when the search result covers a plurality of pages. Further, in the hyperlink display area 76, the screen is changed to the first screen (initial screen 41) by selecting the portion where the character is displayed. The above-described screen allows the searcher to easily grasp which condition the extracted search result corresponds to.

つまり、検索結果表示画面４４において、人物の特徴や行動といった情報や、撮像手段２６と連携した機器等をアイコンによって表示することにより、容易に要求する人物の確認ができる。 In other words, on the search result display screen 44, by displaying information such as a person's characteristics and actions, devices linked to the imaging means 26, and the like with icons, the requested person can be easily confirmed.

＜映像再生画面４５＞
図１３は、本実施形態における映像再生画面を示す図である。上述した検索結果表示画面４４において、映像確認ボタン７４を押した際に、映像再生画面４５にて該当映像を表示する。ここで、図１３に示す映像再生画面４５には、映像表示領域７７、テキスト表示領域７８、特徴表示領域７９、及びボタン領域８０を有している。 <Video playback screen 45>
FIG. 13 is a diagram showing a video playback screen in the present embodiment. When the video confirmation button 74 is pressed on the search result display screen 44 described above, the corresponding video is displayed on the video playback screen 45. Here, the video playback screen 45 shown in FIG. 13 has a video display area 77, a text display area 78, a feature display area 79, and a button area 80.

映像表示領域７７では、入力されたファイルパス、再生開始時間、再生終了時間を映像再生処理の引数として、映像を再生・表示する。また、映像表示領域７７には、スライダ、再生ボタン、停止ボタン等があり、映像を見たい場所から見たり、映像の再生、停止を容易に行うことができる。また、映像表示領域７７で表示されるデータは、ＡＶＩファイルの他、ＭＰＥＧ２、ＭＰＥＧ４、Ｈ．２６４、ＷＭＶ、ＭＰ４等のファイル形式のデータを表示することができ、またＪＰＥＧ、ＧＩＦ、ＴＩＦＦ、ＢＭＰ等の画像データも表示させることができる。 In the video display area 77, video is played back and displayed using the input file path, playback start time, and playback end time as arguments of video playback processing. In addition, the video display area 77 includes a slider, a playback button, a stop button, and the like, so that it is possible to easily view the video from a desired place and to play or stop the video. In addition to the AVI file, the data displayed in the video display area 77 includes MPEG2, MPEG4, H.264. H.264, WMV, MP4, and other file format data can be displayed, and image data such as JPEG, GIF, TIFF, and BMP can also be displayed.

また、テキスト表示領域７８には、映像再生の時間がテキスト内の時間になった際に、映像に対応する行動の文章を赤文字や太字等の強調文字により表示する。また、過去の情報はそのまま表示し、新たな情報が表示された場合、過去の情報は黒文字へ変化させる。これにより、現在の行動と今までの行動履歴を容易に把握することができる。 Also, in the text display area 78, when the video playback time is the time in the text, the action text corresponding to the video is displayed with highlighted characters such as red letters and bold letters. Further, past information is displayed as it is, and when new information is displayed, the past information is changed to black characters. Thereby, it is possible to easily grasp the current behavior and the past behavior history.

また、特徴表示領域７９には、表示されている映像に含まれる人物の特徴情報が表示される。なお、複数人が撮影されている場合には、それぞれの特徴情報が表示される。 In the feature display area 79, feature information of a person included in the displayed video is displayed. When a plurality of people are photographed, the respective feature information is displayed.

また、ボタン領域８０では、「共有ボタン」と「類似検索ボタン」とが設けられている。「共有ボタン」を押下すると共有画面４６に遷移する。なお、この場合、映像表示領域における映像再生は、装置の負荷を軽減するために停止する。また、「類似検索ボタン」を押下すると、表示された人物の特徴等に基づいて他の映像からの類似検索を行い、他のカメラで撮影されていないかをチェックし、その結果を上述した検索結果表示画面４４に遷移した後に出力する。 In the button area 80, a “share button” and a “similar search button” are provided. When the “share button” is pressed, the screen transitions to the share screen 46. In this case, the video reproduction in the video display area is stopped to reduce the load on the apparatus. Also, when the “similarity search button” is pressed, a similarity search from another video is performed based on the characteristics of the displayed person, etc., and it is checked whether the image is taken by another camera, and the result is the search described above. Output after transitioning to the result display screen 44.

つまり、映像再生画面４５では、検索結果表示画面４４より渡されたテキスト内に、映像中の人物の行動と、時間が記述されているため、映像表示領域７７にて映像が表示中に、映像の時間が、テキストに対応する時間になった時に、テキスト表示領域７８により該当テキストを赤く表示する（強調表示）。その後、新たな情報がある場合、過去の情報は黒文字で表示する。 That is, in the video playback screen 45, since the behavior and time of the person in the video are described in the text passed from the search result display screen 44, the video is displayed in the video display area 77 while the video is being displayed. Is the time corresponding to the text, the text display area 78 displays the text in red (highlighted). Thereafter, when there is new information, the past information is displayed in black characters.

＜共有画面４６＞
図１４は、本実施形態における共有画面を示す図である。上述した映像再生画面４５において、ボタン領域８０の「共有ボタン」を押下した際に、共有画面４６にて人物情報を表示する。 <Share screen 46>
FIG. 14 is a diagram showing a shared screen in the present embodiment. When the “Share button” in the button area 80 is pressed on the video playback screen 45 described above, the person information is displayed on the share screen 46.

ここで、共有画面４６は、画像表示領域８１、テキスト表示領域８２、及びコメント入力領域８３を有している。画像表示領域８１では、該当人物出現時の画像、及び該当人物の顔画像が表示される。また、テキスト表示領域８２では、赤文字等の強調文字で不審情報を表示し、その他に時刻情報（日付、入店時間、退店時間等）、特徴（身長、性別、年別等）が表示される。コメント入力領域８３では、入力手段１１により、コメントが入力される。なお、コメント入力領域８３はデフォルト表示として「コメントを入力してください。」と表示し、入力手段１１により検索者等により文字の入力を受け付けることができる。これにより、検索者が独自の観点で人物情報を設定し管理することができる。 Here, the shared screen 46 has an image display area 81, a text display area 82, and a comment input area 83. In the image display area 81, an image when the corresponding person appears and a face image of the corresponding person are displayed. In the text display area 82, suspicious information is displayed with highlighted characters such as red characters, and time information (date, time of entry, time of closing, etc.) and characteristics (height, sex, year, etc.) are also displayed. Is done. In the comment input area 83, a comment is input by the input unit 11. The comment input area 83 displays “Please input a comment” as a default display, and the input means 11 can accept input of characters by a searcher or the like. Thereby, the searcher can set and manage personal information from a unique point of view.

更に、共有画面４６に表示される各種の内容をプリンタに出力したり、ファイルを生成して所定の携帯電話にメール送信する印刷・出力機能を有する。これにより、映像表示画面において、人物についての情報を、文字と映像の両方から確認でき、更に印刷物の形態等により外部機器に出力可能とすることで、容易に要求する人物の確認と、情報伝達（共有）ができる。 Furthermore, it has a print / output function for outputting various contents displayed on the shared screen 46 to a printer, generating a file and sending it to a predetermined mobile phone by e-mail. This makes it possible to check information about a person from both text and video on the video display screen, and to output it to an external device in the form of printed matter, etc. (Sharing).

＜ハードウェア構成例＞
ここで、上述した人物検索装置１０は、上述した機能を有する専用の装置構成により制御を行うこともできるが、各機能をコンピュータに実行させることができる実行プログラム（人物検索プログラム）を生成し、例えば、汎用のパーソナルコンピュータ、サーバ等にその実行プログラムをインストールすることにより、本発明における人物検索処理を実現することができる。 <Hardware configuration example>
Here, the person search apparatus 10 described above can be controlled by a dedicated apparatus configuration having the functions described above, but generates an execution program (person search program) that can cause a computer to execute each function. For example, the person search processing in the present invention can be realized by installing the execution program in a general-purpose personal computer, server, or the like.

本実施形態における人物検索処理が実現可能なコンピュータのハードウェア構成例について図を用いて説明する。図１５は、本実施形態における人物検索処理が実現可能なハードウェア構成の一例を示す図である。 A hardware configuration example of a computer capable of realizing person search processing in the present embodiment will be described with reference to the drawings. FIG. 15 is a diagram illustrating an example of a hardware configuration capable of realizing the person search process according to the present embodiment.

図１５におけるコンピュータ本体には、入力装置９１と、出力装置９２と、ドライブ装置９３と、補助記憶装置９４と、メモリ装置９５と、各種制御を行うＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）９６と、ネットワーク接続装置９７とを有するよう構成されており、これらはシステムバスＢで相互に接続されている。 15 includes an input device 91, an output device 92, a drive device 93, an auxiliary storage device 94, a memory device 95, a CPU (Central Processing Unit) 96 that performs various controls, and a network connection device. 97 are connected to each other by a system bus B.

入力装置９１は、使用者等が操作するキーボード及びマウス等のポインティングデバイスを有しており、使用者等からのプログラムの実行等、各種操作信号を入力する。 The input device 91 has a pointing device such as a keyboard and a mouse operated by a user or the like, and inputs various operation signals such as execution of a program from the user or the like.

出力装置９２は、本発明における処理を行うためのコンピュータ本体を操作するのに必要な各種ウィンドウやデータ等を表示するモニタを有し、ＣＰＵ９６が有する制御プログラムによりプログラムの実行経過や結果等を表示することができる。 The output device 92 has a monitor that displays various windows and data necessary for operating the computer main body for performing the processing according to the present invention, and displays the execution progress and results of the program by the control program of the CPU 96. can do.

なお、入力装置９１と出力装置９２とは、例えばタッチパネル等のように一体型の入出力手段であってもよく、この場合には使用者等の指やペン型の入力装置等を用いて所定の位置をタッチして入力を行うことができる。 The input device 91 and the output device 92 may be integrated input / output means such as a touch panel. In this case, the input device 91 and the output device 92 are predetermined using a finger of a user, a pen-type input device, or the like. The position can be touched to input.

ここで、本発明においてコンピュータ本体にインストールされる実行プログラムは、例えば、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）メモリやＣＤ−ＲＯＭ等の可搬型の記録媒体９８等により提供される。プログラムを記録した記録媒体９８は、ドライブ装置９３にセット可能であり、記録媒体６８に含まれる実行プログラムが、記録媒体９８からドライブ装置９３を介して補助記憶装置９４にインストールされる。 Here, the execution program installed in the computer main body in the present invention is provided by a portable recording medium 98 such as a USB (Universal Serial Bus) memory or a CD-ROM, for example. The recording medium 98 on which the program is recorded can be set in the drive device 93, and the execution program included in the recording medium 68 is installed from the recording medium 98 to the auxiliary storage device 94 via the drive device 93.

補助記憶装置９４は、ハードディスク等のストレージ手段であり、本発明における実行プログラムや、コンピュータに設けられた制御プログラム等を蓄積し必要に応じて入出力を行うことができる。 The auxiliary storage device 94 is a storage means such as a hard disk, and can store an execution program according to the present invention, a control program provided in a computer, and the like, and can perform input / output as necessary.

メモリ装置９５は、ＣＰＵ９６により補助記憶装置９４から読み出された実行プログラム等を格納する。なお、メモリ装置９５は、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）やＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等からなる。 The memory device 95 stores an execution program or the like read from the auxiliary storage device 94 by the CPU 96. The memory device 95 includes a ROM (Read Only Memory), a RAM (Random Access Memory), and the like.

ＣＰＵ９６は、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）等の制御プログラム、及びメモリ装置９５により読み出され格納されている実行プログラムに基づいて、各種演算や各ハードウェア構成部とのデータの入出力等、コンピュータ全体の処理を制御して各処理を実現することができる。プログラムの実行中に必要な各種情報等は、補助記憶装置９４から取得することができ、また実行結果等を格納することもできる。 Based on a control program such as an OS (Operating System) and an execution program read and stored by the memory device 95, the CPU 96 performs various operations and inputs / outputs data to / from each hardware component. Each process can be realized by controlling the process. Various information necessary during the execution of the program can be acquired from the auxiliary storage device 94, and the execution result and the like can also be stored.

ネットワーク接続装置９７は、通信ネットワーク等と接続することにより、実行プログラムを通信ネットワークに接続されている他の端末等から取得したり、プログラムを実行することで得られた実行結果、又は本発明における実行プログラム自体を他の端末等に提供することができる。上述したようなハードウェア構成により、本発明における人物検索処理を実行することができる。また、プログラムをインストールすることにより、汎用のパーソナルコンピュータ等で本発明における人物検索処理を容易に実現することができる。次に、人物検索処理の具体的な内容について説明する。 The network connection device 97 acquires an execution program from another terminal connected to the communication network by connecting to a communication network or the like, or an execution result obtained by executing the program, or The execution program itself can be provided to other terminals. With the hardware configuration as described above, the person search process according to the present invention can be executed. Further, by installing the program, the person search process according to the present invention can be easily realized by a general-purpose personal computer or the like. Next, specific contents of the person search process will be described.

＜人物検索処理＞
次に、本発明における実行プログラム（人物検索プログラム）で実行される人物検索処理手順についてフローチャートを用いて説明する。なお、以下の説明では、検索用の人物特徴情報を蓄積するまでの蓄積処理手順と、その蓄積されたデータを用いた検索処理手順とについて説明する。 <Person search processing>
Next, a person search processing procedure executed by the execution program (person search program) according to the present invention will be described with reference to a flowchart. In the following description, an accumulation process procedure until the person characteristic information for search is accumulated and a search process procedure using the accumulated data will be described.

＜人物検索処理：蓄積処理手順＞
図１６は、本実施形態における蓄積処理手順の一例を示すフローチャートである。図１６において、まずカメラ等の撮像手段により撮影された映像を入力する（Ｓ０１）。次に、その映像に含まれる各画像に１又は複数の人物がいた場合、顔領域検出処理（Ｓ０２）及び人体領域検出処理（Ｓ０３）を行う。なお、Ｓ０３の処理において、人物がいるか否かの判断は、例えば時系列画像フレーム間の比較において、色情報が変化している領域が所定の大きさ以上あるか否か等により判断することができ、また、Ｓ０２の処理と併せて、顔領域の検出ができたか否かによって判断することもできる。 <Person search processing: accumulation processing procedure>
FIG. 16 is a flowchart illustrating an example of an accumulation processing procedure in the present embodiment. In FIG. 16, first, an image taken by an imaging means such as a camera is input (S01). Next, when one or more persons are present in each image included in the video, face area detection processing (S02) and human body area detection processing (S03) are performed. In the process of S03, whether or not there is a person can be determined based on, for example, whether or not the area where the color information is changing is greater than or equal to a predetermined size in comparison between time-series image frames. In addition, it can also be determined based on whether or not the face area has been detected in conjunction with the process of S02.

また、Ｓ０２及びＳ０３の処理が終了後、上述したように同定処理を行い（Ｓ０４）、更に同定された人物を追跡し（Ｓ０５）、追跡結果から得られる人物の行動等により不審者を検出する（Ｓ０６）
次に、上述した処理を行った後、人物情報の統合処理を行い（Ｓ０７）、その統合情報からフレーム情報を生成し（Ｓ０８）、生成した情報を蓄積する（Ｓ０９）。なお、上記の蓄積処理手順は、リアルタイムで検索するような場合には実施されなくてもよい。 Further, after the processes of S02 and S03 are completed, the identification process is performed as described above (S04), the identified person is further tracked (S05), and the suspicious person is detected by the action of the person obtained from the tracking result. (S06)
Next, after performing the above-described processing, person information integration processing is performed (S07), frame information is generated from the integration information (S08), and the generated information is accumulated (S09). Note that the above-described accumulation processing procedure may not be performed in the case of searching in real time.

＜人物検索処理：検索処理手順＞
図１７は、本実施形態における検索処理手順の一例を示すフローチャートである。図１７において、まず本システムを起動して検索画面を表示させる（Ｓ１１）。ここで、検索指示があるか否かを判断し（Ｓ１２）、検索指示があった場合（Ｓ１２において、ＹＥＳ）、入力された検索条件により上述した検索手法を用いて予め蓄積されている映像又はリアルタイム映像に対して検索を実行し（Ｓ１３）、検索結果に対応する出力内容を蓄積手段等から抽出する（Ｓ１４）。また、抽出した内容から上述した検索者に表示する画面を生成し（Ｓ１５）、生成した画面を出力する（Ｓ１６）。ここで、検索を終了するか否かを判断し（Ｓ１７）、検索を終了しない場合（Ｓ１７において、ＮＯ）、Ｓ１２に戻り、後続の処理を行う。また、検索を終了する場合（Ｓ１７において、ＹＥＳ）、検索画面を閉じて人物検索処理を終了する。 <Person search processing: Search processing procedure>
FIG. 17 is a flowchart illustrating an example of a search processing procedure in the present embodiment. In FIG. 17, first, the system is activated to display a search screen (S11). Here, it is determined whether or not there is a search instruction (S12). If there is a search instruction (YES in S12), the video or video stored in advance using the search method described above according to the input search condition or A search is performed on the real-time video (S13), and the output content corresponding to the search result is extracted from the storage means (S14). Further, a screen to be displayed to the searcher described above is generated from the extracted contents (S15), and the generated screen is output (S16). Here, it is determined whether or not the search is to be ended (S17). If the search is not ended (NO in S17), the process returns to S12 to perform the subsequent processing. If the search is to be ended (YES in S17), the search screen is closed and the person search process is ended.

上述したように、本発明によれば、撮影された画像に対して簡単な操作で効率的に高精度な人物検索を行うことができる。 As described above, according to the present invention, a highly accurate person search can be efficiently performed with a simple operation on a captured image.

また、人物検索については、予め蓄積された映像に対して人物検索する場合や、リアルタイムで得られた映像に対して人物検索する場合があり、また、その映像を撮影する撮像手段としては、予め設定された壁や天井等に設置される固定型のものや、警備ロボット等に搭載された移動型のものがあるが、本発明によれば、その何れにも対応した適切な人物検索を実現することができる。 As for the person search, there are a case where a person is searched for a video stored in advance, or a case where a person is searched for a video obtained in real time. There are fixed types installed on set walls and ceilings, etc., and mobile types installed on security robots, etc. According to the present invention, appropriate person search corresponding to any of them is realized. can do.

また、本発明によれば、例えば上述した検索スコアリング手法により、蓄積映像又はリアルタイム映像に映っている人物を、様々な人物の特徴や行動の組み合わせで検索することができる。また、どの組み合わせにおいても、検索要求との一致度が高い順番に検索結果が出力される。これにより、検索結果を効果的に出力し、映像確認の作業を大幅に軽減することができる。更に、本発明によれば、各映像間における人物の類似検索を行い、その結果を表示させることができる。 Further, according to the present invention, for example, by the search scoring method described above, it is possible to search for a person shown in an accumulated video or a real-time video by a combination of various person characteristics and actions. In any combination, search results are output in descending order of matching with the search request. As a result, the search result can be output effectively, and the work of checking the video can be greatly reduced. Furthermore, according to the present invention, it is possible to perform a similarity search of persons between videos and display the result.

以上本発明の好ましい実施例について詳述したが、本発明は係る特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形、変更が可能である。 The preferred embodiments of the present invention have been described in detail above, but the present invention is not limited to such specific embodiments, and various modifications, within the scope of the gist of the present invention described in the claims, It can be changed.

１０人物検索装置
１１入力手段
１２出力手段
１３蓄積手段
１４顔領域検出手段
１５人体領域検出手段
１６同定手段
１７追跡手段
１８不審者検出手段
１９人物情報統合手段
２０フレーム情報生成手段
２１検索手段
２２画面生成手段
２３通知手段
２４送受信手段
２５制御手段
２６撮像手段
３１トラッキング手段
３２顔認証手段
３３性別・年代推定手段
３４顔隠し判定手段
３５身長推定手段
３６色情報抽出手段
４１初期画面
４２日時指定画面
４３特徴・行動条件指定画面
４４検索結果表示画面
４５映像再生画面
４６共有画面
５１来店者検索領域
５２不審者検索領域
５３キーワード検索領域
５４−１条件指定検索（人物特徴から検索）領域
５４−２リアルタイム捜索領域
５５−１ラジオボタン・日時指定ボックス
５５−２場所指定ボックス
５６検索実行ボタン群
５７ハイパーリンク表示領域
５８日付指定領域
５９−１場所指定領域
５９−２身長指定領域
６０色情報指定領域
６１性別指定領域
６２マスク指定領域
６３サングラス指定領域
６４年代指定領域
６５登場人物指定領域
６６似ている芸能人指定領域
６７検索実行ボタン群
６８ハイパーリンク表示領域
６９各種検索領域
７０検索結果数表示領域
７１該当人物出現時の画像表示領域
７２該当人物の顔画像
７３検索結果表示領域
７４映像確認ボタン
７５ページ選択アイコン
７６ハイパーリンク表示領域
７７映像表示領域
７８テキスト表示領域
７９特徴表示領域
８０ボタン領域
８１画像表示領域
８２テキスト表示領域
８３コメント入力領域
９１入力装置
９２出力装置
９３ドライブ装置
９４補助記憶装置
９５メモリ装置
９６ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）
９７ネットワーク接続装置
９８記録媒体 DESCRIPTION OF SYMBOLS 10 Person search device 11 Input means 12 Output means 13 Accumulation means 14 Face area detection means 15 Human body area detection means 16 Identification means 17 Tracking means 18 Suspicious person detection means 19 Person information integration means 20 Frame information generation means 21 Search means 22 Screen generation Means 23 Notification means 24 Transmission / reception means 25 Control means 26 Imaging means 31 Tracking means 32 Face authentication means 33 Gender / age estimation means 34 Face hiding judgment means 35 Height estimation means 36 Color information extraction means 41 Initial screen 42 Date / time designation screen 43 Features / Action condition designation screen 44 Search result display screen 45 Video playback screen 46 Shared screen 51 Visitor search area 52 Suspicious person search area 53 Keyword search area 54-1 Condition designation search (search from person characteristics) area 54-2 Real-time search area 55 -1 radio button / date Fixed box 55-2 Location designation box 56 Search execution buttons 57 Hyperlink display area 58 Date designation area 59-1 Location designation area 59-2 Height designation area 60 Color information designation area 61 Gender designation area 62 Mask designation area 63 Sunglasses designation Area 64 Period designation area 65 Character designation area 66 Similar entertainer designation area 67 Search execution button group 68 Hyperlink display area 69 Various search areas 70 Search result number display area 71 Image display area when relevant person appears 72 Face image 73 Search result display area 74 Video confirmation button 75 Page selection icon 76 Hyperlink display area 77 Video display area 78 Text display area 79 Feature display area 80 Button area 81 Image display area 82 Text display area 83 Comment input area 91 Input Device 92 Output device 93 Drive device 94 Auxiliary storage device 95 Memory device 96 CPU (Central Processing Unit)
97 Network connection device 98 Recording medium

Claims

撮像手段により撮影された映像に含まれる人物に関する情報を抽出し、画面に表示する人物検索装置において、
前記映像に含まれる時系列の各画像に対して人物特徴情報を蓄積する蓄積手段と、
前記蓄積手段に対して予め設定される条件に基づいて該当する人物を検索する検索手段と、
前記検索手段により得られる検索結果に対応する情報を前記蓄積手段により抽出し、抽出した人物に対応して予め設定された人物特徴或いは行動、又は前記撮像手段を設けた警備ロボットの巡回位置情報を表示するための画面を生成する画面生成手段と、
前記画面生成手段により生成された画面を表示すると共に、前記画面に表示される情報を外部機器に出力する出力手段とを有することを特徴とする人物検索装置。 In a person search device that extracts information about a person included in a video imaged by an imaging means and displays it on a screen,
Storage means for storing person feature information for each time-series image included in the video;
Search means for searching for a corresponding person based on conditions preset for the storage means;
Information corresponding to the search result obtained by the search means is extracted by the storage means, and personal features or behaviors set in advance corresponding to the extracted person, or patrol position information of the security robot provided with the imaging means Screen generation means for generating a screen for display;
A person search apparatus comprising: an output unit that displays a screen generated by the screen generation unit and outputs information displayed on the screen to an external device.

前記映像に含まれる人物に対して、顔領域を検出する顔領域検出手段と、
前記映像に含まれる人物に対して、人体領域を検出する人体領域検出手段と、
前記顔領域検出手段により得られる検出結果と、前記人体領域検出手段により得られる検出結果とを用いて時間の異なる複数の画像の人物を同定する同定手段と、
前記同定手段により得られる人物の動作を追跡する追跡手段と、
前記追跡手段により得られる追跡結果から不審者を検出する不審者検出手段と、
前記同定手段により同定された人物の特徴情報を統合する人物情報統合手段と、
前記人物情報統合手段により得られる統合結果を、前記映像を構成する画像フレーム毎に生成するフレーム情報生成手段とを有し、
前記蓄積手段は、前記フレーム情報生成手段により得られるフレーム情報に基づいて、映像に含まれる時系列の各画像に対して人物特徴情報を蓄積することを特徴とする請求項１に記載の人物検索装置。 Face area detecting means for detecting a face area for a person included in the video;
Human body region detecting means for detecting a human body region for a person included in the video;
Identification means for identifying a plurality of images of persons having different times using the detection result obtained by the face area detection means and the detection result obtained by the human body area detection means;
Tracking means for tracking the movement of the person obtained by the identification means;
A suspicious person detecting means for detecting a suspicious person from the tracking result obtained by the tracking means;
Person information integration means for integrating the feature information of the person identified by the identification means;
Frame information generating means for generating an integration result obtained by the person information integrating means for each image frame constituting the video,
2. The person search according to claim 1, wherein the storage unit stores person feature information for each time-series image included in the video based on the frame information obtained by the frame information generation unit. apparatus.

前記フレーム情報生成手段は、
前記映像中に含まれる人物の全ての特徴データを離散化して出現回数をカウントし、離散化されたデータを各フレーム数で正規化した人物特徴データを生成することを特徴とする請求項２に記載の人物検索装置。 The frame information generating means
3. The human feature data obtained by discretizing all the feature data of a person included in the video, counting the number of appearances, and normalizing the discretized data by the number of frames. The person search device described.

前記人体領域検出手段は、検出された人体の所定範囲の色情報を抽出し、抽出した色情報を予め人の視覚特性に基づき設定された色空間に変換し、
前記蓄積手段は、前記色空間に変換された色情報を蓄積することを特徴とする請求項１乃至３の何れか１項に記載の人物検索装置。 The human body region detecting means extracts color information of a predetermined range of the detected human body, converts the extracted color information into a color space set in advance based on human visual characteristics,
The person search apparatus according to claim 1, wherein the storage unit stores the color information converted into the color space.

前記検索手段は、
前記検索条件として、検索者により入力された検索要求を、前記人物の特徴の内容に応じた特徴量に変換し、正規化ヒストグラムの最大値を検索要求特徴量とし、前記検索要求特徴量により前記蓄積手段により蓄積された人物情報を検索して該当する人物に付与された特徴量との距離を算出して正答率のスコアリングを行うことを特徴とする請求項１乃至４の何れか１項に記載の人物検索装置。 The search means includes
As the search condition, the search request input by the searcher is converted into a feature amount according to the content of the person's feature, the maximum value of the normalized histogram is set as the search request feature amount, and the search request feature amount 5. The correct answer rate scoring is performed by searching the person information accumulated by the accumulation means, calculating a distance from the feature amount assigned to the person, and scoring the correct answer rate. The person search device described in 1.

撮像手段により撮影された映像に含まれる人物に関する情報を抽出し、画面に表示する人物検索方法において、
前記映像に含まれる時系列の各画像に対して人物特徴情報を蓄積手段に蓄積する蓄積手順と、
前記蓄積手段に対して予め設定される条件に基づいて該当する人物を検索する検索手順と、
前記検索手順により得られる検索結果に対応する情報を前記蓄積手段により抽出し、抽出した人物に対応して予め設定された人物特徴或いは行動、又は前記撮像手段を設けた警備ロボットの巡回位置情報を表示するための画面を生成する画面生成手順と、
前記画面生成手段により生成された画面を表示すると共に、前記画面に表示される情報を外部機器に出力する出力手順とを有することを特徴とする人物検索装置。 In a person search method for extracting information about a person included in an image captured by an imaging unit and displaying the information on a screen,
An accumulation procedure for accumulating human characteristic information in the accumulation means for each time-series image included in the video;
A search procedure for searching for a corresponding person based on conditions preset for the storage means;
Information corresponding to a search result obtained by the search procedure is extracted by the storage means, and personal features or behaviors set in advance corresponding to the extracted person, or patrol position information of a security robot provided with the imaging means is obtained. A screen generation procedure for generating a screen for display;
A person search apparatus comprising: an output procedure for displaying a screen generated by the screen generation means and outputting information displayed on the screen to an external device.

前記映像に含まれる人物に対して、顔領域を検出する顔領域検出手順と、
前記映像に含まれる人物に対して、人体領域を検出する人体領域検出手順と、
前記顔領域検出手順により得られる検出結果と、前記人体領域検出手順により得られる検出結果とを用いて時間の異なる複数の画像の人物を同定する同定手順と、
前記同定手順により得られる人物の動作を追跡する追跡手順と、
前記追跡手順により得られる追跡結果から不審者を検出する不審者検出手順と、
前記同定手順により同定された人物の特徴情報を統合する人物情報統合手順と、
前記人物情報統合手順により得られる統合結果を、前記映像を構成する画像フレーム毎に生成するフレーム情報生成手順とを有し、
前記蓄積手順は、前記フレーム情報生成手順により得られるフレーム情報に基づいて、映像に含まれる時系列の各画像に対して人物特徴情報を蓄積することを特徴とする請求項６に記載の人物検索装置。 A face area detection procedure for detecting a face area for a person included in the video;
A human body region detection procedure for detecting a human body region for a person included in the video,
An identification procedure for identifying a plurality of images at different times using the detection result obtained by the face region detection procedure and the detection result obtained by the human body region detection procedure;
A tracking procedure for tracking the movement of the person obtained by the identification procedure;
A suspicious person detection procedure for detecting a suspicious person from the tracking result obtained by the tracking procedure;
A person information integration procedure for integrating feature information of the person identified by the identification procedure;
A frame information generation procedure for generating an integration result obtained by the person information integration procedure for each image frame constituting the video;
7. The person search according to claim 6, wherein the storing procedure stores person feature information for each time-series image included in the video based on the frame information obtained by the frame information generating procedure. apparatus.

前記フレーム情報生成手順は、
前記映像中に含まれる人物の全ての特徴データを離散化して出現回数をカウントし、離散化されたデータを各フレーム数で正規化した人物特徴データを生成することを特徴とする請求項７に記載の人物検索方法。 The frame information generation procedure includes:
8. The human characteristic data is generated by discretizing all the characteristic data of the person included in the video and counting the number of appearances, and normalizing the discretized data by the number of frames. The person search method described.

前記人体領域検出手順は、検出された人体の所定範囲の色情報を抽出し、抽出した色情報を予め人の視覚特性に基づき設定された色空間に変換し、
前記蓄積手順は、前記色空間に変換された色情報を蓄積することを特徴とする請求項６乃至８の何れか１項に記載の人物検索方法。 The human body region detection procedure extracts color information of a predetermined range of the detected human body, converts the extracted color information into a color space set in advance based on human visual characteristics,
The person search method according to claim 6, wherein the storing procedure stores the color information converted into the color space.

前記検索手順は、
前記検索条件として、検索者により入力された検索要求を、前記人物の特徴の内容に応じた特徴量に変換し、正規化ヒストグラムの最大値を検索要求特徴量とし、前記検索要求特徴量により前記蓄積手段により蓄積された人物情報を検索して該当する人物に付与された特徴量との距離を算出して正答率のスコアリングを行うことを特徴とする請求項６乃至９の何れか１項に記載の人物検索方法。 The search procedure is as follows:
As the search condition, the search request input by the searcher is converted into a feature amount according to the content of the person's feature, the maximum value of the normalized histogram is set as the search request feature amount, and the search request feature amount 10. The correct answer rate scoring is performed by searching the person information stored by the storage means, calculating a distance from the feature amount assigned to the person, and scoring the correct answer rate. The person search method described in.

撮像手段により撮影された映像に含まれる人物に関する情報を抽出し、画面に表示する人物検索装置における人物検索プログラムにおいて、
コンピュータを、
前記映像に含まれる時系列の各画像に対して人物特徴情報を蓄積する蓄積手段、
前記蓄積手段に対して予め設定される条件に基づいて該当する人物を検索する検索手段、
前記検索手段により得られる検索結果に対応する情報を前記蓄積手段により抽出し、抽出した人物に対応して予め設定された人物特徴或いは行動、又は前記撮像手段を設けた警備ロボットの巡回位置情報を表示するための画面を生成する画面生成手段、及び、
前記画面生成手段により生成された画面を表示すると共に、前記画面に表示される情報を外部機器に出力する出力手段として機能させるための人物検索プログラム。 In a person search program in a person search apparatus that extracts information about a person included in a video imaged by an imaging means and displays the information on a screen,
Computer
Storage means for storing person feature information for each time-series image included in the video;
Search means for searching for a corresponding person based on conditions preset for the storage means;
Information corresponding to the search result obtained by the search means is extracted by the storage means, and personal features or actions set in advance corresponding to the extracted person, or patrol position information of the security robot provided with the imaging means Screen generation means for generating a screen for display; and
A person search program for displaying a screen generated by the screen generation means and functioning as an output means for outputting information displayed on the screen to an external device.

前記映像に含まれる人物に対して、顔領域を検出する顔領域検出手段と、
前記映像に含まれる人物に対して、人体領域を検出する人体領域検出手段と、
前記顔領域検出手段により得られる検出結果と、前記人体領域検出手段により得られる検出結果とを用いて時間の異なる複数の画像の人物を同定する同定手段と、
前記同定手段により得られる人物の動作を追跡する追跡手段と、
前記追跡手段により得られる追跡結果から不審者を検出する不審者検出手段と、
前記同定手段により同定された人物の特徴情報を統合する人物情報統合手段と、
前記人物情報統合手段により得られる統合結果を、前記映像を構成する画像フレーム毎に生成するフレーム情報生成手段とを有し、
前記蓄積手段は、前記フレーム情報生成手段により得られるフレーム情報に基づいて、映像に含まれる時系列の各画像に対して人物特徴情報を蓄積することを特徴とする請求項１１に記載の人物検索プログラム。 Face area detecting means for detecting a face area for a person included in the video;
Human body region detecting means for detecting a human body region for a person included in the video;
Identification means for identifying a plurality of images of persons having different times using the detection result obtained by the face area detection means and the detection result obtained by the human body area detection means;
Tracking means for tracking the movement of the person obtained by the identification means;
A suspicious person detecting means for detecting a suspicious person from the tracking result obtained by the tracking means;
Person information integration means for integrating the feature information of the person identified by the identification means;
Frame information generating means for generating an integration result obtained by the person information integrating means for each image frame constituting the video,
12. The person search according to claim 11, wherein the storage unit stores person feature information for each time-series image included in the video based on the frame information obtained by the frame information generation unit. program.

前記フレーム情報生成手段は、
前記映像中に含まれる人物の全ての特徴データを離散化して出現回数をカウントし、離散化されたデータを各フレーム数で正規化した人物特徴データを生成することを特徴とする請求項１２に記載の人物検索プログラム。 The frame information generating means
13. The human feature data obtained by discretizing all the feature data of a person included in the video, counting the number of appearances, and normalizing the discretized data by the number of frames is generated. The person search program described.

前記人体領域検出手段は、検出された人体の所定範囲の色情報を抽出し、抽出した色情報を予め人の視覚特性に基づき設定された色空間に変換し、
前記蓄積手段は、前記色空間に変換された色情報を蓄積することを特徴とする請求項１１乃至１３の何れか１項に記載の人物検索プログラム。 The human body region detecting means extracts color information of a predetermined range of the detected human body, converts the extracted color information into a color space set in advance based on human visual characteristics,
The person search program according to claim 11, wherein the storage unit stores the color information converted into the color space.

前記検索手段は、
前記検索条件として、検索者により入力された検索要求を、前記人物の特徴の内容に応じた特徴量に変換し、正規化ヒストグラムの最大値を検索要求特徴量とし、前記検索要求特徴量により前記蓄積手段により蓄積された人物情報を検索して該当する人物に付与された特徴量との距離を算出して正答率のスコアリングを行うことを特徴とする請求項１１乃至１４の何れか１項に記載の人物検索プログラム。 The search means includes
As the search condition, the search request input by the searcher is converted into a feature amount according to the content of the person's feature, the maximum value of the normalized histogram is set as the search request feature amount, and the search request feature amount 15. The personal information stored by the storage means is searched, a distance from the feature amount assigned to the corresponding person is calculated, and scoring of the correct answer rate is performed. The person search program described in.