JP6393424B2

JP6393424B2 - Image processing system, image processing method, and storage medium

Info

Publication number: JP6393424B2
Application number: JP2017530538A
Authority: JP
Inventors: 森田　健一; 健一森田; 俊明垂井; 裕樹渡邉; 健一米司; 廣池　敦; 敦廣池; 英克高田
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2015-07-29
Filing date: 2015-07-29
Publication date: 2018-09-19
Anticipated expiration: 2035-07-29
Also published as: WO2017017808A1; JPWO2017017808A1

Description

本発明は画像処理システムに関する。 The present invention relates to an image processing system.

本技術分野の背景技術として、特開２０１５−１１４６８５号公報（特許文献１）がある。特許文献１は、「第１の場所で撮影された複数のフレームからなる第１の映像、及び、第２の場所で撮影された複数のフレームからなる第２の映像のそれぞれから、一つ以上の移動体の移動経路を検出して記憶装置に格納し、前記第１の映像から検出された前記一つ以上の移動体のうち選択された移動体の、前記フレームごとの画像特徴量を抽出して前記記憶装置に格納し、前記第１の映像から検出された前記選択された移動体の移動経路、及び、前記第２の映像から検出された前記一つ以上の移動体の移動経路に基づいて、前記抽出した画像特徴量のうち、検索クエリとして使用するクエリ画像特徴量を選択し、前記クエリ画像特徴量を用いて、前記第２の映像から抽出された前記一つ以上の移動体の画像特徴量を検索し、前記検索の結果を出力する映像検索装置」（要約）を開示する。 As background art of this technical field, there is JP-A-2015-11485 (Patent Document 1). Patent Document 1 states that “one or more of each of a first video composed of a plurality of frames photographed at a first location and a second video composed of a plurality of frames photographed at a second location. The moving path of the moving body is detected and stored in the storage device, and the image feature amount for each frame of the moving body selected from the one or more moving bodies detected from the first video is extracted. To the movement path of the selected moving body detected from the first image and the movement path of the one or more moving bodies detected from the second image. Based on the extracted image feature quantity, a query image feature quantity to be used as a search query is selected, and the one or more moving objects extracted from the second video using the query image feature quantity are selected. Search for image feature quantity of It discloses an output image retrieval apparatus "(Abstract).

特開２０１５−１１４６８５号公報Japanese Patent Laying-Open No. 2015-114685

映像データから、移動物体を検索する場合、映像データにおける移動物体の特徴量を予め抽出し、データベースに登録しておくことが必要である。例えば、移動物体の一つである人物の移動経路、顔及び服飾の情報を用いて人物を検索する場合、システムは、映像データ中に現れる人物の移動経路、顔及び服飾の特徴登録処理を実行する。特徴登録処理は、人物の移動経路、顔及び服飾それぞれの領域を検出し、特徴量（特徴ベクトル）を抽出し、さらに、特徴量をデータベースに書き込む。 When searching for a moving object from video data, it is necessary to extract in advance the feature quantity of the moving object in the video data and register it in the database. For example, when searching for a person using information on the movement path, face, and clothing of a person, which is one of the moving objects, the system executes a feature registration process for the movement path, face, and clothing of the person appearing in the video data. To do. In the feature registration process, the movement path of the person, the areas of the face and the clothing are detected, the feature amount (feature vector) is extracted, and the feature amount is written in the database.

特徴登録処理において、負荷の高い処理が存在する場合、特徴登録処理にかかる時間が、映像の実時間の数倍に上る場合がある。また、特徴登録処理にかかる時間を短縮するために登録する映像データのフレームレートを低下させると、登録される特徴量が減少し、移動物体の検索精度が低下する。 In the feature registration process, when there is a process with a high load, the time required for the feature registration process may be several times the actual time of the video. Also, if the frame rate of the video data to be registered is reduced in order to shorten the time required for the feature registration process, the registered feature amount is reduced, and the moving object search accuracy is lowered.

上記課題を解決するために、例えば特許請求の範囲に記載の構成を採用する。本願は上記課題を解決する手段を複数含んでいるが、その一例は、プロセッサと前記プロセッサが実行するプログラムを格納する記憶装置とを含む、画像処理システムであって、前記プロセッサは、映像データから複数フレームを作成し、前記複数フレームにおいて移動物体を検出し、検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行する。 In order to solve the above problems, for example, the configuration described in the claims is adopted. The present application includes a plurality of means for solving the above-mentioned problem. An example thereof is an image processing system including a processor and a storage device that stores a program executed by the processor. A plurality of frames are created, a moving object is detected in the plurality of frames, a trajectory feature amount of each detected moving object is extracted from the plurality of frames and recorded in a database, and a moving object is detected in each of the plurality of frames. The content of feature registration processing including extracting feature values from an image and recording them in the database is determined according to a predetermined condition, and the determined content of the feature registration processing is executed in each of the plurality of frames To do.

本発明の一態様によれば、映像データにおける移動物体の特徴量の登録処理時間を低減しつつ、移動物体の検索精度の低下を小さくすることができる。 According to one aspect of the present invention, it is possible to reduce a decrease in moving object search accuracy while reducing a registration processing time of a moving object feature amount in video data.

上記した以外の課題、構成、及び効果は、以下の実施形態の説明により明らかにされる。 Problems, configurations, and effects other than those described above will be clarified by the following description of embodiments.

実施例１の画像検索システムの全体構成図である。1 is an overall configuration diagram of an image search system according to Embodiment 1. FIG. 実施例１の画像検索システムのハードウェア構成図である。1 is a hardware configuration diagram of an image search system of Embodiment 1. FIG. 実施例１の映像データベースの構成及びデータ例の第１の説明図である。It is the 1st explanatory view of composition of a video database of Example 1, and a data example. 実施例１の映像データベースの構成及びデータ例の第２の説明図である。It is the 2nd explanatory view of composition of a video database of Example 1, and a data example. 実施例１のサーバ計算機が入力された映像を登録する処理を説明するフローチャートである。6 is a flowchart illustrating processing for registering an input video by the server computer according to the first embodiment. 実施例１のサーバ計算機に入力される特徴登録処理条件の設定画面の説明図である。It is explanatory drawing of the setting screen of the characteristic registration process conditions input into the server computer of Example 1. FIG. 実施例１のサーバ計算機による検索処理を説明するフローチャートである。3 is a flowchart illustrating search processing by the server computer according to the first embodiment. 実施例１のサーバ計算機によって出力される検索画面の説明図である。It is explanatory drawing of the search screen output by the server computer of Example 1. FIG. 実施例２のサーバ計算機がデータベースに登録済みのフレームについて特徴量を追加登録する処理を説明するフローチャートである。FIG. 10 is a flowchart illustrating a process in which the server computer according to the second embodiment additionally registers a feature amount for a frame that has been registered in a database. 実施例２のサーバ計算機に入力される特徴量の追加登録処理条件の設定画面の説明図である。It is explanatory drawing of the setting screen of the additional registration process condition of the feature-value input into the server computer of Example 2. FIG. 実施例２のサーバ計算機が、入力された映像を登録する処理を説明するフローチャートである。It is a flowchart explaining the process in which the server computer of Example 2 registers the input image | video. 実施例２のサーバ計算機が、映像データベースに登録されているフレームに対して実行する特徴登録処理についてのフローチャートである。It is a flowchart about the characteristic registration process which the server computer of Example 2 performs with respect to the flame | frame registered into the video database. 実施例２のサーバ計算機が映像データベースに登録済みのフレーム画像のうち特徴量の登録が未実行のフレーム画像に対して、特徴量の登録を追加で行うための条件を設定するための設定画面の説明図である。A setting screen for setting a condition for additionally registering a feature amount for a frame image for which registration of a feature amount has not been executed among frame images already registered in the video database by the server computer according to the second embodiment. It is explanatory drawing.

発明を実行するための形態Detailed Description of the Invention

以下、添付図面を参照して本発明の実施形態を説明する。本実施形態は本発明を実現するための一例に過ぎず、本発明の技術的範囲を限定するものではないことに注意すべきである。各図において共通の構成については同一の参照符号が付されている。 Embodiments of the present invention will be described below with reference to the accompanying drawings. It should be noted that this embodiment is merely an example for realizing the present invention, and does not limit the technical scope of the present invention. In each figure, the same reference numerals are given to common configurations.

図１は、実施例１の画像検索システム１００の全体構成図である。画像検索システム１００は、映像記憶装置１０１、映像撮影装置１０２、入力装置１０３、１０５、表示装置１０４、１０６、及びサーバ計算機１０７を含む。 FIG. 1 is an overall configuration diagram of an image search system 100 according to the first embodiment. The image search system 100 includes a video storage device 101, a video shooting device 102, input devices 103 and 105, display devices 104 and 106, and a server computer 107.

映像記憶装置１０１は、映像データを格納する記憶媒体を含み、要求に応じて映像データを出力する。映像記憶装置１０１は、計算機内蔵のハードディスクドライブ、ＮＡＳ（ＮｅｔｗｏｒｋＡｔｔａｃｈｅｄＳｔｏｒａｇｅ）またはＳＡＮ（ＳｔｏｒａｇｅＡｒｅａＮｅｔｗｏｒｋ）などのネットワークで接続されたストレージシステムを用いて構成することができる。 The video storage device 101 includes a storage medium for storing video data, and outputs video data in response to a request. The video storage device 101 can be configured using a storage system connected via a network such as a hard disk drive built in a computer, NAS (Network Attached Storage), or SAN (Storage Area Network).

映像撮影装置１０２は、映像を撮影して映像データを作成し、それを出力する。映像記憶装置１０１及び映像撮影装置１０２から出力された映像データは、サーバ計算機１０７の映像入力部１０８（後述）に入力される。画像検索システム１００は、映像記憶装置１０１及び映像撮影装置１０２の一方のみを含んでもよい。 The video photographing device 102 creates video data by photographing a video and outputs it. Video data output from the video storage device 101 and the video shooting device 102 is input to a video input unit 108 (described later) of the server computer 107. The image search system 100 may include only one of the video storage device 101 and the video shooting device 102.

画像検索システム１００が映像記憶装置１０１及び映像撮影装置１０２の両方を含む場合、映像入力部１０８への映像データの入力元は、必要に応じて、映像記憶装置１０１と映像撮影装置１０２との間で切り替えられてもよい。映像撮影装置１０２から出力された映像データが一旦映像記憶装置１０１に格納され、そこから映像入力部１０８に入力されてもよい。その場合、映像記憶装置１０１は、例えば、映像撮影装置１０２から継続的に入力される映像データを一時的に保持するキャッシュメモリであってもよい。 When the image search system 100 includes both the video storage device 101 and the video shooting device 102, the input source of the video data to the video input unit 108 is between the video storage device 101 and the video shooting device 102 as necessary. It may be switched with. The video data output from the video imaging device 102 may be temporarily stored in the video storage device 101 and input to the video input unit 108 from there. In that case, the video storage device 101 may be, for example, a cache memory that temporarily holds video data continuously input from the video shooting device 102.

映像記憶装置１０１に格納される映像データ及び映像撮影装置１０２によって作成される映像データは、撮影された移動物体の追跡に利用できるものである限り、どのような形式のデータであってもよい。例えば、映像撮影装置１０２がビデオカメラであり、それによって撮影された動画像データが映像データとして出力されてもよいし、その映像データが映像記憶装置１０１に格納されてもよい。 The video data stored in the video storage device 101 and the video data created by the video shooting device 102 may be any type of data as long as they can be used for tracking a moving object that has been shot. For example, the video shooting device 102 may be a video camera, and moving image data shot by the video shooting device 102 may be output as video data, or the video data may be stored in the video storage device 101.

映像撮影装置１０２がスチルカメラであり、それによって所定の間隔（少なくとも撮影された物体を追跡できる程度の間隔）で撮影された一連の静止画像データが映像データとして出力されてもよいし、その映像データが映像記憶装置１０１に記憶されてもよい。また、映像撮影装置１０２は、複数台のビデオカメラ、または、スチルカメラであってもよいし、その両方で構成されてもよい。 The video photographing apparatus 102 is a still camera, and a series of still image data photographed at a predetermined interval (at least enough to track the photographed object) may be output as video data. Data may be stored in the video storage device 101. In addition, the video photographing device 102 may be a plurality of video cameras, a still camera, or both.

入力装置１０３、１０５は、マウス、キーボード、タッチデバイスなど、ユーザの操作をサーバ計算機１０７に伝えるための入力インタフェースである。表示装置１０４、１０６は、液晶ディスプレイなどの出力インタフェースであり、サーバ計算機１０７の特徴登録処理条件の表示、検索結果の表示、ユーザとの対話的操作などのために用いられる。 The input devices 103 and 105 are input interfaces for transmitting user operations to the server computer 107 such as a mouse, a keyboard, and a touch device. The display devices 104 and 106 are output interfaces such as a liquid crystal display, and are used for displaying feature registration processing conditions of the server computer 107, displaying search results, interactive operations with the user, and the like.

例えばいわゆるタッチパネル等を用いることによって入力装置１０３と表示装置１０４、及び、入力装置１０５と表示装置１０６は、それぞれ一体化されてもよいし、全て一体化されてもよい。入力装置１０３と表示装置１０４の組及び入力装置１０５と表示装置１０６の組は、それぞれ、ネットワークを介してサーバ計算機１０７に接続されたクライアント端末に含まれてもよい。 For example, by using a so-called touch panel or the like, the input device 103 and the display device 104, and the input device 105 and the display device 106 may be integrated, or all may be integrated. The set of the input device 103 and the display device 104 and the set of the input device 105 and the display device 106 may each be included in a client terminal connected to the server computer 107 via a network.

サーバ計算機１０７は、入力された映像データから、規定の処理条件、例えば、システム管理者によって予め設定された処理条件、または、ユーザに指定された処理条件に基づいて、画像に含まれる情報を抽出し、抽出した情報とフレーム画像とを保持する画像登録装置として動作する。さらに、サーバ計算機１０７は、ユーザに指定された検索条件に基づいて、検索対象物体の画像を検索する画像検索装置として機能する。 The server computer 107 extracts information contained in the image from the input video data based on a predetermined processing condition, for example, a processing condition preset by a system administrator or a processing condition designated by the user. Then, it operates as an image registration apparatus that holds the extracted information and the frame image. Furthermore, the server computer 107 functions as an image search device that searches for an image of a search target object based on a search condition designated by the user.

具体的には、サーバ計算機１０７は、与えられた映像データのフレームに含まれる移動物体を追跡し、その移動物体に関する情報を蓄積する。ユーザが蓄積されたフレームから探したい移動物体の検索条件を指定すると、サーバ計算機１０７は、蓄積された情報を用いて画像を検索する。サーバ計算機１０７の機能は複数の計算機に分散して実装されてもよい。 Specifically, the server computer 107 tracks a moving object included in a given frame of video data and accumulates information related to the moving object. When the user specifies a search condition for a moving object to be searched from the accumulated frames, the server computer 107 searches for an image using the accumulated information. The function of the server computer 107 may be distributed and implemented in a plurality of computers.

以下で説明する例において、サーバ計算機１０７が扱う各映像は、一箇所で撮影された定点観測の映像であることを想定している。検索対象の物体は、人物または車両などの任意の移動物体である。実施例１は、検索対象の物体が人物である場合の画像検索システム１００の例を示している。 In the example described below, it is assumed that each video handled by the server computer 107 is a fixed-point observation video shot at one location. The search target object is an arbitrary moving object such as a person or a vehicle. The first embodiment shows an example of the image search system 100 when the object to be searched is a person.

サーバ計算機１０７は、映像入力部１０８、移動物体追跡部１１０、軌跡特徴抽出部１１１、軌跡特徴記録部１１２、フレーム記録部１１３、映像データベース１６１、特徴登録処理条件入力部１０９、特徴登録処理内容判定部１２１、特徴登録処理時間保持部１２２、特徴登録処理時間演算部１２３、顔検出部１３１、顔特徴抽出部１３２、及び顔特徴記録部１３３を含む。 The server computer 107 includes a video input unit 108, a moving object tracking unit 110, a trajectory feature extraction unit 111, a trajectory feature recording unit 112, a frame recording unit 113, a video database 161, a feature registration processing condition input unit 109, and feature registration processing content determination. Unit 121, feature registration processing time holding unit 122, feature registration processing time calculation unit 123, face detection unit 131, face feature extraction unit 132, and face feature recording unit 133.

サーバ計算機１０７は、さらに、頭部検出部１４１、頭部特徴抽出部１４２、頭部特徴記録部１４３、服飾検出部１５１、服飾特徴抽出部１５２、服飾特徴記録部１５３、特徴ベクトル入力部１７１、類似特徴ベクトル検索部１７２、及び検索結果統合部１７３を含む。特徴登録処理内容判定部１２１において、特徴登録処理時間保持部１２２が保持する情報を使用しない場合、サーバ計算機１０７は、特徴登録処理時間保持部１２２と特徴登録処理時間演算部１２３を含まなくてもよい。 The server computer 107 further includes a head detection unit 141, a head feature extraction unit 142, a head feature recording unit 143, a clothing detection unit 151, a clothing feature extraction unit 152, a clothing feature recording unit 153, a feature vector input unit 171, A similar feature vector search unit 172 and a search result integration unit 173 are included. When the information stored in the feature registration processing time holding unit 122 is not used in the feature registration processing content determination unit 121, the server computer 107 does not include the feature registration processing time holding unit 122 and the feature registration processing time calculation unit 123. Good.

映像入力部１０８は、映像記憶装置１０１から映像データを読み出す、または、映像撮影装置１０２によって撮影された映像データを受け取り、それをサーバ計算機１０７内部で使用するデータ形式に変換する。具体的には、映像入力部１０８は、映像（動画データ形式）をフレーム（静止画データ形式）に分解する動画デコード処理を行う。得られたフレームは、移動物体追跡部１１０、フレーム記録部１１３及び特徴登録処理内容判定部１２１に送られる。得られたフレームは、さらに、顔検出部１３１、頭部検出部１４１、服飾検出部１５１に送られてもよい。 The video input unit 108 reads video data from the video storage device 101 or receives video data shot by the video shooting device 102 and converts it into a data format used inside the server computer 107. Specifically, the video input unit 108 performs a video decoding process that decomposes video (moving image data format) into frames (still image data format). The obtained frame is sent to the moving object tracking unit 110, the frame recording unit 113, and the feature registration processing content determination unit 121. The obtained frame may be further sent to the face detection unit 131, the head detection unit 141, and the clothing detection unit 151.

移動物体追跡部１１０は、フレーム中の移動物体を検出し、前フレームで検出された移動物体との対応付けを行うことによって、移動物体の追跡を行う。移動物体の検出及び追跡は、例えばS. Baker and I. Matthews “Lucas-kanade 20 years on: A unifying framework”, International Journal of Computer Vision, vol. 53, no. 3, 2004に記載された方法など、任意の方法を用いて実現することができる。追跡によって得られた移動物体の軌跡（すなわちその移動物体の移動経路）は、一つまたは複数の始点と終点を有するベクトルとして表現され、軌跡情報は、各軌跡にユニークに付与されるＩＤ（追跡ＩＤ）と各フレームの移動物体の座標情報とで構成される。 The moving object tracking unit 110 tracks a moving object by detecting a moving object in the frame and associating it with the moving object detected in the previous frame. Detection and tracking of moving objects can be performed by, for example, the method described in S. Baker and I. Matthews “Lucas-kanade 20 years on: A unifying framework”, International Journal of Computer Vision, vol. 53, no. It can be realized using any method. The trajectory of the moving object obtained by tracking (that is, the moving path of the moving object) is expressed as a vector having one or more start points and end points, and the trajectory information is an ID (tracking) uniquely assigned to each trajectory. ID) and coordinate information of the moving object of each frame.

軌跡特徴抽出部１１１は、同一の追跡ＩＤが付与された移動物体の座標から、軌跡の形状の特徴量（以下、軌跡特徴量とも記載）を抽出する。軌跡特徴記録部１１２は、抽出された軌跡特徴量を映像データベース１６１に記録する。 The trajectory feature extraction unit 111 extracts a trajectory shape feature amount (hereinafter also referred to as a trajectory feature amount) from the coordinates of a moving object to which the same tracking ID is assigned. The trajectory feature recording unit 112 records the extracted trajectory feature amount in the video database 161.

顔検出部１３１、頭部検出部１４１、服飾検出部１５１は、それぞれ、フレームから人物の顔領域、頭部領域及び服飾領域を検出する。検出対象として定義される顔領域と頭部領域とは、重ならなくてもよく、一部が重なってもよい。頭部領域は、例えば、髪の毛や帽子の色を特定するために使用される。例えば、顔領域は首より上の部分における前面の眉毛から下の領域であり、頭部領域は、首より上の部分におけるその他の領域でもよい。 The face detection unit 131, the head detection unit 141, and the clothing detection unit 151 detect a human face region, a head region, and a clothing region from the frame, respectively. The face area and the head area defined as the detection target do not have to overlap, and a part thereof may overlap. The head region is used, for example, to specify the color of hair or a hat. For example, the face region may be a region below the front eyebrows in the portion above the neck, and the head region may be another region in the portion above the neck.

服飾領域は、人物が身に着けている服飾を特定するために使用される。服飾領域は、例えば、人物の上半身における首より下の領域、人物の下半身の領域等である。人物において１又は複数の服飾領域が検出される。 The clothing area is used to identify the clothing worn by the person. The clothing area is, for example, an area below the neck in the upper body of the person, an area of the lower body of the person, or the like. One or more clothing areas are detected in a person.

顔特徴抽出部１３２、頭部特徴抽出部１４２及び服飾特徴抽出部１５２は、それぞれ、顔領域、頭部領域及び服飾領域の画像特徴量を抽出する。以下、それぞれ顔特徴量、頭部特徴量及び服飾特徴量とも記載する。例えばエッジ情報または色情報等、画像から抽出できるものであればどのような種類の特徴量が抽出されてもよい。画像特徴量の抽出は、公知の方法を含む任意の方法によって行うことができる。一つの領域で抽出される画像特徴量は、その領域の特徴ベクトルを構成する。特徴量は、それぞれ、特徴ベクトルの要素であり、特徴ベクトルは１または複数の特徴量で構成される。 The face feature extraction unit 132, the head feature extraction unit 142, and the clothing feature extraction unit 152 extract image feature amounts of the face region, the head region, and the clothing region, respectively. Hereinafter, the facial feature value, the head feature value, and the clothing feature value are also described. For example, any kind of feature amount may be extracted as long as it can be extracted from the image, such as edge information or color information. The extraction of the image feature amount can be performed by any method including a known method. An image feature amount extracted in one area constitutes a feature vector of the area. Each feature amount is an element of a feature vector, and the feature vector includes one or a plurality of feature amounts.

顔特徴記録部１３３、頭部特徴記録部１４３及び服飾特徴記録部１５３は、それぞれ、顔領域、頭部領域及び服飾領域から抽出された画像特徴量（特徴ベクトル）を映像データベース１６１に記録する。 The face feature recording unit 133, the head feature recording unit 143, and the clothing feature recording unit 153 record image feature amounts (feature vectors) extracted from the face region, the head region, and the clothing region in the video database 161, respectively.

顔検出部１３１、顔特徴抽出部１３２、顔特徴記録部１３３、頭部検出部１４１、頭部特徴抽出部１４２、頭部特徴記録部１４３、服飾検出部１５１、服飾特徴抽出部１５２、服飾特徴記録部１５３は、それぞれ、各フレームにおける処理時間を計測し、特徴登録処理時間演算部１２３に処理時間を通知する。これらの一部のみが、処理時間を計測し、通知してもよい。 Face detection unit 131, facial feature extraction unit 132, facial feature recording unit 133, head detection unit 141, head feature extraction unit 142, head feature recording unit 143, clothing detection unit 151, clothing feature extraction unit 152, clothing feature The recording unit 153 measures the processing time in each frame, and notifies the feature registration processing time calculation unit 123 of the processing time. Only some of them may measure and notify the processing time.

特徴登録処理時間演算部１２３は、顔検出部１３１、顔特徴抽出部１３２、顔特徴記録部１３３、頭部検出部１４１、頭部特徴抽出部１４２、頭部特徴記録部１４３、服飾検出部１５１、服飾特徴抽出部１５２及び服飾特徴記録部１５３から受け付けた処理時間に対し、所定の演算処理を実行する。特徴登録処理時間演算部１２３の演算結果は、特徴登録処理時間保持部１２２に送られる。特徴登録処理時間保持部１２２は、特徴登録処理時間演算部１２３から受け付けた情報を保持する。 The feature registration processing time calculation unit 123 includes a face detection unit 131, a face feature extraction unit 132, a face feature recording unit 133, a head detection unit 141, a head feature extraction unit 142, a head feature recording unit 143, and a clothing detection unit 151. The predetermined calculation process is executed for the processing time received from the clothing feature extraction unit 152 and the clothing feature recording unit 153. The calculation result of the feature registration processing time calculation unit 123 is sent to the feature registration processing time holding unit 122. The feature registration processing time holding unit 122 holds the information received from the feature registration processing time calculation unit 123.

特徴登録処理時間演算部１２３は、検出部、抽出部、記録部から新たに受け付けた処理時間の他に、特徴登録処理時間保持部１２２に保持された、過去のフレームにおける特徴登録処理時間演算部１２３の演算結果を使用した演算を行ってもよい。 The feature registration processing time calculation unit 123 includes a feature registration processing time calculation unit in the past frame held in the feature registration processing time holding unit 122 in addition to the processing time newly received from the detection unit, the extraction unit, and the recording unit. You may perform the calculation using 123 calculation results.

特徴登録処理条件入力部１０９は、ユーザが入力装置１０３を操作して入力した、特徴登録処理条件を受け付ける。特徴登録処理条件は、例えば、顔検出部１３１、頭部検出部１４１及び服飾検出部１５１における検出方法、顔特徴抽出部１３２、頭部特徴抽出部１４２及び服飾特徴抽出部１５２における画像特徴量の抽出方法、特徴登録処理の目標フレームレート、特徴登録処理内容判定部１２１で実行される処理内容判定方法などを含む。 The feature registration processing condition input unit 109 receives feature registration processing conditions input by the user operating the input device 103. The feature registration processing conditions are, for example, the detection method in the face detection unit 131, the head detection unit 141, and the clothing detection unit 151, the image feature amount in the face feature extraction unit 132, the head feature extraction unit 142, and the clothing feature extraction unit 152. This includes an extraction method, a target frame rate for feature registration processing, a processing content determination method executed by the feature registration processing content determination unit 121, and the like.

顔検出部１３１、顔特徴抽出部１３２、顔特徴記録部１３３、頭部検出部１４１、頭部特徴抽出部１４２、頭部特徴記録部１４３、服飾検出部１５１、服飾特徴抽出部１５２及び服飾特徴記録部１５３は、特徴登録処理内容判定部１２１における判定結果に応じて、処理内容をフレームごとに変更する。特徴登録処理内容判定部１２１において実行される処理内容判定方法については、後述する（図４参照）。 Face detection unit 131, facial feature extraction unit 132, facial feature recording unit 133, head detection unit 141, head feature extraction unit 142, head feature recording unit 143, clothing detection unit 151, clothing feature extraction unit 152, and clothing features The recording unit 153 changes the processing content for each frame according to the determination result in the feature registration processing content determination unit 121. The process content determination method executed by the feature registration process content determination unit 121 will be described later (see FIG. 4).

フレーム記録部１１３は、入力映像から抽出されたフレームと、抽出元の映像の情報と、特徴登録処理内容判定部１２１によって判定された結果を映像データベース１６１に記録する。 The frame recording unit 113 records the frame extracted from the input video, the information of the source video, and the result determined by the feature registration process content determination unit 121 in the video database 161.

映像データベース１６１は、映像から抽出されたフレーム、移動物体の軌跡情報、顔特徴量、頭部特徴量、服飾特徴量、及びフレーム毎の特徴量抽出状況、などを格納するためのデータベースである。映像データベース１６１へのアクセスは、軌跡特徴記録部１１２、フレーム記録部１１３、顔特徴記録部１３３、頭部特徴記録部１４３、服飾特徴記録部１５３からの書き込み処理、並びに、類似特徴ベクトル検索部１７２からの検索処理の際に発生する。映像データベース１６１に格納するデータの詳細については、図３Ａ、図３Ｂを参照して後述する。 The video database 161 is a database for storing frames extracted from video, trajectory information of moving objects, face feature amounts, head feature amounts, clothing feature amounts, feature amount extraction statuses for each frame, and the like. Access to the video database 161 includes a trajectory feature recording unit 112, a frame recording unit 113, a face feature recording unit 133, a head feature recording unit 143, a clothing feature recording unit 153, and a similar feature vector search unit 172. Occurs during search processing from. Details of the data stored in the video database 161 will be described later with reference to FIGS. 3A and 3B.

特徴ベクトル入力部１７１は、ユーザが入力装置１０５を操作して入力した、検索キーとなる１または複数種類の特徴ベクトル（例えば、軌跡特徴ベクトル、顔特徴ベクトル、頭部特徴ベクトル及び服飾特徴ベクトル）を指定する情報を受け付ける。特徴ベクトルは、１以上の特徴量からなる。 The feature vector input unit 171 is one or a plurality of types of feature vectors (for example, a trajectory feature vector, a face feature vector, a head feature vector, and a clothing feature vector) that are input by the user operating the input device 105 and serving as a search key. The information that specifies is accepted. The feature vector is composed of one or more feature amounts.

類似特徴ベクトル検索部１７２は、検索キーとして指定された各特徴ベクトルと類似する特徴ベクトルを、映像データベース１６１において検索する。検索結果統合部１７３は、１または複数種類の特徴ベクトルの検索キーに基づく検索結果を結合して出力する。これらの詳細な処理については後述する（図６参照）。 The similar feature vector search unit 172 searches the video database 161 for a feature vector similar to each feature vector designated as a search key. The search result integration unit 173 combines and outputs search results based on search keys for one or more types of feature vectors. These detailed processes will be described later (see FIG. 6).

なお、実施例１は、追跡対象の移動物体が人物である画像検索システム１００の例を示している。顔領域、頭部領域及び服飾領域は、人物の識別に利用できる画像情報を含んでいると考えられる領域（以下、顕著領域とも記載する）の例である。 The first embodiment shows an example of the image search system 100 in which the tracking target moving object is a person. The face region, the head region, and the clothing region are examples of regions (hereinafter, also referred to as salient regions) that are considered to include image information that can be used to identify a person.

したがって、サーバ計算機１０７は、上記顕著領域に加えまたは変えて、上記以外の顕著領域を検出する顕著領域検出部、その検出領域から画像特徴量を抽出する画像特徴抽出部、及びその画像特徴量を記録する画像特徴記録部を含んでもよい。設計により、１または複数の種類の顕著領域が検出される。 Therefore, the server computer 107 adds a saliency area detection unit that detects saliency areas other than those described above in addition to or in addition to the saliency area, an image feature extraction unit that extracts an image feature quantity from the detection area, and an image feature quantity thereof. An image feature recording unit for recording may be included. Depending on the design, one or more types of salient regions are detected.

上記以外の顕著領域として、例えば、顔領域の一部である口領域、頭部領域の一部である髪領域、服飾領域の一部である上半身領域、下半身領域、または、人物の所持品の領域（例えば鞄領域）等が挙げられる。 As a salient area other than the above, for example, a mouth area that is a part of a face area, a hair area that is a part of a head area, an upper body area that is a part of a clothing area, a lower body area, or a personal belongings A region (for example, a heel region) or the like can be given.

追跡対象の移動物体が人物以外の物体である場合には、その物体の種類に応じて適切な顕著領域が検出され、その特徴量が登録される。例えば追跡対象が自動車である場合、自動車の前面、後面、側面、タイヤ、またはナンバープレート等が顕著領域として検出されてもよい。 When the tracking target moving object is an object other than a person, an appropriate saliency area is detected according to the type of the object, and the feature amount is registered. For example, when the tracking target is an automobile, the front, rear, side, tire, or license plate of the automobile may be detected as the saliency area.

図２は、実施例１の画像検索システム１００のハードウェア構成図である。サーバ計算機１０７は、例えば、相互に接続されたプロセッサ２０１及び記憶装置２０２を有する一般的な計算機である。記憶装置２０２は任意の種類の記憶媒体を含んで構成される。例えば、記憶装置２０２は、半導体メモリ及びハードディスクドライブを含んでもよい。記憶装置２０２は、データを格納する、非一時的な計算機読み取り可能な記憶媒体を含んでもよい。 FIG. 2 is a hardware configuration diagram of the image search system 100 according to the first embodiment. The server computer 107 is, for example, a general computer having a processor 201 and a storage device 202 connected to each other. The storage device 202 includes any type of storage medium. For example, the storage device 202 may include a semiconductor memory and a hard disk drive. The storage device 202 may include a non-transitory computer readable storage medium that stores data.

この例において、図１に示した映像入力部１０８、移動物体追跡部１１０、軌跡特徴抽出部１１１、軌跡特徴記録部１１２、フレーム記録部１１３、特徴登録処理条件入力部１０９、特徴登録処理内容判定部１２１、顔検出部１３１、顔特徴抽出部１３２、顔特徴記録部１３３、頭部検出部１４１、頭部特徴抽出部１４２、頭部特徴記録部１４３、服飾検出部１５１、服飾特徴抽出部１５２、服飾特徴記録部１５３、特徴登録処理時間演算部１２３、特徴ベクトル入力部１７１、類似特徴ベクトル検索部１７２及び検索結果統合部１７３といった機能部は、プロセッサ２０１が記憶装置２０２に格納された処理プログラム２０３を実行することによって実現される。 In this example, the video input unit 108, the moving object tracking unit 110, the trajectory feature extraction unit 111, the trajectory feature recording unit 112, the frame recording unit 113, the feature registration processing condition input unit 109, and the feature registration processing content determination illustrated in FIG. Unit 121, face detection unit 131, face feature extraction unit 132, face feature recording unit 133, head detection unit 141, head feature extraction unit 142, head feature recording unit 143, clothing detection unit 151, clothing feature extraction unit 152 , Fashion feature recording unit 153, feature registration processing time calculation unit 123, feature vector input unit 171, similar feature vector search unit 172, and search result integration unit 173, a processing program stored in storage device 202 by processor 201 This is realized by executing 203.

言い換えると、上記の各機能部が実行する処理は、実際には、処理プログラム２０３に記述された命令に従うプロセッサ２０１によって実行される。また、特徴登録処理時間保持部１２２と映像データベース１６１は、記憶装置２０２の記憶領域である。映像データベース１６１は、ネットワークを介して接続された記憶装置に含まれてもよい。 In other words, the processing executed by each functional unit described above is actually executed by the processor 201 in accordance with instructions described in the processing program 203. The feature registration processing time holding unit 122 and the video database 161 are storage areas of the storage device 202. The video database 161 may be included in a storage device connected via a network.

サーバ計算機１０７は、さらに、プロセッサ２０１に接続されたネットワークインターフェース装置（ＮＩＦ）２０４を含む。映像撮影装置１０２は、例えば、ネットワークインターフェース装置２０４を介してサーバ計算機１０７に接続される。映像記憶装置１０１は、ネットワークインターフェース装置２０４を介してサーバ計算機１０７に接続されたＮＡＳまたはＳＡＮであってもよいし、記憶装置２０２に含まれてもよい。 The server computer 107 further includes a network interface device (NIF) 204 connected to the processor 201. The video photographing apparatus 102 is connected to the server computer 107 via the network interface apparatus 204, for example. The video storage device 101 may be a NAS or a SAN connected to the server computer 107 via the network interface device 204, or may be included in the storage device 202.

図３Ａ及び図３Ｂは、実施例１の映像データベース１６１の構成例の説明図である。ここではテーブル形式の構成例を示すが、映像データベース１６１のデータ形式は任意でよい。映像データベース１６１は、図３Ａに示す映像データ管理情報３００、背景画像データ管理情報３１０、軌跡特徴管理情報３２０、移動物体管理情報３３０及びフレーム画像管理情報３４０を含む。 3A and 3B are explanatory diagrams of a configuration example of the video database 161 according to the first embodiment. Here, a configuration example of a table format is shown, but the data format of the video database 161 may be arbitrary. The video database 161 includes video data management information 300, background image data management information 310, trajectory feature management information 320, moving object management information 330, and frame image management information 340 shown in FIG. 3A.

さらに、映像データベース１６１は、図３Ｂに示す顔特徴管理情報３５０、頭部特徴管理情報３６０及び服飾特徴管理情報３７０を含む。図３Ａ及び図３Ｂのテーブル構成及び各テーブルのフィールド構成は一例であり、アプリケーションに応じてテーブル及びフィールドを追加、変更してもよい。 Further, the video database 161 includes face feature management information 350, head feature management information 360, and clothing feature management information 370 shown in FIG. 3B. The table configurations in FIGS. 3A and 3B and the field configurations of each table are examples, and the tables and fields may be added or changed according to the application.

映像データ管理情報３００は、映像ＩＤフィールド３０１、ファイル名フィールド３０２、及び撮影場所ＩＤフィールド３０３を有する。映像ＩＤフィールド３０１は、映像データファイルの識別子（以下、映像ＩＤ）を保持する。ファイル名フィールド３０２は、映像記憶装置１０１から読み込まれた映像データファイルのファイル名を保持し、映像データベース１６１内の映像データ（フレーム）と映像記憶装置１０１内のファイルとを対応づける。映像データが映像撮影装置１０２から入力される場合、ファイル名を省略してもよい。 The video data management information 300 includes a video ID field 301, a file name field 302, and a shooting location ID field 303. The video ID field 301 holds an identifier of a video data file (hereinafter referred to as video ID). The file name field 302 holds the file name of the video data file read from the video storage device 101 and associates the video data (frame) in the video database 161 with the file in the video storage device 101. When video data is input from the video shooting device 102, the file name may be omitted.

撮影場所ＩＤフィールド３０３は、定点観測された場所の識別子（以下、撮影場所ＩＤ）を保持する。映像データファイルと撮影場所とを対応付けるための管理情報は、映像入力部１０８により保持されていてもよいし、映像データベース１６１に含まれてもよい。 The shooting location ID field 303 holds an identifier (hereinafter referred to as a shooting location ID) of a location where a fixed point is observed. Management information for associating the video data file with the shooting location may be held by the video input unit 108 or may be included in the video database 161.

入力された映像データが固定カメラによって撮影された場合は、撮影場所ＩＤをカメラＩＤと読み替えてもよい。図３Ａの例のように、一つの撮影場所に対して、複数の映像データファイルが登録されてもよい。複数の映像データファイルには、例えば、設置場所及び撮影方向が固定された一つのカメラがそれぞれ異なる時間帯に撮影した映像データが含まれる。 When the input video data is shot by a fixed camera, the shooting location ID may be read as the camera ID. As in the example of FIG. 3A, a plurality of video data files may be registered for one shooting location. The plurality of video data files include, for example, video data shot at different time zones by one camera whose installation location and shooting direction are fixed.

背景画像データ管理情報３１０は、撮影場所ＩＤフィールド３１１及び背景画像データフィールド３１２を有する。背景画像データ管理情報３１０は、システム管理者によって予め作成され、映像データベース１６１に登録される。撮影場所ＩＤフィールド３１１は、背景画像の撮影場所の識別子を保持するフィールドであり、この識別子は、映像データ管理情報３００に保持される撮影場所ＩＤと対応する。 The background image data management information 310 includes a shooting location ID field 311 and a background image data field 312. The background image data management information 310 is created in advance by the system administrator and registered in the video database 161. The shooting location ID field 311 is a field that holds an identifier of the shooting location of the background image, and this identifier corresponds to the shooting location ID held in the video data management information 300.

背景画像データフィールド３１２は、各撮影場所で撮影された背景画像のデータを保持する。ここに保持される背景画像は、後述するように、ユーザが検索しようとする移動物体の軌跡を入力するときに表示される。したがって、それぞれの撮影場所において移動物体を撮影するカメラと同じカメラによって撮影された、いずれの移動物体も含まない画像であることが望ましい。 The background image data field 312 holds background image data taken at each shooting location. The background image held here is displayed when the user inputs the locus of the moving object to be searched, as will be described later. Therefore, it is desirable that the image is one that does not include any moving object, which is captured by the same camera that captures the moving object at each shooting location.

軌跡特徴管理情報３２０は、追跡ＩＤフィールド３２１、映像ＩＤフィールド３２２、移動物体ＩＤフィールド３２３及び軌跡特徴ベクトル（特徴量）フィールド３２４を有する。追跡ＩＤフィールド３２１は、移動物体追跡部１１０が各移動物体を追跡するために用いる識別子（以下、追跡ＩＤ）を保持する。移動物体を追跡することで得られた各軌跡にユニークな追跡ＩＤが与えられる。 The trajectory feature management information 320 includes a tracking ID field 321, a video ID field 322, a moving object ID field 323, and a trajectory feature vector (feature amount) field 324. The tracking ID field 321 holds an identifier (hereinafter, tracking ID) used by the moving object tracking unit 110 to track each moving object. A unique tracking ID is given to each trajectory obtained by tracking a moving object.

移動物体が映像に表れてから消えるまでの軌跡に対して、一つの追跡ＩＤが与えられる。一つの移動物体が、消えた後に再度現れると、新しい追跡ＩＤが与えられる。この例において、追跡ＩＤの値は、異なる映像ＩＤの値において一意である。 One tracking ID is given to the trajectory from when the moving object appears in the image until it disappears. If a moving object appears again after disappearing, it is given a new tracking ID. In this example, the tracking ID value is unique among the different video ID values.

映像ＩＤフィールド３２２は、追跡対象の移動物体の画像を含む映像データファイルの識別子を保持する。この識別子は、映像データ管理情報３００に保持される映像ＩＤと対応する。 The video ID field 322 holds an identifier of a video data file including an image of a moving object to be tracked. This identifier corresponds to the video ID held in the video data management information 300.

移動物体ＩＤフィールド３２３は、それぞれの軌跡を構成する、フレームそれぞれから検出された移動物体の識別子（以下、移動物体ＩＤ）のリストを保持する。移動物体ＩＤは、移動物体そのものを識別するものではなく、各フレームから検出された移動物体の画像を識別する。同一の移動物体の画像が複数のフレームから検出された場合、それらの移動物体の画像の各々に別の（一意の）移動物体ＩＤが与えられる。同一の移動物体の画像が連続するフレームで検出された場合、それらの異なる移動物体ＩＤが、一つの追跡ＩＤに対応付けられる。 The moving object ID field 323 holds a list of identifiers of moving objects (hereinafter referred to as moving object IDs) that are detected from the respective frames constituting the respective trajectories. The moving object ID does not identify the moving object itself, but identifies an image of the moving object detected from each frame. When images of the same moving object are detected from a plurality of frames, a different (unique) moving object ID is given to each of the moving object images. When images of the same moving object are detected in successive frames, these different moving object IDs are associated with one tracking ID.

例えば、図３Ａにおいて、追跡ＩＤ「１」に対応する移動物体ＩＤフィールド３２３に「１、２、４、５、６、・・・」が登録されている。これは、それぞれ異なるフレームから検出された移動物体ＩＤ「１」、「２」、「４」、「５」、「６」によって識別される移動物体の画像が、移動物体追跡部１１０によって相互に対応付けられていることを意味する。すなわち、それらが同一の移動物体の画像と判定されたことを意味する。 For example, in FIG. 3A, “1, 2, 4, 5, 6,...” Is registered in the moving object ID field 323 corresponding to the tracking ID “1”. This is because moving object images identified by moving object IDs “1”, “2”, “4”, “5”, “6” detected from different frames are mutually transferred by the moving object tracking unit 110. It means that it is associated. That is, it means that they are determined as images of the same moving object.

軌跡特徴ベクトルフィールド３２４は、映像中の移動物体の座標の時系列変化（軌跡）から抽出された、軌跡特徴量を保持する。軌跡特徴量は、例えば一つまたは複数の固定長のベクトルによって表現される。軌跡特徴量は、任意の公知の方法によって抽出することができる。具体的には、同一の追跡ＩＤに対応付けられた移動物体ＩＤの画像のフレーム内の座標の時系列変化から、当該追跡ＩＤの軌跡特徴量が計算される。 The trajectory feature vector field 324 holds the trajectory feature amount extracted from the time-series change (trajectory) of the coordinates of the moving object in the video. The trajectory feature amount is expressed by, for example, one or a plurality of fixed length vectors. The trajectory feature amount can be extracted by any known method. Specifically, the trajectory feature amount of the tracking ID is calculated from the time-series change of the coordinates in the frame of the image of the moving object ID associated with the same tracking ID.

移動物体管理情報３３０は、移動物体ＩＤフィールド３３１、矩形座標フィールド３３２及び撮影日時フィールド３３３を含む。移動物体ＩＤフィールド３３１は、各フレームから検出された移動物体ＩＤを保持する。移動物体ＩＤは、軌跡特徴管理情報３２０の移動物体ＩＤフィールド３２３に保持されるものと対応する。 The moving object management information 330 includes a moving object ID field 331, a rectangular coordinate field 332, and a shooting date / time field 333. The moving object ID field 331 holds the moving object ID detected from each frame. The moving object ID corresponds to that held in the moving object ID field 323 of the trajectory feature management information 320.

矩形座標フィールド３３２は、各フレームから検出された移動物体の画像の当該フレーム中に占める範囲を示す矩形座標を保持する。この座標は、例えば、移動物体の外接矩形の「左上隅の水平座標、左上隅の垂直座標、右下隅の水平座標、右下隅の垂直座標」という形式で表現されてもよいし、矩形の中心の座標、幅及び高さによって表現されてもよい。後述する矩形座標フィールド３５３、３６３及び３７３に保持される矩形座標の表現も同様であってよい。 The rectangular coordinate field 332 holds rectangular coordinates indicating a range that the moving object image detected from each frame occupies in the frame. This coordinate may be expressed in the form of, for example, the “horizontal coordinate of the upper left corner, the vertical coordinate of the upper left corner, the horizontal coordinate of the lower right corner, the vertical coordinate of the lower right corner” of the circumscribed rectangle of the moving object, or the center of the rectangle May be expressed by the coordinates, width, and height. The expression of rectangular coordinates held in rectangular coordinate fields 353, 363, and 373 described later may be the same.

撮影日時フィールド３３３は、各移動物体の画像を含むフレームが撮影された日時を保持する。 The shooting date / time field 333 holds the date / time when the frame including the image of each moving object was shot.

フレーム画像管理情報３４０は、フレームＩＤフィールド３４１、映像ＩＤフィールド３４２及び画像データフィールド３４３、顔処理フィールド３４４、頭部処理フィールド３４５、服飾処理フィールド３４６を含む。 The frame image management information 340 includes a frame ID field 341, a video ID field 342 and an image data field 343, a face processing field 344, a head processing field 345, and a clothing processing field 346.

フレームＩＤフィールド３４１は、映像データから抽出された各フレームの識別子（以下、フレームＩＤ）を保持する。フレームＩＤは、異なる映像データファイルにおいて一意である。映像ＩＤフィールド３４２は、フレームの抽出元の映像データファイルを識別する映像ＩＤを保持するフィールドであり、この映像ＩＤは、映像データ管理情報３００の映像ＩＤフィールド３０１に保持される値に対応する。画像データフィールド３４３は、フレームの静止画像のバイナリデータであり、検索結果などを表示装置１０６に表示する際に用いられるデータを、保持する。 The frame ID field 341 holds an identifier (hereinafter referred to as a frame ID) of each frame extracted from the video data. The frame ID is unique in different video data files. The video ID field 342 holds a video ID for identifying the video data file from which the frame is extracted, and this video ID corresponds to the value held in the video ID field 301 of the video data management information 300. The image data field 343 is binary data of a still image of a frame, and holds data used when displaying a search result or the like on the display device 106.

顔処理フィールド３４４は、フレームにおける顔検出処理及び顔特徴抽出処理の実行有無と、顔検出処理が実行された場合の検出処理方法と、顔特徴抽出が実行された場合の顔特徴抽出方法と、の情報を保持する。頭部処理フィールド３４５は、フレームにおける頭部検出処理及び頭部特徴抽出処理の実行有無と、頭部検出処理が実行された場合の検出処理方法と、頭部特徴抽出処理が実行された場合の頭部特徴抽出方法と、の情報を保持する。服飾処理フィールド３４６は、フレームにおける服飾検出処理及び服飾特徴抽出処理の実行有無と、服飾検出処理が実行された場合の検出処理方法と、服飾特徴抽出処理が実行された場合の服飾特徴抽出方法と、の情報を保持する。 The face processing field 344 includes the presence / absence of execution of face detection processing and face feature extraction processing in a frame, a detection processing method when face detection processing is executed, a face feature extraction method when face feature extraction is executed, The information of is retained. The head processing field 345 includes the presence / absence of execution of the head detection processing and the head feature extraction processing in the frame, the detection processing method when the head detection processing is executed, and the case where the head feature extraction processing is executed. Information on the head feature extraction method is held. The clothing processing field 346 includes the presence / absence of execution of the clothing detection processing and the clothing feature extraction processing in the frame, the detection processing method when the clothing detection processing is executed, and the clothing feature extraction method when the clothing feature extraction processing is executed. , Keep the information.

本例において、「ＮＯＮＥ」は処理が実行されなかったことを意味する。「ＤＥＴＥＣＴＯＲｋ」（ｋは自然数）及び「ＥＸＴＲＡＣＴＯＲｋ」は、それぞれ、顔処理、頭部処理、または服飾処理における、検出処理方法及び抽出処理方法の識別子である。 In this example, “NONE” means that the process has not been executed. “DETECTORk” (k is a natural number) and “EXTRACTORk” are identifiers of a detection processing method and an extraction processing method in face processing, head processing, or clothing processing, respectively.

顔特徴管理情報３５０は、顔ＩＤフィールド３５１、フレームＩＤフィールド３５２、矩形座標フィールド３５３及び顔特徴ベクトルフィールド３５４を含む。顔ＩＤフィールド３５１は、フレームの画像から顔検出部１３１によって検出された顔領域の識別子（以下、顔ＩＤ）を保持する。フレームＩＤフィールド３５２は、顔領域が検出されたフレームのフレームＩＤを保持する。このフレームＩＤは、フレーム画像管理情報３４０のフレームＩＤフィールド３４１に保持されるものと対応する。 The face feature management information 350 includes a face ID field 351, a frame ID field 352, a rectangular coordinate field 353, and a face feature vector field 354. The face ID field 351 holds an identifier (hereinafter referred to as a face ID) of a face area detected by the face detection unit 131 from the frame image. The frame ID field 352 holds the frame ID of the frame in which the face area is detected. This frame ID corresponds to that held in the frame ID field 341 of the frame image management information 340.

矩形座標フィールド３５３は、検出された顔領域のフレームに占める範囲を示す座標を保持する。顔特徴ベクトルフィールド３５４は、検出された顔領域から顔特徴抽出部１３２によって抽出された画像特徴量の特徴ベクトルを保持する。 The rectangular coordinate field 353 holds coordinates indicating the range of the detected face area in the frame. The face feature vector field 354 holds the feature vector of the image feature amount extracted by the face feature extraction unit 132 from the detected face area.

頭部特徴管理情報３６０は、頭部ＩＤフィールド３６１、フレームＩＤフィールド３６２、矩形座標フィールド３６３及び頭部特徴ベクトルフィールド３６４を含む。頭部ＩＤフィールド３６１は、フレームから頭部検出部１４１によって検出された頭部領域の識別子（以下、頭部ＩＤ）を保持する。フレームＩＤフィールド３６２は、頭部領域が検出されたフレームのフレームＩＤを保持する。このフレームＩＤは、フレーム画像管理情報３４０のフレームＩＤフィールド３４１に保持されるものと対応する。 The head feature management information 360 includes a head ID field 361, a frame ID field 362, a rectangular coordinate field 363, and a head feature vector field 364. The head ID field 361 holds an identifier (hereinafter referred to as a head ID) of the head region detected by the head detection unit 141 from the frame. The frame ID field 362 holds the frame ID of the frame in which the head region is detected. This frame ID corresponds to that held in the frame ID field 341 of the frame image management information 340.

矩形座標フィールド３６３は、検出された頭部領域のフレームに占める範囲を示す座標を保持する。頭部特徴ベクトルフィールド３６４は、検出された頭部領域から頭部特徴抽出部１４２によって抽出された画像特徴量の特徴ベクトルを保持する。 The rectangular coordinate field 363 holds coordinates indicating the range of the detected head region in the frame. The head feature vector field 364 holds a feature vector of an image feature amount extracted by the head feature extraction unit 142 from the detected head region.

服飾特徴管理情報３７０は、服飾ＩＤフィールド３７１、フレームＩＤフィールド３７２、矩形座標フィールド３７３及び服飾特徴ベクトルフィールド３７４を含む。服飾ＩＤフィールド３７１は、フレームから服飾検出部１５１によって検出された服飾領域の識別子（以下、服飾ＩＤ）を保持する。フレームＩＤフィールド３７２は、服飾領域が検出されたフレームのフレームＩＤを保持する。このフレームＩＤは、フレーム画像管理情報３４０のフレームＩＤフィールド３４１に保持されるものと対応する。 The clothing feature management information 370 includes a clothing ID field 371, a frame ID field 372, a rectangular coordinate field 373, and a clothing feature vector field 374. The clothing ID field 371 holds an identifier of a clothing area (hereinafter referred to as clothing ID) detected by the clothing detection unit 151 from the frame. The frame ID field 372 holds the frame ID of the frame in which the clothing area is detected. This frame ID corresponds to that held in the frame ID field 341 of the frame image management information 340.

矩形座標フィールド３７３は、検出された服飾領域のフレームに占める範囲を示す座標を保持する。服飾特徴ベクトルフィールド３７４は、検出された服飾領域から服飾特徴抽出部１５２によって抽出された画像特徴量の特徴ベクトルを保持する。 The rectangular coordinate field 373 holds coordinates indicating the range of the detected clothing area in the frame. The clothing feature vector field 374 holds the feature vector of the image feature amount extracted by the clothing feature extraction unit 152 from the detected clothing region.

上記以外の顕著領域が検出され、その特徴量が抽出された場合には、当該顕著領域に関する上記と同様の情報が映像データベース１６１に保持される。 When a saliency area other than the above is detected and the feature amount is extracted, the same information as the above regarding the saliency area is held in the video database 161.

図４は、実施例１のサーバ計算機１０７が入力された映像を登録する処理を説明するフローチャートである。最初に、映像入力部１０８が、映像記憶装置１０１または映像撮影装置１０２から入力された映像データファイルを取得する（ステップＳ４００）。具体的には、映像入力部１０８は、映像データ管理情報３００に取得した映像データファイルのエントリを追加する。映像データファイルは、映像を撮影した撮影場所または撮影装置の情報を含む。映像入力部１０８は、不図示の管理情報を参照することで、撮影装置に対応付けられた撮影場所の情報を取得する。 FIG. 4 is a flowchart for explaining processing for registering the input video by the server computer 107 according to the first embodiment. First, the video input unit 108 acquires a video data file input from the video storage device 101 or the video shooting device 102 (step S400). Specifically, the video input unit 108 adds an entry of the acquired video data file to the video data management information 300. The video data file includes information on the shooting location or the shooting device that shot the video. The video input unit 108 refers to management information (not shown) to acquire information on the shooting location associated with the shooting device.

次に、映像入力部１０８が、入力された映像データファイルをデコードし、フレームを静止画として抽出する（ステップＳ４０１）。映像データファイルのフレームレートは、映像データファイルの種類によって異なってもよい。 Next, the video input unit 108 decodes the input video data file and extracts a frame as a still image (step S401). The frame rate of the video data file may vary depending on the type of the video data file.

次に、サーバ計算機１０７内の各部が、ステップＳ４０１で抽出された各フレームに対して、ステップＳ４０２〜Ｓ４１９を実行する。 Next, each unit in the server computer 107 executes steps S402 to S419 for each frame extracted in step S401.

フレーム記録部１１３は、抽出されたフレームの画像データを、フレームＩＤ及び映像ＩＤと共に映像データベース１６１のフレーム画像管理情報３４０に記録する（ステップＳ４０３）。 The frame recording unit 113 records the extracted frame image data in the frame image management information 340 of the video database 161 together with the frame ID and the video ID (step S403).

移動物体追跡部１１０は、処理対象である現在フレームから移動物体を検出し、検出した移動物体と直前の時刻のフレームから検出された移動物体との間の対応関係を決定し、その情報を保持する（ステップＳ４０４）。移動物体追跡部１１０は、現在フレームにおいて検出された移動物体それぞれに移動物体ＩＤを与え、さらに、現在フレームにおいて検出された移動物体と同じ移動物体が直前フレームで検出されている場合、それらの移動物体ＩＤを同一移動物体として対応付ける。 The moving object tracking unit 110 detects a moving object from the current frame to be processed, determines the correspondence between the detected moving object and the moving object detected from the frame at the previous time, and retains the information (Step S404). The moving object tracking unit 110 gives a moving object ID to each moving object detected in the current frame. Further, if the same moving object as the moving object detected in the current frame is detected in the immediately preceding frame, the moving object tracking unit 110 moves them. The object ID is associated as the same moving object.

次に、軌跡特徴記録部１１２は、現在フレームから検出された移動物体の情報を映像データベース１６１の移動物体管理情報３３０に記録する（ステップＳ４０５）。具体的には、軌跡特徴記録部１１２は、現在フレームから検出された移動物体に付与した移動物体ＩＤ、対応する移動物体の矩形座標及び現在フレームの撮影日時の情報を記録する。撮影日時の情報は映像データファイルの属性情報に含まれており、軌跡特徴記録部１１２は、それを映像入力部１０８から取得する。 Next, the trajectory feature recording unit 112 records information on the moving object detected from the current frame in the moving object management information 330 of the video database 161 (step S405). Specifically, the trajectory feature recording unit 112 records the moving object ID given to the moving object detected from the current frame, the rectangular coordinates of the corresponding moving object, and information on the shooting date and time of the current frame. The shooting date / time information is included in the attribute information of the video data file, and the trajectory feature recording unit 112 acquires it from the video input unit 108.

次に、移動物体追跡部１１０は、現在フレームにおいて新しい移動物体が出現しているか否かを判定する（ステップＳ４０６）。具体的には、移動物体追跡部１１０は、現在フレームから検出された移動物体が直前の時刻のフレームから検出されたいずれの移動物体とも対応付けられない、言い換えると、現在フレームから検出された移動物体と同一の移動物体の画像が直前の時刻のフレームに含まれていない場合、新しい移動物体が出現したと判定する。 Next, the moving object tracking unit 110 determines whether or not a new moving object appears in the current frame (step S406). Specifically, the moving object tracking unit 110 does not associate the moving object detected from the current frame with any moving object detected from the immediately preceding frame, in other words, the movement detected from the current frame. When an image of the same moving object as the object is not included in the frame at the previous time, it is determined that a new moving object has appeared.

新しい移動物体が出現している場合（ステップＳ４０６：ＹＥＳ）、移動物体追跡部１１０は、当該新しい移動物体に新しい追跡ＩＤを付与する。軌跡特徴記録部１１２は、当該新しい追跡ＩＤと、当該新しい移動物体が検出された映像の映像ＩＤと、を含むエントリを、軌跡特徴管理情報３２０に記録する（ステップＳ４０７）。一方、新しい移動物体が出現していない場合（ステップＳ４０６：ＮＯ）、ステップＳ４０７は省略される。 When a new moving object appears (step S406: YES), the moving object tracking unit 110 assigns a new tracking ID to the new moving object. The trajectory feature recording unit 112 records an entry including the new tracking ID and the video ID of the video in which the new moving object is detected in the trajectory feature management information 320 (step S407). On the other hand, when no new moving object has appeared (step S406: NO), step S407 is omitted.

次に、軌跡特徴記録部１１２は、ステップＳ４０４における対応関係に従って、軌跡特徴管理情報３２０における移動物体ＩＤフィールド３２３を更新する（ステップＳ４０８）。 Next, the trajectory feature recording unit 112 updates the moving object ID field 323 in the trajectory feature management information 320 according to the correspondence relationship in Step S404 (Step S408).

具体的には、軌跡特徴記録部１１２は、新たに出現した移動物体の移動物体ＩＤを、ステップＳ４０７で記録された追跡ＩＤのエントリの移動物体ＩＤフィールド３２３に記録する。さらに、軌跡特徴記録部１１２は、直前の時刻のフレームから検出された移動物体と同一の移動物体の移動物体ＩＤを、当該直前の時刻の移動物体と同一のエントリの移動物体ＩＤフィールド３２３に記録する。 Specifically, the trajectory feature recording unit 112 records the moving object ID of the newly appearing moving object in the moving object ID field 323 of the tracking ID entry recorded in step S407. Further, the trajectory feature recording unit 112 records the moving object ID of the same moving object as the moving object detected from the frame at the immediately preceding time in the moving object ID field 323 of the same entry as the moving object at the immediately preceding time. To do.

次に、移動物体追跡部１１０は、消失した移動物体が存在するか否かを判定する（ステップＳ４０９）。具体的には、ステップＳ４０４において直前の時刻のフレームから検出された移動物体が現在フレームのいずれの移動物体とも対応付けられない、言い換えると、直前の時刻のフレームから検出された移動物体と同一の移動物体の画像が現在フレームに含まれていない場合、移動物体追跡部１１０は、直前の時刻のフレームから検出された当該移動物体が消失したと判定する。 Next, the moving object tracking unit 110 determines whether or not there is a lost moving object (step S409). Specifically, in step S404, the moving object detected from the frame at the previous time is not associated with any moving object in the current frame, in other words, the same as the moving object detected from the frame at the previous time. When the image of the moving object is not included in the current frame, the moving object tracking unit 110 determines that the moving object detected from the frame at the immediately preceding time has disappeared.

消失した移動物体が存在する場合（Ｓ４０９：ＹＥＳ）、軌跡特徴抽出部１１１は、消失した移動物体それぞれの軌跡から、軌跡特徴量を抽出する（ステップＳ４１０）。具体的には、軌跡特徴抽出部１１１は、軌跡特徴管理情報３２０における、消失した移動物体のエントリの移動物体ＩＤフィールド３２３から、移動物体ＩＤを取得する。 When there is a lost moving object (S409: YES), the trajectory feature extraction unit 111 extracts a trajectory feature amount from the trajectory of each of the lost moving objects (step S410). Specifically, the trajectory feature extraction unit 111 acquires the moving object ID from the moving object ID field 323 of the disappeared moving object entry in the trajectory feature management information 320.

さらに、軌跡特徴抽出部１１１は、移動物体管理情報３３０から、取得した移動物体ＩＤそれぞれの矩形座標を取得し、当該移動物体の軌跡を決定する。各フレームにおける移動物体の座標は、例えば、矩形の中心位置である。軌跡特徴抽出部１１１は、決定した軌跡から軌跡特徴量を算出する。軌跡特徴抽出部１１１は、抽出された軌跡特徴量から軌跡特徴ベクトルを生成する。 Further, the trajectory feature extraction unit 111 acquires the rectangular coordinates of each acquired moving object ID from the moving object management information 330, and determines the trajectory of the moving object. The coordinates of the moving object in each frame are, for example, the center position of the rectangle. The trajectory feature extraction unit 111 calculates a trajectory feature amount from the determined trajectory. The trajectory feature extraction unit 111 generates a trajectory feature vector from the extracted trajectory feature amount.

軌跡特徴記録部１１２は、消失した移動物体それぞれの軌跡特徴ベクトルを、軌跡特徴管理情報３２０内の軌跡特徴ベクトルフィールド３２４に記録する（ステップＳ４１１）。消失していない移動物体に対しては、軌跡がさらに延長される可能性があるため、ステップＳ４１０、Ｓ４１１は実行されない。消失した移動物体が存在しない場合（Ｓ４０９：ＮＯ）、ステップＳ４１０及びＳ４１１は省略される。 The trajectory feature recording unit 112 records the trajectory feature vector of each lost moving object in the trajectory feature vector field 324 in the trajectory feature management information 320 (step S411). For a moving object that has not disappeared, the trajectory may be further extended, and thus steps S410 and S411 are not executed. When there is no disappearing moving object (S409: NO), steps S410 and S411 are omitted.

次に、特徴登録処理内容判定部１２１は、顔検出部１３１、頭部検出部１４１、服飾検出部１５１それぞれにおいて検出処理を実行するか、または、実行しないか判定する（ステップＳ４１２）。検出処理を実行する場合、検出処理内容を切り替える判定が含まれてもよい。判定は、特徴登録処理条件入力部１０９が保持する特徴登録処理条件と、特徴登録処理時間保持部１２２が保持する各検出部、各抽出部、各記録部の過去の処理時間とに、基づいてもよい。 Next, the feature registration process content determination unit 121 determines whether or not to perform the detection process in each of the face detection unit 131, the head detection unit 141, and the clothing detection unit 151 (step S412). When executing the detection process, a determination to switch the detection process content may be included. The determination is based on the feature registration processing conditions held by the feature registration processing condition input unit 109 and the past processing times of each detection unit, each extraction unit, and each recording unit held by the feature registration processing time holding unit 122. Also good.

特徴登録処理時間保持部１２２が処理中の映像について保持する処理時間は、例えば、各検出部、各抽出部、各記録部の、過去のフレームそれぞれの処理時間、直前フレームの処理時間、過去所定数のフレームにおける１フレームあたりの平均処理時間、当該映像の最初のフレームから直前のフレームまでの１フレームあたりの平均処理時間等である。ステップＳ４１２において、特徴登録処理内容判定部１２１が検出処理を実行しないという判定となった場合、ステップＳ４１３〜ステップＳ４１７は実行しない。 The processing time that the feature registration processing time holding unit 122 holds for the video being processed is, for example, the processing time of each past frame, the processing time of the immediately preceding frame, or the past predetermined time of each detection unit, each extraction unit, and each recording unit. The average processing time per frame in several frames, the average processing time per frame from the first frame of the video to the previous frame, and the like. In step S412, when it is determined that the feature registration process content determination unit 121 does not execute the detection process, steps S413 to S417 are not executed.

ステップＳ４１２における特徴登録処理内容判定部１２１の判定は、例えば、以下の通りである。特徴登録処理内容判定部１２１は、ユーザ入力または設定ファイルに従い、所定数フレームに１回の頻度で、特定の検出処理を省略する、または、特定の検出処理において複数の方法から負荷の小さい方法を選択する。これにより、特徴登録処理内容の判定が容易となる。頻度は、検出処理の種類毎に決定されてもよく、撮影場所毎に設定されてもよい。 The determination by the feature registration process content determination unit 121 in step S412 is as follows, for example. The feature registration process content determination unit 121 omits a specific detection process at a frequency of once every predetermined number of frames according to a user input or a setting file, or selects a method with a low load from a plurality of methods in a specific detection process. select. This facilitates determination of the feature registration process content. The frequency may be determined for each type of detection process or may be set for each shooting location.

例えば、特徴登録処理内容判定部１２１は、顔検出はフレーム毎に実行し、頭部検出及び服飾検出は、それぞれに対して設定された規定数フレームに１回の頻度で省略すると、判定する。これにより、顔検出が実行されるフレームレートは、頭部検出が実行されるフレームレートよりも大きい。 For example, the feature registration process content determination unit 121 determines that face detection is performed for each frame, and that head detection and clothing detection are omitted once every specified number of frames set for each. Thereby, the frame rate at which face detection is performed is greater than the frame rate at which head detection is performed.

当該判定は、過去のフレームの処理時間を参照しない。省略する処理及び省略頻度は、ユーザによって指定される、または設定ファイルに予め設定されている。検索精度に影響が小さい処理の省略頻度を大きくすることで、検索精度への影響を小さくしつつ、特徴登録処理の時間を短縮することができる。 This determination does not refer to the processing time of past frames. The process to be omitted and the frequency of omission are designated by the user or preset in the setting file. By increasing the omission frequency of the process that has a small influence on the search accuracy, it is possible to reduce the time for the feature registration process while reducing the influence on the search accuracy.

特徴登録処理内容判定部１２１は、当該映像における過去のフレームの処理時間に基づいて、特徴登録処理内容を決定してもよい。これにより、状況に応じて適切に登録処理内容を変更できる。 The feature registration processing content determination unit 121 may determine the feature registration processing content based on the processing time of the past frame in the video. Thereby, the registration process content can be appropriately changed according to the situation.

例えば、特徴登録処理内容判定部１２１は、特徴登録処理の目標フレームレートを元に、１フレームあたりの目標平均処理時間を算出する。特徴登録処理の目標フレームレートは、顔処理、頭部処理及び服飾処理それぞれに対して設定されている。顔処理、頭部処理及び服飾処理は並列に実行されてもよいし、直列に実行されてもよい。顔処理、頭部処理及び服飾処理に対して独立に目標フレームレートを設定することで、検索精度の影響を小さくしつつ、登録処理時間を短縮できる。 For example, the feature registration process content determination unit 121 calculates a target average processing time per frame based on the target frame rate of the feature registration process. The target frame rate for feature registration processing is set for each of face processing, head processing, and clothing processing. The face process, the head process, and the clothing process may be executed in parallel or in series. By setting the target frame rate independently for the face processing, head processing, and clothing processing, the registration processing time can be shortened while reducing the influence of search accuracy.

特徴登録処理内容判定部１２１は、顔処理、頭部処理、及び服飾処理それぞれの、過去の１フレームあたりの特徴登録処理時間の平均値を算出する。特徴登録処理時間は、検出部、抽出部、記録部の処理時間の合計である。 The feature registration process content determination unit 121 calculates the average value of the feature registration process time per frame in the past for each of the face process, the head process, and the clothing process. The feature registration processing time is the total processing time of the detection unit, the extraction unit, and the recording unit.

特徴登録処理内容判定部１２１は、顔処理、頭部処理及び服飾処理それぞれについて、特徴登録処理時間の平均値と目標平均処理時間の差分に基づいて、現在フレームの当該処理を省略するか否かまたは処理方法を変更するか否か判定する。 Whether the feature registration process content determination unit 121 omits the process for the current frame based on the difference between the average value of the feature registration process time and the target average process time for each of the face process, the head process, and the costume process. Alternatively, it is determined whether or not to change the processing method.

たとえば、ある処理の特徴登録処理時間の平均値が当該処理の目標平均処理時間よりも長い場合、特徴登録処理内容判定部１２１は、当該処理の検出処理の実行頻度を減少させる（省略頻度を増加させる）、または現在フレームで当該処理を省略すると判定する。現在フレームで当該処理を省略することは、省略頻度を増加させることになる。 For example, when the average value of the feature registration processing time of a certain process is longer than the target average processing time of the process, the feature registration process content determination unit 121 decreases the execution frequency of the detection process of the process (increases the omission frequency) It is determined that the processing is omitted in the current frame. Omitting the process in the current frame increases the omission frequency.

実行頻度の減少量は、ユーザによって指定される、または設定ファイルに予め設定されてもよい。これにより、検出処理の負荷を低減し、特徴処理時間の平均値を小さくして目標値に近づけることができる。 The amount of decrease in the execution frequency may be designated by the user or set in advance in the setting file. As a result, the load of the detection process can be reduced, and the average value of the feature processing time can be reduced to approach the target value.

特徴登録処理内容判定部１２１は、目標平均処理時間と特徴登録処理時間の平均値とに基づき、それぞれの検出処理の処理方法を決定してもよい。たとえば、特徴登録処理内容判定部１２１は、フレーム内で検出処理を実行する領域を変化させてもよい。特徴登録処理時間の平均値が目標平均処理時間よりも長い場合、特徴登録処理内容判定部１２１は、検出処理を実行する領域を小さくすると判定する。検出処理を実行する領域は、たとえば、撮影場所毎に予め設定されている。 The feature registration process content determination unit 121 may determine a processing method for each detection process based on the target average processing time and the average value of the feature registration processing time. For example, the feature registration process content determination unit 121 may change the area in which the detection process is executed in the frame. When the average value of the feature registration processing time is longer than the target average processing time, the feature registration processing content determination unit 121 determines to reduce the area in which the detection processing is executed. The area where the detection process is performed is set in advance for each shooting location, for example.

これにより、検出処理の負荷を低減し、特徴処理時間の平均値を小さくして目標値に近づけることができる。特徴処理時間の平均値は、単純平均値または加重平均値でもよい。平均値と異なる統計値が使用されてもよい。 As a result, the load of the detection process can be reduced, and the average value of the feature processing time can be reduced to approach the target value. The average value of the feature processing time may be a simple average value or a weighted average value. A statistical value different from the average value may be used.

特徴登録処理の一つの目標フレームレートが、撮影場所や映像に対して設定されていてもよい。特徴登録処理内容判定部１２１は、過去フレームそれぞれの顔処理、頭部処理、及び服飾処理の特徴登録処理時間の最大値を選択し、それらの平均値を過去の１フレームあたりの特徴登録処理時間の平均値と決定する。ここでは、上述のように、顔処理、頭部処理、及び服飾処理は並列に実行される。 One target frame rate of the feature registration process may be set for the shooting location and the video. The feature registration processing content determination unit 121 selects the maximum value of the feature registration processing time of the face processing, head processing, and clothing processing for each past frame, and calculates the average value of the feature registration processing time per past frame. Determine the average value of. Here, as described above, the face processing, the head processing, and the clothing processing are executed in parallel.

特徴登録処理時間の平均値が目標平均処理時間よりも遅い場合、特徴登録処理内容判定部１２１は、予め決められている検出処理、例えば頭部検出処理の実行頻度を減少させる、または現在フレームでの当該検出処理を省略する。特徴登録処理内容判定部１２１は、実行頻度を低減する処理として、特徴登録処理時間の平均値が最も長い検出処理を選択してもよい。 When the average value of the feature registration processing time is later than the target average processing time, the feature registration processing content determination unit 121 decreases the execution frequency of a predetermined detection process, for example, the head detection process, or in the current frame. This detection process is omitted. The feature registration process content determination unit 121 may select a detection process with the longest average value of the feature registration process time as a process of reducing the execution frequency.

特徴登録処理内容判定部１２１は、目標平均処理時間と特徴登録処理時間の平均値との差分に基づいて、実行頻度を減少させる検出処理及び実行頻度を決定してもよく、現在フレームで省略する検出処理を決定してもよい。たとえば、特徴登録処理内容判定部１２１は、差分が第１閾値よりも大きい場合に服飾検出処理を省略し、差分が第１閾値よりも大きな第２閾値よりも大きい場合に頭部検出処理及び服飾検出処理を省略すると判定してもよい。 The feature registration process content determination unit 121 may determine the detection process and the execution frequency for reducing the execution frequency based on the difference between the target average processing time and the average value of the feature registration processing time, and omits it in the current frame. The detection process may be determined. For example, the feature registration process content determination unit 121 omits the clothing detection process when the difference is larger than the first threshold, and detects the head detection process and the clothing when the difference is larger than the second threshold larger than the first threshold. It may be determined that the detection process is omitted.

特徴登録処理内容判定部１２１は、目標平均処理時間と特徴登録処理時間の平均値とに基づき、それぞれの検出処理の処理方法を決定してもよい。たとえば、特徴登録処理内容判定部１２１は、顔処理、頭部処理、及び服飾処理のうちの１又は複数の処理において、検出処理を実行する領域を変化させてもよい。 The feature registration process content determination unit 121 may determine a processing method for each detection process based on the target average processing time and the average value of the feature registration processing time. For example, the feature registration process content determination unit 121 may change the region in which the detection process is executed in one or more of the face process, the head process, and the clothing process.

次に、顔検出部１３１、頭部検出部１４１及び服飾検出部１５１は、それぞれ、特徴登録処理内容判定部１２１の判定結果に従い、処理対象の現在フレームの画像から顔領域、頭部領域及び服飾領域を検出する、または検出を省略する（ステップＳ４１３）。実行頻度が決められている場合、過去のフレームにおける検出処理の有無と実行頻度とから、現在フレームにおける検出処理の実行の有無が決定される。検出は、公知の方法を含む任意の方法によって実行することができる。 Next, the face detection unit 131, the head detection unit 141, and the clothing detection unit 151 respectively determine the face region, the head region, and the clothing from the image of the current frame to be processed according to the determination result of the feature registration processing content determination unit 121. The region is detected or the detection is omitted (step S413). When the execution frequency is determined, the presence / absence of the detection process in the current frame is determined from the presence / absence of the detection process in the past frame and the execution frequency. Detection can be performed by any method including known methods.

次に、顔検出部１３１、頭部検出部１４１及び服飾検出部１５１は、それぞれの領域の検出に成功したか否かを判定する（ステップＳ４１４）。検出が成功している処理については（ステップＳ４１４：ＹＥＳ）、特徴登録処理内容判定部１２１は、ステップＳ４１５に進む。検出が成功していない処理については（ステップＳ４１４：ＮＯ）、特徴登録処理内容判定部１２１は、ステップＳ４１８に進む。 Next, the face detection unit 131, the head detection unit 141, and the clothing detection unit 151 determine whether or not each region has been successfully detected (step S414). For the process that has been successfully detected (step S414: YES), the feature registration process content determination unit 121 proceeds to step S415. For the process that has not been successfully detected (step S414: NO), the feature registration process content determination unit 121 proceeds to step S418.

ステップＳ４１５において、顔特徴抽出部１３２、頭部特徴抽出部１４２、服飾特徴抽出部１５２において特徴抽出処理を実行するか否かの判定を行う（ステップＳ４１５）。特徴抽出を省略すると判定する場合（Ｓ４１５：ＮＯ）、特徴登録処理内容判定部１２１は、ステップＳ４１６を省略し、ステップＳ４１７に進む。 In step S415, it is determined whether or not feature extraction processing is executed in the face feature extraction unit 132, the head feature extraction unit 142, and the clothing feature extraction unit 152 (step S415). When it is determined that feature extraction is omitted (S415: NO), the feature registration process content determination unit 121 omits step S416 and proceeds to step S417.

特徴抽出を実行すると判定する場合（Ｓ４１５：ＹＥＳ）、特徴登録処理内容判定部１２１は、ステップＳ４１６に進む。特徴抽出処理を実行する場合、特徴抽出処理を切り替える判定を含めてもよい。判定においては、特徴登録処理条件入力部１０９が保持する登録処理条件と、特徴登録処理時間保持部１２２が保持する各検出部、各抽出部、各記録部の処理時間を使用してもよい。 When it determines with performing feature extraction (S415: YES), the characteristic registration process content determination part 121 progresses to step S416. When executing the feature extraction process, a determination to switch the feature extraction process may be included. In the determination, the registration processing condition held by the feature registration processing condition input unit 109 and the processing time of each detection unit, each extraction unit, and each recording unit held by the feature registration processing time holding unit 122 may be used.

ステップＳ４１５における特徴登録処理内容判定部１２１の判定内容は、例えば、以下の通りである。特徴登録処理内容判定部１２１検出処理によって検出された顕著領域の構成に基づいて、特徴抽出処理の内容を決定してもよい。これにより、検索精度への影響を小さくしつつ特徴抽出処理の負荷および時間を低減できる。 The determination content of the feature registration process content determination unit 121 in step S415 is, for example, as follows. The content of the feature extraction processing may be determined based on the configuration of the saliency area detected by the feature registration processing content determination unit 121 detection processing. Thereby, it is possible to reduce the load and time of the feature extraction process while reducing the influence on the search accuracy.

たとえば、検出処理によって検出された顕著領域の矩形領域のサイズ、たとえば、フレーム内の最大サイズまたはサイズ平均値が、ユーザが指定したまたは設定ファイルに予め記載されたサイズより小さい場合、対応する特徴抽出処理を省略すると判定される。フレームにおける各種顕著領域の最大サイズ、最小サイズまたは平均値であってもよい。目標平均処理時間と特徴登録処理時間の平均値との比較に基づく判定は、検出処理と同様である。 For example, if the size of the rectangular area of the salient area detected by the detection process, for example, the maximum size or average size value in the frame is smaller than the size specified by the user or described in the setting file in advance, the corresponding feature extraction It is determined that the process is omitted. It may be the maximum size, minimum size, or average value of various salient regions in the frame. The determination based on the comparison between the target average processing time and the average value of the feature registration processing time is the same as the detection processing.

特徴登録処理内容判定部１２１は、検出された顕著領域の数に基づいて特徴抽出処理を省略するか否か判定してもよく、または検出された顕著領域の数に基づいて特徴抽出方法を切り替えてもよい。たとえば、特定の検出処理（たとえば顔検出）で検出された顕著領域（たとえば顔領域）が閾値を超える場合、特徴登録処理内容判定部１２１は、対応する特徴抽出処理（たとえば顔特徴抽出処理）を省略する。閾値の数は、顔処理、頭部処理及び服飾処理それぞれに設定されてもよく、共通でもよい。閾値は、特徴登録処理時間の平均値と目標平均処理時間の差分に応じて変化してもよい。 The feature registration process content determination unit 121 may determine whether or not to omit the feature extraction process based on the number of detected salient regions, or switch the feature extraction method based on the number of detected salient regions. May be. For example, when a saliency area (for example, a face area) detected by a specific detection process (for example, face detection) exceeds a threshold value, the feature registration process content determination unit 121 performs a corresponding feature extraction process (for example, face feature extraction process) Omitted. The number of threshold values may be set for each of face processing, head processing, and clothing processing, or may be common. The threshold value may change according to the difference between the average value of the feature registration processing time and the target average processing time.

顔特徴抽出部１３２は、特徴登録処理内容判定部１２１における判定結果（ステップＳ４１５）に応じて、検出された顔領域から特徴量を抽出する（ステップＳ４１６）。顔特徴記録部１３３は、ステップＳ４１３が実行された場合、検出された顔領域の範囲を示す座標を処理対象の現在フレームのフレームＩＤに対応する矩形座標フィールド３５３に登録し、ステップＳ４１６が実行された場合、抽出された特徴量からなる特徴ベクトルを処理対象の現在フレームのフレームＩＤに対応する顔特徴ベクトルフィールド３５４に登録する（ステップＳ４１７）。 The face feature extraction unit 132 extracts a feature amount from the detected face area according to the determination result (step S415) in the feature registration processing content determination unit 121 (step S416). When step S413 is executed, the face feature recording unit 133 registers the coordinates indicating the range of the detected face area in the rectangular coordinate field 353 corresponding to the frame ID of the current frame to be processed, and step S416 is executed. In this case, the feature vector composed of the extracted feature quantity is registered in the face feature vector field 354 corresponding to the frame ID of the current frame to be processed (step S417).

頭部特徴抽出部１４２は、特徴登録処理内容判定部１２１における判定結果（ステップＳ４１５）に応じて、検出された頭部領域から特徴量を抽出する（ステップＳ４１６）。頭部特徴記録部１４３は、ステップＳ４１３が実行された場合、検出された頭部領域の範囲を示す座標を処理対象の現在フレームのフレームＩＤに対応する矩形座標フィールド３６３に記録し、ステップＳ４１６が実行された場合、抽出された特徴量からなる特徴ベクトルを処理対象の現在フレームのフレームＩＤに対応する頭部特徴ベクトルフィールド３６４に記録する（ステップＳ４１７）。 The head feature extraction unit 142 extracts a feature amount from the detected head region according to the determination result (step S415) in the feature registration process content determination unit 121 (step S416). When step S413 is executed, the head feature recording unit 143 records the coordinates indicating the range of the detected head region in the rectangular coordinate field 363 corresponding to the frame ID of the current frame to be processed, and step S416 If executed, the feature vector composed of the extracted feature quantity is recorded in the head feature vector field 364 corresponding to the frame ID of the current frame to be processed (step S417).

服飾特徴抽出部１５２は、特徴登録処理内容判定部１２１における判定結果（ステップＳ４１５）に応じて、検出された服飾領域から特徴量を抽出する（ステップＳ４１６）。服飾特徴記録部１５３は、ステップＳ４１３が実行された場合、検出された服飾領域の範囲を示す座標を処理対象の現在フレームのフレームＩＤに対応する矩形座標フィールド３７３に記録し、ステップＳ４１６が実行された場合、抽出された特徴量からなる特徴ベクトルを処理対象の現在フレームのフレームＩＤに対応する服飾特徴ベクトルフィールド３７４に記録する（ステップＳ４１７）。 The clothing feature extraction unit 152 extracts a feature amount from the detected clothing region according to the determination result (step S415) in the feature registration processing content determination unit 121 (step S416). When step S413 is executed, the clothing feature recording unit 153 records the coordinates indicating the range of the detected clothing area in the rectangular coordinate field 373 corresponding to the frame ID of the current frame to be processed, and step S416 is executed. In the case, the feature vector composed of the extracted feature quantity is recorded in the clothing feature vector field 374 corresponding to the frame ID of the current frame to be processed (step S417).

顔特徴記録部１３３は、顔検出部１３１及び顔特徴抽出部１３２における処理内容をフレーム画像管理情報３４０の顔処理フィールド３４４に記録する（ステップＳ４１８）。顔処理フィールド３４４には、例えば、顔検出や顔特徴量抽出が実行されたか実行されていないかについての情報、顔検出に使用した検出方法、顔領域の画像特徴量の抽出に使用した方法などが登録される。 The face feature recording unit 133 records the processing contents in the face detection unit 131 and the face feature extraction unit 132 in the face processing field 344 of the frame image management information 340 (step S418). The face processing field 344 includes, for example, information on whether or not face detection or face feature amount extraction has been executed, a detection method used for face detection, a method used for extracting image feature amounts of a face area, and the like. Is registered.

頭部特徴記録部１４３は、頭部検出部１４１及び頭部特徴抽出部１４２における処理内容をフレーム画像管理情報３４０の頭部処理フィールド３４５に記録する（ステップＳ４１８）。頭部処理フィールド３４５には、例えば、頭部検出や頭部特徴量抽出が実行されたか実行されていないかについての情報、頭部検出に使用した検出方法、頭部領域の画像特徴量の抽出に使用した方法などが登録される。 The head feature recording unit 143 records the processing contents in the head detection unit 141 and the head feature extraction unit 142 in the head processing field 345 of the frame image management information 340 (step S418). The head processing field 345 includes, for example, information on whether or not head detection or head feature amount extraction has been performed, a detection method used for head detection, and extraction of image feature amounts of the head region. The method used for is registered.

服飾特徴記録部１５３は、服飾検出部１５１及び服飾特徴抽出部１５２における処理内容をフレーム画像管理情報３４０の服飾処理フィールド３４６に記録する（ステップＳ４１８）。服飾処理フィールド３４６には、例えば、服飾検出や服飾特徴量抽出が実行されたか実行されていないかについての情報、服飾検出に使用した検出方法、服飾領域の画像特徴量の抽出に使用した方法などが登録される。 The clothing feature recording unit 153 records the processing contents in the clothing detection unit 151 and the clothing feature extraction unit 152 in the clothing processing field 346 of the frame image management information 340 (step S418). The clothing processing field 346 includes, for example, information on whether or not clothing detection or clothing feature value extraction has been executed, a detection method used for clothing detection, a method used to extract image feature values of a clothing region, and the like. Is registered.

特徴登録処理時間演算部１２３は、顔検出部１３１、顔特徴抽出部１３２、顔特徴記録部１３３、頭部検出部１４１、頭部特徴抽出部１４２、頭部特徴記録部１４３、服飾検出部１５１、服飾特徴抽出部１５２、服飾特徴記録部１５３が使用したそれぞれの処理時間の全て、または、いずれかを受け付け、特徴登録処理時間保持部１２２に格納する（ステップ４１９）。ステップ４１９において、特徴登録処理時間演算部１２３は、各抽出部、各検出部、各記録部から受け付けた処理時間をそのまま特徴登録処理時間保持部１２２に格納してもよいし、演算を行った結果を登録してもよい。 The feature registration processing time calculation unit 123 includes a face detection unit 131, a face feature extraction unit 132, a face feature recording unit 133, a head detection unit 141, a head feature extraction unit 142, a head feature recording unit 143, and a clothing detection unit 151. All or any of the processing times used by the clothing feature extraction unit 152 and the clothing feature recording unit 153 are received and stored in the feature registration processing time holding unit 122 (step 419). In step 419, the feature registration processing time calculation unit 123 may store the processing time received from each extraction unit, each detection unit, and each recording unit as it is in the feature registration processing time holding unit 122, or performs the calculation. The result may be registered.

さらに、演算において、特徴登録処理時間保持部１２２に保持された処理時間を使用してもよい。例えば、顔検出部１３１、顔特徴抽出部１３２、顔特徴抽出部１３２が、ステップＳ４１３、Ｓ４１６、Ｓ４１８において使用した時間の和を計算してもよいし、特徴登録処理時間保持部１２２に保存された処理対象フレームより以前に処理されたフレームの処理時間を使用して、処理開始以降の１フレームあたりの平均処理時間を計算してもよい。 Further, in the calculation, the processing time held in the feature registration processing time holding unit 122 may be used. For example, the face detection unit 131, the face feature extraction unit 132, and the face feature extraction unit 132 may calculate the sum of the times used in steps S413, S416, and S418, and are stored in the feature registration processing time holding unit 122. The average processing time per frame after the start of processing may be calculated using the processing time of frames processed before the processing target frame.

全てのフレームについて上記の処理が終了すると、入力された映像を登録する処理が終了する。なお、ステップＳ４０３、Ｓ４０４〜４１１、Ｓ４１２〜４１９については、並列に実行されてもよい。 When the above process is completed for all frames, the process of registering the input video is completed. Note that steps S403, S404 to 411, and S412 to 419 may be executed in parallel.

図５は、実施例１のサーバ計算機１０７が入力された映像を登録する処理の条件を設定するための設定画面の説明図である。図５を参照して、ユーザからの登録処理条件の入力方法の一例を説明する。図５の画面での設定内容は、図４のステップＳ４１２及びＳ４１５における判定に使用される。 FIG. 5 is an explanatory diagram of a setting screen for setting processing conditions for registering the input video by the server computer 107 according to the first embodiment. With reference to FIG. 5, an example of a method for inputting registration processing conditions from the user will be described. The setting contents on the screen of FIG. 5 are used for determination in steps S412 and S415 of FIG.

登録処理条件設定画面は、表示装置１０４によって表示され、チェックボックス５０１、５０２、５０４、５０６と、特徴登録処理時間比率表示エリア５０３と、フレームレート比率表示エリア５０５と、設定ボタン５０７と、を含む。 The registration processing condition setting screen is displayed by the display device 104 and includes check boxes 501, 502, 504, and 506, a feature registration processing time ratio display area 503, a frame rate ratio display area 505, and a setting button 507. .

チェックボックス５０１は、特徴登録処理におけるフレームレートの変更を禁止するか許可するかの選択状態を表示する。特徴登録処理におけるフレームレートの変更を禁止する設定の場合、ステップＳ４１２及びステップＳ４１５では判定処理は実行されず、後続のステップが実行される。 A check box 501 displays a selection state of whether to prohibit or permit change of the frame rate in the feature registration process. In the case of setting for prohibiting the change of the frame rate in the feature registration process, the determination process is not executed in steps S412 and S415, and the subsequent steps are executed.

チェックボックス５０２は、特徴登録処理におけるフレームレートを自動的に設定するか否かの選択状態を表示する。チェックボックス５０２が選択状態の場合、特徴登録処理時間比率表示エリア５０３の入力値が有効となる。特徴登録処理時間比率表示エリア５０３には、ユーザがキーボードを用いて入力した特徴登録処理時間比率が表示される。特徴登録処理時間比率は、処理対象の実映像時間に対する特徴登録処理時間の目標時間であり、この値によって、特徴登録処理における顕著領域に共通の目標フレームレートが決まる。 A check box 502 displays a selection state as to whether or not to automatically set the frame rate in the feature registration process. When the check box 502 is selected, the input value in the feature registration processing time ratio display area 503 is valid. The feature registration processing time ratio display area 503 displays the feature registration processing time ratio input by the user using the keyboard. The feature registration processing time ratio is a target time of the feature registration processing time with respect to the actual video time to be processed, and this value determines a common target frame rate for the salient region in the feature registration processing.

チェックボックス５０４は、顕著領域別に特徴登録処理のフレームレート比率を指定するか否かの選択状態を表示する。チェックボックス５０４が選択状態の場合、フレームレート比率表示エリア５０５に表示されている各顕著領域別のフレームレート比率が有効となる。 A check box 504 displays a selection state as to whether or not to specify the frame rate ratio of the feature registration process for each saliency area. When the check box 504 is selected, the frame rate ratio for each remarkable area displayed in the frame rate ratio display area 505 is valid.

フレームレート比率表示エリア５０５には、ユーザがキーボードを用いて入力した顕著領域別のフレームレート比率が表示される。それぞれの顕著領域において、特徴登録処理のフレームレートが、映像入力部１０８が受け付ける映像のフレームレートとフレームレート比率の積となるように、ステップＳ４１２において処理をスキップする判定が実行される。 In the frame rate ratio display area 505, the frame rate ratio for each remarkable area input by the user using the keyboard is displayed. In each saliency area, in step S412, determination is made to skip the process so that the frame rate of the feature registration process is the product of the frame rate and the frame rate ratio of the video received by the video input unit 108.

チェックボックス５０６は、特徴抽出方法の切替許可の選択状態を表示する。チェックボックス５０６が選択状態の場合、ステップＳ４１５において、特徴登録処理内容判定部１２１は、処理状況に応じて、ステップＳ４１５における該当する顕著領域の画像特徴量の抽出方法を切り替える。図５では省略しているが、登録処理条件設定画面には、切替対象となる抽出方法を指定するための表示エリアが含まれてもよい。 A check box 506 displays a selection state of permission to switch the feature extraction method. When the check box 506 is selected, in step S415, the feature registration process content determination unit 121 switches the image feature amount extraction method of the corresponding saliency area in step S415 according to the processing status. Although omitted in FIG. 5, the registration processing condition setting screen may include a display area for designating an extraction method to be switched.

チェックボックス５０１、５０２、５０４、５０６は、マウスクリックすることで状態を切り替えることが可能である。設定ボタン５０７をクリックすることで、登録条件設定画面における登録処理条件が特徴登録処理条件入力部１０９に送られる。 Check boxes 501, 502, 504, and 506 can be switched by clicking the mouse. By clicking the setting button 507, the registration processing condition on the registration condition setting screen is sent to the feature registration processing condition input unit 109.

上述のように、実施例１によれば、映像の特徴登録処理において、各顕著領域の検出処理及び特徴抽出処理の方法及び頻度、並びに、映像データベースへの特徴記録処理の頻度を、変更することができる。これにより、負荷の高い処理の頻度を低下させることや、処理方法を負荷の低い方法に切り替えることができ、移動物体の特徴量登録処理時間を低減しつつ、移動物体の検索精度の低下を小さくできる。 As described above, according to the first embodiment, in the video feature registration process, the method and frequency of each salient area detection process and feature extraction process, and the frequency of the feature recording process in the video database are changed. Can do. As a result, it is possible to reduce the frequency of processing with a high load and to switch the processing method to a method with a low load, thereby reducing the reduction in the accuracy of moving object search while reducing the feature amount registration processing time of the moving object. it can.

実施例１によれば、入力された映像から特徴量を抽出する際の自由度が向上し、入力された映像データファイルのフレームの登録処理と、特徴登録処理とを、リアルタイムに行うことが出きる。また、情報の記録量と処理時間の調整が可能となる。 According to the first embodiment, the degree of freedom in extracting feature values from the input video is improved, and the registration processing of the frames of the input video data file and the feature registration processing can be performed in real time. Yes. In addition, the amount of information recorded and the processing time can be adjusted.

図６は、実施例１のサーバ計算機１０７による検索処理を説明するフローチャートである。最初に、ユーザが検索に使用する特徴ベクトル（特徴量）を決定し、入力する（ステップＳ６０１）。ユーザからの入力は、特徴ベクトル入力部１７１によって受け付けられる。ここでは、軌跡特徴ベクトル、顔特徴ベクトル、頭部特徴ベクトル及び服飾特徴ベクトルを検索に使用することが決定された場合について説明する。 FIG. 6 is a flowchart for explaining search processing by the server computer 107 according to the first embodiment. First, a feature vector (feature amount) used by the user for search is determined and input (step S601). Input from the user is received by the feature vector input unit 171. Here, a case where it is determined to use a trajectory feature vector, a face feature vector, a head feature vector, and a clothing feature vector for a search will be described.

上述のように検索に使用する特徴ベクトルが決定された場合、サーバ計算機１０７は、軌跡特徴ベクトルを用いた検索処理（ステップＳ６１１〜Ｓ６１６）、顔特徴ベクトルを用いた検索処理（ステップＳ６２１〜Ｓ６２３）、頭部特徴ベクトルを用いた検索処理（ステップＳ６３１〜Ｓ６３３）、及び服飾特徴ベクトルを用いた検索処理（ステップＳ６４１〜Ｓ６４３）を、任意の順に、または並列に実行する。 When the feature vector used for the search is determined as described above, the server computer 107 performs a search process using the trajectory feature vector (steps S611 to S616) and a search process using the face feature vector (steps S621 to S623). The search process using the head feature vector (steps S631 to S633) and the search process using the clothing feature vector (steps S641 to S643) are executed in any order or in parallel.

最初に、軌跡特徴ベクトルを用いた検索処理（ステップＳ６１１〜Ｓ６１６）を説明する。サーバ計算機１０７は、ステップＳ６１１〜Ｓ６１４を順次実行する。ユーザが入力装置１０５を用いて撮影場所ＩＤをサーバ計算機１０７に入力すると（ステップＳ６１１）、サーバ計算機１０７は背景画像データ管理情報３１０を参照して、入力された撮影場所ＩＤに対応する背景画像データを読み出し、そのデータに基づいて背景画像を表示装置１０６に表示する（ステップＳ６１２）。 First, the search process (steps S611 to S616) using the trajectory feature vector will be described. The server computer 107 sequentially executes steps S611 to S614. When the user inputs the shooting location ID to the server computer 107 using the input device 105 (step S611), the server computer 107 refers to the background image data management information 310 and the background image data corresponding to the input shooting location ID. And a background image is displayed on the display device 106 based on the data (step S612).

次に、ユーザが入力装置１０５を用いて特徴ベクトル入力部１７１に軌跡を入力すると（ステップＳ６１３）、類似特徴ベクトル検索部１７２が入力された軌跡を軌跡特徴ベクトルに変換する（ステップＳ６１４）。この変換は、軌跡特徴抽出部１１１が図４のステップＳ４１０において実行する軌跡特徴ベクトルの抽出と同様の方法で行われる。 Next, when the user inputs a trajectory to the feature vector input unit 171 using the input device 105 (step S613), the similar feature vector search unit 172 converts the input trajectory into a trajectory feature vector (step S614). This conversion is performed by the same method as the extraction of the trajectory feature vector performed by the trajectory feature extraction unit 111 in step S410 in FIG.

次に、ユーザが入力装置１０５を操作して軌跡特徴ベクトルの重みを決定する（ステップＳ６１５）。次に、類似特徴ベクトル検索部１７２が、ステップＳ６１３で入力された軌跡と類似する軌跡を映像データベース１６１の軌跡特徴管理情報３２０から検索する（ステップＳ６１６）。軌跡特徴ベクトルの検索機能は、例えば、クエリとして入力された軌跡特徴ベクトルと近い順にデータを並び替えて出力する。軌跡特徴ベクトルの比較には、例えば、軌跡特徴ベクトル間のユークリッド距離を用いることができる。 Next, the user operates the input device 105 to determine the weight of the trajectory feature vector (step S615). Next, the similar feature vector search unit 172 searches the trajectory feature management information 320 of the video database 161 for a trajectory similar to the trajectory input in step S613 (step S616). The trajectory feature vector search function, for example, rearranges and outputs data in the order closest to the trajectory feature vector input as a query. For comparison of the trajectory feature vectors, for example, the Euclidean distance between the trajectory feature vectors can be used.

次に、顔特徴ベクトルを用いた検索処理（ステップＳ６２１〜Ｓ６２３）を説明する。まず、ユーザが入力装置１０５を操作して検索キーとなる顔を設定する（ステップＳ６２１）。次に、ユーザが入力装置１０５を操作して顔特徴ベクトルの重みを決定する（ステップＳ６２２）。 Next, search processing (steps S621 to S623) using a face feature vector will be described. First, the user operates the input device 105 to set a face as a search key (step S621). Next, the user operates the input device 105 to determine the weight of the face feature vector (step S622).

次に、類似特徴ベクトル検索部１７２が、ステップ５２１で検索キーとして設定された顔に類似する顔を顔特徴管理情報３５０から検索する（ステップＳ６２３）。具体的には、類似特徴ベクトル検索部１７２は、ステップ５２１で検索キーとして設定された顔の画像特徴ベクトルと、顔特徴管理情報３５０に保持された顔特徴ベクトルとの間のユークリッド距離を用いて、類似する顔を検索することができる。後述する類似頭部検索（ステップＳ６３３）及び類似服飾検索（ステップＳ６４３）も同様である。 Next, the similar feature vector search unit 172 searches the face feature management information 350 for a face similar to the face set as the search key in step 521 (step S623). Specifically, the similar feature vector search unit 172 uses the Euclidean distance between the face image feature vector set as the search key in step 521 and the face feature vector held in the face feature management information 350. , Similar faces can be searched. The same applies to a similar head search (step S633) and a similar clothing search (step S643) described later.

次に、頭部特徴量を用いた検索処理（ステップＳ６３１〜Ｓ６３３）を説明する。まず、ユーザが入力装置１０５を操作して検索キーとなる頭部を設定する（ステップＳ６３１）。次に、ユーザが入力装置１０５を操作して頭部特徴ベクトルの重みを決定する（ステップＳ６３２）。次に、類似特徴ベクトル検索部１７２が、ステップ５３１で検索キーとして設定された頭部に類似する頭部を頭部特徴管理情報３６０から検索する（ステップＳ６３３）。 Next, a search process (steps S631 to S633) using the head feature amount will be described. First, the user operates the input device 105 to set a head as a search key (step S631). Next, the user operates the input device 105 to determine the weight of the head feature vector (step S632). Next, the similar feature vector search unit 172 searches the head feature management information 360 for a head similar to the head set as the search key in step 531 (step S633).

次に、服飾特徴ベクトルを用いた検索処理（ステップＳ６４１〜Ｓ６４３）を説明する。まず、ユーザが入力装置１０５を操作して検索キーとなる服飾を設定する（ステップＳ６４１）。次に、ユーザが入力装置１０５を操作して服飾特徴ベクトルの重みを決定する（ステップＳ６４２）。次に、類似特徴ベクトル検索部１７２が、ステップ５３１で検索キーとして設定された服飾に類似する服飾を服飾特徴管理情報３７０から検索する（ステップＳ６４３）。 Next, search processing (steps S641 to S643) using the clothing feature vector will be described. First, the user operates the input device 105 to set clothing that serves as a search key (step S641). Next, the user operates the input device 105 to determine the weight of the clothing feature vector (step S642). Next, the similar feature vector search unit 172 searches the clothing feature management information 370 for clothing similar to the clothing set as the search key in step 531 (step S643).

次に、検索結果統合部１７３が、ステップＳ６１６、Ｓ６２３、Ｓ６３３及びＳ６４３の検索結果を統合する（ステップＳ６５１）。具体的には、検索結果統合部１７３は、検索によって得られた軌跡特徴ベクトル、顔特徴ベクトル、頭部特徴ベクトル及び服飾特徴ベクトルの類似度に、それぞれ、ステップＳ６１５、Ｓ６２２、Ｓ６３２及びＳ６４２で決定された重み係数を掛けた値を合計することで総合的な類似度のスコアを得る。検索結果統合部１７３は、設定された重み係数をそのまま使用してもよいし、全ての特徴ベクトルの重み係数の合計値が１となるように正規化してもよい。 Next, the search result integration unit 173 integrates the search results of steps S616, S623, S633, and S643 (step S651). Specifically, the search result integration unit 173 determines the similarity of the trajectory feature vector, the face feature vector, the head feature vector, and the clothing feature vector obtained by the search in steps S615, S622, S632, and S642, respectively. A total similarity score is obtained by summing the values multiplied by the weighting factors. The search result integration unit 173 may use the set weight coefficient as it is, or may normalize so that the total value of the weight coefficients of all feature vectors becomes 1.

なお、たとえば、顔特徴管理情報３５０に登録された顔特徴ベクトルが、移動物体管理情報３３０に登録されたどの人物の顔の特徴ベクトルであるかは、顔特徴管理情報３５０に登録された矩形座標と移動物体管理情報３３０に登録された矩形座標とが重複するか否か（またはどの程度重複するか）に基づいて判定することができる。頭部特徴ベクトル及び服飾特徴ベクトルについても同様である。この判定を容易にするために、移動物体管理情報３３０は、それぞれの移動物体の画像が抽出されたフレームを識別するフレームＩＤをさらに含んでもよい。 Note that, for example, which face feature vector registered in the face feature management information 350 is a facial feature vector registered in the moving object management information 330 is a rectangular coordinate registered in the face feature management information 350. And the rectangular coordinates registered in the moving object management information 330 can be determined based on whether or not (or how much) the rectangular coordinates are registered. The same applies to the head feature vector and the clothing feature vector. In order to facilitate this determination, the moving object management information 330 may further include a frame ID for identifying a frame from which an image of each moving object is extracted.

次に、検索結果統合部１７３は、検索結果及びスコアを表示装置１０６に出力し、表示装置１０６が検索結果をスコアが高い順に表示する（ステップＳ６６１）。以上で検索処理が終了する。 Next, the search result integration unit 173 outputs the search result and the score to the display device 106, and the display device 106 displays the search result in descending order of score (step S661). This completes the search process.

図７は、実施例１のサーバ計算機１０７によって出力される検索画面の説明図である。図７を参照して、図６の処理におけるユーザからの情報の入力方法及び検索結果の表示方法の一例を説明する。 FIG. 7 is an explanatory diagram of a search screen output by the server computer 107 according to the first embodiment. With reference to FIG. 7, an example of a method for inputting information from a user and a method for displaying a search result in the processing of FIG. 6 will be described.

表示装置１０６によって表示される検索画面は、映像再生エリア７０１、カメラ内追跡結果表示エリア７０２、全体像表示エリア７０３、重み設定エリア７０４、特徴ベクトル設定エリア７０５、検索ボタン７０６及び検索結果表示エリア７０７を含む。 The search screen displayed by the display device 106 includes a video playback area 701, an in-camera tracking result display area 702, an overall image display area 703, a weight setting area 704, a feature vector setting area 705, a search button 706, and a search result display area 707. including.

映像再生エリア７０１には、ユーザが選択した撮影場所で撮影された映像が再生され、表示される。ユーザが表示された映像に含まれる移動物体のいずれかを指定すると、指定された移動物体を当該映像内で追跡した結果がカメラ内追跡結果表示エリア７０２に表示される。ここでは、移動物体が人物である例について説明する。例えば当該映像の複数のフレーム画像から切り出された、指定された人物の複数の画像が、撮影日時の順に並べて表示されてもよい。 In the video playback area 701, video shot at the shooting location selected by the user is played back and displayed. When the user designates any of the moving objects included in the displayed video, the result of tracking the designated moving object in the video is displayed in the in-camera tracking result display area 702. Here, an example in which the moving object is a person will be described. For example, a plurality of images of a designated person cut out from a plurality of frame images of the video may be displayed side by side in order of shooting date and time.

実行される検索の目的は、ユーザが指定した特徴に近い特徴を有する人物の画像を、当該映像、当該映像と同じ場所で撮影された別の映像、または別の場所で撮影された映像から検索することであり、その人物と同一の人物（または同一の人物ではないが類似する特徴を有する人物）の画像が映像再生エリア７０１に表示された映像から発見された場合には、ユーザはその画像を指定することができる。 The purpose of the search performed is to search for an image of a person with features close to the features specified by the user from the video, another video taken at the same location as the video, or a video taken at another location. If an image of the same person (or a person who is not the same person but has similar characteristics) is found from the video displayed in the video playback area 701, the user Can be specified.

ただし、例えば目撃情報から設定された特徴ベクトル（後述）のみが検索キーとして設定される場合のように、検索キーを設定するために移動物体の画像を参照する必要がない場合、映像再生エリア７０１に表示された画像をユーザが指定する必要がない。全体像表示エリア７０３には、移動物体の全体像が模式的に表示される。図７の例では、人物の頭部、顔、上半身、下半身、鞄等の領域を含む人物の模式図が表示される。 However, when it is not necessary to refer to the image of the moving object in order to set the search key, for example, when only the feature vector (described later) set from the sighting information is set as the search key, the video playback area 701 There is no need for the user to specify the image displayed on the screen. In the overall image display area 703, the entire image of the moving object is schematically displayed. In the example of FIG. 7, a schematic diagram of a person including areas such as the head, face, upper body, lower body, and heel of the person is displayed.

重み設定エリア７０４には、それぞれの特徴量の重みを設定するためのスライドバーまたはその他の設定手段が表示される。図７の例では、全体像表示エリア７０３に表示された領域及び軌跡のそれぞれの特徴ベクトルの重みを設定するためのスライドバーが表示される。 In the weight setting area 704, a slide bar or other setting means for setting the weight of each feature amount is displayed. In the example of FIG. 7, a slide bar for setting the weight of each feature vector of the region and the trajectory displayed in the overall image display area 703 is displayed.

図７の例では頭部、顔、上半身、下半身、鞄及び軌跡のそれぞれの特徴ベクトルに対応するスライドバーが表示されているが、頭部、顔、服飾及び軌跡に対応するスライドバーが表示されてもよいし、その他の任意の顕著領域の特徴ベクトルに対応するスライドバーが表示されてもよい。 In the example of FIG. 7, slide bars corresponding to the feature vectors of the head, face, upper body, lower body, heel, and locus are displayed, but slide bars corresponding to the head, face, clothing, and locus are displayed. Alternatively, a slide bar corresponding to the feature vector of any other saliency area may be displayed.

特徴ベクトル設定エリア７０５には、以下に具体例を説明するように、特徴ベクトルの設定画面が表示される。例えば、ユーザは、軌跡特徴ベクトルを検索に使用することを決定した場合（ステップＳ６０１）、重み設定エリア７０４から「軌跡」を選択し、撮影場所ＩＤを指定すると（ステップＳ６１１）、指定された撮影場所の背景画像が特徴ベクトル設定エリア７０５に表示される（ステップＳ６１２）。 In the feature vector setting area 705, a feature vector setting screen is displayed as will be described below. For example, when the user decides to use the trajectory feature vector for the search (step S601), the user selects “trajectory” from the weight setting area 704 and designates the photographing location ID (step S611). The background image of the place is displayed in the feature vector setting area 705 (step S612).

続いて、ユーザは特徴ベクトル設定エリア７０５に表示された背景画像上で、検索キーとなる軌跡を入力する（ステップＳ６１３）。例えば、ユーザは、入力装置１０５であるマウスを操作してポインタ（マウスカーソル）を入力しようとする軌跡の始点に置き、マウスボタンを押して、ポインタを軌跡に沿って動かすようにドラッグし、軌跡の終点でマウスボタンを離すことで、軌跡を入力する。 Subsequently, the user inputs a locus serving as a search key on the background image displayed in the feature vector setting area 705 (step S613). For example, the user operates the mouse as the input device 105 to place the pointer (mouse cursor) at the start point of the locus to be input, presses the mouse button, drags the pointer to move along the locus, Enter the trajectory by releasing the mouse button at the end point.

または、表示装置１０６が例えば入力装置１０５としての機能も有するタッチパネルである場合、ユーザが特徴ベクトル設定エリア７０５に表示された背景画像上の軌跡の始点を指またはペン等によってタッチし、軌跡に沿って終点までスワイプしてもよい。ユーザがマウスクリックまたはタッチによって背景画像上のいくつかの点を指定し、サーバ計算機１０７がベジエ曲線等によってそれらの点を補間することによって軌跡を生成してもよい。 Alternatively, when the display device 106 is a touch panel that also functions as the input device 105, for example, the user touches the start point of the locus on the background image displayed in the feature vector setting area 705 with a finger or a pen, and follows the locus. You can swipe to the end point. The user may specify some points on the background image by mouse click or touch, and the server computer 107 may generate a trajectory by interpolating these points using a Bezier curve or the like.

軌跡特徴抽出部１１１は、入力された軌跡を軌跡特徴量に変換する（ステップＳ６１４）。ユーザは、さらに、重み設定エリア７０４に表示された「軌跡」に対応するスライドバーを操作することで、軌跡特徴ベクトルの重みを設定する（ステップＳ６１５）。 The trajectory feature extraction unit 111 converts the input trajectory into a trajectory feature amount (step S614). The user further sets the weight of the trajectory feature vector by operating the slide bar corresponding to the “trajectory” displayed in the weight setting area 704 (step S615).

または、ユーザは、顔特徴ベクトルを検索に使用することを決定した場合（ステップＳ６０１）、重み設定エリア７０４から「顔」を選択し、検索キーとなる顔を設定する（ステップ５２１）。例えば、カメラ内追跡結果表示エリア７０２に表示された複数の画像から切り出された複数の顔画像が特徴ベクトル設定エリア７０５に表示され、ユーザがそれらの顔画像のいずれかを選択すると、選択された顔画像が検索キーとして設定されてもよい。ユーザは、さらに、重み設定エリア７０４に表示された「顔」に対応するスライドバーを操作することで、顔特徴ベクトルの重みを設定する（ステップＳ６２２）。 Alternatively, when the user decides to use the face feature vector for the search (step S601), the user selects “face” from the weight setting area 704, and sets a face as a search key (step 521). For example, a plurality of face images cut out from a plurality of images displayed in the in-camera tracking result display area 702 are displayed in the feature vector setting area 705, and the user selects one of those face images. A face image may be set as a search key. The user further sets the weight of the face feature vector by operating the slide bar corresponding to the “face” displayed in the weight setting area 704 (step S622).

同様に、ユーザは、頭部特徴ベクトルを検索に使用することを決定した場合（ステップＳ６０１）、重み設定エリア７０４から「頭部」を選択し、検索キーとなる頭部を設定し（ステップ５３１）、頭部特徴ベクトルの重みを設定する（ステップＳ６３２）。これらの手順は、例えば、顔特徴量及びその重みの設定と同様に実行することができる。 Similarly, when the user decides to use the head feature vector for the search (step S601), the user selects “head” from the weight setting area 704, and sets the head as a search key (step 531). ), The weight of the head feature vector is set (step S632). These procedures can be executed, for example, in the same manner as the setting of the facial feature amount and its weight.

さらに、ユーザは、服飾特徴量を検索に使用することを決定した場合、重み設定エリア７０４から「服飾」（図７の例では「上半身」または「下半身」でもよい）を選択し、検索キーとなる服飾を設定し（ステップ５４１）、服飾特徴ベクトルの重みを設定する（ステップＳ６４２）。これらの手順は、例えば、顔特ベクトル量及びその重みの設定と同様に実行することができる。 Furthermore, when the user decides to use the clothing feature value for the search, the user selects “clothing” (may be “upper body” or “lower body” in the example of FIG. 7) from the weight setting area 704, To be set (step 541), and the weight of the clothing feature vector is set (step S642). These procedures can be executed, for example, in the same manner as the setting of the face special vector amount and its weight.

または、例えば服飾特徴量として色特徴量が設定される場合には、カメラ内追跡結果表示エリア７０２に表示された画像から切り出された服飾画像ではなく、色見本またはカラーパレット等が特徴ベクトル設定エリア７０５に表示され、ユーザがいずれかの色を選択してもよい。 Alternatively, for example, when a color feature amount is set as a clothing feature amount, a color sample or a color palette is not a clothing image cut out from the image displayed in the in-camera tracking result display area 702, but a feature vector setting area 705 and the user may select any color.

図６では省略されているが、他の顕著領域（例えば鞄領域）の特徴量が設定される場合も、上記と同様に行うことができる。設定された特徴量（例えば色等）が全体像表示エリア７０３に表示された人物の模式図に反映されてもよい。 Although omitted in FIG. 6, it is possible to perform the same processing as described above even when a feature amount of another saliency area (for example, a wrinkle area) is set. The set feature amount (for example, color) may be reflected in the schematic diagram of the person displayed in the overall image display area 703.

ユーザが検索ボタン７０６を操作すると、検索に使用することが決定された特徴量について類似特徴ベクトル検索が行われ（ステップＳ６１６、Ｓ６２３、Ｓ６３３及びＳ６４３）、設定された重みに基づいて検索結果が統合され（ステップＳ６５１）、スコアの順に検索結果が検索結果表示エリア７０７に表示される（ステップＳ６６１）。 When the user operates the search button 706, a similar feature vector search is performed for the feature amount determined to be used for the search (steps S616, S623, S633, and S643), and the search results are integrated based on the set weight. Then, the search results are displayed in the search result display area 707 in the order of the scores (step S661).

画像７０８ａには、指定された撮影場所の背景画像上に、検索結果として得られた軌跡７１０ａが表示され、さらに、当該軌跡７１０ａに対応する移動物体（この例では人物）７０９ａの画像が表示される。すなわち、軌跡７１０ａは移動物体７０９ａが移動した軌跡である。 In the image 708a, a trajectory 710a obtained as a search result is displayed on the background image of the designated shooting location, and further, an image of a moving object (a person in this example) 709a corresponding to the trajectory 710a is displayed. The That is, the trajectory 710a is a trajectory that the moving object 709a has moved.

例えば、検索された軌跡の開始日時から終了日時までのいずれかの時点のフレームの画像上に、検索された軌跡を示す矢印を重畳表示することによって画像７０８ａが生成されてもよい。この場合、図７に示すように、表示されたフレームが撮影された日時が表示されてもよい。画像７０８ｂも同様に、別の検索された軌跡７１０ｂ及びそれに対応する移動物体７０９ｂの画像を含む。 For example, the image 708a may be generated by superimposing an arrow indicating the searched trajectory on the image of the frame at any point in time from the start date and time to the end date and time of the searched trajectory. In this case, as shown in FIG. 7, the date and time when the displayed frame was captured may be displayed. The image 708b similarly includes another searched trajectory 710b and a corresponding moving object 709b image.

なお、検索結果表示エリア７０７には、背景画像及び移動物体の画像を表示せずに、検索された軌跡を示す矢印のみを表示してもよい。または、いずれかの時点のフレームの画像（すなわち静止画像）ではなく、検索された軌跡の開始日時から終了日時までの映像を再生し、それを検索結果表示エリア７０７に表示してもよい。 In the search result display area 707, only the arrow indicating the searched trajectory may be displayed without displaying the background image and the moving object image. Alternatively, a video from the start date and time to the end date and time of the searched trajectory may be reproduced instead of the frame image (that is, still image) at any time point and displayed in the search result display area 707.

また、図６及び図７の例では、検索条件として撮影場所及び軌跡が指定さるが、さらに時刻が指定されてもよい。この場合、図７に示す検索画面にさらに時刻指定エリア（図示省略）が設けられ、ユーザが入力装置１０５を操作して時刻（例えば検索対象の時間帯）を入力する。この場合、ユーザに指定された時間帯に開始時刻から終了時刻までの時間が含まれる軌跡が検索される。または、検索条件として撮影場所が指定されなくてもよい。その場合、撮影場所にかかわらず、類似する軌跡が検索される。 In the examples of FIGS. 6 and 7, the shooting location and the locus are specified as the search conditions, but the time may be specified. In this case, a time designation area (not shown) is further provided on the search screen shown in FIG. 7, and the user operates the input device 105 to input a time (for example, a search target time zone). In this case, a trajectory in which the time from the start time to the end time is included in the time zone specified by the user is searched. Alternatively, the shooting location may not be specified as a search condition. In this case, a similar locus is searched regardless of the shooting location.

上述のように、実施例１によれば、１種類以上の特徴ベクトルに基づいて移動物体の画像を検索することができる。実際に撮影された画像から取得された特徴ベクトル、例えば映像再生エリア７０１に表示された映像から切り出された顔画像の特徴ベクトル、だけでなく、ユーザが直接入力した特徴ベクトル、例えばユーザによって色見本から選択された色特徴ベクトルを検索キーとして指定することができる。 As described above, according to the first embodiment, a moving object image can be searched based on one or more types of feature vectors. Not only feature vectors acquired from actually captured images, eg, feature vectors of face images cut out from the video displayed in the video playback area 701, but also feature vectors directly input by the user, eg, color samples by the user The color feature vector selected from the above can be designated as a search key.

検索しようとする人物が所持していると推定される所持品（例えば鞄）と類似する所持品を所持する別の人物の画像が得られる場合、その画像から抽出された所持品の特徴ベクトルを検索キーとして指定することもできる。 When an image of another person who possesses possession similar to the possession that is estimated to be possessed by the person to be searched (for example, bag) is obtained, the feature vector of the possession extracted from the image is obtained. It can also be specified as a search key.

複数種類の特徴ベクトルを検索キーとして検索が行われる場合、それらの検索結果は重み付けをした上で統合される。例えばユーザがいずれかの特徴を特に重視したい特徴ベクトルがある場合、その特徴ベクトルの重みを大きくすることができる。ユーザは、設定された検索キーの確度またはそれを用いた移動物体の識別のしやすさ等に基づいて重みを設定してもよい。 When a search is performed using a plurality of types of feature vectors as search keys, the search results are weighted and integrated. For example, when there is a feature vector that the user particularly wants to emphasize any feature, the weight of the feature vector can be increased. The user may set the weight based on the accuracy of the set search key or the ease of identifying a moving object using the search key.

サーバ計算機１０７は、例えば顔領域に含まれる画像の「顔らしさ」を判定して、顔らしさが高いほど重みが大きくなるように重みを自動設定してもよい。他の顕著領域についても同様である。 For example, the server computer 107 may determine “face-likeness” of an image included in the face area and automatically set the weight so that the higher the face-likeness, the larger the weight. The same applies to other salient regions.

実施例１によれば、実際に撮影された検索対象の移動物体の画像の特徴ベクトルだけでなく、例えば目撃情報またはその他の情報に基づいて推定される１種類以上の特徴ベクトルを検索キーとして用い、さらにそれらの重みを任意に設定することによって、種々の情報源からの情報を統合した検索キーを用いた検索など、自由度の高い検索を実現することができる。 According to the first embodiment, not only the feature vector of the image of the moving object that is actually photographed but also one or more types of feature vectors estimated based on, for example, sighting information or other information are used as the search key. Further, by arbitrarily setting these weights, it is possible to realize a search with a high degree of freedom such as a search using a search key that integrates information from various information sources.

なお、画像検索システム１００は、特徴ベクトル入力部１７１、類似特徴ベクトル検索部１７２、検索結果統合部１７３、入力装置１０５、表示装置１０６を持たない情報処理装置として機能してもよい。この場合、サーバ計算機１０７は、映像撮影装置１０２に組み込まれてもよい。 Note that the image search system 100 may function as an information processing apparatus that does not include the feature vector input unit 171, the similar feature vector search unit 172, the search result integration unit 173, the input device 105, and the display device 106. In this case, the server computer 107 may be incorporated in the video shooting device 102.

次に、実施例２の画像検索システム１００について説明する。実施例１においては、図４のステップＳ４１２〜Ｓ４１８において検出及び抽出の処理が実行されない場合、図６のステップＳ６１６、Ｓ６２３、Ｓ６３３、Ｓ６４３における検索の対象となるデータ量が減少する場合がある。 Next, the image search system 100 according to the second embodiment will be described. In the first embodiment, when the detection and extraction processes are not executed in steps S412 to S418 in FIG. 4, the amount of data to be searched in steps S616, S623, S633, and S643 in FIG. 6 may decrease.

このようなデータ量減少を解消するため、実施例２においては、特徴量がデータベースに登録されていないフレームについて、追加で特徴登録処理を行う。実施例２の画像検索システム１００について、図８〜図１２を用いて説明する。 In order to eliminate such a decrease in the data amount, in the second embodiment, an additional feature registration process is performed for frames whose feature amounts are not registered in the database. An image search system 100 according to the second embodiment will be described with reference to FIGS.

図８及び図９は、実施例２の画像検索システム１００の全体構成図である。実施例１記載の画像検索システム１００と異なる点を説明する。実施例２の画像検索システム１００は、図８記載の追加登録実行判定部８０１及び追加登録条件入力部８０２を含む。さらに、実施例２の画像検索システム１００は、図９記載の、追加登録処理内容判定部９０１、フレーム記録部９０３、顔検出部９１１、顔特徴抽出部９１２、顔特徴記録部９１３、頭部検出部９２１、頭部特徴抽出部９２２、頭部特徴記録部９２３、服飾検出部９３１、服飾特徴抽出部９３２、服飾特徴記録部９３３を含む。 8 and 9 are overall configuration diagrams of the image search system 100 according to the second embodiment. Differences from the image search system 100 described in the first embodiment will be described. The image search system 100 according to the second embodiment includes an additional registration execution determination unit 801 and an additional registration condition input unit 802 illustrated in FIG. Furthermore, the image search system 100 according to the second embodiment includes an additional registration processing content determination unit 901, a frame recording unit 903, a face detection unit 911, a face feature extraction unit 912, a face feature recording unit 913, and a head detection described in FIG. A part 921, a head feature extraction unit 922, a head feature recording unit 923, a clothing detection unit 931, a clothing feature extraction unit 932, and a clothing feature recording unit 933.

追加登録実行判定部８０１は、追加登録条件入力部８０２から受け付けた特徴ベクトルと、軌跡特徴記録部１１２、顔特徴記録部１３３、頭部特徴記録部１４３、服飾特徴記録部１５３から受けつけた各特徴ベクトルを照合し、照合結果に応じて、後述する特徴追加登録のフロー（図１１）を開始する。追加登録条件入力部８０２は、ユーザが入力装置１０３を用いて入力した内容を受け付ける。 The additional registration execution determination unit 801 receives the feature vector received from the additional registration condition input unit 802, and the features received from the trajectory feature recording unit 112, the face feature recording unit 133, the head feature recording unit 143, and the clothing feature recording unit 153. The vectors are collated, and a feature addition registration flow (FIG. 11) described later is started according to the collation result. The additional registration condition input unit 802 receives content input by the user using the input device 103.

フレーム記録部９０３、顔検出部９１１、顔特徴抽出部９１２、顔特徴記録部９１３、頭部検出部９２１、頭部特徴抽出部９２２、頭部特徴記録部９２３、服飾検出部９３１、服飾特徴抽出部９３２、服飾特徴記録部９３３は、図１記載のフレーム記録部１１３、顔検出部１３１、顔特徴抽出部１３２、顔特徴記録部１３３、頭部検出部１４１、頭部特徴抽出部１４２、頭部特徴記録部１４３、服飾検出部１５１、服飾特徴抽出部１５２、服飾特徴記録部１５３と同様であるが、それぞれ独立に動作することが可能である。 Frame recording unit 903, face detection unit 911, face feature extraction unit 912, face feature recording unit 913, head detection unit 921, head feature extraction unit 922, head feature recording unit 923, clothing detection unit 931, clothing feature extraction Unit 932, clothing feature recording unit 933, frame recording unit 113, face detection unit 131, face feature extraction unit 132, face feature recording unit 133, head detection unit 141, head feature extraction unit 142, head shown in FIG. This is the same as the head feature recording unit 143, the clothing detection unit 151, the clothing feature extraction unit 152, and the clothing feature recording unit 153, but can operate independently.

実施例２の画像検索システム１００のハードウェア構成は、実施例１における画像検索システム１００のハードウェア構成（図２）と同様であり、説明を省略する。実施例２の映像データベース１６１の構成及びデータ例については、実施例１の映像データベース１６１の構成及びデータ例（図３）と同様であり、説明を省略する。 The hardware configuration of the image search system 100 according to the second embodiment is the same as the hardware configuration (FIG. 2) of the image search system 100 according to the first embodiment, and a description thereof will be omitted. The configuration and data example of the video database 161 of the second embodiment are the same as the configuration and data example (FIG. 3) of the video database 161 of the first embodiment, and the description thereof is omitted.

図１０は、実施例２のサーバ計算機１０７が、入力された映像を登録する処理を説明するフローチャートである。以下、図４との相違点を中心に説明する。ステップＳ４０１〜ステップＳ４１９については、図４と同様である。 FIG. 10 is a flowchart illustrating a process in which the server computer 107 according to the second embodiment registers an input video. Hereinafter, the difference from FIG. 4 will be mainly described. Steps S401 to S419 are the same as those in FIG.

次に、追加登録実行判定部８０１は、特徴量を追加で登録するかを判定する追加特徴量登録判定を実行する（ステップＳ１００１）。具体的には、追加登録実行判定部８０１は、追加登録条件入力部８０２より特徴ベクトルを受け付け、軌跡特徴記録部１１２、顔特徴記録部１３３、頭部特徴記録部１４３、服飾特徴記録部１５３から受け付けた特徴ベクトルのうち該当する種類の特徴ベクトルとの照合を行い、特徴ベクトルの類似度が閾値より高い場合には、後述する特徴追加登録のフロー（図１１）を開始する（ステップＳ１００１）。 Next, the additional registration execution determination unit 801 executes additional feature amount registration determination for determining whether to additionally register a feature amount (step S1001). Specifically, the additional registration execution determination unit 801 receives a feature vector from the additional registration condition input unit 802, and from the trajectory feature recording unit 112, the face feature recording unit 133, the head feature recording unit 143, and the clothing feature recording unit 153. Among the received feature vectors, matching is performed with a corresponding type of feature vector, and when the similarity between the feature vectors is higher than a threshold value, a later-described feature addition registration flow (FIG. 11) is started (step S1001).

例えば、追加登録実行判定部８０１が追加登録条件入力部８０２より軌跡特徴ベクトルを受け付けた場合、追加登録実行判定部８０１は、軌跡特徴記録部１１２から受け付けた軌跡特徴ベクトルと追加登録条件入力部８０２より受け付けた軌跡特徴ベクトルの類似度が規定値を超える場合、特徴追加登録のフローを開始する。これにより、問題となる動きを示す移動物体の未登録の特徴量を登録できる。 For example, when the additional registration execution determination unit 801 receives a trajectory feature vector from the additional registration condition input unit 802, the additional registration execution determination unit 801 receives the trajectory feature vector received from the trajectory feature recording unit 112 and the additional registration condition input unit 802. If the degree of similarity of the trajectory feature vector accepted exceeds the specified value, the flow of feature addition registration is started. Thereby, the unregistered feature amount of the moving object showing the problematic motion can be registered.

または、追加登録実行判定部８０１が追加登録条件入力部８０２より顔特徴ベクトルを受け付けた場合、追加登録実行判定部８０１は、顔特徴記録部１３３から受け付けた顔特徴ベクトルと追加登録条件入力部８０２より受け付けた顔特徴ベクトルの類似度が規定値を超える場合、特徴追加登録のフローを開始させる。問題となる人物の未登録の特徴量を登録できる。 Alternatively, when the additional registration execution determination unit 801 receives a face feature vector from the additional registration condition input unit 802, the additional registration execution determination unit 801 receives the face feature vector received from the face feature recording unit 133 and the additional registration condition input unit 802. If the similarity of the received face feature vector exceeds the specified value, the feature addition registration flow is started. Unregistered feature quantities of the person in question can be registered.

なお、追加登録実行判定部８０１は、追加登録条件入力部８０２より特徴ベクトルを受け付ける以外に、特徴ベクトルを決定するための情報を受け付けてもよい。追加登録実行判定部８０１は、例えば、追加登録条件入力部８０２より人物の顔画像が含まれる画像と画像からの特徴抽出の対象とする顕著領域の指定情報を受け取った場合、顔検出部９１１、顔特徴抽出部９１２を使用して画像から判定に使用する特徴ベクトルを抽出する。 Note that the additional registration execution determination unit 801 may receive information for determining a feature vector in addition to receiving a feature vector from the additional registration condition input unit 802. For example, when the additional registration execution determination unit 801 receives from the additional registration condition input unit 802 the image including the face image of the person and the designation information of the saliency area to be subjected to feature extraction from the image, the face detection unit 911, A feature vector used for determination is extracted from the image using the face feature extraction unit 912.

そのほか、追加登録実行判定部８０１が軌跡特徴記録部１１２から受け付けた軌跡特徴ベクトルに含まれるデータ長が閾値より小さい場合に、追加登録実行判定部８０１は、特徴追加登録のフロー（図１１）を開始してもよい。上記処理により、映像内に映っている時間の短い人物の特徴ベクトルの登録が未実行となることを回避できる。 In addition, when the data length included in the trajectory feature vector received by the additional registration execution determination unit 801 from the trajectory feature recording unit 112 is smaller than the threshold, the additional registration execution determination unit 801 performs the feature additional registration flow (FIG. 11). You may start. By the above processing, it is possible to avoid the registration of the feature vector of the person having a short time shown in the video.

図１１は、実施例２のサーバ計算機１０７が、映像データベース１６１に登録されているフレームに対して実行する特徴追加登録処理についてのフローチャートである。図１１記載のフローは、図１０記載のステップＳ１００１における判定結果に基づき実行される。以下、図１１の各ステップについて説明する。 FIG. 11 is a flowchart of a feature addition registration process executed by the server computer 107 according to the second embodiment for a frame registered in the video database 161. The flow illustrated in FIG. 11 is executed based on the determination result in step S1001 illustrated in FIG. Hereinafter, each step of FIG. 11 will be described.

まず、追加登録処理内容判定部９０１は、追加登録実行判定部８０１より処理中のフレームＩＤを受け付け、フレーム画像管理情報３４０を参照して、同一映像における処理中のフレームの直前フレームのフレームＩＤを決定する。追加登録処理内容判定部９０１は、さらに、フレーム画像管理情報３４０から、直前フレームのフレームＩＤから追加登録条件入力部８０２より受けつけたフレーム数だけ遡ったフレームのフレームＩＤを決定し、それらフレームＩＤの範囲内に含まれる同映像ＩＤのフレームＩＤを選択する（ステップＳ１１０１）。 First, the additional registration process content determination unit 901 receives the frame ID being processed from the additional registration execution determination unit 801, refers to the frame image management information 340, and determines the frame ID of the frame immediately before the frame being processed in the same video. decide. The additional registration processing content determination unit 901 further determines the frame IDs of the frames that are backed by the number of frames received from the additional registration condition input unit 802 from the frame ID of the immediately preceding frame from the frame image management information 340, and A frame ID of the same video ID included in the range is selected (step S1101).

次に、サーバ計算機１０７内の各部が、ステップＳ１１０１で抽出された各フレームに対して、ステップＳ１１０３〜Ｓ１１０７を実行する。追加登録処理内容判定部９０１は、フレーム画像管理情報３４０における該当フレームＩＤに対応する顔処理フィールド３４４、頭部処理フィールド３４５、服飾処理フィールド３４６を確認し、顔検出部１３１、頭部検出部１４１、服飾検出部１５１、顔特徴抽出部１３２、頭部特徴抽出部１４２、服飾特徴抽出部１５２による検出処理及び特徴抽出処理のうち未実行の処理があるか確認する（ステップＳ１１０３）。 Next, each unit in the server computer 107 executes steps S1103 to S1107 for each frame extracted in step S1101. The additional registration processing content determination unit 901 confirms the face processing field 344, the head processing field 345, and the clothing processing field 346 corresponding to the corresponding frame ID in the frame image management information 340, and the face detection unit 131 and the head detection unit 141. Then, it is confirmed whether there is any unexecuted processing among the detection processing and the feature extraction processing by the clothing detection unit 151, the face feature extraction unit 132, the head feature extraction unit 142, and the clothing feature extraction unit 152 (step S1103).

ステップ１１０３において確認された未実行の検出処理について（ステップＳ１１０３：ＥＬＳＥ）、顔検出部９１１、頭部検出部９２１、服飾検出部９３１は、それぞれ、検出処理を実行する（ステップＳ１１０４）。 Regarding the unexecuted detection process confirmed in step 1103 (step S1103: ELSE), the face detection unit 911, the head detection unit 921, and the clothing detection unit 931 each execute the detection process (step S1104).

顔検出部９１１、頭部検出部９２１、服飾検出部９３１は、ステップＳ１１０４で実行した検出処理が成功しているか判定する（ステップＳ１１０５）。ステップＳ１１０３の未実行の処理の確認の結果、未実行の検出処理が無い場合（ステップＳ１１０３：ＳＫＩＰ）、ステップＳ１１０４、Ｓ１１０５は実行されない。 The face detection unit 911, the head detection unit 921, and the clothing detection unit 931 determine whether the detection process executed in step S1104 is successful (step S1105). As a result of checking the unexecuted process in step S1103, if there is no unexecuted detection process (step S1103: SKIP), steps S1104 and S1105 are not executed.

ステップＳ１１０５においてステップＳ１１０４における検出処理が成功していると判定された場合（ステップＳ１１０５：ＹＥＳ）、または、ステップＳ１１０３において検出処理が実行済みであるが特徴抽出処理が未実行であると判定された場合（ステップＳ１１０３：ＳＫＩＰ）、顔特徴抽出部９１２、頭部特徴抽出部９２２、服飾特徴抽出部９３２のうち該当する抽出部は、特徴抽出処理を実行する（ステップＳ１１０６）。ステップＳ１１０６における特徴抽出処理は、ステップＳ４１６と同様である。 If it is determined in step S1105 that the detection process in step S1104 is successful (step S1105: YES), or it is determined in step S1103 that the detection process has been performed but the feature extraction process has not been performed. In the case (step S1103: SKIP), the corresponding extraction unit among the face feature extraction unit 912, the head feature extraction unit 922, and the clothing feature extraction unit 932 executes feature extraction processing (step S1106). The feature extraction process in step S1106 is the same as that in step S416.

ステップＳ１１０５においてステップＳ１１０４における検出処理が失敗と判定された場合（ステップＳ１１０５：ＮＯ）、当該領域における特徴抽出処理（ステップＳ１１０６）は省略される。 If it is determined in step S1105 that the detection process in step S1104 has failed (step S1105: NO), the feature extraction process (step S1106) in the area is omitted.

顔特徴記録部９１３、頭部特徴記録部９２３、服飾特徴記録部９３３は、ステップＳ１１０６において該当する特徴量が抽出された場合、特徴ベクトル（特徴量）を記録する（ステップＳ１１０７）。ステップＳ１１０７における処理は、ステップＳ４１８における処理内容と同様である。 The face feature recording unit 913, head feature recording unit 923, and clothing feature recording unit 933 record a feature vector (feature amount) when the corresponding feature amount is extracted in step S1106 (step S1107). The processing in step S1107 is the same as the processing content in step S418.

図１１記載の特徴追加登録のフローは、図１０記載の映像登録のフローとは、独立、または、並列に実行可能である。このため、実施例２においては、図１０記載の映像登録のフローの実行時間を遅延させることなく、図６記載の類似軌跡検索Ｓ６１６、類似顔検索Ｓ６２３、類似頭部検索Ｓ６３３、類似服飾検索Ｓ６４３の検索対象とする特徴ベクトル（特徴量）のデータ量を拡大することが可能となり、検索精度の低下を回避することが可能となる。 The feature addition registration flow shown in FIG. 11 can be executed independently or in parallel with the video registration flow shown in FIG. Therefore, in the second embodiment, the similar locus search S616, the similar face search S623, the similar head search S633, and the similar clothing search S643 illustrated in FIG. 6 are performed without delaying the execution time of the video registration flow illustrated in FIG. It is possible to expand the data amount of the feature vector (feature amount) to be searched, and to avoid a decrease in search accuracy.

図１２は、実施例２のサーバ計算機１０７が映像データベース１６１に登録済みのフレーム画像のうち特徴ベクトルの登録が未実行のフレーム画像に対して、特徴ベクトルの登録を追加で行うための条件を設定するための設定画面の説明図である。 FIG. 12 illustrates a condition for additionally registering a feature vector for a frame image for which feature vector registration has not been performed among the frame images registered in the video database 161 by the server computer 107 according to the second embodiment. It is explanatory drawing of the setting screen for doing.

特徴ベクトルの追加登録条件の設定画面は、表示装置１０４によって表示され、撮影場所ＩＤ設定セクション１２０１、禁止軌跡設定セクション１２０２、許可対象特徴設定セクション１２０５、ブラックリスト特徴設定セクション１２０６、追加登録フレーム数設定セクション１２０７を含む。 The feature vector additional registration condition setting screen is displayed by the display device 104, and includes a shooting location ID setting section 1201, a prohibited locus setting section 1202, a permission target feature setting section 1205, a blacklist feature setting section 1206, and an additional registration frame number setting. Section 1207 is included.

禁止軌跡設定セクション１２０２は、ユーザが撮影場所ＩＤ設定セクション１２０１により選択した撮影場所ＩＤに対応する背景画像データを表示する。ユーザは、入力装置１０３であるマウスをドラッグすることで、禁止対象とする禁止軌跡１２０４を指定することが可能である。禁止軌跡設定セクション１２０２で設定した禁止軌跡１２０４は、追加登録条件入力部８０２に受け付けられる。 The prohibited locus setting section 1202 displays background image data corresponding to the shooting location ID selected by the user in the shooting location ID setting section 1201. The user can specify a prohibited locus 1204 to be prohibited by dragging the mouse which is the input device 103. The prohibited locus 1204 set in the prohibited locus setting section 1202 is received by the additional registration condition input unit 802.

禁止軌跡１２０４は、追加登録実行判定部８０１が追加登録条件入力部８０２から禁止軌跡を受け付けた際、軌跡ベクトルに変換される。禁止軌跡設定セクション１２０２において禁止軌跡１２０４を入力した場合、図１０記載のステップＳ１００１において、類似する軌跡が存在するかの判定が行われ、存在する場合、図１１記載のベクトルの特徴追加登録のフローが開始される。 The prohibited locus 1204 is converted into a locus vector when the additional registration execution determination unit 801 receives a prohibited locus from the additional registration condition input unit 802. When the prohibited locus 1204 is input in the prohibited locus setting section 1202, it is determined in step S1001 shown in FIG. 10 whether or not a similar locus exists, and if there is, the flow of vector feature addition registration shown in FIG. Is started.

許可対象特徴設定セクション１２０５では、許可対象とする人物の画像ファイルと、その画像ファイルから抽出対象とする特徴ベクトルを設定することが可能である。例えば、特徴ベクトルの入力セクションで顔を指定し、ファイル入力セクションで人物の全身が写った画像ファイルを指定し、登録処理ボタンを押すと、指定した情報が許可対象一覧に追加されるとともに、追加登録条件入力部８０２に送られる。追加登録実行判定部８０１は、追加登録条件入力部８０２に送られた情報を受けつけた際、顔検出部９１１と顔特徴抽出部９１２により、顔特徴ベクトルに変換して判定に使用する。 In the permission target feature setting section 1205, it is possible to set an image file of a person to be permitted and a feature vector to be extracted from the image file. For example, if you specify a face in the feature vector input section, specify an image file that shows the whole body of a person in the file input section, and press the registration process button, the specified information is added to the permitted target list and added. It is sent to the registration condition input unit 802. When the additional registration execution determination unit 801 receives the information sent to the additional registration condition input unit 802, the additional detection execution determination unit 801 converts the information into a facial feature vector by the face detection unit 911 and the facial feature extraction unit 912 and uses the information.

許可対象特徴設定セクション１２０５において許可対象とする特徴ベクトル（許可対象特徴ベクトル）を登録した場合、図１０記載のステップＳ１００１において、禁止軌跡１２０４と許可対象特徴設定セクション１２０５で設定された許可対象特徴ベクトルの両方が判定に使用される。このように、許可対象の特徴ベクトルは、禁止軌跡１２０４の判定に付随して使用される。 When a feature vector (permitted target feature vector) to be permitted is registered in the permitted target feature setting section 1205, the permitted target feature vector set in the prohibited locus 1204 and the permitted target feature setting section 1205 in step S1001 illustrated in FIG. Both are used for the determination. As described above, the feature vector to be permitted is used in association with the determination of the prohibited locus 1204.

一例として、禁止軌跡１２０４を設定し、さらに、許可対象特徴ベクトルとして特定の人物の顔を設定した場合について説明する。ステップＳ１００１において、追加登録実行判定部８０１により、禁止軌跡１２０４に類似する軌跡が存在すると判定された場合、追加登録実行判定部８０１は、さらに、映像データベース１６１からその軌跡を移動している人物の顔特徴量（顔特徴ベクトル）を抽出し、抽出した顔特徴ベクトルと、許可対象特徴ベクトルとして設定された顔特徴ベクトルの比較を行い、類似度が高いと判定した場合は、特徴追加登録のフロー（図１１）を開始しない。 As an example, a case where a prohibited locus 1204 is set and a face of a specific person is set as a permission target feature vector will be described. In step S <b> 1001, when the additional registration execution determination unit 801 determines that there is a locus similar to the prohibited locus 1204, the additional registration execution determination unit 801 further moves the locus of the person who is moving the locus from the video database 161. If a facial feature quantity (facial feature vector) is extracted, the extracted facial feature vector is compared with the facial feature vector set as the permission target feature vector, and it is determined that the degree of similarity is high, a feature addition registration flow (FIG. 11) is not started.

以上のように、禁止軌跡１２０４と許可対象特徴ベクトルを組み合わせて使用すれば、例えば、以下のような運用が可能である。ユーザは、特定の人物以外立ち入り禁止の進入禁止エリア１２０３が存在する場合、禁止軌跡設定セクション１２０２で進入禁止エリア１２０３に侵入する禁止軌跡１２０４を指定し、かつ、許可対象特徴設定セクション１２０５で進入禁止エリア１２０３に進入してもよい人物の特徴ベクトルを設定する。これにより、許可されていない人物が進入禁止エリア１２０３に進入した場合のみ、図１１記載の特徴追加登録のフローが開始される。 As described above, if the prohibited locus 1204 and the permitted target feature vector are used in combination, for example, the following operation is possible. In the case where there is an entry prohibition area 1203 that is prohibited from entering other than a specific person, the user designates a prohibition locus 1204 that enters the entry prohibition area 1203 in the prohibition locus setting section 1202 and prohibits entry in the permission object feature setting section 1205. A feature vector of a person who may enter the area 1203 is set. As a result, the feature addition registration flow shown in FIG. 11 is started only when an unauthorized person enters the entry prohibition area 1203.

ブラックリスト特徴設定セクション１２０６では、追加の特徴登録を実施すべき対象の特徴ベクトル（特徴量）を設定することが可能である。例えば、特徴ベクトルの入力セクションで服飾を指定し、ファイル入力セクションで人物の全身が写った画像ファイルを指定し、登録処理ボタンを押すと、指定した情報がブラックリスト一覧に追加されるとともに、追加登録条件入力部８０２に送られる。追加登録実行判定部８０１は、追加登録条件入力部８０２に送られた情報を受けつけた際、服飾検出部９３１と服飾特徴抽出部９３２により、服飾特徴ベクトルに変換して判定に使用する。 In the black list feature setting section 1206, it is possible to set a feature vector (feature amount) to be subjected to additional feature registration. For example, if you specify clothing in the feature vector input section, specify an image file that shows the whole body of a person in the file input section, and press the registration process button, the specified information is added to the blacklist list and added. It is sent to the registration condition input unit 802. When the additional registration execution determination unit 801 receives the information sent to the additional registration condition input unit 802, the additional registration execution determination unit 801 converts the information into a clothing feature vector using the clothing detection unit 931 and the clothing feature extraction unit 932, and uses the information for determination.

ブラックリスト特徴設定セクション１２０６においてブラックリスト対象とする特徴ベクトル（ブラックリスト特徴ベクトル）を登録した場合、図１０記載のステップＳ１００１において、類似する特徴ベクトルが存在するかの判定が行われ、類似する特徴量が存在する場合、図１１記載の特徴追加登録のフローが開始される。この判定は、上述の禁止軌跡１２０４の判定とは独立に実行される。 When the blacklist feature setting section 1206 has registered a blacklist target feature vector (blacklist feature vector), it is determined in step S1001 in FIG. 10 whether or not a similar feature vector exists, and the similar feature. If the amount exists, the flow of feature addition registration illustrated in FIG. 11 is started. This determination is performed independently of the determination of the forbidden locus 1204 described above.

ブラックリスト特徴設定セクション１２０６におけるブラックリスト特徴ベクトルの設定により、以下の運用が可能となる。進入禁止エリア１２０３によらず監視すべき人物が存在する場合、ユーザは、ブラックリスト特徴ベクトルに該当する人物の特徴ベクトルを設定する。これにより、該当する人物が映像内に存在する場合、図１１記載の特徴追加登録のフローが開始される。 By setting the black list feature vector in the black list feature setting section 1206, the following operations can be performed. When there is a person to be monitored regardless of the entry prohibition area 1203, the user sets the feature vector of the person corresponding to the blacklist feature vector. Thereby, when the corresponding person exists in the video, the flow of feature addition registration shown in FIG. 11 is started.

なお、許可対象特徴ベクトルやブラックリスト特徴ベクトルには、たとえば、顔、服の色、帽子の色、ヘルメットの色や形状、服やヘルメットのロゴマークなどを指定してもよい。追加登録フレーム数設定セクション１２０７では、図１１記載の特徴追加登録のフローにおいて登録の対象とするフレーム数を設定することが可能である。 For example, a face, clothes color, hat color, helmet color or shape, clothes or helmet logo mark, etc. may be designated as the permission target feature vector or blacklist feature vector. In the additional registration frame number setting section 1207, it is possible to set the number of frames to be registered in the feature additional registration flow shown in FIG.

実施例２において、図１１記載の特徴追加登録のフローチャートが図１０記載のステップにＳ１００１における判定によって開始する場合について説明したが、図１１記載の特徴追加登録のフローは、図１０記載のフローとは無関係に周期的に開始してもよい。たとえば、１０分おきに開始するなどであってもよい。この場合、図１０記載のステップＳ１００１は実行しなくてもよい。 In the second embodiment, the case where the flowchart of feature addition registration illustrated in FIG. 11 is started by the determination in S1001 in the step illustrated in FIG. 10 is described. The flow of feature addition registration illustrated in FIG. May start periodically regardless. For example, it may start every 10 minutes. In this case, step S1001 described in FIG. 10 may not be executed.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明したすべての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることが可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 In addition, this invention is not limited to an above-described Example, Various modifications are included. For example, the above-described embodiments have been described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described. Further, a part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment. Further, it is possible to add, delete, and replace other configurations for a part of the configuration of each embodiment.

また、上記の各構成・機能・処理部等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、または、ＩＣカード、ＳＤカード等の記録媒体に置くことができる。 Each of the above-described configurations, functions, processing units, and the like may be realized by hardware by designing a part or all of them, for example, by an integrated circuit. Each of the above-described configurations, functions, and the like may be realized by software by interpreting and executing a program that realizes each function by the processor. Information such as programs, tables, and files for realizing each function can be stored in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card or an SD card.

また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしもすべての制御線や情報線を示しているとは限らない。実際には殆どすべての構成が相互に接続されていると考えてもよい。 In addition, the control lines and information lines are those that are considered necessary for the explanation, and not all the control lines and information lines on the product are necessarily shown. In practice, it may be considered that almost all the components are connected to each other.

Claims

プロセッサと前記プロセッサが実行するプログラムを格納する記憶装置とを含む、画像処理システムであって、
前記プロセッサは、
映像データから複数フレームを作成し、
前記複数フレームにおいて移動物体を検出し、
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、
前記複数フレームの移動物体の画像から複数種類の特徴量を抽出し、
前記複数フレームから、第１フレームレートにおいて、第１種類の特徴量を抽出し、
前記複数フレームから、第１フレームレートより小さい第２フレームレートにおいて、第２種類の特徴量を抽出する、画像処理システム。 An image processing system including a processor and a storage device that stores a program executed by the processor,
The processor is
Create multiple frames from video data,
Detecting a moving object in the plurality of frames;
Extracting the feature amount of the detected locus of each moving object from the plurality of frames and recording it in a database
In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
In each of the plurality of frames, execute the content of the determined feature registration process ,
Extracting a plurality of types of feature quantities from the images of moving objects of the plurality of frames,
Extracting a first type of feature quantity from the plurality of frames at a first frame rate;
An image processing system that extracts a second type of feature amount from the plurality of frames at a second frame rate smaller than the first frame rate.

プロセッサと前記プロセッサが実行するプログラムを格納する記憶装置とを含む、画像処理システムであって、  An image processing system including a processor and a storage device that stores a program executed by the processor,
前記プロセッサは、  The processor is
映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記複数フレームから処理対象フレームを順次選択して前記特徴登録処理を実行し、  Sequentially selecting the processing target frame from the plurality of frames and executing the feature registration process;
所定頻度において、前記処理対象フレームからの第１種類の特徴量の抽出を省略する、画像処理システム。  An image processing system that omits extraction of a first type of feature amount from the processing target frame at a predetermined frequency.

プロセッサと前記プロセッサが実行するプログラムを格納する記憶装置とを含む、画像処理システムであって、  An image processing system including a processor and a storage device that stores a program executed by the processor,
前記プロセッサは、  The processor is
映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記複数フレームから処理対象フレームを順次選択して前記特徴登録処理を実行し、  Sequentially selecting the processing target frame from the plurality of frames and executing the feature registration process;
過去フレームにおける前記特徴登録処理の処理時間と、予め設定された目標値とに基づいて、選択されている処理対象フレームにおける前記特徴登録処理の内容を決定する、画像処理システム。  An image processing system that determines the content of the feature registration process in a selected processing target frame based on a processing time of the feature registration process in a past frame and a preset target value.

請求項３に記載の画像処理システムであって、 The image processing system according to claim 3,
前記プロセッサは、過去フレームにおける第１種類の特徴量の特徴登録処理の処理時間の統計値が目標値より長い場合、前記第１種類の特徴量の抽出を省略する頻度を増加させる、画像処理システム。 The processor increases the frequency of omitting the extraction of the first type of feature quantity when the statistical value of the processing time of the feature registration process of the first type of feature quantity in the past frame is longer than a target value. .

プロセッサと前記プロセッサが実行するプログラムを格納する記憶装置とを含む、画像処理システムであって、  An image processing system including a processor and a storage device that stores a program executed by the processor,
前記プロセッサは、  The processor is
映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記特徴登録処理において、処理対象フレームから１または複数の領域を検出し、前記１または複数の領域から特徴量を抽出し、  In the feature registration process, one or more areas are detected from the processing target frame, and feature quantities are extracted from the one or more areas,
前記１または複数の領域の構成に基づいて前記１または複数の領域からの特徴量抽出の内容を決定する、画像処理システム。  An image processing system that determines contents of feature quantity extraction from the one or more regions based on the configuration of the one or more regions.

プロセッサと前記プロセッサが実行するプログラムを格納する記憶装置とを含む、画像処理システムであって、  An image processing system including a processor and a storage device that stores a program executed by the processor,
前記プロセッサは、  The processor is
映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記データベースは前記複数フレームを格納し、  The database stores the plurality of frames;
前記プロセッサは、  The processor is
特徴登録処理の少なくとも一部が省略されたフレームを前記データベースから選択し、  Selecting a frame from which at least a part of the feature registration process is omitted from the database;
選択した前記フレームの特徴登録処理を実行する、画像処理システム。  An image processing system for executing feature registration processing of the selected frame.

請求項６に記載の画像処理システムであって、  The image processing system according to claim 6,
前記プロセッサは、  The processor is
特徴登録処理の少なくとも一部が省略されたフレームのうち、指定された特徴量との類似度が閾値を超える移動物体の画像を含むフレームを、前記データベースから選択する、画像処理システム。  An image processing system that selects, from the database, a frame including an image of a moving object whose similarity with a specified feature amount exceeds a threshold among frames from which at least part of the feature registration processing is omitted.

請求項７に記載の画像処理システムであって、 The image processing system according to claim 7,
前記指定された特徴量は移動物体の軌跡の特徴量を示す、画像処理システム。 The image processing system, wherein the specified feature amount indicates a feature amount of a locus of a moving object.

映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記複数フレームの移動物体の画像から複数種類の特徴量を抽出し、  Extracting a plurality of types of feature quantities from the images of moving objects of the plurality of frames,
前記複数フレームから、第１フレームレートにおいて、第１種類の特徴量を抽出し、  Extracting a first type of feature quantity from the plurality of frames at a first frame rate;
前記複数フレームから、第１フレームレートより小さい第２フレームレートにおいて、第２種類の特徴量を抽出する、ことを含む画像処理方法。  An image processing method comprising: extracting a second type of feature amount from the plurality of frames at a second frame rate smaller than the first frame rate.

映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記複数フレームから処理対象フレームを順次選択して前記特徴登録処理を実行し、  Sequentially selecting the processing target frame from the plurality of frames and executing the feature registration process;
所定頻度において、前記処理対象フレームからの第１種類の特徴量の抽出を省略する、ことを含む画像処理方法。  An image processing method including omitting extraction of the first type of feature amount from the processing target frame at a predetermined frequency.

映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記複数フレームから処理対象フレームを順次選択して前記特徴登録処理を実行し、  Sequentially selecting the processing target frame from the plurality of frames and executing the feature registration process;
過去フレームにおける前記特徴登録処理の処理時間と、予め設定された目標値とに基づいて、選択されている処理対象フレームにおける前記特徴登録処理の内容を決定する、  Determining the content of the feature registration process in the selected processing target frame based on the processing time of the feature registration process in the past frame and a preset target value;
ことを含む画像処理方法。An image processing method.

映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記特徴登録処理において、処理対象フレームから１または複数の領域を検出し、前記１または複数の領域から特徴量を抽出し、  In the feature registration process, one or more areas are detected from the processing target frame, and feature quantities are extracted from the one or more areas,
前記１または複数の領域の構成に基づいて前記１または複数の領域からの特徴量抽出の内容を決定する、ことを含む画像処理方法。  An image processing method comprising: determining content of feature amount extraction from the one or more regions based on a configuration of the one or more regions.

映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
前記複数フレーム及び検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  The feature amount of the trajectory of each of the plurality of frames and the detected moving object is extracted from the plurality of frames and recorded in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
特徴登録処理の少なくとも一部が省略されたフレームを前記データベースから選択し、  Selecting a frame from which at least a part of the feature registration process is omitted from the database;
選択した前記フレームの特徴登録処理を実行する、ことを含む画像処理方法。  An image processing method including executing a feature registration process of the selected frame.

計算機に処理に実行されるコードを格納する、非一時的な計算機読み取り可能な記憶媒体であって、前記コードは前記計算機に画像処理を実行させ、前記画像処理は、  A non-transitory computer-readable storage medium that stores a code executed for processing in a computer, the code causing the computer to perform image processing, and the image processing includes:
映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記複数フレームの移動物体の画像から複数種類の特徴量を抽出し、  Extracting a plurality of types of feature quantities from the images of moving objects of the plurality of frames,
前記複数フレームから、第１フレームレートにおいて、第１種類の特徴量を抽出し、  Extracting a first type of feature quantity from the plurality of frames at a first frame rate;
前記複数フレームから、第１フレームレートより小さい第２フレームレートにおいて、第２種類の特徴量を抽出する、ことを含む、記憶媒体。  A storage medium comprising: extracting a second type of feature amount from the plurality of frames at a second frame rate smaller than the first frame rate.

計算機に処理に実行されるコードを格納する、非一時的な計算機読み取り可能な記憶媒体であって、前記コードは前記計算機に画像処理を実行させ、前記画像処理は、  A non-transitory computer-readable storage medium that stores a code executed for processing in a computer, the code causing the computer to perform image processing, and the image processing includes:
映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記複数フレームから処理対象フレームを順次選択して前記特徴登録処理を実行し、  Sequentially selecting the processing target frame from the plurality of frames and executing the feature registration process;
所定頻度において、前記処理対象フレームからの第１種類の特徴量の抽出を省略する、ことを含む、記憶媒体。  A storage medium including: omitting extraction of the first type of feature amount from the processing target frame at a predetermined frequency.

計算機に処理に実行されるコードを格納する、非一時的な計算機読み取り可能な記憶媒体であって、前記コードは前記計算機に画像処理を実行させ、前記画像処理は、  A non-transitory computer-readable storage medium that stores a code executed for processing in a computer, the code causing the computer to perform image processing, and the image processing includes:
映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記複数フレームから処理対象フレームを順次選択して前記特徴登録処理を実行し、  Sequentially selecting the processing target frame from the plurality of frames and executing the feature registration process;
過去フレームにおける前記特徴登録処理の処理時間と、予め設定された目標値とに基づいて、選択されている処理対象フレームにおける前記特徴登録処理の内容を決定する、ことを含む、記憶媒体。  A storage medium comprising: determining content of the feature registration process in a selected processing target frame based on a processing time of the feature registration process in a past frame and a preset target value.

計算機に処理に実行されるコードを格納する、非一時的な計算機読み取り可能な記憶媒体であって、前記コードは前記計算機に画像処理を実行させ、前記画像処理は、  A non-transitory computer-readable storage medium that stores a code executed for processing in a computer, the code causing the computer to perform image processing, and the image processing includes:
映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  Extracting the feature amount of the detected trajectory of each moving object from the plurality of frames and recording it in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
前記特徴登録処理において、処理対象フレームから１または複数の領域を検出し、前記１または複数の領域から特徴量を抽出し、  In the feature registration process, one or more areas are detected from the processing target frame, and feature quantities are extracted from the one or more areas,
前記１または複数の領域の構成に基づいて前記１または複数の領域からの特徴量抽出の内容を決定する、ことを含む、記憶媒体。  Determining a content of feature amount extraction from the one or more regions based on a configuration of the one or more regions.

計算機に処理に実行されるコードを格納する、非一時的な計算機読み取り可能な記憶媒体であって、前記コードは前記計算機に画像処理を実行させ、前記画像処理は、  A non-transitory computer-readable storage medium that stores a code executed for processing in a computer, the code causing the computer to perform image processing, and the image processing includes:
映像データから複数フレームを作成し、  Create multiple frames from video data,
前記複数フレームにおいて移動物体を検出し、  Detecting a moving object in the plurality of frames;
前記複数フレーム及び検出した前記移動物体それぞれの軌跡の特徴量を前記複数フレームから抽出してデータベースに記録し、  The feature amount of the trajectory of each of the plurality of frames and the detected moving object is extracted from the plurality of frames and recorded in a database,
前記複数フレームのそれぞれにおいて、移動物体の画像から特徴量を抽出して前記データベースに記録することを含む特徴登録処理、の内容を、予め定められた条件に従って決定し、  In each of the plurality of frames, the content of a feature registration process including extracting a feature amount from an image of a moving object and recording it in the database is determined according to a predetermined condition,
前記複数フレームのそれぞれにおいて、決定した前記特徴登録処理の内容を実行し、  In each of the plurality of frames, execute the content of the determined feature registration process,
特徴登録処理の少なくとも一部が省略されたフレームを前記データベースから選択し、  Selecting a frame from which at least a part of the feature registration process is omitted from the database;
選択した前記フレームの特徴登録処理を実行する、ことを含む、記憶媒体。  Performing a feature registration process of the selected frame.