WO2018147089A1 - Information processing device and method - Google Patents

Information processing device and method Download PDF

Info

Publication number
WO2018147089A1
WO2018147089A1 PCT/JP2018/002379 JP2018002379W WO2018147089A1 WO 2018147089 A1 WO2018147089 A1 WO 2018147089A1 JP 2018002379 W JP2018002379 W JP 2018002379W WO 2018147089 A1 WO2018147089 A1 WO 2018147089A1
Authority
WO
WIPO (PCT)
Prior art keywords
captured image
imaging
image data
information
imaging target
Prior art date
Application number
PCT/JP2018/002379
Other languages
French (fr)
Japanese (ja)
Inventor
高林 和彦
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Publication of WO2018147089A1 publication Critical patent/WO2018147089A1/en

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • G11B27/034Electronic editing of digitised analogue information signals, e.g. audio or video signals on discs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Definitions

  • the present disclosure relates to an information processing apparatus and method, and more particularly, to an information processing apparatus and method that allow a user to select desired content more easily.
  • MPEG-DASH Moving Picture Experts Group Phase-Dynamic Dynamic Adaptive Streaming over HTTP
  • MPD Media Presentation Description
  • this MPD corresponds to a “play list” that describes a plurality of different video streams in an integrated manner.
  • SRD spatial Relation ⁇ ⁇ ⁇ Description
  • SRD spatial Relation ⁇ ⁇ ⁇ Description
  • ISO / IEC 23009-1 Information technology-Dynamic adaptive streaming over HTTP (DASH)-Part1: Media presentation description and segment formats, http: //standards.iso.org/ittf/PubliclyAvailableStandards/c065274_ISO_IEC_23009-1_2014.zip and ISO / IEC 23009-1: (same) m Amendment 2: Spatial Relationship Description, Generalized URL parameters and other extensions
  • This disclosure has been made in view of such a situation, and enables a user to select desired content more easily.
  • One aspect of the present technology includes a generation unit that generates a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions.
  • Information processing apparatus includes a generation unit that generates a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions.
  • the information that can be used to reproduce the captured image data can be information that can be used to select the captured image data to be reproduced.
  • the information that can be used to reproduce the captured image data can include information indicating the position of the imaging target.
  • the information that can be used for reproduction of the captured image data can include information indicating an imaging direction viewed from the imaging target.
  • the information that can be used for reproduction of the captured image data can include information indicating the acquisition destination of the captured image data.
  • the information usable for reproduction of the captured image data can include information indicating a period not included in the captured image data.
  • the multi-view viewing playlist may include a list for each imaging target of information that can be used to reproduce the captured image data for a plurality of imaging targets.
  • the multi-view viewing playlist may be generated for a predetermined period and include information indicating a start time and a length of the period.
  • the multi-view viewing playlist may include a list for each imaging target of information that can be used for reproducing the captured image data for each predetermined period for a plurality of periods.
  • the generation unit may generate the multi-view viewing playlist as an MPEG-DASH (Moving Picture Experts Group phase-Dynamic Dynamic Adaptive Streaming over HTTP) MPD (Media Presentation Description).
  • MPEG-DASH Motion Picture Experts Group phase-Dynamic Dynamic Adaptive Streaming over HTTP
  • MPD Media Presentation Description
  • the multi-view appreciation playlist can include information indicating the imaging direction viewed from the imaging target, listed for each area to be imaged.
  • the multi-view viewing playlist can include information indicating a center position and a radius of the area to be imaged.
  • the multi-view appreciation playlist can manage information on each captured image data with AdaptationSet.
  • the generation unit can group the captured image data based on the imaging target and generate a list of information that can be used for reproduction of the captured image data for each group.
  • the generation unit collects all the captured image data into one group, groups the captured image data using a preset area, or selects the captured image according to the position of each imaging target. Data can be grouped.
  • the image processing apparatus may further include an analysis unit that analyzes an imaging state, and the generation unit may be configured to generate the multi-view viewing playlist based on a result of analysis by the analysis unit.
  • the analysis unit can obtain information related to imaging, information related to the imaging target, and information related to the imaging direction viewed from the imaging target.
  • the analysis unit can analyze the imaging state based on the metadata of the captured image data and information on the venue.
  • a providing unit that provides the multi-view viewing playlist generated by the generating unit can be further provided.
  • One aspect of the present technology is also an information processing that generates a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions. Is the method.
  • a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions is generated.
  • information can be processed.
  • the user can select desired content more easily.
  • Image sharing and selection> it is possible to share information, such as moving images and still images captured at the same event, etc., and audio, etc., among friends using SNS and data sharing services.
  • the shared data is classified and managed in units such as the date and time of the event, the location, etc., or every period of a small event in the event.
  • metadata such as an imaging location (position), a focus distance, and a zoom level may be added to the captured image data.
  • position an imaging location
  • focus distance a focus distance
  • zoom level a zoom level
  • this metadata is added to each captured image data and is not managed collectively. Therefore, when the user specifies a desired captured image using the metadata, the user needs to check the metadata of each captured data one by one, and a complicated operation is required. Furthermore, as the number of shared information increases, the amount of work further increases.
  • such metadata is composed of information related to captured image data to which the metadata itself is added, and information related to other captured image data is not included. Therefore, it is difficult to specify the relationship between captured images, for example, the subject is the same, from this metadata. Therefore, for example, it has been difficult to perform a more complicated search such as specifying other captured image data that is the same imaging target.
  • package media such as DVD-ROM (Digital Versatile Disc-Read-Only Memory) and BD-ROM (Blu-ray (registered trademark) Disc-read-Only Memory)
  • DVD-ROM Digital Versatile Disc-Read-Only Memory
  • BD-ROM Blu-ray (registered trademark) Disc-read-Only Memory
  • MPEG-DASH Moving / Picture / Experts / Group / phase / -Dynamic / Adaptive / Streaming / over / HTTP
  • MPD Media / Presentation
  • An extension called Spatial Relation Description (SRD) ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ has been defined as a mechanism for associating a partial video stream in one video with the original (whole) video.
  • SRD Spatial Relation Description
  • ⁇ Generation of multi-view viewing playlist> Therefore, a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions is generated. By doing in this way, the user who views the captured image can perform a more advanced search of the captured image data based on the multi-view viewing playlist. That is, the user can more easily select and acquire desired content.
  • FIG. 1 is a block diagram illustrating a main configuration example of an image providing system which is an embodiment of an image processing system to which the present technology is applied.
  • An image providing system 100 shown in FIG. 1 is a system that provides an image sharing service.
  • Image sharing services include images taken at any event, such as concerts, plays, sports games and competitions, local and school events (eg festivals, ceremonies, flea markets, school performances, sports events, etc.) This is a service that enables viewers to share and browse image data (captured image data).
  • the event may be anything as long as the time and place can be specified, such as a concert, a stage, a professional sports game, etc. It may be public such as festivals, ceremonies, amateur sports competitions, flea markets, school events such as athletic meet and athletic meet, or private such as birthday parties.
  • the provision (distribution) of the captured image data may be performed immediately (so-called real time (ignoring time lag such as transmission delay and processing delay)) with respect to imaging, or the accumulated image data may be stored. It may be in a format that is distributed upon request (so-called on-demand).
  • the provider of this image sharing service may be the same as or different from the event operator.
  • the photographer may be an event operator (including staff on the operation side), an image sharing service provider, or a content broadcast / distributor other than those. It may be an event participant (actor, player, spectator, etc.). This photographer may or may not be registered in advance in the image sharing service. That is, only a specific person registered as a member in advance may be allowed to share captured image data, or an unspecified number of persons may be allowed to share captured image data.
  • the viewer who shares the captured image data may be an event participant (performer, player, spectator, etc.), or may be a person who has not participated in the event. Of course, it may be an event operator. It is also possible for the photographer to be a viewer. The viewer may or may not be registered in advance in the image sharing service.
  • the image providing system 100 includes an imaging device 101, an integrated server 102, and a terminal device 103.
  • the imaging device 101 and the terminal device 103 are communicably connected to the integrated server 102 via an arbitrary communication medium such as the Internet.
  • an arbitrary communication medium such as the Internet.
  • one imaging device 101, one integrated server 102, and one terminal device 103 are shown, but these numbers are arbitrary and may be plural.
  • the image providing system 100 is a system in which captured image data (captured image data) and the like captured by the image capturing apparatus 101 are accumulated by the accumulation server 102, and the terminal apparatus 103 acquires and reproduces it. That is, the integrated server 102 provides a service for sharing the captured image data captured by the imaging device 101 (the user who is the user) to the terminal device 103 (the viewer who is the user).
  • the imaging device 101 performs processing related to imaging and communication. For example, the imaging device 101 captures a subject (also referred to as an imaging target) and generates captured image data that is data of the captured image. Further, for example, the imaging apparatus 101 supplies the generated captured image data to the integration server 102 (so-called uploading).
  • a subject also referred to as an imaging target
  • the imaging apparatus 101 supplies the generated captured image data to the integration server 102 (so-called uploading).
  • the accumulation server 102 performs processing related to accumulation and provision of captured image data. For example, the accumulation server 102 acquires and stores captured image data supplied from the imaging device 101. Further, for example, the integration server 102 generates a multi-view viewing playlist that is information used for reproducing captured image data.
  • the multi-view viewing playlist is information used for retrieval (selection) and reproduction control of captured image data, and retrieval (selection) of a plurality of captured image data generated by imaging the same imaging target from different positions. ) And a list of information available for playback control. More specifically, in the multi-view appreciation playlist, information regarding captured image data is collected for each imaging target (for example, a person, an object, an area, and the like). Therefore, for example, when searching (selecting) and playing back captured image data in which a desired imaging target is reflected, the user uses this multi-view viewing playlist to perform search (selection) and playback control. It can be made easier.
  • a user who views a captured image views a captured image showing a desired imaging target. That is, in many cases, what is reflected in a captured image is most important for a user who views the captured image.
  • a large number of imaging data obtained by imaging a large number of imaging targets is often shared.
  • conventional playlists, metadata, and the like hardly contain information about the imaging target, and it is difficult to perform a search based on the imaging target.
  • the search and reproduction control based on the imaging target can be performed more easily by using the above-described multi-view viewing playlist, user satisfaction can be improved.
  • the accumulation server 102 provides captured image data to the terminal device 103 (downloading, streaming, etc.). For example, the accumulation server 102 supplies the generated multi-view viewing playlist to the terminal device 103, and performs playback control based on the multi-view viewing playlist. That is, the accumulation server 102 supplies the captured image data requested based on the supplied multi-view viewing playlist to the terminal device 103.
  • the terminal device 103 performs processing related to reproduction of captured image data shared using the accumulation server 102. For example, the terminal device 103 acquires a multi-view viewing playlist from the integration server 102, specifies desired captured image data based on the multi-view viewing playlist, and requests the specified captured image data from the integration server 102. To do. When desired captured image data is supplied from the accumulation server 102 in response to the request, the terminal device 103 acquires the captured image data, reproduces it, and displays it on a monitor or the like.
  • FIG. 2 is a block diagram illustrating a main configuration example of the imaging apparatus 101.
  • the imaging apparatus 101 includes an imaging unit 121, a metadata generation unit 122, a metadata addition unit 123, and a communication unit 124.
  • the imaging unit 121 includes a captured image generation function such as an image sensor, for example, and performs processing related to imaging of an imaging target.
  • the metadata generation unit 122 performs processing related to generation of metadata of captured image data.
  • the metadata adding unit 123 performs a process related to adding metadata to captured image data.
  • the metadata adding unit 123 performs processing related to upload of captured image data to which metadata is added.
  • FIG. 3 is a block diagram illustrating a main configuration example of the integrated server 102.
  • the integrated server 102 includes a CPU (Central Processing Unit) 151, a ROM (Read Only Memory) 152, a RAM (Random Access Memory) 153, a bus 154, an input / output interface 160, an input unit 161, and an output.
  • the CPU 151, the ROM 152, and the RAM 153 are connected to each other via a bus 154.
  • An input / output interface 160 is also connected to the bus 154.
  • An input unit 161 to a drive 165 are connected to the input / output interface 160.
  • the input unit 161 includes arbitrary input devices such as a keyboard, a mouse, a touch panel, an image sensor, a microphone, a switch, and an input terminal.
  • the output unit 162 includes an arbitrary output device such as a display, a speaker, and an output terminal.
  • the storage unit 163 includes an arbitrary storage medium such as a hard disk, a RAM disk, a nonvolatile memory such as an SSD (Solid State Drive) or a USB (Universal Serial Bus) (registered trademark) memory.
  • the communication unit 164 is, for example, any communication standard such as Ethernet (registered trademark), Bluetooth (registered trademark), USB, HDMI (registered trademark) (High-Definition Multimedia Interface), IrDA or the like, wired or wireless, or both. Communication interface.
  • the drive 165 drives a removable medium 171 having an arbitrary storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
  • Processing is performed by the CPU 151 loading, for example, a program stored in the ROM 152 or the storage unit 163 into the RAM 153 and executing it.
  • the RAM 153 also appropriately stores data necessary for the CPU 151 to execute various processes.
  • FIG. 4 is a functional block diagram illustrating an example of functions that the integrated server 102 has.
  • the integrated server 102 implements various functions as shown in FIG. 4 when the CPU 151 executes a program or the like.
  • the accumulation server 102 includes, as functional blocks, an accumulation unit 181, a captured image database 182, an imaging state analysis unit 183, a playlist generation unit 184, a multi-view viewing playlist database 185, and a playlist provision unit. 186 and a captured image providing unit 187.
  • the accumulation unit 181 performs processing related to the accumulation of captured image data and its metadata.
  • the accumulation unit 181 is realized by the CPU 151 executing a program or the like or controlling the communication unit 164 or the like via the input / output interface 160.
  • the captured image database 182 performs processing related to storage and management of information such as captured image data and metadata.
  • the captured image database 182 is realized, for example, when the CPU 151 executes a program or the like or controls the storage unit 163 or the like via the input / output interface 160.
  • the imaging state analysis unit 183 performs processing related to analysis of the imaging state.
  • the imaging state analysis unit 183 is realized by, for example, the CPU 151 executing a program or the like.
  • the playlist generation unit 184 performs processing related to generation of a multi-view viewing playlist.
  • the playlist generation unit 184 is realized, for example, when the CPU 151 executes a program or the like.
  • the multi-view viewing playlist database 185 performs processing related to storage and management of the multi-view viewing playlist.
  • the multi-view viewing playlist database 185 is realized, for example, when the CPU 151 executes a program or the like, or controls the storage unit 163 or the like via the input / output interface 160.
  • the playlist providing unit 186 performs processing related to providing a multi-view viewing playlist.
  • the playlist providing unit 186 is realized, for example, when the CPU 151 executes a program or the like, or controls the communication unit 164 or the like via the input / output interface 160.
  • the captured image providing unit 187 performs processing related to provision of captured image data.
  • the captured image providing unit 187 is realized, for example, when the CPU 151 executes a program or the like or controls the communication unit 164 or the like via the input / output interface 160.
  • FIG. 5 is a block diagram illustrating a main configuration example of the terminal device 103.
  • the terminal device 103 includes a CPU 201, a ROM 202, a RAM 203, a bus 204, an input / output interface 210, an input unit 211, an output unit 212, a storage unit 213, a communication unit 214, and a drive 215.
  • the CPU 201, the ROM 202, and the RAM 203 are connected to each other via a bus 204.
  • An input / output interface 210 is also connected to the bus 204.
  • An input unit 211 to a drive 215 are connected to the input / output interface 160.
  • the input unit 211 includes arbitrary input devices such as a keyboard, a mouse, a touch panel, an image sensor, a microphone, a switch, and an input terminal.
  • the output unit 212 includes an arbitrary output device such as a display, a speaker, and an output terminal, for example.
  • the storage unit 213 includes an arbitrary storage medium such as a hard disk, a RAM disk, a non-volatile memory such as an SSD (Solid State Drive) or a USB (Universal Serial Bus) (registered trademark) memory.
  • the communication unit 164 is, for example, any communication standard such as Ethernet (registered trademark), Bluetooth (registered trademark), USB, HDMI (registered trademark) (High-Definition Multimedia Interface), IrDA or the like, wired or wireless, or both. Communication interface.
  • the drive 215 drives a removable medium 221 having an arbitrary storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
  • Processing is performed by the CPU 201 loading, for example, a program stored in the ROM 202 or the storage unit 213 into the RAM 203 and executing the program.
  • the RAM 203 also appropriately stores data necessary for the CPU 201 to execute various processes.
  • FIG. 6 is a functional block diagram illustrating an example of functions that the terminal device 103 has.
  • the terminal device 103 implements various functions as shown in FIG. 6 when the CPU 201 executes a program or the like.
  • the terminal device 103 includes a playlist acquisition unit 231, an image selection processing unit 232, a captured image request unit 233, a captured image acquisition unit 234, and a playback unit 235 as functional blocks.
  • the playlist acquisition unit 231 performs processing related to acquisition of a multi-view viewing playlist.
  • the playlist acquisition unit 231 is realized, for example, when the CPU 201 executes a program or the like, or controls the communication unit 214 or the like via the input / output interface 210.
  • the image selection processing unit 232 performs processing related to selection of captured images based on the multi-view viewing playlist.
  • the image selection processing unit 232 is realized, for example, when the CPU 201 executes a program or the like, or controls the communication unit 214 or the like via the input / output interface 210.
  • the captured image request unit 233 performs processing related to a request for a captured image to be reproduced.
  • the captured image request unit 233 is realized, for example, when the CPU 201 executes a program or the like, or controls the communication unit 214 or the like via the input / output interface 210.
  • the captured image acquisition unit 234 performs processing related to acquisition of a captured image.
  • the captured image acquisition unit 234 is realized, for example, when the CPU 201 executes a program or the like, or controls the communication unit 214 or the like via the input / output interface 210.
  • the playback unit 235 is a device that performs processing related to playback of captured image data.
  • the reproducing unit 235 is realized by the CPU 201 executing a program or the like or controlling the output unit 212 or the like via the input / output interface 210.
  • the imaging unit 121 of the imaging apparatus 101 is operated by the photographer at an event venue or the like in step S101 to capture an imaging target (subject).
  • the imaging unit 121 photoelectrically converts incident light with an image sensor or the like to generate captured image data, and supplies it to the metadata adding unit 123.
  • the metadata generation unit 122 collects information such as imaging settings and surrounding states (imaging environment), and generates metadata including them.
  • the content of this metadata is arbitrary.
  • the metadata generation unit 122 supplies the generated metadata to the metadata addition unit 123.
  • step S103 the metadata adding unit 123 adds the metadata generated by the process of step S102 to the captured image data generated by the process of step S101.
  • the metadata adding unit 123 supplies the captured image data to which the metadata is added to the communication unit 124.
  • step S104 the communication unit 124 communicates with the accumulation server 102, and supplies (transmits) the captured image data to which the metadata is added by the process in step S103 to the accumulation server 102 at a predetermined timing.
  • the captured image data to which the metadata is added is transmitted to the integrated server 102 via a transmission path such as the Internet.
  • step S ⁇ b> 121 the accumulation unit 181 of the accumulation server 102 controls the communication unit 164 to communicate with the imaging apparatus 101, and captured image data (imaging with metadata added) supplied (transmitted) from the imaging apparatus 101. Acquire (receive) image data. That is, the accumulation unit 181 collects captured image data to which metadata is added.
  • step S122 the captured image database 182 stores and manages the captured image data (and its metadata) acquired by the processing in step S121 in the storage unit 163.
  • the captured image accumulation process ends.
  • the captured image integration process as described above is executed for each imaging device used for imaging at the event venue.
  • the execution timing of this captured image integration process is arbitrary. For example, it may be after the event ends or during the event. Further, for example, it may be executed at a predetermined time, may be periodically executed at a predetermined time interval, or may be executed based on an instruction from a photographer or the like. Also good. Further, for example, it may be executed when a predetermined event occurs, such as generating new captured image data.
  • the accumulation server 102 executes a multi-view appreciation playlist generation process to generate a multi-view appreciation playlist.
  • An example of the flow of the multi-view appreciation playlist generation process will be described with reference to the flowchart of FIG.
  • the imaging state analysis unit 183 of the integrated server 102 analyzes the imaging state in step S141. Details of this analysis processing will be described later.
  • step S142 the playlist generation unit 184 generates a multi-view viewing playlist based on the analysis result. Details of the playlist generation processing will be described later.
  • this multi-view appreciation playlist generation process ends.
  • the execution timing of this multi-view appreciation playlist generation process is arbitrary. For example, it may be executed at a predetermined time, may be periodically executed at predetermined time intervals, or may be executed based on an instruction from a server administrator or the like. Good. Further, for example, it may be executed when a predetermined event occurs, such as acquiring new captured image data.
  • the imaging situation analysis unit 183 analyzes the imaging situation such as the imaging position at the time of imaging, the position of the imaging target, the imaging direction viewed from the imaging target, and the like on the captured image data. This analysis can be performed based on arbitrary information.
  • the imaging state analysis unit 183 refers to the following information.
  • Image pickup position Information regarding the position (image pickup position) of the image pickup apparatus 101 at the time of image pickup may measure its own position using a GPS (Global Positioning System) system at the time of imaging. Further, for example, positioning may be performed using communication with a wireless LAN (Local Area Network) access point or a base station, or various sensors such as an acceleration sensor may be used. The positioning method is arbitrary. In this case, the accumulation server 102 performs analysis with reference to the positioning result (coordinates and the like).
  • the information related to the imaging position may be other than the positioning result (coordinates, etc.).
  • it may be identification information (seat number or the like) or position information (coordinates or the like) of the seat where the image is taken.
  • a position (seat) where an imager or the like has taken an image may be specified based on a seating chart indicating a seat layout.
  • the accumulation server 102 performs analysis with reference to seat identification information (such as a seat number) and position information (such as coordinates).
  • Such information regarding the imaging position may or may not be included in the metadata of the captured image data.
  • it may be supplied to the accumulation server 102 as data different from the captured image data and metadata.
  • the integrated server 102 may analyze the captured image and specify the imaging position.
  • the accumulation server 102 may identify the imaging position by comparing the captured image with a reference image prepared in advance.
  • a marker for specifying a position may be installed at the event venue, and the accumulation server 102 may specify the imaging position based on the marker reflected in the captured image.
  • the accumulation server 102 performs analysis with reference to the imaging position (coordinates, etc.) specified by itself.
  • devices other than the imaging device 101 and the integrated server 102 may specify the imaging position.
  • Imaging time This is information relating to the time (such as the start time and the end time) when imaging was performed and the time from the start to the end. For example, it is possible to generate time information for starting and ending imaging using a timer function in which the imaging apparatus 101 is built.
  • the imaging device 101 may acquire time information from an external device such as a server.
  • the accumulation server 102 performs analysis with reference to this time information.
  • the captured image data can be more accurately synchronized.
  • the synchronization is not accurately achieved unless the times are synchronized and the reproduction of a plurality of captured image data is switched, for example, when simply searching for captured image data having similar imaging times. In many cases, there is no problem. That is, it is only necessary to synchronize the time information between the imaging devices 101 with accuracy according to the usage method.
  • communication with a server may be performed at the time of imaging. Information regarding the time and time when such imaging is performed may or may not be included in the metadata of the captured image data.
  • the imaging device 101 may measure the distance from the imaging device 101 to the imaging target.
  • the method of distance measurement is arbitrary.
  • Such information regarding the distance to the imaging target may or may not be included in the metadata of the captured image data.
  • it may be supplied to the accumulation server 102 as data different from the captured image data and metadata.
  • the integrated server 102 may analyze the captured image and measure the distance from the imaging device 101 to the imaging target.
  • the accumulation server 102 may perform distance measurement by comparing a captured image with a reference image prepared in advance. Of course, devices other than the imaging device 101 and the integrated server 102 may perform this distance measurement.
  • the accumulation server 102 performs analysis with reference to the distance measurement results (distance information, etc.) obtained in this way.
  • Imaging direction Information regarding the imaging direction of the imaging apparatus 101 may specify the imaging direction using an acceleration sensor, an electronic compass, or the like. Such information regarding the imaging direction may or may not be included in the metadata of the captured image data. For example, it may be supplied to the accumulation server 102 as data different from the captured image data and metadata. Further, for example, the accumulation server 102 may compare the captured image with a reference image prepared in advance to specify the imaging direction. The accumulation server 102 performs analysis with reference to the imaging direction specified in this way.
  • -Zoom level This is information related to the zoom setting (angle of view setting) when the imaging apparatus 101 performs imaging. Such information regarding the imaging direction may or may not be included in the metadata of the captured image data. For example, it may be supplied to the accumulation server 102 as data different from the captured image data and metadata. Further, for example, the accumulation server 102 may compare the captured image with a reference image prepared in advance to specify the zoom level. The accumulation server 102 performs analysis with reference to the zoom level specified in this way.
  • Imaging state analysis unit 183 refers to at the time of analysis is arbitrary. Any one of the various types of information described above may be referred to, or information other than those described above may be referred to. A plurality of pieces of information may be referred to.
  • the imaging state analysis unit 183 may refer to venue information that is information about an event venue where imaging is performed.
  • the content of the venue information is arbitrary, but may include, for example, imaging target positional relationship information indicating the positional relationship of various areas in the event venue, venue shape information indicating the shape and layout of the event venue, and the like. Good.
  • the image pickup target position relation information includes, for example, an image pickup position area that is an area where the image pickup apparatus 101 can be placed in an event venue, an image pickup target area that can be an image pickup target, and an image pickup target area.
  • This is information indicating the positional relationship of an imaging target detail area, which is an area where a specific imaging target exists.
  • FIG. 9 shows an example of an event such as an athletic meet held in the school yard.
  • the imaging target position relationship information 250 shown in FIG. 9 indicates an imaging position area 251, an imaging target area 252, an imaging target detail area 253-1, an imaging target detail area 253-2, an imaging target detail area 253-3, and the like. ing.
  • imaging target detail area 253-1, the imaging target detail area 253-2, and the imaging target detail area 253-3 are referred to as the imaging target detail area 253 when it is not necessary to distinguish between them. How much information in the imaging target detail area 253 is assumed in advance depends on the nature of the event.
  • imaging target position relationship information 250 it is possible to limit the position of the imaging device 101 and the imaging target in the venue to some extent, so it is easier to specify the imaging position, the position of the imaging target, the imaging direction, and the like. Can be.
  • the imaging target position relationship information 250 includes the positions, imaging directions, and the like of the imaging device 261-1 and the imaging device 261-2 that capture the imaging target region 252 and generate a reference image. .
  • Venue shape information is information about the shape of the venue. For example, the size, shape, coordinates, etc. of the venue are shown. Further, the layout and coordinates of the seat may be shown. By using such venue shape information, it is possible to specify an imaging position, an imaging target position, an imaging direction, and the like in the venue. For example, when coordinates are indicated as the imaging position, the imaging position in the venue can be specified by comparing the coordinates with the coordinates of the venue shape information. Further, for example, when a seat number is indicated as the imaging position, the imaging position (seat position) in the venue can be specified by comparing the seat number with the seating chart of the venue shape information.
  • step S141 in FIG. 8 the imaging state analysis unit 183 analyzes the imaging state such as the imaging position, the imaging target, and the imaging direction as viewed from the imaging target from the above-described various information.
  • the imaging state analysis unit 183 analyzes the imaging state such as the imaging position, the imaging target, and the imaging direction as viewed from the imaging target from the above-described various information.
  • the imaging state analysis unit 183 acquires data to be analyzed (for example, metadata of the captured image data to be analyzed) from the captured image database 182 in step S161. In step S162, the imaging state analysis unit 183 determines whether the metadata includes imaging position information indicating the imaging position. If it is determined that it is included, the process proceeds to step S163.
  • data to be analyzed for example, metadata of the captured image data to be analyzed
  • the imaging state analysis unit 183 specifies the imaging position in the “imaging target position relationship”, that is, the imaging position in the event hall, based on information (for example, GPS information) related to the imaging position included in the metadata. . Further, the venue information may be referred to. For example, the imaging state analysis unit 183 may identify the imaging position in the event venue by comparing coordinates indicating the imaging position with venue shape information (layout and coordinates in the venue). When the process of step S163 ends, the process proceeds to step S165.
  • step S164 the imaging state analysis unit 183 specifies the imaging position in the “imaging target position relationship”, that is, the imaging position in the event hall, based on various information.
  • the imaging state analysis unit 183 specifies the imaging position based on imaging position information (for example, coordinates and seat number) input by the photographer or the like.
  • the venue information may be referred to.
  • the imaging state analysis unit 183 may identify the imaging position in the event venue by comparing the captured image with a reference image whose imaging position and imaging direction are known.
  • step S165 the imaging state analysis unit 183 determines whether or not imaging target position information indicating the position of the imaging target is included in the metadata. If it is determined that it is included, the process proceeds to step S166.
  • the imaging state analysis unit 183 determines the position of the imaging target in the “imaging target positional relationship” (or based on the information related to the position of the imaging target included in the metadata (for example, the imaging direction and the distance to the imaging target) (or Range), that is, the position (or range) of the imaging target in the event venue.
  • the imaging state analysis unit 183 specifies an imaging target (thing, person, or region) based on information such as an imaging direction and a distance to the imaging target included in the captured image data and metadata. Specify the location in the event venue to be imaged. Further, the venue information may be referred to.
  • the imaging state analysis unit 183 identifies the position (or range) of the imaging target in the event venue by comparing the imaging direction and the distance to the imaging target with venue shape information (layout and coordinates in the venue). You may do it.
  • the process of step S166 ends, the process proceeds to step S168.
  • step S167 the imaging state analysis unit 183 specifies the position (or range) of the imaging target in the “imaging target positional relationship” based on various information.
  • the imaging state analysis unit 183 based on imaging target position information (for example, coordinates) input by the photographer or the like, the position (or range) of the imaging target, that is, the position (or range) of the imaging target in the event venue. Is identified. Further, the venue information may be referred to.
  • the imaging state analysis unit 183 may identify the position of the imaging target in the event venue by comparing the captured image with a reference image whose imaging position and imaging direction are known.
  • the imaging state analysis unit 183 determines the imaging position specified by the process of step S163 or the process of step S164, and the position (or range of the imaging target) specified by the process of step S166 or the process of step S167. Based on (center), the positional relationship between the imaging apparatus 101 and the imaging target is specified, and further, the imaging direction viewed from the imaging target is specified from the positional relationship. Further, the venue information may be referred to. For example, the imaging state analysis unit 183 may identify the position of the imaging target in the event venue by comparing the captured image with a reference image whose imaging position and imaging direction are known.
  • the captured image database 182 stores information indicating the imaging position, the position (or range) of the imaging target, and the imaging direction viewed from the imaging target, obtained as described above, as captured image data to be processed. Is stored as metadata of the captured image data and managed.
  • step S170 the imaging state analysis unit 183 determines whether or not all captured image data has been analyzed. If it is determined that there is unprocessed captured image data, the process proceeds to step S171. In step S171, the imaging state analysis unit 183 updates the analysis target to the next captured image data (changes the analysis target). When the process of step S171 ends, the process returns to step S161, and the subsequent processes are repeated.
  • the integration server 102 analyzes the imaging state of all the captured image data stored in the captured image database 182 by executing each process of step S161 to step S171 on each captured image data. If it is determined in step S170 that all captured image data has been analyzed, the analysis process ends, and the process returns to FIG.
  • EndX (Y) indicates the end time (for example, UTC time).
  • xxX (Y), yyX (Y), and zzX (Y) are coordinates in the “imaging target positional relationship” (FIG. 9)
  • xyX (Y) and zX (Y) are imaging directions (camera positions) viewed from the coordinates. Direction).
  • the coordinate origin and the reference direction of the angle are determined according to the “imaging target positional relationship”. For example, in the example of FIG. 9, the center is the coordinate origin, the reference of xyX (Y) is directly above, and zX (Y) (The standard is the horizontal plane, etc.
  • the numerical value representing the coordinates is an integer value in millimeters.
  • the playlist generation unit 184 groups the captured image data for each imaging target detail area in step S191.
  • the playlist generation unit 184 generates a multi-view viewing playlist for each group, that is, for each imaging target detail area.
  • the pre-list generation process ends, and the process returns to FIG.
  • the imaging target detail area 253 may be set in addition to the imaging target area 252 as a part of the imaging target positional relationship information 250 in advance.
  • the imaging target detail area 253 may be automatically formed from metadata given to each video as a specific result of the imaging target described above.
  • three types of grouping are described. It is assumed that the imaging target detail area 253 is defined as a circle (or a hemisphere), and the position expression on the playlist is based on the center position.
  • the imaging target detail area 282 includes all the imaging target positions of the imaging devices 101-1 to 101-8. That is, in this case, all of the captured image data captured by the imaging devices 101-1 to 101-8 are grouped into one group.
  • each imaging device 101 performs imaging at the same position and direction as in FIG. In this case, as shown in FIG. 15, since the imaging device 101-4, the imaging device 101-6, and the imaging device 101-8 are in the vicinity of each other, the imaging is performed so as to include all these imaging targets.
  • the target detail area 284-1 may be set, and grouping may be performed by the imaging target detail area 284-1.
  • the imaging object details are included so as to include all these imaging objects.
  • An area 284-2 may be set, and grouping may be performed by the imaging target detail area 284-2. Note that the size of the imaging target detail area set in this way is arbitrary. As in the example of FIG. 15, when a plurality of imaging target detailed areas are set, their sizes do not have to be unified (an imaging target detailed area having a different size from the other can be set).
  • (1) and (2) are grouped based on a pre-designated imaging target detail area.
  • (3) those in which the position of the imaging target of each captured image data is included in a certain range are grouped.
  • such grouping of captured image data is performed at certain time intervals. For example, every small event in the entire event. In that case, a certain imaging device 101 does not always capture one imaging target (imaging target detailed area) during that time, but it also captures a grouping target area even during a part of that period. If it is, the captured image data is included in the group. In that case, an identifier of a group belonging to each predetermined time is added as metadata to each captured image data. In this way, group information (information indicating a group of captured image data) is generated. An example of this group information is shown in FIG.
  • the group element is a preset imaging target detail area (imaging target detail area 282 in FIG. 13 and imaging target detail area 253 in FIG. 14), or metadata of each captured image data.
  • imaging target detail area for example, imaging target detail area 284-1 and imaging target detail area 284-2 in FIG. 15
  • cx (n), cy (n), and xz (n) indicate the coordinates of the center of the circular area (that is, the imaging target detail area) including the imaging target position of each grouped captured image data, and radius Represents the radius.
  • the gpref attribute of object_group indicates the id of the ⁇ group> element to which each captured image data belongs as a result of grouping.
  • startX (Y) and endX (Y) indicate the start and end time (for example, UTC time) of the section in which the captured image data belongs to each group.
  • the s-angle values (xyX (Y), zX (Y)) indicate the imaging direction (imaging position direction, angle on the horizontal plane and angle above and below the horizontal plane) as seen from the center of the imaging target detail area of the group. Show. Therefore, the value of the angle attribute given to the ⁇ period (n)> element added in advance to each captured image data does not necessarily match.
  • the ⁇ period> element and the ⁇ object_group> element described above are child elements of the same ⁇ photographing data> element and can be described together, but are omitted here for simplification.
  • step S212 the playlist generation unit 184 acquires the metadata of the image (i) to be processed (for example, the metadata of the i-th registered captured image data) from the captured image database 182.
  • step S213 the playlist generation unit 184 determines whether there is an imaging target detail area including the coordinates of the jth section. If it is determined that it exists, the process proceeds to step S214.
  • step S214 the playlist generation unit 184 assigns the jth section of the image (i) to the corresponding group.
  • the process proceeds to step S216. If it is determined in step S213 that there is no imaging target detailed area including the coordinates of the jth section, the process proceeds to step S215.
  • step S215 the playlist generation unit 184 records the position information of the jth section of the image (i) as the other imaging target (n). When the process of step S215 ends, the process proceeds to step S216.
  • step S216 the playlist generation unit 184 determines whether the value of the variable j is the number of sections of the image (i). If it is determined that the value of the variable j has not reached the number of sections of the image (i), the process proceeds to step S217.
  • step S217 the playlist generation unit 184 increments the values of the variable j and the variable n by “+1” (j ++, n ++). When the process of step S217 ends, the process returns to step S213, and the subsequent processes are repeated. As described above, the processing from step S213 to step S217 is repeatedly executed, and when it is determined in step S216 that the value of the variable j has reached the number of sections of the image (i), the processing proceeds to step S218.
  • step S218 the playlist generation unit 184 determines whether the value of the variable i is the number of captured image data. If it is determined that the value of the variable i has not reached the number of captured image data, the process proceeds to step S219.
  • step S219 the playlist generation unit 184 increments the value of the variable i by “+1” (i ++). When the process of step S219 ends, the process returns to step S212, and the subsequent processes are repeated. As described above, the processing from step S212 to step S219 is repeatedly executed, and when it is determined in step S218 that the value of the variable i has reached the number of captured image data, the processing proceeds to FIG.
  • step S221 of FIG. 18 the playlist generation unit 184 groups other target positions (1) to (n) in the vicinity range and sets an unspecified imaging target detail area.
  • step S223 the playlist generation unit 184 acquires the metadata of the image (i) to be processed (for example, the metadata of the captured image data registered for the i-th) from the captured image database 182.
  • step S224 the playlist generation unit 184 determines whether or not the i-th section has been assigned to the group. If it is determined that no assignment has been made, the process proceeds to step S225.
  • step S225 the playlist generation unit 184 determines whether there is an unspecified imaging target detail area including the coordinates of the jth section. If it is determined that it exists, the process proceeds to step S226.
  • step S226 the playlist generation unit 184 assigns the jth section of the image (i) to the corresponding group. When the process of step S226 ends, the process proceeds to step S227.
  • step S224 If it is determined in step S224 that the i-th section has already been assigned to the group, the process proceeds to step S227. If it is determined in step S225 that there is an unspecified imaging target detail area including the coordinates of the jth section, the process proceeds to step S227.
  • step S227 the playlist generation unit 184 determines whether or not the value of the variable j is the number of sections of the image (i). If it is determined that the value of the variable j has not reached the number of sections of the image (i), the process proceeds to step S228. In step S228, the playlist generation unit 184 increments the value of the variable j by “+1” (j ++). When the process of step S228 ends, the process returns to step S224 and the subsequent processes are repeated. As described above, the processing from step S224 to step S228 is repeatedly executed, and when it is determined in step S227 that the value of the variable j has reached the number of sections of the image (i), the processing proceeds to step S229.
  • step S229 the playlist generation unit 184 determines whether the value of the variable i is the number of captured image data. If it is determined that the value of the variable i has not reached the number of captured image data, the process proceeds to step S230. In step S230, the playlist generation unit 184 increments the value of the variable i by “+1” (i ++). When the process of step S230 ends, the process returns to step S223, and the subsequent processes are repeated. As described above, the processing from step S223 to step S230 is repeatedly executed. When it is determined in step S229 that the value of the variable i has reached the number of captured image data, the image grouping processing ends, and the processing is as shown in FIG. Return to.
  • the playlist generation unit 184 When the grouping of the captured image data is performed as described above, the playlist generation unit 184 generates a multi-view viewing playlist for each group by the process of step S192 in FIG. For example, in the case of (3) above, a video belonging to a video group is selected by the group identifier (gpref attribute value of object_group) added to each captured image data in the video metadata 271 of FIG.
  • the imaging direction in the section which is imaging the imaging target detailed area of the group of each video is determined.
  • the direction (imaging direction) in which each captured image is captured is the direction viewed from the center of the region.
  • the direction reference (0 degree direction) an orientation based on the ground axis can be used, but it can also be defined on the drawing of the imaging target positional relationship information.
  • the imaging direction on the horizontal plane of each captured image is determined by an angle in a clockwise direction in which the upward direction in FIG. 9 is 0 degree.
  • the imaging positions of all the captured images are within a certain height range from the plane, but the imaging positions are arranged more three-dimensionally, for example, in the example of FIG.
  • the information indicating the image pickup direction may be expressed by two values, an angle on the horizontal plane and an angle from the horizontal plane to the up and down direction. .
  • Multi-view viewing playlist> A multi-view viewing playlist is generated as described above. This multi-view viewing playlist can be generated for the entire event or for each small event. When a playlist for the entire event is generated and the above-described group configuration changes for each small event, the time may be divided for each small event.
  • An example of the multi-view viewing playlist is shown in FIG.
  • the multi-view viewing playlist 321 shown in FIG. 20 is described in XML (Extensible Markup Language) format.
  • this multi-view viewing playlist 321 includes, for example, the following elements and information as information that can be used for reproducing captured image data (information that can be used to select captured image data to be reproduced): Has attribute.
  • MultiviewPlaylist Indicates a multi-view viewing playlist.
  • BaseURL Indicates the base URL from which captured image data is acquired.
  • Pereod @ start Indicates the imaging start time of the section (small event time).
  • Pereod @ duration Indicates the length of the section (small event time).
  • ObjectGroup @ position Indicates the position (X, Y, Z coordinates) of the imaging target in the imaging target positional relationship drawing.
  • Representation @ angle Indicates the imaging angle (imaging direction) of this captured image viewed from the imaging target.
  • xy represents an angle w of the horizontal plane
  • z represents an angle with the horizontal plane.
  • URL Indicates the URL from which each captured image data is acquired. Used in combination with the base URL specified by MultiviewPlaylist.BaseURL. Inactive: The time when the captured image data is not capturing the imaging target (that is, the period not included in the captured image data) is specified by separating with a comma. Specify relative to Time @ start based on timescale.
  • the multi-view viewing playlist 321 information (Representation) that can be used for reproduction of each captured image data obtained by capturing each imaging target, for a plurality of imaging targets, is the imaging target (ObjectGroup ) Are grouped (grouped). These pieces of information are generated for a predetermined period (duration) such as a small event, and include information indicating the start time (Pereod @ start) and the length (Pereod @ duration) of the period. Further, the multi-view appreciation playlist 321 is a list of information that can be used for searching, selecting, and reproducing captured image data for each predetermined period (small event) for a plurality of periods (events). Including.
  • the multi-view viewing playlist can be expressed as an extension of MPEGDASH MPD.
  • An example is shown in FIGS.
  • the contents of the multi-view appreciation playlist 331 shown in FIG. 21, the multi-view appreciation playlist 332 shown in FIG. 22, and the multi-view appreciation playlist 333 shown in FIG. are the same.
  • a ⁇ MultiviewGroup> element that includes information indicating the imaging direction viewed from the imaging target, listed for each area to be imaged, is newly defined, and the group as the element By enumerating ⁇ AdaptationSet> belonging to the ⁇ views> element, a group of Adaptation ⁇ ⁇ Sets to be referenced for each imaging target (area) can be shown.
  • Each ⁇ views> element has the coordinates (position attribute) of the center of the target imaging region and the r attribute representing the range, that is, the radius.
  • the value of the r attribute is the value of the radius attribute of the ⁇ group> element generated as each video metadata.
  • each Adaptation Set has a plurality of Representations with different encoding and bit rates generated from the same captured image data, it is possible to perform adaptive bit rate streaming playback of each image captured from a plurality of angles.
  • each ⁇ MultiviewGroup> element can refer to a different Adaptation Set.
  • the Inactive element is described as a child element (or attribute) of the Adaptation Set for a section in which the imaging target that is the target of the specific Multiview Group is not shown.
  • the multi-view viewing playlist database 185 stores and manages the multi-view viewing playlist.
  • the captured image data stored in the captured image database 182 is shared by the terminal device 103 (the user). That is, the accumulation server 102 provides captured image data to the terminal device 103. At that time, the accumulation server 102 provides the above-described multi-view appreciation playlist to the terminal device 103, and selects captured image data based on the multi-view appreciation playlist. In other words, the terminal device 103 (the user) selects desired captured image data based on the multi-view viewing playlist supplied from the accumulation server 102 and requests it from the accumulation server 102.
  • the playlist providing unit 186 of the accumulation server 102 acquires the multi-view viewing playlist from the multi-view viewing playlist database 185 in step S251, and transmits it to the terminal via the communication unit 164. Supply to device 103.
  • the playlist acquisition unit 231 of the terminal device 103 controls the communication unit 214 to acquire the multi-view viewing playlist in step S261.
  • the exchange of the multi-view viewing playlist may be performed at an arbitrary timing or may be performed based on a request from the terminal device 103.
  • the playlist providing unit 186 selects a desired multi-view viewing playlist and supplies it to the terminal device 103. May be.
  • the playlist providing unit 186 may select a multi-view viewing playlist recommended for the terminal device 103 (user) and supply the playlist to the terminal device 103.
  • the terminal device 103 (the user) requests a desired multi-view viewing playlist, and the playlist providing unit 186 reads the requested multi-view viewing playlist from the multi-view viewing playlist database 185. May be supplied to the terminal device 103.
  • step S262 the image selection processing unit 232 controls the output unit 212 to display a GUI (Graphical User Interface) using the multi-view appreciation playlist supplied from the accumulation server 102 on the monitor, and allows the user to display the GUI.
  • the captured image is selected based on the viewpoint appreciation playlist.
  • an image selection GUI 350 as shown in FIG. 25 may be displayed.
  • the image selection GUI 350 shows an imaging target detail area 351 in the captured image provided by the user, and each captured image whose imaging target detail area 351 is an imaging target is the center of the imaging target detail area 351. It is shown in each imaging direction viewed from the viewpoint. In the example of FIG.
  • the captured image 352 is a captured image uploaded by the user, and the captured image 353 and the captured image 354 are captured images uploaded by others.
  • the image selection processing unit 232 sets the selected captured image as a captured image to be viewed (requested from the integrated server 102). .
  • the captured image is selected for an area different from the imaging target detail area as the imaging target in the uploaded captured image.
  • an area selection GUI configured as shown in FIGS. 13 and 14 may be displayed.
  • an image selection GUI as shown in FIG. 25 may be displayed for the selected area.
  • an image selection GUI as shown in FIG. 25 may be displayed for the imaging target detail area 282 shown in FIG.
  • the captured image request unit 233 requests the captured image data of the selected captured image from the integrated server 102 by controlling the communication unit 214 in step S263. .
  • the captured image providing unit 187 of the integrated server 102 controls the communication unit 164 and accepts the request in step S252, the captured image data is read from the captured image database 182 and is transmitted to the terminal via the communication unit 164. Supply to device 103.
  • step S263 the captured image acquisition unit 234 of the terminal device 103 controls the communication unit 214 to acquire the captured image data.
  • step S264 the reproduction unit 235 reproduces the captured image data.
  • a large number of integrations are performed by using metadata added to the captured image data at the time of imaging, and a positional relationship between an imaging-capable area and an imaging target prepared in advance.
  • the captured image data is classified for each imaging target (a person, an object, or a fixed range around it), and those captured images are for a specific imaging target (or the center if it has a fixed range).
  • a multi-view appreciation playlist with information on which image is taken from which direction can be generated, and a captured image captured by a plurality of imaging devices 101 including the one captured by the user based on the playlist. It becomes possible to easily select a desired one from the above. Thereby, the value of collecting and sharing the captured image data captured by a large number of viewers can be increased.
  • image data has been described as an example, but information to be collected and shared is arbitrary.
  • audio data may be collected and shared.
  • a plurality of types of data such as audio data and image data may be accumulated and shared. That is, by applying the present technology, the user can more easily select desired content.
  • the captured image data is grouped for each imaging target detail area.
  • this grouping can be performed on the basis of an arbitrary one as long as it relates to the imaging target.
  • the captured image data may be grouped for each imaging target (person, object, region).
  • each processing unit (imaging unit 121 to communication unit 124) of the imaging device 101, the CPU 151 of the integrated server 102, the CPU 201 of the terminal device 103, and the like execute the software. What is necessary is just to make it have the structure as a computer which can be performed. Examples of the computer include a computer incorporated in dedicated hardware and a general-purpose computer capable of executing an arbitrary function by installing various programs.
  • the integrated server 102 has the configuration described with reference to FIG. 3, and as described above, the CPU 151 loads a program stored in the storage unit 163, for example, through the input / output interface 160 and the bus 154.
  • the above-described series of processing is executed by software by being loaded into the RAM 153 and executed.
  • This program can be provided by being recorded on a removable medium 171 as a package medium, for example.
  • the program can be installed in the storage unit 163 via the input / output interface 160 by attaching the removable medium 171 to the drive 165.
  • This program can also be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
  • the program can be received by the communication unit 164 and installed in the storage unit 163.
  • this program can be installed in the ROM 152 or the storage unit 163 in advance.
  • the integrated server 102 has been described. However, the same applies to other devices, and the same configuration may be used, and the program may be executed as described above. Note that a part of the series of processes described above can be executed by hardware, and the other can be executed by software.
  • the various metadata described above may be transmitted or recorded in any form as long as it is associated with the captured image data.
  • the term “associate” means, for example, that one data can be used (linked) when one data is processed. That is, the data associated with each other may be collected as one data, or may be individual data.
  • the information associated with the captured image data may be transmitted on a transmission path different from the captured image data (or at a different timing). Further, for example, information associated with captured image data may be recorded on a recording medium different from the captured image data (or another recording area of the same recording medium).
  • the “association” may be a part of the data, not the entire data. For example, an image and information corresponding to the image may be associated with each other in an arbitrary unit such as a plurality of frames, one frame, or a part of the frame.
  • the present technology may be applied to any configuration that constitutes an apparatus or a system, such as a processor as a system LSI (Large Scale Integration), a module using a plurality of processors, a unit using a plurality of modules, and the unit. It can also be implemented as a set to which other functions are added (that is, a partial configuration of the apparatus).
  • a processor as a system LSI (Large Scale Integration)
  • a module using a plurality of processors a unit using a plurality of modules
  • the unit such as a set to which other functions are added (that is, a partial configuration of the apparatus).
  • the system means a set of a plurality of constituent elements (devices, modules (parts), etc.), and it does not matter whether all the constituent elements are in the same casing. Accordingly, a plurality of devices housed in separate housings and connected via a network and a single device housing a plurality of modules in one housing are all systems. .
  • the configuration described as one device (or processing unit) may be divided and configured as a plurality of devices (or processing units).
  • the configurations described above as a plurality of devices (or processing units) may be combined into a single device (or processing unit).
  • a configuration other than that described above may be added to the configuration of each device (or each processing unit).
  • a part of the configuration of a certain device (or processing unit) may be included in the configuration of another device (or other processing unit). .
  • the present technology can take a configuration of cloud computing in which one function is shared and processed by a plurality of devices via a network.
  • the above-described program can be executed in an arbitrary device.
  • the device may have necessary functions (functional blocks and the like) so that necessary information can be obtained.
  • each step described in the above flowchart can be executed by one device or can be executed by a plurality of devices.
  • the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
  • a plurality of processes included in one step can be executed as a process of a plurality of steps.
  • the processing described as a plurality of steps can be collectively executed as one step.
  • the program executed by the computer may be executed in a time series in the order described in this specification for the processing of the steps describing the program, or in parallel or called. It may be executed individually at a necessary timing. That is, as long as no contradiction occurs, the processing of each step may be executed in an order different from the order described above. Furthermore, the processing of the steps describing this program may be executed in parallel with the processing of other programs, or may be executed in combination with the processing of other programs.
  • An information processing apparatus including a generation unit that generates a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions.
  • the information processing apparatus according to (1) wherein the information usable for reproducing the captured image data is information usable for selecting the captured image data to be reproduced.
  • the information processing apparatus according to (1) or (2), wherein information usable for reproducing the captured image data includes information indicating a position of the imaging target.
  • the information processing apparatus includes information indicating an acquisition destination of the captured image data.
  • the information processing apparatus includes information indicating an acquisition destination of the captured image data.
  • information usable for reproducing the captured image data includes information indicating a period not included in the captured image data.
  • the multi-view viewing playlist includes, for a plurality of imaging targets, a list for each imaging target of information that can be used for reproducing the captured image data. Processing equipment.
  • the information processing apparatus according to any one of (1) to (7), wherein the multi-view viewing playlist is generated for a predetermined period and includes information indicating a start time and a length of the period.
  • the multi-view viewing playlist includes a list for each imaging target of information that can be used for reproduction of the captured image data for each predetermined period for a plurality of periods. .
  • the generation unit generates the multi-view viewing playlist as an MPD (Media Presentation Description) of MPEG-DASH (Moving Picture Experts Group phase-Dynamic Adaptive Streaming over HTTP).
  • MPD Media Presentation Description
  • MPEG-DASH Motion Picture Experts Group phase-Dynamic Adaptive Streaming over HTTP
  • the information processing apparatus includes: Group the captured image data based on the imaging target; The information processing apparatus according to any one of (1) to (13), wherein a list of information that can be used for reproduction of the captured image data is generated for each group.
  • the generation unit includes: All the captured image data is grouped into one group Group the captured image data using a preset area, or Or the information processing device according to (14), wherein the captured image data is grouped according to a position of each imaging target.
  • An analysis unit for analyzing the imaging state is further provided, The information processing apparatus according to any one of (1) to (15), wherein the generation unit is configured to generate the multi-view viewing playlist based on a result of analysis by the analysis unit.
  • the information processing apparatus analyzes a state of imaging based on metadata of captured image data and information related to a venue.
  • An information processing method for generating a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions.
  • 100 image providing system 101 imaging device, 102 integration server, 103 terminal device, 121 imaging unit, 122 metadata generation unit, 123 metadata addition unit, 124 communication unit, 181 integration unit, 182 imaging image database, 183 imaging status analysis Unit, 184 playlist generation unit, 185 multi-view viewing playlist database, 186 playlist provision unit, 187 captured image provision unit, 231 playlist acquisition unit, 232 image selection processing unit, 233 captured image request unit, 234 captured image acquisition Part, 235 reproduction part

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure of the present invention relates to an information processing device and method which enable a user to more easily select desired contents. The present invention enables the generation of a multi-viewpoint viewing playlist which includes a list of information usable in the reproduction of a plurality of captured image data items generated by imaging the same imaging subject from mutually different positions. For example, information usable in the reproduction of captured image data may be information usable in the selection of the captured image data to be reproduced. The disclosed invention can be applied to an information processing device, a communication device, an image distribution server, an encoding device and the like.

Description

情報処理装置および方法Information processing apparatus and method
 本開示は、情報処理装置および方法に関し、特に、ユーザが所望のコンテンツをより容易に選択することができるようにした情報処理装置および方法に関する。 The present disclosure relates to an information processing apparatus and method, and more particularly, to an information processing apparatus and method that allow a user to select desired content more easily.
 従来、ビットレート適応ストリーミングに用いられるMPEG-DASH(Moving Picture Experts Group phase - Dynamic Adaptive Streaming over HTTP)のMPD(Media Presentation Description)があった。MPEG-DASHにおいては、このMPDが、複数の異なる映像ストリームについて統合的に記述する「プレイリスト」に相当する。そして、MPEG-DASHでは、ある1つの映像の中の一部分のみの映像のストリームを元の(全体)映像と関連付ける仕組みとしてSRD(Spatial Relation Description)という拡張が定義されていた(例えば非特許文献1参照)。 Previously, there was MPEG-DASH (Moving Picture Experts Group Phase-Dynamic Dynamic Adaptive Streaming over HTTP) MPD (Media Presentation Description) used for bit rate adaptive streaming. In MPEG-DASH, this MPD corresponds to a “play list” that describes a plurality of different video streams in an integrated manner. In MPEG-DASH, an extension called SRD (Spatial Relation と し て Description) has been defined as a mechanism for associating a partial video stream in one video with the original (whole) video (for example, Non-Patent Document 1). reference).
 しかしながら、MPDでは、異なる方向から撮像された同一の対象(人、物、あるいはその周辺の領域)を記述する方法は規定されていなかった。そのため、ユーザが、所望の撮像対象について、その撮像対象が映っている映像のストリームを、このMPDを用いて選択し、視聴することは困難であった。 However, in MPD, a method for describing the same object (a person, an object, or an area around it) taken from different directions has not been defined. For this reason, it has been difficult for the user to select and view a video stream in which the imaging target is captured using the MPD for the desired imaging target.
 本開示は、このような状況に鑑みてなされたものであり、ユーザが所望のコンテンツをより容易に選択することができるようにするものである。 This disclosure has been made in view of such a situation, and enables a user to select desired content more easily.
 本技術の一側面は、同一の撮像対象を互いに異なる位置から撮像して生成された複数の撮像画像データの再生に利用可能な情報のリストを含む多視点鑑賞プレイリストを生成する生成部を備える情報処理装置である。 One aspect of the present technology includes a generation unit that generates a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions. Information processing apparatus.
 前記撮像画像データの再生に利用可能な情報は、再生する前記撮像画像データの選択に利用可能な情報であるようにすることができる。 The information that can be used to reproduce the captured image data can be information that can be used to select the captured image data to be reproduced.
 前記撮像画像データの再生に利用可能な情報は、前記撮像対象の位置を示す情報を含むようにすることができる。 The information that can be used to reproduce the captured image data can include information indicating the position of the imaging target.
 前記撮像画像データの再生に利用可能な情報は、前記撮像対象からみた撮像方向を示す情報を含むようにすることができる。 The information that can be used for reproduction of the captured image data can include information indicating an imaging direction viewed from the imaging target.
 前記撮像画像データの再生に利用可能な情報は、前記撮像画像データの取得先を示す情報を含むようにすることができる。 The information that can be used for reproduction of the captured image data can include information indicating the acquisition destination of the captured image data.
 前記撮像画像データの再生に利用可能な情報は、前記撮像画像データに含まれない期間を示す情報を含むようにすることができる。 The information usable for reproduction of the captured image data can include information indicating a period not included in the captured image data.
 前記多視点鑑賞プレイリストは、前記撮像画像データの再生に利用可能な情報の撮像対象毎のリストを、複数の撮像対象について含むようにすることができる。 The multi-view viewing playlist may include a list for each imaging target of information that can be used to reproduce the captured image data for a plurality of imaging targets.
 前記多視点鑑賞プレイリストは、所定の期間について生成され、前記期間の開始時刻と長さを示す情報を含むようにすることができる。 The multi-view viewing playlist may be generated for a predetermined period and include information indicating a start time and a length of the period.
 前記多視点鑑賞プレイリストは、所定の期間毎の、前記撮像画像データの再生に利用可能な情報の撮像対象毎のリストを、複数の期間について含むようにすることができる。 The multi-view viewing playlist may include a list for each imaging target of information that can be used for reproducing the captured image data for each predetermined period for a plurality of periods.
 前記生成部は、前記多視点鑑賞プレイリストを、MPEG-DASH(Moving Picture Experts Group phase - Dynamic Adaptive Streaming over HTTP)のMPD(Media Presentation Description)として生成するようにすることができる。 The generation unit may generate the multi-view viewing playlist as an MPEG-DASH (Moving Picture Experts Group phase-Dynamic Dynamic Adaptive Streaming over HTTP) MPD (Media Presentation Description).
 前記多視点鑑賞プレイリストは、撮像対象となる領域毎にリスト化された、撮像対象からみた撮像方向を示す情報を含むようにすることができる。 The multi-view appreciation playlist can include information indicating the imaging direction viewed from the imaging target, listed for each area to be imaged.
 前記多視点鑑賞プレイリストは、前記撮像対象となる領域の中心位置および半径を示す情報を含むようにすることができる。 The multi-view viewing playlist can include information indicating a center position and a radius of the area to be imaged.
 前記多視点鑑賞プレイリストは、各撮像画像データに関する情報を、AdaptationSetで管理するようにすることができる。 The multi-view appreciation playlist can manage information on each captured image data with AdaptationSet.
 前記生成部は、撮像対象に基づいて前記撮像画像データをグループ化し、グループ毎に前記撮像画像データの再生に利用可能な情報のリストを生成することができる。 The generation unit can group the captured image data based on the imaging target and generate a list of information that can be used for reproduction of the captured image data for each group.
 前記生成部は、全ての撮像画像データを1グループにまとめるか、予め設定された領域を用いて、前記撮像画像データをグループ化するか、または、各撮像対象の位置に応じて、前記撮像画像データをグループ化することができる。 The generation unit collects all the captured image data into one group, groups the captured image data using a preset area, or selects the captured image according to the position of each imaging target. Data can be grouped.
 撮像の状況を解析する解析部をさらに備え、前記生成部は、前記解析部による解析の結果に基づいて、前記多視点鑑賞プレイリストを生成するように構成されるようにすることができる。 The image processing apparatus may further include an analysis unit that analyzes an imaging state, and the generation unit may be configured to generate the multi-view viewing playlist based on a result of analysis by the analysis unit.
 前記解析部は、解析の結果、撮像に関する情報、撮像対象に関する情報、撮像対象からみた撮像方向に関する情報を得ることができる。 As a result of the analysis, the analysis unit can obtain information related to imaging, information related to the imaging target, and information related to the imaging direction viewed from the imaging target.
 前記解析部は、撮像画像データのメタデータと会場に関する情報とに基づいて、撮像の状況を解析することができる。 The analysis unit can analyze the imaging state based on the metadata of the captured image data and information on the venue.
 前記生成部により生成された前記多視点鑑賞プレイリストを提供する提供部をさらに備えることができる。 A providing unit that provides the multi-view viewing playlist generated by the generating unit can be further provided.
 本技術の一側面は、また、同一の撮像対象を互いに異なる位置から撮像して生成された複数の撮像画像データの再生に利用可能な情報のリストを含む多視点鑑賞プレイリストを生成する情報処理方法である。 One aspect of the present technology is also an information processing that generates a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions. Is the method.
 本技術の一側面においては、同一の撮像対象を互いに異なる位置から撮像して生成された複数の撮像画像データの再生に利用可能な情報のリストを含む多視点鑑賞プレイリストが生成される。 In one aspect of the present technology, a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions is generated.
 本開示によれば、情報を処理することができる。特に、ユーザが所望のコンテンツをより容易に選択することができる。 According to the present disclosure, information can be processed. In particular, the user can select desired content more easily.
画像提供システムの主な構成例を示す図である。It is a figure which shows the main structural examples of an image provision system. 撮像装置の主な構成例を示す図である。It is a figure which shows the main structural examples of an imaging device. 集積サーバの主な構成例を示すブロック図である。It is a block diagram which shows the main structural examples of an integrated server. 集積サーバが実現する主な機能の例を示す機能ブロック図である。It is a functional block diagram which shows the example of the main functions which an integrated server implement | achieves. 端末装置の主な構成例を示すブロック図である。It is a block diagram which shows the main structural examples of a terminal device. 端末装置が実現する主な機能の例を示す機能ブロック図である。It is a functional block diagram which shows the example of the main functions which a terminal device implement | achieves. 撮像画像集積処理の流れの例を説明するフローチャートである。It is a flowchart explaining the example of the flow of a captured image integration process. 多視点鑑賞プレイリスト生成処理の流れの例を説明するフローチャートである。It is a flowchart explaining the example of the flow of a multiview viewing play list production | generation process. 撮像対象位置関係情報の例を示す図である。It is a figure which shows the example of imaging target positional relationship information. 解析処理の流れの例を説明するフローチャートである。It is a flowchart explaining the example of the flow of an analysis process. 映像メタデータの例を示す図である。It is a figure which shows the example of video metadata. プレイリスト生成処理の流れの例を説明するフローチャートである。It is a flowchart explaining the example of the flow of a play list production | generation process. 撮像対象詳細範囲の設定の様子の例を示す図である。It is a figure which shows the example of the mode of a setting of the imaging target detailed range. 撮像対象詳細範囲の設定の様子の例を示す図である。It is a figure which shows the example of the mode of a setting of the imaging target detailed range. 撮像対象詳細範囲の設定の様子の例を示す図である。It is a figure which shows the example of the mode of a setting of the imaging target detailed range. グループ情報の例を示す図である。It is a figure which shows the example of group information. 画像グループ化処理の流れの例を説明するフローチャートである。It is a flowchart explaining the example of the flow of an image grouping process. 画像グループ化処理の流れの例を説明する図17に続くフローチャートである。It is a flowchart following FIG. 17 explaining the example of the flow of an image grouping process. 撮像方向の設定の様子の例を示す図である。It is a figure which shows the example of the mode of a setting of an imaging direction. 多視点鑑賞プレイリストの例を示す図である。It is a figure which shows the example of a multi-view appreciation play list. MPDを用いた多視点鑑賞プレイリストの例を示す図である。It is a figure which shows the example of the multi-view appreciation play list using MPD. MPDを用いた多視点鑑賞プレイリストの例を示す図である。It is a figure which shows the example of the multi-view appreciation play list using MPD. MPDを用いた多視点鑑賞プレイリストの例を示す図である。It is a figure which shows the example of the multi-view appreciation play list using MPD. 画像提供処理の流れの例を説明するフローチャートである。It is a flowchart explaining the example of the flow of an image provision process. 画像選択用GUIの例を示す図である。It is a figure which shows the example of GUI for image selection.
 以下、本開示を実施するための形態(以下実施の形態とする)について説明する。なお、説明は以下の順序で行う。
 1.画像共有と選択
 2.第1の実施の形態(画像提供システム)
 3.ソフトウエア
 4.その他
Hereinafter, modes for carrying out the present disclosure (hereinafter referred to as embodiments) will be described. The description will be given in the following order.
1. Image sharing and selection First embodiment (image providing system)
3. Software 4. Other
 <1.画像共有と選択>
 従来より、同一のイベント等で撮像された動画や静止画の撮像画像や音声等の情報をSNSやデータ共有サービスを利用して仲間内などで共有することが可能である。このような場合、共有するデータは、イベントの日時、場所等、あるいはイベント内の小イベントの期間ごと等の単位で分類管理されていた。しかしながら、実際にどのような内容であるかをこれらの分類から判断するのは困難であり、ユーザが実際に視聴してみないとわからなかった。つまり、ユーザは所望の情報を得るために各情報を1つ1つ確認しなければならず、煩雑な作業が必要であった。さらに、共有する情報の数が増大すると、その作業量はさらに増大した。
<1. Image sharing and selection>
Conventionally, it is possible to share information, such as moving images and still images captured at the same event, etc., and audio, etc., among friends using SNS and data sharing services. In such a case, the shared data is classified and managed in units such as the date and time of the event, the location, etc., or every period of a small event in the event. However, it is difficult to determine what kind of content is actually from these classifications, and it has been difficult to understand unless the user actually watches. That is, in order to obtain desired information, the user has to check each piece of information one by one, which requires complicated work. Furthermore, as the number of shared information increases, the amount of work further increases.
 なお、撮像画像データには、例えば、撮像場所(位置)、フォーカス距離、ズームレベルなどのメタデータが付加されている場合もあるが、やはりこれらの情報のみでは、どのような被写体が撮像されているかを識別することは困難であった。また、このメタデータは各撮像画像データに付加されており、まとめて管理されていなかった。したがって、ユーザは、このメタデータを利用して所望の撮像画像を特定する場合、ユーザは、各撮像データのメタデータを1つずつ確認する必要があり、煩雑な作業が必要であった。さらに、共有する情報の数が増大すると、その作業量はさらに増大した。 Note that, for example, metadata such as an imaging location (position), a focus distance, and a zoom level may be added to the captured image data. However, what kind of subject is imaged with only this information. It was difficult to identify. Further, this metadata is added to each captured image data and is not managed collectively. Therefore, when the user specifies a desired captured image using the metadata, the user needs to check the metadata of each captured data one by one, and a complicated operation is required. Furthermore, as the number of shared information increases, the amount of work further increases.
 また、一般的には、このようなメタデータは、自身が付加されている撮像画像データに関する情報により構成され、他の撮像画像データに関する情報は含まれていなかった。したがって、例えば被写体が同一である等の撮像画像間の関連性を、このメタデータから特定することは困難であった。したがって、例えば、同一の撮像対象とする他の撮像画像データを特定するといったより複雑な検索を行うことは困難であった。 In general, such metadata is composed of information related to captured image data to which the metadata itself is added, and information related to other captured image data is not included. Therefore, it is difficult to specify the relationship between captured images, for example, the subject is the same, from this metadata. Therefore, for example, it has been difficult to perform a more complicated search such as specifying other captured image data that is the same imaging target.
 なお、複数の映像を統合したコンテンツ提供のフォーマットとしてはDVD-ROM(Digital Versatile Disc - Read Only Memory)やBD-ROM(Blu-ray(登録商標) Disc - Read Only Memory)などのパッケージ・メディアにおいて、例えば同一の区間に対して複数の撮像アングルが異なる映像を提供し、ユーザがその中から1つを選択して視聴するようなコンテンツ形態があった。しかしながら、この場合、再生するデータを順に切り替えているのみであり、例えば、所定の撮像対象を撮像した撮像画像データを検索する等のより複雑な検索を行うことは困難であった。 In addition, as a format for providing content that integrates multiple images, package media such as DVD-ROM (Digital Versatile Disc-Read-Only Memory) and BD-ROM (Blu-ray (registered trademark) Disc-read-Only Memory) For example, there has been a content form in which a plurality of images with different imaging angles are provided for the same section, and the user selects and views one of them. However, in this case, the data to be reproduced is simply switched in order, and it has been difficult to perform a more complex search such as searching for captured image data obtained by imaging a predetermined imaging target.
 また、複数の異なる映像ストリームについて統合的に記述する”プレイリスト”相当のものとして、ビットレート適応ストリーミングに用いられるMPEG-DASH(Moving Picture Experts Group phase - Dynamic Adaptive Streaming over HTTP)のMPD(Media Presentation Description)があった。これまでにある1つの映像の中の一部分のみの映像のストリームを元の(全体)映像と関連付ける仕組みとしてSpatial Relation Description (SRD) という拡張が定義されていた。しかしながら、これにおいても、異なる方向から撮像された同一の対象(人、物、あるいはその周辺の領域)を記述する方法は規定されておらず、より複雑な検索を行うことは困難であった。 In addition, MPEG-DASH (Moving / Picture / Experts / Group / phase / -Dynamic / Adaptive / Streaming / over / HTTP) MPD (Media / Presentation) used for bit rate adaptive streaming is equivalent to a "playlist" that describes a plurality of different video streams in an integrated manner. Description). Up to now, an extension called Spatial Relation Description (SRD) 関 連 付 け る has been defined as a mechanism for associating a partial video stream in one video with the original (whole) video. However, even in this case, a method for describing the same object (a person, an object, or an area around it) taken from different directions is not defined, and it is difficult to perform a more complicated search.
 <2.第1の実施の形態>
  <多視点鑑賞プレイリストの生成>
 そこで、同一の撮像対象を互いに異なる位置から撮像して生成された複数の撮像画像データの再生に利用可能な情報のリストを含む多視点鑑賞プレイリストを生成するようにする。このようにすることにより、撮像画像を視聴するユーザは、その多視点鑑賞プレイリストに基づいて、撮像画像データのより高度な検索を行うことができる。すなわち、ユーザが所望のコンテンツをより容易に選択し、取得することができる。
<2. First Embodiment>
<Generation of multi-view viewing playlist>
Therefore, a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions is generated. By doing in this way, the user who views the captured image can perform a more advanced search of the captured image data based on the multi-view viewing playlist. That is, the user can more easily select and acquire desired content.
  <画像提供システム>
 図1は、本技術を適用した画像処理システムの一実施の形態である画像提供システムの主な構成例を示すブロック図である。図1に示される画像提供システム100は、画像共有サービスを提供するシステムである。画像共有サービスは、例えば、コンサート、お芝居、スポーツの試合や大会、地域や学校の行事(例えば、お祭り、式典、フリーマーケット、学芸会、運動会等)等、任意のイベント等において撮像された撮像画像のデータ(撮像画像データ)を、視聴者が共有し、閲覧することができるようにするサービスである。イベントは、時間と場所とが特定可能なものであればどのようなものであってもよく、例えばコンサート、舞台、プロスポーツの試合等のように商業的なものであってもよいし、例えばお祭り、式典、アマチュアスポーツ大会、フリーマーケット、学芸会や運動会等の学校行事のように公共的なものであってもよいし、誕生日会等のように私的なものであってもよい。
<Image provision system>
FIG. 1 is a block diagram illustrating a main configuration example of an image providing system which is an embodiment of an image processing system to which the present technology is applied. An image providing system 100 shown in FIG. 1 is a system that provides an image sharing service. Image sharing services include images taken at any event, such as concerts, plays, sports games and competitions, local and school events (eg festivals, ceremonies, flea markets, school performances, sports events, etc.) This is a service that enables viewers to share and browse image data (captured image data). The event may be anything as long as the time and place can be specified, such as a concert, a stage, a professional sports game, etc. It may be public such as festivals, ceremonies, amateur sports competitions, flea markets, school events such as athletic meet and athletic meet, or private such as birthday parties.
 なお、例えば公園での撮像や観光地での撮像等のように、具体的な行事として成立していなくてもよいが、以下においては説明の便宜上、何らかのイベントにおいて撮像が行われるものとする。また、撮像画像データの提供(配信)は、撮像に対して即時的(所謂リアルタイム(伝送遅延や処理遅延等のタイムラグは無視する))に行われるようにしてもよいし、蓄積されたものが要求に応じて配信される形式にしてもよい(所謂オンデマンド)。 It should be noted that although it does not have to be established as a specific event, such as imaging in a park or sightseeing, for example, it is assumed that imaging is performed at some event for convenience of explanation below. Further, the provision (distribution) of the captured image data may be performed immediately (so-called real time (ignoring time lag such as transmission delay and processing delay)) with respect to imaging, or the accumulated image data may be stored. It may be in a format that is distributed upon request (so-called on-demand).
 この画像共有サービスの提供者は、イベントの運営者と同一であってもよいし、異なっていてもよい。 The provider of this image sharing service may be the same as or different from the event operator.
 撮像者は、イベントの運営者(運営側のスタッフも含む)であってもよいし、画像共有サービスの提供者であってもよいし、それらとは別のコンテンツ放送・配信業者であってもよいし、イベント参加者(演者、選手、観客等)であってもよい。この撮像者は予め画像共有サービスに登録されていてもよいし、されていなくてもよい。つまり、予め会員登録された特定の人物のみが撮像画像データを共有させることができるようにしてもよいし、不特定多数の人物が撮像画像データを共有させることができるようにしてもよい。 The photographer may be an event operator (including staff on the operation side), an image sharing service provider, or a content broadcast / distributor other than those. It may be an event participant (actor, player, spectator, etc.). This photographer may or may not be registered in advance in the image sharing service. That is, only a specific person registered as a member in advance may be allowed to share captured image data, or an unspecified number of persons may be allowed to share captured image data.
 撮像画像データを共有する視聴者は、イベント参加者(演者、選手、観客等)であってもよいし、イベントには参加していない者であってもよい。もちろん、イベント運営者であってもよい。また、撮像者が視聴者となることも可能である。視聴者は予め画像共有サービスに登録されていてもよいし、されていなくてもよい。 The viewer who shares the captured image data may be an event participant (performer, player, spectator, etc.), or may be a person who has not participated in the event. Of course, it may be an event operator. It is also possible for the photographer to be a viewer. The viewer may or may not be registered in advance in the image sharing service.
 図1に示されるように、画像提供システム100は、撮像装置101、集積サーバ102、および端末装置103を有する。撮像装置101や端末装置103は、例えばインターネット等の任意の通信媒体を介して集積サーバ102と通信可能に接続されている。図1においては、撮像装置101、集積サーバ102、端末装置103をそれぞれ1台ずつ示しているが、これらの数はそれぞれ任意であり、複数であってもよい。 As shown in FIG. 1, the image providing system 100 includes an imaging device 101, an integrated server 102, and a terminal device 103. The imaging device 101 and the terminal device 103 are communicably connected to the integrated server 102 via an arbitrary communication medium such as the Internet. In FIG. 1, one imaging device 101, one integrated server 102, and one terminal device 103 are shown, but these numbers are arbitrary and may be plural.
 画像提供システム100は、撮像装置101により撮像された撮像画像のデータ(撮像画像データ)等を、集積サーバ102が集積し、端末装置103がそれを取得して再生するシステムである。つまり、集積サーバ102が、端末装置103(のユーザである視聴者)に対して、撮像装置101(のユーザである撮像者)が撮像した撮像画像データを共有するサービスを提供する。 The image providing system 100 is a system in which captured image data (captured image data) and the like captured by the image capturing apparatus 101 are accumulated by the accumulation server 102, and the terminal apparatus 103 acquires and reproduces it. That is, the integrated server 102 provides a service for sharing the captured image data captured by the imaging device 101 (the user who is the user) to the terminal device 103 (the viewer who is the user).
 撮像装置101は、撮像や通信に関する処理を行う。例えば、撮像装置101は、被写体(撮像対象とも称する)を撮像し、撮像画像のデータである撮像画像データを生成する。また、例えば、撮像装置101は、生成した撮像画像データを集積サーバ102に供給する(所謂アップロードする)。 The imaging device 101 performs processing related to imaging and communication. For example, the imaging device 101 captures a subject (also referred to as an imaging target) and generates captured image data that is data of the captured image. Further, for example, the imaging apparatus 101 supplies the generated captured image data to the integration server 102 (so-called uploading).
 集積サーバ102は、撮像画像データの集積と提供に関する処理を行う。例えば、集積サーバ102は、撮像装置101から供給される撮像画像データを取得し、蓄積する。また、例えば、集積サーバ102は、撮像画像データの再生に利用される情報である多視点鑑賞プレイリストを生成する。 The accumulation server 102 performs processing related to accumulation and provision of captured image data. For example, the accumulation server 102 acquires and stores captured image data supplied from the imaging device 101. Further, for example, the integration server 102 generates a multi-view viewing playlist that is information used for reproducing captured image data.
 多視点鑑賞プレイリストは、撮像画像データの検索(選択)や再生制御に利用される情報であり、同一の撮像対象を互いに異なる位置から撮像して生成された複数の撮像画像データの検索(選択)や再生制御に利用可能な情報のリストを含む。より具体的には、多視点鑑賞プレイリストにおいては、撮像画像データに関する情報が撮像対象(例えば人、物、領域等)毎にまとめられている。したがって、ユーザは、例えば、所望の撮像対象が映っている撮像画像データを検索(選択)し、再生させる場合に、この多視点鑑賞プレイリストを用いることにより、その検索(選択)や再生制御をより容易にすることができる。 The multi-view viewing playlist is information used for retrieval (selection) and reproduction control of captured image data, and retrieval (selection) of a plurality of captured image data generated by imaging the same imaging target from different positions. ) And a list of information available for playback control. More specifically, in the multi-view appreciation playlist, information regarding captured image data is collected for each imaging target (for example, a person, an object, an area, and the like). Therefore, for example, when searching (selecting) and playing back captured image data in which a desired imaging target is reflected, the user uses this multi-view viewing playlist to perform search (selection) and playback control. It can be made easier.
 一般的に、撮像画像を視聴するユーザは、所望の撮像対象が映っている撮像画像を視聴する。つまり、撮像画像を視聴するユーザにとって、その撮像画像に何が映っているかが最も重要である場合が多い。特に、上述したような画像共有サービスの場合、多数の撮像対象を撮像した多数の撮像データが共有されることも多い。しかしながら、従来のプレイリストやメタデータ等には撮像対象についての情報はほとんど含まれておらず、撮像対象に基づいた検索を行うことが困難であった。これに対して、上述の多視点鑑賞プレイリストを用いることにより、撮像対象に基づく検索や再生制御をより容易に行うことができるようになるので、ユーザの満足度を向上させることができる。 Generally, a user who views a captured image views a captured image showing a desired imaging target. That is, in many cases, what is reflected in a captured image is most important for a user who views the captured image. In particular, in the case of the image sharing service as described above, a large number of imaging data obtained by imaging a large number of imaging targets is often shared. However, conventional playlists, metadata, and the like hardly contain information about the imaging target, and it is difficult to perform a search based on the imaging target. On the other hand, since the search and reproduction control based on the imaging target can be performed more easily by using the above-described multi-view viewing playlist, user satisfaction can be improved.
 また、例えば、集積サーバ102は、端末装置103に対して撮像画像データを提供する(ダウンロードやストリーミング等)。例えば、集積サーバ102は、生成した多視点鑑賞プレイリストを端末装置103に供給し、その多視点鑑賞プレイリストに基づいた再生制御を行わせる。つまり、集積サーバ102は、供給した多視点鑑賞プレイリストに基づいて要求される撮像画像データを端末装置103に供給する。 Also, for example, the accumulation server 102 provides captured image data to the terminal device 103 (downloading, streaming, etc.). For example, the accumulation server 102 supplies the generated multi-view viewing playlist to the terminal device 103, and performs playback control based on the multi-view viewing playlist. That is, the accumulation server 102 supplies the captured image data requested based on the supplied multi-view viewing playlist to the terminal device 103.
 端末装置103は、集積サーバ102を用いて共有される撮像画像データの再生に関する処理を行う。例えば、端末装置103は、集積サーバ102より多視点鑑賞プレイリストを取得し、その多視点鑑賞プレイリストに基づいて所望の撮像画像データを特定し、その特定した撮像画像データを集積サーバ102に要求する。その要求に応じて集積サーバ102から所望の撮像画像データが供給されると、端末装置103は、その撮像画像データを取得し、再生し、モニタ等に表示する。 The terminal device 103 performs processing related to reproduction of captured image data shared using the accumulation server 102. For example, the terminal device 103 acquires a multi-view viewing playlist from the integration server 102, specifies desired captured image data based on the multi-view viewing playlist, and requests the specified captured image data from the integration server 102. To do. When desired captured image data is supplied from the accumulation server 102 in response to the request, the terminal device 103 acquires the captured image data, reproduces it, and displays it on a monitor or the like.
  <撮像装置>
 図2は、撮像装置101の主な構成例を示すブロック図である。図2に示されるように、撮像装置101は、撮像部121、メタデータ生成部122、メタデータ付加部123、および通信部124を有する。
<Imaging device>
FIG. 2 is a block diagram illustrating a main configuration example of the imaging apparatus 101. As illustrated in FIG. 2, the imaging apparatus 101 includes an imaging unit 121, a metadata generation unit 122, a metadata addition unit 123, and a communication unit 124.
 撮像部121は、例えばイメージセンサ等の撮像画像生成機能を備え、撮像対象の撮像に関する処理を行う。メタデータ生成部122は、撮像画像データのメタデータの生成に関する処理を行う。メタデータ付加部123は、撮像画像データに対するメタデータの付加に関する処理を行う。メタデータ付加部123は、メタデータが付加された撮像画像データのアップロードに関する処理を行う。 The imaging unit 121 includes a captured image generation function such as an image sensor, for example, and performs processing related to imaging of an imaging target. The metadata generation unit 122 performs processing related to generation of metadata of captured image data. The metadata adding unit 123 performs a process related to adding metadata to captured image data. The metadata adding unit 123 performs processing related to upload of captured image data to which metadata is added.
  <集積サーバ>
 図3は、集積サーバ102の主な構成例を示すブロック図である。図3に示されるように、集積サーバ102は、CPU(Central Processing Unit)151、ROM(Read Only Memory)152、RAM(Random Access Memory)153、バス154、入出力インタフェース160、入力部161、出力部162、記憶部163、通信部164、およびドライブ165を有する。
<Integration server>
FIG. 3 is a block diagram illustrating a main configuration example of the integrated server 102. As shown in FIG. 3, the integrated server 102 includes a CPU (Central Processing Unit) 151, a ROM (Read Only Memory) 152, a RAM (Random Access Memory) 153, a bus 154, an input / output interface 160, an input unit 161, and an output. Unit 162, storage unit 163, communication unit 164, and drive 165.
 CPU151、ROM152、およびRAM153は、バス154を介して相互に接続されている。バス154にはまた、入出力インタフェース160も接続されている。入出力インタフェース160には、入力部161乃至ドライブ165が接続されている。 The CPU 151, the ROM 152, and the RAM 153 are connected to each other via a bus 154. An input / output interface 160 is also connected to the bus 154. An input unit 161 to a drive 165 are connected to the input / output interface 160.
 入力部161は、例えば、キーボード、マウス、タッチパネル、イメージセンサ、マイクロホン、スイッチ、入力端子等の任意の入力デバイスを有する。出力部162は、例えば、ディスプレイ、スピーカ、出力端子等の任意の出力デバイスを有する。記憶部163は、例えば、ハードディスク、RAMディスク、SSD(Solid State Drive)やUSB(Universal Serial Bus)(登録商標)メモリ等のような不揮発性のメモリ等、任意の記憶媒体を有する。通信部164は、例えば、イーサネット(登録商標)、Bluetooth(登録商標)、USB、HDMI(登録商標)(High-Definition Multimedia Interface)、IrDA等の、有線若しくは無線、または両方の、任意の通信規格の通信インタフェースを有する。ドライブ165は、磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリ等の任意の記憶媒体を有するリムーバブルメディア171を駆動する。 The input unit 161 includes arbitrary input devices such as a keyboard, a mouse, a touch panel, an image sensor, a microphone, a switch, and an input terminal. The output unit 162 includes an arbitrary output device such as a display, a speaker, and an output terminal. The storage unit 163 includes an arbitrary storage medium such as a hard disk, a RAM disk, a nonvolatile memory such as an SSD (Solid State Drive) or a USB (Universal Serial Bus) (registered trademark) memory. The communication unit 164 is, for example, any communication standard such as Ethernet (registered trademark), Bluetooth (registered trademark), USB, HDMI (registered trademark) (High-Definition Multimedia Interface), IrDA or the like, wired or wireless, or both. Communication interface. The drive 165 drives a removable medium 171 having an arbitrary storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
 CPU151が、例えば、ROM152や記憶部163に記憶されているプログラムを、RAM153にロードして実行することにより処理が行われる。RAM153にはまた、CPU151が各種の処理を実行する上において必要なデータなども適宜記憶される。 Processing is performed by the CPU 151 loading, for example, a program stored in the ROM 152 or the storage unit 163 into the RAM 153 and executing it. The RAM 153 also appropriately stores data necessary for the CPU 151 to execute various processes.
  <機能ブロック>
 図4は、集積サーバ102が有する機能の例を示す機能ブロック図である。集積サーバ102は、CPU151がプログラム等を実行することにより、図4に示されるような各種機能を実現する。図4に示されるように、集積サーバ102は、機能ブロックとして、集積部181、撮像画像データベース182、撮像状況解析部183、プレイリスト生成部184、多視点鑑賞プレイリストデータベース185、プレイリスト提供部186、および撮像画像提供部187を有する。
<Functional block>
FIG. 4 is a functional block diagram illustrating an example of functions that the integrated server 102 has. The integrated server 102 implements various functions as shown in FIG. 4 when the CPU 151 executes a program or the like. As illustrated in FIG. 4, the accumulation server 102 includes, as functional blocks, an accumulation unit 181, a captured image database 182, an imaging state analysis unit 183, a playlist generation unit 184, a multi-view viewing playlist database 185, and a playlist provision unit. 186 and a captured image providing unit 187.
 集積部181は、撮像画像データやそのメタデータの集積に関する処理を行う。この集積部181は、例えば、CPU151がプログラム等を実行したり、入出力インタフェース160を介して通信部164等を制御することによって実現される。撮像画像データベース182は、撮像画像データやメタデータ等の情報の記憶・管理に関する処理を行う。この撮像画像データベース182は、例えば、CPU151がプログラム等を実行したり、入出力インタフェース160を介して記憶部163等を制御することによって実現される。撮像状況解析部183は、撮像の状況についての解析に関する処理を行う。この撮像状況解析部183は、例えば、CPU151がプログラム等を実行することによって実現される。プレイリスト生成部184は、多視点鑑賞プレイリストの生成に関する処理を行う。このプレイリスト生成部184は、例えば、CPU151がプログラム等を実行することによって実現される。 The accumulation unit 181 performs processing related to the accumulation of captured image data and its metadata. For example, the accumulation unit 181 is realized by the CPU 151 executing a program or the like or controlling the communication unit 164 or the like via the input / output interface 160. The captured image database 182 performs processing related to storage and management of information such as captured image data and metadata. The captured image database 182 is realized, for example, when the CPU 151 executes a program or the like or controls the storage unit 163 or the like via the input / output interface 160. The imaging state analysis unit 183 performs processing related to analysis of the imaging state. The imaging state analysis unit 183 is realized by, for example, the CPU 151 executing a program or the like. The playlist generation unit 184 performs processing related to generation of a multi-view viewing playlist. The playlist generation unit 184 is realized, for example, when the CPU 151 executes a program or the like.
 多視点鑑賞プレイリストデータベース185は、多視点鑑賞プレイリストの記憶・管理に関する処理を行う。この多視点鑑賞プレイリストデータベース185は、例えば、CPU151がプログラム等を実行したり、入出力インタフェース160を介して記憶部163等を制御することによって実現される。プレイリスト提供部186は、多視点鑑賞プレイリストの提供に関する処理を行う。このプレイリスト提供部186は、例えば、CPU151がプログラム等を実行したり、入出力インタフェース160を介して通信部164等を制御することによって実現される。撮像画像提供部187は、撮像画像データの提供に関する処理を行う。この撮像画像提供部187は、例えば、CPU151がプログラム等を実行したり、入出力インタフェース160を介して通信部164等を制御することによって実現される。 The multi-view viewing playlist database 185 performs processing related to storage and management of the multi-view viewing playlist. The multi-view viewing playlist database 185 is realized, for example, when the CPU 151 executes a program or the like, or controls the storage unit 163 or the like via the input / output interface 160. The playlist providing unit 186 performs processing related to providing a multi-view viewing playlist. The playlist providing unit 186 is realized, for example, when the CPU 151 executes a program or the like, or controls the communication unit 164 or the like via the input / output interface 160. The captured image providing unit 187 performs processing related to provision of captured image data. The captured image providing unit 187 is realized, for example, when the CPU 151 executes a program or the like or controls the communication unit 164 or the like via the input / output interface 160.
  <端末装置>
 図5は、端末装置103の主な構成例を示すブロック図である。図5に示されるように、端末装置103は、CPU201、ROM202、RAM203、バス204、入出力インタフェース210、入力部211、出力部212、記憶部213、通信部214、およびドライブ215を有する。
<Terminal device>
FIG. 5 is a block diagram illustrating a main configuration example of the terminal device 103. As illustrated in FIG. 5, the terminal device 103 includes a CPU 201, a ROM 202, a RAM 203, a bus 204, an input / output interface 210, an input unit 211, an output unit 212, a storage unit 213, a communication unit 214, and a drive 215.
 CPU201、ROM202、およびRAM203は、バス204を介して相互に接続されている。バス204にはまた、入出力インタフェース210も接続されている。入出力インタフェース160には、入力部211乃至ドライブ215が接続されている。 The CPU 201, the ROM 202, and the RAM 203 are connected to each other via a bus 204. An input / output interface 210 is also connected to the bus 204. An input unit 211 to a drive 215 are connected to the input / output interface 160.
 入力部211は、例えば、キーボード、マウス、タッチパネル、イメージセンサ、マイクロホン、スイッチ、入力端子等の任意の入力デバイスを有する。出力部212は、例えば、ディスプレイ、スピーカ、出力端子等の任意の出力デバイスを有する。記憶部213は、例えば、ハードディスク、RAMディスク、SSD(Solid State Drive)やUSB(Universal Serial Bus)(登録商標)メモリ等のような不揮発性のメモリ等、任意の記憶媒体を有する。通信部164は、例えば、イーサネット(登録商標)、Bluetooth(登録商標)、USB、HDMI(登録商標)(High-Definition Multimedia Interface)、IrDA等の、有線若しくは無線、または両方の、任意の通信規格の通信インタフェースを有する。ドライブ215は、磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリ等の任意の記憶媒体を有するリムーバブルメディア221を駆動する。 The input unit 211 includes arbitrary input devices such as a keyboard, a mouse, a touch panel, an image sensor, a microphone, a switch, and an input terminal. The output unit 212 includes an arbitrary output device such as a display, a speaker, and an output terminal, for example. The storage unit 213 includes an arbitrary storage medium such as a hard disk, a RAM disk, a non-volatile memory such as an SSD (Solid State Drive) or a USB (Universal Serial Bus) (registered trademark) memory. The communication unit 164 is, for example, any communication standard such as Ethernet (registered trademark), Bluetooth (registered trademark), USB, HDMI (registered trademark) (High-Definition Multimedia Interface), IrDA or the like, wired or wireless, or both. Communication interface. The drive 215 drives a removable medium 221 having an arbitrary storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
 CPU201が、例えば、ROM202や記憶部213に記憶されているプログラムを、RAM203にロードして実行することにより処理が行われる。RAM203にはまた、CPU201が各種の処理を実行する上において必要なデータなども適宜記憶される。 Processing is performed by the CPU 201 loading, for example, a program stored in the ROM 202 or the storage unit 213 into the RAM 203 and executing the program. The RAM 203 also appropriately stores data necessary for the CPU 201 to execute various processes.
  <機能ブロック>
 図6は、端末装置103が有する機能の例を示す機能ブロック図である。端末装置103は、CPU201がプログラム等を実行することにより、図6に示されるような各種機能を実現する。図6に示されるように、端末装置103は、機能ブロックとして、プレイリスト取得部231、画像選択処理部232、撮像画像要求部233、撮像画像取得部234、および再生部235を有する。
<Functional block>
FIG. 6 is a functional block diagram illustrating an example of functions that the terminal device 103 has. The terminal device 103 implements various functions as shown in FIG. 6 when the CPU 201 executes a program or the like. As illustrated in FIG. 6, the terminal device 103 includes a playlist acquisition unit 231, an image selection processing unit 232, a captured image request unit 233, a captured image acquisition unit 234, and a playback unit 235 as functional blocks.
 プレイリスト取得部231は、多視点鑑賞プレイリストの取得に関する処理を行う。このプレイリスト取得部231は、例えば、CPU201がプログラム等を実行したり、入出力インタフェース210を介して通信部214等を制御することによって実現される。画像選択処理部232は、多視点鑑賞プレイリストに基づく撮像画像の選択に関する処理を行う。この画像選択処理部232は、例えば、CPU201がプログラム等を実行したり、入出力インタフェース210を介して通信部214等を制御することによって実現される。撮像画像要求部233は、再生する撮像画像の要求に関する処理を行う。この撮像画像要求部233は、例えば、CPU201がプログラム等を実行したり、入出力インタフェース210を介して通信部214等を制御することによって実現される。 The playlist acquisition unit 231 performs processing related to acquisition of a multi-view viewing playlist. The playlist acquisition unit 231 is realized, for example, when the CPU 201 executes a program or the like, or controls the communication unit 214 or the like via the input / output interface 210. The image selection processing unit 232 performs processing related to selection of captured images based on the multi-view viewing playlist. The image selection processing unit 232 is realized, for example, when the CPU 201 executes a program or the like, or controls the communication unit 214 or the like via the input / output interface 210. The captured image request unit 233 performs processing related to a request for a captured image to be reproduced. The captured image request unit 233 is realized, for example, when the CPU 201 executes a program or the like, or controls the communication unit 214 or the like via the input / output interface 210.
 撮像画像取得部234は、撮像画像の取得に関する処理を行う。この撮像画像取得部234は、例えば、CPU201がプログラム等を実行したり、入出力インタフェース210を介して通信部214等を制御することによって実現される。再生部235は、撮像画像データの再生に関する処理を行うデバイスである。この再生部235は、例えば、CPU201がプログラム等を実行したり、入出力インタフェース210を介して出力部212等を制御することによって実現される。 The captured image acquisition unit 234 performs processing related to acquisition of a captured image. The captured image acquisition unit 234 is realized, for example, when the CPU 201 executes a program or the like, or controls the communication unit 214 or the like via the input / output interface 210. The playback unit 235 is a device that performs processing related to playback of captured image data. For example, the reproducing unit 235 is realized by the CPU 201 executing a program or the like or controlling the output unit 212 or the like via the input / output interface 210.
  <撮像画像集積処理の流れ>
 次に、画像提供システム100により実行される各種処理について説明する。最初に、撮像装置101による撮像や集積サーバ102による撮像画像データの集積が行われる撮像画像集積処理の流れの例を、図7のフローチャートを参照して説明する。
<Flow of captured image integration processing>
Next, various processes executed by the image providing system 100 will be described. First, an example of a flow of captured image integration processing in which imaging by the imaging apparatus 101 and captured image data by the integration server 102 are performed will be described with reference to a flowchart of FIG.
 撮像画像集積処理が開始されると、撮像装置101の撮像部121は、ステップS101において、イベント会場等において、撮像者に操作され、撮像対象(被写体)を撮像する。撮像部121はイメージセンサ等により入射光を光電変換して撮像画像データを生成し、それをメタデータ付加部123に供給する。 When the captured image accumulation process is started, the imaging unit 121 of the imaging apparatus 101 is operated by the photographer at an event venue or the like in step S101 to capture an imaging target (subject). The imaging unit 121 photoelectrically converts incident light with an image sensor or the like to generate captured image data, and supplies it to the metadata adding unit 123.
 ステップS102において、メタデータ生成部122は、撮像の設定や周辺の状態(撮像環境)等の情報を収集し、それらを含むメタデータを生成する。このメタデータの内容は任意である。メタデータ生成部122は、生成したメタデータをメタデータ付加部123に供給する。 In step S102, the metadata generation unit 122 collects information such as imaging settings and surrounding states (imaging environment), and generates metadata including them. The content of this metadata is arbitrary. The metadata generation unit 122 supplies the generated metadata to the metadata addition unit 123.
 ステップS103において、メタデータ付加部123は、ステップS101の処理により生成された撮像画像データに、ステップS102の処理により生成されたメタデータを付加する。メタデータ付加部123は、そのメタデータが付加された撮像画像データを通信部124に供給する。 In step S103, the metadata adding unit 123 adds the metadata generated by the process of step S102 to the captured image data generated by the process of step S101. The metadata adding unit 123 supplies the captured image data to which the metadata is added to the communication unit 124.
 ステップS104において、通信部124は、集積サーバ102と通信を行い、所定のタイミングにおいて、ステップS103の処理によりメタデータが付加された撮像画像データを集積サーバ102に供給(送信)する。メタデータが付加された撮像画像データは、インターネット等の伝送路を介して集積サーバ102に伝送される。 In step S104, the communication unit 124 communicates with the accumulation server 102, and supplies (transmits) the captured image data to which the metadata is added by the process in step S103 to the accumulation server 102 at a predetermined timing. The captured image data to which the metadata is added is transmitted to the integrated server 102 via a transmission path such as the Internet.
 集積サーバ102の集積部181は、ステップS121において、通信部164を制御して撮像装置101と通信を行い、その撮像装置101から供給(伝送)される撮像画像データ(メタデータが付加された撮像画像データ)を取得(受信)する。つまり、集積部181は、メタデータが付加された撮像画像データを収集する。 In step S <b> 121, the accumulation unit 181 of the accumulation server 102 controls the communication unit 164 to communicate with the imaging apparatus 101, and captured image data (imaging with metadata added) supplied (transmitted) from the imaging apparatus 101. Acquire (receive) image data. That is, the accumulation unit 181 collects captured image data to which metadata is added.
 ステップS122において、撮像画像データベース182は、ステップS121の処理により取得された撮像画像データ(およびそのメタデータ)を記憶部163に記憶し、管理する。 In step S122, the captured image database 182 stores and manages the captured image data (and its metadata) acquired by the processing in step S121 in the storage unit 163.
 ステップS122の処理が終了すると、撮像画像集積処理が終了する。以上のような撮像画像集積処理が、イベント会場において撮像に用いられる各撮像装置に対して実行される。この撮像画像集積処理の実行タイミングは任意である。例えば、イベント終了後であってもよいし、イベント中に行われるようにしてもよい。また、例えば、所定の時刻に実行されるようにしてもよいし、所定の時間間隔で定期的に実行されるようにしてもよいし、撮像者等の指示に基づいて実行されるようにしてもよい。また、例えば新たな撮像画像データを生成する等、所定のイベント発生時に実行されるようにしてもよい。 When the process of step S122 ends, the captured image accumulation process ends. The captured image integration process as described above is executed for each imaging device used for imaging at the event venue. The execution timing of this captured image integration process is arbitrary. For example, it may be after the event ends or during the event. Further, for example, it may be executed at a predetermined time, may be periodically executed at a predetermined time interval, or may be executed based on an instruction from a photographer or the like. Also good. Further, for example, it may be executed when a predetermined event occurs, such as generating new captured image data.
  <多視点鑑賞プレイリスト生成処理の流れ>
 以上のように集積した撮像画像データに対して、集積サーバ102は、多視点鑑賞プレイリスト生成処理を実行して多視点鑑賞プレイリストを生成する。この多視点鑑賞プレイリスト生成処理の流れの例を図8のフローチャートを参照して説明する。
<Flow of multi-view viewing playlist generation processing>
For the captured image data accumulated as described above, the accumulation server 102 executes a multi-view appreciation playlist generation process to generate a multi-view appreciation playlist. An example of the flow of the multi-view appreciation playlist generation process will be described with reference to the flowchart of FIG.
 多視点鑑賞プレイリスト生成処理が開始されると、集積サーバ102の撮像状況解析部183は、ステップS141において、撮像の状況についての解析を行う。この解析処理の詳細については後述する。 When the multi-view viewing playlist generation process is started, the imaging state analysis unit 183 of the integrated server 102 analyzes the imaging state in step S141. Details of this analysis processing will be described later.
 撮像の状況が解析されると、ステップS142において、プレイリスト生成部184は、その解析結果に基づいて多視点鑑賞プレイリストを生成する。このプレイリスト生成処理の詳細については後述する。 When the imaging state is analyzed, in step S142, the playlist generation unit 184 generates a multi-view viewing playlist based on the analysis result. Details of the playlist generation processing will be described later.
 以上のように多視点鑑賞プレイリストが生成されると、多視点鑑賞プレイリスト生成処理が終了する。この多視点鑑賞プレイリスト生成処理の実行タイミングは任意である。例えば、所定の時刻に実行されるようにしてもよいし、所定の時間間隔で定期的に実行されるようにしてもよいし、サーバ管理者等の指示に基づいて実行されるようにしてもよい。また、例えば新たな撮像画像データを取得する等、所定のイベント発生時に実行されるようにしてもよい。 When the multi-view appreciation playlist is generated as described above, the multi-view appreciation playlist generation process ends. The execution timing of this multi-view appreciation playlist generation process is arbitrary. For example, it may be executed at a predetermined time, may be periodically executed at predetermined time intervals, or may be executed based on an instruction from a server administrator or the like. Good. Further, for example, it may be executed when a predetermined event occurs, such as acquiring new captured image data.
  <撮像状況の解析>
 上述のように、撮像状況解析部183は、撮像画像データに対して、例えばその撮像時の撮像位置、撮像対象の位置、撮像対象からみた撮像方向等の撮像の状況について解析を行う。この解析は任意の情報に基づいて行うことができる。例えば、撮像状況解析部183は、以下のような情報を参照する。
<Analysis of imaging conditions>
As described above, the imaging situation analysis unit 183 analyzes the imaging situation such as the imaging position at the time of imaging, the position of the imaging target, the imaging direction viewed from the imaging target, and the like on the captured image data. This analysis can be performed based on arbitrary information. For example, the imaging state analysis unit 183 refers to the following information.
 ・撮像位置
 撮像時の撮像装置101の位置(撮像位置)に関する情報である。例えば、撮像装置101が撮像の際にGPS(Global Positioning System)システムを利用して自身の位置を測位するようにしてもよい。また例えば、無線LAN(Local Area Network)のアクセスポイントや基地局等との通信を利用して測位してもよいし、加速度センサ等の各種センサを利用してもよい。測位の方法は任意である。この場合、集積サーバ102は、測位結果(座標等)を参照して解析を行う。
Image pickup position Information regarding the position (image pickup position) of the image pickup apparatus 101 at the time of image pickup. For example, the imaging apparatus 101 may measure its own position using a GPS (Global Positioning System) system at the time of imaging. Further, for example, positioning may be performed using communication with a wireless LAN (Local Area Network) access point or a base station, or various sensors such as an acceleration sensor may be used. The positioning method is arbitrary. In this case, the accumulation server 102 performs analysis with reference to the positioning result (coordinates and the like).
 なお、撮像位置に関する情報は、測位結果(座標等)以外であってもよい。例えば、撮像が行われた座席の識別情報(座席番号等)や位置情報(座標等)であってもよい。例えば、撮像者等が座席のレイアウトを示す座席表に基づいて撮像を行った位置(座席)を指定するようにしてもよい。この場合、集積サーバ102は、座席の識別情報(座席番号等)や位置情報(座標等)を参照して解析を行う。 Note that the information related to the imaging position may be other than the positioning result (coordinates, etc.). For example, it may be identification information (seat number or the like) or position information (coordinates or the like) of the seat where the image is taken. For example, a position (seat) where an imager or the like has taken an image may be specified based on a seating chart indicating a seat layout. In this case, the accumulation server 102 performs analysis with reference to seat identification information (such as a seat number) and position information (such as coordinates).
 このような撮像位置に関する情報は、撮像画像データのメタデータに含まれるようにしてもよいし、含まれないようにしてもよい。例えば、撮像画像データやメタデータとは別のデータとして集積サーバ102に供給されるようにしてもよい。 Such information regarding the imaging position may or may not be included in the metadata of the captured image data. For example, it may be supplied to the accumulation server 102 as data different from the captured image data and metadata.
 また、例えば、集積サーバ102が撮像画像を解析して撮像位置を特定するようにしてもよい。例えば、集積サーバ102が、撮像画像と予め用意されたリファレンス画像とを比較して撮像位置を特定するようにしてもよい。また、例えば、イベント会場に位置特定用のマーカを設置しておき、集積サーバ102が、撮像画像に映りこんだそのマーカに基づいて撮像位置を特定するようにしてもよい。この場合、集積サーバ102は、自身が特定した撮像位置(座標等)を参照して解析を行う。もちろん、撮像装置101および集積サーバ102以外の装置が撮像位置を特定するようにしてもよい。 Further, for example, the integrated server 102 may analyze the captured image and specify the imaging position. For example, the accumulation server 102 may identify the imaging position by comparing the captured image with a reference image prepared in advance. Further, for example, a marker for specifying a position may be installed at the event venue, and the accumulation server 102 may specify the imaging position based on the marker reflected in the captured image. In this case, the accumulation server 102 performs analysis with reference to the imaging position (coordinates, etc.) specified by itself. Of course, devices other than the imaging device 101 and the integrated server 102 may specify the imaging position.
 ・撮像時間(期間)
 撮像が行われた時刻(開始時刻や終了時刻など)や開始から終了までの時間に関する情報である。例えば、撮像装置101が内蔵されるタイマ機能を利用して撮像開始や終了の時刻情報を生成するようにしてもよい。また、例えば撮像装置101が、サーバ等の外部の装置から時刻情報を取得するようにしてもよい。集積サーバ102は、この時刻情報を参照して解析を行う。
・ Imaging time (period)
This is information relating to the time (such as the start time and the end time) when imaging was performed and the time from the start to the end. For example, it is possible to generate time information for starting and ending imaging using a timer function in which the imaging apparatus 101 is built. For example, the imaging device 101 may acquire time information from an external device such as a server. The accumulation server 102 performs analysis with reference to this time information.
 なお、撮像を行う撮像装置101間でこの時刻情報の同期がより正確に取れている方が、撮像画像データ同士においても同期をより正確にとることができる。ただし、例えば、単に撮像時刻が類似する撮像画像データを検索する場合等のように、時刻を同期させて複数の撮像画像データの再生を切り替えるような場合でなければ、正確に同期が取れていなくても問題ない場合も多い。つまり、撮像装置101間の時刻情報の同期は、利用方法に応じた精度で取れていればよい。撮像装置101間で時刻情報の同期をとる場合は、例えば撮像時にサーバと交信して同期させるようにしてもよい。このような撮像が行われた時刻や時間に関する情報は、撮像画像データのメタデータに含まれるようにしてもよいし、含まれないようにしてもよい。 It should be noted that if the time information is more accurately synchronized between the imaging devices 101 that perform imaging, the captured image data can be more accurately synchronized. However, the synchronization is not accurately achieved unless the times are synchronized and the reproduction of a plurality of captured image data is switched, for example, when simply searching for captured image data having similar imaging times. In many cases, there is no problem. That is, it is only necessary to synchronize the time information between the imaging devices 101 with accuracy according to the usage method. When synchronizing the time information between the imaging apparatuses 101, for example, communication with a server may be performed at the time of imaging. Information regarding the time and time when such imaging is performed may or may not be included in the metadata of the captured image data.
 ・撮像対象までの距離
 撮像時の撮像装置101と撮像対象との間の距離に関する情報である。例えば、撮像装置101が、撮像装置101から撮像対象までの距離を測定するようにしてもよい。測距の方法は任意である。このような撮像対象までの距離に関する情報は、撮像画像データのメタデータに含まれるようにしてもよいし、含まれないようにしてもよい。例えば、撮像画像データやメタデータとは別のデータとして集積サーバ102に供給されるようにしてもよい。また、例えば、集積サーバ102が撮像画像を解析して撮像装置101から撮像対象までの距離を測定するようにしてもよい。例えば、集積サーバ102が、撮像画像と予め用意されたリファレンス画像とを比較して測距を行うようにしてもよい。もちろん、撮像装置101および集積サーバ102以外の装置がこの測距を行うようにしてもよい。集積サーバ102は、このようにして得られた測距結果(距離情報等)を参照して解析を行う。
-Distance to imaging target This is information regarding the distance between the imaging device 101 and the imaging target at the time of imaging. For example, the imaging device 101 may measure the distance from the imaging device 101 to the imaging target. The method of distance measurement is arbitrary. Such information regarding the distance to the imaging target may or may not be included in the metadata of the captured image data. For example, it may be supplied to the accumulation server 102 as data different from the captured image data and metadata. For example, the integrated server 102 may analyze the captured image and measure the distance from the imaging device 101 to the imaging target. For example, the accumulation server 102 may perform distance measurement by comparing a captured image with a reference image prepared in advance. Of course, devices other than the imaging device 101 and the integrated server 102 may perform this distance measurement. The accumulation server 102 performs analysis with reference to the distance measurement results (distance information, etc.) obtained in this way.
 ・撮像方向
 撮像装置101が撮像した方向に関する情報である。この撮像方向の特定方法は、任意である。例えば、撮像装置101が、加速度センサや電子コンパス等を用いて撮像方向を特定するようにしてもよい。このような撮像方向に関する情報は、撮像画像データのメタデータに含まれるようにしてもよいし、含まれないようにしてもよい。例えば、撮像画像データやメタデータとは別のデータとして集積サーバ102に供給されるようにしてもよい。また、例えば、集積サーバ102が撮像画像と予め用意されたリファレンス画像とを比較して撮像方向を特定するようにしてもよい。集積サーバ102は、このように特定された撮像方向を参照して解析を行う。
Imaging direction Information regarding the imaging direction of the imaging apparatus 101. The method for specifying the imaging direction is arbitrary. For example, the imaging device 101 may specify the imaging direction using an acceleration sensor, an electronic compass, or the like. Such information regarding the imaging direction may or may not be included in the metadata of the captured image data. For example, it may be supplied to the accumulation server 102 as data different from the captured image data and metadata. Further, for example, the accumulation server 102 may compare the captured image with a reference image prepared in advance to specify the imaging direction. The accumulation server 102 performs analysis with reference to the imaging direction specified in this way.
 ・ズームレベル(画角)
 撮像装置101が撮像を行った際のズームの設定(画角の設定)に関する情報である。このような撮像方向に関する情報は、撮像画像データのメタデータに含まれるようにしてもよいし、含まれないようにしてもよい。例えば、撮像画像データやメタデータとは別のデータとして集積サーバ102に供給されるようにしてもよい。また、例えば、集積サーバ102が撮像画像と予め用意されたリファレンス画像とを比較してこのズームレベルを特定するようにしてもよい。集積サーバ102は、このように特定されたズームレベルを参照して解析を行う。
-Zoom level (angle of view)
This is information related to the zoom setting (angle of view setting) when the imaging apparatus 101 performs imaging. Such information regarding the imaging direction may or may not be included in the metadata of the captured image data. For example, it may be supplied to the accumulation server 102 as data different from the captured image data and metadata. Further, for example, the accumulation server 102 may compare the captured image with a reference image prepared in advance to specify the zoom level. The accumulation server 102 performs analysis with reference to the zoom level specified in this way.
 撮像状況解析部183が解析の際に参照する情報は任意である。上述した各種情報のいずれかを参照するようにしてもよいし、上述した以外の情報を参照するようにしてもよい。また、複数の情報を参照するようにしてもよい。 Information that the imaging state analysis unit 183 refers to at the time of analysis is arbitrary. Any one of the various types of information described above may be referred to, or information other than those described above may be referred to. A plurality of pieces of information may be referred to.
 例えば、撮像状況解析部183が、撮像が行われるイベント会場に関する情報である会場情報を参照するようにしてもよい。会場情報の内容は任意であるが、例えば、そのイベント会場における各種領域の位置関係を示す撮像対象位置関係情報や、イベント会場の形状やレイアウト等を示す会場形状情報等が含まれるようにしてもよい。 For example, the imaging state analysis unit 183 may refer to venue information that is information about an event venue where imaging is performed. The content of the venue information is arbitrary, but may include, for example, imaging target positional relationship information indicating the positional relationship of various areas in the event venue, venue shape information indicating the shape and layout of the event venue, and the like. Good.
 ・撮像対象位置関係情報
 撮像対象位置関係情報は、例えば、イベント会場内の、撮像装置101が配置されうる領域である撮像位置領域、撮像対象となりうる領域である撮像対象領域、その撮像対象領域のうち特定の撮像対象が存在する領域である撮像対象詳細領域等の位置関係を表す情報である。学校の校庭で行われる運動会などのイベント時の一例を図9に示す。図9に示される撮像対象位置関係情報250では、撮像位置領域251、撮像対象領域252、撮像対象詳細領域253-1、撮像対象詳細領域253-2、撮像対象詳細領域253-3等が示されている。なお、撮像対象詳細領域253-1、撮像対象詳細領域253-2、撮像対象詳細領域253-3をたがいに区別して説明する必要がない場合、撮像対象詳細領域253と称する。この撮像対象詳細領域253の情報をどの程度予め想定されるかはイベントの性質等による。
Image pickup target position relationship information The image pickup target position relation information includes, for example, an image pickup position area that is an area where the image pickup apparatus 101 can be placed in an event venue, an image pickup target area that can be an image pickup target, and an image pickup target area. This is information indicating the positional relationship of an imaging target detail area, which is an area where a specific imaging target exists. FIG. 9 shows an example of an event such as an athletic meet held in the school yard. The imaging target position relationship information 250 shown in FIG. 9 indicates an imaging position area 251, an imaging target area 252, an imaging target detail area 253-1, an imaging target detail area 253-2, an imaging target detail area 253-3, and the like. ing. Note that the imaging target detail area 253-1, the imaging target detail area 253-2, and the imaging target detail area 253-3 are referred to as the imaging target detail area 253 when it is not necessary to distinguish between them. How much information in the imaging target detail area 253 is assumed in advance depends on the nature of the event.
 このような撮像対象位置関係情報250を用いることにより、会場内における撮像装置101や撮像対象の位置をある程度限定することができるので、撮像位置、撮像対象の位置、撮像方向等の特定をより容易にすることができる。 By using such imaging target position relationship information 250, it is possible to limit the position of the imaging device 101 and the imaging target in the venue to some extent, so it is easier to specify the imaging position, the position of the imaging target, the imaging direction, and the like. Can be.
 また、リファレンス画像を用意する場合、そのリファレンス画像を撮像する撮像装置の位置や向き等を示す情報もこの撮像対象位置関係情報250に含めるようにしてもよい。図9の例の場合、撮像対象領域252を撮像してリファレンス画像を生成する撮像装置261-1および撮像装置261-2の位置や撮像方向等が、撮像対象位置関係情報250に含まれている。なお、リファレンス画像を生成する場合、撮像対象領域252全体をカバーし、かつ、複数の方向から撮像するのが望ましい。 Further, when a reference image is prepared, information indicating the position and orientation of the imaging device that captures the reference image may be included in the imaging target positional relationship information 250. In the case of the example in FIG. 9, the imaging target position relationship information 250 includes the positions, imaging directions, and the like of the imaging device 261-1 and the imaging device 261-2 that capture the imaging target region 252 and generate a reference image. . When generating a reference image, it is desirable to cover the entire imaging target area 252 and capture images from a plurality of directions.
・会場形状情報
 会場形状情報は、会場の形状などに関する情報である。例えば、会場の大きさ、形、座標等が示される。また、座席のレイアウトや座標等が示されるようにしてもよい。このような会場形状情報を用いることにより、会場内における撮像位置、撮像対象の位置、撮像方向等を特定することができる。例えば、撮像位置として座標が示される場合、その座標と会場形状情報の座標とを照らし合わせることにより、会場内における撮像位置を特定することができる。また、例えば、撮像位置として座席番号が示される場合、その座席番号と会場形状情報の座席表とを照らし合わせることにより、会場内における撮像位置(座席の位置)を特定することができる。
-Venue shape information Venue shape information is information about the shape of the venue. For example, the size, shape, coordinates, etc. of the venue are shown. Further, the layout and coordinates of the seat may be shown. By using such venue shape information, it is possible to specify an imaging position, an imaging target position, an imaging direction, and the like in the venue. For example, when coordinates are indicated as the imaging position, the imaging position in the venue can be specified by comparing the coordinates with the coordinates of the venue shape information. Further, for example, when a seat number is indicated as the imaging position, the imaging position (seat position) in the venue can be specified by comparing the seat number with the seating chart of the venue shape information.
  <解析処理の流れ>
 撮像状況解析部183は、図8のステップS141において、以上のような各種情報から、撮像位置、撮像対象、撮像対象からみた撮像方向等の、撮像の状況を解析する。図10のフローチャートを参照して、この解析処理の流れの例を説明する。
<Flow of analysis processing>
In step S141 in FIG. 8, the imaging state analysis unit 183 analyzes the imaging state such as the imaging position, the imaging target, and the imaging direction as viewed from the imaging target from the above-described various information. An example of the flow of this analysis process will be described with reference to the flowchart of FIG.
 解析処理が開始されると、撮像状況解析部183は、ステップS161において、撮像画像データベース182から、解析対象のデータ(例えば解析を行う撮像画像データのメタデータ等)を取得する。ステップS162において、撮像状況解析部183は、そのメタデータに撮像位置を示す撮像位置情報が含まれているか否かを判定する。含まれていると判定された場合、処理はステップS163に進む。 When the analysis process is started, the imaging state analysis unit 183 acquires data to be analyzed (for example, metadata of the captured image data to be analyzed) from the captured image database 182 in step S161. In step S162, the imaging state analysis unit 183 determines whether the metadata includes imaging position information indicating the imaging position. If it is determined that it is included, the process proceeds to step S163.
 ステップS163において、撮像状況解析部183は、メタデータに含まれる撮像位置に関する情報(例えばGPS情報等)に基づいて”撮像対象位置関係”における撮像位置、すなわち、イベント会場内における撮像位置を特定する。また、会場情報を参照するようにしてもよい。例えば、撮像状況解析部183が、撮像位置を示す座標と会場形状情報(会場内のレイアウトや座標)とを照らし合わせることによりイベント会場内における撮像位置を特定するようにしてもよい。ステップS163の処理が終了すると、処理はステップS165に進む。 In step S163, the imaging state analysis unit 183 specifies the imaging position in the “imaging target position relationship”, that is, the imaging position in the event hall, based on information (for example, GPS information) related to the imaging position included in the metadata. . Further, the venue information may be referred to. For example, the imaging state analysis unit 183 may identify the imaging position in the event venue by comparing coordinates indicating the imaging position with venue shape information (layout and coordinates in the venue). When the process of step S163 ends, the process proceeds to step S165.
 また、ステップS162において、メタデータに撮像位置情報が含まれていないと判定された場合、処理はステップS164に進む。ステップS164において、撮像状況解析部183は、各種情報に基づいて、”撮像対象位置関係”における撮像位置、すなわち、イベント会場内における撮像位置を特定する。例えば、撮像状況解析部183は、撮像者等が入力した撮像位置情報(例えば座標や座席番号等)に基づいて、撮像位置を特定する。また、会場情報を参照するようにしてもよい。例えば、撮像状況解析部183が、撮像画像を、撮像位置や撮像方向が既知のリファレンス画像と比較することにより、イベント会場内における撮像位置を特定するようにしてもよい。ステップS164の処理が終了すると、処理はステップS165に進む。 If it is determined in step S162 that the imaging position information is not included in the metadata, the process proceeds to step S164. In step S164, the imaging state analysis unit 183 specifies the imaging position in the “imaging target position relationship”, that is, the imaging position in the event hall, based on various information. For example, the imaging state analysis unit 183 specifies the imaging position based on imaging position information (for example, coordinates and seat number) input by the photographer or the like. Further, the venue information may be referred to. For example, the imaging state analysis unit 183 may identify the imaging position in the event venue by comparing the captured image with a reference image whose imaging position and imaging direction are known. When the process of step S164 ends, the process proceeds to step S165.
 ステップS165において、撮像状況解析部183は、メタデータに撮像対象の位置を示す撮像対象位置情報が含まれているか否かを判定する。含まれていると判定された場合、処理はステップS166に進む。 In step S165, the imaging state analysis unit 183 determines whether or not imaging target position information indicating the position of the imaging target is included in the metadata. If it is determined that it is included, the process proceeds to step S166.
 ステップS166において、撮像状況解析部183は、メタデータに含まれる撮像対象の位置に関する情報(例えば撮像方向や撮像対象までの距離等)に基づいて”撮像対象位置関係”における撮像対象の位置(または範囲)、すなわち、イベント会場内における撮像対象の位置(または範囲)を特定する。例えば、撮像状況解析部183は、撮像画像データやメタデータに含まれる撮像方向や撮像対象までの距離等の情報に基づいて、撮像対象(もの、人、または領域)を特定し、その特定した撮像対象のイベント会場内における位置を特定する。また、会場情報を参照するようにしてもよい。例えば、撮像状況解析部183が、撮像方向や撮像対象までの距離と会場形状情報(会場内のレイアウトや座標)とを照らし合わせることによりイベント会場内における撮像対象の位置(または範囲)を特定するようにしてもよい。ステップS166の処理が終了すると、処理はステップS168に進む。 In step S166, the imaging state analysis unit 183 determines the position of the imaging target in the “imaging target positional relationship” (or based on the information related to the position of the imaging target included in the metadata (for example, the imaging direction and the distance to the imaging target) (or Range), that is, the position (or range) of the imaging target in the event venue. For example, the imaging state analysis unit 183 specifies an imaging target (thing, person, or region) based on information such as an imaging direction and a distance to the imaging target included in the captured image data and metadata. Specify the location in the event venue to be imaged. Further, the venue information may be referred to. For example, the imaging state analysis unit 183 identifies the position (or range) of the imaging target in the event venue by comparing the imaging direction and the distance to the imaging target with venue shape information (layout and coordinates in the venue). You may do it. When the process of step S166 ends, the process proceeds to step S168.
 また、ステップS165において、メタデータに撮像対象位置情報が含まれていないと判定された場合、処理はステップS167に進む。ステップS167において、撮像状況解析部183は、各種情報に基づいて、”撮像対象位置関係”における撮像対象の位置(または範囲)を特定する。例えば、撮像状況解析部183は、撮像者等が入力した撮像対象位置情報(例えば座標)に基づいて、撮像対象の位置(または範囲)、すなわち、イベント会場内における撮像対象の位置(または範囲)を特定する。また、会場情報を参照するようにしてもよい。例えば、撮像状況解析部183が、撮像画像を、撮像位置や撮像方向が既知のリファレンス画像と比較することにより、イベント会場内における撮像対象の位置を特定するようにしてもよい。ステップS167の処理が終了すると、処理はステップS168に進む。 If it is determined in step S165 that the imaging target position information is not included in the metadata, the process proceeds to step S167. In step S167, the imaging state analysis unit 183 specifies the position (or range) of the imaging target in the “imaging target positional relationship” based on various information. For example, the imaging state analysis unit 183, based on imaging target position information (for example, coordinates) input by the photographer or the like, the position (or range) of the imaging target, that is, the position (or range) of the imaging target in the event venue. Is identified. Further, the venue information may be referred to. For example, the imaging state analysis unit 183 may identify the position of the imaging target in the event venue by comparing the captured image with a reference image whose imaging position and imaging direction are known. When the process of step S167 ends, the process proceeds to step S168.
 ステップS168において、撮像状況解析部183は、ステップS163の処理またはステップS164の処理により特定された撮像位置、並びに、ステップS166の処理またはステップS167の処理により特定された撮像対象の位置(または範囲の中心)に基づいて、撮像装置101と撮像対象との位置関係を特定し、さらにその位置関係から、撮像対象からみた撮像方向を特定する。また、会場情報を参照するようにしてもよい。例えば、撮像状況解析部183が、撮像画像を、撮像位置や撮像方向が既知のリファレンス画像と比較することにより、イベント会場内における撮像対象の位置を特定するようにしてもよい。 In step S168, the imaging state analysis unit 183 determines the imaging position specified by the process of step S163 or the process of step S164, and the position (or range of the imaging target) specified by the process of step S166 or the process of step S167. Based on (center), the positional relationship between the imaging apparatus 101 and the imaging target is specified, and further, the imaging direction viewed from the imaging target is specified from the positional relationship. Further, the venue information may be referred to. For example, the imaging state analysis unit 183 may identify the position of the imaging target in the event venue by comparing the captured image with a reference image whose imaging position and imaging direction are known.
 ステップS169において、撮像画像データベース182は、以上のようにして得られた撮像位置、撮像対象の位置(または範囲)、並びに撮像対象からみた撮像方向のそれぞれを示す情報を、処理対象の撮像画像データに付加し、その撮像画像データのメタデータとして記憶し、管理する。 In step S169, the captured image database 182 stores information indicating the imaging position, the position (or range) of the imaging target, and the imaging direction viewed from the imaging target, obtained as described above, as captured image data to be processed. Is stored as metadata of the captured image data and managed.
 ステップS170において、撮像状況解析部183は、全ての撮像画像データを解析したか否かが判定される。未処理の撮像画像データが存在すると判定された場合、処理はステップS171に進む。ステップS171において、撮像状況解析部183は、解析対象を次の撮像画像データに更新する(解析対象を変更する)。ステップS171の処理が終了すると、処理はステップS161に戻り、それ以降の処理を繰り返す。 In step S170, the imaging state analysis unit 183 determines whether or not all captured image data has been analyzed. If it is determined that there is unprocessed captured image data, the process proceeds to step S171. In step S171, the imaging state analysis unit 183 updates the analysis target to the next captured image data (changes the analysis target). When the process of step S171 ends, the process returns to step S161, and the subsequent processes are repeated.
 ステップS161乃至ステップS171の各処理を各撮像画像データに対して実行することにより、集積サーバ102は、撮像画像データベース182に記憶されている全ての撮像画像データの撮像状況を解析する。そして、ステップS170において、全ての撮像画像データを解析したと判定された場合、解析処理が終了し、処理は図8に戻る。 The integration server 102 analyzes the imaging state of all the captured image data stored in the captured image database 182 by executing each process of step S161 to step S171 on each captured image data. If it is determined in step S170 that all captured image data has been analyzed, the analysis process ends, and the process returns to FIG.
 以上のように解析処理を行うことにより、多視点鑑賞プレイリストの生成に必要な映像メタデータが生成される。 By performing the analysis process as described above, video metadata necessary for generating a multi-view appreciation playlist is generated.
 この映像メタデータの例を図11に示す。図11において、startX(Y)は、撮像画像データ(id=d_X)のY番目の区間の開始時刻(たとえばUTC時刻)を示す。また、endX(Y)は、その終了時刻(例えばUTC時刻)を示す。また、xxX(Y)、yyX(Y)、zzX(Y)は"撮像対象位置関係"(図9)における座標、xyX(Y)およびzX(Y)は前記座標から見た撮像方向(カメラ位置の方向)とする。(座標原点および角度の基準方向は"撮像対象位置関係"に応じて定めるものとする。例えば図9の例ではその中心を座標原点、xyX(Y)の基準は真上、zX(Y)の基準は水平面とする等。座標を表す数値はミリメートル単位の整数値とする。) An example of this video metadata is shown in FIG. In FIG. 11, startX (Y) indicates the start time (for example, UTC time) of the Yth section of the captured image data (id = d_X). EndX (Y) indicates the end time (for example, UTC time). Further, xxX (Y), yyX (Y), and zzX (Y) are coordinates in the “imaging target positional relationship” (FIG. 9), and xyX (Y) and zX (Y) are imaging directions (camera positions) viewed from the coordinates. Direction). (The coordinate origin and the reference direction of the angle are determined according to the “imaging target positional relationship”. For example, in the example of FIG. 9, the center is the coordinate origin, the reference of xyX (Y) is directly above, and zX (Y) (The standard is the horizontal plane, etc. The numerical value representing the coordinates is an integer value in millimeters.)
  <プレイリスト生成処理の流れ>
 以上のように解析処理が終了すると、プレイリスト生成部184は、図8のステップS142において、プレイリスト生成処理を開始する。このプレイリスト生成処理の流れの例を、図12のフローチャートを参照して説明する。
<Flow of playlist generation processing>
When the analysis process is completed as described above, the playlist generation unit 184 starts the playlist generation process in step S142 of FIG. An example of the flow of the playlist generation process will be described with reference to the flowchart of FIG.
 プレイリスト生成処理が開始されると、プレイリスト生成部184は、ステップS191において、撮像対象詳細領域毎に撮像画像データをグループ化する。ステップS192において、プレイリスト生成部184は、そのグループ毎、すなわち撮像対象詳細領域毎に多視点鑑賞プレイリストを生成する。多視点鑑賞プレイリストが生成されるとプリエリスト生成処理が終了し、処理は図8に戻る。 When the playlist generation process is started, the playlist generation unit 184 groups the captured image data for each imaging target detail area in step S191. In step S192, the playlist generation unit 184 generates a multi-view viewing playlist for each group, that is, for each imaging target detail area. When the multi-view viewing playlist is generated, the pre-list generation process ends, and the process returns to FIG.
  <撮像画像データのグループ化>
 次に、図12のステップS191の処理である撮像画像データのグループ化について説明する。上述のとおり、予め撮像対象位置関係情報250の一部として、撮像対象領域252に加えて撮像対象詳細領域253が設定される場合がある。また、撮像対象詳細領域253は、上述の撮像対象の特定の結果として各映像に与えられたメタデータから自動的に形成される場合がある。ここでは3種類のグループ化について記述する。なお、撮像対象詳細領域253は円形(または半球形)で定義されるものとし、プレイリスト上での位置表現はその中心の位置によるものとする。
<Grouping of captured image data>
Next, grouping of captured image data, which is the process of step S191 in FIG. 12, will be described. As described above, the imaging target detail area 253 may be set in addition to the imaging target area 252 as a part of the imaging target positional relationship information 250 in advance. In addition, the imaging target detail area 253 may be automatically formed from metadata given to each video as a specific result of the imaging target described above. Here, three types of grouping are described. It is assumed that the imaging target detail area 253 is defined as a circle (or a hemisphere), and the position expression on the playlist is based on the center position.
 (1)撮像対象領域内を撮像したすべての映像をグループ化
 例えば、各撮像位置と撮像対象の位置との関係が図13に示すようなものであった場合、全ての撮像画像データを1つのグループとしてもよい。図13の例の場合、撮像位置領域251に撮像装置101-1乃至撮像装置101-8が設置され、それぞれ、撮像方向281-1乃至撮像方向281-8の方向の撮像対象を撮像している。また、撮像対象領域252には、1つの大きな撮像対象詳細領域282が設定されている。この撮像対象詳細領域282には、撮像装置101-1乃至撮像装置101-8の撮像対象の位置が全て含まれる。つまり、この場合、撮像装置101-1乃至撮像装置101-8が撮像した撮像画像データの全てが1つのグループにグループ化されている。
(1) Grouping all the images captured in the imaging target area For example, when the relationship between each imaging position and the imaging target position is as shown in FIG. It is good also as a group. In the case of the example in FIG. 13, the imaging devices 101-1 to 101-8 are installed in the imaging position area 251, and the imaging targets in the imaging directions 281-1 to 281-8 are captured, respectively. . In addition, one large imaging target detail area 282 is set in the imaging target area 252. The imaging target detail area 282 includes all the imaging target positions of the imaging devices 101-1 to 101-8. That is, in this case, all of the captured image data captured by the imaging devices 101-1 to 101-8 are grouped into one group.
 (2)撮像対象領域内に予め設定された撮像対象詳細領域に応じてグループ化
 図13の場合と同じ位置および方向で各撮像装置101が撮像を行う場合に、図9と同様の撮像対象詳細領域253(撮像対象詳細領域253-1乃至撮像対象詳細領域253-3)が設定されていたとする。その場合、この既存の撮像対象詳細領域253によってグループ化を行うようにしてもよい。例えば、図14に示されるように、撮像装置101-4、撮像装置101-6、および撮像装置101-8は撮像対象詳細領域253-1内を撮像対象としているので、これらの撮像画像データを1つのグループにグループ化するようにしてもよい。ただし、この例の場合、撮像装置101-1、撮像装置101-2、撮像装置101-3、撮像装置101-5、および撮像装置101-7は、既存の3つの撮像対象詳細領域253のいずれにも含まれない領域を撮像しているので、これらの撮像画像データは、この方法によるグループ化は望ましくない。
(2) Grouping according to imaging target detail areas set in advance within the imaging target area When each imaging apparatus 101 performs imaging at the same position and direction as in FIG. 13, the same imaging target details as in FIG. Assume that the area 253 (the imaging target detail area 253-1 to the imaging target detail area 253-3) is set. In that case, grouping may be performed by the existing imaging target detail area 253. For example, as shown in FIG. 14, the imaging device 101-4, the imaging device 101-6, and the imaging device 101-8 have the imaging target detailed area 253-1 as the imaging target. You may make it group into one group. However, in this example, the imaging device 101-1, the imaging device 101-2, the imaging device 101-3, the imaging device 101-5, and the imaging device 101-7 are any of the three existing imaging target detail areas 253. Therefore, it is not desirable to group these captured image data by this method.
 (3)各カメラの撮像対象位置に応じたグループ化
 撮像対象詳細領域253が予め設定されていない場合、または予め設定された撮像対象詳細領域253外を撮像対象とする撮像画像データが複数存在する場合、互いに近傍な位置を撮像対象とする複数の撮像画像データを1つのグループにグループ化するようにしてもよい。例えば、図13の場合と同じ位置および方向で各撮像装置101が撮像を行うとする。この場合、図15に示されるように、撮像装置101-4、撮像装置101-6、および撮像装置101-8は、互いに近傍を撮像対象としているので、これらの撮像対象を全て含むように撮像対象詳細領域284-1を設定し、この撮像対象詳細領域284-1によってグループ化を行うようにしてもよい。同様に、撮像装置101-1、撮像装置101-2、撮像装置101-3、および撮像装置101-5は、互いに近傍を撮像対象としているので、これらの撮像対象を全て含むように撮像対象詳細領域284-2を設定し、この撮像対象詳細領域284-2によってグループ化を行うようにしてもよい。なお、このように設定される撮像対象詳細領域の大きさは任意である。図15の例のように、撮像対象詳細領域を複数設定する場合、それらの大きさが統一されていなくてもよい(他と異なる大きさの撮像対象詳細領域を設定することができる)。
(3) Grouping according to the imaging target position of each camera When the imaging target detail area 253 is not set in advance, or there are a plurality of captured image data for imaging outside the preset imaging target detail area 253 In this case, a plurality of picked-up image data whose positions are close to each other may be grouped into one group. For example, it is assumed that each imaging device 101 performs imaging at the same position and direction as in FIG. In this case, as shown in FIG. 15, since the imaging device 101-4, the imaging device 101-6, and the imaging device 101-8 are in the vicinity of each other, the imaging is performed so as to include all these imaging targets. The target detail area 284-1 may be set, and grouping may be performed by the imaging target detail area 284-1. Similarly, since the imaging device 101-1, the imaging device 101-2, the imaging device 101-3, and the imaging device 101-5 are imaged in the vicinity of each other, the imaging object details are included so as to include all these imaging objects. An area 284-2 may be set, and grouping may be performed by the imaging target detail area 284-2. Note that the size of the imaging target detail area set in this way is arbitrary. As in the example of FIG. 15, when a plurality of imaging target detailed areas are set, their sizes do not have to be unified (an imaging target detailed area having a different size from the other can be set).
 これらの方法の内(1)および(2)は、予め指定された撮像対象詳細領域に基づいてグループ化が行われる。これに対して(3)の場合、各撮像画像データの撮像対象の位置が一定の範囲に含まれるものをグルーピングすることになる。 Of these methods, (1) and (2) are grouped based on a pre-designated imaging target detail area. On the other hand, in the case of (3), those in which the position of the imaging target of each captured image data is included in a certain range are grouped.
 なお、このような撮像画像データのグループ化はある一定の時間ごとに行われるものとする。例えば、イベント全体の中の小イベントごとといった具合である。その場合、ある1つの撮像装置101がその時間中常に1つの撮像対象(撮像対象詳細領域)を撮像しているとは限らないが、その期間中の一部でもグループ化対象領域を撮像している場合は、その撮像画像データをグループに含めるものとする。その場合、各撮像画像データに対して、所定の時間ごとに所属するグループの識別子がメタデータとして付加される。このようにしてグループ情報(撮像画像データのグループを示す情報)が生成される。このグループ情報の例を図16に示す。 It should be noted that such grouping of captured image data is performed at certain time intervals. For example, every small event in the entire event. In that case, a certain imaging device 101 does not always capture one imaging target (imaging target detailed area) during that time, but it also captures a grouping target area even during a part of that period. If it is, the captured image data is included in the group. In that case, an identifier of a group belonging to each predetermined time is added as metadata to each captured image data. In this way, group information (information indicating a group of captured image data) is generated. An example of this group information is shown in FIG.
 図16に示されるグループ情報291において、group要素は、予め設定された撮像対象詳細領域(図13の撮像対象詳細領域282および図14の撮像対象詳細領域253)、または各撮像画像データのメタデータに基づいてグルーピングされた撮像対象詳細領域(例えば、図15の撮像対象詳細領域284-1や撮像対象詳細領域284-2)毎に定義される。また、cx(n),cy(n),xz(n)は、各グループ化された撮像画像データの撮像対象位置を含む円形領域(すなわち、撮像対象詳細領域)の中心の座標を示し、radiusはその半径を表す。object_groupのgpref属性は、各撮像画像データがグループ化の結果属する<group>要素のidを示す。startX(Y)およびendX(Y)は撮像画像データが各グループに属する区間の開始および終了時刻(たとえばUTC時刻)を示す。s-angleの値(xyX(Y),zX(Y))はそのグループの撮像対象詳細領域の中心から見た撮像方向(撮像位置の方向、それぞれ水平面上の角度と水平面から上下の角度)を示す。したがって、各撮像画像データに予め付加された<期間(n)>要素に与えられたangle属性の値とは必ずしも一致しない。上述した<期間>要素と<object_group>要素は、同じ<撮影データ>要素の子要素であり併記することができるが、ここでは簡略化のため省略している。 In the group information 291 shown in FIG. 16, the group element is a preset imaging target detail area (imaging target detail area 282 in FIG. 13 and imaging target detail area 253 in FIG. 14), or metadata of each captured image data. Is defined for each imaging target detail area (for example, imaging target detail area 284-1 and imaging target detail area 284-2 in FIG. 15) grouped based on Further, cx (n), cy (n), and xz (n) indicate the coordinates of the center of the circular area (that is, the imaging target detail area) including the imaging target position of each grouped captured image data, and radius Represents the radius. The gpref attribute of object_group indicates the id of the <group> element to which each captured image data belongs as a result of grouping. startX (Y) and endX (Y) indicate the start and end time (for example, UTC time) of the section in which the captured image data belongs to each group. The s-angle values (xyX (Y), zX (Y)) indicate the imaging direction (imaging position direction, angle on the horizontal plane and angle above and below the horizontal plane) as seen from the center of the imaging target detail area of the group. Show. Therefore, the value of the angle attribute given to the <period (n)> element added in advance to each captured image data does not necessarily match. The <period> element and the <object_group> element described above are child elements of the same <photographing data> element and can be described together, but are omitted here for simplification.
  <画像グループ化処理の流れ>
 図17および図18のフローチャートを参照して、図12のステップS191において実行される画像グループ化処理の流れの例を説明する。画像グループ化処理が開始されると、プレイリスト生成部184は、図17のステップS211において、変数i、j、nを初期化する。例えば、プレイリスト生成部184は、変数iの値を「1」に設定し(i=1)、変数jの値を「1」に設定し(j=1)、変数nの値を「1」に設定する(n=1)。
<Flow of image grouping processing>
An example of the flow of the image grouping process executed in step S191 in FIG. 12 will be described with reference to the flowcharts in FIGS. When the image grouping process is started, the playlist generating unit 184 initializes variables i, j, and n in step S211 of FIG. For example, the playlist generation unit 184 sets the value of the variable i to “1” (i = 1), sets the value of the variable j to “1” (j = 1), and sets the value of the variable n to “1”. "(N = 1).
 ステップS212において、プレイリスト生成部184は、撮像画像データベース182から処理対象である画像(i)のメタデータ(例えばi番目に登録されている撮像画像データのメタデータ)を取得する。ステップS213において、プレイリスト生成部184は、j番目の区間の座標が含まれる撮像対象詳細領域が存在するか否かを判定する。存在すると判定された場合、処理はステップS214に進む。 In step S212, the playlist generation unit 184 acquires the metadata of the image (i) to be processed (for example, the metadata of the i-th registered captured image data) from the captured image database 182. In step S213, the playlist generation unit 184 determines whether there is an imaging target detail area including the coordinates of the jth section. If it is determined that it exists, the process proceeds to step S214.
 ステップS214において、プレイリスト生成部184は、画像(i)のj番目の区間を該当グループに割り当てる。ステップS214の処理が終了すると処理はステップS216に進む。また、ステップS213において、j番目の区間の座標が含まれる撮像対象詳細領域が存在しないと判定された場合、処理はステップS215に進む。ステップS215において、プレイリスト生成部184は、画像(i)のj番目の区間の位置情報をその他の撮像対象(n)として記録する。ステップS215の処理が終了すると処理はステップS216に進む。 In step S214, the playlist generation unit 184 assigns the jth section of the image (i) to the corresponding group. When the process of step S214 ends, the process proceeds to step S216. If it is determined in step S213 that there is no imaging target detailed area including the coordinates of the jth section, the process proceeds to step S215. In step S215, the playlist generation unit 184 records the position information of the jth section of the image (i) as the other imaging target (n). When the process of step S215 ends, the process proceeds to step S216.
 ステップS216において、プレイリスト生成部184は、変数jの値が画像(i)の区間の数であるか否かを判定する。変数jの値が画像(i)の区間の数に達していないと判定された場合、処理はステップS217に進む。ステップS217において、プレイリスト生成部184は、変数jと変数nの値をそれぞれ「+1」インクリメントする(j++、n++)。ステップS217の処理が終了すると処理はステップS213に戻り、それ以降の処理を繰り返す。以上のようにステップS213乃至ステップS217の処理が繰り返し実行され、ステップS216において、変数jの値が画像(i)の区間の数に達したと判定された場合、処理はステップS218に進む。 In step S216, the playlist generation unit 184 determines whether the value of the variable j is the number of sections of the image (i). If it is determined that the value of the variable j has not reached the number of sections of the image (i), the process proceeds to step S217. In step S217, the playlist generation unit 184 increments the values of the variable j and the variable n by “+1” (j ++, n ++). When the process of step S217 ends, the process returns to step S213, and the subsequent processes are repeated. As described above, the processing from step S213 to step S217 is repeatedly executed, and when it is determined in step S216 that the value of the variable j has reached the number of sections of the image (i), the processing proceeds to step S218.
 ステップS218において、プレイリスト生成部184は、変数iの値が撮像画像データの数であるか否かを判定する。変数iの値が撮像画像データの数に達していないと判定された場合、処理はステップS219に進む。ステップS219において、プレイリスト生成部184は、変数iの値を「+1」インクリメントする(i++)。ステップS219の処理が終了すると処理はステップS212に戻り、それ以降の処理を繰り返す。以上のようにステップS212乃至ステップS219の処理が繰り返し実行され、ステップS218において、変数iの値が撮像画像データの数に達したと判定された場合、処理は図18に進む。 In step S218, the playlist generation unit 184 determines whether the value of the variable i is the number of captured image data. If it is determined that the value of the variable i has not reached the number of captured image data, the process proceeds to step S219. In step S219, the playlist generation unit 184 increments the value of the variable i by “+1” (i ++). When the process of step S219 ends, the process returns to step S212, and the subsequent processes are repeated. As described above, the processing from step S212 to step S219 is repeatedly executed, and when it is determined in step S218 that the value of the variable i has reached the number of captured image data, the processing proceeds to FIG.
 図18のステップS221において、プレイリスト生成部184は、その他の対象位置(1)乃至(n)のうち、近傍範囲にあるものをグループ化して、規定外の撮像対象詳細領域を設定する。 In step S221 of FIG. 18, the playlist generation unit 184 groups other target positions (1) to (n) in the vicinity range and sets an unspecified imaging target detail area.
 ステップS222において、プレイリスト生成部184は、変数iおよび変数jに初期値を設定する。例えば、プレイリスト生成部184は、変数iの値を「1」に設定し(i=1)、変数jの値を「1」に設定する(j=1)。 In step S222, the playlist generation unit 184 sets initial values for the variable i and the variable j. For example, the playlist generation unit 184 sets the value of the variable i to “1” (i = 1) and sets the value of the variable j to “1” (j = 1).
 ステップS223において、プレイリスト生成部184は、撮像画像データベース182から処理対象である画像(i)のメタデータ(例えばi番目に登録されている撮像画像データのメタデータ)を取得する。ステップS224において、プレイリスト生成部184は、i番目の区間はグループに割り当て済みであるか否かを判定する。割り当てていないと判定された場合、処理はステップS225に進む。ステップS225において、プレイリスト生成部184は、j番目の区間の座標が含まれる規定外の撮像対象詳細領域が存在するか否かを判定する。存在すると判定された場合、処理はステップS226に進む。ステップS226において、プレイリスト生成部184は、画像(i)のj番目の区間を該当グループに割り当てる。ステップS226の処理が終了すると処理はステップS227に進む。 In step S223, the playlist generation unit 184 acquires the metadata of the image (i) to be processed (for example, the metadata of the captured image data registered for the i-th) from the captured image database 182. In step S224, the playlist generation unit 184 determines whether or not the i-th section has been assigned to the group. If it is determined that no assignment has been made, the process proceeds to step S225. In step S225, the playlist generation unit 184 determines whether there is an unspecified imaging target detail area including the coordinates of the jth section. If it is determined that it exists, the process proceeds to step S226. In step S226, the playlist generation unit 184 assigns the jth section of the image (i) to the corresponding group. When the process of step S226 ends, the process proceeds to step S227.
 また、ステップS224において、i番目の区間はグループに割り当て済みであると判定された場合、処理はステップS227に進む。また、ステップS225において、j番目の区間の座標が含まれる規定外の撮像対象詳細領域が存在すると判定された場合、処理はステップS227に進む。 If it is determined in step S224 that the i-th section has already been assigned to the group, the process proceeds to step S227. If it is determined in step S225 that there is an unspecified imaging target detail area including the coordinates of the jth section, the process proceeds to step S227.
 ステップS227において、プレイリスト生成部184は、変数jの値が画像(i)の区間数であるか否かを判定する。変数jの値が画像(i)の区間数に達していないと判定された場合、処理はステップS228に進む。ステップS228にいおいて、プレイリスト生成部184は、変数jの値を「+1」インクリメントする(j++)。ステップS228の処理が終了すると処理はステップS224に戻り、それ以降の処理を繰り返す。以上のようにステップS224乃至ステップS228の処理が繰り返し実行され、ステップS227において、変数jの値が画像(i)の区間数に達したと判定された場合、処理はステップS229に進む。 In step S227, the playlist generation unit 184 determines whether or not the value of the variable j is the number of sections of the image (i). If it is determined that the value of the variable j has not reached the number of sections of the image (i), the process proceeds to step S228. In step S228, the playlist generation unit 184 increments the value of the variable j by “+1” (j ++). When the process of step S228 ends, the process returns to step S224 and the subsequent processes are repeated. As described above, the processing from step S224 to step S228 is repeatedly executed, and when it is determined in step S227 that the value of the variable j has reached the number of sections of the image (i), the processing proceeds to step S229.
 ステップS229において、プレイリスト生成部184は、変数iの値が撮像画像データ数であるか否かを判定する。変数iの値が撮像画像データ数に達していないと判定された場合、処理はステップS230に進む。ステップS230にいおいて、プレイリスト生成部184は、変数iの値を「+1」インクリメントする(i++)。ステップS230の処理が終了すると処理はステップS223に戻り、それ以降の処理を繰り返す。以上のようにステップS223乃至ステップS230の処理が繰り返し実行され、ステップS229において、変数iの値が撮像画像データ数に達したと判定された場合、画像グループ化処理が終了し、処理は図12に戻る。 In step S229, the playlist generation unit 184 determines whether the value of the variable i is the number of captured image data. If it is determined that the value of the variable i has not reached the number of captured image data, the process proceeds to step S230. In step S230, the playlist generation unit 184 increments the value of the variable i by “+1” (i ++). When the process of step S230 ends, the process returns to step S223, and the subsequent processes are repeated. As described above, the processing from step S223 to step S230 is repeatedly executed. When it is determined in step S229 that the value of the variable i has reached the number of captured image data, the image grouping processing ends, and the processing is as shown in FIG. Return to.
 以上のように撮像画像データのグループ化が行われると、プレイリスト生成部184は、図12のステップS192の処理により、そのグループ毎に多視点鑑賞プレイリストを生成する。例えば、上述の(3)の場合、図11の映像メタデータ271において各撮像画像データに付加されたグループ識別子(object_groupのgpref属性値)により、ある映像グループに属する映像を選定する。 When the grouping of the captured image data is performed as described above, the playlist generation unit 184 generates a multi-view viewing playlist for each group by the process of step S192 in FIG. For example, in the case of (3) above, a video belonging to a video group is selected by the group identifier (gpref attribute value of object_group) added to each captured image data in the video metadata 271 of FIG.
 そして、各映像の、当該グループの撮像対象詳細領域を撮像している区間における撮像方向を決定する。上述のとおり、撮像対象がある一定の領域である場合、各撮像画像が撮像された方向(撮像方向)は、その領域の中心から見た方向とする。方向の基準(0度方向)は、地軸に基づく方位を用いることもできるが、撮像対象位置関係情報の図面上で規定することもできる。例えば、図9における上方向を0度とした時計回り方向の角度によって各撮像画像の水平面上の撮像方向とする。なお、この例ではすべての撮像画像の撮像位置は平面上から一定の高さの範囲であることが想定されるが、撮像位置がより立体的に配置される、例えば図19の例で撮像装置101-8が高い位置から撮像しているような場合を考慮して、撮像方向を表す情報は水平面上の角度と、水平面から上下への角度との2つの値で表現するようにしてもよい。 And the imaging direction in the section which is imaging the imaging target detailed area of the group of each video is determined. As described above, when the imaging target is a certain region, the direction (imaging direction) in which each captured image is captured is the direction viewed from the center of the region. As the direction reference (0 degree direction), an orientation based on the ground axis can be used, but it can also be defined on the drawing of the imaging target positional relationship information. For example, the imaging direction on the horizontal plane of each captured image is determined by an angle in a clockwise direction in which the upward direction in FIG. 9 is 0 degree. In this example, it is assumed that the imaging positions of all the captured images are within a certain height range from the plane, but the imaging positions are arranged more three-dimensionally, for example, in the example of FIG. In consideration of the case where 101-8 is picking up from a high position, the information indicating the image pickup direction may be expressed by two values, an angle on the horizontal plane and an angle from the horizontal plane to the up and down direction. .
  <多視点鑑賞プレイリスト>
 以上のようにして多視点鑑賞プレイリストが生成される。この多視点鑑賞プレイリストは、イベント全体、または小イベントごとに生成することができる。イベント全体のプレイリストを生成する場合で、小イベント毎に上述のグループ構成が変わるようなときは、小イベントごとに時間を区切るようにしてもよい。多視点鑑賞プレイリストの例を図20に示す。図20に示される多視点鑑賞プレイリスト321は、XML(Extensible Markup Language)形式で記述されたものである。
<Multi-view viewing playlist>
A multi-view viewing playlist is generated as described above. This multi-view viewing playlist can be generated for the entire event or for each small event. When a playlist for the entire event is generated and the above-described group configuration changes for each small event, the time may be divided for each small event. An example of the multi-view viewing playlist is shown in FIG. The multi-view viewing playlist 321 shown in FIG. 20 is described in XML (Extensible Markup Language) format.
 図20に示されるように、この多視点鑑賞プレイリスト321は、撮像画像データの再生に利用可能な情報(再生する撮像画像データの選択に利用可能な情報)として、例えば以下のようなelementおよびattributeを有する。 As shown in FIG. 20, this multi-view viewing playlist 321 includes, for example, the following elements and information as information that can be used for reproducing captured image data (information that can be used to select captured image data to be reproduced): Has attribute.
 MultiviewPlaylist:多視点鑑賞プレイリストであることを示す。
 MultiviewPlaylist@timescale:時間分解能(= 1/timescale)を示す。
 BaseURL:撮像画像データの取得先のbase URLを示す。
 Pereod@start:その区間(小イベント時間)の撮像開始時刻を示す。
 Pereod@duration:その区間(小イベント時間)の長さを示す。
 ObjectGroup@position:撮像対象位置関係図面内における撮像対象の位置(X,Y,Z座標)を示す。
 Representation@angle:撮像対象からみたこの撮像画像の撮像角度(撮像方向)を示す。xyは水平面の角度w示し、zは水平面との角度を示す。
 URL:各撮像画像データの取得先のURLを示す。MultiviewPlaylist.BaseURLで指定されるbase URLと結合されて用いられる。
 Inactive:その撮像画像データが撮像対象を撮像していない時間(すなわち、撮像画像データに含まれない期間)をカンマで区切って指定する。Period@startに対してtimescaleを基準として相対指定する。
MultiviewPlaylist: Indicates a multi-view viewing playlist.
MultiviewPlaylist @ timescale: Indicates time resolution (= 1 / timescale).
BaseURL: Indicates the base URL from which captured image data is acquired.
Pereod @ start: Indicates the imaging start time of the section (small event time).
Pereod @ duration: Indicates the length of the section (small event time).
ObjectGroup @ position: Indicates the position (X, Y, Z coordinates) of the imaging target in the imaging target positional relationship drawing.
Representation @ angle: Indicates the imaging angle (imaging direction) of this captured image viewed from the imaging target. xy represents an angle w of the horizontal plane, and z represents an angle with the horizontal plane.
URL: Indicates the URL from which each captured image data is acquired. Used in combination with the base URL specified by MultiviewPlaylist.BaseURL.
Inactive: The time when the captured image data is not capturing the imaging target (that is, the period not included in the captured image data) is specified by separating with a comma. Specify relative to Time @ start based on timescale.
 図20に示されるように、多視点鑑賞プレイリスト321においては、複数の撮像対象について、各撮像対象を撮像した各撮像画像データの再生等に利用可能な情報(Representation)が、撮像対象(ObjectGroup)毎にまとめられ(グループ化され)ている。また、これらの情報は、小イベント等の所定の期間(duration)について生成され、前記期間の開始時刻(Pereod@start)と長さ(Pereod@duration)を示す情報を含む。さらに、多視点鑑賞プレイリスト321は、所定の期間(小イベント)毎の、撮像画像データの検索、選択、再生等に利用可能な情報の撮像対象毎のリストを、複数の期間(イベント)について含む。 As shown in FIG. 20, in the multi-view viewing playlist 321, information (Representation) that can be used for reproduction of each captured image data obtained by capturing each imaging target, for a plurality of imaging targets, is the imaging target (ObjectGroup ) Are grouped (grouped). These pieces of information are generated for a predetermined period (duration) such as a small event, and include information indicating the start time (Pereod @ start) and the length (Pereod @ duration) of the period. Further, the multi-view appreciation playlist 321 is a list of information that can be used for searching, selecting, and reproducing captured image data for each predetermined period (small event) for a plurality of periods (events). Including.
 また、多視点鑑賞プレイリストはMPEG DASHのMPDの拡張として表現することもできる。図21乃至図23にその例を示す。図21に示される多視点鑑賞プレイリスト331、図22に示される多視点鑑賞プレイリスト332、および図23に示される多視点鑑賞プレイリスト333の内容は、図20の多視点鑑賞プレイリスト321と同一である。この場合、<Period>要素の子要素として、撮像対象となる領域毎にリスト化された、撮像対象からみた撮像方向を示す情報を含む<MultiviewGroup>要素を新たに定義し、その要素としてそのgroupに属する<AdaptationSet>を<views>要素として列挙することで各撮像対象(領域)ごとに参照すべきAaptation Setのグループを示すことができる。各<views>要素は、対象となる撮像領域の中心の座標(position属性)および範囲すなわち半径を表すr属性を持つ。r属性の値は、各映像メタデータとして生成される<group>要素のradius属性の値である。この際、各Adaptation Setが同一撮像画像データから生成されたエンコード・ビットレートの異なる複数のRepresentationを持つようにすれば、複数アングルから撮像されたそれぞれの映像を適応ビットレートストリーミング再生することができる。また、ひとつの撮像画像データが撮像期間によって複数のグループに属する場合には、それぞれ別のAdaptation Setとして記述し、各<MultiviewGroup>要素からはそれぞれ異なるAdaptation Setを参照するようにすることができる。この際、特定のMultiview Groupの対象となる撮像対象を映していない区間についてはInactive要素をAdaptation Setの子要素(または属性)として記載する。 Also, the multi-view viewing playlist can be expressed as an extension of MPEGDASH MPD. An example is shown in FIGS. The contents of the multi-view appreciation playlist 331 shown in FIG. 21, the multi-view appreciation playlist 332 shown in FIG. 22, and the multi-view appreciation playlist 333 shown in FIG. Are the same. In this case, as a child element of the <Period> element, a <MultiviewGroup> element that includes information indicating the imaging direction viewed from the imaging target, listed for each area to be imaged, is newly defined, and the group as the element By enumerating <AdaptationSet> belonging to the <views> element, a group of Adaptation 示 す Sets to be referenced for each imaging target (area) can be shown. Each <views> element has the coordinates (position attribute) of the center of the target imaging region and the r attribute representing the range, that is, the radius. The value of the r attribute is the value of the radius attribute of the <group> element generated as each video metadata. At this time, if each Adaptation Set has a plurality of Representations with different encoding and bit rates generated from the same captured image data, it is possible to perform adaptive bit rate streaming playback of each image captured from a plurality of angles. . Further, when one captured image data belongs to a plurality of groups depending on the imaging period, it can be described as different Adaptation Set, and each <MultiviewGroup> element can refer to a different Adaptation Set. At this time, the Inactive element is described as a child element (or attribute) of the Adaptation Set for a section in which the imaging target that is the target of the specific Multiview Group is not shown.
 以上のように多視点鑑賞プレイリストが生成されると、多視点鑑賞プレイリストデータベース185がその多視点鑑賞プレイリストを記憶し、管理する。 When the multi-view viewing playlist is generated as described above, the multi-view viewing playlist database 185 stores and manages the multi-view viewing playlist.
  <撮像画像データの共有>
 撮像画像データベース182に蓄積された撮像画像データは、端末装置103(のユーザ)により共有される。つまり、集積サーバ102は、端末装置103に対して撮像画像データを提供する。その際に、集積サーバ102は、上述の多視点鑑賞プレイリストを端末装置103に提供し、その多視点鑑賞プレイリストに基づいて撮像画像データを選択させる。つまり、端末装置103(のユーザ)は、集積サーバ102から供給される多視点鑑賞プレイリストに基づいて、所望の撮像画像データを選択し、それを集積サーバ102に対して要求する。
<Sharing of captured image data>
The captured image data stored in the captured image database 182 is shared by the terminal device 103 (the user). That is, the accumulation server 102 provides captured image data to the terminal device 103. At that time, the accumulation server 102 provides the above-described multi-view appreciation playlist to the terminal device 103, and selects captured image data based on the multi-view appreciation playlist. In other words, the terminal device 103 (the user) selects desired captured image data based on the multi-view viewing playlist supplied from the accumulation server 102 and requests it from the accumulation server 102.
  <画像提供処理の流れ>
 図24のフローチャートを参照して、画像提供処理の流れの例を説明する。画像提供処理が開始されると、集積サーバ102のプレイリスト提供部186は、ステップS251において、多視点鑑賞プレイリストデータベース185から多視点鑑賞プレイリストを取得し、通信部164を介してそれを端末装置103に供給する。この処理に対応して、端末装置103のプレイリスト取得部231は、ステップS261において、通信部214を制御して、その多視点鑑賞プレイリストを取得する。なお、この多視点鑑賞プレイリストの授受は、任意のタイミングにおいて行われるようにしてもよいし、端末装置103からの要求に基づいて行われるようにしてもよい。また、多視点鑑賞プレイリストデータベース185に複数の多視点鑑賞プレイリストが登録されている場合、プレイリスト提供部186が、所望の多視点鑑賞プレイリストを選択して端末装置103に供給するようにしてもよい。例えば、プレイリスト提供部186が、端末装置103(のユーザ)に対してお勧めの多視点鑑賞プレイリストを選択し、端末装置103に供給するようにしてもよい。また、例えば、端末装置103(のユーザ)が所望の多視点鑑賞プレイリストを要求するようにし、プレイリスト提供部186がその要求された多視点鑑賞プレイリストを多視点鑑賞プレイリストデータベース185から読み出して端末装置103に供給するようにしてもよい。
<Flow of image provision processing>
An example of the flow of the image providing process will be described with reference to the flowchart in FIG. When the image providing process is started, the playlist providing unit 186 of the accumulation server 102 acquires the multi-view viewing playlist from the multi-view viewing playlist database 185 in step S251, and transmits it to the terminal via the communication unit 164. Supply to device 103. In response to this processing, the playlist acquisition unit 231 of the terminal device 103 controls the communication unit 214 to acquire the multi-view viewing playlist in step S261. The exchange of the multi-view viewing playlist may be performed at an arbitrary timing or may be performed based on a request from the terminal device 103. When a plurality of multi-view viewing playlists are registered in the multi-view viewing playlist database 185, the playlist providing unit 186 selects a desired multi-view viewing playlist and supplies it to the terminal device 103. May be. For example, the playlist providing unit 186 may select a multi-view viewing playlist recommended for the terminal device 103 (user) and supply the playlist to the terminal device 103. Further, for example, the terminal device 103 (the user) requests a desired multi-view viewing playlist, and the playlist providing unit 186 reads the requested multi-view viewing playlist from the multi-view viewing playlist database 185. May be supplied to the terminal device 103.
 ステップS262において、画像選択処理部232は、出力部212を制御して、集積サーバ102から供給された多視点鑑賞プレイリストを用いたGUI(Graphical User Interface)をモニタに表示させ、ユーザにその多視点鑑賞プレイリストに基づいて撮像画像を選択させる。 In step S262, the image selection processing unit 232 controls the output unit 212 to display a GUI (Graphical User Interface) using the multi-view appreciation playlist supplied from the accumulation server 102 on the monitor, and allows the user to display the GUI. The captured image is selected based on the viewpoint appreciation playlist.
 例えば、視聴者である端末装置103のユーザが、撮像者である撮像装置101のユーザとして撮像画像データを集積サーバ102にアップロードしている(集積サーバ102による撮像画像データの集積に参加している)場合、ユーザ自らが提供した撮像画像データを視聴するケースが多いと考えられる。したがって、例えば図25に示されるような画像選択用GUI350が表示されるようにしてもよい。画像選択用GUI350には、そのユーザが提供した撮像画像における撮像対象詳細領域351が示され、さらに、その撮像対象詳細領域351を撮像対象とする各撮像画像が、その撮像対象詳細領域351の中心からみた各撮像方向に示される。図25の例の場合、撮像画像352は、そのユーザがアップロードした撮像画像であり、撮像画像353および撮像画像354は、他者がアップロードした撮像画像である。ユーザが、この画像選択用GUI350に表示されるいずれかの撮像画像を選択すると、画像選択処理部232は、その選択された撮像画像を、視聴する(集積サーバ102に要求する)撮像画像とする。 For example, a user of the terminal device 103 who is a viewer uploads captured image data to the integration server 102 as a user of the imaging device 101 which is a photographer (participates in the integration of the captured image data by the integration server 102). ), It is considered that there are many cases of viewing captured image data provided by the user himself / herself. Therefore, for example, an image selection GUI 350 as shown in FIG. 25 may be displayed. The image selection GUI 350 shows an imaging target detail area 351 in the captured image provided by the user, and each captured image whose imaging target detail area 351 is an imaging target is the center of the imaging target detail area 351. It is shown in each imaging direction viewed from the viewpoint. In the example of FIG. 25, the captured image 352 is a captured image uploaded by the user, and the captured image 353 and the captured image 354 are captured images uploaded by others. When the user selects any captured image displayed on the image selection GUI 350, the image selection processing unit 232 sets the selected captured image as a captured image to be viewed (requested from the integrated server 102). .
 このようなGUIにより、ユーザがアップロードした撮像画像において撮像対象としていた撮像対象詳細領域351について、他にどの角度から撮像された映像が存在するのかを視覚的にわかりやすく提示することができる。したがって、ユーザはより容易に所望の撮像画像を選択することができる。なお、撮像画像の代わりに画枠のみが表示されるようにしてもよい。 With such a GUI, it is possible to visually present in an easy-to-understand manner what other images are captured from the imaging target detail area 351 that was the imaging target in the captured image uploaded by the user. Therefore, the user can select a desired captured image more easily. Note that only the image frame may be displayed instead of the captured image.
 また、例えば、視聴者である端末装置103のユーザが、撮像画像をアップロードしていない場合、また、アップロードした撮像画像において撮像対象とした撮像対象詳細領域とは異なる領域についての撮像画像を選択する場合、撮像対象詳細領域が設定されているのであれば、図13や図14に示されるような構成の領域選択用GUIが表示されるようにしてもよい。そして、例えば、端末装置103のユーザがこのGUIを用いて撮像対象領域を選択すると、その選択された領域について、図25に示されるような画像選択用GUIが表示されるようにしてもよい。なお、撮像対象詳細領域が設定されていない場合、図13に示される撮像対象詳細領域282について、図25に示されるような画像選択用GUIが表示されるようにしてもよい。 In addition, for example, when the user of the terminal device 103 who is a viewer has not uploaded a captured image, the captured image is selected for an area different from the imaging target detail area as the imaging target in the uploaded captured image. In this case, if an imaging target detail area is set, an area selection GUI configured as shown in FIGS. 13 and 14 may be displayed. Then, for example, when the user of the terminal apparatus 103 selects an imaging target area using this GUI, an image selection GUI as shown in FIG. 25 may be displayed for the selected area. When the imaging target detail area is not set, an image selection GUI as shown in FIG. 25 may be displayed for the imaging target detail area 282 shown in FIG.
 このようにすることにより、各撮像対象詳細領域について、他にどの角度から撮像された映像が存在するのかを視覚的にわかりやすく提示することができる。したがって、ユーザはより容易に所望の撮像画像を選択することができる。つまり、多数の撮像装置で撮像された撮像画像が集積サーバ102に集積されたなかから、ユーザの関心にマッチした撮像画像を容易に選択して視聴することが可能となる。 By doing in this way, it is possible to present in an easy-to-understand manner which angle from which other images are captured for each of the imaging target detail areas. Therefore, the user can select a desired captured image more easily. That is, since the picked-up images picked up by a large number of image pickup devices are accumulated on the accumulation server 102, it becomes possible to easily select and view a picked-up image that matches the user's interest.
 以上のようにして撮像画像が選択されると、撮像画像要求部233は、ステップS263において、通信部214を制御して、選択された撮像画像の撮像画像データを集積サーバ102に対して要求する。集積サーバ102の撮像画像提供部187は、ステップS252において、通信部164を制御して、その要求を受け付けると、撮像画像データベース182からその撮像画像データを読み出し、通信部164を介してそれを端末装置103に供給する。 When a captured image is selected as described above, the captured image request unit 233 requests the captured image data of the selected captured image from the integrated server 102 by controlling the communication unit 214 in step S263. . When the captured image providing unit 187 of the integrated server 102 controls the communication unit 164 and accepts the request in step S252, the captured image data is read from the captured image database 182 and is transmitted to the terminal via the communication unit 164. Supply to device 103.
 端末装置103の撮像画像取得部234は、ステップS263において、通信部214を制御して、その撮像画像データを取得する。ステップS264において、再生部235は、その撮像画像データを再生する。ステップS264の処理が終了すると、画像提供処理が終了する。 In step S263, the captured image acquisition unit 234 of the terminal device 103 controls the communication unit 214 to acquire the captured image data. In step S264, the reproduction unit 235 reproduces the captured image data. When the process of step S264 ends, the image provision process ends.
 以上のように各処理を実行することにより、撮像時に撮像画像データに付加されたメタデータと、予め用意された撮像が可能な領域と撮像対象物の位置関係とを用いることで、多数の集積された撮像画像データを撮像対象(人や物、あるいはその周辺の一定の範囲)ごとに分類し、それらの撮像画像が特定の撮像対象(一定の範囲を持つ場合はその中心)地点に対してどの方向から撮像された画像であるかの情報を伴った多視点鑑賞プレイリストを生成することができ、それに基づいてユーザが自身が撮像したものを含む複数の撮像装置101で撮像された撮像画像の中から所望のものを容易に選択することが可能となる。これにより、多数の観覧者によって撮像された撮像画像データを集積・共有することの価値を高めることができる。 By executing each processing as described above, a large number of integrations are performed by using metadata added to the captured image data at the time of imaging, and a positional relationship between an imaging-capable area and an imaging target prepared in advance. The captured image data is classified for each imaging target (a person, an object, or a fixed range around it), and those captured images are for a specific imaging target (or the center if it has a fixed range). A multi-view appreciation playlist with information on which image is taken from which direction can be generated, and a captured image captured by a plurality of imaging devices 101 including the one captured by the user based on the playlist. It becomes possible to easily select a desired one from the above. Thereby, the value of collecting and sharing the captured image data captured by a large number of viewers can be increased.
 なお、以上においては、画像データを例に説明したが、集積・共有する情報は任意である。例えば、音声データを集積・共有するようにしてもよい。また、例えば音声データと画像データのように、複数種類のデータを集積・共有するようにしてもよい。つまり、本技術を適用することにより、ユーザが所望のコンテンツをより容易に選択することができる。 In the above description, image data has been described as an example, but information to be collected and shared is arbitrary. For example, audio data may be collected and shared. Further, for example, a plurality of types of data such as audio data and image data may be accumulated and shared. That is, by applying the present technology, the user can more easily select desired content.
また、以上においては、撮像対象詳細領域毎に撮像画像データをグループ化するように説明したが、このグループ化は、撮像対象に関するものであれば任意のものを基準として行うことができる。例えば、撮像対象(人、物、領域)毎に撮像画像データをグループ化するようにしてもよい。 In the above description, the captured image data is grouped for each imaging target detail area. However, this grouping can be performed on the basis of an arbitrary one as long as it relates to the imaging target. For example, the captured image data may be grouped for each imaging target (person, object, region).
 <3.ソフトウエア>
 上述した一連の処理は、ハードウエアにより実行させることもできるし、ソフトウエアにより実行させることもできる。一連の処理をソフトウエアにより実行する場合には、例えば、撮像装置101の各処理部(撮像部121乃至通信部124)、集積サーバ102のCPU151、端末装置103のCPU201等が、そのソフトウエアを実行することができるコンピュータとしての構成を有するようにすればよい。このコンピュータには、例えば、専用のハードウエアに組み込まれているコンピュータや、各種のプログラムをインストールすることで、任意の機能を実行することが可能な汎用のコンピュータ等が含まれる。
<3. Software>
The series of processes described above can be executed by hardware or can be executed by software. When a series of processing is executed by software, for example, each processing unit (imaging unit 121 to communication unit 124) of the imaging device 101, the CPU 151 of the integrated server 102, the CPU 201 of the terminal device 103, and the like execute the software. What is necessary is just to make it have the structure as a computer which can be performed. Examples of the computer include a computer incorporated in dedicated hardware and a general-purpose computer capable of executing an arbitrary function by installing various programs.
 例えば、集積サーバ102は、図3を参照して説明したような構成を有し、上述したように、CPU151が、例えば記憶部163に記憶されているプログラムを、入出力インタフェース160およびバス154を介して、RAM153にロードして実行することにより、上述した一連の処理をソフトウエアにより実行させる。 For example, the integrated server 102 has the configuration described with reference to FIG. 3, and as described above, the CPU 151 loads a program stored in the storage unit 163, for example, through the input / output interface 160 and the bus 154. Thus, the above-described series of processing is executed by software by being loaded into the RAM 153 and executed.
 このプログラムは、例えば、パッケージメディア等としてのリムーバブルメディア171に記録して提供することができる。その場合、プログラムは、リムーバブルメディア171をドライブ165に装着することにより、入出力インタフェース160を介して、記憶部163にインストールすることができる。また、このプログラムは、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線または無線の伝送媒体を介して提供することもできる。その場合、プログラムは、通信部164で受信し、記憶部163にインストールすることができる。その他、このプログラムは、ROM152や記憶部163に、あらかじめインストールしておくこともできる。 This program can be provided by being recorded on a removable medium 171 as a package medium, for example. In that case, the program can be installed in the storage unit 163 via the input / output interface 160 by attaching the removable medium 171 to the drive 165. This program can also be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting. In that case, the program can be received by the communication unit 164 and installed in the storage unit 163. In addition, this program can be installed in the ROM 152 or the storage unit 163 in advance.
 以上においては、集積サーバ102について説明したが、その他の装置の場合も同様であり、同様の構成を有し、上述のようにプログラムを実行するようにすればよい。なお、上述した一連の処理は、一部をハードウエアにより実行させ、他をソフトウエアにより実行させることもできる。 In the above, the integrated server 102 has been described. However, the same applies to other devices, and the same configuration may be used, and the program may be executed as described above. Note that a part of the series of processes described above can be executed by hardware, and the other can be executed by software.
 <4.その他>
 上述した各種のメタデータは、撮像画像データに関連づけられていれば、どのような形態で伝送または記録されるようにしてもよい。ここで、「関連付ける」という用語は、例えば、一方のデータを処理する際に他方のデータを利用し得る(リンクさせ得る)ようにすることを意味する。つまり、互いに関連付けられたデータは、1つのデータとしてまとめられてもよいし、それぞれ個別のデータとしてもよい。例えば、撮像画像データに関連付けられた情報は、その撮像画像データとは別の伝送路上で(または異なるタイミングに)伝送されるようにしてもよい。また、例えば、撮像画像データに関連付けられた情報は、その撮像画像データとは別の記録媒体(又は同一の記録媒体の別の記録エリア)に記録されるようにしてもよい。なお、この「関連付け」は、データ全体でなく、データの一部であってもよい。例えば、画像とその画像に対応する情報とが、複数フレーム、1フレーム、又はフレーム内の一部分などの任意の単位で互いに関連付けられるようにしてもよい。
<4. Other>
The various metadata described above may be transmitted or recorded in any form as long as it is associated with the captured image data. Here, the term “associate” means, for example, that one data can be used (linked) when one data is processed. That is, the data associated with each other may be collected as one data, or may be individual data. For example, the information associated with the captured image data may be transmitted on a transmission path different from the captured image data (or at a different timing). Further, for example, information associated with captured image data may be recorded on a recording medium different from the captured image data (or another recording area of the same recording medium). The “association” may be a part of the data, not the entire data. For example, an image and information corresponding to the image may be associated with each other in an arbitrary unit such as a plurality of frames, one frame, or a part of the frame.
 また、本明細書において、「合成する」、「多重化する」、「付加する」、「一体化する」、「含める」、「格納する」、「入れ込む」、「差し込む」、「挿入する」等の用語は、例えば撮像画像データとメタデータとを1つのデータにまとめるといった、複数の物を1つにまとめることを意味し、上述の「関連付ける」の1つの方法を意味する。 Also, in this specification, “synthesize”, “multiplex”, “add”, “integrate”, “include”, “store”, “insert”, “insert”, “insert” A term such as “” means that a plurality of things are combined into one data, for example, the captured image data and metadata are combined into one data, and means one method of “associating” described above.
 また、本技術の実施の形態は、上述した実施の形態に限定されるものではなく、本技術の要旨を逸脱しない範囲において種々の変更が可能である。 The embodiments of the present technology are not limited to the above-described embodiments, and various modifications can be made without departing from the gist of the present technology.
 また、例えば、本技術は、装置またはシステムを構成するあらゆる構成、例えば、システムLSI(Large Scale Integration)等としてのプロセッサ、複数のプロセッサ等を用いるモジュール、複数のモジュール等を用いるユニット、ユニットにさらにその他の機能を付加したセット等(すなわち、装置の一部の構成)として実施することもできる。 In addition, for example, the present technology may be applied to any configuration that constitutes an apparatus or a system, such as a processor as a system LSI (Large Scale Integration), a module using a plurality of processors, a unit using a plurality of modules, and the unit. It can also be implemented as a set to which other functions are added (that is, a partial configuration of the apparatus).
 なお、本明細書において、システムとは、複数の構成要素(装置、モジュール(部品)等)の集合を意味し、全ての構成要素が同一筐体中にあるか否かは問わない。したがって、別個の筐体に収納され、ネットワークを介して接続されている複数の装置、及び、1つの筐体の中に複数のモジュールが収納されている1つの装置は、いずれも、システムである。 In this specification, the system means a set of a plurality of constituent elements (devices, modules (parts), etc.), and it does not matter whether all the constituent elements are in the same casing. Accordingly, a plurality of devices housed in separate housings and connected via a network and a single device housing a plurality of modules in one housing are all systems. .
 また、例えば、1つの装置(または処理部)として説明した構成を分割し、複数の装置(または処理部)として構成するようにしてもよい。逆に、以上において複数の装置(または処理部)として説明した構成をまとめて1つの装置(または処理部)として構成されるようにしてもよい。また、各装置(または各処理部)の構成に上述した以外の構成を付加するようにしてももちろんよい。さらに、システム全体としての構成や動作が実質的に同じであれば、ある装置(または処理部)の構成の一部を他の装置(または他の処理部)の構成に含めるようにしてもよい。 Further, for example, the configuration described as one device (or processing unit) may be divided and configured as a plurality of devices (or processing units). Conversely, the configurations described above as a plurality of devices (or processing units) may be combined into a single device (or processing unit). Of course, a configuration other than that described above may be added to the configuration of each device (or each processing unit). Furthermore, if the configuration and operation of the entire system are substantially the same, a part of the configuration of a certain device (or processing unit) may be included in the configuration of another device (or other processing unit). .
 また、例えば、本技術は、1つの機能を、ネットワークを介して複数の装置で分担、共同して処理するクラウドコンピューティングの構成をとることができる。 Also, for example, the present technology can take a configuration of cloud computing in which one function is shared and processed by a plurality of devices via a network.
 また、例えば、上述したプログラムは、任意の装置において実行することができる。その場合、その装置が、必要な機能(機能ブロック等)を有し、必要な情報を得ることができるようにすればよい。 Also, for example, the above-described program can be executed in an arbitrary device. In that case, the device may have necessary functions (functional blocks and the like) so that necessary information can be obtained.
 また、例えば、上述のフローチャートで説明した各ステップは、1つの装置で実行する他、複数の装置で分担して実行することができる。さらに、1つのステップに複数の処理が含まれる場合には、その1つのステップに含まれる複数の処理は、1つの装置で実行する他、複数の装置で分担して実行することができる。換言するに、1つのステップに含まれる複数の処理を、複数のステップの処理として実行することもできる。逆に、複数のステップとして説明した処理を1つのステップとしてまとめて実行することもできる。 Also, for example, each step described in the above flowchart can be executed by one device or can be executed by a plurality of devices. Further, when a plurality of processes are included in one step, the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus. In other words, a plurality of processes included in one step can be executed as a process of a plurality of steps. Conversely, the processing described as a plurality of steps can be collectively executed as one step.
 なお、コンピュータが実行するプログラムは、プログラムを記述するステップの処理が、本明細書で説明する順序に沿って時系列に実行されるようにしても良いし、並列に、あるいは呼び出しが行われたとき等の必要なタイミングで個別に実行されるようにしても良い。つまり、矛盾が生じない限り、各ステップの処理が上述した順序と異なる順序で実行されるようにしてもよい。さらに、このプログラムを記述するステップの処理が、他のプログラムの処理と並列に実行されるようにしても良いし、他のプログラムの処理と組み合わせて実行されるようにしても良い。 Note that the program executed by the computer may be executed in a time series in the order described in this specification for the processing of the steps describing the program, or in parallel or called. It may be executed individually at a necessary timing. That is, as long as no contradiction occurs, the processing of each step may be executed in an order different from the order described above. Furthermore, the processing of the steps describing this program may be executed in parallel with the processing of other programs, or may be executed in combination with the processing of other programs.
 なお、本明細書において複数説明した本技術は、矛盾が生じない限り、それぞれ独立に単体で実施することができる。もちろん、任意の複数の本技術を併用して実施することもできる。例えば、いずれかの実施の形態において説明した本技術の一部または全部を、他の実施の形態において説明した本技術の一部または全部と組み合わせて実施することもできる。また、上述した任意の本技術の一部または全部を、上述していない他の技術と併用して実施することもできる。 In addition, as long as there is no contradiction, the technologies described in this specification can be implemented independently. Of course, any of a plurality of present technologies can be used in combination. For example, part or all of the present technology described in any of the embodiments can be combined with part or all of the present technology described in other embodiments. Moreover, a part or all of the arbitrary present technology described above can be implemented in combination with other technologies not described above.
 なお、本技術は以下のような構成も取ることができる。
 (1) 同一の撮像対象を互いに異なる位置から撮像して生成された複数の撮像画像データの再生に利用可能な情報のリストを含む多視点鑑賞プレイリストを生成する生成部
 を備える情報処理装置。
 (2) 前記撮像画像データの再生に利用可能な情報は、再生する前記撮像画像データの選択に利用可能な情報である
 (1)に記載の情報処理装置。
 (3) 前記撮像画像データの再生に利用可能な情報は、前記撮像対象の位置を示す情報を含む
 (1)または(2)に記載の情報処理装置。
 (4) 前記撮像画像データの再生に利用可能な情報は、前記撮像対象からみた撮像方向を示す情報を含む
 (1)乃至(3)のいずれかに記載の情報処理装置。
 (5) 前記撮像画像データの再生に利用可能な情報は、前記撮像画像データの取得先を示す情報を含む
 (1)乃至(4)のいずれかに記載の情報処理装置。
 (6) 前記撮像画像データの再生に利用可能な情報は、前記撮像画像データに含まれない期間を示す情報を含む
 (1)乃至(5)のいずれかに記載の情報処理装置。
 (7) 前記多視点鑑賞プレイリストは、前記撮像画像データの再生に利用可能な情報の撮像対象毎のリストを、複数の撮像対象について含む
 (1)乃至(6)のいずれかに記載の情報処理装置。
 (8) 前記多視点鑑賞プレイリストは、所定の期間について生成され、前記期間の開始時刻と長さを示す情報を含む
 (1)乃至(7)のいずれかに記載の情報処理装置。
 (9) 前記多視点鑑賞プレイリストは、所定の期間毎の、前記撮像画像データの再生に利用可能な情報の撮像対象毎のリストを、複数の期間について含む
 (8)に記載の情報処理装置。
 (10) 前記生成部は、前記多視点鑑賞プレイリストを、MPEG-DASH(Moving Picture Experts Group phase - Dynamic Adaptive Streaming over HTTP)のMPD(Media Presentation Description)として生成する
 (1)乃至(9)のいずれかに記載の情報処理装置。
 (11) 前記多視点鑑賞プレイリストは、撮像対象となる領域毎にリスト化された、撮像対象からみた撮像方向を示す情報を含む
 (10)に記載の情報処理装置。
 (12) 前記多視点鑑賞プレイリストは、前記撮像対象となる領域の中心位置および半径を示す情報を含む
 (11)に記載の情報処理装置。
 (13) 前記多視点鑑賞プレイリストは、各撮像画像データに関する情報を、AdaptationSetで管理する
 (10)乃至(12)のいずれかに記載の情報処理装置。
 (14) 前記生成部は、
  撮像対象に基づいて前記撮像画像データをグループ化し、
  グループ毎に前記撮像画像データの再生に利用可能な情報のリストを生成する
 (1)乃至(13)のいずれかに記載の情報処理装置。
 (15) 前記生成部は、
  全ての撮像画像データを1グループにまとめるか、
  予め設定された領域を用いて、前記撮像画像データをグループ化するか、
  または、各撮像対象の位置に応じて、前記撮像画像データをグループ化する
 (14)に記載の情報処理装置。
 (16) 撮像の状況を解析する解析部をさらに備え、
 前記生成部は、前記解析部による解析の結果に基づいて、前記多視点鑑賞プレイリストを生成するように構成される
 (1)乃至(15)のいずれかに記載の情報処理装置。
 (17) 前記解析部は、解析の結果、撮像に関する情報、撮像対象に関する情報、撮像対象からみた撮像方向に関する情報を得る
 (16)に記載の情報処理装置。
 (18) 前記解析部は、撮像画像データのメタデータと会場に関する情報とに基づいて、撮像の状況を解析する
 (17)に記載の情報処理装置。
 (19) 前記生成部により生成された前記多視点鑑賞プレイリストを提供する提供部をさらに備える
 (1)乃至(18)のいずれかに記載の情報処理装置。
 (20) 同一の撮像対象を互いに異なる位置から撮像して生成された複数の撮像画像データの再生に利用可能な情報のリストを含む多視点鑑賞プレイリストを生成する
 情報処理方法。
In addition, this technique can also take the following structures.
(1) An information processing apparatus including a generation unit that generates a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions.
(2) The information processing apparatus according to (1), wherein the information usable for reproducing the captured image data is information usable for selecting the captured image data to be reproduced.
(3) The information processing apparatus according to (1) or (2), wherein information usable for reproducing the captured image data includes information indicating a position of the imaging target.
(4) The information processing apparatus according to any one of (1) to (3), wherein the information usable for reproducing the captured image data includes information indicating an imaging direction viewed from the imaging target.
(5) The information processing apparatus according to any one of (1) to (4), wherein the information that can be used for reproduction of the captured image data includes information indicating an acquisition destination of the captured image data.
(6) The information processing apparatus according to any one of (1) to (5), wherein information usable for reproducing the captured image data includes information indicating a period not included in the captured image data.
(7) The information according to any one of (1) to (6), wherein the multi-view viewing playlist includes, for a plurality of imaging targets, a list for each imaging target of information that can be used for reproducing the captured image data. Processing equipment.
(8) The information processing apparatus according to any one of (1) to (7), wherein the multi-view viewing playlist is generated for a predetermined period and includes information indicating a start time and a length of the period.
(9) The information processing apparatus according to (8), wherein the multi-view viewing playlist includes a list for each imaging target of information that can be used for reproduction of the captured image data for each predetermined period for a plurality of periods. .
(10) The generation unit generates the multi-view viewing playlist as an MPD (Media Presentation Description) of MPEG-DASH (Moving Picture Experts Group phase-Dynamic Adaptive Streaming over HTTP). (1) to (9) The information processing apparatus according to any one of the above.
(11) The information processing apparatus according to (10), wherein the multi-view appreciation playlist includes information indicating an imaging direction viewed from the imaging target, which is listed for each region to be imaged.
(12) The information processing apparatus according to (11), wherein the multi-view viewing playlist includes information indicating a center position and a radius of the region to be imaged.
(13) The information processing apparatus according to any one of (10) to (12), wherein the multi-view viewing playlist manages information related to each captured image data with an AdaptationSet.
(14) The generation unit includes:
Group the captured image data based on the imaging target;
The information processing apparatus according to any one of (1) to (13), wherein a list of information that can be used for reproduction of the captured image data is generated for each group.
(15) The generation unit includes:
All the captured image data is grouped into one group
Group the captured image data using a preset area, or
Or the information processing device according to (14), wherein the captured image data is grouped according to a position of each imaging target.
(16) An analysis unit for analyzing the imaging state is further provided,
The information processing apparatus according to any one of (1) to (15), wherein the generation unit is configured to generate the multi-view viewing playlist based on a result of analysis by the analysis unit.
(17) The information processing apparatus according to (16), wherein the analysis unit obtains information about imaging, information about an imaging target, and information about an imaging direction viewed from the imaging target as a result of analysis.
(18) The information processing apparatus according to (17), wherein the analysis unit analyzes a state of imaging based on metadata of captured image data and information related to a venue.
(19) The information processing apparatus according to any one of (1) to (18), further including a providing unit that provides the multi-view viewing playlist generated by the generating unit.
(20) An information processing method for generating a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions.
 100 画像提供システム, 101 撮像装置, 102 集積サーバ, 103 端末装置, 121 撮像部, 122 メタデータ生成部, 123 メタデータ付加部, 124 通信部, 181 集積部, 182 撮像画像データベース, 183 撮像状況解析部, 184 プレイリスト生成部, 185 多視点鑑賞プレイリストデータベース, 186 プレイリスト提供部, 187 撮像画像提供部, 231 プレイリスト取得部, 232 画像選択処理部, 233 撮像画像要求部, 234 撮像画像取得部, 235 再生部 100 image providing system, 101 imaging device, 102 integration server, 103 terminal device, 121 imaging unit, 122 metadata generation unit, 123 metadata addition unit, 124 communication unit, 181 integration unit, 182 imaging image database, 183 imaging status analysis Unit, 184 playlist generation unit, 185 multi-view viewing playlist database, 186 playlist provision unit, 187 captured image provision unit, 231 playlist acquisition unit, 232 image selection processing unit, 233 captured image request unit, 234 captured image acquisition Part, 235 reproduction part

Claims (20)

  1.  同一の撮像対象を互いに異なる位置から撮像して生成された複数の撮像画像データの再生に利用可能な情報のリストを含む多視点鑑賞プレイリストを生成する生成部
     を備える情報処理装置。
    An information processing apparatus comprising: a generation unit that generates a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions.
  2.  前記撮像画像データの再生に利用可能な情報は、再生する前記撮像画像データの選択に利用可能な情報である
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, wherein the information usable for reproducing the captured image data is information usable for selecting the captured image data to be reproduced.
  3.  前記撮像画像データの再生に利用可能な情報は、前記撮像対象の位置を示す情報を含む
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, wherein information usable for reproducing the captured image data includes information indicating a position of the imaging target.
  4.  前記撮像画像データの再生に利用可能な情報は、前記撮像対象からみた撮像方向を示す情報を含む
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, wherein the information usable for reproducing the captured image data includes information indicating an imaging direction viewed from the imaging target.
  5.  前記撮像画像データの再生に利用可能な情報は、前記撮像画像データの取得先を示す情報を含む
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, wherein the information that can be used for reproducing the captured image data includes information indicating an acquisition destination of the captured image data.
  6.  前記撮像画像データの再生に利用可能な情報は、前記撮像画像データに含まれない期間を示す情報を含む
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, wherein the information usable for reproducing the captured image data includes information indicating a period not included in the captured image data.
  7.  前記多視点鑑賞プレイリストは、前記撮像画像データの再生に利用可能な情報の撮像対象毎のリストを、複数の撮像対象について含む
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, wherein the multi-view viewing playlist includes a list for each imaging target of information that can be used for reproducing the captured image data, for a plurality of imaging targets.
  8.  前記多視点鑑賞プレイリストは、所定の期間について生成され、前記期間の開始時刻と長さを示す情報を含む
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, wherein the multi-view appreciation playlist is generated for a predetermined period and includes information indicating a start time and a length of the period.
  9.  前記多視点鑑賞プレイリストは、所定の期間毎の、前記撮像画像データの再生に利用可能な情報の撮像対象毎のリストを、複数の期間について含む
     請求項8に記載の情報処理装置。
    The information processing apparatus according to claim 8, wherein the multi-view viewing playlist includes a list for each imaging target of information that can be used for reproduction of the captured image data for each predetermined period for a plurality of periods.
  10.  前記生成部は、前記多視点鑑賞プレイリストを、MPEG-DASH(Moving Picture Experts Group phase - Dynamic Adaptive Streaming over HTTP)のMPD(Media Presentation Description)として生成する
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, wherein the generation unit generates the multi-view viewing playlist as an MPD (Media Presentation Description) of MPEG-DASH (Moving Picture Experts Group phase-Dynamic Adaptive Streaming over HTTP).
  11.  前記多視点鑑賞プレイリストは、撮像対象となる領域毎にリスト化された、撮像対象からみた撮像方向を示す情報を含む
     請求項10に記載の情報処理装置。
    The information processing apparatus according to claim 10, wherein the multi-view appreciation playlist includes information indicating an imaging direction viewed from an imaging target, which is listed for each area to be captured.
  12.  前記多視点鑑賞プレイリストは、前記撮像対象となる領域の中心位置および半径を示す情報を含む
     請求項11に記載の情報処理装置。
    The information processing apparatus according to claim 11, wherein the multi-view viewing playlist includes information indicating a center position and a radius of the region to be imaged.
  13.  前記多視点鑑賞プレイリストは、各撮像画像データに関する情報を、AdaptationSetで管理する
     請求項10に記載の情報処理装置。
    The information processing apparatus according to claim 10, wherein the multi-view viewing playlist manages information related to each captured image data using an AdaptationSet.
  14.  前記生成部は、
      撮像対象に基づいて前記撮像画像データをグループ化し、
      グループ毎に前記撮像画像データの再生に利用可能な情報のリストを生成する
     請求項1に記載の情報処理装置。
    The generator is
    Group the captured image data based on the imaging target;
    The information processing apparatus according to claim 1, wherein a list of information that can be used for reproduction of the captured image data is generated for each group.
  15.  前記生成部は、
      全ての撮像画像データを1グループにまとめるか、
      予め設定された領域を用いて、前記撮像画像データをグループ化するか、
      または、各撮像対象の位置に応じて、前記撮像画像データをグループ化する
     請求項14に記載の情報処理装置。
    The generator is
    All the captured image data is grouped into one group
    Group the captured image data using a preset area, or
    The information processing apparatus according to claim 14, wherein the captured image data is grouped according to a position of each imaging target.
  16.  撮像の状況を解析する解析部をさらに備え、
     前記生成部は、前記解析部による解析の結果に基づいて、前記多視点鑑賞プレイリストを生成するように構成される
     請求項1に記載の情報処理装置。
    An analysis unit that analyzes the imaging situation is further provided,
    The information processing apparatus according to claim 1, wherein the generation unit is configured to generate the multi-view viewing playlist based on a result of analysis by the analysis unit.
  17.  前記解析部は、解析の結果、撮像に関する情報、撮像対象に関する情報、撮像対象からみた撮像方向に関する情報を得る
     請求項16に記載の情報処理装置。
    The information processing apparatus according to claim 16, wherein the analysis unit obtains information about imaging, information about an imaging target, and information about an imaging direction viewed from the imaging target as a result of analysis.
  18.  前記解析部は、撮像画像データのメタデータと会場に関する情報とに基づいて、撮像の状況を解析する
     請求項17に記載の情報処理装置。
    The information processing apparatus according to claim 17, wherein the analysis unit analyzes an imaging state based on metadata of captured image data and information related to a venue.
  19.  前記生成部により生成された前記多視点鑑賞プレイリストを提供する提供部をさらに備える
     請求項1に記載の情報処理装置。
    The information processing apparatus according to claim 1, further comprising a providing unit that provides the multi-view viewing playlist generated by the generating unit.
  20.  同一の撮像対象を互いに異なる位置から撮像して生成された複数の撮像画像データの再生に利用可能な情報のリストを含む多視点鑑賞プレイリストを生成する
     情報処理方法。
    An information processing method for generating a multi-view viewing playlist including a list of information that can be used to reproduce a plurality of captured image data generated by imaging the same imaging target from different positions.
PCT/JP2018/002379 2017-02-10 2018-01-26 Information processing device and method WO2018147089A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2017-022703 2017-02-10
JP2017022703 2017-02-10

Publications (1)

Publication Number Publication Date
WO2018147089A1 true WO2018147089A1 (en) 2018-08-16

Family

ID=63107405

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2018/002379 WO2018147089A1 (en) 2017-02-10 2018-01-26 Information processing device and method

Country Status (1)

Country Link
WO (1) WO2018147089A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020137876A1 (en) * 2018-12-26 2020-07-02 シャープ株式会社 Generation device, three-dimensional data transmission device, and three-dimensional data reproduction device
WO2022091215A1 (en) * 2020-10-27 2022-05-05 Amatelus株式会社 Video distribution device, video distribution system, video distribution method, and program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011078008A (en) * 2009-10-01 2011-04-14 Nippon Hoso Kyokai <Nhk> Content sharing apparatus, content editing apparatus, content sharing program, and content editing program
WO2014007083A1 (en) * 2012-07-02 2014-01-09 ソニー株式会社 Transmission apparatus, transmission method, and network apparatus
WO2015109228A1 (en) * 2014-01-16 2015-07-23 Qualcomm Incorporated Robust live operation of dash
JP2017504234A (en) * 2013-11-20 2017-02-02 グーグル インコーポレイテッド Multi-view audio and video interactive playback

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011078008A (en) * 2009-10-01 2011-04-14 Nippon Hoso Kyokai <Nhk> Content sharing apparatus, content editing apparatus, content sharing program, and content editing program
WO2014007083A1 (en) * 2012-07-02 2014-01-09 ソニー株式会社 Transmission apparatus, transmission method, and network apparatus
JP2017504234A (en) * 2013-11-20 2017-02-02 グーグル インコーポレイテッド Multi-view audio and video interactive playback
WO2015109228A1 (en) * 2014-01-16 2015-07-23 Qualcomm Incorporated Robust live operation of dash

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Information technology-Dynamic adaptive streaming over HTTP (DASH)- Part 1: Media presentation description and segment formats", ISO/IEC, 2012OMICRON 04.01 , ISO/IEC 23009-1 : 2012 (E, pages 6 - 28 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020137876A1 (en) * 2018-12-26 2020-07-02 シャープ株式会社 Generation device, three-dimensional data transmission device, and three-dimensional data reproduction device
WO2022091215A1 (en) * 2020-10-27 2022-05-05 Amatelus株式会社 Video distribution device, video distribution system, video distribution method, and program
JPWO2022091215A1 (en) * 2020-10-27 2022-05-05
JP7208695B2 (en) 2020-10-27 2023-01-19 Amatelus株式会社 Video distribution device, video distribution system, video distribution method, and program

Similar Documents

Publication Publication Date Title
US11837260B2 (en) Elastic cloud video editing and multimedia search
US11381739B2 (en) Panoramic virtual reality framework providing a dynamic user experience
US10897659B2 (en) System and method for enhanced video image recognition using motion sensors
US9418703B2 (en) Method of and system for automatic compilation of crowdsourced digital media productions
US11196788B2 (en) Method and system for aggregating content streams based on sensor data
US20150312528A1 (en) Method and mechanism for coordinated capture and organization of multimedia data
US9483228B2 (en) Live engine
US20160261930A1 (en) Video-providing method and video-providing system
JP6187811B2 (en) Image processing apparatus, image processing method, and program
JP2017517789A (en) Method, apparatus and system for time-based and geographic navigation of video content
CN103999470A (en) System to merge multiple recorded video timelines
WO2018147089A1 (en) Information processing device and method
US10664225B2 (en) Multi vantage point audio player
CN105814905A (en) Method and system for synchronizing usage information between device and server
US20150304724A1 (en) Multi vantage point player
US10375456B2 (en) Providing highlights of an event recording
JP6060085B2 (en) Content management apparatus, content management method, program, and content display method
KR101843025B1 (en) System and Method for Video Editing Based on Camera Movement
US20150256762A1 (en) Event specific data capture for multi-point image capture systems
JP6640130B2 (en) Flexible cloud editing and multimedia search
US10911839B2 (en) Providing smart tags
Bailer et al. Multi-sensor concert recording dataset including professional and user-generated content
van Deventer et al. Media orchestration between streams and devices via new MPEG timed metadata
US20220053248A1 (en) Collaborative event-based multimedia system and method
EP3142116A1 (en) Method and device for capturing a video in a communal acquisition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18750961

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18750961

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP