GB2540224A - Multi-apparatus distributed media capture for playback control - Google Patents

Multi-apparatus distributed media capture for playback control Download PDF

Info

Publication number
GB2540224A
GB2540224A GB1521096.6A GB201521096A GB2540224A GB 2540224 A GB2540224 A GB 2540224A GB 201521096 A GB201521096 A GB 201521096A GB 2540224 A GB2540224 A GB 2540224A
Authority
GB
United Kingdom
Prior art keywords
orientation
media
common datum
common
capture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1521096.6A
Other versions
GB201521096D0 (en
Inventor
Shyamsundar Mate Sujeet
Kolmonen Veli-Matti
Johannes Eronen Antti
Juhani Lehtiniemi Arto
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB1511949.8A external-priority patent/GB2540175A/en
Priority claimed from GB1518025.0A external-priority patent/GB2543276A/en
Priority claimed from GB1518023.5A external-priority patent/GB2543275A/en
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Publication of GB201521096D0 publication Critical patent/GB201521096D0/en
Priority to CN201680052193.2A priority Critical patent/CN108432272A/en
Priority to EP16820900.5A priority patent/EP3320682A4/en
Priority to US15/742,687 priority patent/US20180213345A1/en
Priority to PCT/FI2016/050496 priority patent/WO2017005980A1/en
Publication of GB2540224A publication Critical patent/GB2540224A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0346Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/162Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01QANTENNAS, i.e. RADIO AERIALS
    • H01Q21/00Antenna arrays or systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H60/00Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
    • H04H60/02Arrangements for generating broadcast information; Arrangements for generating broadcast-related information with a direct linking to broadcast information or to broadcast space-time; Arrangements for simultaneous generation of broadcast information and broadcast-related information
    • H04H60/04Studio equipment; Interconnection of studios
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/027Spatial or constructional arrangements of microphones, e.g. in dummy heads
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/40Visual indication of stereophonic sound image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/038Indexing scheme relating to G06F3/038
    • G06F2203/0381Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/0482Interaction with lists of selectable items, e.g. menus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K19/00Record carriers for use with machines and with at least a part designed to carry digital markings
    • G06K19/06Record carriers for use with machines and with at least a part designed to carry digital markings characterised by the kind of the digital marking, e.g. shape, nature, code
    • G06K19/067Record carriers with conductive marks, printed circuits or semiconductor circuit elements, e.g. credit or identity cards also with resonating or responding marks without active components
    • G06K19/07Record carriers with conductive marks, printed circuits or semiconductor circuit elements, e.g. credit or identity cards also with resonating or responding marks without active components with integrated circuit chips
    • G06K19/0723Record carriers with conductive marks, printed circuits or semiconductor circuit elements, e.g. credit or identity cards also with resonating or responding marks without active components with integrated circuit chips the record carrier comprising an arrangement for non-contact communication, e.g. wireless communication circuits on transponder cards, non-contact smart cards or RFIDs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K7/00Methods or arrangements for sensing record carriers, e.g. for reading patterns
    • G06K7/10Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2220/00Input/output interfacing specifically adapted for electrophonic musical tools or instruments
    • G10H2220/091Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith
    • G10H2220/101Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith for graphical creation, edition or control of musical data or parameters
    • G10H2220/106Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith for graphical creation, edition or control of musical data or parameters using icons, e.g. selecting, moving or linking icons, on-screen symbols, screen regions or segments representing musical elements or parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2220/00Input/output interfacing specifically adapted for electrophonic musical tools or instruments
    • G10H2220/091Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith
    • G10H2220/101Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith for graphical creation, edition or control of musical data or parameters
    • G10H2220/106Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith for graphical creation, edition or control of musical data or parameters using icons, e.g. selecting, moving or linking icons, on-screen symbols, screen regions or segments representing musical elements or parameters
    • G10H2220/111Graphical user interface [GUI] specifically adapted for electrophonic musical instruments, e.g. interactive musical displays, musical instrument icons or menus; Details of user interactions therewith for graphical creation, edition or control of musical data or parameters using icons, e.g. selecting, moving or linking icons, on-screen symbols, screen regions or segments representing musical elements or parameters for graphical orchestra or soundstage control, e.g. on-screen selection or positioning of instruments in a virtual orchestra, using movable or selectable musical instrument icons
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/40Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
    • H04R2201/4012D or 3D arrays of transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/11Positioning of individual sound objects, e.g. moving airplane, within a sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • User Interface Of Digital Computer (AREA)
  • Computer Vision & Pattern Recognition (AREA)

Abstract

Apparatus includes a first media capture device 141, such as a microphone array or camera, and a locator which receives at least one remote location signal from a tag to locate an audio source associated with the tag. The locator comprising an array of antenna elements arranged with a reference orientation from which the tag is located, and the tag may transmit a radio-based signal. A common orientation determiner determines a common datum orientation between the reference orientation and a common datum, the common datum being common to the apparatus and at least one further apparatus, and may transmit it to a server. Switching between the apparatus and the further apparatus can be controlled based on their determined common datum orientations. The common orientation determiner may be an electronic compass, radio or light beacon, or a GPS system. Also disclosed is an apparatus and method for playback control of the captured media, using the orientation information.

Description

i$ULTI-APPARATUS DISTRIBUTED fWEDfA CAPTURE FOR PLAYBACK
CONTROL
Reid
The present application reiates to apparatus and methods for distributed audio capture and mixing. The invention further reiates to, but is not limited to, apparatus and methods for distributed audio capture and mixing for spatiai processing of audio signals to enable spatiai reproduction of audio signais.
Background
Capture of audio signais from multiple sources and mixing of those audio signals when these sources are moving in the spatiai field requires significant manuai effort. For example the capture and mixing of an audio signal source such as a speaker or artist within an audio environment such as a theatre or lecture hall to be presented to a listener and produce an effective audio atmosphere requires significant investment in equipment and training. A commonly implemented system would be for a professional producer to utilize a dose microphone, for example a Lavalier microphone worn by the user or a microphone attached to a boom pole to capture audio signals dose to the speaker or other sources, and then manually mix this captured audio signal with one or more suitable spatial (or environmental or audio field) audio signals such that the produced sound comes from an intended direction.
The spatial capture apparatus or omni-directional content capture (OCC) devices should be able to capture high quality audio signal while being able to track the dose microphones.
However a single point omni-directional content capture (OCC) apparatus can be problematic in that it provides an all aspect view but from only a single point in space,
Summary
According to a first aspect there is provided apparatus for capturing media comprising: a first media capture device configured to capture media; a locator configured to receive at least one remote location signal such that the apparatus is configured to locate an audio source associated with a tag generating the remote location signals, the locator comprising an array of antenna elements arranged with a reference orientation from which the tag is located; and a common orientation determiner configured to determine a common datum orientation between the reference orientation and the common datum, the common datum being common with respect to the apparatus and at least one further apparatus for capturing media, such that switching between the apparatus and the further apparatus for capturing media can be controlled based on the determined common datum orientation and a further apparatus common datum orientation.
The media capture device may comprise at least one of: a microphone array configured to capture at least one spatial audio signal comprising an audio source, the microphone array comprising at least two microphones arranged around a first axis and configured to capture an audio source along the reference orientation; and at least one camera configured to capture an image with a field of view including the reference orientation.
The locator may be a radio based positioning locator and wherein the at least one remote location signal may be a radio based positioning tag signal.
The locator may be configured to transmit the common datum orientation associated with the apparatus to a server, wherein the server may be configured to determine an offset orientation between pairs of apparatus for capturing media based on the common datum orientation of the apparatus and the further apparatus common datum orientation.
The locator may be configured to locate an audio source associated with a tag based on the reference orientation from which the tag is located and the common datum orientation so to generate an audio source location orientation relative to the common datum.
The media capture device may have a capture reference orientation which is offset with respect to the reference orientation associated with the locator antenna elements.
The common orientation determiner may comprise: an electronic compass configured to determine the common datum orientation between the reference orientation and magnetic north; a beacon orientation determiner configured to determine the common datum orientation between the reference orientation and a radio or light beacon; and a gps orientation determiner configured to determine the common datum orientation between the reference orientation and a determined gps derived position.
According to a second aspect there is provided an apparatus for playback control of the captured media, the apparatus configured to: receive, from each of the more than one apparatus for capturing media, a common datum orientation between a reference orientation of the respective apparatus for capturing media and a common datum, the common datum being common with respect to the more than apparatus for capturing media; and determine an offset orientation between pairs of apparatus for capturing media based on the common datum orientations.
The apparatus may furthermore be configured to provide the offset orientation to a playback apparatus to enable the playback apparatus to control a switch between the more than one apparatus.
The apparatus may further be configured to receive captured media from more than one apparatus wherein the apparatus may be further configured to process the captured media from the more than one apparatus based on the offset orientation when implementing a switch from the first of the pair of apparatus for capturing media to the other.
The apparatus may be further configured to: receive location estimates for audio sources from the more than one apparatus for capturing media; determine a switching policy associated with a switch between a pair of apparatus for capturing media; and apply the switching policy to the location estimates for audio sources.
The switching policy may comprise one or more of the following: maintain a location orientation for an object of Interest after a switch; and keep an object of interest within a field of experience after a switch, A system may comprise: a first apparatus as described herein; a further appararatus for capturing media comprising: a further media capture device configured to capture media; a further locator configured to receive at least one remote location signal such that the further apparatus is configured to locate an audio source associated with a tag generating the remote location signals, the further locator comprising an array of antenna elements arranged with a reference orientation from which the tag is located; and a further common orientation determiner configured to determine a further common datum orientation between the further apparatus reference orientation and the common datum, the common datum being common with respect to the further apparatus and the apparatus for capturing media, such that switching between the apparatus and the further apparatus for capturing media can be controlled based on the determined common datum orientation and a further apparatus common datum orientation.
The system may further comprise at least one remote media capture apparatus, the at least one remote media capture apparatus may comprise: at least one remote media capture apparatus configured to capture media associated with the audio source; and a locator tag configured to transmit remote location signal.
The system may further comprise a playback control server, the playback control server may comprise: an offset determiner configured to determine an offset orientation between the appararatus for capturing media common datum orientation and the further apparatus for capturing media common datum orientation.
According to a third aspect there is provided a method for capturing media, the method comprising: capturing media using a first media capture device; receiving at least one remote location signal; iocating an audio source associated with a tag generating the remote location signal the location associated with a reference orientation from which the tag is located; determining a common datum orientation between the reference orientation and a common datum, the common datum being common with respect to the first capture device and at least one apparatus for capturing media; and controlling switching between the device media and the apparatus for capturing media based on the determined common datum orientation and a further apparatus common datum orientation.
Capturing media may comprise at least one of: capturing at least one spatiai audio signai comprising an audio source using a microphone array comprising at least two microphones arranged around a first axis and configured to capture an audio source along the reference orientation; and capturing an image using at ieast one camera with a field of view including the reference orientation.
Locating an audio source may comprise radio based positioning locating and wherein the at ieast one remote location signal may be a radio based positioning tag signai.
Locating an audio source may comprise transmitting the common datum orientation associated with the apparatus to a server, wherein the method may further comprise determining at the server an offset orientation between pairs of apparatus for capturing media based on the common datum orientation and apparatus common datum orientation.
Locating an audio source may comprise locating an audio source associated with a tag based on the reference orientation from which the tag is located and the common datum orientation so to generate an audio source location orientation relative to the common datum.
Capturing media using a first media capture device may comprise capturing media using a first media device with a capture reference orientation which is offset with respect to the reference orientation.
Determining a common datum orientation may comprise: determining the common datum orientation between the reference orientation and magnetic north; determining the common datum orientation between the reference orientation and a radio or Sight beacon; and determining the common datum orientation between the reference orientation and a determined gps derived position.
According to a fourth aspect there is provided a method for playback controi of the captured media, the method comprising: receiving, from each of the more than one apparatus for capturing media, a common datum orientation between a reference orientation of the respective apparatus for capturing media and a common datum, the common datum being common with respect to the more than apparatus for capturing media; and determining an offset orientation between pairs of apparatus for capturing media based on the common datum orientations.
The method may comprise providing the offset orientation to a playback apparatus to enable the playback apparatus to control a switch between the more than one apparatus.
The method may further comprise: receiving captured media from more than one apparatus; processing the captured media from the more than one apparatus based on the offset orientation when implementing a switch from the first of the pair of apparatus for capturing media to the other.
The method may further comprise: receiving Socation estimates for audio sources from the more than one apparatus for capturing media; determining a switching policy associated with a switch between a pair of apparatus for capturing media: and applying the switching policy to the location estimates for audio sources.
Determining a switching policy may comprise one or more of the following: maintaining a location orientation for an object of interest after a switch; and keeping an object of interest within a field of experience alter a switch.
According to a fifth aspect there is provided an apparatus for capturing media, the apparatus comprising: means for capturing media using a first media capture device; means for receiving at ieast one remote location signai; means for locating an audio source associated with a tag generating the remote iocation signal, the location associated with a reference orientation from which the tag is located; means for determining a common datum orientation between the reference orientation and a common datum, the common datum being common with respect to the first capture device and at ieast one apparatus for capturing media; and means for controlling switching between the device media and the apparatus for capturing media based on the determined common datum orientation and a further apparatus common datum orientation.
The means for capturing media may comprise at ieast one of; means for capturing at ieast one spatial audio signai comprising an audio source using a microphone array comprising at least two microphones arranged around a first axis and configured to capture an audio source along the reference orientation; and means for capturing an image using at ieast one camera with a fieid of view including the reference orientation.
The means for locating an audio source may comprise means for radio based positioning locating and wherein the at ieast one remote iocation signal may be an radio based positioning tag signai.
The means for locating an audio source may comprise means for transmitting the common datum orientation associated with the apparatus to a server, wherein the server is configured to determine an offset orientation between pairs of apparatus for capturing media based on the common datum orientation and apparatus common datum orientation.
The means for locating an audio source may comprise means for locating an audio source associated with a tag based on the reference orientation from which the tag is located and the common datum orientation so to generate an audio source location orientation relative to the common datum.
The means for capturing media using a first media capture device may comprise means for capturing media using a first media device with a capture reference orientation which is offset with respect to the reference orientation.
The means for determining a common datum orientation may comprise: means for determining the common datum orientation between the reference orientation and magnetic north; means for determining the common datum orientation between the reference orientation and a radio or light beacon; and means for determining the common datum orientation between the reference orientation and a determined gps derived position.
According to a sixth aspect there is provided an apparatus for playback control of the captured media, the apparatus comprising: means for receiving, from each of the more than one apparatus for capturing media, a common datum orientation between a reference orientation of the respective apparatus for capturing media and a common datum, the common datum being common with respect to the more than apparatus for capturing media; and means for determining an offset orientation between pairs of apparatus for capturing media based on the common datum orientations.
The apparatus may comprise means for providing the offset orientation to a playback apparatus to enable the playback apparatus to control a switch between the more than one apparatus.
The apparatus may further comprise: means for receiving captured media from more than one apparatus; means for processing the captured media from the more than one apparatus based on the offset orientation when implementing a switch from the first of the pair of apparatus for capturing media to the other.
The apparatus may further comprise: means for receiving location estimates for audio sources from the more than one apparatus for capturing media; means for determining a switching policy associated with a switch between a pair of apparatus for capturing media; and means for applying the switching policy to the location estimates for audio sources.
The means for determining a switching poiicy may comprise one or more of the foiiowing: means for maintaining a location orientation for an object of interest after a switch; and means for keeping an object of interest within a field of experience after a switch. A computer program product stored on a medium may cause an apparatus to perform the method as described herein.
An eiectronic device may comprise apparatus as described herein. A chipset may comprise apparatus as described herein.
Embodiments of the present application aim to address problems associated with the state of the art.
Summary of the Figures
For a better understanding of the present application, reference will now be made by way of example to the accompanying drawings in which:
Figures 1a to 1c show example OCC apparatus distributed over a venue according to some embodiments;
Figure 2 shows example OCC apparatus distributed and a tracked object of interest or positioning tag over a venue according to some embodiments;
Figures 3 to 5 shows example OCC apparatus offset management according to some embodiments;
Figures 6 and 7 show example OCC apparatus distributions according to some embodiments;
Figure 8 shows a flow diagram of an example object of interest based switching of OCC apparatus according to some embodiments; and
Figure 9 shows schematically capture and render apparatus suitable for implementing spatial audio capture and rendering according to some embodiments; and
Figure 10 shows schematically an example device suitable for implementing the capture and/or render apparatus shown in Figure 9,
Embodiments of the Application
The following describes in further detail suitable apparatus and possible mechanisms for the provision of effective capture of audio signals from multiple sources and mixing of those audio signals. In the following examples, audio signals and audio capture signals are described. However it would be appreciated that in some embodiments the apparatus may be part of any suitable electronic device or apparatus configured to capture an audio signal or receive the audio signals and other information signals.
As described previously a conventional approach to the capturing and mixing of audio sources with respect to an audio background or environment audio field signal wouid be for a professional producer to utilize an external or dose microphone (for example a Lavalier microphone worn by the user or a microphone attached to a boom pole) to capture audio signals close to the audio source, and further utilize a omnidirectional object capture microphone to capture an environmental audio signal. These signals or audio tracks may then be manuaily mixed to produce an output audio signal such that the produced sound features the audio source coming from an intended (though not necessarily the original) direction.
As wouid be expected this requires significant time and effort and expertise to do correctly. Furthermore in order to cover a large venue, multiple points of omnidirectional capture are needed to create a holistic coverage of the event. More specifically, multiple OCC apparatus are required as described in further detail herein to cover a large space.
Furthermore by implementing multiple OCC apparatus configured to enable multiple instances of capture points is that each of the OCC apparatus has its own reference or “Front" direction. Consequently, when switching from one OCC to another one, there is the need to identify and store ail the reference or “Front” directions, if this is not done, moving from one OCC capture point to another may experience a sudden change In orientation while consuming (for example listening to) the content,
The concept as described herein may make it possible to capture and remix an external or close audio signal and spatial or environmental audio signal more effectively and efficiently.
The concept as discussed in the following embodiments relates to a method to determine and signal the relative reference ’Front’ orientation offsets between multiple omni-directional content capture (OCC) apparatus or devices. In the following embodiments media or media content may refer to audio, video or both. The reiative orientation offsets between the multiple OCC devices may be signalled to enable media content adaptation for seamless traversal between OCC apparatus.
As described herein the reference orientation of each OCC apparatus is known to itseif. The concept as discussed herein is for each OCC apparatus to determine a common datum orientation (for example by using a magnetic compass to determine magnetic north), and then determine the offset of the OCC apparatus with respect to the determined common datum reference orientation. Although the following examples show the determination of a common datum reference orientation using an electronic compass other common datum reference methods may be employed. For example where street view images (e.g, Navteq or Here street view images) are available, or by visual analysis based Global CPE can be used to determine the offset from the common datum. Furthermore common references may be provided by exploiting an artificial reference beacon over a pre-specific IP address or radio channel. Outdoor common references furthermore may use a GPS or other signal at Infinity’. This information can then be signalled from the OCC apparatus to a suitable device and combined to determine the relative offsets of each OCC apparatus with respect to each other The reiative offsets between each OCC apparatus may furthermore be signalled to the entity which is delivering the media content for consumption. This entity may use the offset vaiues to adapt the content playback orientation. The sensor based orientation offset measurement may thus be used to enable fast visual analysis based camera pose estimation and consequently, achieve fast visual calibration between the OCC apparatus.
Furthermore in some embodiments there may be an object of interest (00!) based switching policy, in such embodiments a common reference point can be used to determine the object or region of Interest and the consequent content playback selection of playback starting direction for the user, which ensures that a particular object is in view when switching from one OCC apparatus to another one. For example, in case of OOi tracking with radio based positioning - such as HAiP (High Accuracy Indoor Positioning) location determination system, the direction of arrivai for a particular positioning tag for each OCC apparatus can be used to choose the playback orientation, in some embodiments visual analysis or spatial audio analysis based selection of start playback direction when switching between OCC devices can be implemented.
In some embodiments furthermore the OCC apparatus comprises a microphone array part comprising a microphone array. The microphone array may then be mounted on a fixed or teiescopic mount which locates the microphone array, with a ’front’ or reference orientation reiative to a locator (an iocator such as high accuracy indoor positioning ~ HAIP) part. The OCC apparatus further comprises a locator part. The locator part may comprise an array of positioning receivers. Each array element may be located and orientated on the same elevation plane (for example centred on the horizontal plane) and positioned about (for example for a 3 element array 120 degrees separate) in azimuth from each other in order to provide 380 degree coverage with some overlap. The reference orientation of the microphone array may be coincidental with the reference orientation of one of the receiver array elements. However in some embodiments the microphone reference orientation is defined reiative to a reference orientation of one of the receiver array elements, Thus in some embodiments the OCC apparatus comprises a co-axiaiiy located microphone array and locator. The co-axial iocation as weii as aligned reference axis of the locator and the media capture system enable simple out of box usage as the configuration shown herein may remove the need for any caiibration or complicated setup. in some embodiments the reiative reference orientation information between OCC apparatus may be signalled at a suitable frequency when one or more of the OCC apparatus are moving.
In some embodiments a suitable metadata description format (e.g. SDP/JSON/PROTOBUF/etc) over a suitable transport protocol (HTTP/UOP/TCP/etc) can be used to signal the reference information.
The concept may for example be embodied as a capture system configured to capture both an external or dose (speaker, instrument or other source) audio signal and a spatial (audio field) audio signal. The capture system may furthermore be configured to determine or classify a source and/or the space within which the source is located, This information may then be stored or passed to a suitable rendering system which having received the audio signals and the information may use this information to generate a suitable mixing and rendering of the audio signal to a user. Furthermore in some embodiments, the render system may enable the user to input a suitable input to control the mixing, for example by use of a headtracking or other input which causes the mixing to be changed.
The concept furthermore is embodied by a broad spatial range capture device or an omni-directional content capture (OCC) apparatus or device.
Although the capture and render systems in the following examples are shown as being separate, it is understood that they may be implemented with the same apparatus or may be distributed over a series of physicaiiy separate but communication capable apparatus. For example, a presence-capturing device such as the Nokia OZO device couid be equipped with an additional interface for analysing external microphone sources, and couid be configured to perform the capture part. The output of the capture part could be a spatial audio capture format (e.g. as a 5.1 channel downmix), the Lavalier sources which are time-delay compensated to match the time of the spatial audio, and other information such as the classification of the source and the space within which the source is found. in some embodiments the raw spatiai audio captured by the array microphones (instead of spatia! audio processed into 5.1} may be transmitted to the mixer and tenderer and the mixer/renderer perform spatiai processing on these signals.
The playback apparatus as described herein may be a set of headphones with a motion tracker, and software capable of presenting binaural audio rendering. With head tracking, the spatial audio can be rendered in a fixed orientation with regards to the earth, instead of rotating along with the person’s head.
Furthermore it is understood that at least some elements of the following capture and render apparatus may be implemented within a distributed computing system such as known as the ‘cloud'.
With respect to Figure 9 is shown a system comprising local capture apparatus 101,103 and 105, a single omni-directional content capture (OCC) apparatus 141, mixer/render 151 apparatus, and content playback 181 apparatus suitable for implementing audio capture, rendering and playback according to some embodiments. in this example there is shown only three local capture apparatus 101,103 and 105 configured to generate three local audio signals, however more than or fewer than 3 local capture apparatus may be employed.
The first local! capture apparatus 101 may comprise a first external (or Lavalier) microphone 113 for sound source 1. The external microphone Is an example of a ‘dose* audio source capture apparatus and may in some embodiments be a boom microphone or similar neighbouring microphone capture system.
Although the following examples are described with respect to an external microphone as a Lavalier microphone the concept may be extended to any microphone external or separate to the omni-directional content capture (OCC) apparatus. Thus the external microphones may be Lavalier microphones, hand held microphones, mounted mics, or whatever. The externa! microphones can be worn/carried by persons or mounted as close-up microphones for instruments or a microphone In some relevant location which the designer wishes to capture accurately. The external microphone 113 may in some embodiments be a microphone array. A Lavalier microphone typically comprises a small microphone worn around the ear or otherwise dose to the mouth. For other sound sources, such as musica! instruments, the audio signal may be provided either by a Lavaiier microphone or by an internal microphone system of the instrument (e.g,, pick-up microphones in the case of an electric guitar).
The external microphone 113 may be configured to output the captured audio signals to an audio mixer and renderer 151 (and in some embodiments the audio mixer 155). The externa! microphone 113 may be connected to a transmitter unit (not shown), which wirelessly transmits the audio signal to a receiver unit (not shown).
Furthermore the first local capture apparatus 101 comprises a position tag 111. The position tag 111 may be configured to provide information, such as direction, range, and ID, identifying the position or location of the first capture apparatus 101 and the external microphone 113.
It is important to note that microphones worn by people can freely move in the acoustic space and the system supporting location sensing of wearable microphone has to support continuous sensing of user or microphone Socation. The position tag 111 may thus be configured to output the tag signai to a position locator 143. The positioning system may utilize any suitable radio technology, such as Bluetooth Low Energy, WiFi, or some other. in the example as shown in Figure 9, a second local capture apparatus 103 comprises a second external microphone 123 for sound source 2 and furthermore a position tag 121 for identifying the position or iocation of the second local capture apparatus 103 and the second external microphone 123.
Furthermore a third local capture apparatus 105 comprises a third external microphone 133 for sound source 3 and furthermore a position tag 131 for identifying the position or location of the third local capture apparatus 105 and the third external microphone 133,
In the following examples the positioning system and the tag may employ High Accuracy Indoor Positioning (HAIR) or another suitable indoor positioning technology. In the HAIR technology, as developed By Nokia, Bluetooth Low Energy is utilized. The positioning technology may also be based on other radio systems, such as WiFi, or some proprietary technology. The positioning system in the examples is based on direction of arrival estimation where antenna arrays are being utilized.
There can be various realizations of the positioning system and an example of which is the radio based location or positioning system described here. The location or positioning system may in some embodiments be configured to output a location (for example, but not restricted, in azimuth plane, or azimuth domain) and distance based location estimate.
For example, GPS is a radio based system where the time-of-fiight may be determined very accurateiy. This, to some extent, can be reproduced in indoor environments using WiFi signaling.
The described system however may provide anguiar information drrectiy, which in turn can be used very conveniently in the audio solution. in some example embodiments the location can be determined or the location by the tag can be assisted by using the output signals of the pluraiity of microphones and/or piurality of cameras.
The capture apparatus 101 comprises an omni-directional content capture (OCC) apparatus 141. The omni-directionai content capture (OCC) apparatus 141 is an example of an 'audio field' capture apparatus, in some embodiments the omnidirectional content capture {OCC) apparatus 141 may comprise a directional or omnidirectional microphone array 145. The omni-directional content capture (OCC) apparatus 141 may be configured to output the captured audio signals to the mixer/render apparatus 151 (and in some embodiments an audio mixer 155).
Furthermore the omni-directional content capture (OCC) apparatus 141 comprises a source locator 143. The source locator 143 may be configured to receive the information from the position tags 111,121,131 associated with the audio sources and identify the position or location of the local capture apparatus 101, 103, and 105 relative to the omni-directional content capture apparatus 141. The source locator 143 may be configured to output this determination of the position of the spatial capture microphone to the mixer/render apparatus 151 (and in some embodiments a position tracker or position server 153). In some embodiments as discussed herein the source locator receives information from the positioning tags within or associated with the external capture apparatus. In addition to these positioning tag signals, the source locator may use video content analysis and/or sound source iocalization to assist in the identification of the source locations relative to the OCG apparatus 141.
As shown in further detail, the source locator 143 and the microphone array 145 are co-axialiy located. In other words the relative position and orientation of the source locator 143 and the microphone array 145 is known and defined.
In some embodiments the source locator 143 is a common orientation reference determined position determiner. The common orientation reference determined position determiner is configured to receive the positioning iocator tags from the external capture apparatus and furthermore determine the location and/or orientation of the OCC apparatus 141 in order to be able to determine a positon or location from the tag information which is relative to the OCC location and the common datum orientation. In other words a (positioning) locator may provide a relative position with respect to it’s own mounting position. Since the (positioning) locator may be coaxially positioned with the OCC, any relative position of the external capture apparatus is availabie.
In some embodiments the omni-directional content capture (OCC) apparatus 141 may implement at least some of the functionality within a mobile device.
The omni-directional content capture (OCC) apparatus 141 is thus configured to capture spatial audio, which, when rendered to a listener, enables the listener to experience the sound field as if they were present in the iocation of the spatial audio capture device.
The local capture apparatus comprising the external microphone in such embodiments is configured to capture high quality close-up audio signals (for example from a key person’s voice, or a musical instrument).
The mixer/render apparatus 151 may comprise a position tracker (or position server) 153, The position tracker 153 may be configured to receive the relative positions from the omni-directional content capture (OCC) apparatus 141 (and in some embodiments the source locator 143) and be configured to output parameters to an audio mixer 155.
Thus in some embodiments the position or location of the OCC apparatus is determined. The location of the spatial audio capture device may be denoted {at time t~0) as
In some embodiments
The position tracker may thus determine an azimuth angle a and the distance d with respect to the OCC and the microphone array.
For exampie given an external (Lavaiier) microphone position at time t
The direction relative to the array is defined by the vector
The azimuth a may then be determined as
where atan2(y,x) is a “Four-Quadrant inverse Tangent" which gives the angie between the positive x-axis and the point (x,y} and the common datum orientation may be denoted as
Thus, the first term gives the angle between the positive x-axis (origin at
and
and the point
and the second term is the angie between the x~ axis and the common datum orientation position
The azimuth angle may be obtained by subtracting the first angle from the second.
The distance d can be obtained as
In some embodiments, since the positioning location data may be noisy, the position (xs(Q), ys(Q) may be obtained by recording the positions of the positioning tags of the audio capture device and the external (Lavaiier) microphone over a time window of some seconds (for example 30 seconds) and then averaging the recorded positions to obtain the inputs used in the equations above.
In some embodiments the calibration phase may be initialized by the OCC apparatus being configured to output a speech or other instruction to instruct the user(s) to stay in front of the array for the 30 second duration, and give a sound indication after the period has ended.
Although the examples shown above show the locator 145 generating location or position information in two dimensions it is understood that this may be generalized to three dimensions, where the position tracker may determine an elevation angle or eievation offset as well as an azimuth angle and distance. in some embodiments other position locating or tracking means can be used for locating and tracking the moving sources. Examples of other tracking means may include inertial sensors, radar, ultrasound sensing, Lidar or iaser distance meters, and so on.
In some embodiments, visual analysis and/or audio source localization are used to assist positioning.
Visual analysis, for example, may be performed in order to localize and track predefined sound sources, such as persons and musical instruments. The visual analysis may be applied on panoramic video which is captured along with the spatial audio. This analysis may thus identify and track the position of persons carrying the external microphones based on visual identification of the person. The advantage of visual tracking is that it may be used even when the sound source is silent and therefore when it is difficult to rely on audio based tracking. The visual tracking can be based on executing or running detectors trained on suitabie datasets (such as datasets of images containing pedestrians) for each panoramic video frame. In some other embodiments tracking techniques such as kalman filtering and particle filtering can be implemented to obtain the correct trajectory of persons through video frames. The iocation of the person with respect to the front direction of the panoramic video, coinciding with the front direction of the spatiai audio capture device, can then be used as the direction of arrival for that source, in some embodiments, visual markers or detectors based on the appearance of the Lavalier microphones could be used to help or improve the accuracy of the visual tracking methods, in some embodiments visuai analysis can not only provide information about the 2D position of the sound source (i.e.: coordinates within the panoramic video frame), but can aiso provide information about the distance, which is proportional to the size of the detected sound source, assuming that a "standard” size for that sound source ciass is known. For example, the distance of ‘any' person can be estimated based on an average height, Aiternativeiy, a more precise distance estimate can be achieved by assuming that the system knows the size of the specific sound source. For example the system may know or be trained with the height of each person who needs to be tracked.
In some embodiments the 3D or distance information may be achieved by using depth-sensing devices. For example a ’Kinect' system, a time of flight camera, stereo cameras, or camera arrays, can be used to generate images which may be analyzed and from image disparity from muitipie images a depth may or 3D visuai scene may be created. These images may be generated by a camera.
Audio source position determination and tracking can in some embodiments be used to track the sources, The source direction can be estimated, for example, using a time difference of arrival (TDOA) method. The source position determination may in some embodiments be implemented using steered beamformers along with particle filter-based tracking algorithms.
In some embodiments audio self-localization can be used to track the sources.
There are technologies, in radio technologies and connectivity solutions, which can furthermore support high accuracy synchronization between devices which can simplify distance measurement by removing the time offset uncertainty In audio correlation analysis. These techniques have been proposed for future WiFi standardization for the multichannel audio playback systems. in some embodiments, position estimates from positioning, visual analysis, and audio source localization can be used together, for example, the estimates provided by each may be averaged to obtain improved position determination and tracking accuracy. Furthermore, in order to minimize the computational load of visual analysis (which is typically much "heavier" than the analysis of audio or positioning signals), visual analysis may be applied only on portions of the entire panoramic frame, which correspond to the spatial locations where the audio and/or positioning analysis sub-systems have estimated the presence of sound sources.
Location or position estimation can, in some embodiments, combine information from multiple sources and combination of multiple estimates has the potentiai for providing the most accurate position information for the proposed systems. However, it is beneficial that the system can be configured to use a subset of position sensing technologies to produce position estimates even at lower resolution.
The mixer/render apparatus 151 may furthermore comprise an audio mixer 155. The audio mixer 155 may be configured to receive the audio signals from the external microphones 113, 123, and 133 and the omni-directional content capture (OCC) apparatus 141 microphone array 145 and mix these audio signals based on the parameters (spatial and otherwise) from the position tracker 153. The audio mixer 155 may therefore be configured to adjust the gain and spatial position associated with each audio signal in order to provide the listener with a much more realistic immersive experience. In addition, it is possible to produce more point-like auditory objects, thus increasing the engagement and intelligibility. The audio mixer 155 may furthermore receive additional inputs from the playback device 181 (and in some embodiments the capture and playback configuration controller 183) which can modify the mixing of the audio signals from the sources.
The audio mixer in some embodiments may comprise a variabie delay compensator configured to receive the outputs of the externai microphones and the OCC microphone array. The variabie delay compensator may be configured to receive the position estimates and determine any potential timing mismatch or lack of synchronisation between the OCC microphone array audio signals and the external microphone audio signals and determine the timing delay which would be required to restore synchronisation between the signals. In some embodiments the variable delay compensator may be configured to apply the delay to one of the signals before outputting the signals to the renderer 157.
The timing delay may be referred as being a positive time delay or a negative time delay with respect to an audio signal. For example, denote a first (OCC) audio signal by x, and another (external capture apparatus) audio signal by y. The variable delay compensator is configured to try to find a delay t, such that x(n) ™ y(n~i). Here, the delay τ can be either positive or negative.
The variable delay compensator may in some embodiments comprises a time delay estimator. The time delay estimator may be configured to receive at least part of the OCC audio signal (for example a central channel of a 5.1 channel format spatial encoded channel). Furthermore the time delay estimator is configured to receive an output from the external capture apparatus microphone 113, 123, 133. Furthermore in some embodiments the time delay estimator can be configured to receive an input from the location tracker 153.
As the external microphone may change its location (for example because the person wearing the microphone moves while speaking), the OCC locator 145 can be configured to track the location or position of the external microphone (relative to the OCC apparatus) over time. Furthermore, the time-varying location of the external microphone relative to the OCC apparatus causes a time-varying delay between the audio signals.
In some embodiments a position or location difference estimate from the location tracker 143 can be used as the initial delay estimate. More specifically, if the distance of the externa! capture apparatus from the OCC apparatus is d, then an initial delay estimate can be calculated. Any audio correiation used in determining the deiay estimate may be calculated such that the correlation centre corresponds with the initial delay value. in some embodiments the mixer comprises a variable deiay iine. The variable deiay line may be configured to receive the audio signai from the externa! microphones and deiay the audio signai by the deiay value estimated by the time deiay estimator. In other words when the ’optima!1 delay is known, the signai captured by the external (Lavaiier) microphone is delayed by the corresponding amount.
In some embodiments the mixer/render apparatus 151 may furthermore comprise a Tenderer 157. In the example shown in Figure 9 the Tenderer is a binaural audio Tenderer configured to receive the output of the mixed audio signals and generate rendered audio signals suitable to be output to the playback apparatus 181. For example in some embodiments the audio mixer 155 is configured to output the mixed audio signals in a first multichannel (such as 5.1 channel or 7.1 channel format) and the Tenderer 157 renders the multichannel audio signal format into a binaural audio formal. The renderer 157 may be configured to receive an input from the playback apparatus 161 (and in some embodiments the capture and playback configuration controller 163) which defines the output format for the playback apparatus 181, The renderer 157 may then be configured to output the renderer audio signals to the playback apparatus 181 (and in some embodiments the playback output 185).
The audio renderer 157 may thus be configured to receive the mixed or processed audio signals to generate an audio signal which can for example be passed to headphones or other suitable playback output apparatus. However the output mixed audio signal can be passed to any other suitable audio system for playback (for example a 5.1 channel audio amplifier).
In some embodiments the audio renderer 157 may be configured to perform spatiai audio processing on the audio signals.
The mixing and rendering may be described initially with respect to a single (mono) channel, which can be one of the multichannel signals from the OCC apparatus or one of the external microphones. Each channel in the multichannel signal set may be processed in a similar manner, with the treatment for external microphone audio signals and OCC apparatus multichannel signals having the following differences: 1) The external microphone audio signals have time-varying location data (direction of arrival and distance) whereas the OCC signals are rendered from a fixed location. 2) The ratio between synthesized “direct" and “ambient” components may be used to control the distance perception for external microphone sources, whereas the OCC signals are rendered with a fixed ratio. 3) The gain of external microphone signals may be adjusted by the user whereas the gain for OCC signals is kept constant.
The playback apparatus 161 in some embodiments comprises a capture and playback configuration controller 163. The capture and playback configuration controller 163 may enable a user of the playback apparatus to personalise the audio experience generated by the mixer 155 and renderer 157 and furthermore enable the mixer/renderer 151 to generate an audio signal in a native format for the playback apparatus 161. The capture and playback configuration controller 163 may thus output control and configuration parameters to the mixer/renderer 151.
The playback apparatus 161 may furthermore comprise a suitable playback output 165.
In such embodiments the OCC apparatus or spatial audio capture apparatus comprises a microphone array positioned in such a way that allows omnidirectional audio scene capture.
Furthermore the multiple external audio sources may provide uncompromised audio capture quality for sound sources of interest.
As described previously whilst the system as described above with a singie OCC apparatus 141 is stabie with regards to the captured audio signals. Systems which introduce multiple OCC apparatus in order to cover a larger area suffer from a potential switching problem.
Figures 1a to 1c show example OCC and OCC distributions for an example venue which may not be able to be covered using a single OCC apparatus.
Figure 1a for example shows schematically an OCC apparatus or device 141, The OCC apparatus has a ‘Front’ or reference orientation. In the following examples the OCC apparatus or device is configured to capture audio visual content and equipped with an in-device magnetic compass 1105. The magnetic compass reference axis and the media capture system reference axis 1403 is shown in Figure 1a as being aligned. Consequently, the offset of magnetic compass (and thus magnetic North) also represents the offset of the OCC device.
Figure 1b shows a distribution of several OCC devices around a large venue in such a manner, so as to cover a wide expanse.
Figure 1c shows the potential issue where the offset between the reference orientations of each OCC device are not known. In Figure 1c there are shown five OCC (OCC1 1411 to OCC4 1414 and OCC6 141s) located on the periphery of the venue space looking in and a further OCC (OCC5 141s) located within the venue. As can be seen the reference orientations of each of the OCC apparatus differ with each other. Thus should a user who is consuming (of listening to) the captured media change their Viewpoint' from OCC1 1411 to OCC5 141s there would be an abrupt switch in viewpoint orientation. Such a behaviour wouid not be acceptable to someone experiencing the media (for exampie the spatiaiiy resoived audio signals would likely ’click’ in an artificial manner to the new viewpoint).
This effect can be visualised with respect to Figure 2, Figure 2 shows the venue 100 and the OCC distribution as shown in Figure 1c but furthermore shows an example external capture apparatus 201 (or object of interest OOi) located within the venue, in this example a user experiencing the venue and foliowing an external capture apparatus 201 within the venue initialiy from OCC1 1411 may ‘hear the source associated with the external capture apparatus 201 as if it is coming from in front and slightly to the right of the listener. In other words the source is located in front and to the right of the reference orientation. However by switching to OCC5 141s the source would abruptly switch such the listener would hear the source coming from the rear right quadrant and as such would be confused with respect to why the source has moved abruptly.
With respect to Figure 3 an example system and apparatus employed in embodiments as described herein to mitigate such switching effects are shown.
Figure 3 for example shows schematically N OCC (OCC1 1411, OCC2 141s, ...,OCCN 141n), a playback control server 301 and a consuming entity 303. In this example the playback control server (PCS) 301 may be considered to be similar to the mixer/renderer shown in Figure 9 but with additional functionality as described herein. Furthermore the consuming entity may be considered to be similar to the playback apparatus 181 shown in Figure 9.
The OCC apparatus 141 in some embodiments is configured to determine the following characteristics. Firstly the OCC apparatus is configured to determine a OCC ID value. The OCC ID value uniquely identifies an OCC device within the full system. This value may be determined in any suitable manner. Furthermore the OCC apparatus 141 is configured to determine a time value from which a time stamp or time stamp value associated with the time when the signals are sent. The OCC apparatus may furthermore determine an offset vaiue identifying the difference between the OCC apparatus reference axis with respect to a common reference axis, in the following embodiments the common reference axis is determine by an electronic compass and thus the offset value OH (for the i'th OCC) is the offset between the OCC reference orientation and magnetic North.
In some embodiments (and as described previously) the OCC Is further configured to locate the external capture apparatus or object of interest (OOi) and furthermore determine the orientation of these OOI relative to the OCC reference orientation. This orientation information OQi and an OQi identifier value identifying the externa! capture apparatus may aiso be sent with the OCC iD vaiue, time stamp and the offset of reference orientation ON; vaiue to the PCS 301. In some embodiments the OCC is configured to determine the orientation of these OOS with respect to the common reference axis and transmit this information rather than the 'relative to the OCC reference' orientation value.
In other words the OCC is configured to generate or determine and output to the PCS 301 the offset position and OOI information. This is shown for OCC1 in step 330.
Furthermore this is shown in Figure 3 for OCC2 by step 332 and for OCCN by step 334.
The OCC furthermore may be configured to generate media content such as the captured spatial audio signals from a microphone array. This media content may furthermore be transmitted to the PCS 301.
In some embodiments of the implementation, the OCC apparatus comprises a gyroscope and/or altimeter in addition to the compass, in these embodiments in addition to the signalling information described above, the position of the OCC apparatus in 3D space can be determined and signalled to the PCS.
Consequently, the reference offset in 3D can be obtained between the OCC apparatus.
The operation of generating/determining the content and positioning information and transmitting it to the PCS with respect to OCC1 1411 is shown in Figure 3 by step 331.
Furthermore these operations are is shown in Figure 3 for OCC2 by step 333 and for OCCN by step 335.
This system is therefore configured to enable switching of viewpoints across different OCC apparatus or capture devices without causing abrupt or unexpected view point changes.
In some embodiments the playback control server (PCS) 301 Is configured to receive the OCC ID, which uniquely Identifies an OCC device in the full system, the time stamp when the signal was sent and the offset of reference axis with respect to magnetic North QNL This information may be used by the PCS 301 to create an offset guidance signai for the end user consuming entity (playback apparatus) 303. The guidance information may for example comprise an identifier identifying the consuming entity or user thereof, the avaiiabie OCC identifiers, orientation information and object of interest orientation information.
The generation and transmitting of the guidance signal is shown in Figure 3 by step 341,
The consuming entity 303 can be the end user who is watching/iistening to the content for example with a head mounted display. The consuming entity may receive the guidance information and display such information to the user via a suitable user interface. Furthermore the consuming entity may be configured to enable a user input to be made to select the ‘viewpoint’. In other words the user may select an OCC from which the content is to be captured. The consuming entity may furthermore be configured to select an object on interest the user is interested in. In other words the user may select an OOI identifier.
The consuming entity may furthermore determine other consumption parameters, for example a head tracking value from the head mounted dispiay/headphones from which the content is being output.
This information may be transmitted back to the PCS 301.
The operation of generating/determining OCG iD and 00! ID values is shown in Figure 3 by step 343,
The PCS 301, In some embodiments may operate as a streaming server with respect to the media content.
The PCS 301 may thus receive the output values from the consuming entity 303 (or end user device). Thus for example the PCS may receive information for a switch of viewpoint with respect to a possible pair of OCC devices. For example, if the user is currently on view point corresponding to OCC1, ali the other OCC devices can be candidate switch devices.
The PCS may be configured such that when the user operating the consuming entity switches from OCC1 to OCC5 the viewing angle is chosen based on the switching policy adopted.
For example where the switching policy is a minimal change in viewing angle policy, the PCS may enable a start playback direction in OCC5 to be calculated as follows:
Current viewing angle: ON1 + Offset of current view from Front (for exampie as provided by the headtracker).
For sake of simplicity if we assume Offset of current view as 0 (in other words the headtracker function is switched off or straight ahead) then
Current viewing angle = ON1
New viewing angle (after switching to 0CC5) = ON1 + ON5.
In some embodiments the external sources (objects of interest) are also tracked. The PCS may thus be configured to compensate for the switching in order enable a seamless following of an object of interest. For example, where an OOI is tracked continuously with a suitable mechanism. The angular position of the OOI with respect to each of the OCC devices is known, in this situation, the start playback orientation is such that the tracked OOI is always visible whiie switching the view.
In such an example the offset of the OOI with respect to the reference axis of the OCC is signalled by the OCC devices to the PCS. The PCS signals the offset angles between the different OCC pairs to maintain seamless following of QOL
The content from the processed media may then be transmitted to the consuming entity as shown in Figure 3 by step 345.
Figure 4 shows a further system wherein the content streaming and requesting is performed between the consuming entity (end user devices) 303 and a content (streaming) hub 405. In such embodiments the PCS 301 oniy provides user specific playback control signalling.
In other words the OCC apparatus transmit the offset positions and OOI signalling information to the PCS 301 (as shown in steps 330, 332 and 334) and transmit the content to the content (streaming) hub 405 (as shown in steps 431, 433, and 435),
The content request signalling may then be transmitted from the consuming entity 303 to the content streaming hub 405 as shown in step 443.
The content may then be filtered/mixed/rendered/processed and transmitted from the content streaming hub 405 to the consuming entity 303 as shown in step 445.
Figure 5 shows a system similar to Figure 4 but where the PCS is configured to generate a playback control broadcast service, which any consumer entity 303 or end user device can tune into and receive the offset information about all the OCC devices in the system.
The generation and broadcast of playback information signaiiing is shown in Figure 5 by step 541,
In some embodiments the systems such as shown in Figure 4 and 5 have the benefit of generating and working only with metadata information. Consequentiy such systems may be converted into a peer-to-peer configuration between OCC devices.
With respect to Figures 6 and 7 are shown example OCC distributions for OCC apparatus 601 each of which has an effective capture range 603.
Assuming a circular coverage space for each of the OCC apparatus coupled with omnidirectional positioning having a range of Rm radius. Then the area covered by single OCC = Pi*RA2, Figure 6 for example shows a perimeter configuration where the OCC apparatus 601 may only be placed the perimeter of the venue 600. Figure 7 shows a in-venue configuration where the OCC apparatus 701 can be placed within the venue space. The ratio of the number of OCC apparatus needed between the distribution in Figures 6 and 7 is approximately 2.
With respect to Figure 6 is shown a summary of operations with respect to some embodiments.
The initial operation with respect to the OCC is to determine or record the reference offset with respect to magnetic north (or other common datum) orientation.
The operation of determining or recording the reference offset of the OCC with respect to magnetic north (or other common datum) orientation is shown in Figure 8 by step 801.
The reference offset may then be transmitted to a PCS or other suitable server.
The operation of transmitting the reference offset is shown in Figure 8 by step 803,
The server or PCS may be configured to determine reference offset differences between pairs of OCC apparatus.
The operation of determining the reference offset differences is shown in Figure 8 by step 805,
In some embodiments the PCS may furthermore determine a switching policy. For example in some embodiments the switching poiicy may be configured to maintain the same orientation after a switch, or may be configured to keep the OOI within the field of view or within a range of hearing orientation, or any other switching poiicy.
The operation of determining a switching poiicy is shown in Figure 8 by step 808.
In some embodiments the switching policy may determine the user specific start playback orientation (especially when a switch between OCC apparatus is made).
The operation of determining a user specific start playback orientation is shown in Figure 8 by step 807,
The system in some embodiments furthermore may determine or generate playback offset information which can be provided to the playback devices.
The determination or generation of the playback offset information is shown in Figure 8 by step 809.
The user device, or playback device may receive the information and add the current position offset with respect to the local reference to a received playback offset and this may be used to control the media playback, for example to control the mixing and rendering of the audio signals to be output to the user.
The operation of adding the current position offset with respect to the local reference to a received playback offset is shown in Figure 8 by step 811,
With respect to Figure 10 an example electronic device which may be used as at least part of the external capture apparatus 101, 103 or 105 or OCC capture apparatus 141, or mixer/renderer 151 or the playback apparatus 161 is shown. The device may be any suitable electronics device or apparatus. For example in some embodiments the device 1200 is a mobile device, user equipment, tablet computer, computer, audio playback apparatus, etc.
The device 1200 may comprise a microphone array 1201. The microphone array 1201 may comprise a plurality (for example a number N) of microphones. However it is understood that there may be any suitable configuration of microphones and any suitable number of microphones. In some embodiments the microphone array 1201 is separate from the apparatus and the audio signais transmitted to the apparatus by a wired or wireless coupling. The microphone array 1201 may in some embodiments be the microphone 113,123,133, or microphone array 145 as shown in Figure 9.
The microphones may be transducers configured to convert acoustic waves into suitable electrical audio signais. In some embodiments the microphones can be solid state microphones, in other words the microphones may be capable of capturing audio signals and outputting a suitable digital format signal in some other embodiments the microphones or microphone array 1201 can comprise any suitable microphone or audio capture means, for example a condenser microphone, capacitor microphone, electrostatic microphone, ESectret condenser microphone, dynamic microphone, ribbon microphone, carbon microphone, piezoelectric microphone, or microelectricai-mechanicai system (MEMS) microphone. The microphones can in some embodiments output the audio captured signai to an analogue-to-digitai converter (ADC) 1203.
The device 1200 may further comprise an anaiogue-to-digiial converter 1203. The analogue-to-digitaS converter 1203 may be configured to receive the audio signais from each of the microphones in the microphone array 1201 and convert them into a format suitable for processing, in some embodiments where the microphones are integrated microphones the analogue-to-digitai converter is not required. The analogue-to-digital converter 1203 can be any suitable analogue-to-digital conversion or processing means. The analogue-to-digital converter 1203 may be configured to output the digital representations of the audio signals to a processor 1207 or to a memory 1211. in some embodiments the device 1200 comprises at least one processor or central processing unit 1207. The processor 1207 can be configured to execute various program codes. The impiemented program codes can comprise, for example, SPAC controi, position determination and tracking and other code routines such as described herein. in some embodiments the device 1200 comprises a memory 1211. In some embodiments the at least one processor 1207 is coupled to the memory 1211. The memory 1211 can be any suitable storage means, in some embodiments the memory 1211 comprises a program code section for storing program codes impSementabie upon the processor 1207. Furthermore In some embodiments the memory 1211 can further comprise a stored data section for storing data, for example data that has been processed or to be processed in accordance with the embodiments as described herein. The implemented program code stored within the program code section and the data stored within the stored data section can be retrieved by the processor 1207 whenever needed via the memory-processor coupling.
In some embodiments the device 1200 comprises a user interface 1205. The user interface 1205 can be coupled in some embodiments to the processor 1207. In some embodiments the processor 1207 can control the operation of the user interface 1205 and receive inputs from the user interface 1205. In some embodiments the user interface 1205 can enable a user to input commands to the device 1200, for example via a keypad. In some embodiments the user interface 205 can enable the user to obtain information from the device 1200. For example the user interface 1205 may comprise a display configured to display information from the device 1200 to the user. The user interface 1205 can in some embodiments comprise a touch screen or touch interface capable of both enabling information to be entered to the device 1200 and further displaying information to the user of the device 1200.
In some implements the device 1200 comprises a transceiver 1209. The transceiver 1209 in such embodiments can be coupled to the processor 1207 and configured to enable a communication with other apparatus or electronic devices, for example via a wireless communications network. The transceiver 1209 or any suitable transceiver or transmitter and/or receiver means can in some embodiments be configured to communicate with other eiectronic devices or apparatus via a wire or wired coupiing.
For example as shown in Figure 10 the transceiver 1209 may be configured to communicate with a playback apparatus 103.
The transceiver 1209 can communicate with further apparatus by any suitabie known communications protocol. For example in some embodiments the transceiver 209 or transceiver means can use a suitabie universai mobiie telecommunications system (UMTS) protocol, a wireless Socai area network (WLAN) protocol such as for example IEEE 8Q2.X, a suitable short-range radio frequency communication protocol such as Bluetooth, or infrared data communication pathway (IRDA).
In some embodiments the device 1200 may be employed as a render apparatus. As such the transceiver 1209 may be configured to receive the audio signals and positional information from the capture apparatus 101, and generate a suitable audio signal rendering by using the processor 1207 executing suitabie code. The device 1200 may comprise a digitahto-analogue converter 1213. The digltal-to-analogue converter 1213 may be coupled to the processor 1207 and/or memory 1211 and be configured to convert digital representations of audio signals (such as from the processor 1207 following an audio rendering of the audio signals as described herein) to a suitable analogue format suitable for presentation via an audio subsystem output. The dlgital-to-analogue converter (DAC) 1213 or signal processing means can In some embodiments be any suitabie DAC technology.
Furthermore the device 1200 can comprise in some embodiments an audio subsystem output 1215. An exampie, such as shown in Figure 10, may be where the audio subsystem output 1215 is an output socket configured to enabling a coupling with the headphones 181. However the audio subsystem output 1215 may be any suitable audio output or a connection to an audio output. For example the audio subsystem output 1215 may be a connection to a multichannel speaker system.
In some embodiments the digital to analogue converter 1213 and audio subsystem 1215 may be implemented within a physicaliy separate output device. For exampie the DAC 1213 and audio subsystem 1215 may be impSemented as cordless earphones communicating with the device 1200 via the transceiver 1209.
Although the device 120Q is shown having both audio capture and audio rendering components, it would be understood that in some embodiments the device 1200 can comprise just the audio capture or audio render apparatus elements.
In general, the various embodiments of the invention may be implemented in hardware or special purpose circuits, software, iogic or any combination thereof. For example, some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto, While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, specie! purpose circuits or logic, genera! purpose hardware or controiier or other computing devices, or some combination thereof.
The embodiments of this invention may be implemented by computer software executable by a data processor of the mobile device, such as in the processor entity, or by hardware, or by a combination of software and hardware. Further in this regard it should be noted that any blocks of the logic flow as in the Figures may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions. The software may be stored on such physical media as memory chips, or memory blocks implemented within the processor, magnetic media such as hard disk or floppy disks, and optical media such as for example DVD and the data variants thereof, CD.
The memory may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory. The data processors may be of any type suitable to the local technical environment, and may include one or more of genera! purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASIC), gate level circuits and processors based on multi-core processor architecture, as non-limiting examples.
Embodiments of the inventions may be practiced in various components such as integrated circuit modules. The design of integrated circuits is by and large a highly automated process. Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
Programs, such as those provided by Synopsys, Inc, of Mountain View, California and Cadence Design, of San Jose, California automatically route conductors and locate components on a semiconductor chip using well established rules of design as well as iibraries of pre-stored design modules. Once the design for a semiconductor circuit has been completed, the resultant design, in a standardized electronic format (e.g., Opus, GDSH, or the like) may be transmitted to a semiconductor fabrication facility or "fab” for fabrication.
The foregoing description has provided by way of exemplary and non~Simiting examples a fuii and informative description of the exemplary embodiment of this invention. However, various modifications and adaptations may become apparent to those skilled in the relevant arts in view of the foregoing description, when read in conjunction with the accompanying drawings and the appended claims. However, aii such and similar modifications of the teachings of this invention will still fail within the scope of this invention as defined in the appended claims.

Claims (27)

  1. GLAIRS:
    1. Apparatus for capturing media comprising: a first media capture device configured to capture media; a iocator configured to receive at least one remote iocation signai such that the apparatus is configured to locate an audio source associated with a tag generating the remote iocation signais, the iocator comprising an array of antenna elements arranged with a reference orientation from which the tag is located; and a common orientation determiner configured to determine a common datum orientation between the reference orientation and the common datum, the common datum being common with respect to the apparatus and at least one further apparatus for capturing media, such that switching between the apparatus and the further apparatus for capturing media can be controlled based on the determined common datum orientation and a further apparatus common datum orientation.
  2. 2. The apparatus as claimed in claim 1, wherein the media capture device comprises at least one of: a microphone array configured to capture at least one spatial audio signai comprising an audio source, the microphone array comprising at least two microphones arranged around a first axis and configured to capture an audio source along the reference orientation; and at least one camera configured to capture an image with a field of view including the reference orientation,
  3. 3. The apparatus as claimed in any of claims 1 to 2, wherein the locator is a radio based positioning locator and wherein the at least one remote location signal is a radio based positioning tag signal,
  4. 4. The apparatus as claimed in any of claims 1 to 3, wherein the locator is configured to transmit the common datum orientation associated with the apparatus to a server, wherein the server is configured to determine an offset orientation between pairs of apparatus for capturing media based on the common datum orientation of the apparatus and the further apparatus common datum orientation.
  5. 5, The apparatus as claimed in any of claims 1 to 4, wherein the locator is configured to locate an audio source associated with a tag based on the reference orientation from which the tag is located and the common datum orientation so to generate an audio source location orientation relative to the common datum,
  6. 6, The apparatus as claimed in any of claims 1 to 5, wherein the media capture device has a capture reference orientation which is offset with respect to the reference orientation associated with the locator antenna elements.
  7. 7, The apparatus as claimed in any of claims 1 to 6, wherein the common orientation determiner comprises: an electronic compass configured to determine the common datum orientation between the reference orientation and magnetic north; a beacon orientation determiner configured to determine the common datum orientation between the reference orientation and a radio or light beacon; and a gps orientation determiner configured to determine the common datum orientation between the reference orientation and a determined gps derived position,
  8. 8, Apparatus for playback control of the captured media, the apparatus configured to; receive, from each of the more than one apparatus for capturing media, a common datum orientation between a reference orientation of the respective apparatus for capturing media and a common datum, the common datum being common with respect to the more than apparatus for capturing media; and determine an offset orientation between pairs of apparatus for capturing media based on the common datum orientations.
  9. 9, The apparatus as claimed in claim 8, wherein the apparatus is furthermore configured to provide the offset orientation to a playback apparatus to enable the playback apparatus to control a switch between the more than one apparatus.
  10. 10, The apparatus as claimed in any of claims 8 to 9, further configured to receive captured media from more than one apparatus wherein the apparatus is further configured to process the captured media from the more than one apparatus based on the offset orientation when impiementing a switch from the first of the pair of apparatus for capturing media to the other,
  11. 11. The apparatus as claimed in any of claims 8 to 10, further configured to: receive location estimates for audio sources from the more than one apparatus for capturing media; determine a switching policy associated with a switch between a pair of apparatus for capturing media; and apply the switching policy to the location estimates for audio sources.
  12. 12. The apparatus as claimed in claim 11, wherein a switching policy comprises one or more of the following: maintain a location orientation for an object of interest alter a switch; and keep an object of interest within a field of experience after a switch.
  13. 13, A system comprising: a first apparatus as claimed in any of claims 1 to 7; a further appararatus for capturing media comprising: a further media capture device configured to capture media; a further locator configured to receive at least one remote location signal such that the further apparatus is configured to locate an audio source associated with a tag generating the remote location signals, the further locator comprising an array of antenna elements arranged with a reference orientation from which the tag is located; and a further common orientation determiner configured to determine a further common datum orientation between the further apparatus reference orientation and the common datum, the common datum being common with respect to the further apparatus and the apparatus for capturing media, such that switching between the apparatus and the further apparatus for capturing media can be controlled based on the determined common datum orientation and a further apparatus common datum orientation.
  14. 14. The system as claimed in ciaim 13, further comprising at least one remote media capture apparatus, the at least one remote media capture apparatus comprising: at least one remote media capture apparatus configured to capture media associated with the audio source; and a locator tag configured to transmit remote location signal.
  15. 15. The system as claimed in claim 14, further comprising a playback control server, the playback control server comprising: an offset determiner configured to determine an offset orientation between the appararatus for capturing media common datum orientation and the further apparatus for capturing media common datum orientation.
  16. 16. A method for capturing media, the method comprising: capturing media using a first media capture device; receiving at least one remote location signal; locating an audio source associated with a tag generating the remote location signal, the location associated with a reference orientation from which the tag is located; determining a common datum orientation between the reference orientation and a common datum, the common datum being common with respect to the first capture device and at least one apparatus for capturing media; and controlling switching between the device media and the apparatus for capturing media based on the determined common datum orientation and a further apparatus common datum orientation.
  17. 17. The method as claimed in ciaim 16, wherein capturing media comprises at least one of: capturing at least one spatial audio signal comprising an audio source using a microphone array comprising at ieast two microphones arranged around a first axis and configured to capture an audio source aiong the reference orientation; and capturing an image using at ieast one camera with a field of view including the reference orientation.
  18. 18. The method as claimed in any of claims 18 to 17, wherein locating an audio source comprises radio based positioning locating and wherein the at least one remote location signal is a radio based positioning tag signal.
  19. 19. The method as claimed in any of claims 16 to 18, wherein locating an audio source comprises transmitting the common datum orientation associated with the apparatus to a server, wherein the method further comprises determining at the server an offset orientation between pairs of apparatus for capturing media based on the common datum orientation and apparatus common datum orientation.
  20. 20. The method as claimed in any of claims 18 to 19, wherein locating an audio source comprises locating an audio source associated with a tag based on the reference orientation from which the tag is located and the common datum orientation so to generate an audio source location orientation relative to the common datum.
  21. 21. The method as claimed in any of claims 18 to 20, wherein capturing media using a first media capture device comprising capturing media using a first media device with a capture reference orientation which is offset with respect to the reference orientation.
  22. 22. The method as claimed in any of claims 18 to 21, wherein determining a common datum orientation comprises: determining the common datum orientation between the reference orientation and magnetic north; determining the common datum orientation between the reference orientation and a radio or light beacon; and determining the common datum orientation between the reference orientation and a determined gps derived position,
  23. 23. A method for piayback control of the captured media, the method comprising: receiving, from each of the more than one apparatus for capturing media, a common datum orientation between a reference orientation of the respective apparatus for capturing media and a common datum, the common datum being common with respect to the more than apparatus for capturing media; and determining an offset orientation between pairs of apparatus for capturing media based on the common datum orientations,
  24. 24. The method as claimed in claim 23, wherein the method comprises providing the offset orientation to a playback apparatus to enable the piayback apparatus to control a switch between the more than one apparatus.
  25. 25. The method as claimed in any of claims 22 to 24, further comprising: receiving captured media from more than one apparatus; processing the captured media from the more than one apparatus based on the offset orientation when implementing a switch from the first of the pair of apparatus for capturing media to the other.
  26. 28. The method as claimed in any of claims 22 to 25, further comprising: receiving location estimates for audio sources from the more than one apparatus for capturing media; determining a switching policy associated with a switch between a pair of apparatus for capturing media; and applying the switching policy to the location estimates for audio sources.
  27. 27, The method as claimed in claim 28, wherein determining a switching policy comprises one or more of the following: maintaining a iocation orientation for an object of interest after a switch; and keeping an object of interest within a field of experience after a switch.
GB1521096.6A 2015-07-08 2015-11-30 Multi-apparatus distributed media capture for playback control Withdrawn GB2540224A (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201680052193.2A CN108432272A (en) 2015-07-08 2016-07-05 How device distributed media capture for playback controls
EP16820900.5A EP3320682A4 (en) 2015-07-08 2016-07-05 Multi-apparatus distributed media capture for playback control
US15/742,687 US20180213345A1 (en) 2015-07-08 2016-07-05 Multi-Apparatus Distributed Media Capture for Playback Control
PCT/FI2016/050496 WO2017005980A1 (en) 2015-07-08 2016-07-05 Multi-apparatus distributed media capture for playback control

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
GB1511949.8A GB2540175A (en) 2015-07-08 2015-07-08 Spatial audio processing apparatus
GB1513198.0A GB2542112A (en) 2015-07-08 2015-07-27 Capturing sound
GB1518025.0A GB2543276A (en) 2015-10-12 2015-10-12 Distributed audio capture and mixing
GB1518023.5A GB2543275A (en) 2015-10-12 2015-10-12 Distributed audio capture and mixing

Publications (2)

Publication Number Publication Date
GB201521096D0 GB201521096D0 (en) 2016-01-13
GB2540224A true GB2540224A (en) 2017-01-11

Family

ID=55177449

Family Applications (3)

Application Number Title Priority Date Filing Date
GB1521096.6A Withdrawn GB2540224A (en) 2015-07-08 2015-11-30 Multi-apparatus distributed media capture for playback control
GB1521098.2A Withdrawn GB2540225A (en) 2015-07-08 2015-11-30 Distributed audio capture and mixing control
GB1521102.2A Withdrawn GB2540226A (en) 2015-07-08 2015-11-30 Distributed audio microphone array and locator configuration

Family Applications After (2)

Application Number Title Priority Date Filing Date
GB1521098.2A Withdrawn GB2540225A (en) 2015-07-08 2015-11-30 Distributed audio capture and mixing control
GB1521102.2A Withdrawn GB2540226A (en) 2015-07-08 2015-11-30 Distributed audio microphone array and locator configuration

Country Status (5)

Country Link
US (3) US20180213345A1 (en)
EP (3) EP3320693A4 (en)
CN (3) CN107949879A (en)
GB (3) GB2540224A (en)
WO (3) WO2017005979A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2556058A (en) * 2016-11-16 2018-05-23 Nokia Technologies Oy Distributed audio capture and mixing controlling
GB2556922A (en) * 2016-11-25 2018-06-13 Nokia Technologies Oy Methods and apparatuses relating to location data indicative of a location of a source of an audio component
GB2557218A (en) * 2016-11-30 2018-06-20 Nokia Technologies Oy Distributed audio capture and mixing
WO2019106228A1 (en) * 2017-12-01 2019-06-06 Nokia Technologies Oy Processing audio signals
WO2019162690A1 (en) * 2018-02-22 2019-08-29 Sintef Tto As Positioning sound sources
US10524076B2 (en) 2016-04-13 2019-12-31 Nokia Technologies Oy Control of audio rendering
US11010051B2 (en) 2016-06-22 2021-05-18 Nokia Technologies Oy Virtual sound mixing environment

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2540175A (en) * 2015-07-08 2017-01-11 Nokia Technologies Oy Spatial audio processing apparatus
US10579879B2 (en) * 2016-08-10 2020-03-03 Vivint, Inc. Sonic sensing
EP3343957B1 (en) * 2016-12-30 2022-07-06 Nokia Technologies Oy Multimedia content
US10187724B2 (en) * 2017-02-16 2019-01-22 Nanning Fugui Precision Industrial Co., Ltd. Directional sound playing system and method
GB2561596A (en) * 2017-04-20 2018-10-24 Nokia Technologies Oy Audio signal generation for spatial audio mixing
GB2563670A (en) 2017-06-23 2018-12-26 Nokia Technologies Oy Sound source distance estimation
US11209306B2 (en) 2017-11-02 2021-12-28 Fluke Corporation Portable acoustic imaging tool with scanning and analysis capability
GB2570298A (en) 2018-01-17 2019-07-24 Nokia Technologies Oy Providing virtual content based on user context
US10735882B2 (en) * 2018-05-31 2020-08-04 At&T Intellectual Property I, L.P. Method of audio-assisted field of view prediction for spherical video streaming
US11457308B2 (en) 2018-06-07 2022-09-27 Sonova Ag Microphone device to provide audio with spatial context
CN112739997A (en) * 2018-07-24 2021-04-30 弗兰克公司 Systems and methods for detachable and attachable acoustic imaging sensors
CN108989947A (en) * 2018-08-02 2018-12-11 广东工业大学 A kind of acquisition methods and system of moving sound
US11451931B1 (en) 2018-09-28 2022-09-20 Apple Inc. Multi device clock synchronization for sensor data fusion
EP3870991A4 (en) 2018-10-24 2022-08-17 Otto Engineering Inc. Directional awareness audio communications system
US10863468B1 (en) * 2018-11-07 2020-12-08 Dialog Semiconductor B.V. BLE system with slave to slave communication
US10728662B2 (en) 2018-11-29 2020-07-28 Nokia Technologies Oy Audio mixing for distributed audio sensors
AU2020253755A1 (en) 2019-04-05 2021-11-04 Tls Corp. Distributed audio mixing
US20200379716A1 (en) * 2019-05-31 2020-12-03 Apple Inc. Audio media user interface
CN112492506A (en) * 2019-09-11 2021-03-12 深圳市优必选科技股份有限公司 Audio playing method and device, computer readable storage medium and robot
US11925456B2 (en) 2020-04-29 2024-03-12 Hyperspectral Corp. Systems and methods for screening asymptomatic virus emitters
CN113905302B (en) * 2021-10-11 2023-05-16 Oppo广东移动通信有限公司 Method and device for triggering prompt message and earphone
GB2613628A (en) 2021-12-10 2023-06-14 Nokia Technologies Oy Spatial audio object positional distribution within spatial audio communication systems
TWI814651B (en) * 2022-11-25 2023-09-01 國立成功大學 Assistive listening device and method with warning function integrating image, audio positioning and omnidirectional sound receiving array
CN116132882B (en) * 2022-12-22 2024-03-19 苏州上声电子股份有限公司 Method for determining installation position of loudspeaker

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1521165A2 (en) * 2003-09-30 2005-04-06 Canon Kabushiki Kaisha Data conversion method and apparatus, and orientation measurement apparatus
US20120195574A1 (en) * 2011-01-31 2012-08-02 Home Box Office, Inc. Real-time visible-talent tracking system
US8743219B1 (en) * 2010-07-13 2014-06-03 Marvell International Ltd. Image rotation correction and restoration using gyroscope and accelerometer

Family Cites Families (60)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69425499T2 (en) * 1994-05-30 2001-01-04 Makoto Hyuga IMAGE GENERATION PROCESS AND RELATED DEVICE
JP4722347B2 (en) * 2000-10-02 2011-07-13 中部電力株式会社 Sound source exploration system
US6606057B2 (en) * 2001-04-30 2003-08-12 Tantivy Communications, Inc. High gain planar scanned antenna array
AUPR647501A0 (en) * 2001-07-19 2001-08-09 Vast Audio Pty Ltd Recording a three dimensional auditory scene and reproducing it for the individual listener
US7496329B2 (en) * 2002-03-18 2009-02-24 Paratek Microwave, Inc. RF ID tag reader utilizing a scanning antenna system and method
US7187288B2 (en) * 2002-03-18 2007-03-06 Paratek Microwave, Inc. RFID tag reading system and method
US6922206B2 (en) * 2002-04-15 2005-07-26 Polycom, Inc. Videoconferencing system with horizontal and vertical microphone arrays
KR100499063B1 (en) * 2003-06-12 2005-07-01 주식회사 비에스이 Lead-in structure of exterior stereo microphone
US7428000B2 (en) * 2003-06-26 2008-09-23 Microsoft Corp. System and method for distributed meetings
US7327383B2 (en) * 2003-11-04 2008-02-05 Eastman Kodak Company Correlating captured images and timed 3D event data
EP2408192A3 (en) * 2004-04-16 2014-01-01 James A. Aman Multiple view compositing and object tracking system
US7634533B2 (en) * 2004-04-30 2009-12-15 Microsoft Corporation Systems and methods for real-time audio-visual communication and data collaboration in a network conference environment
GB0426448D0 (en) * 2004-12-02 2005-01-05 Koninkl Philips Electronics Nv Position sensing using loudspeakers as microphones
WO2006125849A1 (en) * 2005-05-23 2006-11-30 Noretron Stage Acoustics Oy A real time localization and parameter control method, a device, and a system
JP4257612B2 (en) * 2005-06-06 2009-04-22 ソニー株式会社 Recording device and method for adjusting recording device
US7873326B2 (en) * 2006-07-11 2011-01-18 Mojix, Inc. RFID beam forming system
JP4345784B2 (en) * 2006-08-21 2009-10-14 ソニー株式会社 Sound pickup apparatus and sound pickup method
AU2007221976B2 (en) * 2006-10-19 2009-12-24 Polycom, Inc. Ultrasonic camera tracking system and associated methods
US7995731B2 (en) * 2006-11-01 2011-08-09 Avaya Inc. Tag interrogator and microphone array for identifying a person speaking in a room
JP4254879B2 (en) * 2007-04-03 2009-04-15 ソニー株式会社 Digital data transmission device, reception device, and transmission / reception system
US20110046915A1 (en) * 2007-05-15 2011-02-24 Xsens Holding B.V. Use of positioning aiding system for inertial motion capture
US7830312B2 (en) * 2008-03-11 2010-11-09 Intel Corporation Wireless antenna array system architecture and methods to achieve 3D beam coverage
US20090238378A1 (en) * 2008-03-18 2009-09-24 Invism, Inc. Enhanced Immersive Soundscapes Production
JP5071290B2 (en) * 2008-07-23 2012-11-14 ヤマハ株式会社 Electronic acoustic system
US9185361B2 (en) * 2008-07-29 2015-11-10 Gerald Curry Camera-based tracking and position determination for sporting events using event information and intelligence data extracted in real-time from position information
US7884721B2 (en) * 2008-08-25 2011-02-08 James Edward Gibson Devices for identifying and tracking wireless microphones
WO2010034063A1 (en) * 2008-09-25 2010-04-01 Igruuv Pty Ltd Video and audio content system
CA2765116C (en) * 2009-06-23 2020-06-16 Nokia Corporation Method and apparatus for processing audio signals
EP2517486A1 (en) * 2009-12-23 2012-10-31 Nokia Corp. An apparatus
US20110219307A1 (en) * 2010-03-02 2011-09-08 Nokia Corporation Method and apparatus for providing media mixing based on user interactions
US20120114134A1 (en) * 2010-08-25 2012-05-10 Qualcomm Incorporated Methods and apparatus for control and traffic signaling in wireless microphone transmission systems
US9736462B2 (en) * 2010-10-08 2017-08-15 SoliDDD Corp. Three-dimensional video production system
US9015612B2 (en) * 2010-11-09 2015-04-21 Sony Corporation Virtual room form maker
CN102223515B (en) * 2011-06-21 2017-12-05 中兴通讯股份有限公司 Remote presentation conference system, the recording of remote presentation conference and back method
HUE054452T2 (en) * 2011-07-01 2021-09-28 Dolby Laboratories Licensing Corp System and method for adaptive audio signal generation, coding and rendering
US9274595B2 (en) * 2011-08-26 2016-03-01 Reincloud Corporation Coherent presentation of multiple reality and interaction models
US9084057B2 (en) * 2011-10-19 2015-07-14 Marcos de Azambuja Turqueti Compact acoustic mirror array system and method
US9099069B2 (en) * 2011-12-09 2015-08-04 Yamaha Corporation Signal processing device
WO2013093565A1 (en) * 2011-12-22 2013-06-27 Nokia Corporation Spatial audio processing apparatus
TWI517140B (en) * 2012-03-05 2016-01-11 廣播科技機構公司 Method and apparatus for down-mixing of a multi-channel audio signal
CN104335601B (en) * 2012-03-20 2017-09-08 艾德森***工程公司 Audio system with integrated power, audio signal and control distribution
WO2013142668A1 (en) * 2012-03-23 2013-09-26 Dolby Laboratories Licensing Corporation Placement of talkers in 2d or 3d conference scene
US10107887B2 (en) * 2012-04-13 2018-10-23 Qualcomm Incorporated Systems and methods for displaying a user interface
US9800731B2 (en) * 2012-06-01 2017-10-24 Avaya Inc. Method and apparatus for identifying a speaker
JP6038312B2 (en) * 2012-07-27 2016-12-07 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン Apparatus and method for providing loudspeaker-enclosure-microphone system description
US9031262B2 (en) * 2012-09-04 2015-05-12 Avid Technology, Inc. Distributed, self-scaling, network-based architecture for sound reinforcement, mixing, and monitoring
US9286898B2 (en) * 2012-11-14 2016-03-15 Qualcomm Incorporated Methods and apparatuses for providing tangible control of sound
US10228443B2 (en) * 2012-12-02 2019-03-12 Khalifa University of Science and Technology Method and system for measuring direction of arrival of wireless signal using circular array displacement
WO2014096900A1 (en) * 2012-12-18 2014-06-26 Nokia Corporation Spatial audio apparatus
US9160064B2 (en) * 2012-12-28 2015-10-13 Kopin Corporation Spatially diverse antennas for a headset computer
US9420434B2 (en) * 2013-05-07 2016-08-16 Revo Labs, Inc. Generating a warning message if a portable part associated with a wireless audio conferencing system is not charging
US10204614B2 (en) 2013-05-31 2019-02-12 Nokia Technologies Oy Audio scene apparatus
CN104244164A (en) * 2013-06-18 2014-12-24 杜比实验室特许公司 Method, device and computer program product for generating surround sound field
GB2516056B (en) * 2013-07-09 2021-06-30 Nokia Technologies Oy Audio processing apparatus
US9451162B2 (en) * 2013-08-21 2016-09-20 Jaunt Inc. Camera array including camera modules
US20150078595A1 (en) * 2013-09-13 2015-03-19 Sony Corporation Audio accessibility
US20150139601A1 (en) * 2013-11-15 2015-05-21 Nokia Corporation Method, apparatus, and computer program product for automatic remix and summary creation using crowd-sourced intelligence
KR102221676B1 (en) * 2014-07-02 2021-03-02 삼성전자주식회사 Method, User terminal and Audio System for the speaker location and level control using the magnetic field
US10182301B2 (en) * 2016-02-24 2019-01-15 Harman International Industries, Incorporated System and method for wireless microphone transmitter tracking using a plurality of antennas
EP3252491A1 (en) * 2016-06-02 2017-12-06 Nokia Technologies Oy An apparatus and associated methods

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1521165A2 (en) * 2003-09-30 2005-04-06 Canon Kabushiki Kaisha Data conversion method and apparatus, and orientation measurement apparatus
US8743219B1 (en) * 2010-07-13 2014-06-03 Marvell International Ltd. Image rotation correction and restoration using gyroscope and accelerometer
US20120195574A1 (en) * 2011-01-31 2012-08-02 Home Box Office, Inc. Real-time visible-talent tracking system

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10524076B2 (en) 2016-04-13 2019-12-31 Nokia Technologies Oy Control of audio rendering
US11010051B2 (en) 2016-06-22 2021-05-18 Nokia Technologies Oy Virtual sound mixing environment
GB2556058A (en) * 2016-11-16 2018-05-23 Nokia Technologies Oy Distributed audio capture and mixing controlling
GB2556922A (en) * 2016-11-25 2018-06-13 Nokia Technologies Oy Methods and apparatuses relating to location data indicative of a location of a source of an audio component
GB2557218A (en) * 2016-11-30 2018-06-20 Nokia Technologies Oy Distributed audio capture and mixing
WO2019106228A1 (en) * 2017-12-01 2019-06-06 Nokia Technologies Oy Processing audio signals
US11172290B2 (en) 2017-12-01 2021-11-09 Nokia Technologies Oy Processing audio signals
WO2019162690A1 (en) * 2018-02-22 2019-08-29 Sintef Tto As Positioning sound sources
CN112005556A (en) * 2018-02-22 2020-11-27 诺莫诺股份有限公司 Positioning sound source
CN112005556B (en) * 2018-02-22 2022-05-03 诺莫诺股份有限公司 Method of determining position of sound source, sound source localization system, and storage medium
US11388512B2 (en) 2018-02-22 2022-07-12 Nomono As Positioning sound sources

Also Published As

Publication number Publication date
GB201521102D0 (en) 2016-01-13
EP3320682A1 (en) 2018-05-16
WO2017005980A1 (en) 2017-01-12
WO2017005979A1 (en) 2017-01-12
WO2017005981A1 (en) 2017-01-12
US20180199137A1 (en) 2018-07-12
CN107949879A (en) 2018-04-20
GB2540225A (en) 2017-01-11
CN108028976A (en) 2018-05-11
GB201521098D0 (en) 2016-01-13
EP3320693A1 (en) 2018-05-16
EP3320537A4 (en) 2019-01-16
EP3320693A4 (en) 2019-04-10
US20180203663A1 (en) 2018-07-19
EP3320682A4 (en) 2019-01-23
CN108432272A (en) 2018-08-21
GB2540226A (en) 2017-01-11
GB201521096D0 (en) 2016-01-13
US20180213345A1 (en) 2018-07-26
EP3320537A1 (en) 2018-05-16

Similar Documents

Publication Publication Date Title
US20180213345A1 (en) Multi-Apparatus Distributed Media Capture for Playback Control
US10397722B2 (en) Distributed audio capture and mixing
CN109804559B (en) Gain control in spatial audio systems
US10645518B2 (en) Distributed audio capture and mixing
US11812235B2 (en) Distributed audio capture and mixing controlling
US10448192B2 (en) Apparatus and method of audio stabilizing
US20180220253A1 (en) Differential headtracking apparatus
US20150319530A1 (en) Spatial Audio Apparatus
US10708679B2 (en) Distributed audio capture and mixing
KR101747800B1 (en) Apparatus for Generating of 3D Sound, and System for Generating of 3D Contents Using the Same

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)