EP1695335A1 - Verfahren zum synthetisieren akustischer spazialisierung - Google Patents

Verfahren zum synthetisieren akustischer spazialisierung

Info

Publication number
EP1695335A1
EP1695335A1 EP03819273A EP03819273A EP1695335A1 EP 1695335 A1 EP1695335 A1 EP 1695335A1 EP 03819273 A EP03819273 A EP 03819273A EP 03819273 A EP03819273 A EP 03819273A EP 1695335 A1 EP1695335 A1 EP 1695335A1
Authority
EP
European Patent Office
Prior art keywords
sound
spatialization
synthesis
source
virtual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP03819273A
Other languages
English (en)
French (fr)
Inventor
Rozenn Nicol
David Virette
Marc Emerit
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by France Telecom SA filed Critical France Telecom SA
Publication of EP1695335A1 publication Critical patent/EP1695335A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0091Means for obtaining special acoustic effects
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/155Musical effects
    • G10H2210/265Acoustic effect simulation, i.e. volume, spatial, resonance or reverberation effects added to a musical sound, usually by appropriate filtering or delays
    • G10H2210/295Spatial effects, musical uses of multiple audio channels, e.g. stereo
    • G10H2210/301Soundscape or sound field simulation, reproduction or control for musical purposes, e.g. surround or 3D sound; Granular synthesis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/055Filters for musical processing or musical effects; Filter responses, filter architecture, filter coefficients or control parameters therefor
    • G10H2250/111Impulse response, i.e. filters defined or specifed by their temporal impulse response features, e.g. for echo or reverberation applications

Definitions

  • the present invention relates to the synthesis of audio signals, in particular in music editing applications, video games, or even ringtones for mobile phones.
  • the invention relates to both sound synthesis techniques and three-dimensional (or "3D") sound techniques.
  • a first family of criteria concerns the use of the following parameters: 1 intuitiveness, - perceptibility, physical sense and behavior.
  • the quality and diversity of the sounds that are produced determine the second family of criteria, according to the following parameters: robustness of the identity of the sound, extent of the sound palette, and with a preliminary analysis phase, if necessary.
  • the third family of criteria deals with implementation solutions, with parameters such as: the cost of calculations, the memory required, - control, latency and multitasking.
  • the principle of wave table synthesis consists of taking one or more signal periods (corresponding to a recording or a synthetic signal), then applying treatments to it (with looping, modification of the fundamental frequency, ete ) and finally to apply the above-mentioned ADSR envelope to it.
  • This very simple synthesis method makes it possible to obtain satisfactory results.
  • a technique similar to the synthesis by wave table is that called "sampling" which differs from it however by the fact that it uses natural signal recordings instead of synthetic signals.
  • FM synthesis Another example of simple synthesis is synthesis by frequency modulation, better known under the name of "FM synthesis".
  • a frequency modulation is carried out for which the frequency of the modulant and the modulated (f m and f c ) is in the audible range (20 to 20,000 Hz). It is also indicated that the respective amplitudes of the harmonics with respect to the fundamental mode can be chosen to define a timbre of the sound.
  • the present invention relates to the combination of sound synthesis with the spatialization of the sounds resulting from this synthesis. We recall below some known sound spatialization techniques.
  • the methods based on a physical approach generally consist in reproducing the sound field 1 identical to the original sound field within an area of finite dimensions. These processes do not take into account a priori the perceptual properties of the auditory system, in particular in terms of auditory localization. With such systems, the listener is thus immersed in a field identical in every way to that which he would have perceived in the presence of real sources and he is therefore able to locate the sound sources as in a real listening situation.
  • the methods based on a psycho-acoustic approach rather seek to take advantage of the 3D sound perception mechanisms in order to simplify the process of sound reproduction. For example, instead of reproducing the sound field over an entire area, one can be content to reproduce it only at the level of the two ears of the listener. Similarly, one can impose a faithful reproduction of the sound field on only a fraction of the spectrum, in order to relax the constraint on the rest of the spectrum.
  • the objective is to take into account the perception mechanisms of the auditory system in order to identify the minimum amount of information to reproduce, to obtain a psycho-acoustically identical field to the original field, ie such as the ear, due to the limitation of its performance, is unable to distinguish them from each other.
  • holophony is typically a technique of physical reconstruction of a sound field, since it constitutes the acoustic equivalent of holography. It consists of reproducing a sound field from a recording on a surface (hollow sphere, or other). Further details are given in: “spatial sound removal over a large area: Application to telepresence", R. Nicol; Thesis from the University of Maine, 1999; the surround technique (from English “ambisonic”), which is another example of physical reconstruction of the acoustic field, using a decomposition of the sound field on the basis of proper functions, called “spherical harmonixju.es”.
  • stereophony which exploits differences in time or intensity to position the sound sources between two speakers, based on the interaural differences in time and intensity which define the criteria auditory of hearing localization in a horizontal plane
  • - binaural techniques which aim to reconstruct the sound field only at the level of the listener's ears, so that their eardrums perceive a sound field identical to that which the real sources would have induced.
  • Each technique is characterized by a specific method of encoding and decoding spatialization information in an adequate format of audio signals.
  • the different sound spatialization techniques are also distinguished by the extent of spatialization that they provide.
  • 3D spatialization such as surround encoding, holophony, binaural or transaural synthesis (which is a transposition of the binaural technique on two distant speakers) includes all directions of space.
  • two-dimensional (“2D”) spatialization such as stereophony, or a 2D restriction of holophony or ambisonic technique, is limited to the horizontal plane.
  • the different techniques are distinguished by their possible broadcasting systems, for example: - broadcasting on headphones for binaural techniques, or stereophony, - broadcasting on two speakers, in particular for stereophony or for a transaural system, - or a broadcast on a network with more than two speakers, for an extended listening area (in particular for multi-listener applications), in holophony, or in surround sound reproduction.
  • a wide range of current devices offers possibilities for sound synthesis. These devices range from musical instruments (such as a keyboard, a rhythm machine, or the like), mobile terminals, for example of the PDA type (for "Personal Digital Assistant"), or even computers on which are installed music editing software, or effects pedals with a MIDI interface.
  • musical instruments such as a keyboard, a rhythm machine, or the like
  • mobile terminals for example of the PDA type (for "Personal Digital Assistant")
  • computers on which are installed music editing software, or effects pedals with a MIDI interface or even computers on which are installed music editing software, or effects pedals with a MIDI interface.
  • Sound reproduction systems headphones, stereo speakers or multi-speaker systems
  • the quality of sound synthesis systems are very varied, in particular according to the more or less limited computing capacities and according to the environments of use of such systems.
  • Systems are currently known capable of spatializing previously synthesized sounds, in particular by cascading a sound synthesis engine and a spatialization engine. Spatialization is then applied to the synthesizer output signal (on a mono channel or two stereo channels) after mixing the different sources. We thus know of implementations of this solution so as to spatialize the sounds coming from a synthesizer.
  • 3D rendering engines which can be applied to any type of digital audio signals, whether synthetic or not.
  • the different musical instruments of a MIDI score classic sound synthesis format
  • One of the aims of the present invention is a sound synthesis method offering the possibility of directly spatializing synthetic sounds.
  • an object of the present invention is to associate with sound synthesis spatialization tools of satisfactory quality.
  • this association combines the complexity due to sound synthesis with that of spatialization, which makes it difficult to implement spatialized sound synthesis on terminals with high constraints (that is to say computing power and with relatively limited memory size).
  • Another object of the present invention aims to optimize the complexity of the spatialization of synthetic sounds according to the capabilities of the terminal.
  • the present invention firstly proposes a method of sound synthesis and spatialization, in which a synthetic sound to be generated is characterized by the nature of a virtual sound source and by its position relative to a chosen origin.
  • the method within the meaning of the invention comprises a joint step consisting in determining parameters including at least one gain, in order to define at the same time:
  • the present invention makes it possible to integrate a sound spatialization technique with a sound synthesis technique, so as to obtain a global processing using common parameters for the implementation of the two techniques.
  • the spatialization of the virtual source takes place in a surround context.
  • the method then includes a step of calculating gains associated with surround components in a base of spherical harmonics.
  • the synthetic sound is intended to be reproduced in a holophonic, or binaural, or transaural context, on a plurality of reproduction channels.
  • this "plurality of restitution channels" can as well relate to two restitution tracks, in binaural or transaural context, or even more than two restitution tracks, for example in holophonic context.
  • a delay between restitution channels is also determined, to define at the same time:
  • the nature of the virtual source is configured at least by a temporal variation of sound intensity, over a chosen duration and including an instant of triggering of the sound.
  • this time variation can advantageously be represented by an ADSR envelope as described above.
  • this variation comprises at least: - an instrumental attack phase
  • the spatialization of the virtual source is preferably carried out by a binaural synthesis based on a linear decomposition of transfer functions, these transfer functions being expressed by a linear combination of terms depending on the frequency of the sound and weighted by terms depending on sound direction. This measurement proves to be advantageous in particular when the position of the virtual source is liable to change over time and / or when several virtual sources are to be spatialized.
  • the direction is defined by at least one azimuthal angle (for spatialization in a single plane) and, preferably, by an azimuthal angle and an elevation angle (for three-dimensional spatialization).
  • the position of the virtual source is advantageously configured at least by: several filterings, functions of the sound frequency, several weighting gains each associated with a filtering, and - a delay by "left" and "right” channel.
  • the nature of the virtual source is parameterized at least by a sound timbre, by associating selected relative sound intensities with harmonics of a frequency corresponding to a pitch of the sound.
  • this modeling is advantageously carried out by an FM synthesis, described above.
  • a sound synthesis engine capable of generating spatialized sounds, with respect to a predetermined origin.
  • the synthesis engine is implemented in the context of musical editing, and a man / machine interface is also provided for placing the virtual source at a chosen position relative to the predetermined origin.
  • each source is assigned to a respective position, preferably using a linear decomposition of the transfer functions in binaural context, as indicated above.
  • the present invention also relates to a module for generating synthetic sounds, comprising in particular a processor, and comprising in particular a working memory capable of storing instructions for the implementation of the above method, so as to simultaneously process a synthesis and a spatialization of the sound, according to one of the advantages which the present invention provides.
  • the present invention also relates to a computer program product, stored in a memory of a central unit or of a terminal, in particular mobile, or on a removable medium suitable for cooperating with a reader of said central unit, and comprising instructions for implementing the above process.
  • FIG. 1 schematically illustrates positions of sound sources i and positions of microphones j in three-dimensional space
  • FIG. 2 schematically represents a simultaneous spatialization and sound synthesis processing, within the meaning of the invention
  • FIG. 3 schematically represents the application of HRTFs transfer functions to Si signals for spatialization in binaural or transaural synthesis
  • Figure 4 schematically represents the application of a pair of delays (a delay by left or right channel) and several gains (one gain per directional filter) in binaural or transaural synthesis, using the linear decomposition of HRTFs
  • - Figure 5 schematically represents the integration of spatialization processing, within a plurality of synthetic sound generators, for spatialization and sound synthesis in a single step
  • - Figure 6 represents an ADSR envelope model in sound synthesis
  • FIG. 7 shows schematically a sound generator in FM synthesis.
  • the present invention proposes to integrate a technique of spatialization of sound with a technique of sound synthesis so as to obtain a global, optimized processing, of spatialized sound synthesis.
  • the pooling of certain sound synthesis operations, on the one hand, and sound spatialization, on the other hand, is particularly interesting.
  • a sound synthesis engine (typically a “synthesizer”) has the role of generating one or more synthetic signals, on the basis of a sound synthesis model, a model which is controlled from a set of parameters, hereinafter called "synthesis parameters".
  • the synthetic signals generated by the synthesis engine can correspond to distinct sound sources (which are, for example, the different instruments of a score) or can be associated with the same source, for example in the case of different notes of the same instrument.
  • the terms "tone generator” designate a module for producing a musical note.
  • a synthesizer is composed of a set of tone generators.
  • a sound spatialization tool is a tool which admits a given number of audio signals as input, these signals being representative of sound sources and, in principle, free of spatialization processing. It is in fact indicated that, if these signals have already undergone spatial processing, this prior processing is not taken into account here.
  • the role of the spatialization tool is to process the input signals, according to a diagram which is specific to the spatialization technique chosen, to generate a given number of output signals which define the spatialized signals representative of the sound scene in format of spatialization chosen.
  • the nature and the complexity of the spatialization processing obviously depend on the technique chosen, depending on whether one considers a rendering in stereophonic, binaural, holophonic or ambiophonic format.
  • the encoding corresponds to the sound recording of the sound field generated by the different sources at a given time.
  • This "virtual" sound recording system can be more or less complex depending on the sound spatialization technique used. So, we simulate a sound recording by a number more or less important microphones with different positions and directivities.
  • the encoding is reduced, to calculate the contribution of a sound source, at least to the application of gains and, more often than not, delays (typically in holophony or in binaural or transaural synthesis) to different copies of the signal from the source. There is one gain (and, if necessary, a delay) per source for each virtual microphone. This gain (and delay) depends on the position of the source relative to the microphone. If a virtual sound pickup system provided with K microphones is provided, there are K signals output from the encoding system.
  • the signal Ej represents the sum of the contributions of all the sound sources on the microphone j.
  • the sound emitted by the source i, - Ej the signal encoded at the output of the microphone j
  • Gji the attenuation of the sound Si due to the distance between the source i and the microphone j, to the directivity of the source, at the obstacles between the source i and the microphone j, and finally at the very directivity of the microphone j
  • tji the delay of the sound Si due to the propagation from the source i towards the microphone j
  • - x, y, z the Cartesian coordinates of the position of the source, assumed to be variable over time.
  • the gains and the delays depend on the position of the source i relative to the microphone j at the instant t.
  • the encoding is therefore a representation of the sound field generated by the sound sources at this instant t. It is simply recalled here that in a surround context (consisting of a decomposition of the field in a base of spherical harmonics), the delay does not really intervene in the spatialization processing.
  • the image sources In the case where the sound sources are in a room, the image sources must be added. These are the images of the sound sources reflected by the walls of the room. Image sources, in turn reflecting on the walls, generate image sources of higher order.
  • L therefore no longer represents the number of sources, but the number of sources to which the number of image sources is added.
  • the number of image sources is infinite, which is why, in practice, we keep only audible image sources and whose direction we perceive. Image sources that are audible but whose direction is no longer perceived are grouped together and their contribution is synthesized using an artificial reverberator.
  • the decoding step aims to restore the signals E j encoded on a given device, comprising a predetermined number T of sound transducers (headphones, loudspeaker, 5/069272
  • This step consists in applying a TxK matrix of filters to the encoded signals.
  • This matrix depends only on the rendering device, and not on the sound sources. Depending on the encoding and decoding technique chosen, this matrix can be very simple (for example identity) or very complex.
  • a first step ST constitutes a start-up step during which a user defines sound commands Ci, C 2 , ..., C N to be synthesized and spatialized (for example by providing a man / machine interface to define a musical note , an instrument to play this note and a position of this instrument playing this note in space).
  • the spatialization information can be transmitted in a stream parallel to the synthetic audio stream, or even directly in the synthetic audio stream.
  • a sound can be defined at least by: the frequency of its fundamental mode, characterizing the pitch, its duration, and its intensity. So, in the example of a sensitive keyboard synthesizer, if the user plays a loud note,
  • the intensity associated with the Ci command will be greater than the intensity associated with a piano note. More particularly, it is indicated that the intensity parameter can, in general, take into account the spatialization gain gi in a context of spatialization processing, as will be seen below, according to one of the major advantages which it provides. the present invention.
  • a sound is, of course, also defined by its triggering instant.
  • the spatialization technique chosen is not a surround treatment, but rather binaural or transaural synthesis, holophony, or the like, the spatialization delay ⁇ ⁇ (which will be described in detail below) can make it possible to control in addition the instant of triggering of the sound.
  • a sound synthesis and spatialization device Dl comprises: - a synthesis module proper Ml, capable of defining, as a function of a command Ci, at least the frequency fi and the duration Di of the sound i associated with this command Ci, and a spatialization module M2, capable of defining at least the gain gi (in surround context in particular) and, moreover, the spatialization delay Ti, in holophony or binaural or transaural synthesis.
  • these last two parameters g ⁇ and Xi can be used jointly for the spatialization, but also for the synthesis of the sound itself, when a sound intensity (or a pan in stereophony) and a triggering moment are defined. sound.
  • the two modules M1 and M2 are grouped together in the same module making it possible to define in a single step all the parameters of the signal if to be synthesized and spatialized: its frequency, its duration, its gain in spatialization, its delay in spatialization, in particular.
  • this module M3 performs a linear combination on the signals si which in particular involves the spatialization gains, as will be seen below.
  • This encoding module M3 can also apply compression encoding to the signals Si to prepare a transmission of the encoded data to a restitution device D2.
  • this encoding module M3 is, in a preferred embodiment, directly integrated into the modules Ml and M2 above, so as to create directly, within a single module Dl which would simply consist of a motor sound synthesis and spatialization, the signals Ej as if they were delivered by microphones j, as explained above.
  • the sound synthesis and spatialization engine Dl produces, at the output, K sound signals Ej representing the encoding of the virtual sound field that the different synthetic sources would have created if they had been real.
  • K sound signals Ej representing the encoding of the virtual sound field that the different synthetic sources would have created if they had been real.
  • this rendering device may also be made to add (or "mix") to this sound scene other scenes coming from an actual sound recording or from the output of other sound processing modules, provided that they are in the same spatialization format.
  • the mixing of these different scenes then passes through a single and unique decoding system M '3, provided at the input of a reproduction device D2.
  • this rendering device In the example shown in Figure 2, this rendering device
  • D2 includes two channels, here for a binaural reproduction (reproduction on stereo headphones) or transaural (reproduction on two speakers) on two channels L and R.
  • a preferred embodiment of the invention is described below, here applied to a mobile terminal and in the context of sound spatialization by binaural synthesis.
  • the preferred sound source positioning technique is then synthesis binaural. It consists, for each sound source, in filtering the monophonic signal by acoustic transfer functions called HRTFs (for "Head Related Transfer Functions"), which model the transformations generated by the torso, the head and the flag of the listener to the signal from a sound source. For each position in space, we can measure a pair of these functions (a function for the right ear, a function for the left ear). The HRTFs are therefore functions of the position [ ⁇ , ⁇ ] (where ⁇ represents the azimuth and ⁇ the elevation) and the sound frequency f.
  • HRTFs for "Head Related Transfer Functions”
  • Another binaural synthesis corresponds to an implementation which proves to be more effective in particular when several sound sources are spatialized, or in the case where the sound sources change position over time. In this case, we speak of "dynamic binaural synthesis”.
  • the positions of the sound sources are not expected to change over time.
  • these filters being in the form of filters, either with finite impulse response (FIR), or with infinite impulse response (IIR), problems of discontinuities of the left and right output signals appear, causing audible "clicks" .
  • FIR finite impulse response
  • IIR infinite impulse response
  • the technical solution used to overcome this problem is to make, rotate two sets of binaural filters in parallel. The first game simulates the first position [ ⁇ l, ⁇ l] at an instant tl, the second the second position [ ⁇ 2, ⁇ 2] at an instant t2.
  • the signal giving the illusion of a displacement between the first and second positions is then obtained by a crossfade of the left and right resulting from the first and second filtering process.
  • the complexity of the sound source positioning system is then multiplied by two compared to the static case.
  • the number of filters to be implemented is proportional to the numbers of sources to be spatialized.
  • N sound sources are considered, the number of filters required is then 2. N for a static binaural synthesis and 4 .N for a dynamic binaural synthesis.
  • the linear decomposition of HRTFs aims to separate the spatial and frequency dependencies of the transfer functions. Beforehand, the excess phase of the HRTFs is extracted, then modeled in the form of a pure delay ⁇ . The linear decomposition then applies to the minimum phase component of the HRTFs.
  • the implementation scheme of binaural synthesis based on a linear decomposition of HRTFs is illustrated in Figure 4.
  • the signal from each source is then decomposed into P channels corresponding to the P basic vectors of the linear decomposition.
  • To each of these channels are then applied the directional coefficients Cj ( ⁇ ⁇ , ⁇ i) (denoted C) resulting from the linear decomposition of HRTFs
  • the signals from the N sources are then added (step 43) then filtered (step 44) by the filter Lj (f) corresponding to the j x th base vector.
  • steps 41, 42 and 43 may correspond to the spatial encoding proper, for binaural synthesis, while steps 44 and 45 may correspond to a spatial decoding before restitution, which the module M '3 of Figure 2, as described above.
  • the signals coming from the summers after step 43 of FIG. 4 can be conveyed via a communication network, for spatial decoding and restitution with a mobile terminal, in steps 44 and 45 described above.
  • the delays ⁇ and the gains C and D which constitute the spatialization parameters and are specific to each sound source as a function of its position, can therefore be dissociated from the directional filters L (f) in setting work of binaural synthesis based on a linear decomposition of HRTFs. Consequently, the directional filters are common to the N sources, regardless of their position, their number or their possible displacement.
  • the application of the spatialization parameters then represents the spatial encoding, properly speaking, of the signals relating to the sources themselves, while the directional filters carry out the effective processing of spatial decoding, with a view to restitution, which no longer depends on the position of the sources, but of the sound frequency.
  • this dissociation between the spatialization parameters and the directional filters is advantageously exploited by integrating the application of the spatialization delay and gain in the sound synthesizer.
  • Sound synthesis and spatial encoding (delays and gains) controlled by the azimuth and the elevation are thus carried out simultaneously within the same module such as a sound generator, for each sound signal (or note, in musical edition) to be generated (step 51).
  • the spatial decoding is then taken care of by the directional filters Li (f), as indicated above (step 52).
  • FIG. 6 represents the main parameters of an ADSR envelope of the aforementioned type, commonly used in different sound synthesis techniques.
  • FIG. 6 represents the temporal variation of the envelope of a synthesized sound signal, for example a note played on a piano, with: an attack parameter, modeled by an ascending ramp 61, corresponding for example to the duration of a hammer hammering against a piano string, - a decline parameter, modeled by a descending ramp 62, with strong decay, corresponding for example to the duration of a hammer release from a string piano, - a support parameter (free vibration), modeled by a slightly descending ramp 63, due to natural acoustic damping, corresponding for example to the duration of a sound of a pressed piano key, and a parameter release, modeled by a descending ramp 64, corresponding for example to the rapid acoustic damping produced by
  • the parameters of the ADSR envelope are defined before performing the filters provided for the spatialization processing, due to the time variables involved.
  • the maximum of the sound amplitude (in arbitrary units in FIG. 6) can be defined by the spatialization processing, in correspondence then to the gains dj and Dij mentioned above, for each left and right channel.
  • the instant of triggering of the sound (start of the ramp 61) can be defined through the delays ⁇ L i and ⁇ R i.
  • FM synthesis a simple operator of sound synthesis by frequency modulation
  • a carrier frequency f c typically the frequency of the fundamental mode
  • OSCl uses one or more oscillators OSCl to define one or more harmonics f m (corresponding in principle to frequencies multiple of the carrier frequency f c ), with which are associated relative intensities I m .
  • the intensities I m compared to the intensity of the fundamental mode, are higher for a metallic sound (such as that of a new guitar string).
  • FM synthesis makes it possible to define the timbre of a synthesized sound.
  • the signals (sinusoids) coming from the oscillator (s) OSCl are added to the signal drawn from the carrier frequency f c by the module AD, which delivers a signal to an output oscillator OSC2 which receives the amplitude A c of the sound with reference. at the carrier frequency f c .
  • this setpoint A c can be directly defined by the spatialization processing, through the gains C and D (in binaural synthesis), as we have seen above.
  • the oscillator OSC2 delivers a signal S'i, to which an ADSR envelope of the type shown in FIG.
  • the present invention makes it possible to directly implement both the spatialization steps and sound synthesis. It will be understood in particular that any sound synthesis processing, requiring the definition of an intensity (and, where appropriate, an instant of triggering of the sound), can be carried out in conjunction with a spatialization processing, proposing a gain (and, the delay, if applicable) by return.
  • a sound synthesizer works by reading a score which gathers information on the instruments to be synthesized, the moments when the sounds should be played, the pitch of these sounds, their strength, etc.
  • a sound generator is associated with each sound, as indicated above with reference to FIG. 5.
  • the same source plays several notes simultaneously. These notes, which come from the same source, are spatialized at the same position and therefore with the same parameters. It is therefore preferred to group the spatialization processing for the sound generators associated with the same source. Under these conditions, the signals associated with the notes from the same source are preferably summed beforehand so as to apply the spatialization processing globally to the resulting signal, which, on the one hand, advantageously reduces the cost of implementation. and, on the other hand, advantageously guarantees the coherence of the sound scene.
  • gains and delays can be applied by taking advantage of the synthesizer structure.
  • the delays (left channel and right channel) of spatialization are implemented in the form of delay lines.
  • the delays are managed by the instants of triggering of the sound generators in agreement with the partition.
  • the two previous approaches delay line and control of the triggering instant
  • the two previous approaches are combined in order to optimize the processing.
  • the balance (or "pan") parameter which is typically associated with the stereophonic system, no longer needs to be. It is therefore possible to delete the gains associated with the balance.
  • the sound generator volume parameter can be applied at the level of the different gains corresponding to the spatial encoding, as described above.
  • the present invention makes it possible to apply sound spatialization, source by source, the fact that the spatialization tool is integrated into the heart of the sound synthesis engine. This is not the case if we proceed on the contrary by simply cascading the synthesis engine and the spatialization tool. In this case, in fact, it is recalled that the spatialization can only be applied globally to the entire sound scene.
  • the sound synthesis and spatialization tools can be judiciously combined, in order to achieve an optimized implementation of a spatialized sound synthesis engine, with, in particular, an optimization of the combination of synthesis and spatialization operations, taking into account in particular at least one gain and / or a spatialization delay, or even a spatialization filter.
  • the spatialization parameters are advantageously taken into account by simple modification of the synthesis parameters, without modification of the synthesis model itself. -even.
  • a spatialized sound synthesis based on different possible spatialization techniques, can be obtained.
  • These spatialization techniques can be of variable complexity and performance but overall offer a much richer and more complete spatialization than stereophony, with in particular a natural and particularly immersive rendering of the sound scene.
  • the sound spatialization within the meaning of the invention retains the full potential of three-dimensional sound rendering, in particular in terms of immersion, with true 3D spatialization.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)
EP03819273A 2003-12-15 2003-12-15 Verfahren zum synthetisieren akustischer spazialisierung Withdrawn EP1695335A1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/FR2003/003730 WO2005069272A1 (fr) 2003-12-15 2003-12-15 Procede de synthese et de spatialisation sonores

Publications (1)

Publication Number Publication Date
EP1695335A1 true EP1695335A1 (de) 2006-08-30

Family

ID=34778508

Family Applications (1)

Application Number Title Priority Date Filing Date
EP03819273A Withdrawn EP1695335A1 (de) 2003-12-15 2003-12-15 Verfahren zum synthetisieren akustischer spazialisierung

Country Status (5)

Country Link
US (1) US20070160216A1 (de)
EP (1) EP1695335A1 (de)
CN (1) CN1886780A (de)
AU (1) AU2003301502A1 (de)
WO (1) WO2005069272A1 (de)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007104877A1 (fr) * 2006-03-13 2007-09-20 France Telecom Synthese et spatialisation sonores conjointes
FR2899423A1 (fr) * 2006-03-28 2007-10-05 France Telecom Procede et dispositif de spatialisation sonore binaurale efficace dans le domaine transforme.
US20090017910A1 (en) * 2007-06-22 2009-01-15 Broadcom Corporation Position and motion tracking of an object
US20080187143A1 (en) * 2007-02-01 2008-08-07 Research In Motion Limited System and method for providing simulated spatial sound in group voice communication sessions on a wireless communication device
US20090238371A1 (en) * 2008-03-20 2009-09-24 Francis Rumsey System, devices and methods for predicting the perceived spatial quality of sound processing and reproducing equipment
US8430750B2 (en) * 2008-05-22 2013-04-30 Broadcom Corporation Video gaming device with image identification
EP2297556B1 (de) * 2008-07-08 2011-11-30 Brüel & Kjaer Sound & Vibration Measurement A/S Verfahren zur rekonstruktion eines akustischen feldes
US7847177B2 (en) * 2008-07-24 2010-12-07 Freescale Semiconductor, Inc. Digital complex tone generator and corresponding methods
EP2460366A4 (de) * 2009-08-02 2014-08-06 Blamey & Saunders Hearing Pty Ltd Montage von klangprozessoren mit verbesserten klängen
US8786852B2 (en) 2009-12-02 2014-07-22 Lawrence Livermore National Security, Llc Nanoscale array structures suitable for surface enhanced raman scattering and methods related thereto
US8805697B2 (en) * 2010-10-25 2014-08-12 Qualcomm Incorporated Decomposition of music signals using basis functions with time-evolution information
US20130204532A1 (en) * 2012-02-06 2013-08-08 Sony Ericsson Mobile Communications Ab Identifying wind direction and wind speed using wind noise
US9395304B2 (en) 2012-03-01 2016-07-19 Lawrence Livermore National Security, Llc Nanoscale structures on optical fiber for surface enhanced Raman scattering and methods related thereto
US9099066B2 (en) * 2013-03-14 2015-08-04 Stephen Welch Musical instrument pickup signal processor
EP4379714A2 (de) * 2013-09-12 2024-06-05 Dolby Laboratories Licensing Corporation Lautstärkeeinstellung für abwärtsgemischten audioinhalt
CN105163239B (zh) * 2015-07-30 2017-11-14 郝立 4d裸耳全息立体声实现方法
FR3046489B1 (fr) * 2016-01-05 2018-01-12 Mimi Hearing Technologies GmbH Encodeur ambisonique ameliore d'une source sonore a pluralite de reflexions
CN107204132A (zh) * 2016-03-16 2017-09-26 中航华东光电(上海)有限公司 3d虚拟立体声空中预警***
WO2017192972A1 (en) 2016-05-06 2017-11-09 Dts, Inc. Immersive audio reproduction systems
US10979844B2 (en) 2017-03-08 2021-04-13 Dts, Inc. Distributed audio virtualization systems
RU2763391C2 (ru) * 2017-04-13 2021-12-28 Сони Корпорейшн Устройство, способ и постоянный считываемый компьютером носитель для обработки сигналов
CN107103801B (zh) * 2017-04-26 2020-09-18 北京大生在线科技有限公司 远程三维场景互动教学***及控制方法
CN109121069B (zh) * 2018-09-25 2021-02-02 Oppo广东移动通信有限公司 3d音效处理方法及相关产品
CN114949856A (zh) * 2022-04-14 2022-08-30 北京字跳网络技术有限公司 游戏音效的处理方法、装置、存储介质及终端设备

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0563929B1 (de) * 1992-04-03 1998-12-30 Yamaha Corporation Verfahren zur Steuerung von Tonquellenposition
US5596644A (en) * 1994-10-27 1997-01-21 Aureal Semiconductor Inc. Method and apparatus for efficient presentation of high-quality three-dimensional audio
DE69619587T2 (de) * 1995-05-19 2002-10-31 Yamaha Corp Verfahren und Vorrichtung zur Tonerzeugung
EP0762804B1 (de) * 1995-09-08 2008-11-05 Fujitsu Limited Dreidimensionaler akustischer Prozessor mit Anwendung von linearen prädiktiven Koeffizienten
US5977471A (en) * 1997-03-27 1999-11-02 Intel Corporation Midi localization alone and in conjunction with three dimensional audio rendering
US6459797B1 (en) * 1998-04-01 2002-10-01 International Business Machines Corporation Audio mixer
US6990205B1 (en) * 1998-05-20 2006-01-24 Agere Systems, Inc. Apparatus and method for producing virtual acoustic sound
JP2000341800A (ja) * 1999-05-27 2000-12-08 Fujitsu Ten Ltd 車室内音響システム
JP3624805B2 (ja) * 2000-07-21 2005-03-02 ヤマハ株式会社 音像定位装置
US7162314B2 (en) * 2001-03-05 2007-01-09 Microsoft Corporation Scripting solution for interactive audio generation
FR2836571B1 (fr) * 2002-02-28 2004-07-09 Remy Henri Denis Bruno Procede et dispositif de pilotage d'un ensemble de restitution d'un champ acoustique
CA2430403C (en) * 2002-06-07 2011-06-21 Hiroyuki Hashimoto Sound image control system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2005069272A1 *

Also Published As

Publication number Publication date
WO2005069272A1 (fr) 2005-07-28
AU2003301502A1 (en) 2005-08-03
CN1886780A (zh) 2006-12-27
US20070160216A1 (en) 2007-07-12

Similar Documents

Publication Publication Date Title
EP1695335A1 (de) Verfahren zum synthetisieren akustischer spazialisierung
CN105900457B (zh) 用于设计和应用数值优化的双耳房间脉冲响应的方法和***
Begault et al. 3-D sound for virtual reality and multimedia
Savioja et al. Creating interactive virtual acoustic environments
Valimaki et al. Fifty years of artificial reverberation
EP1600042B1 (de) Verfahren zum bearbeiten komprimierter audiodaten zur räumlichen wiedergabe
KR102659722B1 (ko) 공간 확장 음원을 재생하는 장치 및 방법 또는 공간 확장 음원으로부터 비트 스트림을 생성하는 장치 및 방법
CN103269474B (zh) 生成具有增强的感知质量的立体声信号的方法和装置
WO2021186107A1 (en) Encoding reverberator parameters from virtual or physical scene geometry and desired reverberation characteristics and rendering using these
US20200374645A1 (en) Augmented reality platform for navigable, immersive audio experience
JP4573433B2 (ja) 仮想音響環境において指向性音響を処理する方法およびシステム
WO2010089357A2 (en) Sound system
WO2022014326A1 (ja) 信号処理装置および方法、並びにプログラム
Rocchesso Spatial effects
Huopaniemi et al. DIVA virtual audio reality system
EP1994526B1 (de) Gemeinsame schallsynthese und -spatialisierung
CA3044260A1 (en) Augmented reality platform for navigable, immersive audio experience
EP1905008A2 (de) Parametrische multikanal-dekodierung
Peters et al. Sound spatialization across disciplines using virtual microphone control (ViMiC)
CN117043851A (zh) 电子设备、方法和计算机程序
Kelly Subjective Evaluations of Spatial Room Impulse Response Convolution Techniques in Channel-and Scene-Based Paradigms
McDonnell Development of Open Source tools for creative and commercial exploitation of spatial audio
KR20060131806A (ko) 음향 합성 및 공간화 방법
Gozzi et al. Listen to the Theatre! Exploring Florentine Performative Spaces
Saini et al. An end-to-end approach for blindly rendering a virtual sound source in an audio augmented reality environment

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20060612

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

17Q First examination report despatched

Effective date: 20060928

DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20110701