WO2001011602A1 - Multi-channel processing method - Google Patents

Multi-channel processing method Download PDF

Info

Publication number
WO2001011602A1
WO2001011602A1 PCT/DK2000/000443 DK0000443W WO0111602A1 WO 2001011602 A1 WO2001011602 A1 WO 2001011602A1 DK 0000443 W DK0000443 W DK 0000443W WO 0111602 A1 WO0111602 A1 WO 0111602A1
Authority
WO
WIPO (PCT)
Prior art keywords
components
signals
directional
signal
directions
Prior art date
Application number
PCT/DK2000/000443
Other languages
French (fr)
Inventor
Knud Bank Christensen
Kim Rishøj PEDERSEN
Morten Lave
Original Assignee
Tc Electronic A/S
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from EP99202585A external-priority patent/EP1076328A1/en
Priority claimed from EP00201759A external-priority patent/EP1158486A1/en
Application filed by Tc Electronic A/S filed Critical Tc Electronic A/S
Priority to AU64271/00A priority Critical patent/AU6427100A/en
Priority to EP00951275A priority patent/EP1203364A1/en
Publication of WO2001011602A1 publication Critical patent/WO2001011602A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0091Means for obtaining special acoustic effects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • H04S5/005Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation  of the pseudo five- or more-channel type, e.g. virtual surround

Definitions

  • the invention relates to a format of audio signals according to claim 1 and a method of processing audio signals according to claim 9.
  • the invention relates to a method of processing audio signals according to claim 12, a method of processing audio signals according to claim 13, a method of representing an audio signal (AS), said method according to claim 14, a method of decoding a number (M) of directional components () into a, preferably lower, number (N) of directional components according to claim 17, a rendering system comprising at least one input for receiving a number (M) of directional components (DC) according to claim 18 and a multi-channel data carrier according to claim 23
  • Audio processing and audio rendering are well-known within the art.
  • An audio rendering system may typically imply a sound generating unit, e.g. a CD or DVD player and an associated amplifier and loudspeaker system.
  • a sound generating unit e.g. a CD or DVD player and an associated amplifier and loudspeaker system.
  • a problem of the rendering known within the art is that the systems are inflexible with respect to a possible desired changing of the rendering method. Thus, a change from e.g. a two channel stereo rendering into a five channel cinema rendering will typically infer serious technical problems, and the quality of an obtainable rendering may be questioned.
  • Another aspect is, that having more sound reproduction formats, including stereo- phonic formats, surround sound formats and future multi-channel formats, it is necessary to record and process the sound differently according to each reproduction format.
  • the invention relates to an audio signal format comprising
  • each of the said components (dl, d2, d3, ... dN) representing a direction, said com- ponents preferably being uncorrelated.
  • a component may comprise an accumulation of zero or more signals in a direction defined by said component.
  • the components should pref- erably be uncorrelated. At least to the extent that no panning or dependency is established between the signals contained in the format.
  • N components (dl, d2, d3, ... dN), where N is at least 3, a further advantageous embodiment of the invention has been obtained.
  • N is at least 10, preferably at least 20, a further advantageous embodiment of the invention has been obtained.
  • an increase of the number of directional components have proven to be advantageous when dealing with multi-channel rendering, such as five channel rendering.
  • the said directions are three-dimensional directions
  • a further advantageous embodiment of the invention has been obtained.
  • the possibility of establishing a three dimensional audio image has been facilitated.
  • the three dimensional signal format may include both true signal compo- nents or even "trick"-components.
  • an experience-based allocation of directions of components may be applied.
  • the said directions are distributed with a larger propor- tion of directions in areas in which the human perception of sound signals is relatively sharp, e.g. in front of the head, a further advantageous embodiment of the invention has been obtained.
  • said signal is decomposed to a signal comprising N di- rectional components and according to an audio signal format as characterized in one or more of claims 1 - 10, a further advantageous embodiment of the invention has been obtained.
  • the invention relates to a method of processing audio signals according to claim 12,
  • each of the said sub-signals (si, s2, s3, ... sM) comprising
  • each of the said components (dl, d2, d3, ... dN) representing a direction;
  • each of the said components ( ⁇ dl, ⁇ d2, ⁇ d3, ... ⁇ dN) representing a direction, each of the said components ( ⁇ dl, ⁇ d2, ⁇ d3, ... ⁇ dN) being the sum of the said M sub-signals (si, s2, s3, ... sM) corresponding components (dl, d2, d3, ... dN).
  • the format may advantageously be applied as an intermediate signal processing format and e.g. different sound sources represented according to the same format may be superpositioned by a simple adding of the signal components.
  • the invention relates to a method of processing audio signals according to claim 13,
  • each of the said sub-signals (si, s2, s3, ... sM) comprising
  • each of the said components (dl, d2, d3, ... dN) representing a direction;
  • sub-signals are results of a room-simulation using room-simulators, preferably multi-directional room-simulators
  • each of the said components ( ⁇ dl, ⁇ d2, ⁇ d3, ... ⁇ dN) representing a direction, each of the said components ( ⁇ dl, ⁇ d2, ⁇ d3, ... ⁇ dN) being the sum of the said M sub-signals (si, s2, s3, ... sM) corresponding components (dl, d2, d3, ... dN).
  • the invention relates to a method of representing an audio signal (AS), said method according to claim 14, comprising the step of
  • the audio signal may subsequently be decoded into a desired rendering format, such as five channel.
  • the audio format represents a flexible audio coding in the sense, that the audio-signals may be encoded without any knowledge about the rendering system.
  • the method according to the invention implies a very advantageous representation of a signal due to the fact that the final rendering of the signal may be performed strictly according to a simple mapping if so desired, e.g. a gain matrix, still maintaining the intended total signal image.
  • the final rendering may thus, if desired, be performed solely with focus on the rendering of a signal component having a certain direction parameter and without differentiating between the different signals and the type of the signals contained in the directional signal com- ponent.
  • direct sound and subsequent room simulated signal may be rendered in the same way.
  • a music mix may be established once and for all in the format accord- ing to the invention, and subsequently, when new formats appear on the commercial market, a rendering may be established on the basis of the said format.
  • the term uncorrelated signal components implies that different directional signal components are independent in a degree that a subsequent rende ⁇ ng in a rende ⁇ ng system is possible without compensating for mutual dependencies between different directional signal components.
  • the signal components may be subsequently be rendered in almost every possible audio rende ⁇ ng system, e.g. stereo, five channel surround sound, without considering the originally intended rendering method.
  • a component includes an accumulation of zero or more signals m a direction defined by said component
  • the method according to the invention may be performed without any (or very little) knowledge of the downstream rendering system.
  • the said audio signal is a room processed signal, a further advantageous embodiment of the invention has been obtained.
  • uncorrelated signal components implies that the signal components are independent.
  • An independency between the signals of different directional signal components of a room simulated signal may specifically imply that signals representing a direct sound signal of a room signal should be independent with other direct sound representing signals of other directional signal components.
  • a correlation between signals of differ- ent signal components will occur within the scope of the invention, due to the fact that a room simulation of a direct sound signal will likely produce a subsequent reverberation sound signal bundle distributed in different signal directional components (generally).
  • the method according to the invention implies a very advantageous representation of a signal when dealing with room processed signals due to the fact that the final rende ⁇ ng of the room simulated signal may be performed st ⁇ ctly according to a simple mapping if so desired, e.g. a gain matrix, still maintaining the intended total signal image.
  • the final rendering may thus, if desired, be performed solely with focus on the rendering of a signal component having a certain direction parameter and without differentiating between the different signals and the type of the signals contained in the directional signal component. In other words, direct sound and subsequent room simulated signal may be rendered in the same way.
  • This unique feature implies e.g. that a music recording may be mixed, room processed, etc. by a sound engineer and encoded/stored according to the invention.
  • the recording may then be transformed into a signal standard fitting to a certain desired rendering method, e.g. stereo.
  • the stored music recording may be redistributed in another format without re-mixing the music recording.
  • an adding of signal components would preferably involve the signals (in time) of the same signal directional component of two different audio signals are summed.
  • the format and the method facilitate a modular approach according to which e.g. the room processing may be processed independently.
  • the invention relates to a method of decoding a number (M) of directional components into a, preferably lower, number (N) of directional components according to claim 17, said method comprising the step a transforming of the input directional components to a number N of output directional components, said input directional components representing a room simulated audio signal, said directional components being preferably uncorrelated.
  • the method of decoding the above signal which may also be regarded as a rendering, has proved to be very efficient and effective, due to the fact that the rendering itself may be performed by means of simple signal processing if desired.
  • a rendering of the directional components into an X-channel signal (X: more or less arbitrary) may be performed by means of a simple gain matrix mapping the M directional components into the X number of channels.
  • the invention relates to a rendering system comprising at least one input for receiving a number (M) of directional components (DC) according to claim 18, said rendering system comprising means for transforming the input directional components into a number (N) of output channels according to at least one rendering method stored in associated storing means (MSM).
  • M directional components
  • the rendering system should be able to be re-configured, e.g. by means of software.
  • M should be significantly higher than N.
  • the rendering system according to the invention represents a very flexible downstream rendering in the sense that the signal may be completely pre-established pre- vious to the rendering, thereby reducing the task of the rendering system to be a mapping.
  • the rendering system may be specifically designed to fit to the room (and e.g. amplifier and loudspeaker) in which the sound has to be reproduced.
  • a unique spatial relation exists between each input signal component and the intended location of an output channel of a rendering system.
  • the means for transforming the input directional components into a number (N) of output channels may perform a signal processing based on a simple gain matrix mapping of the signal components into the said output signals.
  • the means for transforming the input directional components into a number (N) of output channels may also additionally apply other signal processing methods, such as delay compensation for compensating the positioning of the loudspeakers of the rendering system, a compression applied for adapting the rendered output signal to noisy environments, e.g. a cabin of a car.
  • the required signal processing to be performed in the rendering system may be minimized.
  • the said method stored in the said storing means may be exchanged by means of a suitable software transmitting and/or receiving interface, a further advantageous embodiment of the invention has been obtained.
  • the rendering methods may be downloaded into the rendering system, e.g. by means of an IRDA port, a traditional RJ 45 signal interface, etc.
  • Another way may also be that of incorporating the method defining software on a DVD or a CD.
  • the bandwidth required for the transmittal of method defining software into the rendering may typically be relatively small.
  • the rendering system comprises a user interface adapted for selecting at least two different predefined rendering methods stored in said storing means, a further advantageous embodiment of the invention has been obtained.
  • a user may adapt his rendering system, to the room in which music (or speech) is to be reproduced by simply switching between different predefined methods.
  • the rendering system should comprise the amplifier and associated loudspeakers of the rendering system.
  • the rendering system may e.g. comprise a set of predefined rendering methods associated with stereo reproduction, a set of predefined rendering methods associated with five channel rendering, etc.
  • said system comprising a set of output channel (OC) connectors of which the rendering method may define a subset of output channel connectors to be activated when applying the transforming defined by the said, a further advantageous embodiment of the invention has been obtained.
  • OC output channel
  • a rendering system may feed a traditional amplifier/loud speaker setup (e.g. a stereo setup) by means of two rendering method defined output channel connectors if so desired. Moreover, the same system may moreover feed another type of rendering system via other method defined output channel connectors.
  • a traditional amplifier/loud speaker setup e.g. a stereo setup
  • the rendering system may comprises e.g. a
  • DVD reading unit and a unit adapted for allocating the directional signal read by the DVD into a kind of crossfield output connectors.
  • the invention relates to a multi-channel data carrier according to claim 23, said data carrier comprising a number (M) of audio channels (M), at least two of said channels representing a directional signal with respect to a virtual listener/reference position (VLP).
  • M number of audio channels
  • VLP virtual listener/reference position
  • an advantageous rendering of a multi-channel directional audio signal stored at the carrier may be obtained.
  • subsequent rendering of the signal into e.g. a multi-channel rendering system not necessarily predefined at the time of storing the audio-signal at the data carrier, may facilitate a relatively simple, and even more important high quality rendering.
  • the signal format e.g. facilitate a convincing rendering of the preprocessed room simulation.
  • a further advantageous feature of the invention is that the bandwidth may be kept constant irrespective of the character of the stored audio signal, the number of sources, etc.
  • the audio-signal stored at the data carrier should be almost completely preprocessed with respect to the mixing and sound-engineering.
  • a music mix may be released once and for all e.g. on a DVD, and if another rendering format is desired later, the rendering system may be re-configured without any need for manipulating the origin data source.
  • the directional components are established independently of the subsequent rendering of the audio channels.
  • the number of directional channels are at least eight, preferably at least twenty, a further advantageous embodiment of the invention has been obtained.
  • the effective necessary bandwidth of the complete signal represented in the format according to the invention may be reduced taken into consideration that many audio direction components are typically un-occupied in periods. Such unoccupied channel components will typically increase when increasing the number of channels.
  • a possible compression may e.g. be applied according to PCT applicaiton WO 97/38493 by Philips.
  • fig- 1 shows the basic understanding of a reverberated sound
  • fig- 2 shows the basic principles of a sound processing device according to the invention
  • fig- 3a-3c shows different sub-portions of the system according to the invention
  • fig. 4a-4b illustrates early pattern generators according to the invention
  • fifigg.- 5 5 shows a two-dimensional system of co-ordinates for illustrative purposes
  • fig- 6a-6b shows two embodiments of an audio signal format according to the invention
  • fig- 7 shows a three dimensional system of co-ordinates with two additional planes
  • fig- 8 shows the angle of the elevation of the additional planes shown in fig.
  • fig- 9 shows another embodiment of an audio signal format according to the invention
  • fig. 10a- 10c shows fig. 9 divided into sub-planes for the readers convenience
  • fig. 11 shows an audio signal processing method according to the invention
  • fig. 12 shows the same as fig 11, but illustrated in an alternative manner
  • fig. 13 shows a processing system according to a further embodiment of the invention
  • fig. 14 shows a recording, distribution and/or reproducing system according to a still further aspect of the invention
  • fig. 15 shows a recording, distribution and/or reproducing system as illustrated in fig. 14, but showing a storing medium for the pre-processed signal and the generation of the input signals
  • fig. 16 shows a additional embodiment of the system illustrated in fig. 14, and fig. 17 shows different modules of a rendering system according to the invention.
  • arti- ficial generation of room simulated sound should comprise an early reflection pattern and a late sound sequence, i.e. a tail sound signal.
  • the invention is basically directed at the early reflection patterns, and consequently sound processing based on early reflections patterns within the scope of the invention.
  • Fig. 1 illustrates the basic principles of a conventional signal processing unit.
  • the circuit comprises an input 1 communicating with an initial pattern generator 2 and a subsequent reverberation generator 3.
  • the initial pattern generator 2 and the subsequent reverberation generator 3 are connected to two mixers 4, 5 having output channels 6 and 7, respectively.
  • the initial pattern generator 2 generates an initial sound sequence with relatively few signal reflections characterising the first part of the desired emulated sound. It is a basic assumption that the initial pattern is very important as a listener establishes a subjective understanding of the simulated room on the basis of even a short initial pattern.
  • reflections in a certain room will initially comprise relatively few reflections, as the first sound reflection, also called first order reflections, have to propagate from a sound source at a given position in the room to the listener's position via the nearest reflecting walls or surfaces.
  • this sound field will be relatively simple and may therefore be emulated in dependency of the room and the position of the source and the listener.
  • reflections also called second order reflections, will be the sound waves transmitted to the position of the receiver via two reflecting surfaces.
  • the propagation will decrease quite fast after a short time period of time while the sound propagation will continue over a relatively long period of time if the absorption coefficients are low.
  • Fig. 2 illustrates the basic principles of a preferred embodiment of the invention.
  • the shown embodiment of the invention has been divided into three modules 20A, 20B and 20C.
  • the first module 20A of the room simulator comprises M source inputs 21, 22, 23.
  • the source inputs 21, 22 and 23 are each connected to an early pattern generator 26, 27 and 28.
  • Each early pattern generator 26, 27 and 28 outputs M directional signals to a summing unit 29.
  • the summing unit adds the signal components of each of the N predetermined directions from each of the early pattern generators 26, 27 and 27.
  • the summing unit output N directional signals to the module 20B comprising direction rendering unit 201.
  • the direction rendering unit converts the N directional signal to a P channel signal representation.
  • the system comprises a third module 20C.
  • the module 20C comprises a reverb feed matrix 202 fed by the M source inputs 21, 22, 23.
  • the reverb feed matrix 202 outputs P channel signals to a reverberator 203 which, in turn, outputs a P channel signal to a summing unit 204.
  • the summing unit 204 adds the P channel output of the reverberator 203 to the output of the direction rendering unit 201 and feeds the P channel signal to an output.
  • the module 20A comprises a number of inputs SI, S2, S3 and S4. It should be noted that a number of four inputs have been chosen for the purpose of obtaining a relatively simple explanation of the basic principles of the invention. Many other input numbers may be applicable.
  • Each of the inputs are directed to an early pattern generator 26, 27 and 28.
  • Each early pattern generator generates a processed signal specifically established and chosen for the source input SI, S2, S3 and S4.
  • the processed signals are established as a signal composed of seven signal components dl, d2, d3, d4, d5, d6 and d7.
  • the seven signal components represent a directional signal representation of the established sound and the established signal contains both the direct sound and the initial reverberation sound.
  • a possible embodiment of the invention implies a five channel rendering of 10- directional signal where the directions of the input signal format are 0, +/-15, +/-30, +/-70, +/-1 10 and 180 degrees, and the intended location of the five co ⁇ esponding loudspeakers are 0, +/-30 and +/- 110 degrees according to ITU 775.
  • a prefe ⁇ ed em- bodiment comprises more than 20 directions.
  • each of the inputs SI, S2, S3 and S4 may refer to mutually different locations of the input source to which the early pattern is generated.
  • the signals from each source are summed in summing unit 29.
  • the signals dl,..,d7 may comprise tail sound components or even whole tail-sound. It should nevertheless be emphasised that according to the prefe ⁇ ed embodiment of the invention such tail sound may advantageously be gen- erated according to a relatively simple panning algorithm and subsequently added to the established summed initial sound signal as the established summed initial sound comprises the dominating room determining effects.
  • fig. 3b illustrates the basic functioning of the direction rendering unit 201.
  • the type of multi-channel representation is a selectable parameter, both with respect to number of applied channels and to the type of speaker setup and the individual speaker characteristics.
  • the conversion into a given desired P channel representation may be effected in several different ways such as implying HRTF based (head related transfer function), a technique mentioned as Ambisonics, VB AP (vector based amplitude panning) or a pure experience based subjective mapping.
  • fig. 3c module 20C is illustrated as having an input from each of the source inputs SI, S2, S3 and S4.
  • the signals are fed to a reverb feed matrix 202 having five outputs, co ⁇ esponding to the chosen channel number of the direction rendering unit 201.
  • the five channel outputs are fed to a reverberation unit 203 providing a five channel output of subsequent reverberation signals.
  • the reverb feed matrix 202 comprises relatively simple signal pre-processing means (not shown) setting the gain, delay and phase of each input's contribution to each reverb signal and may also comprise filtering pre-processing means.
  • the reverberation unit 203 establishes the desired diffuse tail sound signal by means of five tank circuits (not shown) and outputs the resulting sound signal to be added to the already established space processed initial sound signal.
  • the tail sound generating means are added using almost no space processing due to the fact that a space processing of the tail sound signal according to the diffuse nature of the signal has little or no effect at all. Consequently, the complexity of the overall algorithm may be reduced when adding the tail sound separately and making the tuning much easier.
  • tail-sound provides a more natural diffuse tail-sound due to the fact that the distinct comb-filter effect of the early pattern generator should preferably only be applied to the initial pattern in order to provide naturalness.
  • both the initial sound and the sound tail of each sound may of course be located within an artificial room and subsequently summed in a summing unit.
  • the early pattern generator is one of four according to the above described illustrative embodiment of fig. 2, and each generator comprises a dedicated source input SI, S2, S3 and S4.
  • the shown early pattern generator 26 comprises a source input SI.
  • the source input is connected to a matrix of signal processing means.
  • the shown matrix basically comprises three rows of signal processing lines, which are processed by shared diffusers 41, 42.
  • the upper row is fed directly from the input SI
  • the second row is fed through the diffuser 41
  • the third row is fed through both diffusers 41 and 42.
  • Each row of the signal processing circuit comprises colour filters 411, 412, 413; 421, 422, 423; 431, 432, 433.
  • each row comprises delay lines 4111, 4121 and 4131 which are serially connected to the colour filters 411, 412, 413.
  • each column may be tapped via level and phase controllers such as 4000, 4001 and 4002. It should be noted that each level-phase controller 4000, 4001 and 4002 are tap specific.
  • the initial pattern generator 26 comprises a matrix which may comprise several sets of predefined presets by which a certain desired room may be emulated.
  • signals of the cu ⁇ ent predefined room emulation are tapped to the directional signal representation of the present sound source SI .
  • four signal lines are tapped to seven directional signal components.
  • One signal, N13 of row 1, column 3, is fed to sound component 1
  • one signal, N21, is fed to signal component 3
  • two signals, Ni l and N22 are added to the sound component 4.
  • each tapped signal has consequently been processed through one of three combinations of diffusers, one of three types of predefined colour filters EQ, a freely chosen length of delay line and a freely chosen level and phase output.
  • a separate row with a level-phase controller 4002 should be tapped and determine the direct sound.
  • the location of both the direct sound as well as the co ⁇ esponding EPG and reverberation sound signals may be mapped into the sound signal representation completely similar to the desired directionality i ⁇ espective of directional resolution and complexity.
  • the directional signal representation components usually comprise signals fed to each component 1-7 and not only the illustrated three.
  • the chosen topology of the early pattern generator within the scope of the invention may be chosen from a set of more or less equivalent topologies.
  • the signal modifying components may be varied, if e.g. a certain degree of tail-sound is added before or after tapping.
  • the illustrated early pattern generator comprises linear systems
  • the components e.g. the colour filters EQ may be interchanged with the diffusers DIF.
  • Fig. 4b illustrates a further possible embodiment of the early pattern generator, comprising colour filters EQ placed in the feed line to each row and diffusers DIF placed in each column in each row.
  • the number of columns and rows may vary depending of the system requirements. In a possible embodiment only one column of delay lines with co ⁇ esponding colour filters or diffusers is utilised.
  • additional components, additional diffusers, additional different types of colour filters, etc. may be chosen.
  • the number of directions, i.e. signal components should be not less than twelve, and the established reflections of each early pattern generator should not be less than 25.
  • the basic presetting of each early pattern generator may initially be determined by known commercially available ray tracing or room mi ⁇ oring tool, such as ODEON.
  • Fig. 5 illustrates a two-dimensional system of co-ordinates, with the axes labeled 'x' 910 and 'y' 911.
  • the axes 910, 911 are pe ⁇ endicular to each other, and both are parallel to the ground, i.e. they are a ⁇ anged in a horizontal plane.
  • a head of a person 912 is placed, with the nose pointing in the same direction as the x-axis 910.
  • a circle 913 with radius of a unit and center at the systems origin is drawn in the plane of the x- and the y-axes.
  • Fig. 6a illustrates an audio signal format. It shows again the two dimensional system of co-ordinates of figure 5. It comprises the axes x 910 and y 911, and the unit circle 913. The intersection 912 of the two axes 910, 911 represents the position of the head 912 of figure 5.
  • FIG. 6 comprises twelve vectors 920, all beginning at the systems origin and pointing towards the unit circle 913, all having the length of one unit.
  • the twelve vectors 920 (dl ... dl 2) are evenly distributed around the circle 913, causing the angle 921 between two neighbor vectors to be the same, indifferently to which vector is chosen.
  • Incoming sounds may be defined by these vectors, the direction of the vectors representing the direction of the incoming sounds (or rather the direction, from which the incomings sound signals are coming), and a number representing the magnitude or amplitude of the incoming sounds signals.
  • the number of vectors 920 (twelve) is only an example. It is possible to comprise any number of vectors, as long as the number of vectors is sufficient to define the incoming sounds satisfactorily.
  • a prefe ⁇ ed embodiment would comprise more than 25 vectors or directions, for example 30, 40, 50 or even more.
  • the higher the number of vectors the higher resolution of direction is achieved. And the higher resolution of direction, the more accurate source localization is achieved. For sounds coming from sources placed in front of the head, the human beings are capable of distinguishing directional differences, as small as 3 degrees. This is the so-called localization blur.
  • the illustrated angle 921 between two neighbor vectors is only an example. It is pos- sible to comprise any principle of vector distribution around the circle, including uniform distribution and experience based, e.g. psycho-acoustic based distribution.
  • the ears are not quite as good of directional distinguishing of sounds coming from behind the head or from the sides of the head, as they are of sounds coming from in front of the head, it is advantageous to comprise a distribution with fewer vectors behind the head than in front of the head. This gives a less accurate localization behind the head, but the human being will normally not be able to tell the difference anyway.
  • a prefe ⁇ ed embodiment using this distribution principle is shown in fig. 6b.
  • Another distribution of the vectors 920 could be based on measures of the density of different sounds in all possible directions. The vectors could then be distributed with small angles between them in direction sectors with high density of directional sound signals and with larger angles between them in direction sectors with low density of directional sound signals.
  • a further way of distributing the vectors around the circle could be based on human impressions.
  • FIG. 7 defines such a room. It is a three- dimensional system of co-ordinates, with the axes labeled 'x' 930, 'y' 931 and 'z' 932.
  • the axes 930-932 are perpendicular to each other, that is: each axis is perpendicular to the other two.
  • a head of a person 912 is placed, with the nose pointing in the same direction as the x-axis 930. Further two circles 946a and 946b have been added.
  • circles 946a and 946b are placed in parallel with the unit circle 933, but are displaced along the z-axis in such a way that they still have their centers at the z-axis. Furthermore the distance from the systems origin 912 to any point at the two circles 946a and 946b are exactly one unit, as the circles are placed on a sphere with its center in the systems origin.
  • the circle 933 lying in the x- y-plane is called the middle plane circle.
  • the circle 946a displaced along the z-axis in the positive direction is called the upper circle.
  • the circle 946b displaced along the z-axis in the negative direction is called the lower circle.
  • Fig. 9 shows an embodiment of an audio format comprising three dimensions. It comprises the elements of figure 8. Further it comprises a number of vectors 920, pointing from the systems origin 912 to the middle circle 933. These vectors 920 are comparative to the vectors of the two dimensional audio format of figure 8a.
  • FIG. 9 comprises a number of vectors 960a pointing from the systems origin 912 to the upper circle 946a and a number of vectors 960b pointing from the systems origin 912 to the lower circle 946b.
  • Fig. 10a shows the upper circle 946a and its co ⁇ esponding vectors 960a of the three dimensional directional audio format.
  • the angle 951a indicates the displacement of the upper circle from the x-y-plane.
  • fig. 10a comprises an angle 971a. It indicates the rotation of a vector 960a from the direction of the x-axis 930, with axis of rotation at the z-axis 932.
  • Fig. 10b shows the middle circle 933 and its co ⁇ esponding vectors 920 of the three dimensional directional audio format.
  • the angle 921 indicates the angular distance between two vectors 920.
  • Fig. 10c shows the lower circle 946b and its co ⁇ esponding vectors 960b of the three dimensional directional audio format.
  • the angle 951b indicates the displacement of the lower circle from the x-y-plane.
  • the angle 971b indicates the rotation of a vector 960b from the direction of the x-axis 930.
  • fig. 10a- 10c will end up as fig. 9.
  • the number of vectors co ⁇ esponding to each circle and the number of circles are only examples, and any number of vectors and circles are within the scope of the invention.
  • a prefe ⁇ ed embodiment would also comprise fewer vectors pointing towards each of the upper or lower planes than to the middle plane. This is because the highest resolution of vectors wanted and also usable to the human being is in the middle plane.
  • an upper circle and/or a lower circle is situated near respectively the positive part (i.e. the angle 951a being close to 90°) and the negative part (i.e. the angle 951b being close to 90°) of the z-axis and contains only few vectors.
  • Such upper and/or lower circles could even be defined as points on the z-axis, whereby only one vector would co ⁇ espond to these upper and/or lower "circles", that is a vector located along the positive direction of the z-axis and/or a vector located along the negative direction of the z-axis.
  • the audio format could be defined by a middle circlee as described above in combination with a vector along the positive part of the z-axis and optionally a vector along the negative part of the z- axis.
  • the distribution of vectors around a circle is also just an example.
  • many distribution principles are imaginable and applicable and hence within the scope of this invention. This could be a uniform distribution principle as shown in the drawings 9, 10a- 10c, an experience based distribution principle, a distribution principle based on measures of the localization blur in different directions as shown for the two-dimensional system in fig. 6b or a distribu- tion principle based on measures of the portion of sound gradients in different areas for a specific room and situation.
  • the vectors could be placed in other manners than with their ends placed at a circle, especially the vectors, which are placed with an angle in relation to the x-y-plane.
  • the angle 951 a or 951 (fig. 8) in relation to the x- y-plane could vary as a function of the angle 921 (fig. 6a) if found appropriate, whereby the ends of the vectors could be situated on for example non-circular curves on the surface of the unit sphere.
  • a number of signals, preferably audio signals, si - sM are provided in the signal format according to the invention, e.g. comprising N directional components ac- cording to the same directional format. It shall be noted that not all of these components actually need to contain any signal, as the signal format must be expected to comprise a relatively large number of directional components in order to be able to represent the involved signal sources satisfactorily. Thus some (or even a large number) of the components of the actual signal sources may be zero.
  • the audio signals si - sM may be recordings of or (microphone) signals stemming from single musical instruments, group of instruments, singers etc. or the signals si - sM may be other forms of signals, which will have to be combined to represent a resulting audio signal or other forms of signals.
  • the source signals si - sM are directed to a signal processing unit 972 or 982, said signal processing units serving to combine the source signals si - sM and to provide an output signal 973 or 983, respectively, which also is represented in the audio signal format according to the invention, e.g. comprising N directional components, said components having the same directions as the components used for representing the source signals si - sM.
  • the processing involved for combining the source signals is a summing of the co ⁇ esponding components of each source signal si - sM.
  • the summing is carried out as a simple adding of each signal component, i.e.
  • ⁇ dl dl(sl)+ dl(s2)+ + dl(sM)
  • ⁇ d2 d2(sl)+ d2(s2)+ + d2(sM)
  • ⁇ d3 d3(sl)+ d3(s2)+ + d3(sM)
  • ⁇ d4 d4(sl)+ d4(s2)+ + d4(sM)
  • ⁇ dN: dN(sl)+ dN(s2)+ + dN(sM).
  • signal processing may be performed by the signal processing units 972 and 982, even additional processing not primarily serving to combine the components of the source signals, but serving to amend, equalize, add reverberation to, etc. the resulting signal. But preferably such additional processing will be performed in a later stage or stages of the signal processing chain but before the final direction rendering unit (DRU) will perform the mapping of the signals to the available sound reproducing system, e.g. the loudspeaker system.
  • DRU final direction rendering unit
  • the basic functioning of the direction rendering unit will thus be to map the N directional signal outputs 973 and 983 from units 972 or 982 into a chosen multi- channel representation, according to the available speaker set-up.
  • the conversion into a given desired P channel representation may be effected in several different ways such as implying HRTF based (head related transfer function), a technique mentioned as Ambisonics, VBAP (vector based amplitude panning) or a pure experience based subjective mapping.
  • Fig. 13 shows a system according to a further aspect of the multi-channel audio format and processing method of this invention. It shows a model of a reverberation unit.
  • a number of units 9101a-9101c called room simulator units, calculate how the sound emitted from a source 9100a-9100c will be heard at the listener's position, including reflections from the room.
  • These room simulator units may for example be early pattern generators, EPG's, which will be assumed to be the case in the following.
  • EPG's early pattern generators
  • the EPGs should calculate the resulting sound from as many directions as possible, and this result 9102a-9102c could then be sent on in the audio format of this invention.
  • the result of each EPG should in any suitable way be added to form the final sound heard at the listener's position, and this addition could be made according to the audio signal processing method 9103 of this invention.
  • the result 9104 from passing the outputs from the EPGs through the processing method, would be in the multi-channel audio format of this invention.
  • the sound in each direction of this format would have to be mapped to the available loudspeakers 9106a-9106b or channels chosen by the user. This mapping is performed by a direction rendering unit (DRU) 9105.
  • DRU direction rendering unit
  • each vector could represent the method of calculating the sound coming from that particular direction.
  • each vector would comprise a listing of partial sound signals as a function of time and as a function of the actual input signals. In each vector shall then the partial sound signals be summed.
  • Fig. 14 illustrates a recording, distribution and/or reproducing system according to a further aspect of the invention, said system utilizing the signal format according to the first aspect of the invention.
  • a number of source signals si - sM, each comprising a number of directional components dl - dN, are led to a pre-processing unit (PPM) 9111.
  • the input source signals which preferably may be audio signals, may each stem from a single musical instrument, a singer or another source of audio signals, or may stem from a group of instruments, a group of singers, a group of other audio sources or combinations of these. These signals may have been generated using commonly available methods and equipment, e.g. microphones, while simultaneously producing the directional components of the signals. Further the input source signals si - sM, or some of these, may have been generated using audio generators and/or simulators, for exam- pie room simulators, early pattern generators (EPG's) etc. as indicated above.
  • EPG's early pattern generators
  • a processing of the input source signals si - sM takes place.
  • dN: dN(sl)+ dN(s2)+ dN(s3)+ .... +dN(sM), whereby the output signal IF from the pre-processing unit (PPM) 9111 will be generated.
  • PPM pre-processing unit 9111
  • PPM pre-processing unit
  • amendments of the format of the signals may take place, e.g. reduction of the number of directional components, canceling of certain directional components, addition of certain components, which may contain information relating to the recorded or processed signals.
  • the resulting output signal IF from the pre-processing unit (PPM) 9111 thus com- prises a number T of signal components, of which most or all are directional components.
  • the number T may equal the number N of directional components in each of the input source signals si - sM, or T may be larger than or less than N.
  • the output signal IF constitutes an intermediate signal format, which is suitable for storing, transmission or otherwise distribution of the recorded signals, preferably audio signals, while simultaneously retaining as much detailed information about the input signals as possible.
  • the output signal IF may thus be stored on any suitable form of storing media such as CD's, DVD's or static storing media of the electronic, magnetic, optical etc. variety as illustrated in fig. 15 by the storing means 9115. Further the output signal IF may be transmitted in any suitable manner, for example by distribution via the Internet or other suitable means.
  • the T-channel signal IF is received by user processing means (UPM) 9112, which may be inco ⁇ orated in the reproducing system or apparatus at the end-user, for ex- ample when a storing medium such as a CD or a DVD is played on the apparatus of the end-user, or when a signal IF is received via electronic, electromagnetic or optical communication means, for example via the Internet, by the reproducing system at the end-user.
  • UPM user processing means
  • the reproducing equipment at the end-user will not be capable of reproducing the relatively large number of directional components, i.e. channels, comprised in the signal IF, but will be able to reproduce the signals, preferably the audio signals, in one or more of the commonly used bi- or multi-channel systems, e.g. stereo system, 3-, 4- or 5-channel systems, su ⁇ ound systems etc. or in a user-specified set-up.
  • the user processing means (UPM) 9112 therefore comprises means, for example in the form of a decoder, for transforming the T-channel signal IF into a signal containing a suitable number of channels, which may be reproduced by the end-user equipment.
  • the user processing means UPM may be predestined to transform the received signal from a certain number of channels T to a certain number of channels to be reproduced by the end-user equipment.
  • the user processing means UPM may be configured to determine the number of channels T inco ⁇ orated in the received signal IF and to perform a transformation from this number of channels to a certain number of channels to be reproduced by the end-user equipment.
  • the user processing means UPM may be able to perform a transformation to one of two or more different reproducing systems UF1 - UFk, containing a different number of channels, in dependence upon an active choice by the user or in dependence upon other factors, such as parameters of the received signal IF.
  • the user processing means UPM of fig. 14 is configured to reproduce the audio input signals in a specific reproducing system labeled UF3, which may be any type of the relatively large number of reproducing systems (labeled UF1 - Ufk on fig 14), which are available at present, and which may reproduce the signals via loudspeaker systems using stereo systems, 3-, 4- or 5-channel systems and/or user- specified set-ups etc.
  • a specific reproducing system labeled UF3 which may be any type of the relatively large number of reproducing systems (labeled UF1 - Ufk on fig 14), which are available at present, and which may reproduce the signals via loudspeaker systems using stereo systems, 3-, 4- or 5-channel systems and/or user- specified set-ups etc.
  • the user processing means UPM may further comprise other means for processing the received signal IF, for example means for equalizing, other means for coloring the signals, delaying means, adding reverberation, dynamic processing etc. These further processing steps may be designated in relation to the actual type and character of the user reproducing equipment, e.g. in order to achieve an optimal sound reproduction.
  • the transformation of the number of channels T contained in the received signal IF to the number of channels, which may be reproduced by the actual end-user equipment, may be performed using linear transformation of the received signal- components, for example by matrix-operation. More complicated operation may be performed to achieve more sophisticated and detailed results of the transformation.
  • the input signals si - sM may originate from microphones or similar transducers producing electric signals 9110a - 9110M. These signals are processed by signal processing means 9114a - 9114M, whereby the signals si - sM each having N signal components according to the signal format of the invention is produced.
  • the signal processing means 9114a - 9114M may for example be room simulators, early pattern generators etc. Some or all of the input signals for these signal processing means or for the pre-processing means PPM 9111 may also have been produced artificially, for example by electronic music instruments or signal generators.
  • FIG. 16 illustrates a system, which co ⁇ esponds to the system described in connection with fig. 14, but where a further processing means, an intermediate processing means (IPM) 9113 has been added.
  • This further processing means IPM provides an intermediate processing of the multi-channel signal IF having T channels (or directional components), whereby the number of channels may be re- prised to a number V.
  • the output signal IF' having V channels (or directional components) may be stored on any suitable storing media 9116 or may otherwise be transmitted or distributed to the end-users, represented by the end-user processing means UPM, where it may be reproduced as described in connection with fig. 14.
  • the output signal IF from the pre-processing unit (PPM) 9111 may also be stored on any suitable storing media (not shown in fig. 16) or may otherwise be transmitted or distributed.
  • the intermediate processing may be performed by or for a record company, a record distributor, a distribution network etc. which has a need to distribute or may achieve an advantage by distributing signals with a fewer or a specific number of channels. This may thus be performed by the IPM without influenc- ing upon the other parts of the system, e.g. the system as shown in fig. 14 will function without regard to the system shown in fig. 15.
  • Fig. 17 shows different modules of a rendering system according to the invention.
  • the system comprises an independent pre-rendering stage PS.
  • the pre-rendering stage PS comprises a data carrier reading unit DCRU.
  • the unit may e.g. be a DVD player adapted for reading data stored in a DVD or it may e.g. comprise a solid state memory interface.
  • the data ca ⁇ ier reading unit DCRU are connected to a direction rendering unit DRU by means of an M channel interface.
  • the direction rendering unit DRU comprises method storing means MSM and a signal processing unit SPU adapted for transforming the M-channel input into an N- channel output according to a rendering method stored in the said memory storing means MSM.
  • the direction rendering unit may e.g. perform a relatively simple mapping of the input channels into the output channels by means of a gain matrix.
  • the rendering method may e.g. be established by means of vector based amplitude panning.
  • the rendering unit may be adapted for exchanging the rendering method by means of software modifications.
  • the N-channel output of the pre-rendering unit DCRU are then fed to output channel connectors (OC).
  • the illustrated pre-rendering unit DCRU comprises only two signal output connectors.
  • the pre-rendering unit DCRU is specifically adapted for transforming the input channels into a two-channel signal representation, such as stereo.
  • the outputs of the pre-rendering unit DCRU are then fed to a traditional stereo amplifier RA connected with two loudspeakers LS.
  • the pre-rendering stage may thus be adapted for fitting to an arbitrary combination of amplifiers and loudspeakers, e.g. a su ⁇ ound sound system.
  • a pre- rendering unit DCRU comprising a number of output connectors, e.g. ten.
  • the pre- rendering unit DCRU may then, under control of the stored rendering method apply 1 to 10 physical outputs.
  • the amplifier means may be inco ⁇ orated in the pre-rendering unit DCRU within the scope of the invention.
  • the direction rendering unit DRU may also comprise a set of rendering methods storing the methods storing means.
  • the rendering methods may both be different with respect to basic properties determines by the number of output channels, but it may also be different with respect to the intended positioning of the loudspeaker of the rendering system.
  • the illustrated stereo rendering system may comprise e.g. twenty different predefined "stereo" variants with respect to e.g. the positioning of the loudspeakers in the room or e.g. room characteristics.
  • Such rendering may e.g. imply phase modification, equalizing, sound coloring, sound compression or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

The invention relates to an audio signal format comprising N components (d1, d2, d3, ... dN), each of the said components (d1, d2, d3, ... dN) representing a direction. According to a preferred embodiment of the invention, the components should preferably be uncorrelated to the extent that no panning is established between the signals contained in the format.

Description

MULTI-CHANNEL PROCESSING METHOD Field of the invention
The invention relates to a format of audio signals according to claim 1 and a method of processing audio signals according to claim 9.
Moreover, the invention relates to a method of processing audio signals according to claim 12, a method of processing audio signals according to claim 13, a method of representing an audio signal (AS), said method according to claim 14, a method of decoding a number (M) of directional components () into a, preferably lower, number (N) of directional components according to claim 17, a rendering system comprising at least one input for receiving a number (M) of directional components (DC) according to claim 18 and a multi-channel data carrier according to claim 23
Background of the invention
Audio processing and audio rendering are well-known within the art.
An audio rendering system may typically imply a sound generating unit, e.g. a CD or DVD player and an associated amplifier and loudspeaker system.
A problem of the rendering known within the art is that the systems are inflexible with respect to a possible desired changing of the rendering method. Thus, a change from e.g. a two channel stereo rendering into a five channel cinema rendering will typically infer serious technical problems, and the quality of an obtainable rendering may be questioned.
However the number of outputs and the preferred final speaker arrangement most often determines how the input signals are being processed, as it is imperative that the sound is modified according to that certain speaker arrangement for the sound engineer to give the listener the desired experience. Consequently sound reproduction is very constrained by the fact that it is necessary almost from the beginning to the end of the process to consider which sound format and speaker arrangement will be used eventually. And thereby lots of opportunities and properties of the sound are lost almost from the start.
One of the consequences is that true reverberation or room simulation is almost impossible to achieve by prior art audio formats.
Another aspect is, that having more sound reproduction formats, including stereo- phonic formats, surround sound formats and future multi-channel formats, it is necessary to record and process the sound differently according to each reproduction format.
Preferably there would be some method to record, modify and mix the input signals without having to consider which reproduction format to use. Then all the heavy and significant work on the input signals could be done once, no matter what equipment the eventual listener intends to use.
It is one object of the invention to provide a method for multi-channel sound proc- essing where it is unnecessary to possess knowledge of the final sound reproduction format.
Summary of the invention
The invention relates to an audio signal format comprising
N components (dl, d2, d3, ... dN) according to claim 1,
each of the said components (dl, d2, d3, ... dN) representing a direction, said com- ponents preferably being uncorrelated. A component may comprise an accumulation of zero or more signals in a direction defined by said component.
According to a preferred embodiment of the invention, the components should pref- erably be uncorrelated. At least to the extent that no panning or dependency is established between the signals contained in the format.
It should be noted, that the signal format according to the invention may be supplemented by further signal components or signal representations within the scope of the invention.
When, as stated in claim 2, N components (dl, d2, d3, ... dN), where N is at least 3, a further advantageous embodiment of the invention has been obtained.
When, as stated in claim 3, N is at least 10, preferably at least 20, a further advantageous embodiment of the invention has been obtained.
According to the invention, experiments have shown that impressive rendering methods may relatively easy be applied to audio signals represented in the format according to the invention when increasing the number of signal components.
Accordingly, experiments have shown that a ten-directional format (N = 10) may be rendered into an impressing stereo image, e.g. by means of a simple gain matrix mapping the direction components into the channels of a stereo rendering system. Moreover, an increase of the number of directional components have proven to be advantageous when dealing with multi-channel rendering, such as five channel rendering.
When, as stated in claim 4,the said directions are three-dimensional directions, a further advantageous embodiment of the invention has been obtained. When establishing a three dimensional format, the possibility of establishing a three dimensional audio image has been facilitated.
Evidently, the three dimensional signal format may include both true signal compo- nents or even "trick"-components.
When, as stated in claim 5, some or all of said directions are angled in relation to a common reference plane and where preferably all of said directions to one and the same side of the plane have been placed with approximately the same angle in rela- tion to the common reference plane, a further advantageous embodiment of the invention has been obtained.
When, as stated in claim 6, the directions are placed on both sides of the common reference plane, where some or all of said directions are angled in relation to the common reference plane and where preferably all of said directions to one and the same side of the plane have been placed with approximately the same angle in relation to the common reference plane, a further advantageous embodiment of the invention has been obtained.
When, as stated in claim 7, the angle of the directions on one side of the common reference plane and the angle of the directions on the other side of said plane are substantially equal, a further advantageous embodiment of the invention has been obtained.
When, as stated in claim 8, the said directions are distributed among all possible directions, a further advantageous embodiment of the invention has been obtained.
According to the invention, an experience-based allocation of directions of components may be applied.
When, as stated in claim 9, the said directions are distributed with a larger proportion of directions in areas with a relatively high density of sound signals than in areas with a relatively low proportion of sound signals, a further advantageous embodiment of the invention has been obtained.
When, as stated in claim 10, the said directions are distributed with a larger propor- tion of directions in areas in which the human perception of sound signals is relatively sharp, e.g. in front of the head, a further advantageous embodiment of the invention has been obtained..
When, as stated in claim 11 , said signal is decomposed to a signal comprising N di- rectional components and according to an audio signal format as characterized in one or more of claims 1 - 10, a further advantageous embodiment of the invention has been obtained.
Moreover, the invention relates to a method of processing audio signals according to claim 12,
said signals comprising
M sub-signals (si, s2, s3, ... sM),
each of the said sub-signals (si, s2, s3, ... sM) comprising
N components (dl, d2, d3, ... dN),
each of the said components (dl, d2, d3, ... dN) representing a direction;
where the said sub-signals (si, s2, s3, ... sM) are added to form a sum-signal (∑s) comprising
N components (∑dl , Σd2, Σd3, ... ΣdN),
each of the said components (∑dl, Σd2, Σd3, ... ΣdN) representing a direction, each of the said components (∑dl, Σd2, Σd3, ... ΣdN) being the sum of the said M sub-signals (si, s2, s3, ... sM) corresponding components (dl, d2, d3, ... dN).
According to the invention, the format may advantageously be applied as an intermediate signal processing format and e.g. different sound sources represented according to the same format may be superpositioned by a simple adding of the signal components.
Moreover, the invention relates to a method of processing audio signals according to claim 13,
said signals comprising
M sub-signals (si, s2, s3, ... sM),
each of the said sub-signals (si, s2, s3, ... sM) comprising
N components (dl, d2, d3, ... dN),
each of the said components (dl, d2, d3, ... dN) representing a direction;
where said sub-signals (si, s2, s3, .. sM) are results of a room-simulation using room-simulators, preferably multi-directional room-simulators,
where the said sub-signals (si, s2, s3, ... sM) are added to form a sum-signal (∑s) comprising
N components (∑dl , Σd2, Σd3, ... ΣdN),
each of the said components (∑dl, Σd2, Σd3, ... ΣdN) representing a direction, each of the said components (∑dl, Σd2, Σd3, ... ΣdN) being the sum of the said M sub-signals (si, s2, s3, ... sM) corresponding components (dl, d2, d3, ... dN).
Moreover, the invention relates to a method of representing an audio signal (AS), said method according to claim 14, comprising the step of
establishing at least two directional signal components (M), said directional signal components (M) preferably being uncorrelated.
The audio signal may subsequently be decoded into a desired rendering format, such as five channel.
Basically, the audio format represents a flexible audio coding in the sense, that the audio-signals may be encoded without any knowledge about the rendering system.
It should be noted that the method according to the invention implies a very advantageous representation of a signal due to the fact that the final rendering of the signal may be performed strictly according to a simple mapping if so desired, e.g. a gain matrix, still maintaining the intended total signal image. The final rendering may thus, if desired, be performed solely with focus on the rendering of a signal component having a certain direction parameter and without differentiating between the different signals and the type of the signals contained in the directional signal com- ponent. In other words, direct sound and subsequent room simulated signal may be rendered in the same way.
Accordingly, a music mix may be established once and for all in the format accord- ing to the invention, and subsequently, when new formats appear on the commercial market, a rendering may be established on the basis of the said format. According to the invention, the term uncorrelated signal components implies that different directional signal components are independent in a degree that a subsequent rendeπng in a rendeπng system is possible without compensating for mutual dependencies between different directional signal components.
According to the invention, the signal components may be subsequently be rendered in almost every possible audio rendeπng system, e.g. stereo, five channel surround sound, without considering the originally intended rendering method.
A component includes an accumulation of zero or more signals m a direction defined by said component
Typically, the method according to the invention may be performed without any (or very little) knowledge of the downstream rendering system.
When, as stated in claim 15, the said audio signal is a room processed signal, a further advantageous embodiment of the invention has been obtained.
Again, the term uncorrelated signal components implies that the signal components are independent. An independency between the signals of different directional signal components of a room simulated signal may specifically imply that signals representing a direct sound signal of a room signal should be independent with other direct sound representing signals of other directional signal components. Evidently, when dealing with a room simulated signal, a correlation between signals of differ- ent signal components will occur within the scope of the invention, due to the fact that a room simulation of a direct sound signal will likely produce a subsequent reverberation sound signal bundle distributed in different signal directional components (generally).
It should be noted that the method according to the invention implies a very advantageous representation of a signal when dealing with room processed signals due to the fact that the final rendeπng of the room simulated signal may be performed stπctly according to a simple mapping if so desired, e.g. a gain matrix, still maintaining the intended total signal image. The final rendering may thus, if desired, be performed solely with focus on the rendering of a signal component having a certain direction parameter and without differentiating between the different signals and the type of the signals contained in the directional signal component. In other words, direct sound and subsequent room simulated signal may be rendered in the same way.
This unique feature implies e.g. that a music recording may be mixed, room processed, etc. by a sound engineer and encoded/stored according to the invention.
The recording may then be transformed into a signal standard fitting to a certain desired rendering method, e.g. stereo.
Subsequently, the stored music recording may be redistributed in another format without re-mixing the music recording.
When, as stated in claim 16, at least two audio signal (AS) the signals are combined into one signal by means of an adding, a further advantageous embodiment of the invention has been obtained.
According to the invention, an adding of signal components would preferably involve the signals (in time) of the same signal directional component of two different audio signals are summed.
Clearly, according to the invention, the format and the method facilitate a modular approach according to which e.g. the room processing may be processed independently.
Moreover, the invention relates to a method of decoding a number (M) of directional components into a, preferably lower, number (N) of directional components according to claim 17, said method comprising the step a transforming of the input directional components to a number N of output directional components, said input directional components representing a room simulated audio signal, said directional components being preferably uncorrelated.
The method of decoding the above signal, which may also be regarded as a rendering, has proved to be very efficient and effective, due to the fact that the rendering itself may be performed by means of simple signal processing if desired. Hence, a rendering of the directional components into an X-channel signal (X: more or less arbitrary) may be performed by means of a simple gain matrix mapping the M directional components into the X number of channels.
Moreover, the invention relates to a rendering system comprising at least one input for receiving a number (M) of directional components (DC) according to claim 18, said rendering system comprising means for transforming the input directional components into a number (N) of output channels according to at least one rendering method stored in associated storing means (MSM).
According to a preferred embodiment of the invention, the rendering system should be able to be re-configured, e.g. by means of software.
According to a preferred embodiment of the invention M should be significantly higher than N.
The rendering system according to the invention represents a very flexible downstream rendering in the sense that the signal may be completely pre-established pre- vious to the rendering, thereby reducing the task of the rendering system to be a mapping.
Another advantageous feature is that the rendering system may be specifically designed to fit to the room (and e.g. amplifier and loudspeaker) in which the sound has to be reproduced. According to a preferred embodiment of the invention, a unique spatial relation exists between each input signal component and the intended location of an output channel of a rendering system.
According to the invention, the means for transforming the input directional components into a number (N) of output channels may perform a signal processing based on a simple gain matrix mapping of the signal components into the said output signals. Evidently, the means for transforming the input directional components into a number (N) of output channels may also additionally apply other signal processing methods, such as delay compensation for compensating the positioning of the loudspeakers of the rendering system, a compression applied for adapting the rendered output signal to noisy environments, e.g. a cabin of a car.
When, as stated in claim 19, at least one of the said methods implies a transforming according to a gain matrix, a further advantageous embodiment of the invention has been obtained.
According to the present embodiment of the invention, the required signal processing to be performed in the rendering system may be minimized.
When, as stated in claim 20, the said method stored in the said storing means may be exchanged by means of a suitable software transmitting and/or receiving interface, a further advantageous embodiment of the invention has been obtained.
According to the present embodiment, the rendering methods may be downloaded into the rendering system, e.g. by means of an IRDA port, a traditional RJ 45 signal interface, etc. Another way may also be that of incorporating the method defining software on a DVD or a CD.
It should be noted that the bandwidth required for the transmittal of method defining software into the rendering may typically be relatively small. When, as stated in claim 21, the rendering system comprises a user interface adapted for selecting at least two different predefined rendering methods stored in said storing means, a further advantageous embodiment of the invention has been obtained.
According to the invention, a user may adapt his rendering system, to the room in which music (or speech) is to be reproduced by simply switching between different predefined methods. Evidently, the rendering system should comprise the amplifier and associated loudspeakers of the rendering system.
The rendering system may e.g. comprise a set of predefined rendering methods associated with stereo reproduction, a set of predefined rendering methods associated with five channel rendering, etc.
When, as stated in claim 22, said system comprising a set of output channel (OC) connectors of which the rendering method may define a subset of output channel connectors to be activated when applying the transforming defined by the said, a further advantageous embodiment of the invention has been obtained.
According to the invention, a rendering system may feed a traditional amplifier/loud speaker setup (e.g. a stereo setup) by means of two rendering method defined output channel connectors if so desired. Moreover, the same system may moreover feed another type of rendering system via other method defined output channel connectors.
Accordingly, the rendering system according to the invention may comprises e.g. a
DVD reading unit and a unit adapted for allocating the directional signal read by the DVD into a kind of crossfield output connectors.
Moreover, the invention relates to a multi-channel data carrier according to claim 23, said data carrier comprising a number (M) of audio channels (M), at least two of said channels representing a directional signal with respect to a virtual listener/reference position (VLP). According to the above stated embodiment of the invention, an advantageous rendering of a multi-channel directional audio signal stored at the carrier may be obtained. Experiments have shown that subsequent rendering of the signal into e.g. a multi-channel rendering system, not necessarily predefined at the time of storing the audio-signal at the data carrier, may facilitate a relatively simple, and even more important high quality rendering. Experiments have shown, that the signal format e.g. facilitate a convincing rendering of the preprocessed room simulation.
A further advantageous feature of the invention is that the bandwidth may be kept constant irrespective of the character of the stored audio signal, the number of sources, etc.
According to a prefeπed embodiment of the invention the audio-signal stored at the data carrier should be almost completely preprocessed with respect to the mixing and sound-engineering.
It should be noted that it is known to present music in different formats at the same data carrier, e.g. stereo, five channel audio, Dolby Surround ,etc. Each format is typically mixed separately, and anyway, each format may hardly be transformed into another rendering system.
But, according to the invention, a music mix may be released once and for all e.g. on a DVD, and if another rendering format is desired later, the rendering system may be re-configured without any need for manipulating the origin data source.
When, as stated in claim 24, the directional channels are established independently of a subsequent rendering, a further advantageous embodiment of the invention has been obtained
The directional components are established independently of the subsequent rendering of the audio channels. When, as stated in claim 25, the number of directional channels are at least eight, preferably at least twenty, a further advantageous embodiment of the invention has been obtained.
When, as stated in claim 26, at least two of the directional channels are uncoπelated, a further advantageous embodiment of the invention has been obtained.
When, as stated in claim 27, at least two of the directional channel are stored at the data caπier in a compressed state, a further advantageous embodiment of the invention has been obtained.
If desired, the effective necessary bandwidth of the complete signal represented in the format according to the invention may be reduced taken into consideration that many audio direction components are typically un-occupied in periods. Such unoccupied channel components will typically increase when increasing the number of channels.
A possible compression may e.g. be applied according to PCT applicaiton WO 97/38493 by Philips.
The figures
The invention will be described below with reference to the drawings of which
fig- 1 shows the basic understanding of a reverberated sound fig- 2 shows the basic principles of a sound processing device according to the invention fig- 3a-3c shows different sub-portions of the system according to the invention, fig. 4a-4b illustrates early pattern generators according to the invention fifigg.- 5 5 shows a two-dimensional system of co-ordinates for illustrative purposes, fig- 6a-6b shows two embodiments of an audio signal format according to the invention, fig- 7 shows a three dimensional system of co-ordinates with two additional planes, fig- 8 shows the angle of the elevation of the additional planes shown in fig.
7, fig- 9 shows another embodiment of an audio signal format according to the invention, fig. 10a- 10c shows fig. 9 divided into sub-planes for the readers convenience, fig. 11 shows an audio signal processing method according to the invention, fig. 12 shows the same as fig 11, but illustrated in an alternative manner, fig. 13 shows a processing system according to a further embodiment of the invention, fig. 14 shows a recording, distribution and/or reproducing system according to a still further aspect of the invention, fig. 15 shows a recording, distribution and/or reproducing system as illustrated in fig. 14, but showing a storing medium for the pre-processed signal and the generation of the input signals, fig. 16 shows a additional embodiment of the system illustrated in fig. 14, and fig. 17 shows different modules of a rendering system according to the invention. Detailed description
According to most embodiments of the invention, it is the general approach that arti- ficial generation of room simulated sound should comprise an early reflection pattern and a late sound sequence, i.e. a tail sound signal.
It should be noted that the invention is basically directed at the early reflection patterns, and consequently sound processing based on early reflections patterns within the scope of the invention.
Fig. 1 illustrates the basic principles of a conventional signal processing unit.
The circuit comprises an input 1 communicating with an initial pattern generator 2 and a subsequent reverberation generator 3. In addition, the initial pattern generator 2 and the subsequent reverberation generator 3 are connected to two mixers 4, 5 having output channels 6 and 7, respectively.
The initial pattern generator 2 generates an initial sound sequence with relatively few signal reflections characterising the first part of the desired emulated sound. It is a basic assumption that the initial pattern is very important as a listener establishes a subjective understanding of the simulated room on the basis of even a short initial pattern.
An explanation of this performance is that this signal reception corresponds to the actual sound propagation and reflection in a real life room.
Hence, reflections in a certain room will initially comprise relatively few reflections, as the first sound reflection, also called first order reflections, have to propagate from a sound source at a given position in the room to the listener's position via the nearest reflecting walls or surfaces. Compared with the overall heavy complexity of the technique, this sound field will be relatively simple and may therefore be emulated in dependency of the room and the position of the source and the listener.
Subsequently, and of course with some degree of overlapping, the next reflections will appear at the listeners position. These reflections, also called second order reflections, will be the sound waves transmitted to the position of the receiver via two reflecting surfaces.
Gradually, this sound propagation will increase in dependency of the room type, and finally the last reflected sound will be of a more diffuse nature as it comprises several reflections of several different orders at different times.
Apparently, the sound propagation will gradually result in a diffuse sound field and the sound field will more or less become a "sound soup". This diffuse sound field will be refeπed to as the tail sound.
If the walls have high absorption coefficients, the propagation will decrease quite fast after a short time period of time while the sound propagation will continue over a relatively long period of time if the absorption coefficients are low.
Fig. 2 illustrates the basic principles of a preferred embodiment of the invention.
For reasons of explaining, the shown embodiment of the invention has been divided into three modules 20A, 20B and 20C.
The first module 20A of the room simulator, according the embodiment shown, comprises M source inputs 21, 22, 23.
The source inputs 21, 22 and 23 are each connected to an early pattern generator 26, 27 and 28. Each early pattern generator 26, 27 and 28 outputs M directional signals to a summing unit 29. The summing unit adds the signal components of each of the N predetermined directions from each of the early pattern generators 26, 27 and 27.
The summing unit output N directional signals to the module 20B comprising direction rendering unit 201.
The basic establishing of the N directional signals has been illustrated in fig. 3a.
Now returning to fig. 2, the direction rendering unit converts the N directional signal to a P channel signal representation.
The basic establishing of the P channels of module 20B has been illustrated in fig. 3b.
Moreover, the system comprises a third module 20C. The module 20C comprises a reverb feed matrix 202 fed by the M source inputs 21, 22, 23. The reverb feed matrix 202 outputs P channel signals to a reverberator 203 which, in turn, outputs a P channel signal to a summing unit 204.
Thus, the summing unit 204 adds the P channel output of the reverberator 203 to the output of the direction rendering unit 201 and feeds the P channel signal to an output.
The basic establishing of the P channels of module 20C has been illustrated in fig. 3c.
Before explaining the overall functioning of the algorithm, the basic functioning of the early pattern generators 26, 27, 28 and the summing unit 29 will be explained with reference to fig. 3 a
According to fig. 3a, the module 20A comprises a number of inputs SI, S2, S3 and S4. It should be noted that a number of four inputs have been chosen for the purpose of obtaining a relatively simple explanation of the basic principles of the invention. Many other input numbers may be applicable.
Each of the inputs are directed to an early pattern generator 26, 27 and 28. Each early pattern generator generates a processed signal specifically established and chosen for the source input SI, S2, S3 and S4. The processed signals, according to the shown embodiment, are established as a signal composed of seven signal components dl, d2, d3, d4, d5, d6 and d7. The seven signal components represent a directional signal representation of the established sound and the established signal contains both the direct sound and the initial reverberation sound.
A possible embodiment of the invention implies a five channel rendering of 10- directional signal where the directions of the input signal format are 0, +/-15, +/-30, +/-70, +/-1 10 and 180 degrees, and the intended location of the five coπesponding loudspeakers are 0, +/-30 and +/- 110 degrees according to ITU 775.
Obviously, several other directions and locations may be applicable. A prefeπed em- bodiment comprises more than 20 directions.
Accordingly, each of the inputs SI, S2, S3 and S4 may refer to mutually different locations of the input source to which the early pattern is generated.
Successively, the signals from each source are summed in summing unit 29. The summing is carried out as a simple adding of each signal component, i.e. dl :=dl(Sl)+ dl(S2)+ dl(S3)+ dl(S4), d2 =d2(Sl)+ d2(S2)+ d2(S3)+ d2(S4), d3:=d3(Sl)+ d3(S2)+ d3(S3)+ d3(S4), d4 =d4(Sl)+ d4(S2)+ d4(S3)+ d4(S4), d5:=d5(Sl)+ d5(S2)+ d5(S3)+ d5(S4), d6 =d6(Sl)+ d6(S2)+ d6(S3)+ d6(S4) and d7:=d7(S 1 )+ d7(S2)+ d7(S3)+ d7(S4). It should be noted that, even though undesired, according to the preferred embodiment of the invention, the signals dl,..,d7 may comprise tail sound components or even whole tail-sound. It should nevertheless be emphasised that according to the prefeπed embodiment of the invention such tail sound may advantageously be gen- erated according to a relatively simple panning algorithm and subsequently added to the established summed initial sound signal as the established summed initial sound comprises the dominating room determining effects.
Moreover, it should be emphasised that a separate tuning of the resulting tail-sound signal is much easier when made separately from the individual tuning of the different source generators.
Turning now to module 20B, fig. 3b illustrates the basic functioning of the direction rendering unit 201.
According to the shown embodiment of the invention, the seven directional signal outputs from module 20A are mapped into a chosen multi-channel representation. According to the illustrated embodiment, the seven directional signals are mapped to a P=5 channel output.
According to a prefeπed embodiment of the invention, the type of multi-channel representation is a selectable parameter, both with respect to number of applied channels and to the type of speaker setup and the individual speaker characteristics.
The conversion into a given desired P channel representation may be effected in several different ways such as implying HRTF based (head related transfer function), a technique mentioned as Ambisonics, VB AP (vector based amplitude panning) or a pure experience based subjective mapping.
Turning now to fig. 3c module 20C is illustrated as having an input from each of the source inputs SI, S2, S3 and S4. The signals are fed to a reverb feed matrix 202 having five outputs, coπesponding to the chosen channel number of the direction rendering unit 201. The five channel outputs are fed to a reverberation unit 203 providing a five channel output of subsequent reverberation signals.
The reverb feed matrix 202 comprises relatively simple signal pre-processing means (not shown) setting the gain, delay and phase of each input's contribution to each reverb signal and may also comprise filtering pre-processing means.
Subsequently, the reverberation unit 203 establishes the desired diffuse tail sound signal by means of five tank circuits (not shown) and outputs the resulting sound signal to be added to the already established space processed initial sound signal. According to the illustrated prefeπed embodiment of the invention, the tail sound generating means are added using almost no space processing due to the fact that a space processing of the tail sound signal according to the diffuse nature of the signal has little or no effect at all. Consequently, the complexity of the overall algorithm may be reduced when adding the tail sound separately and making the tuning much easier.
Moreover, it should be noted that the above mentioned separate generation of the tail-sound provides a more natural diffuse tail-sound due to the fact that the distinct comb-filter effect of the early pattern generator should preferably only be applied to the initial pattern in order to provide naturalness.
It should be noted that the above generation of subsequent reverberation signals, according to the present prefeπed embodiment, is generated independently of the initial sound generation. Nevertheless, it should be emphasised that the invention is in no way restricted to a naπow interpretation of the basic generation of a reverberation sound. Thus, within the scope of the invention, both the initial sound and the sound tail of each sound may of course be located within an artificial room and subsequently summed in a summing unit.
Turning now to fig. 4a, an early pattern generator, such as 26 of fig. 2, is illustrated in detail. The early pattern generator is one of four according to the above described illustrative embodiment of fig. 2, and each generator comprises a dedicated source input SI, S2, S3 and S4.
The shown early pattern generator 26 comprises a source input SI.
According to the shown embodiment, the source input is connected to a matrix of signal processing means. The shown matrix basically comprises three rows of signal processing lines, which are processed by shared diffusers 41, 42.
Accordingly, the upper row is fed directly from the input SI, the second row is fed through the diffuser 41, and the third row is fed through both diffusers 41 and 42.
Each row of the signal processing circuit comprises colour filters 411, 412, 413; 421, 422, 423; 431, 432, 433. According to the shown embodiment, colour filters of the same columns are identical, i.e. colour filter 411 =421 =431.
It should nevertheless be emphasised that the colour filters may of course differ within the scope of the invention.
Moreover each row comprises delay lines 4111, 4121 and 4131 which are serially connected to the colour filters 411, 412, 413. Finally, each column may be tapped via level and phase controllers such as 4000, 4001 and 4002. It should be noted that each level-phase controller 4000, 4001 and 4002 are tap specific.
Hence, the initial pattern generator 26 comprises a matrix which may comprise several sets of predefined presets by which a certain desired room may be emulated.
As already mentioned and according to the simplified embodiment of the invention, signals of the cuπent predefined room emulation are tapped to the directional signal representation of the present sound source SI . According to the illustrated programming, four signal lines are tapped to seven directional signal components. One signal, N13 of row 1, column 3, is fed to sound component 1, one signal, N21, is fed to signal component 3, and two signals, Ni l and N22 are added to the sound component 4.
It should be noted that each tapped signal has consequently been processed through one of three combinations of diffusers, one of three types of predefined colour filters EQ, a freely chosen length of delay line and a freely chosen level and phase output.
Obviously, several other combinations and number processing elements are applicable within the scope of the invention.
According to one of the prefeπed embodiment of the invention, a separate row with a level-phase controller 4002 should be tapped and determine the direct sound. When integrating the direct sound into the early pattern generation, the location of both the direct sound as well as the coπesponding EPG and reverberation sound signals may be mapped into the sound signal representation completely similar to the desired directionality iπespective of directional resolution and complexity.
Evidently, the directional signal representation components usually comprise signals fed to each component 1-7 and not only the illustrated three.
It should be noted, that the chosen topology of the early pattern generator within the scope of the invention may be chosen from a set of more or less equivalent topologies. Moreover, the signal modifying components may be varied, if e.g. a certain degree of tail-sound is added before or after tapping.
As the illustrated early pattern generator comprises linear systems, it will be possible to interchange the components, e.g. the colour filters EQ may be interchanged with the diffusers DIF.
Fig. 4b illustrates a further possible embodiment of the early pattern generator, comprising colour filters EQ placed in the feed line to each row and diffusers DIF placed in each column in each row. Likewise, the number of columns and rows may vary depending of the system requirements. In a possible embodiment only one column of delay lines with coπesponding colour filters or diffusers is utilised. Moreover, additional components, additional diffusers, additional different types of colour filters, etc. may be chosen.
Finally it should be mentioned that, according to a prefeπed embodiment of the invention, the number of directions, i.e. signal components, should be not less than twelve, and the established reflections of each early pattern generator should not be less than 25.
The basic presetting of each early pattern generator may initially be determined by known commercially available ray tracing or room miπoring tool, such as ODEON.
To describe the invention, it will be necessary for illustrative and explanatory purposes first to define a plane (and later a room) wherein the elements of the audio format according to the invention may be placed.
Fig. 5 illustrates a two-dimensional system of co-ordinates, with the axes labeled 'x' 910 and 'y' 911. The axes 910, 911 are peφendicular to each other, and both are parallel to the ground, i.e. they are aπanged in a horizontal plane. At the systems origin a head of a person 912 is placed, with the nose pointing in the same direction as the x-axis 910. A circle 913 with radius of a unit and center at the systems origin is drawn in the plane of the x- and the y-axes.
Fig. 6a illustrates an audio signal format. It shows again the two dimensional system of co-ordinates of figure 5. It comprises the axes x 910 and y 911, and the unit circle 913. The intersection 912 of the two axes 910, 911 represents the position of the head 912 of figure 5.
Further figure 6 comprises twelve vectors 920, all beginning at the systems origin and pointing towards the unit circle 913, all having the length of one unit. The twelve vectors 920 (dl ... dl 2) are evenly distributed around the circle 913, causing the angle 921 between two neighbor vectors to be the same, indifferently to which vector is chosen.
Incoming sounds may be defined by these vectors, the direction of the vectors representing the direction of the incoming sounds (or rather the direction, from which the incomings sound signals are coming), and a number representing the magnitude or amplitude of the incoming sounds signals.
The number of vectors 920 (twelve) is only an example. It is possible to comprise any number of vectors, as long as the number of vectors is sufficient to define the incoming sounds satisfactorily. A prefeπed embodiment would comprise more than 25 vectors or directions, for example 30, 40, 50 or even more. The higher the number of vectors, the higher resolution of direction is achieved. And the higher resolution of direction, the more accurate source localization is achieved. For sounds coming from sources placed in front of the head, the human beings are capable of distinguishing directional differences, as small as 3 degrees. This is the so-called localization blur.
The illustrated angle 921 between two neighbor vectors is only an example. It is pos- sible to comprise any principle of vector distribution around the circle, including uniform distribution and experience based, e.g. psycho-acoustic based distribution.
Having in mind that the ears are not quite as good of directional distinguishing of sounds coming from behind the head or from the sides of the head, as they are of sounds coming from in front of the head, it is advantageous to comprise a distribution with fewer vectors behind the head than in front of the head. This gives a less accurate localization behind the head, but the human being will normally not be able to tell the difference anyway. A prefeπed embodiment using this distribution principle is shown in fig. 6b.
Another distribution of the vectors 920 could be based on measures of the density of different sounds in all possible directions. The vectors could then be distributed with small angles between them in direction sectors with high density of directional sound signals and with larger angles between them in direction sectors with low density of directional sound signals.
A further way of distributing the vectors around the circle could be based on human impressions.
Another perspective of the invention is added when the above format is defined according to a room instead of just a plane. Fig. 7 defines such a room. It is a three- dimensional system of co-ordinates, with the axes labeled 'x' 930, 'y' 931 and 'z' 932. The axes 930-932 are perpendicular to each other, that is: each axis is perpendicular to the other two. At the systems origin a head of a person 912 is placed, with the nose pointing in the same direction as the x-axis 930. Further two circles 946a and 946b have been added. These circles 946a and 946b are placed in parallel with the unit circle 933, but are displaced along the z-axis in such a way that they still have their centers at the z-axis. Furthermore the distance from the systems origin 912 to any point at the two circles 946a and 946b are exactly one unit, as the circles are placed on a sphere with its center in the systems origin. The circle 933 lying in the x- y-plane is called the middle plane circle. The circle 946a displaced along the z-axis in the positive direction is called the upper circle. The circle 946b displaced along the z-axis in the negative direction is called the lower circle.
Fig. 8 coπesponds to fig. 7. It comprises the three axes 930-932 of the three- dimensional co-ordinate system. Further it comprises the unit circle 933 of the x-y- plane and the two additional circles 946a, the upper circle, and 946b, the lower circle, placed in parallel with the x-y-plane. The circles 946a and 946b are centered at the z-axis, and the distance from any point on these circles to the systems origin 912 is exactly one unit as described above. Further fig. 8 comprises two angles 951a and 951b. The angle 951a is the angle between the x-axis and the direction from the sys- tems origin 912 to a point at the circle 946a lying exactly above the x-axis. The angle
951b is defined the same way, yet according to the circle 946b. This way the angles 951a, 951b indicates the displacement of the circles 946a, 946b. Fig. 9 shows an embodiment of an audio format comprising three dimensions. It comprises the elements of figure 8. Further it comprises a number of vectors 920, pointing from the systems origin 912 to the middle circle 933. These vectors 920 are comparative to the vectors of the two dimensional audio format of figure 8a.
Further fig. 9 comprises a number of vectors 960a pointing from the systems origin 912 to the upper circle 946a and a number of vectors 960b pointing from the systems origin 912 to the lower circle 946b.
To be able to distinguish the vectors of these three vector systems, the three circles are drawn separately in fig. 10a- 10c.
Fig. 10a shows the upper circle 946a and its coπesponding vectors 960a of the three dimensional directional audio format. The angle 951a indicates the displacement of the upper circle from the x-y-plane. In addition to these already mentioned elements, fig. 10a comprises an angle 971a. It indicates the rotation of a vector 960a from the direction of the x-axis 930, with axis of rotation at the z-axis 932.
Fig. 10b shows the middle circle 933 and its coπesponding vectors 920 of the three dimensional directional audio format. The angle 921 indicates the angular distance between two vectors 920.
Fig. 10c shows the lower circle 946b and its coπesponding vectors 960b of the three dimensional directional audio format. The angle 951b indicates the displacement of the lower circle from the x-y-plane. The angle 971b indicates the rotation of a vector 960b from the direction of the x-axis 930.
Drawn in the same co-ordinate system, fig. 10a- 10c will end up as fig. 9. With reasons as those for the two dimensional audio format, the number of vectors coπesponding to each circle and the number of circles are only examples, and any number of vectors and circles are within the scope of the invention.
As a human being are more capable of directional distinguishing of sources in a horizontal plane than in a vertical plane, it is not necessary to have the same resolution of vectors up- and downwards as it is sidewards. This makes a prefeπed number of horizontal circles 5. These comprise from the top a second upper circle, a first upper circle, a middle circle, a first lower circle and a second lower circle. Common to all circes imaginable is that the distance from any point at a circle to the system origin, i.e. the head of a human being, is one unit, or in other words, the circles are all placed with their perimeters on the surface of a sphere with a radius of one unit.
A prefeπed embodiment would also comprise fewer vectors pointing towards each of the upper or lower planes than to the middle plane. This is because the highest resolution of vectors wanted and also usable to the human being is in the middle plane.
In a further prefeπed embodiment an upper circle and/or a lower circle is situated near respectively the positive part (i.e. the angle 951a being close to 90°) and the negative part (i.e. the angle 951b being close to 90°) of the z-axis and contains only few vectors. Such upper and/or lower circles could even be defined as points on the z-axis, whereby only one vector would coπespond to these upper and/or lower "circles", that is a vector located along the positive direction of the z-axis and/or a vector located along the negative direction of the z-axis.
In an advantageous simple embodiment of the invention the audio format could be defined by a middle circlee as described above in combination with a vector along the positive part of the z-axis and optionally a vector along the negative part of the z- axis.
The distribution of vectors around a circle is also just an example. As explained for the two-dimensional audio format many distribution principles are imaginable and applicable and hence within the scope of this invention. This could be a uniform distribution principle as shown in the drawings 9, 10a- 10c, an experience based distribution principle, a distribution principle based on measures of the localization blur in different directions as shown for the two-dimensional system in fig. 6b or a distribu- tion principle based on measures of the portion of sound gradients in different areas for a specific room and situation.
Further it shall be pointed out that the vectors could be placed in other manners than with their ends placed at a circle, especially the vectors, which are placed with an angle in relation to the x-y-plane. The angle 951 a or 951 (fig. 8) in relation to the x- y-plane could vary as a function of the angle 921 (fig. 6a) if found appropriate, whereby the ends of the vectors could be situated on for example non-circular curves on the surface of the unit sphere.
In fig. 1 1 and 12 is illustrated signal processing methods and systems, which utilize the above-described audio signal formatting system.
A number of signals, preferably audio signals, si - sM are provided in the signal format according to the invention, e.g. comprising N directional components ac- cording to the same directional format. It shall be noted that not all of these components actually need to contain any signal, as the signal format must be expected to comprise a relatively large number of directional components in order to be able to represent the involved signal sources satisfactorily. Thus some (or even a large number) of the components of the actual signal sources may be zero.
The audio signals si - sM may be recordings of or (microphone) signals stemming from single musical instruments, group of instruments, singers etc. or the signals si - sM may be other forms of signals, which will have to be combined to represent a resulting audio signal or other forms of signals.
The source signals si - sM are directed to a signal processing unit 972 or 982, said signal processing units serving to combine the source signals si - sM and to provide an output signal 973 or 983, respectively, which also is represented in the audio signal format according to the invention, e.g. comprising N directional components, said components having the same directions as the components used for representing the source signals si - sM.
In a prefeπed embodiment the processing involved for combining the source signals is a summing of the coπesponding components of each source signal si - sM. The summing is carried out as a simple adding of each signal component, i.e.
∑dl =dl(sl)+ dl(s2)+ + dl(sM), Σd2 =d2(sl)+ d2(s2)+ + d2(sM), Σd3 =d3(sl)+ d3(s2)+ + d3(sM), Σd4 =d4(sl)+ d4(s2)+ + d4(sM),
∑dN:=dN(sl)+ dN(s2)+ + dN(sM).
Other forms of signal processing may be performed by the signal processing units 972 and 982, even additional processing not primarily serving to combine the components of the source signals, but serving to amend, equalize, add reverberation to, etc. the resulting signal. But preferably such additional processing will be performed in a later stage or stages of the signal processing chain but before the final direction rendering unit (DRU) will perform the mapping of the signals to the available sound reproducing system, e.g. the loudspeaker system.
The basic functioning of the direction rendering unit (DRU) will thus be to map the N directional signal outputs 973 and 983 from units 972 or 982 into a chosen multi- channel representation, according to the available speaker set-up. The conversion into a given desired P channel representation may be effected in several different ways such as implying HRTF based (head related transfer function), a technique mentioned as Ambisonics, VBAP (vector based amplitude panning) or a pure experience based subjective mapping.
Fig. 13 shows a system according to a further aspect of the multi-channel audio format and processing method of this invention. It shows a model of a reverberation unit. Here a number of units 9101a-9101c, called room simulator units, calculate how the sound emitted from a source 9100a-9100c will be heard at the listener's position, including reflections from the room. These room simulator units may for example be early pattern generators, EPG's, which will be assumed to be the case in the following. As output each EPG 9101a-9101c has the same listener placed at the same location, but as input the EPG's could have different instruments 9100a-9100c placed at different locations. To make the best room-simulation, the EPGs should calculate the resulting sound from as many directions as possible, and this result 9102a-9102c could then be sent on in the audio format of this invention. The result of each EPG should in any suitable way be added to form the final sound heard at the listener's position, and this addition could be made according to the audio signal processing method 9103 of this invention. The result 9104 from passing the outputs from the EPGs through the processing method, would be in the multi-channel audio format of this invention. Finally the sound in each direction of this format would have to be mapped to the available loudspeakers 9106a-9106b or channels chosen by the user. This mapping is performed by a direction rendering unit (DRU) 9105.
Instead of just representing the sound coming from a particular direction, each vector could represent the method of calculating the sound coming from that particular direction. In this embodiment each vector would comprise a listing of partial sound signals as a function of time and as a function of the actual input signals. In each vector shall then the partial sound signals be summed.
According to a further embodiment of the invention the vectors may represent further sound describing information, for example source direction, coloring etc. Fig. 14 illustrates a recording, distribution and/or reproducing system according to a further aspect of the invention, said system utilizing the signal format according to the first aspect of the invention.
A number of source signals si - sM, each comprising a number of directional components dl - dN, are led to a pre-processing unit (PPM) 9111. The input source signals, which preferably may be audio signals, may each stem from a single musical instrument, a singer or another source of audio signals, or may stem from a group of instruments, a group of singers, a group of other audio sources or combinations of these. These signals may have been generated using commonly available methods and equipment, e.g. microphones, while simultaneously producing the directional components of the signals. Further the input source signals si - sM, or some of these, may have been generated using audio generators and/or simulators, for exam- pie room simulators, early pattern generators (EPG's) etc. as indicated above.
In the pre-processing unit (PPM) 9111 a processing of the input source signals si - sM takes place. This pre-processing may in a simple form be a summing of the corresponding directional components of each input source signal, e.g. dl :=dl(sl)+ dl(s2)+ dl(s3)+ ....+ dl(sM), d2:=d2(sl)+ d2(s2)+ d2(s3)+ .... +d2(sM),
dN:=dN(sl)+ dN(s2)+ dN(s3)+ .... +dN(sM), whereby the output signal IF from the pre-processing unit (PPM) 9111 will be generated.
Other forms of signal processing may be incorporated in the pre-processing unit (PPM) 9111, such as equalizing, coloring, addition of tail sound signals, reverberation, delaying etc. Also amendments of the format of the signals may take place, e.g. reduction of the number of directional components, canceling of certain directional components, addition of certain components, which may contain information relating to the recorded or processed signals.
The resulting output signal IF from the pre-processing unit (PPM) 9111 thus com- prises a number T of signal components, of which most or all are directional components. The number T may equal the number N of directional components in each of the input source signals si - sM, or T may be larger than or less than N.
The output signal IF constitutes an intermediate signal format, which is suitable for storing, transmission or otherwise distribution of the recorded signals, preferably audio signals, while simultaneously retaining as much detailed information about the input signals as possible. The output signal IF may thus be stored on any suitable form of storing media such as CD's, DVD's or static storing media of the electronic, magnetic, optical etc. variety as illustrated in fig. 15 by the storing means 9115. Further the output signal IF may be transmitted in any suitable manner, for example by distribution via the Internet or other suitable means.
The T-channel signal IF is received by user processing means (UPM) 9112, which may be incoφorated in the reproducing system or apparatus at the end-user, for ex- ample when a storing medium such as a CD or a DVD is played on the apparatus of the end-user, or when a signal IF is received via electronic, electromagnetic or optical communication means, for example via the Internet, by the reproducing system at the end-user.
Ordinarily the reproducing equipment at the end-user will not be capable of reproducing the relatively large number of directional components, i.e. channels, comprised in the signal IF, but will be able to reproduce the signals, preferably the audio signals, in one or more of the commonly used bi- or multi-channel systems, e.g. stereo system, 3-, 4- or 5-channel systems, suπound systems etc. or in a user-specified set-up. The user processing means (UPM) 9112 therefore comprises means, for example in the form of a decoder, for transforming the T-channel signal IF into a signal containing a suitable number of channels, which may be reproduced by the end-user equipment.
The user processing means UPM may be predestined to transform the received signal from a certain number of channels T to a certain number of channels to be reproduced by the end-user equipment. Alternatively, the user processing means UPM may be configured to determine the number of channels T incoφorated in the received signal IF and to perform a transformation from this number of channels to a certain number of channels to be reproduced by the end-user equipment. Further, the user processing means UPM may be able to perform a transformation to one of two or more different reproducing systems UF1 - UFk, containing a different number of channels, in dependence upon an active choice by the user or in dependence upon other factors, such as parameters of the received signal IF.
As illustrated, the user processing means UPM of fig. 14 is configured to reproduce the audio input signals in a specific reproducing system labeled UF3, which may be any type of the relatively large number of reproducing systems (labeled UF1 - Ufk on fig 14), which are available at present, and which may reproduce the signals via loudspeaker systems using stereo systems, 3-, 4- or 5-channel systems and/or user- specified set-ups etc.
In addition to the means for transforming the number of channels, the user processing means UPM may further comprise other means for processing the received signal IF, for example means for equalizing, other means for coloring the signals, delaying means, adding reverberation, dynamic processing etc. These further processing steps may be designated in relation to the actual type and character of the user reproducing equipment, e.g. in order to achieve an optimal sound reproduction.
The transformation of the number of channels T contained in the received signal IF to the number of channels, which may be reproduced by the actual end-user equipment, may be performed using linear transformation of the received signal- components, for example by matrix-operation. More complicated operation may be performed to achieve more sophisticated and detailed results of the transformation.
As indicated in fig. 15, the input signals si - sM may originate from microphones or similar transducers producing electric signals 9110a - 9110M. These signals are processed by signal processing means 9114a - 9114M, whereby the signals si - sM each having N signal components according to the signal format of the invention is produced. The signal processing means 9114a - 9114M may for example be room simulators, early pattern generators etc. Some or all of the input signals for these signal processing means or for the pre-processing means PPM 9111 may also have been produced artificially, for example by electronic music instruments or signal generators.
A further advantageous embodiment of the invention will be described with refer- ence to fig. 16. This figure illustrates a system, which coπesponds to the system described in connection with fig. 14, but where a further processing means, an intermediate processing means (IPM) 9113 has been added. This further processing means IPM provides an intermediate processing of the multi-channel signal IF having T channels (or directional components), whereby the number of channels may be re- duced to a number V. The output signal IF' having V channels (or directional components) may be stored on any suitable storing media 9116 or may otherwise be transmitted or distributed to the end-users, represented by the end-user processing means UPM, where it may be reproduced as described in connection with fig. 14.
The output signal IF from the pre-processing unit (PPM) 9111 may also be stored on any suitable storing media (not shown in fig. 16) or may otherwise be transmitted or distributed.
It will be understood that the intermediate processing may be performed by or for a record company, a record distributor, a distribution network etc. which has a need to distribute or may achieve an advantage by distributing signals with a fewer or a specific number of channels. This may thus be performed by the IPM without influenc- ing upon the other parts of the system, e.g. the system as shown in fig. 14 will function without regard to the system shown in fig. 15.
Fig. 17 shows different modules of a rendering system according to the invention.
The system comprises an independent pre-rendering stage PS.
The pre-rendering stage PS comprises a data carrier reading unit DCRU. The unit may e.g. be a DVD player adapted for reading data stored in a DVD or it may e.g. comprise a solid state memory interface.
The data caπier reading unit DCRU are connected to a direction rendering unit DRU by means of an M channel interface.
The direction rendering unit DRU comprises method storing means MSM and a signal processing unit SPU adapted for transforming the M-channel input into an N- channel output according to a rendering method stored in the said memory storing means MSM.
The direction rendering unit may e.g. perform a relatively simple mapping of the input channels into the output channels by means of a gain matrix. The rendering method may e.g. be established by means of vector based amplitude panning.
Moreover, the rendering unit may be adapted for exchanging the rendering method by means of software modifications.
The N-channel output of the pre-rendering unit DCRU are then fed to output channel connectors (OC).
The illustrated pre-rendering unit DCRU comprises only two signal output connectors. Hence, the pre-rendering unit DCRU is specifically adapted for transforming the input channels into a two-channel signal representation, such as stereo. The outputs of the pre-rendering unit DCRU are then fed to a traditional stereo amplifier RA connected with two loudspeakers LS.
According to the invention, the pre-rendering stage may thus be adapted for fitting to an arbitrary combination of amplifiers and loudspeakers, e.g. a suπound sound system.
Evidently, another possible implementation of the invention would be a pre- rendering unit DCRU comprising a number of output connectors, e.g. ten. The pre- rendering unit DCRU may then, under control of the stored rendering method apply 1 to 10 physical outputs.
Evidently, the amplifier means may be incoφorated in the pre-rendering unit DCRU within the scope of the invention.
The direction rendering unit DRU may also comprise a set of rendering methods storing the methods storing means. The rendering methods may both be different with respect to basic properties determines by the number of output channels, but it may also be different with respect to the intended positioning of the loudspeaker of the rendering system. Accordingly, the illustrated stereo rendering system may comprise e.g. twenty different predefined "stereo" variants with respect to e.g. the positioning of the loudspeakers in the room or e.g. room characteristics.
Such rendering may e.g. imply phase modification, equalizing, sound coloring, sound compression or the like.

Claims

1. Audio signal format comprising
N components (dl, d2, d3, ... dN),
each of the said components (dl, d2, d3, ... dN) representing a direction, said components preferably being uncoπelated.
2. Audio signal format according to claim 1 comprising
N components (dl, d2, d3, ... dN), where N is at least 3.
3. Audio signal format according to claim 1 or 2 comprising
N components (dl, d2, d3, ... dN), where N is at least 10, preferably at least 20.
4. Audio signal format according to claim 1-3, wherein the said directions are three- dimensional directions.
5. Audio signal format according to claim 1-4, wherein some or all of said directions are angled in relation to a common reference plane and where preferably all of said directions to one and the same side of the plane have been placed with approximately the same angle in relation to the common reference plane.
6. Audio signal format according to claim 1-5, wherein directions are placed on both sides of the common reference plane, where some or all of said directions are angled in relation to the common reference plane and where preferably all of said directions to one and the same side of the plane have been placed with approximately the same angle in relation to the common reference plane.
7. Audio signal format according to claim 1-6, wherein the angle of the directions on one side of the common reference plane and the angle of the directions on the other side of said plane are substantially equal.
8. Audio signal format according to one or more of claims 1 - 7, wherein the said directions are distributed among all possible directions.
9. Audio signal format according to claims 1-8, wherein the said directions are distributed with a larger proportion of directions in areas with a relatively high density of sound signals than in areas with a relatively low proportion of sound signals.
10. Audio signal format according to claim 1-8, wherein the said directions are distributed with a larger proportion of directions in areas in which the human perception of sound signals is relatively shaφ, e.g. in front of the head.
11. Method of representing an audio signal, wherein said signal is decomposed to a signal comprising N directional components and according to an audio signal format as characterized in one or more of claims 1 - 10.
12. Method of processing audio signals,
said signals comprising
M sub-signals (si, s2, s3, ... sM),
each of the said sub-signals (si, s2, s3, ... sM) comprising
N components (dl, d2, d3, ... dN),
each of the said components (dl, d2, d3, ... dN) representing a direction; where the said sub-signals (si, s2, s3, ... sM) are added to form a sum-signal (∑s) comprising
N components (∑dl, Σd2, Σd3, ... ΣdN),
each of the said components (∑dl, Σd2, Σd3, ... ΣdN) representing a direction,
each of the said components (∑dl, Σd2, Σd3, ... ΣdN) being the sum of the said M sub-signals (si, s2, s3, ... sM) coπesponding components (dl, d2, d3, ... dN).
13. Method of processing audio signals,
said signals comprising
M sub-signals (si, s2, s3, ... sM),
each of the said sub-signals (si, s2, s3, ... sM) comprising
N components (dl, d2, d3, ... dN),
each of the said components (dl, d2, d3, ... dN) representing a direction;
where said sub-signals (si, s2, s3, .. sM) are results of a room-simulation using room-simulators, preferably multi-directional room-simulators,
where the said sub-signals (si, s2, s3, ... sM) are added to form a sum-signal (∑s) comprising
N components (∑dl, Σd2, Σd3, ... ΣdN),
each of the said components (∑dl, Σd2, Σd3, ... ΣdN) representing a direction, each of the said components (∑dl, Σd2, Σd3, ... ΣdN) being the sum of the said M sub-signals (si, s2, s3, ... sM) coπesponding components (dl, d2, d3, ... dN).
14. Method of representing an audio signal (AS), said method comprising the step of
establishing at least two directional signal components (M), said directional signal components (M) preferably being uncoπelated.
15. Method of representing an audio signal according to claim 14, whereby the said audio signal is a room processed signal.
16. Method of combining signals established according to the method according to claim 14 and 15, whereby at least two audio signals (AS) are combined into one signal by means of an adding.
17. Method of decoding a number (M) of directional components into a, preferably lower, number (N) of directional components, said method comprising the step a transforming of the input directional components to a number N of output directional components, said input directional components representing a room simulated audio signal, said input directional components being preferably uncoπelated.
18. Rendering system comprising at least one input for receiving a number (M) of directional components (DC), said rendering system comprising means for trans- forming the input directional components into a number (N) of output channels according to at least one rendering method stored in associated storing means (MSM).
19. Rendering system according to claim 18, wherein one at least of the said methods implies a transforming according to a gain matrix.
20. Rendering system according to claim 18 or 19, wherein the said method stored in the said storing means may be exchanged by means of a suitable software transmitting and/or receiving interface.
21. Rendering system according to claims 18-20, wherein the rendering system comprises a user interface adapted for selecting at least two different predefined rendering methods stored in said storing means.
22. Rendering system according to claims 18-21, wherein said system comprising a set of output channel (OC) connectors of which the ren- dering method may define a subset of output channel connectors to be activated when applying the transforming defined by the said.
23. Multi-channel data carrier, said data carrier comprising a number (M) of audio channels (M), at least two of said channels representing a' directional signal with re- spect to a virtual listener/reference position (VLP).
24. Multi-channel data carrier according to claim 23, wherein the directional channels are established independently of a subsequent rendering
25. Multi-channel data carrier according to claim 23 or 24, wherein the number of directional channels are at least eight, preferably at least twenty.
26. Multi-channel data carrier according to claim 23-25, wherein at least two of the directional channels are uncoπelated.
27. Multi-channel data carrier according to claim 23-26, wherein at least two of the directional channel are stored at the data carrier in a compressed state.
PCT/DK2000/000443 1999-08-09 2000-08-09 Multi-channel processing method WO2001011602A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU64271/00A AU6427100A (en) 1999-08-09 2000-08-09 Multi-channel processing method
EP00951275A EP1203364A1 (en) 1999-08-09 2000-08-09 Multi-channel processing method

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP99202585A EP1076328A1 (en) 1999-08-09 1999-08-09 Signal processing unit
EP99202585.8 1999-08-09
EP00201759A EP1158486A1 (en) 2000-05-18 2000-05-18 Method of processing a signal
EP00201759.8 2000-05-18

Publications (1)

Publication Number Publication Date
WO2001011602A1 true WO2001011602A1 (en) 2001-02-15

Family

ID=26072251

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DK2000/000443 WO2001011602A1 (en) 1999-08-09 2000-08-09 Multi-channel processing method

Country Status (3)

Country Link
EP (1) EP1203364A1 (en)
AU (1) AU6427100A (en)
WO (1) WO2001011602A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4731848A (en) * 1984-10-22 1988-03-15 Northwestern University Spatial reverberator
EP0593228A1 (en) * 1992-10-13 1994-04-20 Matsushita Electric Industrial Co., Ltd. Sound environment simulator and a method of analyzing a sound space
US5452360A (en) * 1990-03-02 1995-09-19 Yamaha Corporation Sound field control device and method for controlling a sound field
US5585587A (en) * 1993-09-24 1996-12-17 Yamaha Corporation Acoustic image localization apparatus for distributing tone color groups throughout sound field

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB394325A (en) * 1931-12-14 1933-06-14 Alan Dower Blumlein Improvements in and relating to sound-transmission, sound-recording and sound-reproducing systems

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4731848A (en) * 1984-10-22 1988-03-15 Northwestern University Spatial reverberator
US5452360A (en) * 1990-03-02 1995-09-19 Yamaha Corporation Sound field control device and method for controlling a sound field
EP0593228A1 (en) * 1992-10-13 1994-04-20 Matsushita Electric Industrial Co., Ltd. Sound environment simulator and a method of analyzing a sound space
US5585587A (en) * 1993-09-24 1996-12-17 Yamaha Corporation Acoustic image localization apparatus for distributing tone color groups throughout sound field

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ROCCHESSO D ET AL: "CIRCULANT AND ELLIPTIC FEEDBACK DELAY NETWORKS FOR ARTIFICIAL REVERBERATION", IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING,US,IEEE INC. NEW YORK, vol. 5, no. 1, 1997, pages 51 - 63, XP000785329, ISSN: 1063-6676 *
See also references of EP1203364A1 *

Also Published As

Publication number Publication date
EP1203364A1 (en) 2002-05-08
AU6427100A (en) 2001-03-05

Similar Documents

Publication Publication Date Title
Zotter et al. Ambisonics: A practical 3D audio theory for recording, studio production, sound reinforcement, and virtual reality
Hacihabiboglu et al. Perceptual spatial audio recording, simulation, and rendering: An overview of spatial-audio techniques based on psychoacoustics
Jot Real-time spatial processing of sounds for music, multimedia and interactive human-computer interfaces
EP1025743B1 (en) Utilisation of filtering effects in stereo headphone devices to enhance spatialization of source around a listener
US6904152B1 (en) Multi-channel surround sound mastering and reproduction techniques that preserve spatial harmonics in three dimensions
RU2533437C2 (en) Method and apparatus for encoding and optimal reconstruction of three-dimensional acoustic field
CA2270664C (en) Multi-channel audio enhancement system for use in recording and playback and methods for providing same
Lipshitz Stereo microphone techniques: Are the purists wrong?
EP1275272B1 (en) Multi-channel surround sound mastering and reproduction techniques that preserve spatial harmonics in three dimensions
KR20080060640A (en) Method and apparatus for reproducing a virtual sound of two channels based on individual auditory characteristic
US8363847B2 (en) Device and method for simulation of WFS systems and compensation of sound-influencing properties
WO1992015180A1 (en) Sound reproduction system
Jot et al. Binaural simulation of complex acoustic scenes for interactive audio
MX2012002886A (en) Phase layering apparatus and method for a complete audio signal.
Braasch et al. A loudspeaker-based projection technique for spatial music applications using virtual microphone control
Lipshitz Stereo microphone techniques: Are the purists wrong?
EP1203364A1 (en) Multi-channel processing method
US7403625B1 (en) Signal processing unit
WO2001019138A2 (en) Method and apparatus for generating a second audio signal from a first audio signal
KR970005610B1 (en) An apparatus for regenerating voice and sound
KR20050060552A (en) Virtual sound system and virtual sound implementation method
Sousa The development of a'Virtual Studio'for monitoring Ambisonic based multichannel loudspeaker arrays through headphones
Christensen et al. Presented at the 107th Convention 1999 September 24-27 New York
Montoya et al. High Spatial Resolution Multichannel Recording

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ CZ DE DE DK DK DM DZ EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2000951275

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2000951275

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 10049417

Country of ref document: US

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: JP