WO2001011602A1 - Multi-channel processing method - Google Patents
Multi-channel processing method Download PDFInfo
- Publication number
- WO2001011602A1 WO2001011602A1 PCT/DK2000/000443 DK0000443W WO0111602A1 WO 2001011602 A1 WO2001011602 A1 WO 2001011602A1 DK 0000443 W DK0000443 W DK 0000443W WO 0111602 A1 WO0111602 A1 WO 0111602A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- components
- signals
- directional
- signal
- directions
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/305—Electronic adaptation of stereophonic audio signals to reverberation of the listening space
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0091—Means for obtaining special acoustic effects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S5/00—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
- H04S5/005—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation of the pseudo five- or more-channel type, e.g. virtual surround
Definitions
- the invention relates to a format of audio signals according to claim 1 and a method of processing audio signals according to claim 9.
- the invention relates to a method of processing audio signals according to claim 12, a method of processing audio signals according to claim 13, a method of representing an audio signal (AS), said method according to claim 14, a method of decoding a number (M) of directional components () into a, preferably lower, number (N) of directional components according to claim 17, a rendering system comprising at least one input for receiving a number (M) of directional components (DC) according to claim 18 and a multi-channel data carrier according to claim 23
- Audio processing and audio rendering are well-known within the art.
- An audio rendering system may typically imply a sound generating unit, e.g. a CD or DVD player and an associated amplifier and loudspeaker system.
- a sound generating unit e.g. a CD or DVD player and an associated amplifier and loudspeaker system.
- a problem of the rendering known within the art is that the systems are inflexible with respect to a possible desired changing of the rendering method. Thus, a change from e.g. a two channel stereo rendering into a five channel cinema rendering will typically infer serious technical problems, and the quality of an obtainable rendering may be questioned.
- Another aspect is, that having more sound reproduction formats, including stereo- phonic formats, surround sound formats and future multi-channel formats, it is necessary to record and process the sound differently according to each reproduction format.
- the invention relates to an audio signal format comprising
- each of the said components (dl, d2, d3, ... dN) representing a direction, said com- ponents preferably being uncorrelated.
- a component may comprise an accumulation of zero or more signals in a direction defined by said component.
- the components should pref- erably be uncorrelated. At least to the extent that no panning or dependency is established between the signals contained in the format.
- N components (dl, d2, d3, ... dN), where N is at least 3, a further advantageous embodiment of the invention has been obtained.
- N is at least 10, preferably at least 20, a further advantageous embodiment of the invention has been obtained.
- an increase of the number of directional components have proven to be advantageous when dealing with multi-channel rendering, such as five channel rendering.
- the said directions are three-dimensional directions
- a further advantageous embodiment of the invention has been obtained.
- the possibility of establishing a three dimensional audio image has been facilitated.
- the three dimensional signal format may include both true signal compo- nents or even "trick"-components.
- an experience-based allocation of directions of components may be applied.
- the said directions are distributed with a larger propor- tion of directions in areas in which the human perception of sound signals is relatively sharp, e.g. in front of the head, a further advantageous embodiment of the invention has been obtained.
- said signal is decomposed to a signal comprising N di- rectional components and according to an audio signal format as characterized in one or more of claims 1 - 10, a further advantageous embodiment of the invention has been obtained.
- the invention relates to a method of processing audio signals according to claim 12,
- each of the said sub-signals (si, s2, s3, ... sM) comprising
- each of the said components (dl, d2, d3, ... dN) representing a direction;
- each of the said components ( ⁇ dl, ⁇ d2, ⁇ d3, ... ⁇ dN) representing a direction, each of the said components ( ⁇ dl, ⁇ d2, ⁇ d3, ... ⁇ dN) being the sum of the said M sub-signals (si, s2, s3, ... sM) corresponding components (dl, d2, d3, ... dN).
- the format may advantageously be applied as an intermediate signal processing format and e.g. different sound sources represented according to the same format may be superpositioned by a simple adding of the signal components.
- the invention relates to a method of processing audio signals according to claim 13,
- each of the said sub-signals (si, s2, s3, ... sM) comprising
- each of the said components (dl, d2, d3, ... dN) representing a direction;
- sub-signals are results of a room-simulation using room-simulators, preferably multi-directional room-simulators
- each of the said components ( ⁇ dl, ⁇ d2, ⁇ d3, ... ⁇ dN) representing a direction, each of the said components ( ⁇ dl, ⁇ d2, ⁇ d3, ... ⁇ dN) being the sum of the said M sub-signals (si, s2, s3, ... sM) corresponding components (dl, d2, d3, ... dN).
- the invention relates to a method of representing an audio signal (AS), said method according to claim 14, comprising the step of
- the audio signal may subsequently be decoded into a desired rendering format, such as five channel.
- the audio format represents a flexible audio coding in the sense, that the audio-signals may be encoded without any knowledge about the rendering system.
- the method according to the invention implies a very advantageous representation of a signal due to the fact that the final rendering of the signal may be performed strictly according to a simple mapping if so desired, e.g. a gain matrix, still maintaining the intended total signal image.
- the final rendering may thus, if desired, be performed solely with focus on the rendering of a signal component having a certain direction parameter and without differentiating between the different signals and the type of the signals contained in the directional signal com- ponent.
- direct sound and subsequent room simulated signal may be rendered in the same way.
- a music mix may be established once and for all in the format accord- ing to the invention, and subsequently, when new formats appear on the commercial market, a rendering may be established on the basis of the said format.
- the term uncorrelated signal components implies that different directional signal components are independent in a degree that a subsequent rende ⁇ ng in a rende ⁇ ng system is possible without compensating for mutual dependencies between different directional signal components.
- the signal components may be subsequently be rendered in almost every possible audio rende ⁇ ng system, e.g. stereo, five channel surround sound, without considering the originally intended rendering method.
- a component includes an accumulation of zero or more signals m a direction defined by said component
- the method according to the invention may be performed without any (or very little) knowledge of the downstream rendering system.
- the said audio signal is a room processed signal, a further advantageous embodiment of the invention has been obtained.
- uncorrelated signal components implies that the signal components are independent.
- An independency between the signals of different directional signal components of a room simulated signal may specifically imply that signals representing a direct sound signal of a room signal should be independent with other direct sound representing signals of other directional signal components.
- a correlation between signals of differ- ent signal components will occur within the scope of the invention, due to the fact that a room simulation of a direct sound signal will likely produce a subsequent reverberation sound signal bundle distributed in different signal directional components (generally).
- the method according to the invention implies a very advantageous representation of a signal when dealing with room processed signals due to the fact that the final rende ⁇ ng of the room simulated signal may be performed st ⁇ ctly according to a simple mapping if so desired, e.g. a gain matrix, still maintaining the intended total signal image.
- the final rendering may thus, if desired, be performed solely with focus on the rendering of a signal component having a certain direction parameter and without differentiating between the different signals and the type of the signals contained in the directional signal component. In other words, direct sound and subsequent room simulated signal may be rendered in the same way.
- This unique feature implies e.g. that a music recording may be mixed, room processed, etc. by a sound engineer and encoded/stored according to the invention.
- the recording may then be transformed into a signal standard fitting to a certain desired rendering method, e.g. stereo.
- the stored music recording may be redistributed in another format without re-mixing the music recording.
- an adding of signal components would preferably involve the signals (in time) of the same signal directional component of two different audio signals are summed.
- the format and the method facilitate a modular approach according to which e.g. the room processing may be processed independently.
- the invention relates to a method of decoding a number (M) of directional components into a, preferably lower, number (N) of directional components according to claim 17, said method comprising the step a transforming of the input directional components to a number N of output directional components, said input directional components representing a room simulated audio signal, said directional components being preferably uncorrelated.
- the method of decoding the above signal which may also be regarded as a rendering, has proved to be very efficient and effective, due to the fact that the rendering itself may be performed by means of simple signal processing if desired.
- a rendering of the directional components into an X-channel signal (X: more or less arbitrary) may be performed by means of a simple gain matrix mapping the M directional components into the X number of channels.
- the invention relates to a rendering system comprising at least one input for receiving a number (M) of directional components (DC) according to claim 18, said rendering system comprising means for transforming the input directional components into a number (N) of output channels according to at least one rendering method stored in associated storing means (MSM).
- M directional components
- the rendering system should be able to be re-configured, e.g. by means of software.
- M should be significantly higher than N.
- the rendering system according to the invention represents a very flexible downstream rendering in the sense that the signal may be completely pre-established pre- vious to the rendering, thereby reducing the task of the rendering system to be a mapping.
- the rendering system may be specifically designed to fit to the room (and e.g. amplifier and loudspeaker) in which the sound has to be reproduced.
- a unique spatial relation exists between each input signal component and the intended location of an output channel of a rendering system.
- the means for transforming the input directional components into a number (N) of output channels may perform a signal processing based on a simple gain matrix mapping of the signal components into the said output signals.
- the means for transforming the input directional components into a number (N) of output channels may also additionally apply other signal processing methods, such as delay compensation for compensating the positioning of the loudspeakers of the rendering system, a compression applied for adapting the rendered output signal to noisy environments, e.g. a cabin of a car.
- the required signal processing to be performed in the rendering system may be minimized.
- the said method stored in the said storing means may be exchanged by means of a suitable software transmitting and/or receiving interface, a further advantageous embodiment of the invention has been obtained.
- the rendering methods may be downloaded into the rendering system, e.g. by means of an IRDA port, a traditional RJ 45 signal interface, etc.
- Another way may also be that of incorporating the method defining software on a DVD or a CD.
- the bandwidth required for the transmittal of method defining software into the rendering may typically be relatively small.
- the rendering system comprises a user interface adapted for selecting at least two different predefined rendering methods stored in said storing means, a further advantageous embodiment of the invention has been obtained.
- a user may adapt his rendering system, to the room in which music (or speech) is to be reproduced by simply switching between different predefined methods.
- the rendering system should comprise the amplifier and associated loudspeakers of the rendering system.
- the rendering system may e.g. comprise a set of predefined rendering methods associated with stereo reproduction, a set of predefined rendering methods associated with five channel rendering, etc.
- said system comprising a set of output channel (OC) connectors of which the rendering method may define a subset of output channel connectors to be activated when applying the transforming defined by the said, a further advantageous embodiment of the invention has been obtained.
- OC output channel
- a rendering system may feed a traditional amplifier/loud speaker setup (e.g. a stereo setup) by means of two rendering method defined output channel connectors if so desired. Moreover, the same system may moreover feed another type of rendering system via other method defined output channel connectors.
- a traditional amplifier/loud speaker setup e.g. a stereo setup
- the rendering system may comprises e.g. a
- DVD reading unit and a unit adapted for allocating the directional signal read by the DVD into a kind of crossfield output connectors.
- the invention relates to a multi-channel data carrier according to claim 23, said data carrier comprising a number (M) of audio channels (M), at least two of said channels representing a directional signal with respect to a virtual listener/reference position (VLP).
- M number of audio channels
- VLP virtual listener/reference position
- an advantageous rendering of a multi-channel directional audio signal stored at the carrier may be obtained.
- subsequent rendering of the signal into e.g. a multi-channel rendering system not necessarily predefined at the time of storing the audio-signal at the data carrier, may facilitate a relatively simple, and even more important high quality rendering.
- the signal format e.g. facilitate a convincing rendering of the preprocessed room simulation.
- a further advantageous feature of the invention is that the bandwidth may be kept constant irrespective of the character of the stored audio signal, the number of sources, etc.
- the audio-signal stored at the data carrier should be almost completely preprocessed with respect to the mixing and sound-engineering.
- a music mix may be released once and for all e.g. on a DVD, and if another rendering format is desired later, the rendering system may be re-configured without any need for manipulating the origin data source.
- the directional components are established independently of the subsequent rendering of the audio channels.
- the number of directional channels are at least eight, preferably at least twenty, a further advantageous embodiment of the invention has been obtained.
- the effective necessary bandwidth of the complete signal represented in the format according to the invention may be reduced taken into consideration that many audio direction components are typically un-occupied in periods. Such unoccupied channel components will typically increase when increasing the number of channels.
- a possible compression may e.g. be applied according to PCT applicaiton WO 97/38493 by Philips.
- fig- 1 shows the basic understanding of a reverberated sound
- fig- 2 shows the basic principles of a sound processing device according to the invention
- fig- 3a-3c shows different sub-portions of the system according to the invention
- fig. 4a-4b illustrates early pattern generators according to the invention
- fifigg.- 5 5 shows a two-dimensional system of co-ordinates for illustrative purposes
- fig- 6a-6b shows two embodiments of an audio signal format according to the invention
- fig- 7 shows a three dimensional system of co-ordinates with two additional planes
- fig- 8 shows the angle of the elevation of the additional planes shown in fig.
- fig- 9 shows another embodiment of an audio signal format according to the invention
- fig. 10a- 10c shows fig. 9 divided into sub-planes for the readers convenience
- fig. 11 shows an audio signal processing method according to the invention
- fig. 12 shows the same as fig 11, but illustrated in an alternative manner
- fig. 13 shows a processing system according to a further embodiment of the invention
- fig. 14 shows a recording, distribution and/or reproducing system according to a still further aspect of the invention
- fig. 15 shows a recording, distribution and/or reproducing system as illustrated in fig. 14, but showing a storing medium for the pre-processed signal and the generation of the input signals
- fig. 16 shows a additional embodiment of the system illustrated in fig. 14, and fig. 17 shows different modules of a rendering system according to the invention.
- arti- ficial generation of room simulated sound should comprise an early reflection pattern and a late sound sequence, i.e. a tail sound signal.
- the invention is basically directed at the early reflection patterns, and consequently sound processing based on early reflections patterns within the scope of the invention.
- Fig. 1 illustrates the basic principles of a conventional signal processing unit.
- the circuit comprises an input 1 communicating with an initial pattern generator 2 and a subsequent reverberation generator 3.
- the initial pattern generator 2 and the subsequent reverberation generator 3 are connected to two mixers 4, 5 having output channels 6 and 7, respectively.
- the initial pattern generator 2 generates an initial sound sequence with relatively few signal reflections characterising the first part of the desired emulated sound. It is a basic assumption that the initial pattern is very important as a listener establishes a subjective understanding of the simulated room on the basis of even a short initial pattern.
- reflections in a certain room will initially comprise relatively few reflections, as the first sound reflection, also called first order reflections, have to propagate from a sound source at a given position in the room to the listener's position via the nearest reflecting walls or surfaces.
- this sound field will be relatively simple and may therefore be emulated in dependency of the room and the position of the source and the listener.
- reflections also called second order reflections, will be the sound waves transmitted to the position of the receiver via two reflecting surfaces.
- the propagation will decrease quite fast after a short time period of time while the sound propagation will continue over a relatively long period of time if the absorption coefficients are low.
- Fig. 2 illustrates the basic principles of a preferred embodiment of the invention.
- the shown embodiment of the invention has been divided into three modules 20A, 20B and 20C.
- the first module 20A of the room simulator comprises M source inputs 21, 22, 23.
- the source inputs 21, 22 and 23 are each connected to an early pattern generator 26, 27 and 28.
- Each early pattern generator 26, 27 and 28 outputs M directional signals to a summing unit 29.
- the summing unit adds the signal components of each of the N predetermined directions from each of the early pattern generators 26, 27 and 27.
- the summing unit output N directional signals to the module 20B comprising direction rendering unit 201.
- the direction rendering unit converts the N directional signal to a P channel signal representation.
- the system comprises a third module 20C.
- the module 20C comprises a reverb feed matrix 202 fed by the M source inputs 21, 22, 23.
- the reverb feed matrix 202 outputs P channel signals to a reverberator 203 which, in turn, outputs a P channel signal to a summing unit 204.
- the summing unit 204 adds the P channel output of the reverberator 203 to the output of the direction rendering unit 201 and feeds the P channel signal to an output.
- the module 20A comprises a number of inputs SI, S2, S3 and S4. It should be noted that a number of four inputs have been chosen for the purpose of obtaining a relatively simple explanation of the basic principles of the invention. Many other input numbers may be applicable.
- Each of the inputs are directed to an early pattern generator 26, 27 and 28.
- Each early pattern generator generates a processed signal specifically established and chosen for the source input SI, S2, S3 and S4.
- the processed signals are established as a signal composed of seven signal components dl, d2, d3, d4, d5, d6 and d7.
- the seven signal components represent a directional signal representation of the established sound and the established signal contains both the direct sound and the initial reverberation sound.
- a possible embodiment of the invention implies a five channel rendering of 10- directional signal where the directions of the input signal format are 0, +/-15, +/-30, +/-70, +/-1 10 and 180 degrees, and the intended location of the five co ⁇ esponding loudspeakers are 0, +/-30 and +/- 110 degrees according to ITU 775.
- a prefe ⁇ ed em- bodiment comprises more than 20 directions.
- each of the inputs SI, S2, S3 and S4 may refer to mutually different locations of the input source to which the early pattern is generated.
- the signals from each source are summed in summing unit 29.
- the signals dl,..,d7 may comprise tail sound components or even whole tail-sound. It should nevertheless be emphasised that according to the prefe ⁇ ed embodiment of the invention such tail sound may advantageously be gen- erated according to a relatively simple panning algorithm and subsequently added to the established summed initial sound signal as the established summed initial sound comprises the dominating room determining effects.
- fig. 3b illustrates the basic functioning of the direction rendering unit 201.
- the type of multi-channel representation is a selectable parameter, both with respect to number of applied channels and to the type of speaker setup and the individual speaker characteristics.
- the conversion into a given desired P channel representation may be effected in several different ways such as implying HRTF based (head related transfer function), a technique mentioned as Ambisonics, VB AP (vector based amplitude panning) or a pure experience based subjective mapping.
- fig. 3c module 20C is illustrated as having an input from each of the source inputs SI, S2, S3 and S4.
- the signals are fed to a reverb feed matrix 202 having five outputs, co ⁇ esponding to the chosen channel number of the direction rendering unit 201.
- the five channel outputs are fed to a reverberation unit 203 providing a five channel output of subsequent reverberation signals.
- the reverb feed matrix 202 comprises relatively simple signal pre-processing means (not shown) setting the gain, delay and phase of each input's contribution to each reverb signal and may also comprise filtering pre-processing means.
- the reverberation unit 203 establishes the desired diffuse tail sound signal by means of five tank circuits (not shown) and outputs the resulting sound signal to be added to the already established space processed initial sound signal.
- the tail sound generating means are added using almost no space processing due to the fact that a space processing of the tail sound signal according to the diffuse nature of the signal has little or no effect at all. Consequently, the complexity of the overall algorithm may be reduced when adding the tail sound separately and making the tuning much easier.
- tail-sound provides a more natural diffuse tail-sound due to the fact that the distinct comb-filter effect of the early pattern generator should preferably only be applied to the initial pattern in order to provide naturalness.
- both the initial sound and the sound tail of each sound may of course be located within an artificial room and subsequently summed in a summing unit.
- the early pattern generator is one of four according to the above described illustrative embodiment of fig. 2, and each generator comprises a dedicated source input SI, S2, S3 and S4.
- the shown early pattern generator 26 comprises a source input SI.
- the source input is connected to a matrix of signal processing means.
- the shown matrix basically comprises three rows of signal processing lines, which are processed by shared diffusers 41, 42.
- the upper row is fed directly from the input SI
- the second row is fed through the diffuser 41
- the third row is fed through both diffusers 41 and 42.
- Each row of the signal processing circuit comprises colour filters 411, 412, 413; 421, 422, 423; 431, 432, 433.
- each row comprises delay lines 4111, 4121 and 4131 which are serially connected to the colour filters 411, 412, 413.
- each column may be tapped via level and phase controllers such as 4000, 4001 and 4002. It should be noted that each level-phase controller 4000, 4001 and 4002 are tap specific.
- the initial pattern generator 26 comprises a matrix which may comprise several sets of predefined presets by which a certain desired room may be emulated.
- signals of the cu ⁇ ent predefined room emulation are tapped to the directional signal representation of the present sound source SI .
- four signal lines are tapped to seven directional signal components.
- One signal, N13 of row 1, column 3, is fed to sound component 1
- one signal, N21, is fed to signal component 3
- two signals, Ni l and N22 are added to the sound component 4.
- each tapped signal has consequently been processed through one of three combinations of diffusers, one of three types of predefined colour filters EQ, a freely chosen length of delay line and a freely chosen level and phase output.
- a separate row with a level-phase controller 4002 should be tapped and determine the direct sound.
- the location of both the direct sound as well as the co ⁇ esponding EPG and reverberation sound signals may be mapped into the sound signal representation completely similar to the desired directionality i ⁇ espective of directional resolution and complexity.
- the directional signal representation components usually comprise signals fed to each component 1-7 and not only the illustrated three.
- the chosen topology of the early pattern generator within the scope of the invention may be chosen from a set of more or less equivalent topologies.
- the signal modifying components may be varied, if e.g. a certain degree of tail-sound is added before or after tapping.
- the illustrated early pattern generator comprises linear systems
- the components e.g. the colour filters EQ may be interchanged with the diffusers DIF.
- Fig. 4b illustrates a further possible embodiment of the early pattern generator, comprising colour filters EQ placed in the feed line to each row and diffusers DIF placed in each column in each row.
- the number of columns and rows may vary depending of the system requirements. In a possible embodiment only one column of delay lines with co ⁇ esponding colour filters or diffusers is utilised.
- additional components, additional diffusers, additional different types of colour filters, etc. may be chosen.
- the number of directions, i.e. signal components should be not less than twelve, and the established reflections of each early pattern generator should not be less than 25.
- the basic presetting of each early pattern generator may initially be determined by known commercially available ray tracing or room mi ⁇ oring tool, such as ODEON.
- Fig. 5 illustrates a two-dimensional system of co-ordinates, with the axes labeled 'x' 910 and 'y' 911.
- the axes 910, 911 are pe ⁇ endicular to each other, and both are parallel to the ground, i.e. they are a ⁇ anged in a horizontal plane.
- a head of a person 912 is placed, with the nose pointing in the same direction as the x-axis 910.
- a circle 913 with radius of a unit and center at the systems origin is drawn in the plane of the x- and the y-axes.
- Fig. 6a illustrates an audio signal format. It shows again the two dimensional system of co-ordinates of figure 5. It comprises the axes x 910 and y 911, and the unit circle 913. The intersection 912 of the two axes 910, 911 represents the position of the head 912 of figure 5.
- FIG. 6 comprises twelve vectors 920, all beginning at the systems origin and pointing towards the unit circle 913, all having the length of one unit.
- the twelve vectors 920 (dl ... dl 2) are evenly distributed around the circle 913, causing the angle 921 between two neighbor vectors to be the same, indifferently to which vector is chosen.
- Incoming sounds may be defined by these vectors, the direction of the vectors representing the direction of the incoming sounds (or rather the direction, from which the incomings sound signals are coming), and a number representing the magnitude or amplitude of the incoming sounds signals.
- the number of vectors 920 (twelve) is only an example. It is possible to comprise any number of vectors, as long as the number of vectors is sufficient to define the incoming sounds satisfactorily.
- a prefe ⁇ ed embodiment would comprise more than 25 vectors or directions, for example 30, 40, 50 or even more.
- the higher the number of vectors the higher resolution of direction is achieved. And the higher resolution of direction, the more accurate source localization is achieved. For sounds coming from sources placed in front of the head, the human beings are capable of distinguishing directional differences, as small as 3 degrees. This is the so-called localization blur.
- the illustrated angle 921 between two neighbor vectors is only an example. It is pos- sible to comprise any principle of vector distribution around the circle, including uniform distribution and experience based, e.g. psycho-acoustic based distribution.
- the ears are not quite as good of directional distinguishing of sounds coming from behind the head or from the sides of the head, as they are of sounds coming from in front of the head, it is advantageous to comprise a distribution with fewer vectors behind the head than in front of the head. This gives a less accurate localization behind the head, but the human being will normally not be able to tell the difference anyway.
- a prefe ⁇ ed embodiment using this distribution principle is shown in fig. 6b.
- Another distribution of the vectors 920 could be based on measures of the density of different sounds in all possible directions. The vectors could then be distributed with small angles between them in direction sectors with high density of directional sound signals and with larger angles between them in direction sectors with low density of directional sound signals.
- a further way of distributing the vectors around the circle could be based on human impressions.
- FIG. 7 defines such a room. It is a three- dimensional system of co-ordinates, with the axes labeled 'x' 930, 'y' 931 and 'z' 932.
- the axes 930-932 are perpendicular to each other, that is: each axis is perpendicular to the other two.
- a head of a person 912 is placed, with the nose pointing in the same direction as the x-axis 930. Further two circles 946a and 946b have been added.
- circles 946a and 946b are placed in parallel with the unit circle 933, but are displaced along the z-axis in such a way that they still have their centers at the z-axis. Furthermore the distance from the systems origin 912 to any point at the two circles 946a and 946b are exactly one unit, as the circles are placed on a sphere with its center in the systems origin.
- the circle 933 lying in the x- y-plane is called the middle plane circle.
- the circle 946a displaced along the z-axis in the positive direction is called the upper circle.
- the circle 946b displaced along the z-axis in the negative direction is called the lower circle.
- Fig. 9 shows an embodiment of an audio format comprising three dimensions. It comprises the elements of figure 8. Further it comprises a number of vectors 920, pointing from the systems origin 912 to the middle circle 933. These vectors 920 are comparative to the vectors of the two dimensional audio format of figure 8a.
- FIG. 9 comprises a number of vectors 960a pointing from the systems origin 912 to the upper circle 946a and a number of vectors 960b pointing from the systems origin 912 to the lower circle 946b.
- Fig. 10a shows the upper circle 946a and its co ⁇ esponding vectors 960a of the three dimensional directional audio format.
- the angle 951a indicates the displacement of the upper circle from the x-y-plane.
- fig. 10a comprises an angle 971a. It indicates the rotation of a vector 960a from the direction of the x-axis 930, with axis of rotation at the z-axis 932.
- Fig. 10b shows the middle circle 933 and its co ⁇ esponding vectors 920 of the three dimensional directional audio format.
- the angle 921 indicates the angular distance between two vectors 920.
- Fig. 10c shows the lower circle 946b and its co ⁇ esponding vectors 960b of the three dimensional directional audio format.
- the angle 951b indicates the displacement of the lower circle from the x-y-plane.
- the angle 971b indicates the rotation of a vector 960b from the direction of the x-axis 930.
- fig. 10a- 10c will end up as fig. 9.
- the number of vectors co ⁇ esponding to each circle and the number of circles are only examples, and any number of vectors and circles are within the scope of the invention.
- a prefe ⁇ ed embodiment would also comprise fewer vectors pointing towards each of the upper or lower planes than to the middle plane. This is because the highest resolution of vectors wanted and also usable to the human being is in the middle plane.
- an upper circle and/or a lower circle is situated near respectively the positive part (i.e. the angle 951a being close to 90°) and the negative part (i.e. the angle 951b being close to 90°) of the z-axis and contains only few vectors.
- Such upper and/or lower circles could even be defined as points on the z-axis, whereby only one vector would co ⁇ espond to these upper and/or lower "circles", that is a vector located along the positive direction of the z-axis and/or a vector located along the negative direction of the z-axis.
- the audio format could be defined by a middle circlee as described above in combination with a vector along the positive part of the z-axis and optionally a vector along the negative part of the z- axis.
- the distribution of vectors around a circle is also just an example.
- many distribution principles are imaginable and applicable and hence within the scope of this invention. This could be a uniform distribution principle as shown in the drawings 9, 10a- 10c, an experience based distribution principle, a distribution principle based on measures of the localization blur in different directions as shown for the two-dimensional system in fig. 6b or a distribu- tion principle based on measures of the portion of sound gradients in different areas for a specific room and situation.
- the vectors could be placed in other manners than with their ends placed at a circle, especially the vectors, which are placed with an angle in relation to the x-y-plane.
- the angle 951 a or 951 (fig. 8) in relation to the x- y-plane could vary as a function of the angle 921 (fig. 6a) if found appropriate, whereby the ends of the vectors could be situated on for example non-circular curves on the surface of the unit sphere.
- a number of signals, preferably audio signals, si - sM are provided in the signal format according to the invention, e.g. comprising N directional components ac- cording to the same directional format. It shall be noted that not all of these components actually need to contain any signal, as the signal format must be expected to comprise a relatively large number of directional components in order to be able to represent the involved signal sources satisfactorily. Thus some (or even a large number) of the components of the actual signal sources may be zero.
- the audio signals si - sM may be recordings of or (microphone) signals stemming from single musical instruments, group of instruments, singers etc. or the signals si - sM may be other forms of signals, which will have to be combined to represent a resulting audio signal or other forms of signals.
- the source signals si - sM are directed to a signal processing unit 972 or 982, said signal processing units serving to combine the source signals si - sM and to provide an output signal 973 or 983, respectively, which also is represented in the audio signal format according to the invention, e.g. comprising N directional components, said components having the same directions as the components used for representing the source signals si - sM.
- the processing involved for combining the source signals is a summing of the co ⁇ esponding components of each source signal si - sM.
- the summing is carried out as a simple adding of each signal component, i.e.
- ⁇ dl dl(sl)+ dl(s2)+ + dl(sM)
- ⁇ d2 d2(sl)+ d2(s2)+ + d2(sM)
- ⁇ d3 d3(sl)+ d3(s2)+ + d3(sM)
- ⁇ d4 d4(sl)+ d4(s2)+ + d4(sM)
- ⁇ dN: dN(sl)+ dN(s2)+ + dN(sM).
- signal processing may be performed by the signal processing units 972 and 982, even additional processing not primarily serving to combine the components of the source signals, but serving to amend, equalize, add reverberation to, etc. the resulting signal. But preferably such additional processing will be performed in a later stage or stages of the signal processing chain but before the final direction rendering unit (DRU) will perform the mapping of the signals to the available sound reproducing system, e.g. the loudspeaker system.
- DRU final direction rendering unit
- the basic functioning of the direction rendering unit will thus be to map the N directional signal outputs 973 and 983 from units 972 or 982 into a chosen multi- channel representation, according to the available speaker set-up.
- the conversion into a given desired P channel representation may be effected in several different ways such as implying HRTF based (head related transfer function), a technique mentioned as Ambisonics, VBAP (vector based amplitude panning) or a pure experience based subjective mapping.
- Fig. 13 shows a system according to a further aspect of the multi-channel audio format and processing method of this invention. It shows a model of a reverberation unit.
- a number of units 9101a-9101c called room simulator units, calculate how the sound emitted from a source 9100a-9100c will be heard at the listener's position, including reflections from the room.
- These room simulator units may for example be early pattern generators, EPG's, which will be assumed to be the case in the following.
- EPG's early pattern generators
- the EPGs should calculate the resulting sound from as many directions as possible, and this result 9102a-9102c could then be sent on in the audio format of this invention.
- the result of each EPG should in any suitable way be added to form the final sound heard at the listener's position, and this addition could be made according to the audio signal processing method 9103 of this invention.
- the result 9104 from passing the outputs from the EPGs through the processing method, would be in the multi-channel audio format of this invention.
- the sound in each direction of this format would have to be mapped to the available loudspeakers 9106a-9106b or channels chosen by the user. This mapping is performed by a direction rendering unit (DRU) 9105.
- DRU direction rendering unit
- each vector could represent the method of calculating the sound coming from that particular direction.
- each vector would comprise a listing of partial sound signals as a function of time and as a function of the actual input signals. In each vector shall then the partial sound signals be summed.
- Fig. 14 illustrates a recording, distribution and/or reproducing system according to a further aspect of the invention, said system utilizing the signal format according to the first aspect of the invention.
- a number of source signals si - sM, each comprising a number of directional components dl - dN, are led to a pre-processing unit (PPM) 9111.
- the input source signals which preferably may be audio signals, may each stem from a single musical instrument, a singer or another source of audio signals, or may stem from a group of instruments, a group of singers, a group of other audio sources or combinations of these. These signals may have been generated using commonly available methods and equipment, e.g. microphones, while simultaneously producing the directional components of the signals. Further the input source signals si - sM, or some of these, may have been generated using audio generators and/or simulators, for exam- pie room simulators, early pattern generators (EPG's) etc. as indicated above.
- EPG's early pattern generators
- a processing of the input source signals si - sM takes place.
- dN: dN(sl)+ dN(s2)+ dN(s3)+ .... +dN(sM), whereby the output signal IF from the pre-processing unit (PPM) 9111 will be generated.
- PPM pre-processing unit 9111
- PPM pre-processing unit
- amendments of the format of the signals may take place, e.g. reduction of the number of directional components, canceling of certain directional components, addition of certain components, which may contain information relating to the recorded or processed signals.
- the resulting output signal IF from the pre-processing unit (PPM) 9111 thus com- prises a number T of signal components, of which most or all are directional components.
- the number T may equal the number N of directional components in each of the input source signals si - sM, or T may be larger than or less than N.
- the output signal IF constitutes an intermediate signal format, which is suitable for storing, transmission or otherwise distribution of the recorded signals, preferably audio signals, while simultaneously retaining as much detailed information about the input signals as possible.
- the output signal IF may thus be stored on any suitable form of storing media such as CD's, DVD's or static storing media of the electronic, magnetic, optical etc. variety as illustrated in fig. 15 by the storing means 9115. Further the output signal IF may be transmitted in any suitable manner, for example by distribution via the Internet or other suitable means.
- the T-channel signal IF is received by user processing means (UPM) 9112, which may be inco ⁇ orated in the reproducing system or apparatus at the end-user, for ex- ample when a storing medium such as a CD or a DVD is played on the apparatus of the end-user, or when a signal IF is received via electronic, electromagnetic or optical communication means, for example via the Internet, by the reproducing system at the end-user.
- UPM user processing means
- the reproducing equipment at the end-user will not be capable of reproducing the relatively large number of directional components, i.e. channels, comprised in the signal IF, but will be able to reproduce the signals, preferably the audio signals, in one or more of the commonly used bi- or multi-channel systems, e.g. stereo system, 3-, 4- or 5-channel systems, su ⁇ ound systems etc. or in a user-specified set-up.
- the user processing means (UPM) 9112 therefore comprises means, for example in the form of a decoder, for transforming the T-channel signal IF into a signal containing a suitable number of channels, which may be reproduced by the end-user equipment.
- the user processing means UPM may be predestined to transform the received signal from a certain number of channels T to a certain number of channels to be reproduced by the end-user equipment.
- the user processing means UPM may be configured to determine the number of channels T inco ⁇ orated in the received signal IF and to perform a transformation from this number of channels to a certain number of channels to be reproduced by the end-user equipment.
- the user processing means UPM may be able to perform a transformation to one of two or more different reproducing systems UF1 - UFk, containing a different number of channels, in dependence upon an active choice by the user or in dependence upon other factors, such as parameters of the received signal IF.
- the user processing means UPM of fig. 14 is configured to reproduce the audio input signals in a specific reproducing system labeled UF3, which may be any type of the relatively large number of reproducing systems (labeled UF1 - Ufk on fig 14), which are available at present, and which may reproduce the signals via loudspeaker systems using stereo systems, 3-, 4- or 5-channel systems and/or user- specified set-ups etc.
- a specific reproducing system labeled UF3 which may be any type of the relatively large number of reproducing systems (labeled UF1 - Ufk on fig 14), which are available at present, and which may reproduce the signals via loudspeaker systems using stereo systems, 3-, 4- or 5-channel systems and/or user- specified set-ups etc.
- the user processing means UPM may further comprise other means for processing the received signal IF, for example means for equalizing, other means for coloring the signals, delaying means, adding reverberation, dynamic processing etc. These further processing steps may be designated in relation to the actual type and character of the user reproducing equipment, e.g. in order to achieve an optimal sound reproduction.
- the transformation of the number of channels T contained in the received signal IF to the number of channels, which may be reproduced by the actual end-user equipment, may be performed using linear transformation of the received signal- components, for example by matrix-operation. More complicated operation may be performed to achieve more sophisticated and detailed results of the transformation.
- the input signals si - sM may originate from microphones or similar transducers producing electric signals 9110a - 9110M. These signals are processed by signal processing means 9114a - 9114M, whereby the signals si - sM each having N signal components according to the signal format of the invention is produced.
- the signal processing means 9114a - 9114M may for example be room simulators, early pattern generators etc. Some or all of the input signals for these signal processing means or for the pre-processing means PPM 9111 may also have been produced artificially, for example by electronic music instruments or signal generators.
- FIG. 16 illustrates a system, which co ⁇ esponds to the system described in connection with fig. 14, but where a further processing means, an intermediate processing means (IPM) 9113 has been added.
- This further processing means IPM provides an intermediate processing of the multi-channel signal IF having T channels (or directional components), whereby the number of channels may be re- prised to a number V.
- the output signal IF' having V channels (or directional components) may be stored on any suitable storing media 9116 or may otherwise be transmitted or distributed to the end-users, represented by the end-user processing means UPM, where it may be reproduced as described in connection with fig. 14.
- the output signal IF from the pre-processing unit (PPM) 9111 may also be stored on any suitable storing media (not shown in fig. 16) or may otherwise be transmitted or distributed.
- the intermediate processing may be performed by or for a record company, a record distributor, a distribution network etc. which has a need to distribute or may achieve an advantage by distributing signals with a fewer or a specific number of channels. This may thus be performed by the IPM without influenc- ing upon the other parts of the system, e.g. the system as shown in fig. 14 will function without regard to the system shown in fig. 15.
- Fig. 17 shows different modules of a rendering system according to the invention.
- the system comprises an independent pre-rendering stage PS.
- the pre-rendering stage PS comprises a data carrier reading unit DCRU.
- the unit may e.g. be a DVD player adapted for reading data stored in a DVD or it may e.g. comprise a solid state memory interface.
- the data ca ⁇ ier reading unit DCRU are connected to a direction rendering unit DRU by means of an M channel interface.
- the direction rendering unit DRU comprises method storing means MSM and a signal processing unit SPU adapted for transforming the M-channel input into an N- channel output according to a rendering method stored in the said memory storing means MSM.
- the direction rendering unit may e.g. perform a relatively simple mapping of the input channels into the output channels by means of a gain matrix.
- the rendering method may e.g. be established by means of vector based amplitude panning.
- the rendering unit may be adapted for exchanging the rendering method by means of software modifications.
- the N-channel output of the pre-rendering unit DCRU are then fed to output channel connectors (OC).
- the illustrated pre-rendering unit DCRU comprises only two signal output connectors.
- the pre-rendering unit DCRU is specifically adapted for transforming the input channels into a two-channel signal representation, such as stereo.
- the outputs of the pre-rendering unit DCRU are then fed to a traditional stereo amplifier RA connected with two loudspeakers LS.
- the pre-rendering stage may thus be adapted for fitting to an arbitrary combination of amplifiers and loudspeakers, e.g. a su ⁇ ound sound system.
- a pre- rendering unit DCRU comprising a number of output connectors, e.g. ten.
- the pre- rendering unit DCRU may then, under control of the stored rendering method apply 1 to 10 physical outputs.
- the amplifier means may be inco ⁇ orated in the pre-rendering unit DCRU within the scope of the invention.
- the direction rendering unit DRU may also comprise a set of rendering methods storing the methods storing means.
- the rendering methods may both be different with respect to basic properties determines by the number of output channels, but it may also be different with respect to the intended positioning of the loudspeaker of the rendering system.
- the illustrated stereo rendering system may comprise e.g. twenty different predefined "stereo" variants with respect to e.g. the positioning of the loudspeakers in the room or e.g. room characteristics.
- Such rendering may e.g. imply phase modification, equalizing, sound coloring, sound compression or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Stereophonic System (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU64271/00A AU6427100A (en) | 1999-08-09 | 2000-08-09 | Multi-channel processing method |
EP00951275A EP1203364A1 (en) | 1999-08-09 | 2000-08-09 | Multi-channel processing method |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP99202585A EP1076328A1 (en) | 1999-08-09 | 1999-08-09 | Signal processing unit |
EP99202585.8 | 1999-08-09 | ||
EP00201759A EP1158486A1 (en) | 2000-05-18 | 2000-05-18 | Method of processing a signal |
EP00201759.8 | 2000-05-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2001011602A1 true WO2001011602A1 (en) | 2001-02-15 |
Family
ID=26072251
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/DK2000/000443 WO2001011602A1 (en) | 1999-08-09 | 2000-08-09 | Multi-channel processing method |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1203364A1 (en) |
AU (1) | AU6427100A (en) |
WO (1) | WO2001011602A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4731848A (en) * | 1984-10-22 | 1988-03-15 | Northwestern University | Spatial reverberator |
EP0593228A1 (en) * | 1992-10-13 | 1994-04-20 | Matsushita Electric Industrial Co., Ltd. | Sound environment simulator and a method of analyzing a sound space |
US5452360A (en) * | 1990-03-02 | 1995-09-19 | Yamaha Corporation | Sound field control device and method for controlling a sound field |
US5585587A (en) * | 1993-09-24 | 1996-12-17 | Yamaha Corporation | Acoustic image localization apparatus for distributing tone color groups throughout sound field |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB394325A (en) * | 1931-12-14 | 1933-06-14 | Alan Dower Blumlein | Improvements in and relating to sound-transmission, sound-recording and sound-reproducing systems |
-
2000
- 2000-08-09 WO PCT/DK2000/000443 patent/WO2001011602A1/en active Application Filing
- 2000-08-09 EP EP00951275A patent/EP1203364A1/en not_active Withdrawn
- 2000-08-09 AU AU64271/00A patent/AU6427100A/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4731848A (en) * | 1984-10-22 | 1988-03-15 | Northwestern University | Spatial reverberator |
US5452360A (en) * | 1990-03-02 | 1995-09-19 | Yamaha Corporation | Sound field control device and method for controlling a sound field |
EP0593228A1 (en) * | 1992-10-13 | 1994-04-20 | Matsushita Electric Industrial Co., Ltd. | Sound environment simulator and a method of analyzing a sound space |
US5585587A (en) * | 1993-09-24 | 1996-12-17 | Yamaha Corporation | Acoustic image localization apparatus for distributing tone color groups throughout sound field |
Non-Patent Citations (2)
Title |
---|
ROCCHESSO D ET AL: "CIRCULANT AND ELLIPTIC FEEDBACK DELAY NETWORKS FOR ARTIFICIAL REVERBERATION", IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING,US,IEEE INC. NEW YORK, vol. 5, no. 1, 1997, pages 51 - 63, XP000785329, ISSN: 1063-6676 * |
See also references of EP1203364A1 * |
Also Published As
Publication number | Publication date |
---|---|
EP1203364A1 (en) | 2002-05-08 |
AU6427100A (en) | 2001-03-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zotter et al. | Ambisonics: A practical 3D audio theory for recording, studio production, sound reinforcement, and virtual reality | |
Hacihabiboglu et al. | Perceptual spatial audio recording, simulation, and rendering: An overview of spatial-audio techniques based on psychoacoustics | |
Jot | Real-time spatial processing of sounds for music, multimedia and interactive human-computer interfaces | |
EP1025743B1 (en) | Utilisation of filtering effects in stereo headphone devices to enhance spatialization of source around a listener | |
US6904152B1 (en) | Multi-channel surround sound mastering and reproduction techniques that preserve spatial harmonics in three dimensions | |
RU2533437C2 (en) | Method and apparatus for encoding and optimal reconstruction of three-dimensional acoustic field | |
CA2270664C (en) | Multi-channel audio enhancement system for use in recording and playback and methods for providing same | |
Lipshitz | Stereo microphone techniques: Are the purists wrong? | |
EP1275272B1 (en) | Multi-channel surround sound mastering and reproduction techniques that preserve spatial harmonics in three dimensions | |
KR20080060640A (en) | Method and apparatus for reproducing a virtual sound of two channels based on individual auditory characteristic | |
US8363847B2 (en) | Device and method for simulation of WFS systems and compensation of sound-influencing properties | |
WO1992015180A1 (en) | Sound reproduction system | |
Jot et al. | Binaural simulation of complex acoustic scenes for interactive audio | |
MX2012002886A (en) | Phase layering apparatus and method for a complete audio signal. | |
Braasch et al. | A loudspeaker-based projection technique for spatial music applications using virtual microphone control | |
Lipshitz | Stereo microphone techniques: Are the purists wrong? | |
EP1203364A1 (en) | Multi-channel processing method | |
US7403625B1 (en) | Signal processing unit | |
WO2001019138A2 (en) | Method and apparatus for generating a second audio signal from a first audio signal | |
KR970005610B1 (en) | An apparatus for regenerating voice and sound | |
KR20050060552A (en) | Virtual sound system and virtual sound implementation method | |
Sousa | The development of a'Virtual Studio'for monitoring Ambisonic based multichannel loudspeaker arrays through headphones | |
Christensen et al. | Presented at the 107th Convention 1999 September 24-27 New York | |
Montoya et al. | High Spatial Resolution Multichannel Recording |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ CZ DE DE DK DK DM DZ EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2000951275 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2000951275 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10049417 Country of ref document: US |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |