WO1994016537A1

WO1994016537A1 - Stereophonic manipulation apparatus and method for sound image enhancement

Info

Publication number: WO1994016537A1
Application number: PCT/US1992/011335
Authority: WO
Inventors: Stephen W. Desper
Original assignee: Desper Products, Inc.
Priority date: 1990-01-09
Filing date: 1992-12-31
Publication date: 1994-07-21
Also published as: KR960700620A

Abstract

The manipulation system and apparatus receive electronic signals which are to be processed and recorded for playback as enhanced stereophonic audio signals from two laterally spaced loudspeakers in front of the listener, either directly, after recording, or after being broadcast. The system and apparatus process those signals to produce a conditioning signal, such as which would be produced by virtual room boundaries, which is heard together with the original signals so that an enlarged listening area is perceived by the listener. By amplitude and phase control of the signal to the two real speakers, the system and apparatus provide a means for control over the enhanced sound field. This enhanced sound field is perceived by the listener as being contained within boundaries larger than those normally reproduced by stereophonic speakers. The system and apparatus generate a conditioning signal for the enhancement of natural, and generation of artificial, spatial qualities present in stereo signals usually masked in the acoustic environment in which reproduction takes place, through generation of phantom boundaries. The apparatus can monitor its own output and shut down or reduce the effects if the output contains qualities that cannot be broadcast. The apparatus provides self-adjustment in the electronic system to maintain spatial masking reversal at a constant value regardless of program material.

Description

STEREOPHONIC MANIPULATION APPARATUS AND METHOD FOR SOUND IMAGE ENHANCEMENT

BACKGROUND OF THE INVENTION

This invention is directed to an automatic stereophonic image enhancement method and apparatus wherein the electronic signal which corresponds to the audio signal is electronically treated by amplitude and phase control to produce a perception of enhancements to stereophonically reproduced music.

Sound is vibration in an elastic medium, and acoustic energy is the additional energy in the medium produced by the sound. Sound in the medium is propagated by compression and refraction of the energy in the medium. The medium oscillates, but the sound travels. A single cycle is a complete single excursion of the medium, and the frequency is the number of cycles per unit time. Wavelength is the distance between wave peaks, and the amplitude of motion (related to energy) is the oscillatory displacement. In fluids, the unobstructed wave front spherically expands.

Hearing is the principal response of a human subject to sound. The ear, its mechanism and nerves receive and transmit the hearing impulse to the brain which receives it, compares it to memory, analyzes it, and translates the impulse into a concept which evokes a mental response. The final step in the process is called listening and takes place in the brain; the ear is only a receiver. Thus, hearing is objective and listening is subjective. Since the method and apparatus of this invention is for the automatic stereophonic image enhancement for human listening, the listening process is in perceptions of hearing. This patent describes the perceptions of human subjects. Because a subject has two ears, laterally spaced from each other, the sound at each eardrum is nearly always different. Some of the differences are due to the level, amplitude or energy, while others are due to timing or phase differences. Each ear sends a different signal to the brain, and the brain analyzes and compares both of the signals and extracts information from them, including information in determining the apparent position and size of the source, and acoustic space surrounding the listener.

The first sound heard from a source is the direct sound which comes by line-of-sight from the source. The direct sound arrives unchanged and uncluttered, and lasts only as long as the source emits it. The direct sound is received at the ear with a frequency response (tonal quality) which is relatively true to the sound produced by the source because it is subject only to losses in the fluid medium (air). The important transient characteristics such as timbre, especially in the higher registers, are conveyed by direct sound. The integral differences at each eardrum are found in time, amplitude and spectral differences. The physical spacing of the ears causes one ear to hear after the other, except for sound originating from a source on the median plane between the ears. The time delayed difference is a function of the direction from which the sound arrives, and the delay is up to about 0.8 millisecond. The 0.8 millisecond time delay is about equal to the period of 1 cycle at 1 ,110 Hz. Above this frequency, the acoustic wavelength of arriving sounds becomes smaller than the ear- to- ear spacing, and the interaural time difference decreases in significance so that it is useful only below about 1 ,400 Hz to locate the direction of the sound. The difference in amplitude between the sound arriving at the two ears results principally from the detracting and shadowing effect of the head and external ear pinna. These effects are greater above 400 Hz and become the source of information the brain interprets to determine the direction of the source for higher frequencies. Other clues to elevation and direction of the sound derive from our practice of turning our head during the sound direction evaluation process. This changes the relative amplitude and time difference to provide further data for mental processing to evaluate direction. Both processes are frequency dependent, but it has been shown that the time difference is more useful with transient portions of sound while both are used for evaluation of the source direction of continuous signals.

In human listening, memory plays an important role in the evaluation of sound. The brain compares the interaural temporal difference, interaural amplitude difference, interaural spectral difference, as well as the precedence effect, and temporal fusion, to be described later, with memories of the same factors. The brain is constantly comparing present perceptions with stored impression so that those signals which are currently being received are compared with memory to provide a conception of the surrounding activity. In listening, the combination of the sound as perceived and the memory of similar events, together, produce a mental image of an aural conceptual geometrical framework around us associated with the sources of sound to become thus a conceptual image space. In the conceptual image space, what is real and what seems to be real are the same. The present system and apparatus is directed toward generating a conceptual image space which seems to be real but, from an objective evaluation, is an illusion.

In an apparatus where there are two, spaced loudspeaker sound sources in front of the observer, with the observer centered between them, the production of substantially the same sound from each speaker, in phase and of the same amplitude, will present to the observer a virtual sound image midway between the two speakers. Since the sound source is in phase, this virtual sound image will be called a "homophasic image". By changing the relative amplitude, the homophasic image can be moved to any point between the two speakers. In conventional professional processing of sound signals, this moving action is called "panning" and is controlled by a pan pot (panoramic potentiometer).

An equally convincing virtual sound image can be heard if the polarity is reversed on one of the signals sent to one of the same two loudspeakers. This results in an 180 degree phase shift for the sound from that speaker reaching the ears. For simplification, the first 0 degree retarded phase-shifted signal from the left speaker first reaches the left ear and later the right ear, simultaneously the second 180 degree retarded phased-shifter signal from the right speaker first reaches the right ear and later the left ear, providing information to the ear-brain mechanism which manifests a virtual sound image to the rear of the center point of the listener's head. This virtual image is the "antiphasic" image. Since it is a virtual image created by mental process, the position is different for different listeners. Most listeners hear the antiphasic image as external and to the rear of the skull, the antiphasic image does not manifest itself as a point source, but is diffused and forms the rear boundary of the listener's conceptual image space. By changing the phase relationship and/or amplitude of various frequencies of the left and right signals, virtual images can be generated along an arc or semicircle from the back of the observer's head toward the left or right speakers.

Another factor which influences the perception of sound is the "precedence effect" wherein the first sound to be heard takes command of the ear-brain mechanism, and sound arriving up to 50 milliseconds later seems to arrive as part of and from the same direction as the original sound. By delaying the signal sent to one speaker, as compared to the other, the apparent direction of the source can be changed. As part of the precedence effect, the apparent source direction is operative through signal delay for up to 30 milliseconds. The effect is dependent upon the transient characteristics of the signal.

An intrinsic part of the precedence effect, yet an identifiably separate phenomenon, is known as "temporal fusion" which fuses together the direct and delayed sounds. The ear-brain mechanism blends together two or more very similar sounds arriving at nearly the same time. After the first sound is heard, the brain suppresses similar sounds arriving within about the next 30 milliseconds. It is this phenomenon which keeps the direct sound and room reverberation all together as one pleasing and natural perception of live listening. Since the directional hearing mechanism works on the direct sound, the source of that sound can be localized even though it is closely followed by multiple waves coming from different directions.

The walls of the room are reflection surfaces from which the direct sound reflects to form complex reflections. The first reflection to reach the listener is known as a first order reflection; the second, as second order, etc. An acoustic image is formed which can be considered as coming from a virtual source situated on the continuation of a line linking the listener with the point of reflection. This is true of all reflection orders. If we generate signals which produce virtual images, boundaries are perceived by the listener. This is a phenomenon of conditioned memory. The position of the boundary image can be expanded by amplitude and phase changes within the signal generating the virtual images. The apparent boundary images broaden the perceived space.

Audio information affecting the capability of the ear-brain mechanism to judge location, size, range, scale, reverberation, spatial identity, spatial impression and ambiance can be extracted from the difference between the left and right source. Modification of this information through frequency shaping and linear delay is necessary to produce the perception of phantom image boundaries when this information is mixed back with the original stereo signal at the antiphasic image position.

SUMMARY OF THE INVENTION

The common practice of the recording industry, for producing a stereo signal, is to use two or more microphones near the sound source. These microphones, no matter how many are used, are always electrically polarized in-phase. When the program source is produced under these conditions (which are industry standard), the apparatus described herein generates a "synthetic" conditioning signal for establishment of a third point with its own time domain. This derivation is called synthetic because there is a separation, alternation and regrouping to form the new whole.

To further help establish a point with a separate time domain, a third microphone may be used to define the location of the third point in relation to the stereo pair. Contrary to the normal procedure of adding the output of a third microphone to the left and right side of the stereo microphone pair, the third microphone is added to the left stereo pair and subtracted from the right stereo pair. This arrangement provides a two-channel stereo signal which is composed of a left signal, a right signal, and a recoverable signal which has its source at a related but separate position in the acoustic space being recorded. This is called organic derivation and it compares to the synthetic situation discussed above, where the ratios are proportional to the left minus the right (from which it was derived) but is based on its own time reference, which is, as will be seen, related to the spacing between the three microphones. The timing between the organic conditioning signal is contingent upon the position of the original sound source with respect to the three microphones. The information derived more closely approximates the natural model than that of the synthetically derived conditioning signal.

Control over either the organic or synthetic situations, the processing thereof, and the generation of a conditioning signal therefrom will produce an expanded listening experience. All sources of sound recorded with two or more microphones in synthetic or organic situations contain the original directional cues. When acted upon by the apparatus of this invention, a portion of the original directional cues are isolated, modified, reconstituted and added, in the form of a conditioning signal, to the original forming a new whole. The new whole is in part original and in part synthetic. The control of the original-to-synthetic ratio is under the direction of the operator via two operating modes:

(1 ) Space, in which the ratio is constant. Synthetic is directly proportional to the original and, therefor, enhancement depends upon the amount of original information present in the stereo program material.

(2) Auto Space, in which the ratio is electrically varied. Synthetic is inversely proportional to the original and, therefore, the enhancement is held at a constant average regardless of program material.

When a stereo recording is reproduced monophonically, it is said to be compatible if the overall musical balance does not change. The dimensionality of the stereo recording will disappear when reproduced monophonically but the inner-instrumental balance should remain stable when L+R (i.e., the left plus right sources have been combined to monophonic sound, also called L=R).

The compatibility problem arises because monophonic or the L+R signal broadcast in a conventional stereo broadcast does not contain the total information present in the left and right sources. When combined as such, it contains only the information of similarity in vectorial proportion. The differential information is lost. It is possible for the differential signal to contain as much identity about the musical content of a source as does the summation signal.

Since differential information will be lost in left plus right combining, directional elements should comprise most of the differential signal. Directional information will be of little use in monophonic reproduction and its loss will be of no consequence with respect to musical balance. Therefore, additional dimensional or spatial producing elements must be introduced in such a way that their removal in L+R combining will not destroy the musical balance established in the original stereophonic production.

Insertion of the conditioning signal at the antiphasic image position produces enhancement to and generation of increased spatial density in the stereo mode but is completely lost in the mono mode where the directional information will be unused. Information which can be lost in the mono mode without upsetting the inner- instrument musical balance includes clues relating to size, location, range and ambience but not original source information.

To accomplish this, directional information is obtained exclusively from the very source which is lost in the monophonic mode, namely, left signal minus right signal. Whether in the synthetic or organic model derivation of a conditioning signal, subtracting the left signal from the right signal and reinserting it at the antiphasic position will not challenge mono/stereo compatibility, providing that the level of conditioning signal does not cause the total RMS difference energy to exceed the total RMS summation energy at the output.

In order to aid in the understanding of this invention, it can be stated in essentially summary form that it is directed to a stereophonic image enhancement system and apparatus wherein a conditioning signal is provided and introduced into electronic signals which are to be reproduced through two spaced loudspeakers so that the perceived sound frame between the two loudspeakers is an open field which at least extends toward the listener from the plane between the loudspeakers and may include the perception of boundaries which originate to the side of the listener. The conditioning signal may be organic, if the original sound source is approximately miked, or it may be derived from the left and right channel stereo signals.

In one aspect, the present invention provides an automatic stereophonic image enhancement system and apparatus wherein two channel stereophonic sound is reproduced with signals therein which generate a third image point with which boundary image planes can be perceived within the listening experience resulting in an extended conceptual image space for the listener.

In another aspect, the present invention provides a stereophonic image enhancement system which includes automatic apparatus for introducing the desired density of conditioning signal regardless of program content into the electronic signal which will be reproduced through the two spaced speakers.

It is another objective to provide an automatic stereophonic image enhancement system and apparatus wherein the inner-instrumental musical balance remains stable when heard in monophonic or stereophonic modes of reproduction.

It is another objective to provide a monophonically compatible automatic stereophonic image enhancement system and apparatus wherein the operator can be readily trained to employ the system and apparatus to achieve desirable recordings with enhanced conceptual image space.

The features of the present invention which are believed to be novel are set forth with particularity in the appended claims. The present invention, both as to its organization and manner of operation, together with further objects and advantages thereof, may be best understood by reference to the following description, taken in conjunction with the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 is a perspective view of listener facing two spaced loudspeakers, and showing the outline of an enclosure.

Figure 2 is a schematic plan view of the perception of a sound frame which includes a synthetic conditioning signal which is included in the signals to the speakers.

Figure 3 is a schematic plan view of the perceived open field sound frame where an organic conditioning signal is introduced into the signal supplied to the speakers.

Figure 4 is a schematic plan view of the open field sound frame, as perceived from the listener's point of view, as affected by various changes within the conditioning signal.

Figure 5 is a schematic plan view of a sound source and microphone placements which will organically produce a conditioning signal.

Figure 6 is a simplified schematic diagram of a circuit which combines the organically derived conditioning signal with the left and right channel signals.

Figures 7(a) and 7(b) form a schematic electrical diagram of the automatic stereophonic image enhancement system and apparatus in accordance with this invention.

Figure 8 is a schematic electrical diagram of an alternate circuit therefore.

Figure 9 is a front view of the control panel for the apparatus of Figure 8.

Figures 10(a) and 10(b) form a digital logic diagram of a digital embodiment of the invention. Figure 11 is a front view of a joystick box, a control box, and a interconnecting data cable

420 which can be used to house the embodiment of the invention described with reference to Figures 12(a), 12(b), 13(a) - 13(f), 14(a) and 14(b).

Figures 12(a) through 12(d) form a schematic diagram of an embodiment of the invention wherein joysticks may be used to move a sound around in a perceived sound field.

Figures 13(a)- 13(f) are graphical representations of the control outputs which are generated by the joysticks and associated circuitry and applied to voltage controlled amplifiers of Figures 12(a) - 12(d).

Figures 14(a) and 14(b) form a digital sound processor logic diagram similar to that of Figures 10(a) and 10(b), but adapted for use as the digital sound processor 450 in Figures 12(a) - 12(d).

Figure 15 is a schematic diagram of an embodiment of the invention which is adapted for use in consumer- quality audio electronics apparatus, of the type which may be used in the home, etc. BRIEF DESCRIPTION OF THE TABLES

Tables A through F set forth the date which is graphically presented in Figures 13(a)-(f), respectively.

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION

Figure 1 illustrates the usual physical arrangement of loudspeakers for monitoring of sound. It should be understood that in the recording industry sound is "monitored" during all stages of production. It is "reproduced" when production is completed and the product is in the market place. At that point and on, what is being reproduced is the production. Several embodiments of the invention are disclosed. Some embodiments are intended for use during sound production, while one embodiment is intended for use during sound reproduction, in the house, for example. Embodiments of the invention include the system and apparatus illustrated in a first embodiment in Figures 5 and 6, a second embodiment 10 in Figure 7, a third embodiment 202 in Figure 8, a fourth embodiment of Figures 10(a) and 10(b), a fifth and presently preferred embodiment (for professional studio use) in Figures 11 , 12(a), 12(b), 13(a)- 13(f), 14(a) and 14(b). These embodiments may be employed in record, compact disc, mini- disc, cassette, motion picture, video and broadcast production, to enhance the perception of sound by human subjects, i.e. listeners. Another and sixth embodiment, which is disclosed with reference to Figure 15, may be used in a consumer quality stereo sound apparatus found in a home environment, for example.

During monitoring of sound for sound production, the two loudspeakers 12 and 14 are of suitable quality with enclosures to produce the desired fidelity. They are laterally spaced, and the listener 16 faces them and is positioned substantially upon a normal plane which bisects the line between the speakers 12 and 14. Usually, the listener is enclosed in a room, shown in phantom lines, with the loudspeakers. During reproduction, the two loudspeakers may be of any quality. The loudspeaker and listener location is relatively unimportant. During monitoring, the effect is one of many separate parts being blended together. Hence, monitoring requires a standard listening position for evaluating consistency, whereas during reproduction, the effect has become one with the whole sound and can be perceived from any general location.

Since several embodiments of the apparatus are designed as a production tool, the loudspeakers 12 and 14 should be considered monitors being fed from an electronic system which includes the sound production enhancement apparatus of this invention. The electronic system may be a professional recording console, multi- track or two- track analogue, or digital recording device, with a stereophonic two-channel output designated for recording or broadcasting. The sound source material may be a live performance or it may be recorded material in a combination of the foregoing.

Figure 2 illustrates the speakers 12 and 14 as being enclosed in what is perceived as a closed field sound frame 24 (without the lower curved lines 17 and 26) which is conventional for ordinary stereophonic production. By varying the amplitude between the speakers 12 and 14, the apparent source can be located anywhere within the sound frame 24, that is, between the speakers. When a synthetic conditioning signal is delayed before reinsertion at the antiphasic image position 34, amplitude and time ratios 17 are manifested between the three points 12, 14 and 34. Because the antiphasic point 34 is the interdependent product of the left point 12 and the right point 14, the natural model is approached by a synthetic construction, but never fully realized. The result is open field sound from 26. Listener 16 perceives the open field 26.

Figure 3 illustrates open field sound frame 28 which is perceived by listener 16 when a conditioning signal derived, as in Figure 2, is supplied and introduced as part of the signal to speakers 12 and 14, but has as its source an organic situation. The density of spatial information is represented by the curved lines 17 in Figure 2 and is represented by the curved lines 19 in Figure 3. It is apparent that the density of spatial information is greater in Figure 3 because the three points which produced the original conditioning signal are not electrically interdependent but are acoustically interactive; information more closely reflecting the natural model is supplied to the ear-brain mechanism of listener 16.

Figure 4 illustrates the various factors which are sensed by the listener 16 in accordance with the stereophonic image enhancement systems of this invention. The two speakers 12 and 14 produce the closed field sound frame 24 when the speakers are fed with homophasic signals. Homophasic image position 30 is illustrated, and the position can be shifted left and right in the frame 24 by control of the relative amplitude of the speakers 12 and 14. The speakers 12 and 14 produce left and right real images, and a typical hard point image 32 is located on the line between the speakers because it is on a direct line between the real images produced by the two real speakers. As described above, the hard point source image can be shifted between the left and right speakers.

The antiphasic image position 34 is produced by speakers 12 and 14 and may be perceived as a source location behind the listener's head 16 at 34 under test or laboratory demonstrations. Under normal apparatus operating conditions, source 34 is not perceived separately but, through temporal fusion, is the means by which an open filed sound frame is perceived. Position 34 is a perceived source, but is not a real source. There is no need for a speaker at position 34. Rather, by controlling the relationship between the antiphasic image position and one or both of the real images all produced by speakers 12 and 14, the image source can be located on a line between one of the real images and the antiphasic image position 34. Since the antiphasic image position 34 is a perceived source, but is not a real source, the point between it and speakers 12 and 14 is considered a soft point source image. Such a soft point source image is shown at point 36. Open field sound frame is thus produced and provides the perception of virtual space boundaries 40, 42, 44 or 46 (not on line), depending on the conditioning signal's phase relationship to the original source. The perceived distance for the virtual space boundaries 40, 42, 44 and 46 from the closest hard point is from 2 to 30 feet (1-10 M), depending on the dimension control setting of Figure 5 and the distance between speakers 12 and 14. First Embodiment

Figure 5 is a schematic diagram of a sound source which is to be recorded or amplified. Three microphones L, R and C are shown located in front of the source. The L (left) and R (right) microphones are approximately equally spaced from the source on its left and right sides. The C (conditioning) microphone is located further spaced from the source and approximately equally spaced from the L and R microphones.

The signal from the C microphone is adjusted in gain and then is added (at adder A, for example) and subtracted (at subtractor S, for example) from the stereo signals L, R as shown in Figure 6. The resulting signal processed outputs PL and PR, when amplified and applied to speakers 12 and 14 (Figure 1), will produce an expanded sound image as described with reference to Figures 3 and 4. By adjusting the gain of conditioning signal, C, the amount of expansion which occurs can be controlled easily. In this embodiment, the conditioning signal, C, is produced organically, that is, by a microphone array pickup as shown in Figure 5 and connected as shown in Figure 6. There exist many previous stereo recordings for which there was no microphone at location C connected as shown in Figure 6, and thus there would seem to be no simple way of recreating the effect described above. However, as will be seen, the conditioning signal can be created synthetically, and introduced into the left and right channel signals, when (1) the sound source is mixed-down from a prerecorded tape in a recording studio, for example, (2) the sound is broadcast, or (3) when prerecorded sound is received or reproduced in a home environment. The conditioning signal is delayed time-wise and filtered compared to the signals from microphones L and R due to the placement of microphone C.

Second Embodiment

Now considering an embodiment of the apparatus 100 which produces the conditioning signal synthetically, the left input lines 48 and 49 and right input lines 50 and 51 are received from musical signal sources. The system and apparatus 10 is described in this embodiment as being a system which introduces the conditioning signal before the two-channel recording and, thus, is a professional audio laboratory and apparatus. Thus, the left and right inputs 48, 49, 50 and 51 may be the product of a live source or a mixdown from a multiple channel tape produced by the live recording, or it may be a computer generated source, or a mixture of same. The inputs of the apparatus 48, 49, 50 and 51 address the output of the recording console's "quad buss" or "4-track buss". Each position on the recording console can supply each and every buss of the quad buss with a variable or "panned" signal representing a particular position. Two channels 49, 51 of the quad buss are meant for use as stereo or the front half of quadraphonic sound; the other two channels, 48, 50, are for the rear half of quadraphonic sound. Normally, each position or input of a modern recording console has a panning control to place the sound of the input between left, right, front or back via the quad buss. A recording console may have any number of inputs or positions which are combined into the quad buss as four separate outputs. The left front quad buss channel address apparatus input 49; the right front quad buss channel addresses apparatus input 51 ; the left rear quad buss channel addresses apparatus input 48; and, the right rear quad buss channel address apparatus input 50. Alternate insertion of the apparatus of Figure 7 is possible in the absence of a quad buss by using the stereo buss plus two effect busses. Left front input 49 (unprocessed) is connected to amplifier 52. Left rear input 48 (to processed) is connected to amplifier 54. Right rear input 50 (to processed) is connected to amplifier 56. Right front input 51 (unprocessed) is connected to amplifier 58. The outputs of amplifiers 52 and 58 are respectively connected to adders 60 and 62, respectively, so that amplifiers 52 and 58 effectively bypass the enhancement system 100. The use of the quad buss allows the apparatus to address its function to each input of a live session or each track of recorded multi-track information, separately. This means that, in production, the operator/engineer can determine the space density of each track rather than settling for an overall space density. This additional degree of creative latitude is unique to this apparatus and sets it apart as a production tool.

The amplified left and right signals in lines 68 and 70 are both connected to summing amplifier

72 and differencing amplifier 74. The output in line 76 is, thus, L+R, but the amplifier 72 also serves to invert the output so that it appears as -(L-R). These sum and difference signals in lines 76 and 78 are added together in adder 60 and generate the left program with a conditioning signal C_L which adds additional spatial effects to the left channel. The signal in line 78 also goes through invertor 80 to produce in line 82 the (L-R) signal. Lines 76 and 82 are introduced into adder 62 to generate in its output line 84 the right program with conditioning signal C_R. The output lines 79 and 84 from adders 60 and 62 go to the balanced-output amplifiers 86 and 88 for the left output and 90 and 92 for the right output. The output amplifiers are preferably differential amplifiers operating as a left pair and a right pair, with one of each pair operating in inverse polarity with the other half of each pair for balanced line output.

The conditioning signals C_L and C_R are similar to conditioning signal C of Figure 6, but are synthetically produced. Also, they have somewhat different frequency filtering which tends to broaden the rear sound images, particularly the antiphasic position 34 (Figure 4). Conditioning signals C_L and C_R derived from the difference signal -(L-R) in line 78 at the output of differencing amplifier 74. The difference signal in line 78 passes through high pass filter 94 which has a slope of about 18 decibels per octave and a cutoff frequency of about 300 Hz to prevent comb filtering effects at lower frequencies. The filtered signal preferably passes through delay 96 with an adjustable and selectable delay as manually input from manual control 98, which is called "the Dimension Control." The output of the delay 96 goes to voltage controlled amplifier (VCA) 102 which provides level control. The DC control voltage in line 104 which controls voltage control amplifier 102, is supplied by potentiometer 106, in the Manual Mode, or by the hereinafter described control circuit in the Automatic Mode. Potentiometer 106 provides a DC voltage divided down from a DC source 107. It functions as a "Space Control" and it effectively controls the amount of expansion of the sound perceived by a listener, i.e., it controls the amount of the conditioning signal which is added and subtracted from the left and right channel signals.

The output from voltage controlled amplifier 102 in line 108 is preferably connected via left equalizer 110 and right equalizer 112 for proper equalization and phasing for the individual left and right channels, which tends to broaden the rear image. The illustrated equalizers 110 and 112 are of the resonant type (although they could be any type) with a mid-band boost of 2 db at a left channel center frequency in equalizer 110 of about 1.5 kilohertz and a right channel frequency in equalizer 112 of about 3 kilohertz. After passing through the equalization circuits, the left conditioning signal -C_L occurs in line 114 and the right conditioning signal -C_R occurs in line 116. The left conditioning signal -C_L is added in adder 60. The right conditioning signal in line 116 is connected to invertor 80 where the conditioning signal -C_R is added to the difference signal -(L-R) and the sum is added to the sum signal to result in the right signal minus right conditioning signal on line 84 and left signal plus left conditioning signal on line 79.

The automatic control circuit generally indicated at 118 monitors the output signal in line 79 and 84 and regulates the amount of conditioning signal to keep a Lissajous figure generated on an X-Y oscilloscope, connected to the outputs, relatively constant. The Lissajous figure is a figure displayed on the CRT of an oscilloscope when the two outputs are connected to the sweep and amplitude drives of the oscilloscope. When the Lissajous figure is fairly round, the energy ratio between the sum and difference of the two outputs is substantially equal (a desirable characteristic). Lines 84 and 79 are respectively connected to the inputs of differencing amplifier 120 and adding amplifier 122. The outputs are respectively rectified, and rectifiers 124 and 126 provide signals in line 128 and 130. The signal in lines 128 and 130 are, thus, the full wave rectified sum and difference signals of the apparatus output respectively out of subtractor 120 and adder 122.

Lines 128 and 130 are connected to filter 132 and 134 which have adjustable rise and fall ballistics. Selector switch 136 selects between the manual and automatic control of the control voltage in line 104 to voltage controlled amplifier 102. The manual position of selector switch 136 is shown in Figure 7(a), and the use of the space expansion control potentiometer 106 has been previously described. There are several individual switches controlled by selector switch 136, as indicated in Figure 7(a). When the space control switch is switched to the other, automatic position, the outputs of filters 132 and 134 in lines 138 and 140, respectively, are processed and are employed to control voltage control amplifier 102.

When space control selector switch 136 is in the automatic position, the output of error amplifier 142 is connected through gate 144 to control the voltage in line 104. The error amplifier 142 has inputs directly from line 138 and from 140 through switch segment 146 and back through line 148. The filtered sum signal in line 140 is connected through the space expansion potentiometer 106 to that it can be used to reduce the apparent level of the output sum information to error amplifier 142 to force the error amplifier 142 to reduce the sum/difference ratio.

Comparator 150 is connected to receive the filtered sum and difference information in lines 138 and 140. Comparator 150 provides an output into gate line 152 when space control selector switch 136 is in the automatic mode and when a monophonic signal is present at inputs 48 and 50. This occurs, for example, when an announcer speaks between music material. When comparator 150 senses monophonic material, gate line 152 turns off gate 144 to shut down voltage controlled amplifier 102 to stop the conditioning signal. This is done to avoid excessive increase in stereo noise, from random phase and amplitude changes, while the input program material is fully balanced. The automatic control circuit 118 cannot distinguish between unwanted noise and desired program material containing difference information. Therefore, a threshold ratio is established between the sum and difference information in lines 138 and 140 by control of the input potentiometer into comparator 150. The comparator 150 and gate 144 thus avoid the addition of false space information in a conditioning signal which, in reality, would be response to difference-noise in the two channels. The comparator 150 thus requires a specific threshold ratio between the sum and difference information, under which the gate 144 is turned off and over which the gate 144 is turned on.

Clipping circuit 153, see the center left of Figure 7(a), is provided to present a signal when the system is almost in a clipping situation and another signal when clipping is present. "Clipping" is a rapid increase in distortion caused by dynamic peaks in the program material being limited by the static limit imposed by the power supply voltage in the circuit. Lines 154 and 156 which are the inputs of amplifiers 52 and 58, are connected, along with lines 68, 70 79 and 84, each through their own diode to bus 158. Bus 158 is connected through a resistance to input 160 of comparator 162. A negative constant voltage source is connected through another resistor to the input 160, and the comparator 162 is also connected to ground. By management of the two resistors, the comparator 162 has an input when bus 158 reaches a particular level. When that level is reached, output signal 164, such as a signal light, is actuated. Bus 158 is similarly connected through a resistor to the input 166 of comparator 168. The negative voltage source is connected through another resistor to input 166, and the resistance values are adjusted so that comparator 168 has an input when clipping is taking place. Latching circuit 170 is actuated when clipping has taken place to illuminate the two signal lights 172 and 174. Those lights stay illuminated until reset 176 is actuated.

In the cutting of V-groove stereo records, a difference signal results in vertical motion. Vertical motion is the most difficult to track in playback. Therefore, large signals which produce too much vertical motion when referenced to lateral motion are usually avoided. It can be considered saturation of the cutting function. Not exceeding the saturation point is extremely important in proper disk cutting. In FM broadcasting, similar restrains still apply, since governmental regulatory bodies tend to require that the difference signal be kept less than the L+R signal. Therefore, a detection circuit 178 is shown in the lower right corner of Figure 7(b). The rectified sum and difference signals in lines 130 and 128 are connected to peak followers 180 and 182. The peaks generated by peak followers 180 and 182 are connected to comparators 184 and 186. Comparator 184 gives an output pulse whenever the difference peak envelope becomes greater than the sum peak envelope, within plus or minus 3 dB. the level controls at the outputs of the peak followers 180 and 182 allow an adjustment in the plus or minus 6 dB difference for different applications. Comparator 186 has an output when sum/difference peak ratio approaches the trigger pint of comparator amplifier 184 within about 2 dB, and lights signal light 188 on the front panel, illustrated in Figure 7(b), as a visual warning of approaching L-R overload. This is accomplished by reducing the apparent level of the sum envelope by about 2 dB with the potentiometer connecting comparator 186 to ground. The output of comparator amplifier 184 feeds a latching circuit 190 which activates light 195 and which holds until reset by switch 192. When the latching circuit is active, it activates driving circuit 194 which lights panel lights 196 and 197 and, after a time delay, rings audible alarm 198. At the same time, driving circuit 194 energizes line 199 which cuts off gate 144 to withhold the signal to amplifier 102 which controls the conditioning signal. Actuation of gate 144 removes the conditioning signal from line 108, but permits the normal stereo signal to continue through the circuit. Third Embodiment

A third embodiment of the system and apparatus of this invention is shown in Figure 8 and is generally indicated at 200. For reasons already stated with respect to the system and apparatus 100 of Figures 7(a) and (b), the left front quad buss channel address unprocessed input 49 whiph is connected to amplifier 204; the left rear quad buss channel address processed input 48 which is connected to amplifier 206; the right rear quad buss channel address processed input 50 which is connected to amplifier 212; and, the right front quad buss channel address unprocessed input 51 which is connected to amplifier 214. Amplifiers 204, 206, 212 and 214 are inverting and provide signals in lines 208,210, 216 and 218, respectively. Both lines 208 and 210 are connected to summing amplifier 220, while both lines 216 and 218 are connected to summing amplifier 222. Lines 210 and 216 carry -L and -R signals.

The conditioning signals C_R and - C_L are derived by connecting differencing amplifier 224 to both lines 210 and 216. The resulting difference signal, -(R-L), is filtered in high pass filter 226, similar to filter 94 in Figure 7(a), and the result is subject to selected delay in delay circuit 228. The delay time is controlled from the front panel, as will be described with respect to Figure 9. The output from delay 228 goes through voltage controlled amplifier 230 which has an output signal, -C, in line 232, which is supplied to both non- inverting equalizer 234 and inverting equalizer 236. Those equalizers respectively have conditioning signal outputs -C_L and +C_R which are connected to the inverting summing amplifiers 220 and 222. The left conditioning signal -C_L is added (and inverted) with the original left signal at amplifier 220 to form L+C_L, and the right conditioning signal +C_R is effectively subtracted from the original right signal at invertor amplifier 222 to form R-C_RC the outputs from amplifiers 220 and 222, in lines 238 and 240, respectively, are preferably and respectively connected to balanced left amplifiers 242 and 244 and balanced right amplifiers 246 and 248, in the manner described with respect to amplifiers 86 through 92 of Figure 7(b). It may be useful to connect the various points in the circuit of Figure 8 to the clipping and L-R overload warning circuits 153 and 178 in the same manner as previously described with reference to Figure 7(b). Alternatively, VCA 230 may be manually controlled by a potentiometer and DC supply combination, such as potentiometer 106 and supply 107. The difference between the two embodiments of the system in Figures 7(a) and (b) and 8 lies in the way the original left and right signals are routed. In

Figures 7(a) and (b), the left and right signals are added and subtracted. This sum and difference information is then re-added and re-subtracted to reconstruct the original left and right signals. In the circuit of Figure 8, the original left and right signals are not mixed together. They remain independent of each other from input to output. The enhancement system may be automatic with self-controlling features in the apparatus so that the stereophonic image enhancement can be achieved without continual adjustment of the system and apparatus. Alternatively, manual control may be used, if desired.

The foregoing description of the invention, as it has been described with reference to the detailed circuitry shown in Figures 7(a), 7(b) and 8, has been basically in analog terms with the various elements of the circuitry being either analog devices or devices which could be either analog or digital. For example, the delay line devices 96 (Figure 7(a)) and 228 (Figure 8) are more likely to be implemented using digital components than by using analog components. Thus, an analog to digital converter might be used immediately prior to a linear digital delay line 96, 228 whose output can than be converted to analog using a digital to analog converter.

Alternatively, and preferably for the professional equipment, predominately digital implementations of the invention are quite practicable, as will be seen in the following embodiments.

Fourth Embodiment

Turning now to Figures 10(a) and 10(b), they form a digital logic diagram of a digital embodiment of the invention which is conceptually somewhat similar to the analog, or mostly analog, embodiment of Figures 7(a) and (b). In Figures 10(a) and (b), data transmission lines are shown in solid lines, while control lines are shown in dashed lines.

Left and right audio channel information is supplied in multiplexed digital format an input 302. Clock information is also supplied at an input 304 to a formatter 306 which separates the left channel information from the right channel information. Preferably, formatter 306 de-multiplexes the digital data which can be supplied in different multiplexed synch schemes. For example, a first scheme might assume that the data is being transmitted via a Crystal Semiconductor model CS8402 chip for AES-EBU, S-PDIF inputs, or a second scheme might assume that the digital data comes from an analog to digital converter such as a Crystal semiconductor model CS5328 chip. The I/O mode input 305 preferably advises the formatter 306 at the front end and the formatter 370 at the rear end of the type of de-multiplexing and multiplexing schemes required for the chips upstream and downstream of the circuitry shown in Figures 10(a) and (b). Those skilled in the art will appreciate that other multiplexing and de-multiplexing schemes can be used or that the left and right channel data could be transmitted in parallel, i.e., non-multiplexed data paths.

The left channel digital audio data appears on line 308 while the right channel digital audio data appears on line 309. This data is subtracted from each other at a subtractor 324 to form R-L data. The R-L data is supplied to a switch 329 and is filtered though a high-pass filter 326, a low pass filter 327 and is subjected to digital time delay at device 328. Switch 329 is controlled by a C-mode control 303 which effectively controls the position of switch 329, which is shown in Figures 10(a) and (b), in its C-mode position, that is, where the filters 326 and 327 and the time delay 328 are bypassed. The C-mode is preferably used when the apparatus is used with live sources, such as might be encountered during a concert or a theatrical performance, and a C microphone input source (Figures 5 and 6) is available, so that the C signal then need not be synthetically produced. The R-L data is preferably subjected to the filtering and time delay to generate the conditioning signal C when the invention is used to mixdown a recorded performance from a multi-track tape deck, for example.

The output from switch 329 is supplied to a variable gain digital circuit 330 which is functionally similar to the voltage controlled amplifier 102 shown in Figure 7(b). A mute control input can be used to reduce the gain at gain control 330 very quickly, if desired. The output of the gain control 330 is applied to left and right channel equalizers 310 and 320, which are functionally similar to equalizers 110 and 112 shown in Figure 7(b). The output of left channel equalizer 310 is applied to an adder 320 while the output of the right channel equalizer 312 is supplied to a subtractor 332 so that the control signals C_L and C_R, are respectively added and subtracted from the left audio data and right audio data on lines 379 and 384, respectively. That data is then multiplexed at formatter 370 and output in digital form at serial output 390.

The variable gain circuitry 330, which can be implemented rather easily in the digital domain by shifting bits, for example, is controlled either from a manual source or an automatic source, much like the voltage controlled amplifier 102 of Figure 7(b). In the manual position of switch 367 shown in Figure 10(b), the gain through circuitry 330 is controlled by a "space control" input 362 which is conceptually similar to the space control potentiometer 106 shown in Figure 7(a) and the potentiometer shown Figure 6. In the automatic position of switch 367, the gain in circuitry 330 is automatically controlled in a manner similar to that of Figures 7(a) and (b). In Figures 10(a) and (b), the data on lines 379 and 384 are summed at a summer 342 and, at the same time, subtracted at subtractor 340. The outputs are respectively applied to high-pass filters 346 and 344, whose outputs are in turn applied to root mean square (RMS) detectors 350 and 348, respectively. Detector 348 outputs a log difference signal, while detector 350 outputs a log sum signal. The value of the log difference signal from detector 348 can be controlled from the "Space In" input 362 at adder 352, in the automatic mode, so that the "Space In" value offsets the output of the log difference detector, either:

(1) 00 for a difference level 12 dB below the sum level;

(2) 80 for a difference level equal to the sum level; and

(3) FF for a difference level 12 dB over the sum level.

The output of adder 352 and the log sum output from detector 350 are applied to a comparator 354, which is conceptually similar to the comparator 150 of Figure 7(a). The output of comparator 354 is applied to a rate limiter 356 which preferably limits the rate at which the output from comparator 354 limits the rate of gain change of circuit 330 to approximately 8 dB per second.

Those skilled in the art will appreciate that the circuitry shown in Figures 10(a) and (b), instead of implementing it in discrete digital circuitry, preferably can be implemented by programming a digital signal processor chip, such as the model DSP 56001 chip manufactured by Motorolla, by known means.

The automatic control circuitry 378 shown in Figures 10(a) and (b), when switch 360 is in the automatic position, effectively controls the amount of spatial effects added by the invention depending upon the amount of spatial material initially in the left and right audio. That is to say, if the left and right audio data being input into the circuitry has high spatial impressions already, the amount of spatial effect added by the present invention is less than if the incoming material has less spatial impression information in it originally. The control circuitry 378 also helps to keep the envelope of the L-R signal less than the envelope of the L+R signal. That can be important for FM and television broadcasting where governmental agencies, such as the FCC in the United States, often prefer that the broadcast L-R signal be no greater than the L+R signal. Thus, the embodiment of the invention disclosed with respect to Figures 10(a) and (b), is particularly useful in connection with the broadcast industry where the spatial effects added by the circuitry can be automatically controlled without the need for constant manual input. It also should be emphasized that the present invention is completely mono-compatible, that is to say, when the present invention is used to enhance the spatial effects in either a radio FM broadcast or a Television sound FM broadcast, those receivers which are not equipped with stereo decoding circuitry, do not produce any undesirable effects in their reproduction of the L+R signal due the spatial effects which are added by the present invention to the L-R signal being broadcast.

The R/L equalization on line 308 controls the amount of boost provided by filter 310 and 312.

That boost is currently set in the range of 0 to 4 dB and more preferably 0 to 1 dB. The boost frequencies of filters 310 and 312 are preferably different, but both in the ranges of 500 Hz to 3 kHz. As previously mentioned, these filters add breadth to the rear image in the sound field.

The WARP In input to time delay 328 adjusts the time delay. The time delay is currently set at 3.75 mSec.

Fifth Embodiment

While the automatic mode version of the present invention can be very useful in broadcasting, the manual mode of operation of the present invention will be very important for the recording industry and for the production of theater, concerts and the like, that is, in those applications in which large multichannel sound mixing panels are currently used. Such audio equipment usually has a reasonable number of audio inputs, or audio channels, each of which are essentially mono. The sound recording engineer has control of not only the levels of each one of the channels but also, in the prior art, uses a pan control to control how much of the mono signal coming into the sound board goes into the left channel and how much goes into the right channel. Additionally, the engineer can control the amount of the signal going to the rear left and rear right channels on a quad buss audio board.

Turning again to Figure 4, the pan control of the prior art permits a sound source point image 32 to be located anywhere on the line between the left and right speakers 12 and 14 depending on the position of the pan control. For that simple reason, stereo recording was a large improvement over the mono recordings of forty years ago. Just imagine, however, the even greater effect which can impart to a listener 16 if the image point can be moved anywhere: not only between the two speakers, but to the left of the left speaker or to the right of the right speaker, to the foreground position (such as point 36) shown in Figure 4, or even to a point behind the listener such as the antiphasic image position 34 shown in Figure 4. The present invention provides audio engineers with such capabilities. Instead of using two pan controls such as can be found on a quad deck, the audio engineer will be provided with a joystick by which he or she will be able to move the sound image both left and right and front and back at the same time. The joystick can be kept in a given position during the course of an audio recording session, a theatrical or concert production, or alternatively, the position of the joystick can be changed during such recording sessions or performances. That is to say, the image position of the sound can be moved with respect to a listener 16 to the left and right and forward and back, as desired. If desired, the effective position of the joystick can be controlled by a MIDI interface.

Initially, in connection with the audio recording and mix down industries, the present invention will likely be packaged as an add-on device which can be used with conventional audio mixing boards. In the future, however, the present invention will likely find its way into the audio mixing board itself, the joystick controls (discussed above) being substituted for the linear pan control of present technology audio mixing boards.

Figure 11 shows the outward configuration of audio components using the present invention which can be used with conventional audio mixing boards known today. As shown in Figure 11 , the device has twenty-four mono inputs and twenty-four joysticks, one for each input. Preferably, the equipment comprises a control box 400 and a number of joystick boxes 410 which are coupled to the control box 400 by a data line 420. The joystick box 410 (shown in Figure 11) has eight joysticks associated with it and is arranged so that it can be daisy-chained with other joystick boxes 410, coupling with the control box 400 by data cable 420 in a serial fashion. Instead of having only eight joysticks in joystick box 410, the joystick box 410 could have all twenty-four joysticks, one for each channel, and, moreover, the number of joysticks and channels can be varied as a matter of design choice. At present it is preferred to package the invention as shown, with eight joysticks in one joystick box 410. In due time, however, it is believed that this invention will work its way into the audio console itself, wherein the joysticks will replace the panning controls presently found on audio consoles.

This embodiment of the invention has enhanced processed, left and right outputs 430 and 432 wherein all the inputs have been processed left and right, front and back, according to the position of the respective joysticks 415. These outputs can be seen on control box 400. Unprocessed outputs are also preferably provided in the form of a direct left 434, a direct right 436, a direct front 438 and a direct back 440 output, which are useful in some applications where the mixing panel is used downstream of the control box, and the audio engineer then has the ability to mix processed left and/or right outputs, with unprocessed outputs, when desired.

Figures 12(a) - 12(d) form a schematic diagram of the invention, particularly as it may be used with respect to the joystick embodiment. Turning now to Figures 12(a) - (d), twenty-four inputs are shown at numerals 405-1 through 405-24. Each input 405 is coupled to an input control circuit 404, each associated with an input 405. Since, in this embodiment, there are twenty- four inputs 405, there are twenty-four input control circuits 404-1 through 404-24. However, only one of which, namely 404-1 , is shown in detail, it being understood that the others, namely 404-2 through 404-24, are preferably of the same design as 404-1. The input control circuitry 404 is responsive to the position of its associated joystick for the purpose of distributing the incoming signal at input 405, on to bus 408. Each joystick provides conventional X and Y dc Voltage signals indicative of the position of the joystick which signal's are converted to digital data, the data being used to address six look-up tables, a look-up table being associated with each of the voltage controlled amplifiers (VCA's) 407 which comprise an input circuit 404. The value in the table for a particular X and Y coordinates of the joystick indicate the gain of its associated VCA 407. The digital output of the look-up table is converted to an analog signal for it's associated VCA 407. Each VCA 407 has a gain between unity and zero, depending on the value of the analog control voltage signal. Thus, from 0% to 100% of the signal being input at input 405-1 , for example, finds its way on to the various lines forming bus 408 depending upon the position of joystick 415-1. Similarly, input 405-2 has its input distributed amongst the various lines making up bus 408, depending upon the position of its joystick 415-2. The same thing is true for the remaining inputs and remaining joysticks. Also, as will be seen, the distribution of the signals is controlled somewhat by the position of a switch 409, whose function will be discussed in due course. The currently preferred values in the look-up tables are tabulated in Tables A-F and graphically depicted in Figures 13(a)-(f), which are associated with VCA's 407L, 407F, 407R, 407BL, 407M and 407BR, respectively. These figures shown the percentage of the signal input at an input 405 which finds its way onto the various lines comprising bus 408, where the various signals on each line of the bus from different input circuits 404 are summed together. The Figures 13(a)- (f) shown the percentages for various positions of a joystick 415 as it is moved left and right and front and rear. Table A and Figure 13(a), which are associated with VCA 407L, indicate that VCA 407L outputs 100% of the inputted signal when its associated joystick is moved to the position maximum left and maximum front. The outputted signal from VCA 407L drops to under 20% of the inputted signal when the joystick is moved to its maximum right, maximum back (or rear) position. Other positions of the joystick cause VCA 407L to output the indicated percentage of the inputted signal at 405.

VCA 407L, receives a control voltage input VC_X-L for controlling the amount of the input signal at 405 which finds its way onto bus 408L. Similarly, VCA 407R controls the amount of input signal at 405 which finds its way onto line 408R. The same thing is true for VCA's 407F, 407R, etc. The voltage control amplifiers 407 in the remaining input circuits 404-2 through 404-24, are also coupled to bus 408 in a similar fashion and, thus, the current supplied by the voltage control amplifiers 407 are summed onto that bus structure. Thus, the various input signals 405-1 through 405-24 are steered, or mapped, onto the appropriate line of bus 408 depending upon the position of the respective joysticks 415-1 through 415-24. The signals on lines comprising bus 408 are then converted back into voltages by summing amplifiers 409, each of which is identified by subscript letter or letters corresponding to the line of bus with which they are coupled. The outputs of summing amplifiers 409L, 409R, and 409F are applied directly to three of the four direct outputs, 434, 436 and 438, respectively. The direct back output 440 is summed from the output of the summing amplifiers 409CDL, 409CDR, 409EL and 409ER.

Before going deeper into the description, it might be helpful to the reader to explain some of the terminology, particularly the subscripts which are being used in this description. The reader has probably noted that the letter "L" is associated with the left channel, the letter "R" with the right channel, the letter "F" with front and the letter "M" with mono-compatibility. The letters "BL" mean back left and the letters "BR" mean back right. The perceived sound locations for L, R, F, BL and BR are shown in Figure 2, for example. The letter "C" is associated with the C-mode of operation, which was briefly discussed with reference to Figures 10(a) and (b). There is also a D-mode of operation and an E-mode of operation in the embodiment of the invention now being described. The mode of each input 405 is controllable from a controller 410. See, for example, Figure 11 where for each joystick 415 there is a mode switch 411 which can be repeatedly pushed to change from mode C, to mode D, to mode E_L , to mode E_R, and then back to mode C. In mode C and D, switches 409L and 409R are in the position shown in Figure 12(c). Switch 409L changes position when in mode E_L, while switch 409R changes position when in mode E_R. Light emitting diodes (LED's) 412L and 412R of Figure 11 , report the mode in which the controller is for each channel. For example, LED's 412L and 412R may both be amber while in mode C, may both be green while in mode D, while in mode E_L the left LED (412L) would be preferably red while the right LED (412R) would be off, and an opposite convention when in mode E_R.

Mode C is preferably used for live microphone array recording of instruments, groups, ensembles, choirs, sound effects and natural sounds, where a microphone array can be placed at the locations shown in Figure 5. Mode D is a directional mode which places a mono-source to any desired location within the listeners conceptual image space, shown in Figure 4, for example. Applications on mode D are in multi-track mix-down, commercial program production, dialogue and sound effects, and concert sound reinforcement.

Mode E expands a stereo source and, therefore, each input is associated with either a left channel (mode EL), or a right channel (mode ER) of the stereo source. This mode can be used to simulate stereo from mono-sources and allows placement within the listener's conceptual image space, as previously discussed. Its applications are the same as for mode D.

Returning to Figures 12(a) and (b), the output form summing amplifiers 409CDL and 409CDR, correspond to the back left and back right signals for the C and D modes. The signals are applied to a stereo analog-to-digital converter 412CD which multiplexes its output onto line 414CD. Similarly, stereo analog- to-digital converter 412E takes the E-mode back left and E- mode back right analog data, and converts it to multiplexed digital data on line 414E. The digital data on lines 414CD and 414E are applied to digital sound processors (DSP's) 450 which will be subsequently described with reference to Figures 14(a) and (b). The audio processors may be identical, and may receive external data for the purpose of determining whether they operate in the C-mode, D-mode or E-mode, as will be described. The programming of the digital sound processor (DSP) 450 can be done at a mask level or it can be programmed in a manner known in the art by a microprocessor attached to a port on DSP 450 which microprocessor then downloads data stored in E proms or ROM's into the DSP 450 during initial power-up of the apparatus. The current preference is to use model 56001 DSP's manufactured by Motorolla. In practicing the present invention, it is preferred to download the programming into the DSP 450 chips using a microprocessor, since that makes it easier to implement design changes should that become necessary. In due course, it will be preferred to use mask level programming since that should make the device more economical to produce. In any event, the programming emulates the digital logic shown in Figures 14(a) and (b). The outputs from the DSP 450 chips are again converted back to analog signals by stereo digital to analog converters 418CD and 418E. The outputs of stereo digital to analog converters 418CD and 418E are summed along with outputs from the mono compatibility channel, the front channel 409F, the right channel 409R and the left channel 409L, through summing resistors 419, before being applied to summing amplifiers 425L and 425R and thence to processed stereo outputs 430 and 432. The summing resistors 419 all preferably have the same value. The mono compatibility signal from summing amplifier 409M is applied to a low and high-pass equalization circuit which preferably has a low Q, typically a Q on the order of .2 or .3, centered around 1 ,700 Hz. Equalization circuit 422 typically has a 6 dB loss at 1 ,700 Hz.

In the D mode, processed directional enhancement information, i.e., the conditioning signal C, is added (and subtracted) to the output channels. This information is band pass filtered by filters 456 and 457, for example, so that it peaks in the mid-range. If the enhanced left and right signals are summed together to form a L+R mono signal, this can show up as a notch in the spectrum in that mid-range area. To counteract that effect, the mono compatibility signal is preferably used which has a notch which is the antithesis of the mid-range notch and which, in effect, balances the output spectrum of a L+R mono signal. When the joystick is in the center, equal amounts of the conditioning signals go to the left and right channels and when those channels are summed to form the R+L signal, the conditioning signal is effectively canceled out since it was originally added to one channel and subtracted from the other channel. So, with a back-centered joystick, some mono compatibility signal is needed, and can be seen in Table E and in Figure 13(e) where the VCx-M input to VCA 407M goes to approximately -5 dB (60%) when the joystick is centered between left and right, but pulled all the way towards the back. It should be understood by the reader that spatial enhancement and mono compatibility of the perceived conceptual image space and collapsed sound field is achieved within a surprisingly very narrow difference range of a few dB. This is the nature of human hearing with respect to the pyschoacoustic phenomena toward which this invention is directed.

Turning now to Figures 14(a) and 14(b), these figures form a sound processor logic diagram similar to data Figures 10(a) and (b), but with a number of changes, the most important of which follows:

(1 ) There is no need for an automatic control circuit 378, as shown in Figures

10(a) and (b) since, in this embodiment, the amount of expansion (the amount of spatial effects which are added) is controlled manually by the position of the joystick 415. There is also no variable gain circuit such as 330 (Figure 10(b)) since, the amount of gain is controlled by the position of the joystick 415 as it, in turn, controls the gain of the various VCA's 407(Figures 12(a) and (c)).

(2) The embodiment of Figures 10(a) and (b) operated in either a C-mode or a non-C expansion mode (which is identified as mode E in Figures 12(a), 12(b), 14(a) and

14(b)). The embodiment of Figures 12(a), 12(b), 14(a) and 14(b) also include another mode (mode D) which, as will be seen, causes certain changes to be made to the audio processor logic of Figures 14(a) and (b) compared to the audio processor logic of Figures 10(a) and (b).

Referring again to Figures 14(a) and (b), the incoming serial data which was multiplexed onto line 414, is de-multiplexed by formatter 451. Preferably, the stereo A to D converters 412 (see

Figure 12(d)) are Crystal Semiconductor model CS5328 chips, while the stereo D to A converters 418 (see Figure 12(d)) are Crystal Semiconductor model CS4328 chips and, therefore, formatters

451 and 470 would be set up to de-multiplex and multiplex the left and right digital channel information in a matter appropriate for those chips. The left and right digital data is separated onto buses 452 and 453, and is communicated to, for example, a subtractor 454, to produce a R-L signal. The R-L signal passes through the low pass and high pass filters 456 and 457 and the time delay circuit 458, when the circuit is connected in the E-mode as depicted by switch 455 (which is controlled by an E-mode control signal). When in the D-mode, switches 455 take the other position shown in schematic diagram and, therefore, the left channel digital data on line 452 is passed through the top set of high pass and low pass filters 456 and 457 and the time delay circuit 458, while the right channel digital data on line 453 is directed through the lower set of high pass and low pass filters 456 and 457 and time delay circuit 458. There is no need to control the amplitude of the signal from the time delay circuits 458, as was done in the embodiment of Figures 10(a) and (b), because of the fact that the amplitude the signals are being controlled at the input control circuits 404 of Figures 12(a) and (c) and the amount of processing is being controlled at the input by the position of joysticks 415 (see Figure 11). The outputs of time delay circuits 458 are applied to respective left and right channel equalization circuits 460. The output of the left equalization circuit 460L is applied via a switch 462 to an input of formatter 470. The output of the right equalization circuit 460R is applied via a switch 462 and an invertor 465 to an input of formatter 470. As previously indicated, formatter 470 multiplexes the signals received at its inputs onto serial output line 416.

Switches 462 are shown in the C-mode position, which has been previously described. When in the D-mode or the E-mode, the switches 462 change position so as to communicate the outputs of the equalizers 460 to the formatter and invertor 465, as opposed to communicating the unfiltered signals on lines 452 and 453 which is done when in the C-mode. The inversion which occurs in the right channel information by invertor 465, is effectively done by subtractor 332 in the embodiment of Figures 10(a) and (b). It is to be recalled that subtractor 332 subtracts the right channel conditioning information C_R (from equalizer 312), from the right channel audio data. In the embodiment of Figures 14(a) and (b), the right channel conditioning signal is inverted by invertor 465. It is then communicated via formatter 470 and the stereo digital to analog converter 418 (see Figure 12(d)) onto a summing bus, where it is summed through resistors 419, along with the right channel information from summing amp 409R, into an input of summing amp 425R. The left channel conditioning signal C_L is communicated, without inversion, via formatter 470 and the stereo digital to analog converter 418 (see Figure 12(d)) onto a summing bus where it is summed through resistors 419, along with the left channel information from summing amp 409L, into an input of summing amp 425L.

The invention has been described with respect to both analog and digital implementations, and with respect to several modes of operation. The broadcast mode, mode B, uses a feedback loop to control the amount of processing being added to stereo signals. In the C, D and E- modes, the amount of processing being added is controlled manually. In the final embodiment disclosed, the amount of processing is input controlled by joystick.

In the C-mode, the conditioning signal which is added and subtracted from the left and right channel data, undergoes little or no processing. Indeed, no processing is required if the conditioning signal is organically produced by the location of microphone "C" in Figure 5. In the mode C operation described with reference to Figures 10(a), 10(b), 12(a), 12(b), 14(a) and 14(b), the conditioning signal bypasses the high pass/low pass filters and the time delay circuitry. On the other hand, in the D and E-modes, the conditioning signals are synthesized by the high pass/low pass filter and, preferably, the time delay. In the E-mode it is a R-L signal which is subjected to filtering, whereas in the D-mode, it the left and right signals are independently subjected to filtering, for the purpose of generating the conditioning signal C.

As can be seen by reference to Figures 10(a), 10(b), 14(a) and 14(b), the amount of time delay is controllable. Indeed, some practicing the instant invention may do away with time delay altogether. However, time delay is preferably inserted to de-correlate the signal exiting the filters from the left and right channel information to help ensure monocompatibility. Unfortunately, comb filtering effects can be encountered, but these seem to be subjectively minimized by filters 456, 457, and 460. In order to minimize such effects in the B, D and E-modes of operation, it is preferred to use a time delay circuit such as 328 (Figures 10(a) and (b)) or 458 (Figures 14(a) and (b)). In the organic mode (Figure 6), the time delay is organically present due to the placement of microphone "C" further form the sound source than microphones "L" and "R".

The present invention can be used to add spatial effects to sound for both the purposes of recording, broadcasting, or a public performance. If the spatial effects of the invention are used, for example, in audio processing at the time of mixing down a multi - track recording to stereo for the purposes of release of tapes, records or digital discs, when the tape, record or digital disc is played back on conventional stereo equipment, the enhanced spatial effects will be perceived by the listener. Thus, there is no need for additional spatial treatment of the sound after it has been broadcast or after it has been mixed down and recorded for public distribution on tapes, records, digital discs, etc. That is to say, there is no need for the addition of spatial effects at the receiving apparatus or on a home stereo set. The spatial effects will be perceived by the listener whether they are broadcast or whether they are heard from a prerecorded tape, record or digital disc, so long as the present invention was used in the mixdown or in the broadcast process.

The present invention is also mono compatible. That is to say, if a person listens to a L=R signal, for example, the output at 430 and 432, no artifacts of the process will be perceived by the listener. This is important for television, FM and AM stereo broadcast as the greater populace will continue to listen to mono signals for some time to come. The present invention, while adding spatial expansion to the stereo signals, does not induce artifacts of the process in the L+R signal.

Digital delay devices can delay any frequency for any time length. Linear digital delays delay all frequencies by the same duration. Group digital delays can delay different groups of frequencies by different durations.

The present invention preferably uses linear digital delay devices because the effect works using those devices and because they are less expensive than are group devices. However, group devices may be used, if desired.

The previously described embodiments, and particularly the embodiments of Figures 7 through

14(a) and (b) will be quite useful in the professional audio industry in the various applications previously mentioned. However, those embodiments tend to be too complex for convenient use in consumer electronics equipment such as might be used in the home. Thus, there is a need for embodiment which may be conveniently used in consumer quality electronics equipment, and which preferably can be embodied in an easily manufactured chip. Such an embodiment is disclosed with reference to Figure 15.

Sixth Embodiment

Figure 15 is a schematic diagram of a sixth embodiment of the invention, which embodiment can be relatively easily implemented using a single semiconductor chip and which may be used in consumer quality electronics equipment, including stereo reproduction devices, television receivers, stereo radios and personal computers, for example.

In Figure 15, the circuit 500 has two inputs, 501 and 502 for the left and right audio channels found within a typical consumer quality electronic apparatus. The signals at input 501 are communicated to two amplifiers, namely amplifiers 504 and 505. The signals at input 502 are communicated to two amplifiers, namely 504 and 506. The left and right channels are subtracted from each other at amplifier 504 which produces an output L-R. That output is communicated to a potentiometer 503 which communicates a portion (depending upon the position of potentiometer 503) of the L-R signal back through a band pass filter 507 formed by a conventional capacitor and resistor network. Filter 507, in addition to band-passing the previously mentioned mid-range frequencies, also adds some time-delay to that signal, which is subsequently applied to an input of amplifier 508. The output of amplifier 508 is the conditioning signal, C, which appears on line 509. The conditioning signal, C, is added to the left channel information at amplifier 505 and is subtracted from the right channel information at amplifier 506 and thence output as spatially enhanced left and right audio channels 511 and 512, respectively. These outputs may then be conveyed to the inputs of the power amplifier of the consumer quality audio apparatus and thence to loudspeakers, in the usual fashion. The listener controls the amount of enhancement added by adjusting potentiometer 503. If the wiper of potentiometer 503 is put to the ground side, then the stereo audio program will be heard with its usual un-enhanced sound. However, as the wiper of potentiometer 503 is adjusted to communicate more and more of the L-R signal to the band pass filter 507, more and more additional spatially processed stereo is perceived by the listener. For example, if the listener happens to be watching a sporting contest on television which is broadcast with stereo sound, by adjusting potentiometer 503, the listener will start to perceive that he or she is actually sitting in the stadium where the sporting contest is occurring due to the additional spatial effects which are perceived and interpreted by the listener.

The circuitry of Figure 15 is shown with essentially discreet components with the exception of the amplifiers, which are preferably National Semiconductor model LM837 devices. However, those skilled in the art, will appreciate that all (or most) of the circuit 500 can be reduced to a single silicon chip, if desired. Those skilled in the art will also appreciate that capacitors C1 and C2 in band pass filter 507 will tend to be rather large if implemented on the chip and, therefore, it may be desirable to provide appropriate pin-outs from the chip for those devices and to use discreet devices for capacitors C1 and C2. That is basically a matter of design choice.

This invention has been described with reference to a number of embodiments, and it is clear that it is susceptible to numerous modifications, modes and embodiments within the ability of those skilled in the art and without the exercise of the inventive faculty. By way of example, the conditioning signal in all embodiments is shown as being added to the left channel and subtracted from the right channel. This convention may be reversed, if desired, although it is believed that the electronics industry will follow the convention described herein for consistency. Other modifications are well within the skill of those skilled in this art. Accordingly, this invention is not limited to the disclosed embodiments, except as required the appended claims.

Claims

WHAT IS CLAIMED IS

1 . An automatic stereophonic image enhancement production apparatus comprising: first and second lines each having an input and an output;

a first circuit in said first line and a second circuit in said second line respectively between said input and said output;

connection means connected to of said first and second lines between its input and its device for receiving a signal;

frequency-dependant delay means connected to said connection means for delaying the signal at said connection means to produce a time delayed signal; and

control means for receiving said time delayed signal, said control means having an output coupled to both of said devices in said first and second lines for delivery of the time delayed signal thereto, said control means controlling the amplitude of the time delayed signal so as to produce a delayed and amplitude controlled compensation signal to said circuits.

2. The apparatus of claim 1 wherein said connection means comprises a differencing amplifier for subtracting signals on said first and seconds lines.

3. The apparatus of claim 2 wherein one of said circuits is arranged as an adder and the other of said circuits is arranged as a subtractor.

4. The apparatus of claim 3 further including connection means connected to at least one of said first and second lines between said circuits and said output, said connection means sensing the signal in said first and second lines at said outputs and being connected to said control means for automatically adjusting said control means to maintain the compensation signal in said output lines substantially at a desired level.

5. The apparatus of claim 4 wherein said connection means includes a differencing device and a summing device both connected to said first and second lines adjacent the outputs thereof to produce difference and sum signals, a signal envelope detector connected to each said differencing and summing devices, a comparator connected to both of said detectors, said comparator having an output connected to said control means so that the compensation signal to said control means is controlled by said comparator for automatic adjustment of said compensation signal as a function of the sum and difference of the signal in said first and second lines adjacent the output thereof.

6. The apparatus of claim 5 wherein further including a manually controllable device and a switch, said switch being operable to selectively connect said comparator and said manually controllable device to said controller so that the amplitude of said compensation signal can be selectively automatically and manually achieved.

7. The apparatus of claim 6 wherein one of said first and second lines has a differencer therein located between said device and said input and the other of said first and second lines having an input adder therein between said input and said device, both said differencer and said adder in said first and second lines adjacent the input thereof being connected to both said first and second lines, said connection means for connecting said delay means to one of said first and second lines being connected between one of said input adder and said differencer and said adders which receive said compensation signal.

8. The apparatus of claim 7 wherein said adders and said differencer in said first and second lines are connected so that a first signal at the input of said first line produces a first signal plus a compensation signal at the output of said first line and a second signal at the input of said second line produces a second signal plus compensation signal at the output of said second line, said compensation signals being time delayed so that, when said first and second signals are respectively connected to first and second spaced loudspeakers, an observer positioned in front of said loudspeakers senses a virtual sound image as if said sound image were created by boundaries located to the side of said loudspeakers.

9. The apparatus of claim 5 wherein the output of said control means is connected through equalizers to said devices.

10. The apparatus of claim 2 wherein a gate is connected in input-gating relationship to said control means, and comparator means is connected between said second-named connection means and said gate, said comparator means including input devices establishing a threshold ratio between first and second signal inputs to said comparator means, said comparator means sensing the presence of a monophonic signal for closing said gate for deactivating said control means during the presence of such monophonic signal.

11. The apparatus of claim 1 wherein there are four input connections for connection to a quad buss, said four inputs being connected respectively to four input amplifiers, each of said input amplifiers having an output line, the output line of the first of said amplifiers being said first line and the output line of the second of said amplifiers being said second line and the output lines of the third and fourth of said amplifiers being respectively connected to said first and second circuits so that the quad buss signals connected to said first and second amplifiers and in said first and second lines are subject to delay, phase shift and addition, while the quad busses connected to the third and fourth amplifiers contribute unprocessed signals.

12. The apparatus of claim 1 wherein there are three input amplifiers for connection to three separate acoustically-related signal sources, the first of said amplifiers having said first line as its output, the second of said amplifiers having said second line as its output, and the third of said amplifiers having an output line connected additively to the input of one of said circuits and subtractively to the input of the other one of said circuits.

13. The apparatus of claim 1 wherein a stereophonic musical recording is provided to said inputs of said first and second lines, and a single speaker is connected to said outputs of said first and second lines for combining left and right channels of said stereophonic recording and reproducing monophonic compatible musical sound.

14. The apparatus of claim 1 wherein said control means is a voltage controlled amplifier.

15. The apparatus of claim 1 wherein said delay means for producing a time delayed signal is a digital delay means.

16. An audio image enhancement apparatus comprising:

(a) first and second audio inputs;

(b) first and second enhanced audio outputs;

(c) a source of time-delayed audio signals, which arrive later than corresponding signals on said first and second audio inputs;

(d) a variable gain circuit in circuit communication with an output of said subtractor circuit;

(e) an adder having a first input coupled with said variable gain circuit, a second input coupled to receive said first signal and having an output coupled with said first enhanced audio output; and

(f ) a subtractor having a first input coupled with said variable gain circuit, a second input coupled to receive said second signal and having an output coupled with said second enhanced audio output.

17. The audio image enhancement apparatus of claim 16, wherein said first and second audio inputs are coupled to first and second audio microphones located relative to a sound source and wherein said source of time-delayed audio signals is a third microphone located further from said source than either said first or second microphone.

1 8. The audio image enhancement apparatus of claim 16 where said first, second and third microphones are positioned at the apexes of a triangle which has a base confronting said sound source and wherein said third microphone is located at the apex opposite said base and furthest from said sound source.

19. The audio image enhancement apparatus of claim 16 wherein said source of time-delayed audio signals comprises a subtractor circuit for subtracting a first signal communicated via said first input from a second signal communicated via said second input, and a filter coupled to said subtractor circuit;

20. The audio image enhancement apparatus of claim 19 wherein said filter is a band pass filter, and further including a time delay circuit coupled between said subtractor circuit and variable gain circuit.

21 . The audio image enhancement apparatus of claim 20 further comprising a switch for selectively coupling the output of the subtractor circuit directly to said variable gain circuit or via said filter and time delay circuit.

22. The audio image enhancement apparatus of claim 16 further comprising first and second equalization filters, said first equalization filter being coupled between said variable gain circuit and said adder, and said second equalization filter being coupled between said variable gain circuit and said subtractor.

23. The audio image enhancement apparatus of claim 16 further comprising a feedback circuit coupled to the outputs of said adder and said subtractor and to said variable gain circuit for adjusting the gain of said variable gain circuit in response to detected levels of spatial information at the enhanced audio outputs.

24. A audio image enhancement apparatus comprising:

(a) a plurality of audio inputs;

(b) first and second enhanced audio outputs;

(c) a bus having a plurality of audio lines;

(d) a plurality of joysticks, each joystick being associated with an audio input, audio each input having an input circuit for steering different amounts of the signal at said audio input onto the lines of said bus based upon the position of said joystick;

(e) a sound processing circuit coupled to at least a pair of lines of said bus; and

(f ) output amplifiers having outputs coupled to said enhanced audio outputs, each output amplifier having inputs for summing signals on selected ones of said audio lines in said bus and also for summing signals output from said sound processing circuit.

25. The audio image enhancement apparatus of claim 24 wherein said sound processing circuit comprises:

(i) a subtractor circuit for subtracting a first signal communicated via a first one of said pair of lines from a second signal communicated via a second one of said pair of lines;

(ii) a first filter for frequency filtering output of said subtractor;

(iii) a second filter for frequency filtering output of said subtractor and an invertor for inverting the output of said second filter; and wherein the outputs of said first filter and said invertor are coupled to said output amplifiers.

26. A method of enhancing audio information comprising the steps of:

(a) subtracting a pair of audio signals from each other;

(b) time-delaying the results of said subtracting step;

(c) adding the results of said time-delaying step to one of said pair of audio signals; and

(d) subtracting the results of said time-delaying step from the other of said pair of audio signals.

27. The method of claim 26, further including the step of controlling the amplitude of the results of the time-delaying step before those results are added to or subtracted from said audio signals.