----- Experience in your own room the magical nature of stereo sound -----

Basics

Issues in speaker
design

Stereo Recording and Rendering

Audio production

Conclusions

Projects

Your own desig

LXmini

LXmini+2

LXstudio

LX521.4

PHOENIX
dipole speaker

Three-Box active
system (1978)

Resources

------------------
Digital Photo
Processes

------------------
The
Sea Ranch

------------------
My Daughter
the Jeweler

What's new

LX - Store

Conversations
with Fitz

OPLUG
Forum

Recording & Rendering

--- Recording & Rendering 101 --- Acoustics vs. Hearing --- Subjective evaluation ---
--- Room optimized stereo --- Sound reproduction --- Recording what we hear ---
--- Experimental results --- Theory --- SRA --- Sound field control ---

Room optimized
stereophonic sound reproduction

Below are the slides from my 20 minute presentation at the AES 123rd Convention, 2007 October 5, in New York, Paper Session P1 - PERCEPTION. You can read the actual 17 page Convention Paper 7162 here.

In this context a paper by Brian C. J. Moore, "Controversies and Mysteries in Spatial Hearing" is of great interest to me. It confirms my conclusions about our ability to mask the listening room perceptually, if the reflections are echoes of the loudspeaker's direct sound. The paper came to my attention only recently again, 2/2008, though I had quoted from it earlier. Unfortunately it was not included in the references of my 2007 paper.

Brian C.J. Moore, "Controversies and mysteries in spatial hearing", Proc. AES 16th International Conf., Rovaniemi, 1999, "Spatial sound reproduction", pp. 249-258

With the New York paper I feel that I have completed a journey that started in the sixties with several popular loudspeakers from the local High-End Audio store in Palo Alto, California. I tried to improve the sound that I heard in my living room. It seemed obvious that one only needed to equalize the sound at the listening position. Since I worked in R&D at HP, albeit the Microwave Division, and had been inspired by similarly interested colleagues, Russ Riley and Lyman Miller, it was not too difficult for us to develop a real time audio analyzer with 1/3rd octave filter banks and true rms detectors in the form of small incandescent light bulbs that were adjusted for constant light output. A microwave point contact diode served as pink noise source. With a 50 MHz 1/f corner there was no shortage of audio output, though popcorn noise could be a problem. The loudspeaker/room equalization attempts showed that a flat frequency response sounded too bright. The attempts were also put in question when a new pair of ESS7 speakers sounded much better, but measured much worse, than my carefully adjusted Advents.

When KEF bought the Model 5450A Fourier Analyzer from HP, a man-high rack of equipment for $50,000 in 1973, and personnel came for training on the test instrument to Santa Clara, I met Laurie Fincham. Now, I thought, I could get all my loudspeaker questions answered. A constant air-mail exchange of letters between the UK and US followed. As it turned out we had many of the same questions. I began to design my own loudspeakers. Drivers were aligned on the baffle for symmetrical diffraction. Crossovers were designed (Linkwitz-Riley) to remove the frequency dependent shift of the main radiation axis. Crossover filtering and driver equalization were performed at line level. Small power amplifiers were designed for active systems. Boxes became sealed, ever smaller and inert. Drivers were magnet mounted for resonance suppression. In hindsight much of this led to improvement of the polar frequency response, but its significance was not fully appreciated until I began working with dipoles. I had been intrigued by the Quad ESL 63 and felt that its shortcomings could be overcome by using conventional dynamic drivers. Many generations of open baffle designs followed. First accompanied by sealed box woofers and then, impressed by Brian Elliott's dipole woofer towers and Don Barringer's experiment with smaller dipole woofers, I went dipole all the way. During my time with Audio Artistry we introduced a complete line of open baffle loudspeakers that very quickly became highly regarded by audiophiles.

My interest had always been in a loudspeaker that was very room independent for its sound. The ORION was as good as I knew how to make it. When I designed the PLUTO, a small low cost speaker and essentially an acoustic point source, I wanted to investigate amongst other things how it phantom imaged relative to the ORION. I did not expect that it would sound so similar to ORION. This then led to a refreshed appreciation of the importance of the polar response and the need to add a rear tweeter to the ORION. It also became clear to me how our perceptual apparatus deals with the loudspeaker and room interaction and manages to create an appealing illusion. Finally now everything seems to be in place to fully enjoy what is in the recording and to forget about loudspeakers and room.

Of course the nature and quality of the recording is either a strong contributor or becomes a limitation to the illusion creation. Following my presentation below are comments from Don Barringer about recording for stereophonic playback. Don is a musician, a former recording engineer for the US Marine Band and "my other pair of ears". We try to keep each other honest.

(Also see and listen to "Accurate sound reproduction from two loudspeakers in a living room" under Publications #23 and investigate the Accurate Stereo performance tests.)

Presentation

	NOTES . . . Good morning! My talk is about Room Reflections Misunderstood -or- How the room is taken out of the 2-channel stereophonic listening experience. Earlier this year it became very clear to me how strongly the loudspeaker’s polar frequency response and the speaker placement in the room determine what we hear when we listen to a 2-channel sound presentation.
	First of all I must emphasize that playback of a recording over two loudspeakers can only create an auditory illusion of the original event. Two channel playback produces loudspeaker cross-talk signals at the ears. The left speaker signal reaches left and right ears. Similarly for the right speaker. In addition there is a multitude of reflected sounds coming from the room surfaces. Stereophonic sound recording and reproduction is a different process from binaural, as with dummy heads and headphones, nor is stereo an attempt at sound field reconstruction which would take many more loudspeakers. With two loudspeakers we can only hope to emanate a sufficient range of auditory cues that allow us to recreate in our mind the recorded acoustic event, In the process we must minimize those auditory cues that mislead us into a different experience, like listening to two loudspeakers in a room.
	It has been my experience that the best strategy for designing loudspeakers and their optimum placement in a room, is to minimize misleading cues. When properly done it is actually possible to create a fairly convincing illusion of listening into a different acoustic space.
	There are some obvious and well known misleading cues that the loudspeakers themselves can contribute: - Non-flat on-axis frequency response - Resonances in loudspeaker drivers and cabinets - Non-linear distortion and the generation of spectral components that were not in the original And then there are the potentially misleading cues that the loudspeaker can generate in combination with the room due to: - The off-axis sound radiation and the resulting room reflections. Typically the off-axis and on-axis frequency response curves are different. I believe this area has not been sufficiently studied and is not fully understood
	What needs to be considered about a listening room’s contribution to the perceived sound of a 2-channel reproduction is: - The temporal symmetry with which sounds are reflected back to the listener - The delay with which the reflections arrive at the listener’s ears relative to the direct sound - The spectral content of the reflections, - and the rate at which they decay In addition we have potentially the room modes or resonances at lower frequencies. A lot of attention has been focused on modes and bass reproduction, but not so much on reflections and their effect upon imaging and spatial perception.
	It has been my experience and that of many others, that for optimum stereo reproduction: - The loudspeaker-listener triangle should be set up symmetrical to the room boundaries - The loudspeakers should be out in the room and at least 1 m or 3 feet away from large reflecting surfaces What is a new insight to me is that the loudspeakers should have a uniform polar response such as - an acoustically small dipole or open baffle loudspeaker, - or an acoustic point source, a monopole or an acoustically small omni-directional loudspeaker.
	Here are the two loudspeaker types that I had designed for different applications. The monopole is a 3-way system in non-resonant enclosures. It is omni-directional radiating up to around 3 kHz and then becomes forward pointing due to the size of the tweeter. Crossovers are at 1 kHz and 100 Hz. The dipole is a 3-way open baffle speaker with conventional dynamic drivers. Crossovers are at 1.4 kHz and 120 Hz. Speakers were measured outdoors on a tower and on the ground. Both loudspeakers have a flat on-axis frequency response when measured outdoors under free-field conditions. A 4pi to 2pi transition between 200 Hz and 100 Hz extends the flat response to half-space conditions for the low frequencies.
	All listening tests were done in my living room. For my personal enjoyment the critical stereo listening position is A, but I also listen a lot from B, further away. Both loudspeakers, dipole D and monopole M were also measured in this room.
	The layout shows the listening triangles, symmetrical to the room boundaries. The speakers are 2 m and more out from the wall behind them. The room extends behind the listener and is acoustically open or dead in that direction. Reverberation time is around 450 ms. The room is fairly live. Both loudspeakers, M and D, sound confusingly similar under these conditions. The surprising observation led me to investigate how this similarity in sound perception might be possible.
	A frequency response measurement from position A shows the effect of the room upon the direct loudspeaker signal. A 200 ms time record has been analyzed. It includes various reflections and room resonances, but is not long enough in duration to fully resolve all room modes. Clearly, dipole and monopole responses look different. But also the corresponding left and right speakers measure differently. The 1/3rd octave smoothed responses show this more clearly. There is little indication of the flat free-field outdoor measurement response in these data. Now let’s look at the response in the time domain to see the contributions from reflections.
	How many reflections are generated by a loudspeaker and which direction do they come from? For example, here is a dipole like D in a room corner. Every image of D contributes sound via reflection. In some cases direct, as from the side wall S in other cases by bouncing around between rear wall, side wall and floor, R+S+F. To measure reflections I use a 4-cycle, Blackman window shaped toneburst. The burst covers about 1 octave in frequency. The envelope of the burst signal on a logarithmic scale will be used to visualize reflections.
	The dipole loudspeaker has a separate rear tweeter, because the front tweeter is closed in the back. Initially no rear tweeter was used. Under outdoor, free-field measurement conditions it had no effect upon the on-axis frequency response. Also it was invisible for off-axis angles up to +/-60 degrees. In the room, though, it contributes to the sound at the listening position via reflections. The smoothed frequency response at position A shows the rear tweeter on and off. The reflection pattern changes whether the rear tweeter is ON, OFF or Reversed in polarity. The different conditions are audible in terms of timbre and imaging with rear tweeter ON or OFF. They are audible in terms of strange imaging with the rear tweeter ON and then Reversed as for monopole like off-axis behavior. There is some correlation to the predicted reflections from the room corner, but there are many more reflections.
	Now let’s look at dipole and monopole reflection patterns during the first 50 ms. The left most peak is always the direct signal. It is the envelope of the 3 kHz burst. It is followed by the room reflections. You see immediately that the reflection magnitude is lower for the monopole M. That is a result of being closer to the microphone than the dipole D and normalization to the direct signal. The overall pattern though looks different as well. You also see differences between left and right speakers because the room furnishings are not symmetrical. It is interesting to look at the power spectrum of left dipole and monopole that corresponds to the time domain presentation shown here.
	Here we see that the total power spectrum of direct and reflected signals follows the spectral envelope of the direct signal except for some fluctuations. The spectrum is very similar for dipole and monopole, only noise and distortion are higher for the monopole due to a less capable tweeter. Thus, even though the specific reflection patterns for dipole and monopole look different, their spectral content is dominated by the direct signal and its reflections. So far we only looked at the first 50 ms. Now let’s extend the time to 400 ms and place the microphone further out into the room to position B.
	You immediately see the decay of the reflections. It is shown here for two frequency regions, an octave around 3 kHz and an octave around 800 Hz. At 3 kHz the monopole has lower reflections because the tweeter becomes forward directional. At 800 Hz, though, you can clearly see the larger amount of room reflections generated by the monopole compared to the dipole. One might estimate the decay rate, but the envelope is rather ragged. I am of the opinion that reverberation time is not a very meaningful parameter for acoustically small spaces like we have here.
	It is illustrative to look at the reflection pattern for different frequency bands. Here I display the full-wave rectified toneburst and its reflections, because the envelope as derived by the energy-time-curve shows artifacts when the signal to noise ratio is not high enough. The display is on a linear amplitude scale. As we go from 3.2 kHz, to 1.6 kHz, to 800 Hz, to 400 Hz, to 200 Hz and finally to 100 Hz, You can see how the direct signal gradually changes from a single spike to the discernible cycles of the toneburst. Note that the visual multiplicity of individual reflections at 3.2 kHz gets gradually integrated into fewer and fewer variations. Time resolution is lost as the reflections overlap more and more. The room begins to respond as a whole as we go down to low frequencies. Reflections lose their meaning as a descriptor and room modes or resonances become important. We have seen some of the reflection patterns for a dipole and a monopole. They are different when viewed in the time domain though their spectral content is similar. We have also seen that the measured frequency response is different for M and D. Neither measurement gives an indication of the great similarity between M and D that is heard in the room.
	To summarize the similarity: The dipolar and monopolar loudspeakers sound almost identical in their spectral balance and clarity, despite the differences in measured room response and burst response. Phantom imaging is very similar, but with greater depth for the dipole. Loudspeakers and room "disappear" so to speak. This to me is a most surprising result. I have demonstrated it many times to visitors seated in A or B. They can switch instantly between monopole and dipole. People got up from their chair to walk over to the speaker to listen which one is playing. I would not have expected such similarity because the two loudspeakers follow different concepts and even use different quality drive elements. But, and this is the key point, both speakers have the same on-axis frequency response under free-space conditions. While one is a dipole and the other is essentially an omni they each have an off-axis response that is an attenuated version of the on-axis response. Thus the room is illuminated with spectral uniformity by both speakers. The monopole is like a bare light bulb, the dipole like two flash lights back to back.
	So here is the hypothesis. Confusing cues from the room are minimized if the reflections are: 1 - Left-right symmetrical 2 - if they are delayed >6 ms, and 3 - if they are attenuated copies of he direct sound in spectral content I think a strong case can be made for our ability to sort out different auditory cues and to focus our attention to hear what is of interest at the moment. It goes back millions of years and is evolutionary programming of the processor between two ears.
	Spatial hearing must have evolved out of the need for survival. For that it is essential to know the direction and the distance from which a threat is coming. So attention is paid to cues that tell direction and distance. Those cues must be sorted out in different surroundings, like in an open savanna or a thick forest where reflections and reverberation have different acoustic characteristics. It is also likely that we learned to integrate reflections of the threat with its direct sound and to mask stationary, non-threatening sounds, thus adapting to different situations. Listening in rooms is just a blink on the evolutionary time scale and so we still use the same adaptive and hard wired processes to tell direction, and distance, and we mask what is not relevant. Of course, much of the hearing process has been studied and the Precedence effect clearly is at play here.
	The precedence effect shows up in 3 phenomena in a room with multiple reflections: 1- as a localization effect, where the direct and reflected sound are heard as a single entity from the location of the direct sound. 2 - as the Haas effect, where a direct sound is integrated with a delayed sound for increased loudness 3 - as de-reverberation. We are normally not much aware of reverberated sound even when its energy is larger than that of the direct sound. Reading the literature on sound perception I conclude that the specific case of 2-channel sound reproduction in an acoustically small space, like a living room, deserves further investigation. Stereophonic listening, though, is a complicated case to study because of the variability of conditions and the many parameters that influence it.
	Given the hypothesis of symmetry of reflections, of delays >6 ms and in particular of the spectral content of the reflections in order that the room is not heard, there are then a number of requirements upon loudspeakers and rooms. They are often not met and so become impediments to creating a realistic impression of a 2-channel recorded acoustic event. 1 - The polar response of typical box loudspeakers is omni-directional at low frequencies and becomes increasingly forward directional with higher frequencies 2 - Many loudspeakers have insufficient dynamic range and distort at high levels 3 - Speakers are placed too close to walls and not symmetrical with respect to the room 4 - Rooms are acoustically treated, but the treatment primarily attenuates high frequencies 5 - Electronic room equalization is based on in-room measurements that correlate poorly with perception and finally, but most importantly: 6 - Recordings that were done with too many microphones and in synthesized acoustic spaces, so there is no coherent acoustic space in the recording to begin with.
	On the other hand, if loudspeakers, room and recording are appropriately optimized, then Two-channel playback in a normal living space can provide an experience that is fully satisfying. Loudspeakers and room disappear and the illusion of listening into a different space takes over. Thank you for you attention!

Top

The illusion of listening into a different space can only be created if the recording contains the necessary auditory cues.
The following are comments from a recording engineer's perspective.

A new look at Recording for Stereo

The ORION is qualified to serve as a standard in monitor loudspeakers. They do not depend upon room standardization or special acoustic treatment to enhance uniformity of result. A normal room with normal acoustics will optimize the listener’s "room subtraction filters", reducing both the room’s influence and the effect of differences between rooms. Once the value and utility of this system was recognized and accepted, some conclusions involving acoustical recording have formed as a result.

Stereo, it turns out, is a less restricted and far more capable medium than previously thought. When well executed, there is a reassuring timelessness to it, and surprise at how complete the stereo experience can be. Who knew there was this much gold left to be mined? Even sub-optimal recordings benefit and are given new life.

One can also appreciate more easily that the audience perspective is the reality, the true reference. It represents a widely shared experience that ought to be respected rather than ignored. For illustration, I have watched a few young conscientious conductors begin a passage in rehearsal, then quickly run back into the auditorium for a few moments to judge the real balance. They know where reality resides, and it isn’t the podium. In the end, of course, there is no practical aural escape from the podium, but that dash for a brief glimpse of reality remains telling. Yet despite this, it is the podium perspective that has dominated stereo recording from the beginning. There are reasons for this, but most are no longer valid.

More technically, the newly demonstrated importance of a uniform polar pattern in loudspeakers logically confirms a similar importance in microphones.

We simultaneously require coherence and incoherence (spaciousness, broadly defined) in stereo recordings and two microphones, whatever their patterns or however arrayed, cannot satisfy this requirement. They must inevitably produce a compromise that cannot fulfill the potential of stereo.

New recording techniques for increased realism are required to respond to this new level of accuracy in monitoring. As an example, a recording system that addresses these observations would require four microphones that separate the contradictory requirements: two for coherent information and two for incoherent information. Or more simply, one pair for the cause and one pair for the effect. Such a system is under evaluation. And this time – for the very first time, I think – a plausible standard can be employed to evaluate the process and the result.

Don Barringer, 11/2007

Also see and listen to "Accurate sound reproduction from two loudspeakers in a living room" under Publications #23.

Top

	What you hear is not the air pressure variation in itself but what has drawn your attention in the streams of superimposed air pressure variations at your eardrums An acoustic event has dimensions of Time, Tone, Loudness and Space Have they been recorded and rendered sensibly?
___________________________________________________________ Last revised: 02/15/2023 - © 1999-2019 LINKWITZ LAB, All Rights Reserved

Recording & Rendering

Room optimized stereophonic sound reproduction

Presentation

Room optimized
stereophonic sound reproduction