Low Frequency Loudness Compression

Eric SVL

Member
Joined
May 1, 2017
Posts
173
More  
Preamp, Processor or Receiver
Denon AVR-X4500H
Main Amp
Hypex NCore NC252MP
DAC
Micca OriGen G2
Computer Audio
iLoud MTM
Universal / Blu-ray / CD Player
Sony PS3, PS4
Streaming Equipment
Google Chromecast
Streaming Subscriptions
GIK Tri-Traps
Front Speakers
Buchardt S400
Surround Speakers
Polk LSiM 702
Front Height Speakers
Focal Chorus OD 706 V
Rear Height Speakers
Focal Chorus OD 706 V
Subwoofers
Rythmik
Other Speakers
ELAC Debut Reference DFR52
Screen
Samsung PN64H5000
It was 1933 Noah, I wonder if anything other than Sines were available. I am suggesting that it takes a while for the curve of sensitivity to establish when the overall volume is changed. But I an reconsidering even that now, due to finding the following. To be honest I hadn't taken any notice of this until just now.
It appears a lot of our assumptions that we hear best and flattest at high levels are probably mistaken.
Fletcher Munson's work has been improved on with very different results. This graph suggests to me that the spectra we hear at different volumes, particular at commonly occurring levels, don't vary much at all.
View attachment 32817
Based on my experience with Dirac Live, which does not have an equal loudness function like Audyssey, bass levels overpower the system if I go past my normal listening levels. I believe there definitely is a flattening out of the needed bass levels as volume increases. Low levels always require the most bass boost to achieve perceived spectral balance. The reality would seem to be somewhere in between the red and blue lines.
 

Matthew J Poes

AV Addict
Joined
Oct 18, 2017
Posts
1,904
I agree with you that minimal phase PEQ may correct amplitude and decay under the assumption that the impulse response is minimal phase. I was thinking it was not the case, but you said "a room is generally minimal phase at low frequencies". I checked various measurements I made, even an extreme one with a standing wave in a 1meter length steel tube of 4mm. diameter, and yes the impulse response is nearly perfectly minimal phase, with a negligible amount of excess phase..... so your assumption seems to hold in all of my measurements: I was convinced it was not the case, but you proved me wrong: good ! ;-)

It gets argued a lot, even I held that view, but since opening my acoustics consulting company I've had the chance to measure or get measurements of 100's of small domestic rooms. It's not that it is NEVER the case that a room has some non-minimum phase behavior below say 300hz. It is just that it is not the norm. Especially once you do something like spatial averaging. In theory a room should always be minimum phase so an argument against any evidence to the contrary is that it may very well be a measurement artifact or something introduced artificially such as an FIR bass correction.
 

Matthew J Poes

AV Addict
Joined
Oct 18, 2017
Posts
1,904
That raises the question of what duration tones are used to develop the loudness curves.

I would have guessed steady state, but your statements imply otherwise."

I would guess any old research, eg. the development of the third octave, used Sine Waves, steady state. I am suggesting that it may take quite a while for a FM sensitivity curve to kick in after a change from a different one.
Much longer than musical events. As I said earlier, depends on what one means by compression. I really don't think the hearing system reacts quickly enough to compress music.

I will be honest, I am not following everything that has transpired here, but I will add that there is a psychoacoustic phenomena in which our perception of loudness (as in instantaneous loudness) is impacted by the amount of time it takes for the sound to decay. This is mostly true at low frqeuencies. It is a form of time intensity trading in which you trade an "increase" in intensity for an increase in time (it decays over a longer period), but perceive it as just as loud or louder.

Loudness curves however were mostly studied on headphones and I believe those have far shorter decay rates than rooms. I would imagine they are not impacted by it.

Now if our perception of loudness varied by a source being impulsive or not, I'm not sure. It's nothing I've ever considered.
 

Matthew J Poes

AV Addict
Joined
Oct 18, 2017
Posts
1,904
This seems a correct definition to me.

A nice visualization feature has been added in recent versions of REW beta : it allows to estimate and plot the decay rate interactively as a function of the frequency you select.

It's actually using a different means of calculating RT. What I did in my analysis is actually related but I had to select those frequencies manually. The new feature is really cool. I've had my concerns that it might be providing some misinformation but so far things seem good. One unexpected benefit is that it has provided a more clear level of proof that the room wall construction can improve overall LF reverberation time. There has been debate about if this is true or not. Simulations show it to be true, but it was generally impossible to measure it directly. This seems to provide a reliable way to measure it directly.
 

Matthew J Poes

AV Addict
Joined
Oct 18, 2017
Posts
1,904
Very interesting discussion.

Perhaps some of the debate about decay is due to a bit of imprecision in its definition.

If I remember correctly from my mechanical vibrations text, it's defined as a % amplitude reduction per cycle, or perhaps more usefully for audio, dB/sec.

That would remove initial amplitude as a variable and its potential to muddy the waters.

Hi Noah,

So I think what I was trying to say (I don't think I used an imprecise term as I actually used the terms taught to me in acoustical physics courses and used by folks like Toole, Welti, and Geddes) is that the common assumption is that EQ doesn't increase the rate of decay in a minimum phase system it simply knocks it down in level. So what people think is happening is that if it decays (at the ringing frequency) at 10dB per second, when you apply EQ, the level has been reduced over time by that amount, say 6dB, but the decay is still 10dB per second. That is not true. The rate of decay actually increases to something like 12dB/Sec or 16dB/Sec, etc. The Schroeder Integer is a calculation of that. The true decay rate is not constant, it increases as time goes on and eventually levels off (though it will look like the opposite due to noise in the measurements). The point I wanted to show was that because of how ringing works, what it is (ringing is essentially a property of phase), means that if you lower its level, you increase its rate of decay. They are synonymous. That is always true for a minimum phase system. All rooms are going to be generally minimum phase and certainly modes and SBIR are minimum phase, so those that brings us to the second property of minimum phase. The inverse of the amplitude response of a mode has the inverse of the phase response as well. So it must decay faster.
 

Matthew J Poes

AV Addict
Joined
Oct 18, 2017
Posts
1,904
Well you seem to be sure of you, so ok, I'm fine.

However, I have to react to this statement:



that would be certainly true if we had one ear, but as we have two ears, the doubt is permitted: what we hear is probably related to the 2 sound pressure signals at each eardrum, combined together with visual cues, temporal information, knowledge, and anything else.

I made binaural measurements in my room some time ago, and we can clearly see the differences in amplitude between the two ears at some frequencies around 100-200hz, caused by "semi"-stationary waves establishing in some direction non perpendicular to the two ears.

Here are the evaluations for each ear:
View attachment 32780

I obtained the measurement by putting on my head some headphones "very open" (you hear nearly the same with headphones weared or not). Then I played sinus in on line-array placed in front on me at about 2 meters, in my treated room (very absorbant: RT60 ~50-100ms for mid-high frequencies and very good absorption at low frequencies compared to most rooms), and I adjusted with a gamepad and my own custom software the amplitude and delay between each headphone driver in order to get perfect "perceived" sound cancellation. I estimated the amplitude error at about 0.1dB, which we should add the dissemetry between L/R headphone drivers that were not calibrated. In the end, the differences between what is perceived between left ear and right ear is sensitively different and it is not perceived by a single microphone.
be
In a more reverberant room, I bet you would get high sound pressure amplitude differences between the two ears.

I think you are making a lot of inaccurate assumptions here. The paper that AJ cited talks about how there are large differences at higher frequencies, but that isn't true at low frequencies. Due to the nature of low frequencies you wont see big differences between the ears below 100hz. Second, it takes at least 1 or more cycles to perceive tone. AT 100hz a wavelength is 4 meters. In a given room, how many walls will a 100hz tone have bounced off of before 1 cycle is reached? In most rooms you are looking at 3rd or 4th order reflections at least. Basically, in most small domestic rooms, we have a reverberant field at low frequencies dominated more by the room than the speakers themselves. Every expert I've ever discussed this with has agreed that as we get lower and lower in frequency what we hear is dominated by room and generally it can be assumed that what we measure is what we hear. There is good agreement between the perception of bass quality and a spatially averaged measurement at low frequencies.

I don't know what you are showing above or how that was obtained so I can't really speak to the claim you are making. Just that it is well agreed upon that humans hear mostly reflections in small domestic rooms.

I believe that what AJ is talking about, which I believe is the argument made by David Greisinger, is that the methods used to obtain the flattest possible bass response inevitably leads to the loss of spaciousness at low frequencies.

However there are a number of things to keep in mind. David's work is really interesting, I was fascinated by it and loved exploring it, but at the end of the day, there is very little subjective research on this topic. Threshold research has established that its barely noticeable with contrived test-tones and not noticeable with music. In fact, when David tried to sell the idea to others at Harman, they tested the idea and couldn't reliably replicate the results. As far as I know, its not been published and I was told it wasn't published because it didn't seem worthwhile. Instead, the research that Toole and Olive did seemed to favor flatness over bassiousness. Some have argued this is because the flattness issue matters most below 100hz, but the bassiousness issue really tapers off fast below 100hz. So the loss of lateral separation of LF sources at low frequencies is of minor to no concern in summing a sub to mono at 80hz. We can disagree on how true that is all we want, but we can't pretend like the research has proven one idea right over another. I would call it inconclusive at best.

David actually sent me all his test tones and I did ABX testing with myself and a few friends/colleagues. We spent hours getting it to work in the first place, and once it did finally work (using laterally separated stereo subs), found that nobody got a reliable score on most of the music tracks. Most did get it on headphones however. We all got it with contrived test signals. That was actually how we confirmed the setup was correct. I shared this with Welti and Olive to figure out what we were doing wrong and Sean said it was exactly what he found. I was actually writing an article in support of the approach only to have my tests confirm it wasn't what I had hoped. I changed the article last minute to ultimately conclude that, for me, I prefer flatness over bassiousness in this last few octaves. My rational was that you have to pick one or the other and my tests matched that of others supporting the notion that flatness in fact mattered more, not spaciousness at low frequencies. An effect that wasn't really audible.
 

Jean Ibarz

New Member
Thread Starter
Joined
Jun 23, 2017
Posts
40
More  
Preamp, Processor or Receiver
Computer
Main Amp
Gemini XP3000
Additional Amp
Samson SERVO 600
Other Amp
Samson SERVO 200 and Yamaha STR-DB840
Hi,

Sorry for the long reply. Just clarifying what I plotted here : https://www.avnirvana.com/attachments/image-png.32780/

I will try to explain better the experiment, and explain what the plotted amplitudes represents.

The experiment is as follow:

I have a sound source in a room, in particular a line-array composed of multiple speakers in a 3-way actively filtered system. The line-array that I used is the one on the left in this picture : https://drive.google.com/file/d/1xXHpzuGyJQS6AJz5u2fq9SqjgM2A8l38/view?usp=sharing. The speakers in the center, reproducing the bass, was also active. A drawing of the listening setup can be found here :
experiment-setup.png

The system measurement in .mdat format is available here : https://drive.google.com/file/d/1l7YUyRizLNra0XT1aSaz4kJN3FTIUCTn/view?usp=sharing
And the average amplitude response was looking like this at different positions around the center of my SOFA :
33765


I can't tell for sure the system response at listening position, but it was quite flat because I used the setup and the sound was really good at that time (however, I applied a target curve to increase by abount 8-12dB the basses below 120hz :p).

So, because the sound was good, I was trying to see how it would be replicated with my headphones. I used these modded headphones: https://drive.google.com/drive/folders/1IxrfG6SA8HqXYztnUiUc7RoqRb9KBIe2?usp=sharing
The in-ear response of the modded headphones are approximately looking like this:
33766


When wearing the headphones, my audio system was sounding really good, as if I had no headphones on my head. So I assumed that the headphones was quite 'acoustically transparant" and had, at least, a negligible effect on the perceived quality of my system response, as perceived by me.

I started from the postulate that if the perceived sound was good, using the headphones to achieve a perfect cancellation of sine waves emitted by the line-array on the left, I would be able to know exactly the signal to be sent to my headphones to perfectly reproduce the sound reaching my eardrums when wearing the headphones. That is, I would be able to estimate the impulse response composed of the (System Response * Binary Room Impulse Response * Headphones Compensation Response).

So, I wrote a software to be able to vary, using a gamepad, independantly the amplitude or the delay of the signal sent to the left speaker or the right speaker of my headphones, because achieving sine-wave cancellation requires only to adjust amplitude and phase of the sound emitted from my headphones speakers.

We can reasonably assume that the response of my system was quite linear, the headphones amplitude response was not linear at all but the curve was really smooth.
So I was expecting to get, as a result, a smooth amplitude, and nearly the same amplitude response for the left and right ear signals.

To illustrate with a simpler example, consider the case where the line-array and the headphones would both have a perfectly flat amplitude. If it is required to send a signal at -10dB at a frequency of, let's say 100hz, on the left headphone speaker, that means that my left headphone speaker is able to reproduce the sine wave emitted by my line-array at 100hz as perceived on my left ear with an input of -10dB. If, at 120hz, the left headphone speaker requires an input of -6dB, that means that the signal as perceived by my left ear is stronger by 4dB than at 100hz (because I assumed the headphone response is perfectly flat...). Hence, the amplitudes sent to my headphones is roughly the amplitude response of the BRIR for my left ear for the current setup (position in the room, head direction, distance from the speaker etc.), up to
- some arbitrary gain,
- the error introduced by my line-array,
- the error introduced by my headphones,
- and an error that I estimate negligible (about 0.1dB) that corresponds to the error of setting the amplitude of the headphones speakers to achieve the perfect sine-wave cancellation,

The screenshots I took for each sine wave cancellation are available here: https://drive.google.com/drive/folders/1w46cdWmKvKU0OdqrOkRHxiUhOpJ-P4yu?usp=sharing
Each screenshot contains the sine wave frequency used in the REW sine wave generator, and the settings obtained through trial & error using my program and the gamepad to adjust the amplitude and delay sent to the L+R headphones speakers in order to achieve perfect perceived sine wave cancellation.

Repeating the process with a lot of sine waves (really, really long and cumbersome), I generated two text files, one for each ear, containing the (frequency, amplitude) values:
Left ear : https://drive.google.com/file/d/1NbR3_x2312VmAbzjty4KdSOHJI0tTaAu/view?usp=sharing
Right ear : https://drive.google.com/file/d/1USnkObFmgf7kcK2_ajx5OFBy7E1S3lDB/view?usp=sharing

And plotted on REW, the raw curves are what I plotted previously. Here is a better plot:
image.png

(red = left headphone speaker, green = right headphone speaker)
I cannot explain the differences for L and R ears at frequencies around 130-180hz, where the wave length is assumed to be around 2.3meters, except if there were stationary waves in my room (there was very little, but there was still some...).
 

Attachments

  • 1596009816289.png
    1596009816289.png
    201.8 KB · Views: 20
  • 1596010080673.png
    1596010080673.png
    86.4 KB · Views: 18
Last edited:

Jean Ibarz

New Member
Thread Starter
Joined
Jun 23, 2017
Posts
40
More  
Preamp, Processor or Receiver
Computer
Main Amp
Gemini XP3000
Additional Amp
Samson SERVO 600
Other Amp
Samson SERVO 200 and Yamaha STR-DB840
Since then, I did not have the motivation to repeat the process more rigorously and in free field conditions (i.e., in some land, with no walls or ceilings around...). Initially, I was suspecting that the transmission of sound through bone conduction was not negligible and that it may explain why my binaural reproduction was really bad (especially for trebles). Doing this experiment, I was able to measure "subjectively" the sound perceived by my brain, no matter how the auditive nerves were stimulated (through air conduction only, or both air and bone conduction). Now I think that bone conduction may be negligible - but still not sure though -, but I think bigger problems are stationary waves establishing in the ear canal and the way we measure the sound, i.e. measuring the SPL at the entrance of the ear canal instead of measuring the SPL at the eardrum, which is not feasible correctly, so we should rely instead on measuring the total sound energy density - function of SPL and particle velocity magnitude, as explained here : https://www.researchgate.net/public..._subjects_from_pressure-velocity_measurements

So my guess is that, may be, we need to rely on total sound energy density in each ear canal instead of the usual SPL at some point in space when we want to calibrate an audio system, or maybe, at least, the total sound energy density at the sweet spot. But I may be plainly wrong too... It's just something I would like to verify. I cannot find, however, any pressure-velocity measurements of audio systems in different places to be able to get some insights, and those pressure-velocity sensors are so expensive that I cannot do these investigations by myself :p.
 
Last edited:

Matthew J Poes

AV Addict
Joined
Oct 18, 2017
Posts
1,904
Since then, I did not have the motivation to repeat the process more rigorously and in free field conditions (i.e., in some land, with no walls or ceilings around...). Initially, I was suspecting that the transmission of sound through bone conduction was not negligible and that it may explain why my binaural reproduction was really bad (especially for trebles). Doing this experiment, I was able to measure "subjectively" the sound perceived by my brain, no matter how the auditive nerves were stimulated (through air conduction only, or both air and bone conduction). Now I think that bone conduction may be negligible - but still not sure though -, but I think bigger problems are stationary waves establishing in the ear canal and the way we measure the sound, i.e. measuring the SPL at the entrance of the ear canal instead of measuring the SPL at the eardrum, which is not feasible correctly, so we should rely instead on measuring the total sound energy density - function of SPL and particle velocity magnitude, as explained here : https://www.researchgate.net/public..._subjects_from_pressure-velocity_measurements

So my guess is that, may be, we need to rely on total sound energy density in each ear canal instead of the usual SPL at some point in space when we want to calibrate an audio system, or maybe, at least, the total sound energy density at the sweet spot. But I may be plainly wrong too... It's just something I would like to verify. I cannot find, however, any pressure-velocity measurements of audio systems in different places to be able to get some insights, and those pressure-velocity sensors are so expensive that I cannot do these investigations by myself :p.

Are you suggesting that we EQ a system based on the measurements we obtain for our systems as we perceive it directly? Meaning that the measurement includes our personal HRTF?

For binaural sound reproduction that would make total sense, you want to be sure the headphone reproduction system accurately accounts for your personal HRTF. I've actually been talking to Harman folks a lot about this for another project and they tell me that to accurately reproduce a system with speakers in a room over headphones you need to use room scans and BRIR's. You then need head tracking to be able to have the acoustics change as you move your head, just as in real life. They indicate that, in the absence of this, their research showed that listeners perceived things inaccurately. One of the main issues was the degree of externalization (and probably this is not universally true, some people are convinced of externalized sources much more easily than others).

If you mean to suggest that in-ear measurements of the response of a system is used to EQ said system, I wouldn't agree with that. Our brain compensates for our HRTF already, we don't need the speakers to do that. Headphones are different because they cannot reproduce sounds as coming from outside our head and different directions, so we need to trick the ears method of replicating that, which is the HRTF. For speakers outside the ear, we have no reason to compensate for an HRTF.

I've done no research on how to accurately capture an HRTF myself and know little about it. I believe that the methods used in the past have been fairly crude. From my understanding, there have been a few methods used. One uses simple mics in the ear canal, and I agree with you, this seems to be to likely be inaccurate. Sound strikes the opening of the ear canal from different angles and this impacts the response some at the highest frequencies. Yet a mic at the ear canal blocks this. I saw an article that involved inserting pressure tubes inside the ear (I didn't look, that might be the same article you cited) and this seemed more accurate to me. I've also seen a number of studies that compared models of HRTF's based on 3D scans of ears and these seemed to have very good agreement with real world measurements. That is the least invasive. What I noted in those studies is that the HRTF simulations were highly smoothed, but that may be fine. The last method I've noted, David Griesinger's method, is to use a simple equal loudness test. He claims this to be highly effective. I actually think the experiment you did might be a more complicated way of doing the equal loudness test. You did an equal loudness by way of cancellation test if I can understand what you wrote.

I've actually been doing some experiments with binaural and ambisonic (3D) impulse response measurements of systems to see if I can extrapolate from those measurements critical information in making EQ decisions. My theory was that we want to EQ directionally, that is, we want to EQ based on those sounds that directly contribute to timbre. Since our ears are mostly forward biased at mid/high frequencies and we hear more in front and above than we do below at these frequencies, it made sense to me that using a binaural mic with similar behavior to a real human head, we could extrapolate from that what is worth eqing and what is not. However, this method proved a bit complicated, you had to move the head around in precise ways in terms of the angle and tilt to detect the direction of reflections. That led to what I am doing now, which is using an A-format 4 channel ambisonic mic. With just 1 measurement I can extrapolate a sphere of sound striking the mic and detect it's direction. From there I can remove those directions I consider unimportant for our perception of timbre and leave the rest, by taking a set of measurements from different X/Y/Zl locations within the room, I can then create an average and EQ it against some target curve. All of this is not working yet, but it's drawing from other peoples work, so nothing here is that crazy. At the moment the extrapolation of the sound field is using spherical harmonics from Farina's software and I have to use his inversion algo, which is too perfect for proper room EQ purposes (it's literally inverting and convolving the raw impulse response causing a perfectly flat but overdetermined EQed response). I think this has the most potential for an alternative way to measure a room and EQ it. Plus I could see people having a lot of fun with the 3D plots it creates.

To the earlier comments being made. At frequencies below 100hz I can tell from the 3D impulse response already that much of what we were theorizing before is plenty true. When I can get plots that I can export I will share, but basically bass is like a giant pressure fluctuation radiating out through the room. By the time the mic detects it, it shows up almost equally (maybe a better word is randomly) across its full 3D face. Move the mic and you simple see another random pressure fluctuation around it's face. The method to detect the subwoofer position as a source using a 3D impulse response totally fails because the room reflections are so strong. In fact, one test we did was place the mic and subwoofer near each other in the room to mimic a nearfield subwoofer because I thought it would increase the ratio of direct to reflected sound. I was surprised to find that it did so only very slightly. Using a pressure gradient graphic, the hotter color only shifted slightly to the back of the mic where the sub was placed and not really in the right vertical direction (You could clearly see the source was behind the mic but you couldn't really see that it was well below the mic on the floor, and there was actually about as much energy coming from the sides. If you did a ratio calculation of the first 80ms to the rest (80-500ms), it was still less than 1.
 

Jean Ibarz

New Member
Thread Starter
Joined
Jun 23, 2017
Posts
40
More  
Preamp, Processor or Receiver
Computer
Main Amp
Gemini XP3000
Additional Amp
Samson SERVO 600
Other Amp
Samson SERVO 200 and Yamaha STR-DB840
Are you suggesting that we EQ a system based on the measurements we obtain for our systems as we perceive it directly? Meaning that the measurement includes our personal HRTF?

I think we should investigate wether the SPL signal at one point is enough to calibrate a system. I think it may be possible that our head and ears affect the sound energy in some way that we perceive things differently from what is just measured. I think that our ears are like a pressure-pressure sensor, and what is the aim of a pressure-pressure sensor ? To measure the acoustic intensity by measuring the pressure gradient. What if, our hearing is more sensitive at low frequencies to sound intensity than it is to SPL for example ?

For binaural sound reproduction that would make total sense, you want to be sure the headphone reproduction system accurately accounts for your personal HRTF. I've actually been talking to Harman folks a lot about this for another project and they tell me that to accurately reproduce a system with speakers in a room over headphones you need to use room scans and BRIR's. You then need head tracking to be able to have the acoustics change as you move your head, just as in real life. They indicate that, in the absence of this, their research showed that listeners perceived things inaccurately. One of the main issues was the degree of externalization (and probably this is not universally true, some people are convinced of externalized sources much more easily than others).

I believe that something is wrong in the binaural reproduction measurement method, and that we should get better externalization and spatialization without headtracking. I think that headtracking is a way to reduce the bad spatialization/externalization we have with binaural reproduction because of the errors we make in the process of doing it. I think that without headtracking we should be able to get good spatialization,headtracking, and no spectral problems in trebles.

If you mean to suggest that in-ear measurements of the response of a system is used to EQ said system, I wouldn't agree with that.

and I agree with you ;)

Our brain compensates for our HRTF already, we don't need the speakers to do that.

agree

I believe that the methods used in the past have been fairly crude. From my understanding, there have been a few methods used. One uses simple mics in the ear canal, and I agree with you, this seems to be to likely be inaccurate. Sound strikes the opening of the ear canal from different angles and this impacts the response some at the highest frequencies. Yet a mic at the ear canal blocks this.

yes some people argue that we should use a microphone at the non-blocked ear canal entrance, others argue that we should use a microphone at the blocked ear canal (more people agree with this method).

I saw an article that involved inserting pressure tubes inside the ear (I didn't look, that might be the same article you cited) and this seemed more accurate to me.

Yes these seems the most accurate, however the tube cannot be inserted up to the eardrum, so there is a problem in measuring amplitude for frequencies around 10-18khz depending on the microphone insertion depth.

But the worse problem to me is that when we hear in normal conditions, the external ear is open. Hence, we have a somehow closed boundary on one end (tymanic membrane) and an open boundary on the other end (ear canal entrance). When we put headphones on the ear (or earphones), we modify the ear canal and now, the previously open end become closed. This is equivalent to a change in the ear canal length (because with the closed end, the sound wave is reflected without polarity inversion, as opposed to before), which means that all the modal resonances are shifted such that peaks become dips, and dips become peaks. I think this is the most important issue and this cause huge errors in high frequencies, even with carefully HRTF or BRIR measurements and headphones calibration (at least errors of ~5-15dB).

I've also seen a number of studies that compared models of HRTF's based on 3D scans of ears and these seemed to have very good agreement with real world measurements. That is the least invasive. What I noted in those studies is that the HRTF simulations were highly smoothed, but that may be fine.

I don't know if it's fine, may be it is not at all if a "normal" dip of 10dB is smoothed and disappear, then, you need headtracking to restore little spatialization/externalization due to bad results ;)

The last method I've noted, David Griesinger's method, is to use a simple equal loudness test. He claims this to be highly effective.

I tried this experiment and I find really difficult to estimate the loudness from comparisons.

I actually think the experiment you did might be a more complicated way of doing the equal loudness test. You did an equal loudness by way of cancellation test if I can understand what you wrote.

In my experiment, the accuracy is nearly perfect. However, it cannot be reliable for high frequencies (>1.5-2khz) because small head movements implies huge variation measurements. However I think below 2khz it may be a way to prove by experimentation if what we measure with microphones (at the ear canal entrance) is really what we perceive, or if there are differences. If these differences are non negligible, we could investigate why : maybe bone conduction or cartilage conduction or both are not really negligible ?

I've actually been doing some experiments with binaural and ambisonic (3D) impulse response measurements of systems to see if I can extrapolate from those measurements critical information in making EQ decisions. My theory was that we want to EQ directionally, that is, we want to EQ based on those sounds that directly contribute to timbre. Since our ears are mostly forward biased at mid/high frequencies and we hear more in front and above than we do below at these frequencies, it made sense to me that using a binaural mic with similar behavior to a real human head, we could extrapolate from that what is worth eqing and what is not. However, this method proved a bit complicated, you had to move the head around in precise ways in terms of the angle and tilt to detect the direction of reflections. That led to what I am doing now, which is using an A-format 4 channel ambisonic mic. With just 1 measurement I can extrapolate a sphere of sound striking the mic and detect it's direction. From there I can remove those directions I consider unimportant for our perception of timbre and leave the rest, by taking a set of measurements from different X/Y/Zl locations within the room, I can then create an average and EQ it against some target curve. All of this is not working yet, but it's drawing from other peoples work, so nothing here is that crazy. At the moment the extrapolation of the sound field is using spherical harmonics from Farina's software and I have to use his inversion algo, which is too perfect for proper room EQ purposes (it's literally inverting and convolving the raw impulse response causing a perfectly flat but overdetermined EQed response). I think this has the most potential for an alternative way to measure a room and EQ it. Plus I could see people having a lot of fun with the 3D plots it creates.

When measuring with an ambisonic sphere, maybe what is measured through the different microphones allow to evaluate in some way (because of the hard boundary of the sphere, or because of the different placement of the icrophones) the acoustic intensity, because you get somehow the lacking information: the directionality. In pressure-velocity measurements, you get the total sound energy because you know the pressure AND the velocity of the particles, but you also know in which direction the particles are moving (because the velocity is a 3D vector). Hence, ambisonic may contains an information that is close the information contained in a pressure-velocity measurement. So in the end, I think we should use a pressure-velocity measurement at the sweet spot, OR an artificial head with in-ear microphones (but in this case, everybody should use the same head for the calibration). Yes, it is more complicated and more expensive. But is it worth it ? I don't know.... but it is something I would like to know ;-)

To the earlier comments being made. At frequencies below 100hz I can tell from the 3D impulse response already that much of what we were theorizing before is plenty true. When I can get plots that I can export I will share, but basically bass is like a giant pressure fluctuation radiating out through the room. By the time the mic detects it, it shows up almost equally (maybe a better word is randomly) across its full 3D face. Move the mic and you simple see another random pressure fluctuation around it's face. The method to detect the subwoofer position as a source using a 3D impulse response totally fails because the room reflections are so strong. In fact, one test we did was place the mic and subwoofer near each other in the room to mimic a nearfield subwoofer because I thought it would increase the ratio of direct to reflected sound. I was surprised to find that it did so only very slightly. Using a pressure gradient graphic, the hotter color only shifted slightly to the back of the mic where the sub was placed and not really in the right vertical direction (You could clearly see the source was behind the mic but you couldn't really see that it was well below the mic on the floor, and there was actually about as much energy coming from the sides. If you did a ratio calculation of the first 80ms to the rest (80-500ms), it was still less than 1.

I experienced this kind of phenomenom in my room. With my wall of 16 bass speakers, being near or away the wall of speakers doesn't really change the SPL. However, using a dipole source: an "infraflex" (a custom made bass speaker with a standard loudspeaker mounted to a flexible plate of polystyren) the SPL was a lot higher near the diaphragm than it was some meters way. Hence I think we cannot conclude, in general, how the pressure varies with distance for a loudspeaker playing bass frequencies in a room. From my experience, it seems to be the case that the sound source surface or the directivity (dipole or monopole). The infraflex and my wall of 16 loudspeakers was not discriminable in ABX test for me and a listener that came in my home for listening tests.

[/quote][/quote]
 
Top Bottom