I can understand isolation for source components and maybe speakers but not isolation for electronics. All those data centers that make this forum conversation even possible might want to rethink how they cool their web-servers. (The internal fans they currently use to cool the compute boards add significant vibration to the electronics). Not to mention all those active speakers makers that have their amps strapped to a thumping electro-mechanical device.
I would like to add a couple things that I think are missing from the explanation of experimental practice.
The onus is on the designer/ manufacturer making the device to substantiate their claims of how their product has better performance or value, not the other way around. But they never do. They rely on anecdotal evidence. (Hey, I saw David Copperfield make the Statue of Liberty disappear. Are you going to tell me that wasn’t real??). They hide behind the null hypothesis and say the 3rd party experimenter didn’t prove that their device was better because of the following reasons.....blah blah blah. So, nothing is decided by the test they say, please refer to our marketing material. Actually, what was decided was that the difference was so small as to be insignificant under those test conditions.
There is a distinction between statistical difference and technical difference. Given unlimited budget, a large enough sample size and and a capable metrology, we could detect very small differences with statistical significance. As a practical matter, no one does this. (Matt mentions this.)
The Benchmark DAC3 has benchmark measurements. Certainly, there is no reason to buy anything more expensive, although there are many options available, since it is unlikely to have better performance - no practical technical difference. Any differences are way below the audible threshold. Actually, you can go the other direction cost-wise and buy a less capable product because you can’t hear the difference anyway.
http://archimago.blogspot.com/
Why bring this up? Because real products with real value get buried in the sea of marketing hype. So, how do you sort thru the to find value? The Audio Media doesn’t help much. Maybe I’m just lazy. I could get the Isoacoustic GAIA with the 30 day trial and judge for myself. But I don’t see a reasonable explanation as to why this should make a difference. Inertia wins out. I’ll keep with my focus on room acoustics.
While I agree with some of the things you are saying, I want to be very careful with our language. We can hold any opinion we want, but we must be very careful that we don’t try to shield our opinion from scrutiny through a misuse of the scientific method.
Both statistically and conceptually we cannot accept the null hypothesis ever. Science, by nature, does not allow us to do that. To go back to a fundamental philosophical issue, we cannot ever prove that god does not exist. We can prove that god does exist. The same holds true in all of science. We cannot prove that something isn’t true, only that it is true.
We can’t prove that isolation devices make no audible difference. We can simply fail to support the notion that they do. We can hold the opinion that they make no difference. We can even say that science has so far failed to prove they make a difference.
It is not ok to use that science to suggest something is true. You can’t say that isolation makes a difference because science hasn’t proven that it does. That makes no sense. But you are welcome to hold the opinion that it does or does not make a difference. The counter remains equally true. You can’t say that because science has failed to show that it makes a difference, it must not.
In addition, it is to fair to say that tests which fail to find audible differences prove it doesn’t because whatever difference is does make is so small as to be impossible to readily detect. That makes a major inaccurate assumption about the study. That you have accounted for all alternative explanations. Andrew Gelman talks a lot about this in his explanation of the problems with P values. The null hypothesis is actually one of many alternative hypothesis. We work to control for the others but we can never control for all others. Studies always have some noise as a result of that.
Let’s take online double blind studies like Archimago did. Is that scientifically rigorous? It sure looks it. It used statistics, it controlled for some sources of bias, etc. It is not a high quality scientific study and I’m sure he knows it. It’s good blog fodder and that is all. He can’t and doesn’t control for the listener or listening conditions. He assumes people use high resolution listening gear and take the test seriously. A huge assumption to make and one that a century of research suggests is unlikely to be true.
If someone believes that all playback systems sound the same that there are three potential paths (or more) someone might take. 1) they take the test seriously and listen for differences but don’t expect to find them, 2) they strongly believe no such differences exist and do not seriously listen for them, 3) they intentionally sabotage the test, essentially trolling it.
I’ve also witnessed people take a listening test like this in a manner that clearly would obfuscate any differences. Listening through laptop speakers, on cheap headphones, or in a noisy environment.
Having done human subject research including preference testing, i have seen all of this happen. It is always standard practice in rigorous research to build in protections and to clean the data of responses which clearly are not in line with the testing assumptions. For example when assessing the validity of a new achievement test, we use a pattern recognition algo to detect people who answered with a pattern. They clearly weren’t even trying and their results would screw up the norm reference calculation.
You might say that it is unlikely that someone would troll an online blind listening test, but I’ve had people do it to me in person. I am sure it happens online. With ABx testing, there is no easy way to detect that and remove those answers. It’s one of many reasons why ABx is less favored today, MUSHRA being a better alternative. You can include questions that help you detect people who aren’t trying or are intentionally trying to mess up the test.
A lot in sound quality research is far more nuanced and less certain than most realize. I’m doing a bunch of work now related to the sound quality of low frequencies. I presented on my work in developing an alignment technique using wavelets and pointed out that this also allows someone to zero out group delay at low frequencies. That led to my looking into research on the audibility of group-delay at low frequencies. The commonly held belief is that it simply doesn’t matter. Some go so far as to say it only matters if it exceeds 1 or 2 cycles. Yet the research suggests something else all together. In fact it’s shockingly audible if the research is to be believed (it is so shockingly detectable at low levels that many, myself included, feel more replication is needed to accept such results). Distortion is another one, the research is mixed here but basically there is some research (not nearly enough) that suggests that extremely low levels of IMD or harmonic distortion are audible at low frequencies outside the masking zone. In other words, higher order distortions. Again, if we accept the handful of studies on this topic, it would suggest most of our assumptions and beliefs are wrong and we’ve been designing speakers all wrong.
Notice that I’m hedging my bets here. I’m not accepting the alternative hypothesis or fully rejecting the null. I’m saying we have evidence that suggests we need to do more research. We need to explain away bias, statistical forking paths, any alternative explanations. But just because I find them hard to believe doesn’t mean I’m rejecting them either. It points me In a direction where I want to do or see more research.
One problem with much of the audiophile soundquality Research is that academic researchers could care less. Those without a dog in the race aren’t doing this work. We have research that suggests HD audio is audible over SD audio, and other research that suggests MP3’s can’t be distinguished from SD audio or HD Audio. Which is it? My read is that it’s all very nuanced. Most recordings are so poor that the audible differences that can exist are obscured by the technique. As such, MP3’s are I distinguishable from better formats in many cases. However, it isn’t true to say they are always the equal in all music. Music can and does contain information that is audibly reproduced in a superior manner by these better formats. Even with high sampling rate, we have a bunch of new research that suggests that the differences are audible and it has nothing to do with hearing ultrasonics. I’m aware of a new study in review that may change our view of the nyquist frequency and provides a scientific basis for the potential audibility of high sampling rate music. I found it because I found huge differences in an analysis or HD vs SD vs MP3 audio streams in the audible range from 10khz on up. Further investigation found these issues to be robust using different analysis and different means of manipulating the tracks (I.e. just examining the HD sources and down-converting them myself). I assumed the mistake was mine and sent the results to a digital signals expert who told me his work into nyquist theorem actually explains this and did his own investigation. A small team of digital experts eventually looked into it and walked away certain the results are real but explainable by existing research adopted in DSP for radar. Basically that you actually need 4 times (or more) the bandwidth digitally to fully reconstruct the audible bandwidth without any errors.
And that is all very interesting but it’s a purely mathematical analysis. No listening tests and that remains the gold standard for making sense of this. I could share these tracks with everyone and do blondes testing. It would prove nothing useful. You really need to do controlled listening tests under blind conditions using MUSHRA. Nobody with the technology and budget to do that cares. The most I got was an offer to help with the analysis if I’m willing to collect the data. The hardest and most expensive part.