The best test tracks "objectively" are the same as the hardest to compress - live unedited chamber acoustic with voice, like a jazz piano trio accompanying a singer. Too many instruments - easier. Too much editing - easier. Without voice - easier. Without piano - easier. Also, it must be a very good performance of very good music (for you) - otherwise, you'll start hating it after listening to the same piece so many times.
I am all for visualizations, like
... which can pinpoint internal resonances and motor hysteresis distortions. Here, the source was bandlimited to 200...1000Hz and it's obvious that the residual is spread all over the map, far and wide. I am not sure how to condense them into a single value. Also, such visualizations require a low-noise mic like Rode NT1 5th gen and a low RT60 room. Otherwise, your ears are vastly superior to any visualizations.
A regular, high level of distortion is easily audible as residual, and visible on spectrograms. The lower, more marginal distortions sound like a cheap piano (while the real one was a true concert grand piano) and an amateur nervous signer with poor voice control (while the real one was flawlessly smooth). These are still possible to detect on residuals. When you change your listening habits, even stop listening to music - that's a sign of an even lower level of distortion. I am not sure if FSAF can help with those (yet:-).