Help me make (statistical) sense of these DTS test results 
Help me make (statistical) sense of these DTS test results 
Mar 22 2012, 18:33
Post
#1


Group: Members Posts: 2676 Joined: 18December 03 Member No.: 10538 
Being often confronted with claims that DTS and especially Dolby 5.1 sounds like 'crap' compared to lossless surround  to the extent that some simply refuse to buy a surround remix if it isn't offered in lossless  I often find myself arguing that DTS and DD are actually rather good at what they do (lossy perceptual encoding), and should be hard to distinguish from lossless in typical listening. Suffice to say I am sometimes met with vigorous, occasionally bordering on vicious, reactions to this stance
That happened on the SurroundSound forum a few months ago, and in response, a poster (a nice guy, not vicious!) set up a listening test. He took a lossless 96/24 surround track of subjectively good audio quality, and converted it to DTS with Surcode's encoder, at three different settings. He then converted all of the lossy versions to 96/24 PCM to match the lossless, and offered them as a downloadable package that one could use to compare 'blind'  the subject doesn't know what order the 4 different versions are in (though he does know what formats were used in the test) , and is tasked with identifying them. Nine people so far have returned answers (I haven't yet, due to pesky amounts of travel I've been doing lately) and the author has kindlyl shared these preliminary results. Red indicates incorrect identifications With the caveat that this is hardly a rigorous scientific test (which the test author has acknowledged all along) , and assuming good faith on the part of the subjects, I find these results surprising. I'm also at a loss to evaluate the probabilities here. Not just the fact that it's a fourway choice , but also because some subjects only got as far as ID ing the lossless. As a statistics maven friend of mine said, "We have to either look at a test of 9 people on identifying the original versus other, or a test of 7 people on picking all four versions." He also tells me the number of replies here is too small to do a chisquare test to determine pvalue the traditional way. And also "I think there is evidence here (presuming the experiment was carried out appropriately) that people can identify the original audio (6/9 better than 25%), and that they can identify all four versions (2/7 better than 4%)." I agreed that it's possible  no one has ever denied that  but I was surprised at how *many* were able to do it here. So he and I explored the significance of the number of correct lossless vs not lossless replies (6/9) . My stats maven's analysis of that: QUOTE The probability of 6 guessing correctly out of 9 is... 1/4 * 1/4 * 1/4 * 1/4 * 1/4 * 1/4 * 3/4 * 3/4 * 3/4 (i.e. 6 people do something with a 1 in 4 chance and 3 people do the opposite, i.e. something with a 3 in 4 chance) ... but we then have to multiply that by what's called 9C6, that is how many ways there are of picking 6 people from 9 people, because we don't care which 6 people are right. 6C9 is the same as 9C3, i.e. the number of ways of picking 3 people from 9 people = 9!/(3!*6!) = 9*8*7/3*2 So, the whole thing comes to 0.008652. Pretty unlikely. That's not a pvalue however. The pvalue would be slightly larger because it would be the probability of 6 OR MORE people guessing correctly, not of PRECISELY 6 guessing correctly, which is the number above. So, we calculate the probability of 7 guessing correctly... 0.001236 And of 8... 0.000103 And of 9... 3.8147E06 So, summing those... The probability that 6 or more people would guess correctly with a 1/4 chance is 0.001, so that's our pvalue. Highly significant i.e., highly significant assuming a significance threshold of p=0.05 (which we agreed is traditional but not always appropriate, sometimes it needs to be smaller, e.g. if there's good independent reason to believe the phenomenon should be very unlikely or even 'impossible'). NB This wasn't performed as an ABC/hr type preference test, as would normally be done with codecs. It's just people listening and comparing at home, by various (unspecified) means. (I'm leaving out lots of back and forth from various forum members, one of whom for example pointed up a section of music that he felt especially reveals the differences) I'd be curious to hear what HA has to say. I can probably even provide the original sound files if people want to try it themselves (the order shown in the table is not necessarily the file order). This post has been edited by krabapple: Mar 22 2012, 18:36 


LoFi Version  Time is now: 28th April 2015  15:27 