IPB

Welcome Guest ( Log In | Register )

3 Pages V  < 1 2 3  
Reply to this topicStart new topic
Why is MPC perceived to be the best?, (an off-topic audio encoding discussion)
Continuum
post Feb 12 2004, 12:40
Post #51





Group: Members
Posts: 473
Joined: 7-June 02
Member No.: 2244



QUOTE (2Bdecided @ Feb 12 2004, 12:18 PM)
Consider each negative ABX result as a "5.0" grade, and each positive ABX result as a "4.5" grade. Do a statistical analysis. Are the results significant?

Why do you suggest such a procedure? There have to be more apt analysis methods. Some binomial distributions come to my mind. wacko.gif

I agree with your (insinuated!) point though. Group tests of high bitrate modes are likely to fail.
Go to the top of the page
+Quote Post
2Bdecided
post Feb 12 2004, 12:51
Post #52


ReplayGain developer


Group: Developer
Posts: 5142
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (Continuum @ Feb 12 2004, 11:40 AM)
QUOTE (2Bdecided @ Feb 12 2004, 12:18 PM)
Consider each negative ABX result as a "5.0" grade, and each positive ABX result as a "4.5" grade. Do a statistical analysis. Are the results significant?

Why do you suggest such a procedure? There have to be more apt analysis methods.

You're right - I'm sure one of our resident statistical geniuses (that's not the plural, is it?) will respond in full...


As for high bitrate tests, I've just re-posted the results of the only (very old, quite flawed) test I know of here:
http://www.hydrogenaudio.org/forums/index....25&#entry183841

IIRC some of the results were statistically significant, even though people weren't required to ABX (some listeners did anyway). You would expect enforced ABX to filter out some of the noise. The placing of the high anchor by most listeners suggests that there isn't actually that much noise here though.

EDIT: where "is it transparent or not" is the question, ABX is probably essential. Where only a ranking is required, blind tests, large numbers of listeners and useful statistical analysis could be enough to cancel out placebo and still get useful results. ABX is still useful because it raises the quality of the results.

Before anyone launches into a TOS-8 attack on my lack of respect for ABX, remember that it (or something very like it) is only essential to prove (to a certain probability) that an individual heres a difference. In BS-1116 listening tests, hidden anchors and statistical processing do the job to give meaningful results for the population. You may not know categorically whether a certain individual actually heard a difference for a certain sample+encoder, but you don't need to.

Cheers,
David.

This post has been edited by 2Bdecided: Feb 12 2004, 13:00
Go to the top of the page
+Quote Post
fanerman91
post Apr 18 2004, 04:18
Post #53





Group: Members
Posts: 14
Joined: 26-September 03
Member No.: 9033



This "Really Big Codec Test" sounds exciting... did momentum for it die out? It looks like a ton of work butit would be incredibly helpful for a lot of people if it were carried out... What happened?
Go to the top of the page
+Quote Post
ScorLibran
post Apr 18 2004, 05:43
Post #54





Group: Banned
Posts: 769
Joined: 1-July 03
Member No.: 7495



QUOTE (fanerman91 @ Apr 17 2004, 10:18 PM)
This "Really Big Codec Test" sounds exciting... did momentum for it die out?  It looks like a ton of work butit would be incredibly helpful for a lot of people if it were carried out... What happened?

As I said on the previous page, the timeframe will be May-June, but may be pushed a little farther out than that to take other scheduling issues into account (June-July?).

And it will be a great deal of work, especially luring enough people to participate to make the results statistically significant. But against popular belief, this won't necessarily be a "high-bitrate" test. People don't like testing high bitrates because they can't distinguish artifacts (or only with great difficulty). That means a lower bitrate range should be used anyway. Maybe a range like 96kbps to 192kbps (VBR wherever possible/available), so that most people will distinguish variances more easily in part of this range. They'll actually stop testing once they can't tell the test sample from the reference, as finding their transparency threshold for that sample and format will be the entire goal of the test.

I'll tentatively plan on starting the official discussion for this test soon after Roberto finishes his dial-up bitrate test.
Go to the top of the page
+Quote Post
damiandimitri
post Apr 19 2004, 12:04
Post #55





Group: Members
Posts: 11
Joined: 6-November 02
Member No.: 3710



QUOTE
Consider each negative ABX result as a "5.0" grade, and each positive ABX result as a "4.5" grade. Do a statistical analysis. Are the results significant?


Why do you suggest such a procedure? There have to be more apt analysis methods. Some binomial distributions come to my mind. 

I agree with your (insinuated!) point though. Group tests of high bitrate modes are likely to fail.



if you want good statistical results...you should make the good and bad results more different then 4,5 and 5.....better take 1 and 10 for example. or...
good 10
don't know 5 << if it excists
bad 0

this way you will get better(clearer) results from your statistics

This post has been edited by damiandimitri: Apr 19 2004, 12:06
Go to the top of the page
+Quote Post
tigre
post Apr 19 2004, 12:46
Post #56


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



QUOTE (damiandimitri @ Apr 19 2004, 01:04 PM)
if you want good statistical results...you should make the good and bad results more different then 4,5 and 5.....better take 1 and 10 for example. or...
good 10
don't know 5 << if it excists
bad 0

this way you will get better(clearer) results from your statistics

Probably not. The main problem about listening tests at settings/bitrates aiming for transparency is certainly not the scale used for rating.

If 40% of listeners rate the original lower then the encoded version (in ABC/HR situation) because the difference they hear is based on immagination, you still need a big number of participants to get results that show a significant difference between encoders, no matter what scale you use for rating.

As said before, one other big problem is the way test samples are choosen. Using known problem samples will cause bias against the encoder most commonly used by people with good training in hearing artifacts. Choosing samples randomly won't help either because you need a big number of them to find samples where people can hear differences at all.


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post

3 Pages V  < 1 2 3
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd September 2014 - 02:25