IPB

Welcome Guest ( Log In | Register )

7 Pages V  « < 5 6 7  
Reply to this topicStart new topic
128kbps Extension Test - FINISHED
Tripwire
post Aug 9 2003, 12:17
Post #151





Group: Members
Posts: 156
Joined: 28-December 02
Member No.: 4272



QUOTE (rjamorim @ Aug 9 2003, 12:16 AM)
QUOTE
and i think you meant normal wma9 as the pro codec cant be used below 128 afaik


Oh, yeah? I sincerely don't know.

Can someone with Windows Media Encoder 9 check out if you can get a 64kbps two pass VBR encode out of it in WMA pro?

Nope. CBR minimum bitrate is 128kbit. The only way to get it below that is using the Quality VBR mode, Q10 and Q25 give bitrates below 128kbit, maybe even Q50, but the bitrates these modes spit out depend on the input you feed the codec.
Go to the top of the page
+Quote Post
Bongoboy
post Aug 11 2003, 00:27
Post #152





Group: Members
Posts: 30
Joined: 10-August 03
From: Newcastle Upon Tyne
Member No.: 8279



this is an interesting test and I'm glad it was (and will continue to be) done, but as a scientist I have to ask : What makes this a double-blind test? shouldn't the samples be compiled on the fly for each person from a large bank of random music so as to eliminate subconscious prejudice from those selecting the samples?

After all the principal of the double-blind is that neither the experimentee or the experimenter knows what they are being subjected to until the results are in... blind tests have demonstrated homeopathy works, whereas double blind ones then dismissed the claim... wink.gif

EDIT : not that I think that it would make much of difference in this case, as the samples are supposed to be deliberately selected to not highlight any particular flaw...

This post has been edited by Bongoboy: Aug 11 2003, 00:29


--------------------
Hip-hop looks like it's having more fun than you are - Chuck D
Go to the top of the page
+Quote Post
ff123
post Aug 11 2003, 01:08
Post #153


ABC/HR developer, ff123.net admin


Group: Developer (Donating)
Posts: 1396
Joined: 24-September 01
Member No.: 12



QUOTE (Bongoboy @ Aug 10 2003, 03:27 PM)
this is an interesting test and I'm glad it was (and will continue to be) done, but as a scientist I have to ask : What makes this a double-blind test? shouldn't the samples be compiled on the fly for each person from a large bank of random music so as to eliminate subconscious prejudice from those selecting the samples?

It is double-blind in the sense that neither the administrator nor the listener knows which codec is being listened to at any given time (unless there is only one codec being compared).

If random samples were chosen each time, there would be no way to make a group comparison, since preferences vary by sample as well as by person.

However, you have a point about how the samples are selected in the first place. I selected the original group of samples for the 64 kbit/s test, after calling for people to send in short clips of music that they liked to listen to. This process was definitely not random. I culled according to my own judgment. The idea was to obtain a mix of genres, with vocals both male and female, a variety of acoustic instruments. I chose what I personally thought was interesting-sounding music (eg., I chose not to include several Japanese pop-music selections).

For the 128 kbit/s tests, Roberto substituted in a couple of "problem" samples, with lots of transients, known to give codecs trouble.

How this might have biased the test is unknown, but the caveat is clearly made that the results of this test are valid for this particular mix of music and the particular group of people who listened to it.

QUOTE
After all the principal of the double-blind is that neither the experimentee or the experimenter knows what they are being subjected to until the results are in... blind tests have demonstrated homeopathy works, whereas double blind ones then dismissed the claim...  wink.gif


This is not so different from a drug test, in which the drug under test is known, but whether the drug or the placebo is being administered to any given person is not known.

ff123
Go to the top of the page
+Quote Post
Bongoboy
post Aug 11 2003, 03:05
Post #154





Group: Members
Posts: 30
Joined: 10-August 03
From: Newcastle Upon Tyne
Member No.: 8279



Well, as i said I don't think it's too important in this case, as the testing is very specifically targeted, and the results are just as meaningful... what i'm suggesting would probably be a great del more difficult to implement, for only marginal gain.

after all, using targeted samples has it's place too.

I'm just suggesting a broader, possibly more meaningful test criterion, in that the purpose of the double-blind test is to erase any prejudice on behalf of the tester, as well as the testee. randomising the selection process, of course, is an integral part of this. Otherwise it's more like the less thourough Pepsi Challenge style blind test.

re: the drug test example. I feel providing truly random audio samples would be like the tester not knowing who gets what drug in double blind test. rather than matching the patient to the test drug or the placebo. true the test is blind, but if the testing is unknowingly potentially tailored for the "best" result then inevitably the trial is tainted.


Meanwhile any one aberrant result caused by the random sampling should be overruled by the mean result and is, at least more truly random.


--------------------
Hip-hop looks like it's having more fun than you are - Chuck D
Go to the top of the page
+Quote Post
AstralStorm
post Aug 11 2003, 19:47
Post #155





Group: Members
Posts: 745
Joined: 22-April 03
From: /dev/null
Member No.: 6130



The test IS double-blind - you don't know which sample you're listening to, you can only discern between coded and not coded.
Double-blind testing was invented to remove some nonverbal information from uncovering what the sample is.
There is no such problem with computers. tongue.gif

This post has been edited by AstralStorm: Aug 11 2003, 19:49


--------------------
ruxvilti'a
Go to the top of the page
+Quote Post
Bongoboy
post Aug 12 2003, 01:59
Post #156





Group: Members
Posts: 30
Joined: 10-August 03
From: Newcastle Upon Tyne
Member No.: 8279



First of all, I'd like to apologise for the way I've put things in my last posts, as I think I'm coming across as a jerk. but hey, that's the 'net. it's easy to offend, unless you always agree with everyone... tongue.gif

AstralStorm, my main point is that the interaction between the testers and the subjects is in the audio selection. if the music they encode is very randomly chosen on a user by user basis (say a random 30 secs out of an hour, chosen and encoded on-demand) the results might be different.

I don't think they would be. but they might.

imagine: "hey, our test says J-pop in 24kbs mp3 sounds the same as raw!" blink.gif
Go to the top of the page
+Quote Post
AstralStorm
post Aug 12 2003, 02:12
Post #157





Group: Members
Posts: 745
Joined: 22-April 03
From: /dev/null
Member No.: 6130



I wasn't annoyed by your post. Just tried to dispell some FUD. Oh well...

Lots of people would find hardly any artifacts in normal (easy) track encoded by recent codec at this bitrate.
Even best would have problems.
This would diminish the differences between them.

This post has been edited by AstralStorm: Aug 12 2003, 02:18


--------------------
ruxvilti'a
Go to the top of the page
+Quote Post
Jore
post Aug 12 2003, 21:33
Post #158





Group: Members
Posts: 32
Joined: 7-February 03
From: Helsinki
Member No.: 4889



rjamorim, great work! Three thumbs up!

I think some ppl take these test a bit too seriously. The main point came out clearly: mp3 is outdated with 128 kbps. It is up to your software and hardware choices what to use of the winners.

Hopefully all the mobile device developers also read your test and move on from mp3 headbang.gif Can't wait the 64 kpbs test results.

Best wishes,
Jore
Go to the top of the page
+Quote Post
dewey1973
post Aug 15 2003, 00:06
Post #159





Group: Members
Posts: 383
Joined: 31-March 03
From: Seattle, WA
Member No.: 5771



QUOTE (spoon @ Aug 7 2003, 02:18 AM)
Wma in all its confusing glory:

Basically everything wma that is not pro, lossless or voice (even WMA v9 standard) can be played by all WMA codecs including the first v2 codec.

To play pro, lossless and voice install the wma v9 codecs (which will play everything).

So if I have a Nomad Jukebox that supports wma it will NOT play file encoded (lossy or lossless) with the Pro encoder?

Also, what is the real world scenario that the 64kbps test is evaluating? I would think with all the "golden ears" here, noone would use that low of a bitrate. Is it to evaluate codecs for the purpose of streaming content?
Go to the top of the page
+Quote Post
rjamorim
post Aug 15 2003, 00:25
Post #160


Rarewares admin


Group: Members
Posts: 7515
Joined: 30-September 01
From: Brazil
Member No.: 81



QUOTE (dewey1973 @ Aug 14 2003, 08:06 PM)
So if I have a Nomad Jukebox that supports wma it will NOT play file encoded (lossy or lossless) with the Pro encoder?

Right, at least until creative releases a firmware update (if ever)

QUOTE
Also, what is the real world scenario that the 64kbps test is evaluating?  I would think with all the "golden ears" here, noone would use that low of a bitrate.  Is it to evaluate codecs for the purpose of streaming content?


That, and flash players, and probably because the other bitrate ranges either have already been tested (128) or are untestable (160+)


--------------------
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org
Go to the top of the page
+Quote Post
spoon
post Aug 15 2003, 09:21
Post #161


dBpowerAMP developer


Group: Developer (Donating)
Posts: 2757
Joined: 24-March 02
Member No.: 1615



For the WMA Codecs on protables, the source code comes from Microsoft themselves. I have heard they have yet to release a PRO or lossless version for portables to the various manufacturers.


--------------------
Spoon http://www.dbpoweramp.com
Go to the top of the page
+Quote Post
dewey1973
post Aug 15 2003, 15:48
Post #162





Group: Members
Posts: 383
Joined: 31-March 03
From: Seattle, WA
Member No.: 5771



QUOTE (rjamorim @ Aug 14 2003, 04:25 PM)
and probably because the other bitrate ranges either have already been tested (128) or are untestable (160+)

Is this because at 160 or above some codecs are mostly transparent? I would think that the presence of HD based players for mp3, wma (though not pro), ogg (Rio Karma), aac, and the whispers of the possibility of mpc, would make more people interested in seeing the results at 160 and 192. I, for example, have been using lossless while waiting to make a decision on my lossy codec choice. The relative performance of these codec at higer bitrates would really help me with that decision as well as the decision of what bitrate to encode files for portable use. It would help even if some of them tie due to transparency. For example if three codecs get 5s at 160 I can quit worrying and choose whichever portable I like and encode at 160. Am I missing something?

And thanks guys for the wma clarification.
Go to the top of the page
+Quote Post
DickD
post Aug 15 2003, 18:21
Post #163





Group: Members
Posts: 265
Joined: 12-January 03
Member No.: 4542



Great test Roberto.

I think from my point of view, I'd prefer an encoder that doesn't trip up very badly very often, even if its average score were a little lower.

Now, WMA Pro tripped up badly once. Perhaps it was bad luck and with other samples another codec would trip up, so statistical information isn't perfect.

However, I tabulated the mean scores (read from your graphs) and estimated the standard deviation.

Assuming all test samples are similarly distributed in terms of encoder variability, and assuming a "normal" or "gaussian" distribution, the average minus one sigma and average minus two sigma give a guide to the worst behaviour we're likely to see:

CODE
Track     AAC    Lame   MPC    Vorbis WMAPro Blade
41_30sec  4.36   3.3    4.33   4.2    3.97   1.4
ATrain    4.41   3.78   4.37   4.17   4.48   3.05
Bachpsic  4.5    3.41   4.66   4.51   4.8    2.9
Blackwat  4.62   3.92   4.71   4.38   4.56   2.18
death2    4.35   3.62   4.67   4.18   2.7    1.27
flooress  4.08   3.68   4.52   4.57   4.25   1.7
layla     4.15   3.59   4.4    4.24   4.45   1.83
macabre   4.59   4.06   4.55   4.54   4.86   3.16
midnight  4.56   3.42   4.43   4.26   4.38   2.39
thear1    4.69   4.16   4.48   4.11   4.44   2.41
thesourc  4.61   4.33   4.62   4.43   4.87   2.36
waiting   4.13   2.71   4.35   3.78   3.88   1.99

AvgScore  4.42   3.67   4.51   4.28   4.30   2.22
Std.Dev   0.21   0.44   0.13   0.22   0.59   0.62

-1 sigma  4.21   3.23   4.37   4.06   3.71   1.60
-2 sigma  4.00   2.79   4.24   3.84   3.11   0.99
-3 sigma  3.79   2.35   4.10   3.61   2.52   0.37

-1 sigma pt = 84.13% p(new sample < this value)
-2 sigma pt = 97.72% p(new sample < this value)
-3 sigma pt = 99.87% p(new sample < this value)

sd/sqrt12 0.06   0.13   0.04   0.06   0.17   0.18
errorbar  0.12   0.25   0.08   0.13   0.34   0.36


The probabilities at the end refer to the inverse normal distribution and the chances of getting a value worse than the -1 sigma point etc. if you chose a new sample at random and had the same listeners test it.

This is the result at the average minus 2-sigma point:



The errorbar line is based on the estimated error in the mean score, which I'd use to find the best rated codec overall on a mean score basis = 2*(Std Dev / Sqrt(12))


Just my thoughts. Many thanks to those who tested (I didn't have time, or probably the artifact training to join in)

By my criterion, of not failing badly, MPC wins over AAC, Vorbis, WMAPro, LAME, Blade.

(Edit: Note, I posted the wrong image originally, so please refresh if the top graph doesn't match the scores or this order)

DickD

P.S. Hmm, I wonder if WMAPro did badly only because it was using 2 passes to aim at 128 kbps for the specific short sample tested. Perhaps it's fairer to use it in a one-pass mode that averages at 128 kbps over many albums.

This post has been edited by DickD: Aug 15 2003, 18:32
Go to the top of the page
+Quote Post
ff123
post Aug 15 2003, 20:29
Post #164


ABC/HR developer, ff123.net admin


Group: Developer (Donating)
Posts: 1396
Joined: 24-September 01
Member No.: 12



QUOTE (DickD @ Aug 15 2003, 09:21 AM)
The errorbar line is based on the estimated error in the mean score, which I'd use to find the best rated codec overall on a mean score basis = 2*(Std Dev / Sqrt(12))

The graph showing the standard error of the mean is an interesting one, as it shows the variability of quality across the codecs. I think that's a useful graph which should be included in future test analyses.

Edit: your graph shows twice the standard error of the mean; I think a more conventional graph would show just the standard error. Still, it's a good graph.

QUOTE
P.S. Hmm, I wonder if WMAPro did badly only because it was using 2 passes to aim at 128 kbps for the specific short sample tested. Perhaps it's fairer to use it in a one-pass mode that averages at 128 kbps over many albums.


1-pass VBR was way too variable in average bitrates across albums. See the bitrate thread for this test. So we chucked it in favor of 2-pass VBR.

ff123

This post has been edited by ff123: Aug 15 2003, 20:31
Go to the top of the page
+Quote Post
rjamorim
post Aug 15 2003, 20:31
Post #165


Rarewares admin


Group: Members
Posts: 7515
Joined: 30-September 01
From: Brazil
Member No.: 81



QUOTE (dewey1973 @ Aug 15 2003, 11:48 AM)
Is this because at 160 or above some codecs are mostly transparent?  I would think that the presence of HD based players for mp3, wma (though not pro), ogg (Rio Karma), aac, and the whispers of the possibility of mpc, would make more people interested in seeing the results at 160 and 192.  I, for example, have been using lossless while waiting to make a decision on my lossy codec choice.  The relative performance of these codec at higer bitrates would really help me with that decision as well as the decision of what bitrate to encode files for portable use.  It would help even if some of them tie due to transparency.  For example if three codecs get 5s at 160 I can quit worrying and choose whichever portable I like and encode at 160.  Am I missing something?

Well, you can try to extrapolate the results of the 128kbps test. At 128kbps, MPC, AAC, WMA and Ogg ended up with average 4.35 points. At 160, I'm pretty sure all of them would reach very close to 5, maybe with some results from golden ears making the scores go down a little.

Unless I use very problematic samples, but then, it won't be really representative of several musical styles.

Finally, there's the point that it would be a VBR test. And I only have the courage to conduce one more VBR test (the 64kbps test)


--------------------
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org
Go to the top of the page
+Quote Post

7 Pages V  « < 5 6 7
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 26th November 2014 - 08:05