Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Pre-Test thread (Read 55421 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Pre-Test thread

Reply #50
Quote
Quote
*I think that Guru's idea of 2 sets is interesting but unfortunately probably bad in our case. I am afraid that only experienced listeners would pick the second group. As we know that those listeners are using lower ranking (as demonstrated in the 128kbps test), ranking of both groups would probably not be comparable.


Good point.

A solution could be to "normalize" the rankings of each listener by calculating his/hers total average of rankings of the 1st (smaller) group of samples. Then all rankings (also for the 2nd (extended) group of samples should be multiplied with a certain factor so the average (1st group) becomes e.g. 3. Hopefully this doesn't mess up the results - another thing that dologan could try to find out.  B)
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello

Pre-Test thread

Reply #51
Another problem with using lame as an anchor is that it might not serve its function.  A lower anchor should rank last for each sample.  Disreguarding the blade anchor, lame did not do that in the 128kbps test.  If a lower anchor is to be used (and I tend to think that would be a good idea, but I am not a statistics expert), I would be in favor of using blade again.  Unlike a simple lowpass, it produces a spectrum of different kinds of artifacts, which I think is a more realistic baseline to work from.

Also, since there haven't been any respnoses about my Linux ABC/HR clone, I assume there's no interest.  Anyone who's interested can let me know, otherwise I'll probably not devote as much time to it as I would otherwise.
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

Pre-Test thread

Reply #52
In statistics, an anchor can be a middle value, not the highest and not the lowest.
I think, in a listening test an anchor is a weighting for the results of every listener to make them more comparable. It is to prevent that some listeners only use the 4-5 range while others take the full range.
It is also useful to check that a participant submits serious results.

Therefore, Blade at 64 kbps would do it.

Pre-Test thread

Reply #53
I think we're making a mess out of this test...


OK, people, please give me suggestions. If you want a lower anchor, and lame 128 shouldn't be an anchor, which would be the codecs featured, in your opinion, including anchors?

Remember we're limited to 8 codecs.

Pre-Test thread

Reply #54
Quote
A solution could be to "normalize" the rankings of each listener by calculating his/hers total average of rankings of the 1st (smaller) group of samples. Then all rankings (also for the 2nd (extended) group of samples should be multiplied with a certain factor so the average (1st group) becomes e.g. 3. Hopefully this doesn't mess up the results - another thing that dologan could try to find out.  B)

My fear is that such a messy way of calculation would open lots of possibilities for critics and the like to flame my test.

Not mentioning that calculating the resulting scores will be nightmarish (I mean, it'll be very hard, and then human errors might creep in, since I won't be using only ff123's tools anymore to do the calculation, I would have to do several calculations myself)

Pre-Test thread

Reply #55
Quote
EDIT: And what about PNS for nero? With quick test i heard that it is useful option to use low bitrates.

Ok. Just talked with Ivan, he said PNS is automagically disabled when you use HE AAC. Probably because it actually decreases quality, I guess.

Pre-Test thread

Reply #56
Quote
Well, as already explained somewhere, the Anchor isn't there only to protect rankings, but also to put things into perspective across the entire sample suite.


Probably it does more good than bad to define fixed rankings for anchors but to put things into perspective it would be good IMO to suggest at least a range for the ranking of the anchors.

Taking this into account the codecs tested should be

higher anchor:
1. lame --preset 128; suggested ranking arround "4" (arround could mean e.g. +/-1)

lower anchor:
2. lame --preset 64; suggested ranking arround "1" - "2" (I don't know how reallistic this suggested ranking is as I haven't tested --preset 64 much so far.)
OR:
2. something transcoded, e.g. WMA9@64kbps -> MP3Pro@64kbps (could have some educational value)

3.Ahead HE-AAC

4.Ogg Vorbis

5.MP3pro

6.WMAV9

7.Real Audio Cook

8.AAC? ATRAC3Plus? WMA8? ...? I'd say AAC because of hardware support.
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello

Pre-Test thread

Reply #57
Ok, these are my personal feelings:

I think that having lame at 128k for the upper anchor is critical.  Lots of these codecs are claiming to be as good at 64k as mp3 is at 128k.  I'm going to wager that such a claim is a big fat lie and I think one of the goals of this test should be testing that claim.

I don't care if there is or is not a lower anchor.  I guess I don't fully understand the significance of multiple anchors, or their importance.  I'm also perfectly happy doing 8 codecs.  With the 128 test, that would have been exhausting.  No codec is that close to transparent at 64k so it will be much easer and less fatiguing to do more samples.  If there is going to be a lower anchor, I would prefer blade.  I think lame at 64k would be surprisingly competitive.

As far as WMA vs. WMA pro, you're screwed any way you do it.  If you use WMA only, people will complain that you didn't include the best version.  If you use only WMA pro, people will claim that it's not representative of the WMA that everyone uses and actually has support in hardware.  Including both seems like a waste.

Sony's claiming that Atrac3 sounds really good and are marketing the crap out of it, which is a claim that should be tested.  However, apparently the software is so horrible that nobody's gonna use it no matter how good it sounds.  I'd say the same is true of Real Audio, but it's actually got some popularity somehow.
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

Pre-Test thread

Reply #58
Quote
I think that having lame at 128k for the upper anchor is critical.  Lots of these codecs are claiming to be as good at 64k as mp3 is at 128k.  I'm going to wager that such a claim is a big fat lie and I think one of the goals of this test should be testing that claim.

Right, so let's try to make things easier: This is definitely in:

-Lame --ap 128
-Ahead HE AAC Streaming :: Medium
-Vorbis -q 0 (or -q 0.2?)
-Adobe Audition MP3pro quality 40

This is discusseable:

-Real Audio Cook/Gecko 64kbps
-WMA (std or pro? CBR or VBR?)
-Bottom anchor (lowpass? blade? lame?)

-This is probably out, but can also be discussed:

-Atrac3plus
-QuickTime AAC LC

I think a good compromise between those that went to test as much as possible, and participants that don't want to waste too much time taking the test, is taking what's definitely in and what's discusseable. And leave out LC AAC and Atrac3+.

That would also make the statistical calculation of the resuls much easier and less prone to criticism. There would be no more odd packages, with a different sample in each of them, or doing special packages for those that want to test more.

Quote
As far as WMA vs. WMA pro, you're screwed any way you do it. If you use WMA only, people will complain that you didn't include the best version. If you use only WMA pro, people will claim that it's not representative of the WMA that everyone uses and actually has support in hardware. Including both seems like a waste.


True. I think another point would be that WMA std was already tested at ff123's test. Yes, it was v8, but I'm not very confident that v9 got much improved. Anyone caring to try? If it's nearly the same as v8, I might as well go with Pro.

Quote
Sony's claiming that Atrac3 sounds really good and are marketing the crap out of it, which is a claim that should be tested. However, apparently the software is so horrible that nobody's gonna use it no matter how good it sounds.


Haha. Right.

I created a special VirtualPC Win98 partition to install SonicStage, in case someone really wants it. But I'm inclined to let it alone.

Regards;

Roberto.

Pre-Test thread

Reply #59
Real: I think that it is not needed
wma: I think that v9 std should be included, as it is marketed for portable devices
atrac3plus: do not think that we need it
aac-lc: I think that it should be included, as it can be decoded by portable devices.

Even if some codecs (wma and aac) were already tested in previous tests, I think that they should be included.

If an higher anchor is included (mp3-128), I think that a lower anchor should be included. Otherwise, there is a risk that most of the ranking would be in the bottom range. With a lower anchor, the interesting competitors will probably be more balanced.

As a lower anchor, I think that lame 64 would be good: mp3 is still used for streaming, and it is a something that is really used (Blade64 and lowpass are probably not used that much...)
I the exact version number and parameters are mentionned, this lower anchor is still reproducible.

So my choice would be:
lame 128 (higher anchor)
lame 64 (lower anchor)
vorbis
mp3pro
he-aac
aac
wma

Pre-Test thread

Reply #60
Quote
If an higher anchor is included (mp3-128), I think that a lower anchor should be included. Otherwise, there is a risk that most of the ranking would be in the bottom range.

That thought sounds wise to my ears. IMO, either put one lower and one higher anchor in, or make explicitly clear that the results may be a little down the ladder.

I also would to like see wma included, as it is one of the advertised two main formats in portable music (...so ms would say, it's as good at 32kbps as mp3 at 64...) let's see if they right :-)
Nothing but a Heartache - Since I found my Baby ;)

Pre-Test thread

Reply #61
uups, double post...
Nothing but a Heartache - Since I found my Baby ;)

Pre-Test thread

Reply #62
Roberto, given the MS claims about WMA9 (not WMA8), and its current semi-support (most existing hardware devices support the capabilities of the WMA8 bit-stream, which is more constrained), and increasing industry support, it has to be included. 

Also, the MS claims about quality are VBR/ABR based, not CBR, and most of the other codec configurations are VBR/ABR configurations.  I suggest ABR (VBR 2-pass) with the standard codec (the pro codec will only realize 64k VBR through quality based configuration which would be more difficult to constrain than ABR).

By the way, thanks for doing all this, Roberto.  Whilst many folk will criticise, few will bother doing anything worth criticising.

Doug

Pre-Test thread

Reply #63
OK, so let's try this:

-HE AAC
-Vorbis
-MP3pro
-WMA Std
-AAC-LC
-Real Audio
-Lame 128 as high anchor
-Lame/Blade 64 as bottom anchor

Any criticism?

Pre-Test thread

Reply #64
That combination looks realy good. This way

- you can look at the differences btw he-aac and lc-aac at the given bitrate
- compare mp3 and wma in detail
- see how vorbis puts up with all of them and how it is possibly beaten by aac...
- and in the end get an impression if mp3(pro!!) realy still is the medium of choise in a low bitrate scenario

The anchors are well choosen, as almost anybody know how mp3 'should' sound and are well set.

I'll say, let's do it that way. Though the codec-combination looks good to me, I realy have no big clue bout all the special 'treatments' for each codec that may be chosen... (but that has been discussed before in this thread, if my mem. doesn't let me down).
Nothing but a Heartache - Since I found my Baby ;)

Pre-Test thread

Reply #65
I agree.

Of course, 8 codecs will take some time (time to listen, time to relax the ears during the test).
But if someones time is short, it's better to reduce the number of testet samples in this personal case, than to reduce the number of codecs for the whole test.

Pre-Test thread

Reply #66
Quote
I realy have no big clue bout all the special 'treatments' for each codec that may be chosen... (but that has been discussed before in this thread, if my mem. doesn't let me down).

What treatments? Preprocessing?

Preprocessing IST VERBOTTEN!

Quote
But if someones time is short, it's better to reduce the number of testet samples in this personal case, than to reduce the number of codecs for the whole test.


True. Besides, the test will last for 11 days and two whole weekends. I reckon people will need less time to listen to this test's 8 samples than the 128kbps test's 6 samples.

Anyway, still looking for criticism before I officialize that as the sample suite.

(Also, I'll wait some time for a reply I'm expecting from a codec developer)

Thanks for all the suggestions and criticism.

Best regards;

Roberto.

Pre-Test thread

Reply #67
BTW, some more questions that need to be answered:

-What AAC LC codec we'll use, Apple (ABR 64kbps) or Ahead (VBR, Radio/Tape (I don't remember))
-Blade or Lame at 64kbps for the bottom lowpass? Or maybe even FhG?

IMO, it would be interesting to use MP3 at it's best at 64kbps too, to see how well (bad?) it compares to other codecs (specially WMA). And at 64kbps the best is surely FhG, since neither Lame nor Blade have Intensity Stereo coding.

Regards;

Roberto.

Pre-Test thread

Reply #68
Quote
IMO, it would be interesting to use MP3 at it's best at 64kbps too, to see how well (bad?) it compares to other codecs (specially WMA). And at 64kbps the best is surely FhG, since neither Lame nor Blade have Intensity Stereo coding.

I'm interested too.  I say go with FhG for the bottom anchor.
gentoo ~amd64 + layman | ncmpcpp/mpd | wavpack + vorbis + lame

Pre-Test thread

Reply #69
The list looks good from where I'm standing.  I'll also agree that featuring 64k mp3 at its best is going to provide the most interesting results.
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

Pre-Test thread

Reply #70
Ciao...

read @ the portal.
sounds familiar.: *general phong*

Pre-Test thread

Reply #71
Quote
read @ the portal.
sounds familiar.: *general phong*

Too late. You replied, and now it's gone. 

Pre-Test thread

Reply #72
Huh?
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

Pre-Test thread

Reply #73
rj -- as for which aac to include, i would hate to see you choose ahead just because its the latest to throw its hat in the ring to the exclusion of qt -- which was i think a surprise in the last two tests, the first for winning outright and the second for placing 2d with cbr. on that basis i would assume it merits inclusion without comment. however, if you personally abx qt vs. ahead and conclude that ahead shows more promise, i for one would not object to you kicking qt out. fwiw.

brett.

Pre-Test thread

Reply #74
Quote
rj -- as for which aac to include, i would hate to see you choose ahead just because its the latest to throw its hat in the ring to the exclusion of qt

Actually, I'm leaning more towards QT. First, because Ahead is already featuring a codec in this test, and second because QT fared so well in the AAC test.

Quote
-- which was i think a surprise in the last two tests, the first for winning outright and the second for placing 2d with cbr. on that basis i would assume it merits inclusion without comment.


I agree

Quote
however, if you personally abx qt vs. ahead and conclude that ahead shows more promise, i for one would not object to you kicking qt out. fwiw.


I don't take listening tests. But if someone wants to test Ahead 64 vs. QT 64, the results would be very welcome.

Regards;

Roberto.