IPB

Welcome Guest ( Log In | Register )

7 Pages V  « < 2 3 4 5 6 > »   
Reply to this topicStart new topic
64 kbps listening test 2005, Pre-test thread
guruboolez
post Mar 24 2005, 13:43
Post #76





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



There's nothing wrong with that.
Now imagine a heterogeneous sample: beginning (first seconds) is quiet, whereas the following part is very different.
Suppose that most people will only rate the file on a small part (4-5 seconds). Suppose then that most people will favour the first thing they hear (beginning). Most, but not all... HE-AAC is very good on the beginning, but fail on the second part. Will the overall notation be representative? Isn't it better to provide to all people a short sample only?

If people will evaluate different part of one sample, it could be considered as evaluating two (or more) different samples (at least if a long sample propose some variety). But correct me if I'm wrong, the purpose of a collective test is to obtain results from different subjectivity evaluating the same thing (same sample, same listening material). We can't do that: people don't have the same hardware. But we can at least make one thing, and be sure that all people are listening to the same musical informations.


Is it clear?


EDIT: Gabriel was faster, and explained it better laugh.gif

This post has been edited by guruboolez: Mar 24 2005, 13:44
Go to the top of the page
+Quote Post
Aoyumi
post Mar 24 2005, 13:52
Post #77





Group: Members
Posts: 236
Joined: 14-January 04
From: Kanto, Japan
Member No.: 11215



QUOTE (PoisonDan @ Mar 23 2005, 11:51 PM)
QUOTE (Aoyumi @ Mar 23 2005, 04:32 PM)
QUOTE (Sebastian Mares @ Mar 23 2005, 10:33 PM)
Regarding Vorbis, I would love if some Vorbis users could start a small listening test and compare AoTuV3 and Xiph 1.1 so that the better version will be used in this test.
*

When the test is performed, I need to submit the newest experiment version.
It is more clearly than aoTuV beta3 good with some samples (setting to the low bit rate).
*


Were you planning on releasing a new version soon anyway? I wouldn't want you to feel rushed to get a version out the door just to be in time for this listening test...

At this moment, I'm extremely busy with real-life and work-related stuff, but next week I'll probably have some time to do a few Vorbis listening tests...
*


I am able to exhibit the version corresponding to the range of 64kbps at least.
I want it to be tested.
Go to the top of the page
+Quote Post
Music Mixer
post Mar 24 2005, 13:56
Post #78





Group: Members
Posts: 4
Joined: 31-October 04
Member No.: 17931



One vote for atrac3+

I suggest to encode via SS3, because it seems to have improved.
I would upload some samples, but it is not possible, because i have only a 56 kbit connection.

(unfortunality)

P.S.: IMHO it sounds better than mp3 but worse than vorbis and he-aac at 64 kbit.

This post has been edited by Music Mixer: Mar 24 2005, 13:58
Go to the top of the page
+Quote Post
guruboolez
post Mar 24 2005, 14:02
Post #79





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



Another suggestion (related to the sample): instead of focusing too much on musical genre (metal - jazz - classical ...), I think it would be better to choose sample for the kind of signal they represent: loud - quiet - noisy - tonal - attacks...

When I sent to Roberto the very quiet sample called Debussy.wav, which had apparently nothing hard to encode, most people were at the end surprised by the poor performance of the champion (musepack). This sample revealed severe issues with musepack (even wma & atrac3 were better) at moderate bitrate. I know that some lossy encoders have serious problems with very tonal music (-> ringing); some other suffers with low volume content. There's also pre-echo...


If you're interested, I could propose several samples.
Go to the top of the page
+Quote Post
Gabriel
post Mar 24 2005, 14:12
Post #80


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



One thing I'd like is to let encoders adapt to the content before the test position.
Most encoders have adaptative thresholds, and so need a few time to adapt at the beginning. It means that a specific piece would not be encoded the same if it is at the beginning of the track or in the middle.
I think that a 1s delay should be reasonable enough.

So would it be possible to:

*cut the first second of the decompressed sample?

or

*instruct the ABC/HR software to only allow testing past the first second?
Go to the top of the page
+Quote Post
ff123
post Mar 24 2005, 15:41
Post #81


ABC/HR developer, ff123.net admin


Group: Developer (Donating)
Posts: 1396
Joined: 24-September 01
Member No.: 12



I'm not sure if abchr-java can force the following options, but I'm sure schnofler can modify his code:

1) the rating scale description should be changed to the "excellent" to "poor" labels; I already know this option exists, but it should be forced from the configuration file

2) the start time should be forced to X sec into the clip without allowing the listener to hear anything before that time, also specified from the configuration file.

ff123
Go to the top of the page
+Quote Post
moi
post Mar 24 2005, 18:43
Post #82





Group: Members
Posts: 53
Joined: 23-June 04
Member No.: 14859



QUOTE (Latexxx @ Mar 23 2005, 06:22 AM)
QUOTE (moi @ Mar 23 2005, 03:58 PM)
I really don't see why LAME at 128kbps should be  included in a 64kbps listening test, as it was the other time. Probably has something to do with the claim that WMA at 64kbps sounds "as good as" MP3 at 128kbps. I don't think many here believe that claim. In any case, IMO, a 64kbps listening test should only include music encoded at 64kbps. It is misleading to encode 128kbps in one format, and 64kbps in all the others. Everything in a 64kbps listening test should be encoded at 64kbps.
*

A credible listening test should have a low and high anchor.
*



What does that mean, a high and low anchor? I guess I really don't know what that means--it just seems strange, that for a 64kbps listening test, one of the formats would be tested at 128 kbps rather than at 64 kbps.

If it is to have a reference to compare to, then why not have one sample uncompressed, for listeners to compare the compressed versions with? (Perhaps that's already done. That makes sense, but I don't understand the " high and low anchor", I guess. Please explain.

Does "high anchor" always mean one format is tested at a higher bit rate than the others? For low anchor a lower bit rate? Will you test one format at 32kbps for the "low anchor"?

In the 128kbps listening test, was one of the formats tested at 192kbps for the "high anchor"?
Go to the top of the page
+Quote Post
beto
post Mar 24 2005, 18:55
Post #83





Group: Members (Donating)
Posts: 713
Joined: 8-July 04
From: Sao Paulo
Member No.: 15173



high anchor -> performs noticeably better than the codecs average being tested
low anchor -> performs noticeably worse than the codecs average being tested

afaik this is done to get meaningful statistic results. The high/low anchors are not part of the test itself in the sense that they are not evaluated. They are just a reference...

someone correct me if i am wrong.

This post has been edited by beto: Mar 24 2005, 18:56


--------------------
http://volutabro.blogspot.com
Go to the top of the page
+Quote Post
Latexxx
post Mar 24 2005, 18:57
Post #84


A/V Moderator


Group: Members
Posts: 858
Joined: 12-May 03
From: Finland
Member No.: 6557



The purpose of anchors is to bind the results to real world i.e. when you have an anchor your results won't anymore "float" in the air. When you have anchors, you can compare codecs which are featured in different listening test to each other to some extent.
Go to the top of the page
+Quote Post
schnofler
post Mar 24 2005, 19:26
Post #85


Java ABC/HR developer


Group: Developer
Posts: 175
Joined: 17-September 03
Member No.: 8879



QUOTE (ff123 @ Mar 24 2005, 06:41 AM)
1) the rating scale description should be changed to the "excellent" to "poor" labels; I already know this option exists, but it should be forced from the configuration file
*

I'm not sure what exactly you mean by "forced from the configuration file". The custom rating labels can be specified in the test setup dialog and will be saved to the configuration file.

QUOTE (ff123 @ Mar 24 2005, 06:41 AM)
2) the start time should be forced to X sec into the clip without allowing the listener to hear anything before that time, also specified from the configuration file.
*

The offset setting could be used for this. Just adding 1000*X to each of the offsets will have exactly that effect.
Go to the top of the page
+Quote Post
ff123
post Mar 24 2005, 21:29
Post #86


ABC/HR developer, ff123.net admin


Group: Developer (Donating)
Posts: 1396
Joined: 24-September 01
Member No.: 12



QUOTE (schnofler @ Mar 24 2005, 10:26 AM)
QUOTE (ff123 @ Mar 24 2005, 06:41 AM)
1) the rating scale description should be changed to the "excellent" to "poor" labels; I already know this option exists, but it should be forced from the configuration file
*

I'm not sure what exactly you mean by "forced from the configuration file". The custom rating labels can be specified in the test setup dialog and will be saved to the configuration file.

QUOTE (ff123 @ Mar 24 2005, 06:41 AM)
2) the start time should be forced to X sec into the clip without allowing the listener to hear anything before that time, also specified from the configuration file.
*

The offset setting could be used for this. Just adding 1000*X to each of the offsets will have exactly that effect.
*



What I meant is that Sebastian should be able to create a configuration file that everyone uses, and which will control the rating labels. Doh, forgot about those offsets in the config file! That's the easy solution, of course.

ff123
Go to the top of the page
+Quote Post
jaybeee
post Mar 24 2005, 21:34
Post #87





Group: Members
Posts: 410
Joined: 20-October 04
From: UK
Member No.: 17750



QUOTE (Gabriel @ Mar 24 2005, 12:30 PM)
I also think that 30s might be too long.
Perhaps 6s is too short, but I think that 15s should be enough.

Letting testers deciding which portion to use is perhaps reducing "usefullness" of results. It is like they are testing different samples, but it makes correlation between results for the same sample harder.

If a sample has some quite different parts in a 30s set, then it could be intersting to split it into 2 samples, making interpretation of results easier.
*


I've just uploaded an 18sec track here that I feel would be good for this test. I deliberated over which section to use and also how long that section was to be - the song is 21min long and has a lot of demanding parts. I think I chose the best part.

This post has been edited by jaybeee: Mar 24 2005, 22:26


--------------------
http://www.health4ni.com/
Go to the top of the page
+Quote Post
Sebastian Mares
post Mar 24 2005, 22:22
Post #88





Group: Members
Posts: 3633
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



Samples should be posted here, please: http://www.hydrogenaudio.org/forums/index....showtopic=32689


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
schnofler
post Mar 25 2005, 00:13
Post #89


Java ABC/HR developer


Group: Developer
Posts: 175
Joined: 17-September 03
Member No.: 8879



QUOTE (ff123 @ Mar 24 2005, 12:29 PM)
What I meant is that Sebastian should be able to create a configuration file that everyone uses, and which will control the rating labels.
*

Yes, that is possible.

QUOTE (ff123 @ Mar 24 2005, 12:29 PM)
Doh, forgot about those offsets in the config file!  That's the easy solution, of course.
*

Heh. Yes, I was just about to get to work on the "new" feature myself, when I noticed it's not such a new feature, really. tongue.gif
Go to the top of the page
+Quote Post
Sebastian Mares
post Mar 25 2005, 14:44
Post #90





Group: Members
Posts: 3633
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



So far, the settings used will be:

Nero HE-AAC: VBR profile "Streaming :: Medium", High Quality
Vorbis: -q 0
WMA Standard: -a_codec WMA9STD -a_mode 3 -a_setting 64_44_2
LAME 3.96.1 (high anchor): -V5 --athaa-sensitivity 1

ATRAC3+ samples will be encoded using whatever settings produce 64kbps, same applies to Apple HE-AAC.

Regarding the low anchor, I would use Adobe Audition 1.5 and the FhG encoder at 64 kbps CBR, but others might want to use LAME.


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
Gabriel
post Mar 25 2005, 16:28
Post #91


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



For the high anchor, I would prefer Lame 3.97 (probably in abr setting) that will probably be at least in beta stage when the test should start.
Go to the top of the page
+Quote Post
Sebastian Mares
post Mar 25 2005, 21:34
Post #92





Group: Members
Posts: 3633
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



QUOTE (Gabriel @ Mar 25 2005, 04:28 PM)
For the high anchor, I would prefer Lame 3.97 (probably in abr setting) that will probably be at least in beta stage when the test should start.
*


So, --preset 128 then?


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
Jojo
post Mar 26 2005, 02:46
Post #93





Group: Members
Posts: 1361
Joined: 25-November 02
Member No.: 3873



QUOTE (Gabriel @ Mar 25 2005, 07:28 AM)
For the high anchor, I would prefer Lame 3.97 (probably in abr setting) that will probably be at least in beta stage when the test should start.
*

just out of curiosity are you saying that some ABR preset in the new LAME 3.97 built might be better than -V5 --athaa-sensitivity 1 ?


--------------------
--alt-presets are there for a reason! These other switches DO NOT work better than it, trust me on this.
LAME + Joint Stereo doesn't destroy 'Stereo'
Go to the top of the page
+Quote Post
Sebastian Mares
post Mar 27 2005, 08:48
Post #94





Group: Members
Posts: 3633
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



So, the list of codecs is now pretty much done:

Apple HE-AAC
Nero HE-AAC
WMA Standard
ATRAC3+
LAME 3.97 MP3 (high anchor)
Adobe Audition FhG MP3 (low anchor)
Ogg Vorbis (AoTuV3 or 1.1)

At this point, I would like to ask people again to test between the two Vorbis encoders. If you have time, you can also give Archer a try, but the test should focus on AoTuV3 and 1.1.

This post has been edited by Sebastian Mares: Mar 27 2005, 08:49


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
westgroveg
post Mar 27 2005, 09:42
Post #95





Group: Members
Posts: 1236
Joined: 5-October 01
Member No.: 220



What about MP3+? would be interesting to see if the HE-AAC encoders can perform better MP3+ yet
Go to the top of the page
+Quote Post
Sebastian Mares
post Mar 27 2005, 10:00
Post #96





Group: Members
Posts: 3633
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



QUOTE (westgroveg @ Mar 27 2005, 09:42 AM)
What about MP3+? would be interesting to see if the HE-AAC encoders can perform better MP3+ yet
*


I suppose you mean mp3PRO... Well, it was tested last time and it performed quite well, but only came third after Nero HE-AAC and the high anchor LAME at 128 kbps.



I will not include it in this test because there are no improvements since the last test and also because it is a pretty rare format with little soft- and hardware support.


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
vinnie97
post Mar 27 2005, 21:13
Post #97





Group: Members
Posts: 472
Joined: 6-March 03
Member No.: 5360



There's a AoTuV-prebeta4 to check now which supposedly resolves some issues @ q0. wink.gif http://www.geocities.jp/aoyoume/aotuv/test.html
Go to the top of the page
+Quote Post
guruboolez
post Mar 30 2005, 19:27
Post #98





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



I did a small listening test for WMA9 encoders. As samples, I've used all selected by Roberto for his last 128 kbps Multiformat Listening Test.


Two important things:

I didn't browse HA since last thursday (If decisions were made in this topic since one week, I wasn't aware)
this listening test was a very fast one. Too fast I would say. I didn't ABX anything; and I've probably miss some details.



WMA9Pro is better, but bitrate doesn't tend to 64 kbps at -q10. WMA9Pro is nevertheless not that better.
Statistically, CBR 2 pass and VBR 2 pass are tied, but CBR 64kbps 2-pass appeared to be a bit more constant in quality than VBR at low bitrate.


EDIT: blank log files (no comment, simple notation) are available here.

This post has been edited by guruboolez: Mar 30 2005, 19:28
Go to the top of the page
+Quote Post
Sebastian Mares
post Mar 30 2005, 19:40
Post #99





Group: Members
Posts: 3633
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



Thanks for the test guruboolez! Weird that VBR is a bit worse than CBR - didn't expect that. I guess I will use CBR for WMA standard then.


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
guruboolez
post Mar 30 2005, 20:24
Post #100





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



As I said, I was not fully satisfied by this test (too fast, too imprecise). If the collective test doesn't start in the next days, I think I could test CBR and VBR again, without WMApro this time, and with ABX phase in order to be sure that difference were audible.
Go to the top of the page
+Quote Post

7 Pages V  « < 2 3 4 5 6 > » 
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd November 2014 - 03:08