Multiformat 128 kbps Listening Test

Topic: Multiformat 128 kbps Listening Test (Read 276105 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Multiformat 128 kbps Listening Test

Reply #750 – 2005-12-01 05:16:02

Quote

I guess I'm the wrong person to answer to this question. Check my "failure" here:

A little VBR ~88 kbps ABX-test
[a href="index.php?act=findpost&pid=346737"][{POST_SNAPBACK}][/a]

You should be able to say how to select "difficult" samples anyhow. I guess that you by linking to this topic want to say that it should be done by listening tests and by someone else than you. Am I right?

Then, sure, I can understand your point - you want the test to be a bit easier so that not all codecs will end up tied. But first a group of people have to sit down and listen to a lot of material in order to select some that produce artefacting with some encoders. I just don't thing that's very practical. The group should as you hint at be people who have above average hearing, and they should go through a sizeable amount of test clips in order to whisk out some interesting ones. Then these people could just as well give the score directly and there would be no need for the public listening test.

Or, you could try to reuse samples from previous tests which separated the codecs well. Then you run into the risk that you bias this test towards the winner of the last. That could be acceptable I guess, but again I would prefer to not do like that, and it does indeed look like Sebastian is selecting a whole new sample set for this test.

I don't see any reason to why it shouldn't be selected in a way that it's bitrate mean and distribution will resemble "average music". What average music is, we could discuss, but I'm fine if it happens to be those libraries from which the current bitrate estimation for the encoders is taken.

Multiformat 128 kbps Listening Test

Reply #751 – 2005-12-01 06:04:05

I suppose it would be good if a couple of experienced testers could try the samples and tell their opinion in case the test conductor is not sure about the selection. They don't have to test all samples and with every encoder.

BTW, did you try the sample I provided e.g. with Vorbis? No one has posted test results yet. Is that sample too easy for Vorbis -q 1.5?

Multiformat 128 kbps Listening Test

Reply #752 – 2005-12-01 06:39:25

Quote

BTW, did you try the sample I provided e.g. with Vorbis? No one has posted test results yet. Is that sample too easy for Vorbis -q 1.5?
[a href="index.php?act=findpost&pid=346765"][{POST_SNAPBACK}][/a]

Not yet, no. If you could provide a flac version of that file I could give it a try. (I'm using a mac)

Multiformat 128 kbps Listening Test

Reply #753 – 2005-12-01 06:52:10

Quote

Not yet, no. If you could provide a flac version of that file I could give it a try. (I'm using a mac)[a href="index.php?act=findpost&pid=346768"][{POST_SNAPBACK}][/a]

FLAC may be a more universal format. I replaced it with a FLAC version.

[span style='font-size:7pt;line-height:100%']Edit: I added also the lossy test files. The files are not big and Vorbis b4.5 may not be available on all platforms.[/span]

Multiformat 128 kbps Listening Test

Reply #754 – 2005-12-01 07:35:31

Quote

I would like see a strong female voice included. A jazz or opera singer who can sing load and high would be good. Someone who can break glasses with her voice... this kind of sample might produce low bitrates and be difficult for the encoders at the same time.

Here's sample of a strong female voice
Cæcilie Norby - Life on Mars

Multiformat 128 kbps Listening Test

Reply #755 – 2005-12-01 08:32:17

[span style='font-size:14pt;line-height:100%']2-pass encoding: why it is necessary to extract samples from an entire encoding and the original track.[/span]

wmeditor.exe is a tool provided with WMEncoder.exe. The purpose of this small application is to cut WMA files without recoding them. The tool is not extremely precise (accuracy seems to be around ½ sec, at least with VBR) but it’s enough to extract a part from a VBR encoding in order to measure the bitrate of this small part.

I encoded four samples from the latest multiformat listening test at 128 kbps with WMA VBR 2-pass, with three different method or environment. As you know, the allocated bitrate of one sample depends a lot from the content of the whole file.

• Method#1 = the sample was extracted from a full track encoded in 2-pass mode. The full track is the original one. The bitrate of the full track should correspond to the targeted one, but our extracted sample could differ from it, and will depend from the complexity of the sample as well from the complexity of the track.

• Method#2 = the sample is directly encoded in 2-pass mode. The final bitrate necessary correspond to the targeted one (here: 128 kbps), regardless of the complexity of the sample.

• Method#3 = the sample was extracted from a full track encoded in 2-pass mode. The full track is a virtual track composed with 18 samples (used in latest 128 multiformat test) merged together. The bitrate of the full track should correspond to the targeted one, but our extracted sample could differ from it, and will depend from the complexity of the sample as well from the complexity of the track.

In summary:
method#1 = 1st step = encoding (full & original track) – 2nd step: extraction
method#2 = 1st step = extraction – 2nd step: encoding
method#3 = 1st step = encoding (full but fake track) – 2nd step: extraction

BARTOK.WAV

Code: [Select]

#1 = 0:22.895    356 kb    124,39 kbps
#2 = 0:23.024    373 kb    129,60 kbps
#3 = 0:23.231    338 kb    116,40 kbps

DEBUSSY.WAV

Code: [Select]

#1 = 0:29.210    385 kb    105,44 kbps
#2 = 0:29.999    478 kb    127,47 kbps
#3 = 0:29.674    291 kb  78,45 kbps

HONGROISE.WAV

Code: [Select]

#1 = 0:29.393    490 kb    130,93 kbps
#2 = 0:30.000    478 kb    127,47 kbps
#3 = 0:29.396    326 kb  88,72 kbps

MAHLER.WAV

Code: [Select]

#1 = 0:29.767    717 kb    192,70 kbps
#2 = 0:29.999    484 kb    129,07 kbps
#3 = 0:30.058    536 kb    142,66 kbps

Code: [Select]

             #1       #2        #3
bartok     124,39    129,60    116,40
debussy    105,44    127,47     78,45
hongroise  130,93    127,47     88,72
mahler     192,70    129,07    142,66
           __________________________
            138,37    128,40    106,56

=> method#2 and method#3 are both inaccurate. None correspond to what the listener would get by encoding his own CD with VBR 2-pass mode. The third method is in this case the worse, and would handicap WMA by 30 kbps per (classical) sample!

Multiformat 128 kbps Listening Test

Reply #756 – 2005-12-01 09:33:12

The corresponding samples (each sample encoded three time with the same seting but in three different environment) are uploaded here.

Multiformat 128 kbps Listening Test

Reply #757 – 2005-12-01 11:10:03

Quote

Quote
Quote
I'm new to this (still) and excuse this question if it silly or I'm jumping ahead of things, but, when I try saving the clips, they all come up as "index.php"? What am I doing wrong?
[a href="index.php?act=findpost&pid=346665"][{POST_SNAPBACK}][/a]

Which browser are you using?
[a href="index.php?act=findpost&pid=346666"][{POST_SNAPBACK}][/a]

IE 6.0...never ran into troubles like this with any other sample?
[a href="index.php?act=findpost&pid=346667"][{POST_SNAPBACK}][/a]

Did you try just left-clicking, instead of right-click -> save as?

Multiformat 128 kbps Listening Test

Reply #758 – 2005-12-01 12:49:11

Quote

=> method#2 and method#3 are both inaccurate. None correspond to what the listener would get by encoding his own CD with VBR 2-pass mode. The third method is in this case the worse, and would handicap WMA by 30 kbps per (classical) sample!
[a href="index.php?act=findpost&pid=346785"][{POST_SNAPBACK}][/a]

That's very interesting. Really puts it, black on white, that none other than method one is acceptable if 2-pass WMA std is to be tested.

Could you do a similar test for Nero's AAC, to test the relevancy of ChiGung's worries?

Multiformat 128 kbps Listening Test

Reply #759 – 2005-12-01 12:55:16

I could, but I need a tool¹ to cut losslessly an AAC or MP4 file. Does that exist? mp4box maybe?

¹ not too hard to use

Multiformat 128 kbps Listening Test

Reply #760 – 2005-12-01 13:24:14

Updated the bitrate table with Vorbis Q 4.25: http://maresweb.de/bitrates.htm

Multiformat 128 kbps Listening Test

Reply #761 – 2005-12-01 13:30:01

Quote

I could, but I need a tool¹ to cut losslessly an AAC or MP4 file. Does that exist? mp4box maybe?

¹ not too hard to use
[a href="index.php?act=findpost&pid=346830"][{POST_SNAPBACK}][/a]

Of course, didn't think about that...

Multiformat 128 kbps Listening Test

Reply #762 – 2005-12-01 13:33:38

Quote

As said before, the used encoding options should be determined by checking the bitrates of a large amount of varied complete tracks and after that kept.

The samples should represent many musical genres and be difficult enough to produce audible artifacts. This is a listening test and no one should be interested what bitrates the individual encoded samples are. I don't think the bitrates should even be checked. The encoders should be left on their own to do the best they can with the sample material.

Naturally, the results of the preceding large-scale bitrate testing should be published.
[a href="index.php?act=findpost&pid=346720"][{POST_SNAPBACK}][/a]

Completely agree. The only problem I see is that final bit rate for a codec depends highly on nature of sound material, so if large “bit rate estimation” corpus consists of mostly classical music there will be one estimate, in case of death-metal – another estimate. There is no any final figure regardless of sound material used. I see several approaches to solve this “problem of definitions”:

1.   To fill “bit rate estimation” corpus with various genres in equal proportions. Say, classical + pop + instrumental + speech.
2.   To calculate final bit rates for each genre separately (the question of comparison will arise);
3.   To select for each genre a representative CD and use them in all future listening tests for defining bit rates. A kind of “standard meter”.

Multiformat 128 kbps Listening Test

Reply #763 – 2005-12-01 14:03:44

Also a few thoughts concerning choice of test samples.

1. It seems to me that they have to be as different as possible, no matter what bit rates are achieved. If we decided to measure final bit rates on a large corpus, then there is no need to do this the second time on selected test samples. Actually I don’t see any idea behind doing this twice.
2. In this particular listening test they have to be short (15-20 sec.) and representative in genre sense. The first is for minimizing listening fatigue of mostly non trained listeners and the latter is for making test samples to be closer to “real world” usage. Some “difficult for encoding” samples are ok as well, I suppose.

Multiformat 128 kbps Listening Test

Reply #764 – 2005-12-01 14:14:09

Quote

Also a few thoughts concerning choice of test samples.

1. It seems to me that they have to be as different as possible, no matter what bit rates are achieved. If we decided to measure final bit rates on a large corpus, then there is no need to do this the second time on selected test samples. Actually I don’t see any idea behind doing this twice.
2. In this particular listening test they have to be short (15-20 sec.) and representative in genre sense. The first is for minimizing listening fatigue of mostly non trained listeners and the latter is for making test samples to be closer to “real world” usage. Some “difficult for encoding” samples are ok as well, I suppose.
[a href="index.php?act=findpost&pid=346875"][{POST_SNAPBACK}][/a]

I'd like to say that 30 secs is a good length. Often when testing at this kind of bitrate it is very difficult to find parts of a track which are distinguishable from the original, even with full length songs.

Multiformat 128 kbps Listening Test

Reply #765 – 2005-12-01 14:23:51

Quote

Also a few thoughts concerning choice of test samples.

1. It seems to me that they have to be as different as possible, no matter what bit rates are achieved. If we decided to measure final bit rates on a large corpus, then there is no need to do this the second time on selected test samples. Actually I don’t see any idea behind doing this twice.
[a href="index.php?act=findpost&pid=346875"][{POST_SNAPBACK}][/a]

There were complains in the past. CBR/ABR encoders were limited to 128 kbps whereas VBR ones reached 140...145 kbps on average with the selected samples (see the first 128 kbps listening test). Some persons considered the setting as flawed. They assumed that VBR would sound poorer with low bitrate samples. To limit the complaints, Roberto tried to lower the bitrate of the selected samples, and introduced one or two low bitrate samples which were very useful in that regard. They also revealed that some VBR encoders such LAME or MPC have serious troubles with this kind of material, considered as "easy" to encode.

In my opinion, a listening test — especially based on VBR encoders — shouldn't discard any samples for the simple reason that the final bitrate varies too much from the target. High variations are a part of VBR encoding, and such samples, including extreme ones, should be tested as a part of VBR.
Moreover, if all contenders are VBR encodings, I don't think that we should put too much attention on the final bitrate of the tested samples. As long as the average bitrate of ALL CONTENDERS are very close, it doesn't matter that the average bitrate is 100 kbps or 160 kbps. Of course, it would be nicer to see something closer to the target bitrate. Here, with a target corresponding to 135 kbps rather to 128 kbps, it wouldn't be a problem if the average bitrate of the selected samples would reach 142...145 kbps.

Multiformat 128 kbps Listening Test

Reply #766 – 2005-12-01 14:30:40

Quote

2. In this particular listening test they have to be short (15-20 sec.) and representative in genre sense. The first is for minimizing listening fatigue of mostly non trained listeners and the latter is for making test samples to be closer to “real world” usage. Some “difficult for encoding” samples are ok as well, I suppose.
[a href="index.php?act=findpost&pid=346875"][{POST_SNAPBACK}][/a]

I agree. I prefer 5...6 sec samples. Therefore, everybody would rank the same musical part. It's not necessary the case with 30 sec samples, especially if several accoustical phenomenons occur during these 30 seconds.
But I know that few people shares my point of view. I'm myself unable to cut 5 or 6 sec from a sample: samples don't sound musical anymore

Multiformat 128 kbps Listening Test

Reply #767 – 2005-12-01 14:31:46

Quote

Quote
Also a few thoughts concerning choice of test samples.

1. It seems to me that they have to be as different as possible, no matter what bit rates are achieved. If we decided to measure final bit rates on a large corpus, then there is no need to do this the second time on selected test samples. Actually I don’t see any idea behind doing this twice.
[a href="index.php?act=findpost&pid=346875"][{POST_SNAPBACK}][/a]

There were complains in the past. CBR/ABR encoders were limited to 128 kbps whereas VBR ones reached 140...145 kbps on average with the selected samples (see the first 128 kbps listening test). Some persons considered the setting as flawed. They assumed that VBR would sound poorer with low bitrate samples. To limit the complaints, Roberto tried to lower the bitrate of the selected samples, and introduced one or two low bitrate samples which were very useful in that regard. They also revealed that some VBR encoders such LAME or MPC have serious troubles with this kind of material, considered as "easy" to encode.

In my opinion, a listening test — especially based on VBR encoders — shouldn't discard any samples for the simple reason that the final bitrate varies too much from the target. High variations are a part of VBR encoding, and such samples, including extreme ones, should be tested as a part of VBR.
Moreover, if all contenders are VBR encodings, I don't think that we should put too much attention on the final bitrate of the tested samples. As long as the average bitrate of ALL CONTENDERS are very close, it doesn't matter that the average bitrate is 100 kbps or 160 kbps. Of course, it would be nicer to see something closer to the target bitrate. Here, with a target corresponding to 135 kbps rather to 128 kbps, it wouldn't be a problem if the average bitrate of the selected samples would reach 142...145 kbps.
[a href="index.php?act=findpost&pid=346880"][{POST_SNAPBACK}][/a]

I would like to go one step further, and claim that it wouldn't matter if the encoders produced very different avg bitrates for the sample corpus either, as long as they have been set to produce a similar avg bitrate for some large, varied music collection.

If that were the case, and the test were to reveal that one of the encoders with a low average was far behind the others in terms of sound quality, it would simply show that the encoder in question hadn't got it's priorities right.

Edit: Maybe my reasoning is flawed. Only if you have a sample set where each VBR encoder has "highs" and "lows" (avg bitrates) can you determine which one distributes the bits in the best way...

Multiformat 128 kbps Listening Test

Reply #768 – 2005-12-01 14:37:52

Quote

Quote
2. In this particular listening test they have to be short (15-20 sec.) and representative in genre sense. The first is for minimizing listening fatigue of mostly non trained listeners and the latter is for making test samples to be closer to “real world” usage. Some “difficult for encoding” samples are ok as well, I suppose.
[a href="index.php?act=findpost&pid=346875"][{POST_SNAPBACK}][/a]

I agree. I prefer 5...6 sec samples. Therefore, everybody would rank the same musical part. It's not necessary the case with 30 sec samples, especially if there several accoustical phenomenons occur during these 30 seconds.
But I know that few people shares my point of view. I'm myself unable to cut 5 or 6 sec from a sample: it's not musical anymore
[a href="index.php?act=findpost&pid=346884"][{POST_SNAPBACK}][/a]

I didn't think of that aspect. Good point.

Multiformat 128 kbps Listening Test

Reply #769 – 2005-12-01 14:49:18

Quote

... Some persons considered the setting as flawed. They assumed that VBR would sound poorer with low bitrate samples. To limit the complaints, Roberto tried to lower the bitrate of the selected samples

In case of VBR encoding this could be translated as "They assumed that VBR would sound poorer with some other sound samples." cause actual bit rate completely depends on nature of samples. That's why most different sound samples are appreciated in listening tests.

Quote

In my opinion, a listening test — especially based on VBR encoders — shouldn't discard any samples for the simple reason that the final bitrate varies too much from the target. High variations are a part of VBR encoding, and such samples, including extreme ones, should be tested as a part of VBR.
Moreover, if all contenders are VBR encodings, I don't think that we should put too much attention on the final bitrate of the tested samples. As long as the average bitrate of ALL CONTENDERS are very close, it doesn't matter that the average bitrate is 100 kbps or 160 kbps. Of course, it would be nicer to see something closer to the target bitrate. Here, with a target corresponding to 135 kbps rather to 128 kbps, it wouldn't be a problem if the average bitrate of the selected samples would reach 142...145 kbps.

Agree. So may be instead of achieving targeted bit rate for testing samples it's better to spend efforts for choosing 3-5 realy "hard to encode" samples. Other ones could be just excerpts from different genres on Sebastian's choice.

Multiformat 128 kbps Listening Test

Reply #770 – 2005-12-01 15:07:06

Quote

I agree. I prefer 5...6 sec samples. Therefore, everybody would rank the same musical part. It's not necessary the case with 30 sec samples, especially if several accoustical phenomenons occur during these 30 seconds.
But I know that few people shares my point of view. I'm myself unable to cut 5 or 6 sec from a sample: samples don't sound musical anymore
[a href="index.php?act=findpost&pid=346884"][{POST_SNAPBACK}][/a]

Yes, this is a good point. In fact this is a part of big big question: will the results be identical for both tests with whole music works and short samples from them? Please don't try to discuss this here Here we just give a simple answer - YES. Serious flaws in some parts of work could ruin the whole impression.

EDIT. And I prefer around 10 sec. samples containing one difficult place surrounded by something in order the sample to be a piece of music.

Multiformat 128 kbps Listening Test

Reply #771 – 2005-12-01 15:30:21

Added macabre, don't let me be misunderstood, kraftwerk and bigyellow, so there are 18 samples now.

http://maresweb.de/bitrates2.htm

What do you think? I would replace the one or the other sample with the recently posted orchestral samples - maybe even macabre since the other two are more dynamic IMO.

Multiformat 128 kbps Listening Test

Reply #772 – 2005-12-01 15:34:35

Two more things... Seing that the bitrate is around 140 kbps, I think I can tell Vorbis to encode to 4.2 or 4.25 (what do you guys think - which one to go with?).
Also, I will cut the voice from Elizabeth since it's a bit "confusing".

Multiformat 128 kbps Listening Test

Reply #773 – 2005-12-01 15:41:56

Quote

I still fail to see why you call it "questioned". Were Roberto's or Guru's tests questioned because samples were encoded instead of full tracks?

maybe at the time no one realiced about this problems. it doesn't mean they where wrong, just that they can be better.

Quote

=> method#2 and method#3 are both inaccurate. None correspond to what the listener would get by encoding his own CD with VBR 2-pass mode. The third method is in this case the worse, and would handicap WMA by 30 kbps per (classical) sample!

you have guru's analisis here to validate my thinking.

Quote

You could conduce a test yourself. I believe Argentinian law is much nicer than german law, in that aspect.

if somebody want to send me the FLACs, i would do it myself. anyway, are you sure the 30 secs samples are legal?

Multiformat 128 kbps Listening Test

Reply #774 – 2005-12-01 15:53:57

Quote

Quote
You could conduce a test yourself. I believe Argentinian law is much nicer than german law, in that aspect.

if somebody want to send me the FLACs, i would do it myself. anyway, are you sure the 30 secs samples are legal?
[a href="index.php?act=findpost&pid=346914"][{POST_SNAPBACK}][/a]

No idea. They're better than full songs, though.

Anyways, encoding full tracks might be interesting for 2-pass or ABR encoders only since AFAIK, VBR encoders will not encode differently (maybe only minor deviations, but nothing to care about). And trying to obtain full songs is pretty much a pain only because WMA doesn't offer a "reliable" VBR option.
Nero might also be handicapped a bit, but that was the decision of the developers - I could've used VBR.

Notice