IPB

Welcome Guest ( Log In | Register )

8 Pages V  « < 4 5 6 7 8 >  
Reply to this topicStart new topic
Multiformat 48 kbps Listening Test, Pre-Test Discussion, Take 2 - And Action!
Sebastian Mares
post Nov 5 2006, 18:20
Post #126





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



Well, if differences are only marginal and don't affect quality, I guess we should go with WMCmd.vbs because it can be used for batch encoding. WMP does not encode to Q10 WMA Standard. I could also use Winamp, but I just uninstalled it again. tongue.gif


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
Sebastian Mares
post Nov 5 2006, 23:10
Post #127





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



I was talking with Roberto about the problems of testing WMA 2-pass VBR the other day and was wondering about one thing - is only 2-pass VBR affected by the issue I described here or does this affect all VBR modes actually. Therefore, I asked both Ivan and Gabriel how their VBR implementations work and whether or not it is true that "free" VBR will always allocate the same number of bits to a given sample, regardless of the fact that it's part of a full song or the sample was encoded as-is: as an already extracted part of a track. While Ivan confirmed my initial thoughts, Nero producing two more or less identical encodes, Gabriel said this is not the case with LAME. He explained that LAME is using a variable ATH level whichs value is based on the previous loudness. Therefore, encoding a full track is not the same as encoding a sample - even if VBR was used, the sample encoded as-is will not be the same as the sample encoded from the whole track.
I am now wondering how big the effect is. Does this "news" render all previous listening tests based on samples as useless with regards to LAME?

This post has been edited by Sebastian Mares: Nov 5 2006, 23:12


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
guruboolez
post Nov 5 2006, 23:34
Post #128





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



It's for that reason Gabriel suggested 2 years ago (and sometimes recalled it) that testers should discard the first one or two seconds from the tested files.
And if I remember correctly it was done for the last listening tests (an option allows this in ABC/HR).

It needs to be confirmed by Gabriel anyway.

This post has been edited by guruboolez: Nov 5 2006, 23:37
Go to the top of the page
+Quote Post
Sebastian Mares
post Nov 5 2006, 23:38
Post #129





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



OK, so it's not something that affects the whole encode, but only the first few samples.


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
Ivan Dimkovic
post Nov 6 2006, 10:34
Post #130


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



@Sebastian,

I would not treat LAME variable ATH as such a problem for the listening test. Fact is that many psychoacoustic models take into account the previous samples - and it is not just variable ATH.

For example, there is temporal post-masking phenomenon - which would create different bit distributions for a given sample, based on the loudness of the samples in the past - however, this phenomenon is very local in time - e.g. maximum duration is approx. 200 ms (unless encoder is buggy)

Also, some encoders are using time-domain methods to estimate tonality of the signal - for example, if the masker is behaving unpredictable in the time domain in the past, encoder might judge the masker as being "noisy" - and this can mean up to 20+ dB in the masker power difference.

Additionally, in SBR you might get slightly different results as there is usually small "SBR Reset" flag being sent every second or so (depending on the encoder) - the difference between two encodings of the same sample, but located in the different region is also not big, but it is definitely there.

Etc..

These are just a few factors that might render samples encoded with different quantization resolution depending on the past samples. However, all of these differences IMHO are not so relevant for a listening test.

I think just adding 2-3 seconds of "run-in" is more than enough to make a fair test.
Go to the top of the page
+Quote Post
Alex B
post Nov 6 2006, 11:24
Post #131





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



QUOTE
I think just adding 2-3 seconds of "run-in" is more than enough to make a fair test.

Isn't cutting the first two seconds off in the ABC/HR options the usual practice?

However, this may be a problem with very short samples or samples that start with audio signal that is meant to demostrate a specific problem. Here is an example of such a sample: http://www.hydrogenaudio.org/forums/index....st&p=420360

The first two or three seconds seem to be problematic for all MP3 encoders at about 128 kbps. The sample is also from the very beginning of a real audio track so it is not artificial.

Perhaps a few seconds of some PCM material could be addded before the sample, but should this be digital silence or some average audio material? Would a few seconds of silence make the encoder behave differently when the real sample starts? If the sudden signal change alters the encoding result we would need to know what is the encoder "default" before it starts adjusting its parameters and use an audio signal that would not change this default if possible.

Edit

Naturally it is possible to decode the sample and add an audio signal after that. The only downside would be the larger file size of the lossless test sample.

This post has been edited by Alex B: Nov 6 2006, 11:34


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Gabriel
post Nov 6 2006, 11:40
Post #132


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



Most modern audio and video encoders will produce different results based on previous samples. It might be because of detection methods (predictability, ATH level,...) or because the encoder is "learning" (mostly video encoders).

In both cases, discarding a few seconds at start (those discarded data beeing similar to tested range - ie no "scene cut") are enough to compensate for this behaviour.

In the Java ABC/HR, up to now, we have to trick it by adjusting the "sample delay" by 2 seconds. (would be nice to be able to specify a testing range instead of this hack)
Go to the top of the page
+Quote Post
Alex B
post Nov 6 2006, 12:06
Post #133





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



So the correct approach for my example sample would be to encode it as it is (since it is from the beginning of a real audio track), decode it and add at least two seconds of digital silence in the beginning.

If some other "too short" sample is from the middle of the audio track, a longer passage of the same track should be encoded. At least it should start more than two seconds before the intended sample starting point.*

Edit:

*If preferred, this type of encoded sample can be cutted to the intended length after decoding. In this case at least two seconds of silence must be added in the beginning if the sample is going to be used with the two second Java ABC/HR delay setting.

This post has been edited by Alex B: Nov 6 2006, 12:27


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Gabriel
post Nov 6 2006, 12:28
Post #134


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



QUOTE (Alex B @ Nov 6 2006, 13:06) *
So the correct approach for my example sample would be to encode it as it is (since it is from the beginning of a real audio track), decode it and add at least two seconds of digital silence in the beginning.

No.

You encode it as it is, and do not test the first 2 seconds

or

You add two seconds of something at the beginning, encode it, and do not test the first 2 seconds.

(first solution is highly preferable)
Go to the top of the page
+Quote Post
Alex B
post Nov 6 2006, 12:38
Post #135





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



QUOTE (Gabriel @ Nov 6 2006, 13:28) *
QUOTE (Alex B @ Nov 6 2006, 13:06) *

So the correct approach for my example sample would be to encode it as it is (since it is from the beginning of a real audio track), decode it and add at least two seconds of digital silence in the beginning.

No.

You encode it as it is, and do not test the first 2 seconds

or

You add two seconds of something at the beginning, encode it, and do not test the first 2 seconds.

(first solution is highly preferable)


The example sample demonstrates a problem in the first few seconds of the track. It represents a real life situation. Just try for example the L3enc version I uploaded. The guitar chords in the very beginning are very bad.

I am not removing the first two seconds when I listen to this track outside a listening test.


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Alex B
post Nov 6 2006, 14:19
Post #136





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



Out of curiosity, I tried the first three seconds of this AC/DC sample with aoTuV b5 @ -q-1, Nero AAC @ ABR 48kbps and l3enc MP3 @ 128 kbps.

Foobar ABX result was 10/10 for all three when compared with the reference.

In my opinion Vorbis and l3enc produced unusable quality. Nero AAC was much better, I would say "slightly annoying".


Edit: I used "-br 48000" with Nero Digital cl encoder v. 1.0.0.2.

This post has been edited by Alex B: Nov 6 2006, 14:47


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Sebastian Mares
post Nov 6 2006, 16:46
Post #137





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



Ivan, do you still recommend ABR or is it OK if VBR used?

Does anyone mind the following settings:

Ogg Vorbis AoTuV AO; aoTuV b5 [20061024] (based on Xiph.Org's libVorbis): q-1.0

Nero HE-AAC Nero AAC codec / May 1 2006: VBR, Q0.20

WMA Standard Windows Media Audio 9.2: VBR Quality 10, 44 kHz, stereo 1-pass VBR

WMA Professional Windows Media Audio 10 Professional: 48 kbps, 44 kHz, 2 channel 16 bit 1-pass CBR

The settings were chosen so that all encoders reach more or less the same bitrate with my material. Bitrate tables are welcome.

Edit: WMA Professional will reach 48 kbps with all material because it encodes with CBR. The other encoders produce ~50 kbps.

If developers and majority of the community agrees with this, I suggest we should start discussing samples. Should we use some samples from the HE-AAC test? I also have some files I would like to post (in case I didn't already), like a Vangelis and a Uriah Heep one.

This post has been edited by Sebastian Mares: Nov 19 2006, 19:12


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
Ivan Dimkovic
post Nov 6 2006, 17:08
Post #138


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



QUOTE
Ivan, do you still recommend ABR or is it OK if VBR used?


I'm fine with both - ABR should provide less quality deviation, but VBR should score a bit higher on average.

Up to you guys.
Go to the top of the page
+Quote Post
Sebastian Mares
post Nov 6 2006, 17:36
Post #139





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



Sorry, but I am afraid I did not understand. What do you mean with "ABR should provide less quality deviation"? blush.gif


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
Ivan Dimkovic
post Nov 6 2006, 18:13
Post #140


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



I meant - ABR quality (subjective grade) is more consistent, with "shorter" confidence intervals than VBR at that bitrate.

This is because VBR mode could undercode some samples and they would sound slightly less good than when they are coded with ABR mode.

However at average VBR is indeed a bit better.

This post has been edited by Ivan Dimkovic: Nov 6 2006, 18:17
Go to the top of the page
+Quote Post
Sebastian Mares
post Nov 6 2006, 18:35
Post #141





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



QUOTE (Gabriel @ Nov 6 2006, 12:28) *
You encode it as it is, and do not test the first 2 seconds

or

You add two seconds of something at the beginning, encode it, and do not test the first 2 seconds.

(first solution is highly preferable)


Gabriel, but what if a song doesn't start "fading in" but like Alex B pointed out with the AC/DC sample?


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
Sebastian Mares
post Nov 6 2006, 18:45
Post #142





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



How many samples should we use, 12?


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
benski
post Nov 7 2006, 00:13
Post #143


Winamp Developer


Group: Developer
Posts: 670
Joined: 17-July 05
From: Brooklyn, NY
Member No.: 23375



QUOTE (Sebastian Mares @ Nov 6 2006, 10:46) *
Ivan, do you still recommend ABR or is it OK if VBR used?


Whatever mode was used in the HE-AAC listening test should be used for this test, also.
Go to the top of the page
+Quote Post
IgorC
post Nov 7 2006, 00:18
Post #144





Group: Members
Posts: 1553
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



QUOTE (Sebastian Mares @ Nov 6 2006, 09:45) *
How many samples should we use, 12?

Last time there were 18 samples in multi-aac test. Now it's multi-codec test. So more people should be interesting in it. 18-20 samples?
Go to the top of the page
+Quote Post
Sebastian Mares
post Nov 7 2006, 09:37
Post #145





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



Well, I think 18 samples is maximum.


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
Gabriel
post Nov 7 2006, 10:24
Post #146


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



QUOTE (Sebastian Mares @ Nov 6 2006, 18:36) *
Sorry, but I am afraid I did not understand. What do you mean with "ABR should provide less quality deviation"? blush.gif

What Ivan is telling is that he's not totally confident in his VBR mode ;-)

Full VBR is a matter of trusting your psymodel, which most of the time is not perfect. If your codec is efficient enough compared to competitors, it's usually safer to rely on ABR (ie VBR is not worth the risk if you are good enough).
(now you know why iTunes is ABR and not fully VBR, and why it is recommended to use Lame in VBR)


QUOTE (Sebastian Mares @ Nov 6 2006, 19:35) *
Gabriel, but what if a song doesn't start "fading in" but like Alex B pointed out with the AC/DC sample?

If you really want to test the start of your sample, you would have two choices:

*re-rip the samples with 2 extra seconds at the beginning
*add 2 seconds of silence at the start of the sample
Go to the top of the page
+Quote Post
Ivan Dimkovic
post Nov 7 2006, 12:46
Post #147


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



QUOTE
Full VBR is a matter of trusting your psymodel, which most of the time is not perfect. If your codec is efficient enough compared to competitors, it's usually safer to rely on ABR (ie VBR is not worth the risk if you are good enough).


Actually,

Looking here:

http://www.hydrogenaudio.org/forums/index....showtopic=41191

It looked like Nero VBR @48 kbits/s was just a bit better than ABR.

However, at such a low bit-rate I don't believe there are big benefits of using true VBR - there is not too much space to scale the bit-rate down before sound start to degrade a lot - which means that there won't be space to scale it up, either - in case of need.

So, ABR should do just fine.
Go to the top of the page
+Quote Post
Sebastian Mares
post Nov 7 2006, 13:52
Post #148





Group: Members
Posts: 3629
Joined: 14-May 03
From: Bad Herrenalb
Member No.: 6613



OK, ABR for Nero then. If everything else is fine, we should focus on samples now.

This post has been edited by Sebastian Mares: Nov 9 2006, 07:23


--------------------
http://listening-tests.hydrogenaudio.org/sebastian/
Go to the top of the page
+Quote Post
Alex B
post Nov 7 2006, 14:18
Post #149





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



We should remember that in the test we should use a setting that should produce the best average quality with various complete audio tracks. So if Ivan recommends ABR to users who are going to encode a complete audio library at about 48 kbps then it should be used.

If the recommendation is VBR then it should be tested even if a certain set of selected test samples would possibly result a bit better quality in ABR mode... *

Edit

* ... or when the ABR mode would be a safer choice for winning this particular test, like Gabriel explained.

This post has been edited by Alex B: Nov 7 2006, 14:28


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Alex B
post Nov 12 2006, 13:06
Post #150





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



Here's a bitrate table and graph in Excel format. I used my usual set of 25 various full length tracks:

bitrates_48kbps_test.xls

Average bitrates:

Nero Digital 1.0.0.2 -br 48000 => 48 kbps
Nero Digital 1.0.0.2 -q 0.21 => 50 kbps
Nero Digital 1.0.0.2 -q 0.20 => 48 kbps
WMA 10 Pro CBR 48 kbps => 48 kbps
WMA 9.2 standard VBR10 => 47 kbps
Vorbis aoTuV beta 5 -q -1 => 49 kbps


Some of you may find the following screenshot interesting too. Some track peaks of my test file set, starting from the highest peak:



Any comments?


EDIT

I tested Nero -q 0.2 and changed Nero -q 0.205 to -q 0.2 since it is the selected test option (it was: Nero -q 0.205 => 49 kbps). Also the linked Excel file is updated.

This post has been edited by Alex B: Nov 22 2006, 18:42


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post

8 Pages V  « < 4 5 6 7 8 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 21st August 2014 - 12:06