IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Unusual Sample For Lossy Codecs
MrDrew
post Dec 17 2001, 07:27
Post #1





Group: Members
Posts: 32
Joined: 3-October 01
From: USA, Michigan
Member No.: 177



I was fooling around a bit in Cool Edit Pro, and I generated a completely synthetic test sample. Basically I was curious about the behavior of lossy codecs given certain sonic conditions. Here it is:

http://people.mw.mediaone.net/lklenk/music/headache.pac

Yeah... I was bored. biggrin.gif This sample behaves in interesting ways using (some) lossy codecs. MP3 had a generally tough time with this sample with the exception of Dibrom's "Insane" preset. Any thoughts or comments?

NOTE: I made this sample merely because of my boredom and curiosity. I realize that no real world music sounds REMOTELY like this, and I recognize the fact that behavior with this sample is entirely uncharacteristic for most lossy codecs (with maybe the exception of the Xing mp3 codecs biggrin.gif ).
Go to the top of the page
+Quote Post
NeoRenegade
post Dec 18 2001, 15:42
Post #2





Group: Members
Posts: 723
Joined: 29-November 01
Member No.: 563



Nice... sounds like "The THX Sound" on helium. It should make a lovely test sample smile.gif

<Edit>I don't blame it, though. At 1.18MB compared to 1.69MB for a copy of it in PCM-WAV, it seems it gives LPAC a little bit of trouble too biggrin.gif</Edit>
Go to the top of the page
+Quote Post
MrDrew
post Dec 19 2001, 03:28
Post #3





Group: Members
Posts: 32
Joined: 3-October 01
From: USA, Michigan
Member No.: 177



Hey Dibrom (and others)... any thoughts on this sample? It's quite a workout for lossy encoders... lol! Seriously though, I posted this particular sample with the hope that it might be able to (somehow) help develop LAME. Can anyone stand the sample (aptly named "headache" tongue.gif ) long enough to do some listening tests? The few tests I did yielded interesting results.
Go to the top of the page
+Quote Post
Dibrom
post Dec 19 2001, 04:00
Post #4


Founder


Group: Admin
Posts: 2958
Joined: 26-August 02
From: Nottingham, UK
Member No.: 1



I'll have to give it a listen, sorry I haven't been able to yet..

If it is a very artificial signal, it may not be too surprising that it would trip up LAME. It certainly would be nice to increase the encoding quality of a sample, but this could prove very difficult. I think that where LAME is at right now, on these more artificial samples we are actually running more into the limitations of the MP3 spec itself than anything unfortunately. I believe there still is possibly some areas for tuning better short block switching criterion which could maybe eliminate certain artifacts from a clip like drone, but there are still probably many areas which are "unfixable".

Having said that, I'll check out the sample and tell you what I think smile.gif
Go to the top of the page
+Quote Post
Dibrom
post Dec 19 2001, 04:08
Post #5


Founder


Group: Admin
Posts: 2958
Joined: 26-August 02
From: Nottingham, UK
Member No.: 1



Initial listen:

Fairly interesting sample... --alt-preset standard actually doesn't do that bad, I thought from a listen before encoding that it might come out sounding horrible, but it doesn't actually. Not completely transparent either though. I think part of the reason for this is that there are still some areas where the masking calculation of LAME fails, and a sample like this may expose some of that. In particular, I think the spreading function of LAME has lots of room for improvement (using a non-linear method instead of a linear method like now, and using more sensitive listeners on harder samples to tune this). An adaptive lowpass could maybe help here some too.
Go to the top of the page
+Quote Post
Dibrom
post Dec 19 2001, 04:22
Post #6


Founder


Group: Admin
Posts: 2958
Joined: 26-August 02
From: Nottingham, UK
Member No.: 1



More results:

MP3 and Vorbis seem to have problems with this sample due to the background noise right at the beginning not being encoded properly. MP3 doesn't encode it to a high enough frequency and has a few small audible dropouts (due to not encoding enough noise high enough in frequency) right before the noise becomes inaudible. Vorbis (RC3 -b192) encodes it high enough, but the noise stops being encoded too abruptly, and too early. MPC seems to handle this properly though, encoding enough noise and long enough to prevent the artifacts. I haven't tested AAC yet. So this is mostly a masking related issue it seems which is what I had originally thought in my initial listen when I mentioned the spreading function.
Go to the top of the page
+Quote Post
MrDrew
post Dec 19 2001, 05:40
Post #7





Group: Members
Posts: 32
Joined: 3-October 01
From: USA, Michigan
Member No.: 177



Thanks for the reply Dibrom! As far as this sample and the "alt presets" I hear and confirm the masking problem with the "standard" and "extreme" presets (i.e. the background noise present in the original wav is mostly gone or significantly changed). The "insane" preset seams to encode fine (I can't tell the encoded mp3 apart from the original wav). What... besides the bitrate, could cause the "insane" preset to encode SIGNIFICANTLY better than "extreme" or "standard?"
Go to the top of the page
+Quote Post
Dibrom
post Dec 19 2001, 05:42
Post #8


Founder


Group: Admin
Posts: 2958
Joined: 26-August 02
From: Nottingham, UK
Member No.: 1



QUOTE
Originally posted by MrDrew
What... besides the bitrate, could cause the "insane" preset to encode SIGNIFICANTLY better than "extreme" or "standard?"


Nothing really. More bits == brute force approach to encoding more of the original audio smile.gif. The problem is that LAME (and Vorbis to some extent) apparently underestimate the audibility of certain frequencies in this case (masking issues) and so don't spend enough bits on them.. or purposely do not encode them, thinking they are inaudible and targets for bitrate savings. More bits will naturally "snow this over" so to speak.

I believe if you use a higher V value, or a lower ath here, you might get more of that background noise encoded with LAME, but obviously that will throw everything else out of whack.
Go to the top of the page
+Quote Post
MrDrew
post Dec 19 2001, 06:18
Post #9





Group: Members
Posts: 32
Joined: 3-October 01
From: USA, Michigan
Member No.: 177



Dibrom... I tried everything you suggested to make this sample sound better, and other than using the "insane" preset (to force a high bit rate) the only thing that seams to help this sample is forcing all short blocks (via --allshort). Adjusting "-V x" and "-q x" made little difference. I tried --notemp as well as --noath (even both in combination) and the situation was only marginally improved. If this was truly a masking and / or ath issue, shouldn't the --notemp and / or --noath switches made a noticeable difference?
Go to the top of the page
+Quote Post
Dibrom
post Dec 19 2001, 06:29
Post #10


Founder


Group: Admin
Posts: 2958
Joined: 26-August 02
From: Nottingham, UK
Member No.: 1



QUOTE
Originally posted by MrDrew
Dibrom... I tried everything you suggested to make this sample sound better, and other than using the "insane" preset (to force a high bit rate) the only thing that seams to help this sample is forcing all short blocks (via --allshort).  Adjusting "-V x" and "-q x" made little difference.  I tried --notemp as well as --noath (even both in combination) and the situation was only marginally improved.  If this was truly a masking and / or ath issue, shouldn't the --notemp and / or --noath switches made a noticeable difference?


--notemp is not going to really have an impact on this. The temporal masking in LAME right now doesn't do much anyway, it's pretty basic and usually hardly saves more than 2 or 3 kbps in my experience. There are no real attacks in this clip either which should make it have even less of an effect than it already does also.

The ath is only part of the issue related to masking, it is not the only thing that is used as a guideline, so disabling the ath and not having much improvement is not particularly surprising. I believe the problem is still mostly related to the spreading function and other areas.

The -V settings only serve to adjust used masking in one direction or another, but if this is not calculated correctly in the first place, it may not matter.

-q doesn't really change much here. -q1 and 0 don't really do much except enable certain features which are supposed to offer possibly better compression. -q1 enables a new experimental noise shaping type which is much slower than type 2, and which does not seem to offer improvements.. in fact it is more aggressive, bringing more room for error. Encoding time I believe is about 40% slower and bitrate savings are on the order of 2-3kbps.

If you used --allshort with --alt-preset standard, the reason this encoded better is because on short blocks, --alt-preset standard uses much more aggressive "best quantization" selection, which helps encode attacks more accurately and keep pre-echo down. This probably translated to more aggressive encoding of the background noise also, but in actuality is should be unrelated to the block type used.. it is more just a side effect of other behavior which --alt-preset standard switches to on short blocks. Also, enabling this behavior everywhere else would lead to massive bitrates, so it isn't practical.

Maybe some of that can clear a few things up.
Go to the top of the page
+Quote Post
MrDrew
post Dec 19 2001, 06:38
Post #11





Group: Members
Posts: 32
Joined: 3-October 01
From: USA, Michigan
Member No.: 177



Hmm... so it sounds like there isn't any one thing that can be done to fix problems in this sample at this time (short of raising the bit rate and / or modifying the spreading function somehow). I wonder if anything can be realistically done in the future to help fix issues with this sample, or could the problems here be unsolvable because of limitations in mp3? In any case, thank you Dibrom for spending so much time answering all my questions. Your input is much appreciated!
Go to the top of the page
+Quote Post
Dibrom
post Dec 19 2001, 06:42
Post #12


Founder


Group: Admin
Posts: 2958
Joined: 26-August 02
From: Nottingham, UK
Member No.: 1



Yeah... there's probably no "easy" fix for this at the moment. In fact, a lot of the stuff I have been "fixing" with the --alt-presets for some time now has been possible only with code level modifications. The limitation for improvement via switch combinations was hit a long time ago I think.

I'm quite certain this problem could be fixed with a better psymodel, but whether or not that will ever happen remains to be seen. There are loads of areas for improvement in LAME's current psymodel, but nobody to really implement them, either due to a lack of time, knowledge, or both, unfortunately.
Go to the top of the page
+Quote Post
JohnV
post Dec 19 2001, 13:11
Post #13





Group: Developer
Posts: 2797
Joined: 22-September 01
Member No.: 6



QUOTE
Originally posted by Dibrom -q1 and 0 don't really do much except enable certain features which are supposed to offer possibly better compression.  -q1 enables a new experimental noise shaping type which is much slower than type 2, and which does not seem to offer improvements..
Hmm, at least some time ago the experimental noise shaping (by Takehiro) was enabled with -q0. Has this been changed?


--------------------
Juha Laaksonheimo
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 24th September 2014 - 03:30