a response to a growing rumor...
post Feb 12 2002, 00:36
Normally I wouldn't attempt to address an issue in this manner, but since it is getting a bit out of hand, and usually on boards I'm not participating in (or have little desire to participate in), I'll try and address it officially, once, in the place where it should be the most relevant.

The matter I'm discussing is related to the --alt-presets and their handling of the "stereo image".

There have been some completely unsubstantiated reports and rampant speculation going on in a few threads which I will list below:

2. http://www.digital-inn.de/showthread.php?threadid=8212
3. http://www.hydrogenaudio.org/forums/showth...s=&threadid=759 (I simply hadn't gotten around to responding to this thread though its on this board).

At any rate, I'll try to make a few points as clearly as I can.

1. All of the --alt-preset VBR modes are tuned for "stereo image".

2. All of the vbr presets provide better sound quality via joint stereo than LAME on it's own with joint stereo, and in some cases should even sound better than with --nssafejoint, while at the same time providing a lower bitrate.

3. The --alt-presets do not, by design, make any sacrifice in regards to stereo image to keep bitrate down. Anyone who tells you this has no idea what they are talking about. I should know since I actually wrote the code and designed the presets.

4. An extremely high degree of stereo frames is not always needed to achieve good sound quality. I challenge anyone who believes that --alt-preset standard has poor stereo seperation, on a common basis (as a few unsubstantiated claims imply), to provide me with direct evidence of this.

5. Joint stereo is needed even at bitrates of 320kbps to achieve the best sound quality in some critical cases. Forcing stereo on everything up to 320kbps and then forcing joint stereo does not fix the problem (as user implies in one of those threads). I've tried this before.

6. There seems to be a misconception that all that the --alt-presets improve on are pre-echo. This is sorely mistaken. Indeed they do improve on pre-echo and impulse handling to a fairly large degree, but they also improve upon:

- joint stereo handling (serioustrouble is a prime example)
- dropout prevention (2nd_vent_clip is a prime example)
- fluttering (gekkou is a prime example)
- knocking (velvet is a prime example)
- ringing (bloodline is a prime example)
- noise pumping (piano, rach_original, etc, are examples)
- rasping (present with noise shaping 2 on some clips like fatboy, or on clean vocals sometimes. Mostly eliminated, even on the most critical samples, with --alt-presets)

And that's just the stuff I can think of off the top of my head.

Now, that's not to say the --alt-presets are perfect. I certainly know they aren't. But they also don't have some massive flaw in regards to stereo image which is present to the degree some people imply. In fact, the only case I've seen which I put any credence in is the few isolated cases which Wombat has found (and provided samples for I might add). I will eventually attempt to address these few samples, but note that these are exceptional cases, not common cases, and as far as I can tell, they are completely unrelated to the other complaints being made. This is especially so since Wombat doesn't describe the artifact as being a collapse of the stereo field (which isn't your typical joint stereo artifact in LAME anyway...).

At any rate, I'm always looking to improve things if I can, but claims must be substantiated which includes providing abx results (which are then verified by other parties) and providing test samples, preferrably multiple ones if you are implying a problem with general behavior.

Not to come across arrogant, but for the most part, I'm the only one who truly understands the workings behind the --alt-preset specific tunings. Not even the other developers have followed my work (though that's by their choice, not mine). The code is available for all to see, but so far I have not seen anyone attempt to reimplement my modifications or to discuss them with me on a technical level. So unless you see someone who is closely related to the work I've done (ie, they have participated in testing, JohnV for example) stating something, or you see me stating something directly about the presets, then chances are whoever is discussing the presets doesn't have the full picture. This is especially true when people begin discussing how the --alt-presets work internally or technically, and especially in relation to joint stereo.

If you see a discussion on another board about these issues, please point people to this thread. If you have a question, please ask me here, you'll likely get a much more correct answer in addition to helping to keep questions about this issue centralized and concise (which will help when the FAQs are created). Speculation is not only wasteful, but it also helps to propogate misinformation such as the old "joint stereo is bad" line of thinking.
post Feb 13 2002, 08:22
Originally posted by JohnV
Well, just read Roel's (r3mix's) comment about alt-preset standard.. (

Interesting read... guess nothing's changed wink.gif

The bit about this:

[b]To sum it up: what is an improvement to one person, might be a quality lowering for another.

I don't buy that at all. I've seen this argument before, and it's always from someone who doesn't want to accept something which has been proven to be superior (ie, people who make up their command lines in disregard to evidence continually backed up and verified by the community -- what this very thread is about).

I've never really seen this verified, that being an improvement to one person being a degredation to another with almost all other things being equal.. and 19khz vs 19.5khz is not significant especially when one consider the logarithmic nature of hearing and the fact that most people probably can't hear beyond 18 or 18.5khz in real music. But then, when you consider the source, someone who is willing to use frequency analysis to judge quality, perhaps that is to be expected.

It doesn't really seem to hold water anyway when compared against community data. Not only has Roel's own AQ test implied (if not directly showed) that the old dm-preset standard was better than --r3mix, I don't believe I've seen a claim since the last few revisions of --alt-preset standard to where someone found --r3mix better. Even if I've missed one or two, the ratio of samples where --r3mix fails badly vs where --aps sounds fine is very high.

At this point, to ignore all of the improvements that have been made (which many people can hear, just look through the revX threads), it'd have to simply be denial I think. Continuing to state, given that evidence, that --r3mix is CD Quality still, supports that as well.

[b]So, Roel seems to say that he can hear a difference because of lower lowpass than --r3mix, because he would [b]never use --alt preset standard because its lowpass is lower.

And of course, we probably won't see any evidence to back up the claims that he can hear the difference between 19khz and 19.5khz. What's worse, he doesn't believe in ABX.. so good luck with that wink.gif

[b]Also Roel seems to have the wrong impression that --aps is just a pre-echo fix... oh well..rolleyes.gif

Perhaps this isn't particularly surprising given the history of reaction towards the dm-presets on his board. If one didn't consider those developments significant then, it probably wouldn't be a stretch for them to think the same now.

At any rate, it would be interesting to see --r3mix develop further (though I feel that "third party presets" with someone's name on them are counter productive towards to goal of simplication and user friendliness at this point; LAME needs consolidation, not further fragmentation), but I can almost guarentee that quality improvements cannot be had without increasing size at all, as Roel seems to think is possible. I've worked on this issue very significantly, and it just isn't going to happen. The only way it could be possible is if there were some pretty major changes to the psymodel, and I don't see that happening anytime soon. Furthermore, I feel that LAME is as far as it can be taken just by combining different combinations of command lines. That's why months ago I decided to delve into the code instead...

We'll wait and see what happens, but it seems like someone is expecting a miracle, and it just ain't there smile.gif What's more, in the past Roel has not shown concern for fixing the many samples which have caused problems in the past and has played down the matter, saying that the person must be abnormal and perhaps should not use MP3 or should normalize their file first or something else. I'd be simply astonished to see the approach to this change now. When you have one person who is apparently unable to hear faults, does not seem to show interest in scientific methodology (abx), is willing to rely on flawed techniques for comparison (freq analysis), and seperates himself from people with sensitive hearing who could help improve things, how can you possibly expect much progress? Take MPC, PsyTEL, or Vorbis for example... if this was the approach used there, they wouldn't be anywhere near the level of quality they are currently at.

And personally I still don't see how 192kbps average is "too high" of a bitrate, especially since the mp3 groups have been trading in this format for years.... I think most users feel this way these days also considering the explosive growth of the use of the --alt-presets.

So at the end of the day, using the bitrate excuse and the .5khz difference in lowpass as reasons for ignoring --aps, just seems like a last stand... an unwillingness to embrace improvements simply for the sake of pride and being stubborn.
