a response to a growing rumor...
a response to a growing rumor...
Feb 12 2002, 00:36
Joined: 26-August 02
From: Nottingham, UK
Member No.: 1
Normally I wouldn't attempt to address an issue in this manner, but since it is getting a bit out of hand, and usually on boards I'm not participating in (or have little desire to participate in), I'll try and address it officially, once, in the place where it should be the most relevant.
The matter I'm discussing is related to the --alt-presets and their handling of the "stereo image".
There have been some completely unsubstantiated reports and rampant speculation going on in a few threads which I will list below:
3. http://www.hydrogenaudio.org/forums/showth...s=&threadid=759 (I simply hadn't gotten around to responding to this thread though its on this board).
At any rate, I'll try to make a few points as clearly as I can.
1. All of the --alt-preset VBR modes are tuned for "stereo image".
2. All of the vbr presets provide better sound quality via joint stereo than LAME on it's own with joint stereo, and in some cases should even sound better than with --nssafejoint, while at the same time providing a lower bitrate.
3. The --alt-presets do not, by design, make any sacrifice in regards to stereo image to keep bitrate down. Anyone who tells you this has no idea what they are talking about. I should know since I actually wrote the code and designed the presets.
4. An extremely high degree of stereo frames is not always needed to achieve good sound quality. I challenge anyone who believes that --alt-preset standard has poor stereo seperation, on a common basis (as a few unsubstantiated claims imply), to provide me with direct evidence of this.
5. Joint stereo is needed even at bitrates of 320kbps to achieve the best sound quality in some critical cases. Forcing stereo on everything up to 320kbps and then forcing joint stereo does not fix the problem (as user implies in one of those threads). I've tried this before.
6. There seems to be a misconception that all that the --alt-presets improve on are pre-echo. This is sorely mistaken. Indeed they do improve on pre-echo and impulse handling to a fairly large degree, but they also improve upon:
- joint stereo handling (serioustrouble is a prime example)
- dropout prevention (2nd_vent_clip is a prime example)
- fluttering (gekkou is a prime example)
- knocking (velvet is a prime example)
- ringing (bloodline is a prime example)
- noise pumping (piano, rach_original, etc, are examples)
- rasping (present with noise shaping 2 on some clips like fatboy, or on clean vocals sometimes. Mostly eliminated, even on the most critical samples, with --alt-presets)
And that's just the stuff I can think of off the top of my head.
Now, that's not to say the --alt-presets are perfect. I certainly know they aren't. But they also don't have some massive flaw in regards to stereo image which is present to the degree some people imply. In fact, the only case I've seen which I put any credence in is the few isolated cases which Wombat has found (and provided samples for I might add). I will eventually attempt to address these few samples, but note that these are exceptional cases, not common cases, and as far as I can tell, they are completely unrelated to the other complaints being made. This is especially so since Wombat doesn't describe the artifact as being a collapse of the stereo field (which isn't your typical joint stereo artifact in LAME anyway...).
At any rate, I'm always looking to improve things if I can, but claims must be substantiated which includes providing abx results (which are then verified by other parties) and providing test samples, preferrably multiple ones if you are implying a problem with general behavior.
Not to come across arrogant, but for the most part, I'm the only one who truly understands the workings behind the --alt-preset specific tunings. Not even the other developers have followed my work (though that's by their choice, not mine). The code is available for all to see, but so far I have not seen anyone attempt to reimplement my modifications or to discuss them with me on a technical level. So unless you see someone who is closely related to the work I've done (ie, they have participated in testing, JohnV for example) stating something, or you see me stating something directly about the presets, then chances are whoever is discussing the presets doesn't have the full picture. This is especially true when people begin discussing how the --alt-presets work internally or technically, and especially in relation to joint stereo.
If you see a discussion on another board about these issues, please point people to this thread. If you have a question, please ask me here, you'll likely get a much more correct answer in addition to helping to keep questions about this issue centralized and concise (which will help when the FAQs are created). Speculation is not only wasteful, but it also helps to propogate misinformation such as the old "joint stereo is bad" line of thinking.
Feb 14 2002, 18:34
Joined: 29-September 01
Member No.: 63
Originally posted by Pio2001
To make it short, I don't like to discuss sound quality analyzing graphs, Listening tests are way better in my opinion.
It is true that listening tests are better to discuss quality, but graphs are not completely useless. You have to realise that Dibrom was not using the graphs to discuss quality, but to discuss BEHAVIOUR of the encoder. Now graphs are the perfect tools for this purpose. The right tools for the job, remember? Many people looked down on EAQUAL as a decider of quality, but I think it's silly. Although you can't use EAQUAL to replace listening tests, EAQUAL is useful for many other purposes too (e.g. those Ogg bitrate vs quality graphs).
In the case above, you have to realise that Dibrom did not use the graphs to say that one setting is better than the other. He used the graph to support his findings that one setting encodes more HF content than the other. This is plain fact, and the conclusion he draws is correct. Did he say that one sounds better than another because of the conclusion? No. But he proved a valid point, and that's the important part.
Anyway, the conclusion is just because --r3mix lowpasses higher doesn't mean it will encode more HF content than --alt-preset standard. It is not hard to figure out why: In most cases, HF frequency content falls below the ATH curve (which is already very high in the HF region) and don't get encoded at all. In this case, the ATH curve combined with the noise measuring algorithm of --alt-preset standard decides that more HF content needs to be encoded than --r3mix's algorithm.
The second thing is that graphs seems to indicate that ringing is occuring in the --r3mix samples. Although we cannot be sure we can hear it without the listening tests, we can recognize the ringing syndromes from the graphs. You may want to visit http://ff123.net/ringing_graph.html to understand it better. In this case the graphs are useful in showing that the --r3mix encode shows the syndrome and signs of ringing. But we cannot conclude that this is audible in the sample itself without a listening test.
Therefore: Don't underestimate the usefulness of graphs. It cannot replace listening tests, and probably is not as useful as listening tests. But it is a very useful tool when used properly. Unfortunately, it has been given a very bad reputation because graphs have often been abused in the past.
|Lo-Fi Version||Time is now: 28th May 2015 - 11:15|