Flaw in ReplayGain spec
Flaw in ReplayGain spec
May 12 2002, 11:04
Joined: 24-September 01
Member No.: 13
It occured to me today that there is a problem with the current ReplayGain spec, or rather, my proposal for doing it in Vorbis.
The issue is combining replaygain and clipping prevention.
If applying the replaygain would cause the track to clip, clipping prevention kicks in, and reduces the level. This will make the output loudness different from the ideal, 'equal' level.
When running in radio/track mode, there is no way around this, since you don't know in advance what you are going to encounter. The best you can do is set the default level low enough so you can hope it'll never happen. I believe this was the idea (among possibly other things) behind setting the default level to K-20 in the new MPC decoders? (Frank? )
If the implementation in the current Vorbis players is correct, a similar effect can be reached by setting the preamp in the plugin to -6dB or so.
In album gain, you could avoid this from happening for the entire album you're listening to, since you already ReplayGain-processed them in group and thus know what is coming up, however, my current proposal poses problems for doing this: You would need to read in all files that belong to the album, read in the peak values, and remember the largest, and use that as the peak value for the individual tracks.
This is what I originally envisioned, however, looking back, this is both ugly, cumbersome and it may not even be possible in some player/plugin architectures.
I think the correct solution would probably be to store an album-peak value.
It would be trivial to implement in the ReplayGain tools, and require only minimal changes in the players without all the uglyness the current method would require (which isn't done correctly by anyone anyway).
The disadvantage is that it requires another tag. However, since the Vorbis people seem to have gotten a bit more enthousiastic about ReplayGain lately, perhaps that isn't so much of a problem.
I believe it's valuable to do this, as it may post a real problem in practise. Moreover, the proposal as it is now is broken by design in this regard, and I'd prefer to fix it while it's still fixable.
Also, the ReplayGain proposal on David's site doesn't mention anything about this? Is there another way to address this problem?
There's two other issues with the current spec that I'd like to discuss about while it's still possible.
1) Change RG_* into REPLAYGAIN_*
This was proposed by Segher, with the idea that someone looking at the tags and that doesn't know what they are can at least google to find out, whereas you'd be left clueless with the current 'RG'. I think this idea is valuable and good.
2) Source/version tag
I didn't include one originally because I saw no way to keep it consistent if you allow the user to edit the tags (you can't require them to know the spec...), and because I didn't see the RG calculations being improved for quite a while. Unfortunately, Frank Klemm has already proven me wrong on the latter. I don't see a way to make such a tag actually _work_ though.
I'd like feedback from everyone about all of this. Is it worthwhile to change the current proposal and fix some of the above issues?
May 21 2002, 14:02
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409
OT: That's really bizarre - we both live in Essex, and your girlfriend and my wife both do cross stitch! Anyway...
Yes, I see it makes sense from a user's point of view to see a louder file having a bigger number. But hopefully the user will never have to look at the value - the whole process should just happen "in the background".
And yet, ignoring Lear's actual proposal (which is back to front, but in fairness I don't think he realised this when he suggested it), both methods are conceptually sound. You either:
(1) store "this file will sound x dB loud in a calibrated system", and the player says "hey - I want it y dB loud, so I'll scale it y-x dB".
(2) store "this file should be x dB louder", and the player says "hey - I want it y dB louder still, so I'll scale it y+x dB".
Of course, in the second, "y" is optional, though recommended to be +6dB.
Is anyone who is coding this proposing that it should be changed? I've heard Lear - what about Frank and Garf?
My worry against changing it is (a) confusion, and (b) in some file formats, the values are stored as binary data (not ASCII comment tags!), and it's more compact to store values between +/- 30, rather than values between 60 and 110 (approx). Unless, of course, you subtract a pre-set number from those second set of values before storing them - but then you're back to where we are now!
I do not think it's an option to (for example) store (1) "level" in Vorbis and (2) "gain" in mpc. That would just be asking for trouble and confusion. So unless BOTH the mpc and vorbis implementations agree to change, they should BOTH DEFINITELY stay as they are!
Another reason against (1) is that almost no one will have a calibrated system - to them 83 dB or 89 dB is just a (meaningless) number. Whereas "6dB louder than suggested" is still just a number, at least it gives you some idea of what you're doing. To know what 89dB means, you have to know it's 6dB louder than what's suggested. but still OK. In contrast, 100dB (which sounds nice and loud) just won't work (user thinks: "why not - my system can output that power"), whereas "+18dB above what's recomended" does sound like you're going to overload it!
btw, the SMPTE RP 200 spec was changed from 85 to 83. I doubt they'll change it again (and actually the two specs are equivalent) - but if it were changed, (1) would be wrong, whereas (2) would be right.
work to do...
|Lo-Fi Version||Time is now: 29th July 2015 - 10:29|