Improving ReplayGain, some ideas for Devs etc
Improving ReplayGain, some ideas for Devs etc
Nov 18 2003, 17:04
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409
Every now and again I wish I had the time to update the ReplayGain website and add some new ideas, and maybe even clarify some old ones. I don't, so this thread will have to do.
Firstly, the format used to store ReplayGain info in files is not documented correctly on the ReplayGain website, and it would be good to "publish" what has emerged as the standard for each format.
Secondly, what is stored is not documented correctly on the ReplayGain website, and I'd like to re-examine what is stored...
One change has already happened, and I think it's a good change:
Forget Radio and Audiophile - Track and Album are much better names.
(that's an open admission of me being wrong, for anyone who discussed this with me previously!)
So, we store:
ReplayGain Track adjustment
ReplayGain Album adjustment
(ReplayGain) Track peak
(ReplayGain) Album peak
(this last one wasn't in the original proposal, but it has been widely used - I've put it in bold to remind me to include it in the update)
That makes sense, and most software supports this. I'd like to formalise some extensions, some of which were there from the start, and others that have cropped up more recently:
1. (ReplayGain) undo adjustment
- this is written when the gain of the file is changed (e.g. by mp3gain, or by decoding with ReplayGain enabled), and is the gain change required to put the file back to where it started.
e.g. If I apply -8dB gain change using mp3gain, then
(ReplayGain) undo adjustment = +8dB
e.g. If I use --scale 0.5 when encoding (for whatever reason?!), then
(ReplayGain) undo adjustment = +6dB
If the gain of an already ReplayGained file is changed, the original four values (Track and Album adjustment and peak) should be updated so that they are correct for the new audio data. (see an example in this thread: http://www.hydrogenaudio.org/forums/index....topic=15412&hl= )
I can't see any argument against defining this field. It would be zero (or absent) if the audio file hasn't been altered. It's useful in all formats because you can always apply wavgain before encoding, and it would be nice to know that this has been done.
2. ReplayGain calculation method
OK - I've had this argument before, but this really is important. ReplayGain can be improved, but you'll never know whether files are tagged using the old or new ReplayGain calculation unless the calculation method (actually a number which corresponds to the method) is stored. This doesn't increase the complexity of players, as they won't care - it just makes it very easy to pick out files that were tagged with the old version, and update them.
3. ReplayGain lossy approximation
This is just a single bit: 0 or 1.
0= this ReplayGain info has been calculated from the data in this file
1=this file has been lossily encoded/transcoded since this ReplayGain info was calculated.
What's the point of this? If you have a file with ReplayGain info, you can transcode it and copy the RG info across. It'll be close enough to give you excellent loudness equalisation, and you won't have to re-calculate it. Yet they'll be a label there to tell all you anal retentives that it's not quite right, and should be recalculated if you want to be 100% sure (especially important for peak amplitude).
You could (should?) have one “ReplayGain lossy approximation” bit for each of the four values, which gives you the chance (for example) of re-calculating the peak values (quick, and important - so let's do it), but leaving the ReplayGain values (slow, and unimportant - so let's not do it).
4. ReplayGain user adjustment
Instead of suggesting that users should change the calculated values if they wish, give them a field to enter their own value if they really have to. Players should give the option to read the user value in preference to any others (i.e. let it act as an over-ride), and taggers should give the option of removing the user values from all (downloaded) files.
5. ReplayGain RealLife adjustment
The gain required to give the actual SPL of the original event (in a calibrated system), or a human judged sensible replay level (see the explanation behind the original "Audiophile" level and the work of Bob Katz if you think this is an impossible idea). I've found a few DVD-A discs that have this information (it's in the MLP stream), so it would be nice to have somewhere to store it. It's unlikely to get used much, but it would be a useful thing to have. It would be the last link in some of the best recordings out there.
I'd like to come to a consensus of which ones of these (if any/all) should be included, and then get some specs as to how they are/should be stored in each file format (especially APE2.0 tags) finalised and published on-line.
Comments? Suggestions? Offers of help?
btw I've received a couple of suggestions for improving the ReplayGain calculation. One is trivial, and seems like a great idea. I'll post it for testing when the problem of version numbering is solved. If anyone else has slightly or totally re-worked the ReplayGain algorithm/concept, now would be a good time to step forward! We could do listening tests to find the best candidate for "calculation version 2".
Newbie warning: this thread is not for asking questions about ReplayGain that are already answered on www.replaygain.org or in previous threads on HA. (I'm always happy to answer "silly" questions via email – half of them aren't silly at all.)
However, if you do already have some understanding of ReplayGain then this thread is the perfect place for clarifying anything to do with the above proposals which is not clear.
This post has been edited by 2Bdecided: Nov 18 2003, 17:42
Jan 8 2004, 14:15
Joined: 4-January 04
Member No.: 10938
QUOTE (2Bdecided @ Jan 8 2004, 01:11 PM)
No, because perceived loudness depends on loudness!
ohhh, you're right...
thanks for the explanations, now I think I understood all there is to understand about replaygain
If you're listening to a bass heavy track at 60dB, you'll hear much less bass (relatively) than you will at 80dB. This means that increasing the gain on a bass heavy track by 20dB will cause its subjective loudness to be increased more than a 20dB boost to a bass light track.
this frequency dependent sensitivity is handled in the digital domain, so this is not contradictory to avoiding any use of SPL out of replaygain. but then the next point settles it :
What's more, the perceived loudness increase of that 20dB boost will be different if it's a boost from 40dB to 60dB than if it's a boost from 80dB to 100dB.
okay, if you want to allow for that kind of tuning of the model, you need to make assumption on how the digital signal will be transformed to SPL, so it's natural to express the results in term of SPL since the computation (may) depend on this assumption.
I'm convinced, it's good that the 83dB SPL number appear in replaygain.
If the equal loudness curves were parallel lines, then we wouldn't really have to worry about real world sound pressure. They're not, so it's an issue, and it can only be solved if we make some kind of guess (like the floating ATH in the lame encoder), or calibrate the system properly to a real world loudness - which is what I've chosen to do.
Hope this makes sense.
Yes it does, and now I knowingly agree with the choices made in replaygain
The fact those curves are not parallel hadnt hit my mind as an issue, and calibration on real world loudness seems the best solution to me.
It felt unnatural at first that algorithms in the digital domain would want to guess what amplification will end up on the signal and how loud I would be listening to my music, but now I see it's not. And further more I realize that the loudness I usually listen to my music is not as much arbitrary as I thought.
Well, now I understood I'm even more for storing the absolute replaygain (à la "92dB").
oh, and if final loudness was used in the psychoacoustic model, the choice of the right SPL reference when computing the gain matters.
the difference in results with reference SPL set to 83 and set to 89 (err, in fact the real other reference is rather 83-6=77 dB SPL, ie the SPL obtained by playing the pink noise on my machine if I've set the amplification to make pop music sound right) is not simply always a difference of 6 dB due to what you explained.
It makes the whole process of replaygain computation+playback a bit more complex if used as a purist.
let's see :
1. you compute the replaygain, choosing as reference the one corresponding to the genre of music it is.
(so that replaygain can make the good assumption on the digital-to-SPL correspondance you'll typically be using at playback)
2. when playing, how do you choose reference ?
if you'll be playing files that were all of the same genre (ie replaygained with same reference), it doesnt really matter as it will just compensate other amp settings.
if you'll be playing different genres of songs,
a. either you want all of them to sound as loud as the others
then you pick one reference (that means either classical music will not be allowed to play loud passages as loud as an audiphile would want, or the pop songs will really be annoyingly loud, depending on which reference you pick)
b. either you want each song to be played with the gain adapted to its kind.
And then, .. , hmm, what to do..
storing relative or absolute replaygain doesnt change the issue.
I guess the 'audiophile' gain was supposed to handle this, but it won't if it really is an album gain.
(or you have to assume all albums of a given genre will be at the same album gain, which may be true for classical music, but not at all for pop songs).
It seems to me the real purist replaygain fanatic will want the file to include the reference used at computation. (or any other way to get the reference to use for the genre of that song)
Else, the user will have to change the setting between songs if he wants to play pop and classical songs in one playlist.
So we should store both replaygain (track and album) and replaygain reference in each file, or did I miss something ?
|Lo-Fi Version||Time is now: 1st February 2015 - 13:45|