Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Improving ReplayGain (Read 50141 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Improving ReplayGain

Reply #75
2B > The peak sample value tells you _exactly_ how much headroom the material has - usually none!

...and I am at a loss on how RG actually does this. If I take a 20 kHz sine wave (or whatever that will yield 2 sample points per cycle) and encode at 44.1 kHz, there is no guarantee that the sample points will fall on the amplitude maximums (unless phase locked). They could fall on the x-axis crossing nodes...or anywhere on the waveform up to the peaks. Given my example above, is RG actually reconstructing the wave to determine the peak value...or is it using the data in the file?

xen-uno
No one can be told what Ogg Vorbis is...you have to hear it for yourself
- Morpheus

Improving ReplayGain

Reply #76
Quote
2B > The peak sample value tells you _exactly_ how much headroom the material has - usually none!

...and I am at a loss on how RG actually does this. If I take a 20 kHz sine wave (or whatever that will yield 2 sample points per cycle)


in fact 2+epsilon sample points are required by shannon theorem in the case of real valued samples, if you look at it closely enough.
(ie : notice that cos(Wt) = 0.5*(exp(iWt) + exp(-iWt)), and thus its bandwidth is not in  [-W,+W[  )

Quote
and encode at 44.1 kHz, there is no guarantee that the sample points will fall on the amplitude maximums (unless phase locked). They could fall on the x-axis crossing nodes...or anywhere on the waveform up to the peaks.


on a long enough sequence of such a sine wave, some of the sample points will fall very close to maximums of the continuous wave.
By a quick estimation, 4000 samples are enough to insure that the discrete peak lies within 100/(4000^2) percents of the continuous wave's real peak for high frequencies up to 22.05/(1+1/4000) = 22.044 kHz.
(I'm using 1-x^2/2 as an estimate of the sine wave near the optimums)

Even with only  10 samples, you get 1% peak precision for high-frequency sines up to 20.04 kHz.
(i.e., from 2.2kHz to 20.04kHz. low frequencies are of no interest here, since they don't show much max difference between discrete and continuous signal)

Conclusion : for a sine wave, you don't really have to worry about the difference between the discrete peak and the underlying continuous peak.

Quote
Given my example above, is RG actually reconstructing the wave to determine the peak value...or is it using the data in the file?


my opinion is it doesn't matter, even slightly, though  I only made my point with sine waves and not the general case of just any sampled sound.

Improving ReplayGain

Reply #77
Quote
my opinion is it doesn't matter, even slightly, though  I only made my point with sine waves and not the general case of just any sampled sound.
[a href="index.php?act=findpost&pid=227357"][{POST_SNAPBACK}][/a]


on top of that, you can consider it's not really clipping as long as the digital signal is conserved.
Then,  if the DAC chops of the true peaks of the analog signal due to that kind of issue, I'd say it's his fault.

Improving ReplayGain

Reply #78
The RG peak value is the largest absolute value within the digital data.

I'm aware that the true reconstructed peak can be between sample values (and oversampling DACs will often clip this), but isn't the ReplayGain calculation slow enough already - without taking this into account?

It's worth remembering that the people who master squashed CDs don't take this into account either.


The reason I didn't worry about this with ReplayGain is because the highest reconstructed inter-sample value you can contrive is around 1.5x digital full scale. As ReplayGain drops most over-compressed tracks by 6-12dB, you've got more than enough headroom.

I suppose you could store an "analogue" peak value, and use this for clipping prevention. That's a nice project, if anyone wants it!

However, ReplayGain will keep most music away from clipping. If you don't use ReplayGain, simply dropping the gain by 3-6dB will keep everything away from clipping. What's more, the existing peak value is more than good enough in most cases, and leaving an extra fraction of a dB headroom will make it fine in all but contrived cases.

You've got to wonder: if someone puts a signal onto a CD where the analogue peak is at digital full scale plus 50%, maybe the intention is to make the DAC in your CD player clip? Is so, what's the point in de-clipping it?

Cheers,
David.

Improving ReplayGain

Reply #79
Quote
The RG peak value is the largest absolute value within the digital data.

I'm aware that the true reconstructed peak can be between sample values (and oversampling DACs will often clip this)


Are there any DACs that can reconstruct the analog signal with full peak above full-scale ?
I guess DACs can behave very differently on such digital signals.

Quote
The reason I didn't worry about this with ReplayGain is because the highest reconstructed inter-sample value you can contrive is around 1.5x digital full scale.


Wow, 1.5x is much more than possible with sine waves..
can you tell how to make such a signal ?
or do you get this value from mathematically bounding the reconstructed signal formula ?

Improving ReplayGain

Reply #80
Quote
isn't the ReplayGain calculation slow enough already - without taking this into account?


it might be possible to take it into account without much more computations.
From the DCT transform, you can bound the analog peak by adding the moduli of the DCT  coefficients.
I don't know how unprecise that can be on real music signals, but my guess is it shouldn't be too bad.

Improving ReplayGain

Reply #81
Quote
Wow, 1.5x is much more than possible with sine waves..
can you tell how to make such a signal ?
or do you get this value from mathematically bounding the reconstructed signal formula ?
[{POST_SNAPBACK}][/a]


OK, it was 1.41, but it's possible with an 11.025kHz sine wave (44.1kHz sampling)...

[a href="http://www.hydrogenaudio.org/forums/index.php?act=Attach&type=post&id=818]http://www.hydrogenaudio.org/forums/index....ype=post&id=818[/url]

I'd imagine that's the maximum you can get from a sine wave, but if you drag samples around in Cool Edit you can get bigger peaks between samples. If you drag 2 more samples high in the above example, you can reach 1.78x digital full scale between samples (verified by resampling to 10x the sample rate and checking the middle sample value). The "true" peak will be slightly higher still. I'll leave you to figure out which two samples you have to drag up!

Cheers,
David.

Improving ReplayGain

Reply #82
Code: [Select]
replaygain_track_gain = -10.99 dB
replaygain_track_peak = 1.647949


Transcoded from a Musepack --standard encode to AoTuV b2 -q 0.
"To understand me, you'll have to swallow a world." Or maybe your words.

Improving ReplayGain

Reply #83
That's different to the latest issue discussed in this thread, because that Peak value is based on actual samples, not inter-sample reconstructed peaks.

However, it illustrates Pio's earlier point very well!

Cheers,
David.

Improving ReplayGain

Reply #84
Quote
The peak sample value tells you _exactly_ how much headroom the material has - usually none!
I'm referring more to peak-to-average ratio here.  Headroom is a rather vague term that could mean anything, so I probably shouldn't have used it.

Quote
If you want a pure RMS measurement, then measure the RMS. It's got little to do with judging or matching loudness, so it's not part of ReplayGain. Sorry!
Yes, but current methods of calculating RMS are rather cumbersome.  You have to open up each individual file in a wave editor, run the analysis feature, write the RMS down, and do that over and over again for every track on an album.  Plus the RMS scanners in wave editors don't have the "intelligent" calculation factors that ReplayGain uses; it simply averages all the samples in a selection (unless you specify the scanner to ignore everything under a certain level, which alreeady adds work that shouldn't be neccessary for the user).  Adding a non-contour feature to ReplayGain would give people a quick and easy way to measure RMS values.

Improving ReplayGain

Reply #85
Quote
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)


Could this be solved by storing the reference level in the file as well as the replaygain?

Edit: Didn't realise I was such a late comer to this thread!
Dan

Improving ReplayGain

Reply #86
not a bad idea, e.g.

replaygain_reference_level=90dB

absence of the tag implies 89dB

Improving ReplayGain

Reply #87
5. ReplayGain RealLife adjustment

The gain required to give the actual SPL of the original event (in a calibrated system), or a human judged sensible replay level (see the explanation behind the original "Audiophile" level and the work of Bob Katz if you think this is an impossible idea). I've found a few DVD-A discs that have this information (it's in the MLP stream), so it would be nice to have somewhere to store it. It's unlikely to get used much, but it would be a useful thing to have. It would be the last link in some of the best recordings out there.


Can we have this, please?

Maybe use %REPLAYGAIN_RWTRIM% as the tag.  If it exists, playback for only album gain is affected and added to it.  FWIW, I posted a feature request to FLAC about this.

Having my Boston Philharmonic play at the same loudness as Pantera is a bit weird while my calibrated home theater system has a listening level of -23dB.  The missing link to SPL referred playback would add that next level of "shiny"