IPB

Welcome Guest ( Log In | Register )

4 Pages V  « < 2 3 4  
Reply to this topicStart new topic
Improving ReplayGain, some ideas for Devs etc
Xenno
post Jul 19 2004, 23:01
Post #76





Group: Members
Posts: 393
Joined: 23-July 02
From: Blue Grass, IA
Member No.: 2760



2B > The peak sample value tells you _exactly_ how much headroom the material has - usually none!

...and I am at a loss on how RG actually does this. If I take a 20 kHz sine wave (or whatever that will yield 2 sample points per cycle) and encode at 44.1 kHz, there is no guarantee that the sample points will fall on the amplitude maximums (unless phase locked). They could fall on the x-axis crossing nodes...or anywhere on the waveform up to the peaks. Given my example above, is RG actually reconstructing the wave to determine the peak value...or is it using the data in the file?

xen-uno


--------------------
No one can be told what Ogg Vorbis is...you have to hear it for yourself
- Morpheus
Go to the top of the page
+Quote Post
SamK
post Jul 20 2004, 00:52
Post #77





Group: Members
Posts: 57
Joined: 4-January 04
Member No.: 10938



QUOTE (Xenno @ Jul 19 2004, 11:01 PM)
2B > The peak sample value tells you _exactly_ how much headroom the material has - usually none!

...and I am at a loss on how RG actually does this. If I take a 20 kHz sine wave (or whatever that will yield 2 sample points per cycle)


in fact 2+epsilon sample points are required by shannon theorem in the case of real valued samples, if you look at it closely enough.
(ie : notice that cos(Wt) = 0.5*(exp(iWt) + exp(-iWt)), and thus its bandwidth is not in [-W,+W[ )

QUOTE
and encode at 44.1 kHz, there is no guarantee that the sample points will fall on the amplitude maximums (unless phase locked). They could fall on the x-axis crossing nodes...or anywhere on the waveform up to the peaks.


on a long enough sequence of such a sine wave, some of the sample points will fall very close to maximums of the continuous wave.
By a quick estimation, 4000 samples are enough to insure that the discrete peak lies within 100/(4000^2) percents of the continuous wave's real peak for high frequencies up to 22.05/(1+1/4000) = 22.044 kHz.
(I'm using 1-x^2/2 as an estimate of the sine wave near the optimums)

Even with only 10 samples, you get 1% peak precision for high-frequency sines up to 20.04 kHz.
(i.e., from 2.2kHz to 20.04kHz. low frequencies are of no interest here, since they don't show much max difference between discrete and continuous signal)

Conclusion : for a sine wave, you don't really have to worry about the difference between the discrete peak and the underlying continuous peak.

QUOTE
Given my example above, is RG actually reconstructing the wave to determine the peak value...or is it using the data in the file?


my opinion is it doesn't matter, even slightly, though I only made my point with sine waves and not the general case of just any sampled sound.

This post has been edited by SamK: Jul 20 2004, 00:58
Go to the top of the page
+Quote Post
SamK
post Jul 20 2004, 00:55
Post #78





Group: Members
Posts: 57
Joined: 4-January 04
Member No.: 10938



QUOTE (SamK @ Jul 20 2004, 12:52 AM)
my opinion is it doesn't matter, even slightly, though  I only made my point with sine waves and not the general case of just any sampled sound.
*


on top of that, you can consider it's not really clipping as long as the digital signal is conserved.
Then, if the DAC chops of the true peaks of the analog signal due to that kind of issue, I'd say it's his fault.
Go to the top of the page
+Quote Post
2Bdecided
post Jul 20 2004, 13:07
Post #79


ReplayGain developer


Group: Developer
Posts: 5176
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



The RG peak value is the largest absolute value within the digital data.

I'm aware that the true reconstructed peak can be between sample values (and oversampling DACs will often clip this), but isn't the ReplayGain calculation slow enough already - without taking this into account? wink.gif

It's worth remembering that the people who master squashed CDs don't take this into account either.


The reason I didn't worry about this with ReplayGain is because the highest reconstructed inter-sample value you can contrive is around 1.5x digital full scale. As ReplayGain drops most over-compressed tracks by 6-12dB, you've got more than enough headroom.

I suppose you could store an "analogue" peak value, and use this for clipping prevention. That's a nice project, if anyone wants it!

However, ReplayGain will keep most music away from clipping. If you don't use ReplayGain, simply dropping the gain by 3-6dB will keep everything away from clipping. What's more, the existing peak value is more than good enough in most cases, and leaving an extra fraction of a dB headroom will make it fine in all but contrived cases.

You've got to wonder: if someone puts a signal onto a CD where the analogue peak is at digital full scale plus 50%, maybe the intention is to make the DAC in your CD player clip? Is so, what's the point in de-clipping it?

Cheers,
David.

This post has been edited by 2Bdecided: Jul 20 2004, 13:08
Go to the top of the page
+Quote Post
SamK
post Jul 20 2004, 14:08
Post #80





Group: Members
Posts: 57
Joined: 4-January 04
Member No.: 10938



QUOTE (2Bdecided @ Jul 20 2004, 01:07 PM)
The RG peak value is the largest absolute value within the digital data.

I'm aware that the true reconstructed peak can be between sample values (and oversampling DACs will often clip this)


Are there any DACs that can reconstruct the analog signal with full peak above full-scale ?
I guess DACs can behave very differently on such digital signals.

QUOTE
The reason I didn't worry about this with ReplayGain is because the highest reconstructed inter-sample value you can contrive is around 1.5x digital full scale.


Wow, 1.5x is much more than possible with sine waves..
can you tell how to make such a signal ?
or do you get this value from mathematically bounding the reconstructed signal formula ?
Go to the top of the page
+Quote Post
SamK
post Jul 20 2004, 14:19
Post #81





Group: Members
Posts: 57
Joined: 4-January 04
Member No.: 10938



QUOTE (2Bdecided @ Jul 20 2004, 01:07 PM)
isn't the ReplayGain calculation slow enough already - without taking this into account? wink.gif


it might be possible to take it into account without much more computations.
From the DCT transform, you can bound the analog peak by adding the moduli of the DCT coefficients.
I don't know how unprecise that can be on real music signals, but my guess is it shouldn't be too bad.
Go to the top of the page
+Quote Post
2Bdecided
post Jul 20 2004, 14:50
Post #82


ReplayGain developer


Group: Developer
Posts: 5176
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (SamK @ Jul 20 2004, 01:08 PM)
Wow, 1.5x is much more than possible with sine waves..
can you tell how to make such a signal ?
or do you get this value from mathematically bounding the reconstructed signal formula ?
*


OK, it was 1.41, but it's possible with an 11.025kHz sine wave (44.1kHz sampling)...

http://www.hydrogenaudio.org/forums/index....ype=post&id=818

I'd imagine that's the maximum you can get from a sine wave, but if you drag samples around in Cool Edit you can get bigger peaks between samples. If you drag 2 more samples high in the above example, you can reach 1.78x digital full scale between samples (verified by resampling to 10x the sample rate and checking the middle sample value). The "true" peak will be slightly higher still. I'll leave you to figure out which two samples you have to drag up!

Cheers,
David.
Go to the top of the page
+Quote Post
dev0
post Jul 20 2004, 14:59
Post #83





Group: Developer
Posts: 1679
Joined: 23-December 01
From: Germany
Member No.: 731



CODE
replaygain_track_gain = -10.99 dB
replaygain_track_peak = 1.647949


Transcoded from a Musepack --standard encode to AoTuV b2 -q 0.


--------------------
"To understand me, you'll have to swallow a world." Or maybe your words.
Go to the top of the page
+Quote Post
2Bdecided
post Jul 20 2004, 15:13
Post #84


ReplayGain developer


Group: Developer
Posts: 5176
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



That's different to the latest issue discussed in this thread, because that Peak value is based on actual samples, not inter-sample reconstructed peaks.

However, it illustrates Pio's earlier point very well!

Cheers,
David.
Go to the top of the page
+Quote Post
Kuuenbu
post Jul 26 2004, 02:18
Post #85





Group: Members
Posts: 65
Joined: 19-July 03
Member No.: 7864



QUOTE
The peak sample value tells you _exactly_ how much headroom the material has - usually none!
I'm referring more to peak-to-average ratio here. Headroom is a rather vague term that could mean anything, so I probably shouldn't have used it.

QUOTE
If you want a pure RMS measurement, then measure the RMS. It's got little to do with judging or matching loudness, so it's not part of ReplayGain. Sorry!
Yes, but current methods of calculating RMS are rather cumbersome. You have to open up each individual file in a wave editor, run the analysis feature, write the RMS down, and do that over and over again for every track on an album. Plus the RMS scanners in wave editors don't have the "intelligent" calculation factors that ReplayGain uses; it simply averages all the samples in a selection (unless you specify the scanner to ignore everything under a certain level, which alreeady adds work that shouldn't be neccessary for the user). Adding a non-contour feature to ReplayGain would give people a quick and easy way to measure RMS values.
Go to the top of the page
+Quote Post
danbee
post Oct 27 2004, 17:54
Post #86





Group: Members
Posts: 225
Joined: 19-February 02
From: plymouth, uk
Member No.: 1355



QUOTE (2Bdecided @ Nov 18 2003, 04:35 PM)
Almost everyone is using a reference level of 89dB, rather than the 83dB in the original ReplayGain proposal. Unless there are any objections, I'll change the official reference level to 89dB.

(It's a pity I didn't stick with the original idea of storing the ReplayGain level in the file e.g. 92dB instead of -3dB, because then the reference level wouldn't matter. Too confusing to change back now I think)


Could this be solved by storing the reference level in the file as well as the replaygain?

Edit: Didn't realise I was such a late comer to this thread!

This post has been edited by danbee: Oct 27 2004, 17:56


--------------------
:: danbee :: pixelhum.com ::
Go to the top of the page
+Quote Post
jcoalson
post Oct 27 2004, 18:23
Post #87


FLAC Developer


Group: Developer
Posts: 1526
Joined: 27-February 02
Member No.: 1408



not a bad idea, e.g.

replaygain_reference_level=90dB

absence of the tag implies 89dB
Go to the top of the page
+Quote Post

4 Pages V  « < 2 3 4
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd October 2014 - 12:07