IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Replaygain at high sample rate, currently replaygain only works upto 48 khz
Ernst
post Oct 5 2010, 14:22
Post #1





Group: Members
Posts: 3
Joined: 5-October 10
Member No.: 84356



Hi,

I noticed that replaygain (as applied by metaflac) only works on tracks with a sample rate upto 48 khz.
I'd like to apply it to all my audio files and I do now have some with 88.2, 96, or 192 khz sample rate.
Looking into the code I see the following:
CODE
FLAC__bool grabbag__replaygain_is_valid_sample_frequency(unsigned sample_frequency)
{
    static const unsigned valid_sample_rates[] = {
        8000,
        11025,
        12000,
        16000,
        22050,
        24000,
        32000,
        44100,
        48000
    };
    static const unsigned n_valid_sample_rates = sizeof(valid_sample_rates) / sizeof(valid_sample_rates[0]);

    unsigned i;

    for(i = 0; i < n_valid_sample_rates; i++)
        if(sample_frequency == valid_sample_rates[i])
            return true;
    return false;
}


In other words, a limit, but it looks as if I could just add other values. However, 96khz would go over the unsigned 16-bit integer limit, so changing to 32-bit integers might be necessary. Would this make anything run into trouble? I've yet to try, which I'll hopefully do later today.

Now the guesswork:
Have these rates been chosen as commonly used?
Is this check at all necessary? Isn't it enough for the sampling rate to be a positive integer?
Is this only in the flac implementation or is this part of the proposed replaygain specification?
Is the reason for the lack of higher sampling rates that at the time calculation for them would have been very slow (in this case it would be better to put up a warning instead)?
Is this because the replaygain reference values are based on psychoacoustics and those are lacking for frequencies we don't hear but which would be present in the higher sampling rate signals?
Go to the top of the page
+Quote Post
[JAZ]
post Oct 5 2010, 18:51
Post #2





Group: Members
Posts: 1767
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



unsigned means unsigned int, and unsigned int has been 32bits since the windows 95 era (even before, in some compilers). This won't be a problem to add higher sampling rates.

Next, the sample rates themselves: Those are the typical sample rates of audio

44.1Kz = CD , 22.05 = half CD, 11.025 = quarter CD (the value of CD has a relation with analog video, but that's for another topic)
48Khz = DAT, 24 = half DAT, 12 = quarter DAC.
32Khz = FM?, 16Khz half FM, 8Khz = analog telephone.

I haven't checked the replaygain algorithm of FLAC, but there's nothing in replaygain specific to sampling rates. What is obvious is that replaygain is a psychoacoustic algorithm, so an implementation may be implemented for specific sampling rates. One such cases can be the design of a lowpass filter.
Go to the top of the page
+Quote Post
tuffy
post Oct 5 2010, 19:01
Post #3





Group: Members
Posts: 111
Joined: 20-August 07
Member No.: 46367



I think what's missing is a set of equal loudness filter coefficients for higher sampling rates. If someone could crank out a set of them, it shouldn't be hard to add.
Go to the top of the page
+Quote Post
DVDdoug
post Oct 5 2010, 19:09
Post #4





Group: Members
Posts: 2542
Joined: 24-August 07
From: Silicon Valley
Member No.: 46454



I'll make a couple of guesses & assumptions... biggrin.gif I can't think of any reason why the code couldn't be revised to work with higher sample rates.

QUOTE
In other words, a limit, but it looks as if I could just add other values. However, 96khz would go over the unsigned 16-bit integer limit, so changing to 32-bit integers might be necessary.
I don't know what you're getting at here... There's no relationship between the bit-depth and the sample rate. I haven't studied the code, but most DSP software converts the raw audio data to 32-bit floating point and then converts back at the end.

QUOTE
Is this because the replaygain reference values are based on psychoacoustics and those are lacking for frequencies we don't hear but which would be present in the higher sampling rate signals?
I assume the psychoacoustic model will treat anything supersonic as inaudible. And, the loudness-curve filters have to be adjusted for the sample rate in any case... since digital filters are simply working on a sequence of numbers, in order for the filter to work properly it has to "know" the sample rate.
Go to the top of the page
+Quote Post
lvqcl
post Oct 5 2010, 19:19
Post #5





Group: Developer
Posts: 3341
Joined: 2-December 07
Member No.: 49183



QUOTE (tuffy @ Oct 5 2010, 22:01) *
I think what's missing is a set of equal loudness filter coefficients for higher sampling rates. If someone could crank out a set of them, it shouldn't be hard to add.


wvgain accepts input samplerates up to 192000. From wvgain.c:

CODE
// These are the filters used to calculate perceived loudness. The table data was copied
// from the Foobar2000 source code.
Go to the top of the page
+Quote Post
Yirkha
post Oct 5 2010, 21:48
Post #6





Group: FB2K Moderator
Posts: 2359
Joined: 30-November 07
Member No.: 49158



Some filter coefficients in ReplayGain need to be changed for different sample rates. The original implementations supported only the rates listed above, but this has been subsequently amended later in some programs. For example, Menno kindly provided additional values for fb2k's RG scanner in 2003, I don't know about the other implementations. Further, this topic might be interesting as it deals with a similar question.



--------------------
Full-quoting makes you scroll past the same junk over and over.
Go to the top of the page
+Quote Post
Ernst
post Oct 6 2010, 05:23
Post #7





Group: Members
Posts: 3
Joined: 5-October 10
Member No.: 84356



QUOTE (DVDdoug @ Oct 5 2010, 20:09) *
I'll make a couple of guesses & assumptions... biggrin.gif I can't think of any reason why the code couldn't be revised to work with higher sample rates.

QUOTE
In other words, a limit, but it looks as if I could just add other values. However, 96khz would go over the unsigned 16-bit integer limit, so changing to 32-bit integers might be necessary.
I don't know what you're getting at here... There's no relationship between the bit-depth and the sample rate. I haven't studied the code, but most DSP software converts the raw audio data to 32-bit floating point and then converts back at the end.

In some compilers int still means 16-bit. So the number 96000 could not be represented.

QUOTE (lvqcl @ Oct 5 2010, 20:19) *
QUOTE (tuffy @ Oct 5 2010, 22:01) *
I think what's missing is a set of equal loudness filter coefficients for higher sampling rates. If someone could crank out a set of them, it shouldn't be hard to add.


wvgain accepts input samplerates up to 192000. From wvgain.c:

CODE
// These are the filters used to calculate perceived loudness. The table data was copied
// from the Foobar2000 source code.



Maybe I'll copy from there to metaflac to see what works.

QUOTE (Yirkha @ Oct 5 2010, 22:48) *
Some filter coefficients in ReplayGain need to be changed for different sample rates. The original implementations supported only the rates listed above, but this has been subsequently amended later in some programs. For example, Menno kindly provided additional values for fb2k's RG scanner in 2003, I don't know about the other implementations. Further, this topic might be interesting as it deals with a similar question.

Nice links.

Of course I did not yet get around to trying something, probably soon.
Go to the top of the page
+Quote Post
saratoga
post Oct 6 2010, 19:01
Post #8





Group: Members
Posts: 4868
Joined: 2-September 02
Member No.: 3264



QUOTE (Ernst @ Oct 6 2010, 00:23) *
QUOTE (DVDdoug @ Oct 5 2010, 20:09) *
I'll make a couple of guesses & assumptions... biggrin.gif I can't think of any reason why the code couldn't be revised to work with higher sample rates.

QUOTE
In other words, a limit, but it looks as if I could just add other values. However, 96khz would go over the unsigned 16-bit integer limit, so changing to 32-bit integers might be necessary.
I don't know what you're getting at here... There's no relationship between the bit-depth and the sample rate. I haven't studied the code, but most DSP software converts the raw audio data to 32-bit floating point and then converts back at the end.

In some compilers int still means 16-bit. So the number 96000 could not be represented.


In c int size is generally thought of as CPU dependent, rather then compiler dependent. So on a 16 bit machine (e.g. a 286) int might be 16 bit. However, most likely metaflac would not compile on such a device without being ported to a 16 bit architecture. Its rare for ordinary programs to function on machines with int < 32 bits.
Go to the top of the page
+Quote Post
Ernst
post Oct 24 2010, 10:13
Post #9





Group: Members
Posts: 3
Joined: 5-October 10
Member No.: 84356



I did get a few higher sampling rates set up in metaflac. I copied the values from mp3gain, adding 64, 88.2, and 96. Now I just need to convert the matlab program for calculating the filters to octave, scilab, or plain c/c++ to get other rates as well.
Go to the top of the page
+Quote Post
quietdragon
post Jan 8 2012, 08:02
Post #10





Group: Members
Posts: 1
Joined: 25-April 10
Member No.: 80153



QUOTE (Ernst @ Oct 5 2010, 05:22) *
I noticed that replaygain (as applied by metaflac) only works on tracks with a sample rate upto 48 khz.

See http://lists.xiph.org/pipermail/flac-dev/2...ary/003064.html
Go to the top of the page
+Quote Post
Nessuno
post Jan 8 2012, 10:41
Post #11





Group: Members
Posts: 422
Joined: 16-December 10
From: Palermo
Member No.: 86562



QUOTE (saratoga @ Oct 6 2010, 19:01) *
QUOTE (Ernst @ Oct 6 2010, 00:23) *

In some compilers int still means 16-bit. So the number 96000 could not be represented.

In c int size is generally thought of as CPU dependent, rather then compiler dependent. So on a 16 bit machine (e.g. a 286) int might be 16 bit. However, most likely metaflac would not compile on such a device without being ported to a 16 bit architecture. Its rare for ordinary programs to function on machines with int < 32 bits.


Pedantically speaking, and if my memory do not fail, int size is not defined by ANSI C standard and thus is implementation dependent: a compiler could have 32 bit int even in a 16 bit architecture and vice versa. So if you have to rely on a fixed width type and still write portable code, you must test for sizeof(int) and accordingly define an alias for long or short to use in your code.

I don't know if and how this could apply to Replaygain code.


--------------------
... I live by long distance.
Go to the top of the page
+Quote Post
saratoga
post Jan 8 2012, 11:04
Post #12





Group: Members
Posts: 4868
Joined: 2-September 02
Member No.: 3264



QUOTE (Nessuno @ Jan 8 2012, 04:41) *
Pedantically speaking, and if my memory do not fail, int size is not defined by ANSI C standard and thus is implementation dependent: a compiler could have 32 bit int even in a 16 bit architecture and vice versa.


To be pedantic, I didn't say that it was defined by the CPU. I said it was thought of as defined by the CPU, because it is. The reason the standard doesn't define the absolute widths of variables is to allow efficient programming on systems with different word sizes. So compilers are given the freedom to pick the optimal size for a given system. But the reason is to accommodate different CPUs, and since all modern CPUs are 32 bit or higher, you need not worry about a 16 bit compiler being installed a contemporary computer.

QUOTE (Nessuno @ Jan 8 2012, 04:41) *
if you have to rely on a fixed width type and still write portable code, you must test for sizeof(int) and accordingly define an alias for long or short to use in your code.


For modern c variants, you'd probably use the c99 int32_t typedefs.

QUOTE (Nessuno @ Jan 8 2012, 04:41) *
I don't know if and how this could apply to Replaygain code.


As I said last year, it does not apply at all to the code in question.
Go to the top of the page
+Quote Post
Nessuno
post Jan 8 2012, 11:19
Post #13





Group: Members
Posts: 422
Joined: 16-December 10
From: Palermo
Member No.: 86562



QUOTE (saratoga @ Jan 8 2012, 11:04) *
To be pedantic, I didn't say that it was defined by the CPU.


I didn't say you said... smile.gif


--------------------
... I live by long distance.
Go to the top of the page
+Quote Post
Wombat
post Jan 8 2012, 21:08
Post #14





Group: Members
Posts: 987
Joined: 7-October 01
Member No.: 235



When i remember right it was asked a long time ago to add support for higher sampling rates into metaflac. Unfortunately i didnīt see much action taking place regarding the developement of flac/metaflac.
Most likely Mr. Coalson has better things to do lately. The bugtracker at sourceforge doesnīt indicate much operation also. I even would like to have a metaflac with the chance to use R128 standard as alternate choice.
Go to the top of the page
+Quote Post
romor
post Jan 9 2012, 21:17
Post #15





Group: Members
Posts: 668
Joined: 16-January 09
Member No.: 65630



QUOTE (quietdragon @ Jan 8 2012, 09:02) *
QUOTE (Ernst @ Oct 5 2010, 05:22) *
I noticed that replaygain (as applied by metaflac) only works on tracks with a sample rate upto 48 khz.

See http://lists.xiph.org/pipermail/flac-dev/2...ary/003064.html

From that data seems obvious that Butterworth coefficients are linearly correlated, with slight deviation at 18900 set. So those can be obtained for arbitrary rate
Only that Yulewalk are not so obvious just by looking in the data
Perhaps with some analytical skills? wink.gif


--------------------
scripts: http://goo.gl/M1qVLQ
Go to the top of the page
+Quote Post
2Bdecided
post Jan 13 2012, 16:44
Post #16


ReplayGain developer


Group: Developer
Posts: 5066
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



See...
http://lists.xiph.org/pipermail/flac-dev/2...ary/003067.html

I just sent Ernst the filter coefficients for 28kHz.

Cheers,
David.
Go to the top of the page
+Quote Post
romor
post Jan 14 2012, 16:53
Post #17





Group: Members
Posts: 668
Joined: 16-January 09
Member No.: 65630



Quote from David's link:
QUOTE
The extended gain analysis tables in Foobar2000 (and the derivatives that copied it) are wrong.

Here is how to show that it's wrong. Use <http://www.daniweb.com/software-development/python/code/263775> to create
a 1 kHz signal using a 48 kSamples/sec rate (48.wav) and and a 192
kSamples/sec rate (192.wav). I modified the script to generate 2s of
samples.

Because the underlying signal is the same (1kHz with a fixed amplitude) the perceived loudness should be identical,
independent of the sampling rate.


Is it?



I used that messy script, which shows itself on Google on multiple places, just to confirm that we are talking about same thing, but that's not the way to treat Python. Numpy exists for reason, even if someone's testing procedure is limited on only two audio files. It's just a shame to use bare Python for audio testing.

Assuming:
CODE
import numpy
from scipy.io import wavfile
from scipy.signal import chirp

Here are some oneliner functions to bake the data, just follow your imagination, it's fast and easy:
CODE
def np_tone(fs, freq=440, t=2): return np.array(16384*np.cos((2*np.pi*freq/fs)*np.arange(fs*t)), dtype=np.int16)

def np_rand(fs, t=2): return np.array(16384*np.random.random_sample(fs*t), dtype=np.int16)

def np_chirp(fs, t=2): return np.array(16384*chirp(np.linspace(0, t, fs*t), fs/2, .002, t), dtype=np.int16)

then create for example 1 kHz tone in stereo (double mono):
CODE
data = np_tone(44100, freq=1000)
wavfile.write('44100.wav', 44100, np.column_stack((data, data)))


BTW, RG for some reason does not work for 176400 both with wvgain, and with same table: http://goo.gl/JwxaT using other tool. Perhaps coefficients are bad for that rate.


--------------------
scripts: http://goo.gl/M1qVLQ
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 1st August 2014 - 02:19