Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Sample rate conversion (Read 24173 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Sample rate conversion

Hello everyone.

I have a question regarding sample rate conversion algorightms.
I don't know how they work, but I guess you guys are more familiar with this 

Let's say I have a 48 kHz wave file that I want to use in something I have to mix
with some 96 kHz material at certain places. This requires the 48 kHz file be resampled
and in the end back to 48.

So in short: does resampling 48 --> 96 --> 48 change the original or can the original
be restored 100% ?

I'm using Wavelab for this.

Sample rate conversion

Reply #1
So in short: does resampling 48 --> 96 --> 48 change the original or can the original
be restored 100% ?


Theres going to be some rounding error.  Its probably better to just do everything at 48k.  That said, with a very good resampler, there will be no meaningful difference.  I have no idea if wavelab is any good, although this page suggests that at least one of the wavlab 6 resamplers is pretty good:

http://src.infinitewave.ca/

Sample rate conversion

Reply #2
Hello everyone.

I have a question regarding sample rate conversion algorightms.
I don't know how they work, but I guess you guys are more familiar with this 

Let's say I have a 48 kHz wave file that I want to use in something I have to mix
with some 96 kHz material at certain places. This requires the 48 kHz file be resampled
and in the end back to 48.

So in short: does resampling 48 --> 96 --> 48 change the original or can the original
be restored 100% ?

I'm using Wavelab for this.


Can't you just stash a copy of the 48k origional some place out of the way?

Sample rate conversion

Reply #3
If you are just mixing, then you ought to be able to get the same results (to within -144db or better) by resampling the 96k content down to 48k, and mixing at 48k. The reason is that mixing is a purely linear operation.

48k->96k, being a 2x oversample, is among the most numerically conservative resampling possibilities. If the resampler is correctly implemented, 50% of all the samples should be numerically exact, with zero quantization error.

Sample rate conversion

Reply #4
If the resampler is correctly implemented, 50% of all the samples should be numerically exact, with zero quantization error.
That's an interesting use of the word "correctly" - a 2x resampler can easily be (and often is) perfect in terms of frequency and phase response, while giving 100% "new" samples.

You can easily design it to do as you suggest, but that's not necessarily the way all are designed.

Cheers,
David.

Sample rate conversion

Reply #5
You can easily design it to do as you suggest, but that's not necessarily the way all are designed.


Afaik all resamplers apply another lowpass on upsampling, be it 2x or some strange number from 44.1 to 96kHz. So when this lowpass is applied all relation to a pattern in the source is gone. Otherwise we´ll have an aliased, mirror above the sources max frequency. Isn´t it? And if not, what software does that creation of added 0s correctly?
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Sample rate conversion

Reply #6
Given 10 seconds of 440Hz tone and upsampling from 48k to 96k, sox preserves the input samples in the output for all but the first and last 40-or-so samples. Of course, the test becomes more 'difficult' if the bit-depth or the tone frequency is increased.

Code: [Select]
sox -b 16 -n 1.wav synth 10
sox -D 1.wav 2.wav rate 96k
sox -c 2 -r 48k 2.wav 3.wav remix 1
cmp -l 1.wav 3.wav | wc -l


Sample rate conversion

Reply #7
Given 10 seconds of 440Hz tone and upsampling from 48k to 96k, sox preserves the input samples in the output for all but the first and last 40-or-so samples. Of course, the test becomes more 'difficult' if the bit-depth or the tone frequency is increased.

Code: [Select]
sox -b 16 -n 1.wav synth 10
sox -D 1.wav 2.wav rate 96k
sox -c 2 -r 48k 2.wav 3.wav remix 1
cmp -l 1.wav 3.wav | wc -l


This must be lowpassed already even if i don´t know what -D does. You may have a look what sox does without lowpass.
http://www.hydrogenaudio.org/forums/index....st&p=675545

The resulting 2.wav looks lowpassed at 22-25kHz. Isn´t it?


Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Sample rate conversion

Reply #8
Quote
This must be lowpassed already even if i don´t know what -D does.
-D is to disable dithering (which would otherwise mess up the test).  The only filter involved is that of the resampler (in the second line), applied after zero stuffing, a lowpass at just below the original nyquist (i.e. the standard method).

Quote
You may have a look what sox does without lowpass.

In fact, that shows a different resampler; with sox the resampling filter is not optional.

Sample rate conversion

Reply #9
That's an interesting use of the word "correctly" - a 2x resampler can easily be (and often is) perfect in terms of frequency and phase response, while giving 100% "new" samples.

You can easily design it to do as you suggest, but that's not necessarily the way all are designed.


Feh. Yes, you are correct. Thanks for the catch.

I think that if I restrict my statement to the domain of windowed sinc filters, it's accurate. But any filter possessing an asymmetric response around the -6db point at Fs/2 is categorically not in that domain. While asymmetric filters are 2x more complex to implement, obviously they exist, particularly in software implementations where the symmetric optimization may not get performed. (offhand, I can't recall any specific instance of such a filter, but I am quite sure they exist.)

... Right?

Sample rate conversion

Reply #10
I think that if I restrict my statement to the domain of windowed sinc filters, it's accurate. But any filter possessing an asymmetric response around the -6db point at Fs/2 is categorically not in that domain.
I have a question that's marginally off-topic, but this post triggered all my keyword detectors. Windowed sinc is symmetric. What would similar functions be in the asymmetric case? The symmetric, acausal nature of sinc has always bugged me, but I've never found a reference containing asymmetric, causal analogues that I could use in SRC contexts.

Sample rate conversion

Reply #11
I admit i have no clue about some things you talk but to summarize my findings i still don´t see anyone convincing me there is upsampling without lowpassing, so changing EVERY single musical bit in the process.

On http://en.wikipedia.org/wiki/Upsampling they talk about the 2 ways of implementing
1. Add zeros between each sample -> to my understanding all resamplers do
2. Filter with a low-pass filter which, theoretically, should be the sinc filter -> that no resampler does because, also from Wikipedia:

"The second step calls for the use of a perfect low-pass filter, which is not implementable"
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Sample rate conversion

Reply #12
I admit i have no clue about some things you talk but to summarize my findings i still don´t see anyone convincing me there is upsampling without lowpassing, so changing EVERY single musical bit in the process.
Sample-and-hold, ie. "nearest neighbour", ie. out[2*x]=in
  • ;out[2*x+1]=in
  • is trivial upsampling without lowpassing, though it's gonna generate a lot of high-frequency aliasing. Solution? Low-pass the aliasing out.

    I suspect that's why step 2 exists in your algorithm. Step 1 will resample (even accurately) but the issue becomes high-frequency aliasing, ie. putting high-frequency content in the signal that was not there originally.

Sample rate conversion

Reply #13
Sample-and-hold, ie. "nearest neighbour", ie. out[2*x]=in
  • ;out[2*x+1]=in
  • is trivial upsampling without lowpassing, though it's gonna generate a lot of high-frequency aliasing. Solution? Low-pass the aliasing out.

Exactly, that aliasing i mentioend in post 6 above. We are running in circles
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Sample rate conversion

Reply #14
I admit i have no clue about some things you talk but to summarize my findings i still don´t see anyone convincing me there is upsampling without lowpassing, so changing EVERY single musical bit in the process.

On http://en.wikipedia.org/wiki/Upsampling they talk about the 2 ways of implementing
1. Add zeros between each sample -> to my understanding all resamplers do
2. Filter with a low-pass filter which, theoretically, should be the sinc filter -> that no resampler does because, also from Wikipedia:

"The second step calls for the use of a perfect low-pass filter, which is not implementable"


Although you can never make a perfect sinc filter, its possible to build ones very very close to perfect, such that any difference is below quantization noise except for a tiny region right around the Nyquist limit, which of course won't include audio anyway because of the anti-alias filter on the original recording ADC.

Sample rate conversion

Reply #15
Exactly, that aliasing i mentioend in post 6 above. We are running in circles
Then what you're looking for is sinc interpolation.

Edit: saratoga beat me, and was more informative to boot.

Sample rate conversion

Reply #16
Although you can never make a perfect sinc filter, its possible to build ones very very close to perfect, such that any difference is below quantization noise except for a tiny region right around the Nyquist limit, which of course won't include audio anyway because of the anti-alias filter on the original recording ADC.


I am pretty sure we can get close but from reading some earlier posts in this thread it is suggested that 2x upsampling will leave the source material intact which it is not. One can argue about such ultra-close filters have huge amounts of pre-ringing btw.
It is not a lossless operation.
I see it in the context of some audiophile claims about upsampling 2x from 44.1 to 88.2 for example sounds better as to 96kHz cause of only applying zeros on every second bit, but it isn´t.
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Sample rate conversion

Reply #17
Although you can never make a perfect sinc filter, its possible to build ones very very close to perfect, such that any difference is below quantization noise except for a tiny region right around the Nyquist limit, which of course won't include audio anyway because of the anti-alias filter on the original recording ADC.


I am pretty sure we can get close but from reading some earlier posts in this thread it is suggested that 2x upsampling will leave the source material intact which it is not.


Obviously if the filter does not have unity transmittance at every frequency, then at least some samples cannot be the same . . .

Sample rate conversion

Reply #18
I think that if I restrict my statement to the domain of windowed sinc filters, it's accurate. But any filter possessing an asymmetric response around the -6db point at Fs/2 is categorically not in that domain.
I have a question that's marginally off-topic, but this post triggered all my keyword detectors. Windowed sinc is symmetric. What would similar functions be in the asymmetric case? The symmetric, acausal nature of sinc has always bugged me, but I've never found a reference containing asymmetric, causal analogues that I could use in SRC contexts.

Yeah, you better split this mofo right now before the OP's head explodes.

I was using the term "windowed sinc" previously with the implicit assumption that the sinc function is critically sampled, ie, with lowpass frequency Fs/2. If it's not that -- more specifically, if it's both lower than, and relatively prime to, Fs/2 -- then you'll get your asymmetric response, and you'll also get every sample modified. That's the simplest example I can think of.

Lemme do a bit of armchair mathematical derivation to outline what I'm riffing from. I'm going to wave my hands *really* widely here, so I apologize in advanced for abuses of notation, convention, or for that matter, logic. For starters, I'll write this assuming a normalized, ordinary-frequency Fourier transform, but using the variable omega (w) instead of xi. I am also assuming Fs=1, and I write Fs/2 largely as a more easily identifiable representation of the frequency "1/2".

The basic principle here is the interpolating filter, ie, one which can be used when interpolating between sampled data values: for interpolating kernel k(t) convolved with a discrete-time signal with sampling period 1, k(0)=1 and k(N)=0 for nonzero integer N.

The sinc function is the "simplest" analytic function satisfying this requirement. I base this statement from the following identity:

  )

The inner term there, (1-x2/n2), can be rewritten as (n-2)(n+x)(n-x). That (n-2) is just a constant which can be ignored, so what we're left with is a function which could be notionally defined as a polynomial with a zero at every nonzero integer. That's precisely what is necessary for an interpolating kernel.

This implies, at least the way I see it currently, that any interpolating function can be written in the form k(t) = h(t) sinc(t), as long as h(0)=1, and h(t) is defined across the support of k(t). Furthermore, the windowed (time-limited) interpolation functions can be written h(t) = g(t/a) rect(t/a), where "a" controls the window width and g(t) is the basic window function evaluated over [-1/2, 1/2]. So a Hann-windowed sinc kernel is going to be something like g(t) = (1+cos(pi t))/2.

The statement k(t) = h(t) sinc(t) in the time domain is equivalent to K(w) = H(w) * rect(w) in the frequency domain. If we assert:
  • h(t) is even ---> H(w) is also even (this will be true for all linear-phase filters)
  • H(w) is bandlimited to <Fs (strictly impossible for windowed kernels, but usually a very safe approximation)

... then, around w=+1/2, we can approximate rect(w) ~~ 1 - u(w-1/2) (and similarly for w=-1/2). From this -- and I apologize again if I'm skipping way too many steps here -- by looking at the bare convolution integrals, it's fairly easy to see that K(1/2)=(1/2) K(0). For a normalized filter response, ie K(0)=0db, then K(1/2)=-6db.

In summary, when performing integral upsampling with a lowpass filter, the kernel k(t) is interpolating (preserves the value of the original samples) if and only if: *either*
  • The kernel can be written k(t) = h(t) sinc(t)
  • h(0)=1
  • h(t) is even
  • The approximation "H(w)=0 for w>Fs" is reasonably accurate
  • The domain of h(t) includes the support of k(t) (ie, no infinities)
*or* the Fourier transform of the kernel can be written K(w) = H(w) * rect(w), for H(w) even; more specifically,
  • K(Fs/2) = (1/2) K(0)
  • K(w) - (1/2) is approximately odd around w=+-Fs/2 (the symmetry obviously breaks down near w=0)

Sample rate conversion

Reply #19
Obviously if the filter does not have unity transmittance at every frequency, then at least some samples cannot be the same . . .

Indeed, with 2× resampling, if the input signal frequency stays below the filter transition, half of the output samples (weird/asymmetric filters notwithstanding) will be the same as the input samples (as observed at the chosen bit depth).  With sox, 16-bit, 48kHz ->96kHz, the odd-numbered output samples are bit-exact up to 22kHz; at 23kHz, the roll-off kicks in and the samples are no longer the same.  At 24-bit, it's bit-exact to 16k; at 17k, attenuation of 0.0001dB is evident in the output.

Sample rate conversion

Reply #20
Obviously if the filter does not have unity transmittance at every frequency, then at least some samples cannot be the same . . .

Indeed, with 2× resampling, if the input signal frequency stays below the filter transition, half of the output samples (weird/asymmetric filters notwithstanding) will be the same as the input samples (as observed at the chosen bit depth).  With sox, 16-bit, 48kHz ->96kHz, the odd-numbered output samples are bit-exact up to 22kHz; at 23kHz, the roll-off kicks in and the samples are no longer the same.  At 24-bit, it's bit-exact to 16k; at 17k, attenuation of 0.0001dB is evident in the output.

Strictly speaking, it is not very meaningful to describe the flatness of a frequency response as "bit-exact".

Sample rate conversion

Reply #21
Strictly speaking, it is not very meaningful to describe the flatness of a frequency response as "bit-exact".

Just as well that's not what we're describing then! We're describing when, after 2× band-limited interpolation, alternate output samples are exactly the same as the input samples. Practical examples were given of when this indeed the case and also when it's not.

Sample rate conversion

Reply #22
after 2× band-limited interpolation, alternate output samples are exactly the same as the input samples.

I still don´t get how any sample can be exactly the same when a lowpass is applied to it in the output!? I hope i only need a small hint

Edit: Added pic of source 48k and upsampled. How can any sample be intact?



Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Sample rate conversion

Reply #23
48 -> 96 kHz conversion may leave signal samples intact, but also may change them.
The condition for leaving samples intact is the use of a half-band low-pass filter, i.e. the filter of the form x1 0 x2 0 x3 0 .... 0 xn. Half-band filters can be designed with a windowed sinc method to easily achieve the required frequency response. However many SRC algorithms are using other designs (not half-band).

Sample rate conversion

Reply #24
48 -> 96 kHz conversion may leave signal samples intact, but also may change them.
The condition for leaving samples intact is the use of a half-band low-pass filter, i.e. the filter of the form x1 0 x2 0 x3 0 .... 0 xn. Half-band filters can be designed with a windowed sinc method to easily achieve the required frequency response. However many SRC algorithms are using other designs (not half-band).

Half-Band filters, ok... May you give me the answer to the sample above and how sox does here. Is there any sample intact? And how to tell?
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!