IPB

Welcome Guest ( Log In | Register )

> Upload forum rules

- No over 30 sec clips of copyrighted music. Cite properly and never more than necessary for the discussion.


- No copyrighted software without permission.


- Click here for complete Hydrogenaudio Terms of Service

2 Pages V   1 2 >  
Reply to this topicStart new topic
3999hz Sine at 8khz SR, Generated by Audacity to prove a point.
hornpipe2
post Aug 30 2013, 21:04
Post #1





Group: Members
Posts: 5
Joined: 30-August 13
Member No.: 109888



Here is a 5 second tone produced by:

* Start Audacity
* Set Project Rate to 8000hz
* Generate -> Tone
- 3999hz
- 1.0 peak amplitude
- 5 sec duration
* Export as 16-bit 8khz WAV.

Shannon-Nyquist says: my 8khz wave file should perfectly capture my 3.999khz sine wave, because 3.999khz < (8khz / 2).

Now my question: Why, oh why, does Audacity, foobar2000, WMP et al play this back with beats? There is a .5sec periodic amplitude osc going on.

What setup do I need to hear this played back as a pure 3.999khz tone?
Attached File(s)
Attached File  tone.wav ( 78.17K ) Number of downloads: 94
 
Go to the top of the page
+Quote Post
saratoga
post Aug 30 2013, 21:08
Post #2





Group: Members
Posts: 5165
Joined: 2-September 02
Member No.: 3264



QUOTE (hornpipe2 @ Aug 30 2013, 16:04) *
Now my question: Why, oh why, does Audacity, foobar2000, WMP et al play this back with beats? There is a .5sec periodic amplitude osc going on.


Because its being resampled to 48k, and there is generally massive distortion of anything with 1% of the Nyquist limit with even very high quality resamplers. Theres a reason 44.1khz audio is 4.1khz above the highest frequency humans are expected to hear.

This post has been edited by saratoga: Aug 30 2013, 21:12
Go to the top of the page
+Quote Post
hornpipe2
post Aug 30 2013, 21:53
Post #3





Group: Members
Posts: 5
Joined: 30-August 13
Member No.: 109888



Ok... REALLY not understanding.

If the second video on xiph.org has taught me anything, it's that this is not the waveform:
Attached Image


but that THIS is the waveform:
Attached Image


It's repeated over and over: a signal under the Nyquist frequency should be perfectly reproduced.

So I suppose this is being resampled from 8khz to 48khz for playback (or 44.1 or whatever my laptop is running). Where does this conversion happen?
Player -> Windows Mixer -> Sound Card Driver -> Sound Card Hardware -> Headphones

Is it really just linear interpolating to come up with the 48khz sample, and if so, what happened to all the fancypants reconstruction that a DAC is supposed to do?
Go to the top of the page
+Quote Post
greynol
post Aug 30 2013, 22:11
Post #4





Group: Super Moderator
Posts: 10339
Joined: 1-April 04
From: San Francisco
Member No.: 13167



Why not try a signal at -3dB?

Also, I hope you realize that Audacity draws straight lines between sample points, which is not how a DAC behaves.

fancypants?

This post has been edited by greynol: Aug 30 2013, 22:15


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
hornpipe2
post Aug 30 2013, 22:12
Post #5





Group: Members
Posts: 5
Joined: 30-August 13
Member No.: 109888



saratoga I don't think that's it.

Here is a 3950hz wave at 8khz sample rate, giving > 1% transition band so the resampler would hypothetically "have something to work with".

Attached Image

It sounds just as bad as it looks.

EDIT: Greynol, just tried it, with amplitude 0.5 (-6 dB) instead of 1.0. It sounds the same, but quieter. So that's not it either : (

This post has been edited by hornpipe2: Aug 30 2013, 22:16
Attached File(s)
Attached File  c.wav ( 78.17K ) Number of downloads: 52
 
Go to the top of the page
+Quote Post
Silversight
post Aug 30 2013, 22:15
Post #6





Group: Members
Posts: 310
Joined: 5-April 06
From: Aachen, Germany
Member No.: 29203



I don't know about the math here, but I can at least say these amplitude swings are not produced in the playback chain. They are already in the file. In fact, the frequency of these swings increases as the generated tone frequency decreases. As for the reason for this, I'm not into the science well enough to say.

edit: Ah, too slow... smile.gif

This post has been edited by Silversight: Aug 30 2013, 22:17


--------------------
Nothing is impossible if you don't need to do it yourself.
Go to the top of the page
+Quote Post
hornpipe2
post Aug 30 2013, 22:23
Post #7





Group: Members
Posts: 5
Joined: 30-August 13
Member No.: 109888



Yes Greynol, re: Audacity lines, I watched this earlier, which I assume most people have seen already.
https://www.xiph.org/video/vid2.shtml

The chapter "stairsteps" showed me the error of my previous assumptions about how digital audio works, which is what drove me to try this myself. I set up a simple test by generating an audible wave near the Nyquist frequency so it would "look" terrible in the editor. Then I played it back so I could hear what's really going on and check that it really did reconstruct the wave properly.

I was quite surprised to hear the interference after every online article was telling me that's not how it really works. I assume something is wrong with my setup, not with Nyquist... : )
Go to the top of the page
+Quote Post
saratoga
post Aug 30 2013, 23:59
Post #8





Group: Members
Posts: 5165
Joined: 2-September 02
Member No.: 3264



I doubt that article claims that you can perfectly resample a signal without regard for your filters transition band. The issue here is that practical systems are designed to work with some oversampling. The better your software the less you will need.

If you insist on 1% I suggest doing some research and pick the right software. That will be a very slow resampler.
Go to the top of the page
+Quote Post
xnor
post Aug 31 2013, 01:40
Post #9





Group: Developer
Posts: 1016
Joined: 29-April 11
From: Austria
Member No.: 90198



I have no idea what you guys are talking about.

When I generate a clean 3999 Hz tone sampled at 8000 Hz quantized to 16 bits with dither and play it back in foobar2000 using the SoX resampler plugin (95% passband, min phase, best quality, no aliasing) I can hear nothing.
When I use the same DSP to convert to 44.1 kHz, 32 bit and amplify the output all I hear is the dither of the initial 16 bit file.

No fast resampler will output the 3999 Hz sine tone. The low pass filter needs some room to work: 3999/8000*2 = 99.975%. The SoX plugin allows up to 99.0%. As such the sine will be attenuated massively.

But it's not impossible to reconstruct the 3999 Hz sine. You just need a very steep filter.


--------------------
"we are having an educated and deep technical discussion"-amirm
Go to the top of the page
+Quote Post
hornpipe2
post Aug 31 2013, 04:33
Post #10





Group: Members
Posts: 5
Joined: 30-August 13
Member No.: 109888



I think this is all making sense to me now. I generated a sweep from 3000hz to 4000hz over 10 seconds, then converted (by changing the Project Rate) to 44.1khz. Audacity now plays back the sweep, but beginning from ~3700hz on up to 4000 the volume starts to drop off. If I leave it set at 8khz Project Rate (which I think passes the buck on resampling from Audacity to Windows): it begins to generate a "beat" secondary tone at the same point, becoming more detuned and pronounced until it hits 4000, at which point it sounds like my wave is ramping in and out full volume every .5 seconds.

I'm about to throw out a bunch of mumbo jumbo to explain my current understanding. Please correct me if I am wrong.

--

Nyquist theory states: you can get back your sinewave as long as it is at less than half the sample rate.
This is in the ideal sense, however.

Let's look again at my 3999hz waveform. Sampled at 8khz, there is one and only function which could replicate the sample points we see: a 3999hz wave. But this sample could ALSO be from a 4khz wave, ramping up and down in volume twice a second. This would be an "alias" wave, yes? There are an infinite number of waves that can hit these exact points, but they all lie outside the sample band of 0-3.99999999repeating khz, while there is only one True Solution within the band.

If you're going to recover the original waveform and not the aliases, you need a filter which stops dead at 4khz, so that when you pull the analog back out you then chop off everything but the band-limited frequency.

In practice, we don't have these luxuries. Infinite filters would require infinite processing time or buffering or precision or whatever.

I think this does explain the artifacts I'm hearing:

Windows 7's sample rate conversion is (EDIT) BUGGED: it is doing linear interpolation, so I'm really hearing a beating triangle. Sheesh. Read more here: http://social.msdn.microsoft.com/Forums/wi...ion-slightly-ot, or here for a patch: http://support.microsoft.com/kb/2653312

Audacity's SR conversion has an AA filter. It isn't a perfect filter (meaning: pass everything up to 4khz, drop everything 4khz and above), but rather it kicks in gradually beginning from say 95% of Nyquist. It hits -inf dB at 4khz. The sweep demonstrates pretty clearly when I'm heading into the transition band. If I really wanted to get my data back, I could configure the SRC filter to be much steeper and set the transition band to begin at 99.9999% instead.

--

Is there a common transition band specified by Red Book CD Audio or other standards to which DAC devices must / should adhere?

This post has been edited by hornpipe2: Aug 31 2013, 05:14
Go to the top of the page
+Quote Post
ktf
post Aug 31 2013, 09:01
Post #11





Group: Members
Posts: 433
Joined: 22-March 09
From: The Netherlands
Member No.: 68263



There's a simpler explanation: Nyquist-Shannon states your sample has to infinitely long to get a perfect reconstruction too, and you need an infinitely long reconstruction filter.

As the stuff we want to resample generally isn't infinitely long and reconstruction aren't infinitely long (otherwise, you'll get a infinitely long filter delay, so never hear anything) you'll need to keep a little distance from the limit case of frequencies very close to 0.5Fs. Even SoX, regarded as a high-quality tool for this kind of manipulations, states in it's manual "note that band-width values greater than 99% are not recommended for normal use as they can cause excessive transient echo". SoX' default is 95%.


--------------------
Music: sounds arranged such that they construct feelings.
Go to the top of the page
+Quote Post
bandpass
post Aug 31 2013, 10:05
Post #12





Group: Members
Posts: 361
Joined: 3-August 08
From: UK
Member No.: 56644



QUOTE (hornpipe2 @ Aug 31 2013, 04:33) *
Is there a common transition band specified by Red Book CD Audio or other standards to which DAC devices must / should adhere?



QUOTE (ktf @ Aug 31 2013, 09:01) *
Even SoX, regarded as a high-quality tool for this kind of manipulations, states in it's manual "note that band-width values greater than 99% are not recommended for normal use as they can cause excessive transient echo". SoX' default is 95%.

For better or for worse, SoX works to the 3dB point; so by default, you're 3dB down at 95%. With SoX, the corresponding 0dB point is at 91.x% (can't remember the value of x off the top of my head). The upper limit of the audible band is 20kHz, which divided by the Redbook nyquist is 90.7%. So the SoX default is a good fit for Redbook (of course, it was designed with this in mind).

Go to the top of the page
+Quote Post
C.R.Helmrich
post Aug 31 2013, 11:01
Post #13





Group: Developer
Posts: 694
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (xnor @ Aug 31 2013, 02:40) *
When I generate a clean 3999 Hz tone sampled at 8000 Hz quantized to 16 bits with dither and play it back in foobar2000 using the SoX resampler plugin (95% passband, min phase, best quality, no aliasing) I can hear nothing.

QUOTE (bandpass @ Aug 31 2013, 11:05) *
For better or for worse, SoX works to the 3dB point; so by default, you're 3dB down at 95%. With SoX, the corresponding 0dB point is at 91.x% ...

Exactly, meaning that the SoX resampler will fully attenuate the entire mirror image above Nyquist (here 4 kHz). Most resamplers I know put their 3-dB point right on the Nyquist frequency, so you get some aliasing into your target base-band. Probably best to do it like SoX, i.e. fully attenuate near Nyquist: no tone at all, but no beating tone either.

Edit: Indeed, the beating is also present if you create a 23999- or 22049-Hz tone at 48 or 44.1 kHz sample rate, respectively, so it has nothing to do with upsampling. And the frequency of the tone is correct in both cases, I checked with Audition. So what's the explanation? Intermodulation distortion between tone and sampling frequency?

Chris

This post has been edited by C.R.Helmrich: Aug 31 2013, 11:20


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
xnor
post Aug 31 2013, 14:24
Post #14





Group: Developer
Posts: 1016
Joined: 29-April 11
From: Austria
Member No.: 90198



A very pragmatic explanation:

When you sample a sine with just above 2 samples per cycle (2.0005 in the case of 3999 Hz sampled with 8 kHz) then the sample values will slowly "drift".
You start out at zero, the following samples will be close to zero, after some time you reach the maximum amplitude of the sine waves, then you're approaching zero again, and the cycle begins anew.
In the example above this cycle is 0.5s.

Now obviously if you have a short filter it will follow this cycle. Only if the filter is long enough (steep enough) it will include samples from the previous/next cycle even when you're in the middle of such a close-to-zero valley.
Another option is a short filter that reaches its stopband at 4 kHz (attenuates a lot including the 3999 Hz tone).

Low quality resampling won't do either. It will usually use a really short filter and one that doesn't attenuate properly as well.


A more correct explanation:
There will be an aliased sine at 4001 Hz at a not much lower level than the 3999 Hz one. Try generating and mixing a 3999 and 4001 Hz tone in a 44.1 kHz file.
Destructive interference will cause the valleys and effectively the beating.

This post has been edited by xnor: Sep 1 2013, 03:30


--------------------
"we are having an educated and deep technical discussion"-amirm
Go to the top of the page
+Quote Post
ktf
post Aug 31 2013, 15:54
Post #15





Group: Members
Posts: 433
Joined: 22-March 09
From: The Netherlands
Member No.: 68263



QUOTE (C.R.Helmrich @ Aug 31 2013, 12:01) *
Edit: Indeed, the beating is also present if you create a 23999- or 22049-Hz tone at 48 or 44.1 kHz sample rate, respectively, so it has nothing to do with upsampling. And the frequency of the tone is correct in both cases, I checked with Audition. So what's the explanation? Intermodulation distortion between tone and sampling frequency?

Most DACs do upsample, and if they don't they'll need a reconstruction filter. In both cases, the filtering can't be IIR because of the delay, a DAC should have as little delay as possible. So in both cases, the filter can't be 'steep' enough to prevent aliasing.

edit: I've been confusing non-causal filters with IIR filters. Anyway, think of it like this: If you have a sample-and-hold DAC with a nth-order lowpass filter behind that, you get the same problems: the filter isn't steep enough, it will either alias or attentuate the sine. Some sort of filter is necessary with both sampling as well as reconstructing.

This post has been edited by ktf: Aug 31 2013, 15:58


--------------------
Music: sounds arranged such that they construct feelings.
Go to the top of the page
+Quote Post
xnor
post Aug 31 2013, 16:27
Post #16





Group: Developer
Posts: 1016
Joined: 29-April 11
From: Austria
Member No.: 90198



QUOTE (C.R.Helmrich @ Aug 31 2013, 12:01) *
Edit: Indeed, the beating is also present if you create a 23999- or 22049-Hz tone at 48 or 44.1 kHz sample rate, respectively, so it has nothing to do with upsampling. And the frequency of the tone is correct in both cases, I checked with Audition. So what's the explanation? Intermodulation distortion between tone and sampling frequency?


Well the 4001 Hz sine is added due to a not very great reconstruction filter.

But yeah, in the end it's just a linear addition of two sine waves. The pulsing has to be there by definition. There's no distortion, nothing non-linear.
plot sin(2*pi*f*t)+sin(2*pi*(f+1)*t) where f = 100 from t=0 to 1


--------------------
"we are having an educated and deep technical discussion"-amirm
Go to the top of the page
+Quote Post
saratoga
post Aug 31 2013, 17:56
Post #17





Group: Members
Posts: 5165
Joined: 2-September 02
Member No.: 3264



Its nonlinear because you put in one tone and got two out. The superposition is linear though.

Since modern audio DACs upsample to a few MHz for playback, its hard to say what will happen to something very close to Nyquist. For scientific applications its common to have non oversampled DACs without a reconstruction filter. In this case you can in fact generate tones extremely close to Nyquist, but they will have out of band harmonics due to the lack of a filter.
Go to the top of the page
+Quote Post
xnor
post Aug 31 2013, 18:38
Post #18





Group: Developer
Posts: 1016
Joined: 29-April 11
From: Austria
Member No.: 90198



Sure but once you have the two tones (which you get directly as a result of upsampling, before interpolation, i.e. low pass filtering) the resulting waveform is the linear addition of those tones.

I find statements like "massive distortion within 1% of nyquist" questionable and potentially highly confusing to those not into signal processing.

With distortion I assume you mean non-linear distortion. (If you included linear distortion it'd get even more confusing.)
So I see two cases:
a) If the reconstruction filter reaches its stopband at or before Nyquist there won't be any added (aliased) tones. If additionally the filter is linear phase (which it usually is) there won't be any non-linear distortion.

b) If the filter reaches its stopband somewhere above Nyquist you'll see aliased tones. Again, a linear phase filter will not cause non-linear distortion. The pulsing is not the result of the filter distorting the signal, but not suppressing aliased tones of the upsampled signal.

This is common in most A/D/A converters since these usually use short, "fast" filters that allow such aliasing. It's just not an audible problem since the stuff is usually above 20 kHz. Here it is clearly in the audible range, so plainly audible.

This post has been edited by xnor: Aug 31 2013, 18:51


--------------------
"we are having an educated and deep technical discussion"-amirm
Go to the top of the page
+Quote Post
greynol
post Aug 31 2013, 18:51
Post #19





Group: Super Moderator
Posts: 10339
Joined: 1-April 04
From: San Francisco
Member No.: 13167



...so long as the rest of the process up to the sound waves reaching the listener's ear is linear, or are you suggesting that a 23kHz tone and a 25kHz tone with a 48kHz sample rate cannot create a 2kHz tone?

What I find curious is that I see the beating in Adobe Audition where I had assumed it could display what the waveform would look like with perfect reconstruction. Clearly I don't understand how the program is going about connecting the dots.

This post has been edited by greynol: Aug 31 2013, 19:09


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
xnor
post Aug 31 2013, 19:18
Post #20





Group: Developer
Posts: 1016
Joined: 29-April 11
From: Austria
Member No.: 90198



QUOTE (greynol @ Aug 31 2013, 19:51) *
...so long as the rest of the process up to the sound waves reaching the listener's ear is linear, or are you suggesting that a 23kHz tone and a 25kHz tone with a 48kHz sample rate cannot create a 2kHz tone?

I'm talking about the resampling process. Sure, downstream electronics or transducers or even our ear can and will add intermodulation, harmonic .. distortion.

(Remember: even when the 8 kHz signal is resampled to <insert high sample rate here> it will show the beating due to aliasing and also sound that way primarily due to aliasing, not intermodulation.)

QUOTE
What I find curious is that I see the beating in Adobe Audition where I had assumed it could display what the waveform would look like with perfect reconstruction. Clearly I don't understand how the program is going about connecting the dots.

The filters used for display are short ones for obvious reasons. Short means slow roll-off but it cannot start rolling off at 10 kHz (for a 44.1 kHz sample rate) because then high frequencies below Nyquist would be attenuated... so I guess it cannot suppress aliasing. In the 44.1 kHz case the stopband may not be reached before, let's say, 24 kHz. In that case you'd see visual beating starting slowly with tones above 20 kHz - getting worse with higher frequency, of course, since the aliasing will get stronger and stronger.

(Audition does still a lot better than other audio editors that "connect the dots".)

This post has been edited by xnor: Aug 31 2013, 19:25


--------------------
"we are having an educated and deep technical discussion"-amirm
Go to the top of the page
+Quote Post
bandpass
post Aug 31 2013, 19:21
Post #21





Group: Members
Posts: 361
Joined: 3-August 08
From: UK
Member No.: 56644



Here are some pics which may help.

Generate a tone at close to nyquist:

CODE
sox -r 8000 -n 1.wav synth 8 sine 3990 gain -1


Which looks, at some points in time, like this (i.e. apparent beating):


Resample to 44100 (with max bandwidth allowed by command-line sox):

CODE
sox 1.wav 2.wav rate -b 99.7 44100


Also, generate the tone directly at 44100 sample rate:

CODE
sox -r 44100 -n 3.wav synth 8 sine 3990 gain -1


Make a two channel file from these last two files, and equalise the levels (to account for the resampling roll-off close to nyquist):
CODE
sox -M 2.wav 3.wav 4.wav gain -e


And load the result into audacity (or whatever) for analysis:


There's no difference between the resampled and the directly generated tone. So no aliasing or beating, just a little roll-off.
Go to the top of the page
+Quote Post
saratoga
post Aug 31 2013, 19:33
Post #22





Group: Members
Posts: 5165
Joined: 2-September 02
Member No.: 3264



QUOTE (greynol @ Aug 31 2013, 13:51) *
What I find curious is that I see the beating in Adobe Audition where I had assumed it could display what the waveform would look like with perfect reconstruction. Clearly I don't understand how the program is going about connecting the dots.


That beating is just because its using linear or at most polynomial to generate the display. Its too slow to use high quality interpolation for display so almost nothing does.
Go to the top of the page
+Quote Post
saratoga
post Aug 31 2013, 19:37
Post #23





Group: Members
Posts: 5165
Joined: 2-September 02
Member No.: 3264



Aliasing is nonlinear distortion so I don't see what's confusing about saying distortion is likely when resampling tones very close to Nyquist.
Go to the top of the page
+Quote Post
xnor
post Aug 31 2013, 20:15
Post #24





Group: Developer
Posts: 1016
Joined: 29-April 11
From: Austria
Member No.: 90198



QUOTE (saratoga @ Aug 31 2013, 20:37) *
Aliasing is nonlinear distortion so I don't see what's confusing about saying distortion is likely when resampling tones very close to Nyquist.

I'm sorry if I'm annoying but this:
QUOTE (saratoga @ Aug 30 2013, 22:08) *
Because its being resampled to 48k, and there is generally massive distortion of anything with 1% of the Nyquist limit with even very high quality resamplers. Theres a reason 44.1khz audio is 4.1khz above the highest frequency humans are expected to hear.

doesn't sound like you're talking about aliasing.

SoX, for example, is a high quality resampler and does not cause "massive distortion" while processing audio in realtime (somewhere you said such a resampler is going to be very slow).

Also, I find the 4.1 kHz number confusing as well - to laypeople it will sound like humans can hear up to 40 kHz.
I understand what you meant, but I'd say:
44.1 kHz allows sampling a signal up to just below 22.05 kHz, which is 2.05 kHz above the generally accepted limit of human hearing.


--------------------
"we are having an educated and deep technical discussion"-amirm
Go to the top of the page
+Quote Post
saratoga
post Aug 31 2013, 20:20
Post #25





Group: Members
Posts: 5165
Joined: 2-September 02
Member No.: 3264



What is the fft of Sox's output? I suspect that depending on your settings it either completely attenuates the tone or introduces considerable aliasing. Perhaps zeroing the output is not technically distortion but I think in the colloquial sense I was using its close enough.
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 26th December 2014 - 19:46