IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
What is the correct way to antialias a log plot?
Axon
post Mar 2 2009, 12:23
Post #1





Group: Members (Donating)
Posts: 1984
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



Say you have a power spectrum. Each sample point is equally spaced from the last. You want to display it on a logarithmic scale instead of a linear one. That means that the density of data points per pixel increases with increasing frequency, and at some point may exceed 1. The plot then encounters some form of aliasing distortion, like in this post, where all the points at high frequencies pile together and display as a nasty mess:

http://www.hydrogenaudio.org/forums/index....st&p=616410

This makes it hard to plot anything wider than 2 decades without having to resort to multiple plots. So some kind of antialiasing filter needs to be employed - basically, a variable-rate resampling operation, but one with very specific requirements:
  • No resampling below some critical frequency, and exponentially decreasing lowpass above critical frequency.
  • Minimal ringing and pulse widening. A large 1-sample pulse in the spectrum response should transform into a 1-2 sample pulse in the output.
  • No scalloping or other response inaccuracy depending on input signal position. That is, total spectrum power must not change the frequency (x-coordinate) of a sample position changes, and ideally, the output sample peak should change only minimally if its frequency changes.
I first tried a sinc + hamming windowed summation, with exponentially increasing window widths and sample spacing. That seemed to work until I realized it probably violated rule #3 - input samples existing at frequencies between output samples will almost certainly be misrepresented in the final output, because the windows are symmetric - the sinc waves between adjacent windows will not line up, like they do with normal PCM samples.

I then tried pre-warping the inputs to the sinc+hamming window by, essentially, something to the effect of x=>A+B/x etc. The first half of the window is compressed, the second half is expanded, and the window peak moves to the left half. I wound up having to manually change the scale of the window on each side (ie, if x<0 x*=A; else x*=B; y=sinc(x)). This seemed to eliminate the theoretical concerns about missing samples, but there is tremendous ringing in the output extending for many pixels and the pulses tended to widen unacceptably.

My current thinking is that I should just get back down to basics and just do a box average of samples, with perhaps some interpolation/fractional counting of endpoint samples. This eliminates the ringing issue and the scalloping, and clearly makes the power equivalence thing a non-issue, but I am a little concerned there will be hidden costs to this, like an insufficient amount of antialiasing.

Any suggestions? Am I reinventing the wheel here? How does RMAA etc do this?
Go to the top of the page
+Quote Post
saratoga
post Mar 2 2009, 14:50
Post #2





Group: Members
Posts: 5000
Joined: 2-September 02
Member No.: 3264



Spline fit the points onto log scale before you log plot them. Its really easy to do in matlab:

plot(log(spline(data_to_plot,data_x_axis_points,desired_log_x_axis)),log(data_to_plot))
Go to the top of the page
+Quote Post
Axon
post Mar 2 2009, 15:07
Post #3





Group: Members (Donating)
Posts: 1984
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



But that's interpolation, not filtering, right? That would just throw out data.
Go to the top of the page
+Quote Post
saratoga
post Mar 2 2009, 16:22
Post #4





Group: Members
Posts: 5000
Joined: 2-September 02
Member No.: 3264



QUOTE (Axon @ Mar 2 2009, 09:07) *
But that's interpolation, not filtering, right?


Interpolation and filtering are essentially the same thing, at least in this context.

QUOTE (Axon @ Mar 2 2009, 09:07) *
That would just throw out data.


Assuming the data is properly sampled, the only loss of information is due to inaccuracies/rounding in the spline fit. Splines work pretty well though for most data, however you could use some other form of interpolation (sinc, etc) if you don't think spline is a good enough.

Edit: By properly sampled, I mean that after you do the spline fit, you still have enough points left to satisfy nyquist. Obviously if you do a spline fit from 1000 points to 10 points, then yes theres a loss of information. But as long as you generate enough points you don't lose anything beyond rounding.

This post has been edited by Mike Giacomelli: Mar 2 2009, 16:23
Go to the top of the page
+Quote Post
Axon
post Mar 2 2009, 17:39
Post #5





Group: Members (Donating)
Posts: 1984
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



QUOTE (Mike Giacomelli @ Mar 2 2009, 09:22) *
QUOTE (Axon @ Mar 2 2009, 09:07) *
But that's interpolation, not filtering, right?
Interpolation and filtering are essentially the same thing, at least in this context.
Well, sort of. They can both be described in terms of FIR, but this requires a result that is perhaps "interpolated" from several dozen data points at once, not just 2 or 3. And spline knots make no sense in this situation - none of the output data points must be the exact same values as their inputs.

QUOTE
QUOTE (Axon @ Mar 2 2009, 09:07) *
That would just throw out data.
Assuming the data is properly sampled, the only loss of information is due to inaccuracies/rounding in the spline fit. Splines work pretty well though for most data, however you could use some other form of interpolation (sinc, etc) if you don't think spline is a good enough.

Edit: By properly sampled, I mean that after you do the spline fit, you still have enough points left to satisfy nyquist. Obviously if you do a spline fit from 1000 points to 10 points, then yes theres a loss of information.
That's exactly the situation I'm describing.

That is: FFT a 1s window of 16/44 data and make a power spectrum. 22,050 samples from 0hz to 22049hz, in 1hz steps. Now plot it in an 800x600 log graph running from 1hz to 22049hz. That's at least 27 data points for each pixel - aliasing is unavoidable here. My objective is to eliminate that aliasing along with the above criteria.

Or restated: Replace the 1s FFT window with 1,000 1ms FFT windows averaged together to yield a 500 sample power spectrum. That will have far few problems on the log plot. The objective is to make the 1s window look as good as the 1ms windows at high frequencies.

This post has been edited by Axon: Mar 2 2009, 17:46
Go to the top of the page
+Quote Post
Alexey Lukin
post Mar 3 2009, 01:03
Post #6





Group: Members
Posts: 191
Joined: 31-July 08
Member No.: 56508



Filtering of spectral data is undesirable because it will distort (lower) levels of harmonics in the spectrum. One possible idea is displaying only tops of spectral energy at each pixel bin:
Go to the top of the page
+Quote Post
Axon
post Mar 4 2009, 00:32
Post #7





Group: Members (Donating)
Posts: 1984
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



QUOTE (Alexey Lukin @ Mar 2 2009, 18:03) *
Filtering of spectral data is undesirable because it will distort (lower) levels of harmonics in the spectrum.
Proof?

My experience with averaging has been that it has been unusually good at isolating low low level info like that. As long as the frequency content is largely randomly distributed, except for the signal you actually care about, it makes perfect sense to average it or otherwise filter it.

QUOTE
One possible idea is displaying only tops of spectral energy at each pixel bin:

Given the existing plots I'm seeing, of (say) flat, unfiltered frequency responses whose peaks are 10db or more above the average, I don't think this is a good approach. Maybe a min/avg/max plot is in order, but the average or filtered response clearly seems the most meaningful to me.

This post has been edited by Axon: Mar 4 2009, 00:51
Go to the top of the page
+Quote Post
Alexey Lukin
post Mar 4 2009, 00:48
Post #8





Group: Members
Posts: 191
Joined: 31-July 08
Member No.: 56508



QUOTE (Axon @ Mar 3 2009, 18:32) *
QUOTE (Alexey Lukin @ Mar 2 2009, 18:03) *
Filtering of spectral data is undesirable because it will distort (lower) levels of harmonics in the spectrum.
Proof?

Isn't it obvious that any low-pass filtering of the pulse (which is a spectral peak) lowers the level of the peak, depending on the filter size? You say that you are using the exponentially increasing filter size. This would tilt your displayed peak levels across the frequency range.
Go to the top of the page
+Quote Post
Axon
post Mar 4 2009, 00:58
Post #9





Group: Members (Donating)
Posts: 1984
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



QUOTE (Alexey Lukin @ Mar 3 2009, 17:48) *
QUOTE (Axon @ Mar 3 2009, 18:32) *
QUOTE (Alexey Lukin @ Mar 2 2009, 18:03) *
Filtering of spectral data is undesirable because it will distort (lower) levels of harmonics in the spectrum.
Proof?

Isn't it obvious that any low-pass filtering of the pulse (which is a spectral peak) lowers the level of the peak, depending on the filter size? You say that you are using the exponentially increasing filter size. This would tilt your displayed peak levels across the frequency range.

If you have a very high power peak at only one sample position, of course filtering will dampen it. But it will never obscure it entirely if it was all that significant to begin with.

Relying entirely on the peak values is just as bad, if not worse - that will consistently overestimate the response at high frequencies. That might be slightly more sensitive for single large peaks, but it will completely screw with the wideband response.

I think it's extremely important to ensure that the total power represented in the graph is invariant, and any scheme that doesn't rely on a 50th percentile (or something very close to it) is going to be violating that.

This post has been edited by Axon: Mar 4 2009, 01:01
Go to the top of the page
+Quote Post
Alexey Lukin
post Mar 4 2009, 01:08
Post #10





Group: Members
Posts: 191
Joined: 31-July 08
Member No.: 56508



QUOTE (Axon @ Mar 3 2009, 18:58) *
If you have a very high power peak at only one sample position, of course filtering will dampen it. But it will never obscure it entirely if it was all that significant to begin with.

Of course not. But the level will be reduced, which may be a problem.

QUOTE (Axon @ Mar 3 2009, 18:58) *
I think it's extremely important to ensure that the total power represented in the graph is invariant.

Ok, it seems that we are trying to achieve different objectives.
Considering your objective, you may want to refer to a multitaper spectrum estimation. It's a technique that uses a set of orthogonal windows (typically prolate spheroidal wave functions) to smooth noise in spectral estimates by averaging spectra that were obtained using different windows.
Go to the top of the page
+Quote Post
Axon
post Mar 4 2009, 02:28
Post #11





Group: Members (Donating)
Posts: 1984
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



Thanks for the tip on multitaper - I'll be sure to give it a whirl. I can see how a massive reduction in variance may make the noise problem a non-issue, which would let me display peaks instead of averages. Actually the variance thing is quite remarkable. Ye Olde Wikipedia says the variance of a N-window averaged periodogram is ~ 1/sqrt(N), but I'm reading elsewhere that the variance for a multitaper analysis with K estimators is ~ 1/K. So if I can cram in K=10 I basically get a 10x reduction in noise for free.

In practice, I must be able to isolate 0.5hz tones clearly, and my present analyses are running out of memory when the windows exceed about 40 seconds in length. So I may not be able to get much beyond K=5 without going through some pain, but still, that might be fine.

Nothing prevents me from using multitaper and periodogram averaging/Welch, right? Since my dataset is way, way external, and I don't want to have to compute an external Fourier transform....
Go to the top of the page
+Quote Post
Alexey Lukin
post Mar 4 2009, 02:48
Post #12





Group: Members
Posts: 191
Joined: 31-July 08
Member No.: 56508



QUOTE (Axon @ Mar 3 2009, 20:28) *
Wikipedia says the variance of a N-window averaged periodogram is ~ 1/sqrt(N), but I'm reading elsewhere that the variance for a multitaper analysis with K estimators is ~ 1/K.

I see Wikipedia saying that standard deviation is 1/sqrt(L), which is the same as variance being 1/L.

QUOTE (Axon @ Mar 3 2009, 20:28) *
So if I can cram in K=10 I basically get a 10x reduction in noise for free.

Not quite free. Your peaks are going to become wider in frequency, they approach an interesting rectangle-shaped form (iirc my experiments).

QUOTE (Axon @ Mar 3 2009, 20:28) *
In practice, I must be able to isolate 0.5hz tones clearly, and my present analyses are running out of memory when the windows exceed about 40 seconds in length.

If you are only interested in a particular frequency range, you may want to consider downsampling the signal prior to spectral analysis or using a Goertzel algorithm.
Go to the top of the page
+Quote Post
Axon
post Mar 4 2009, 03:34
Post #13





Group: Members (Donating)
Posts: 1984
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



QUOTE (Alexey Lukin @ Mar 3 2009, 19:48) *
QUOTE (Axon @ Mar 3 2009, 20:28) *
Wikipedia says the variance of a N-window averaged periodogram is ~ 1/sqrt(N), but I'm reading elsewhere that the variance for a multitaper analysis with K estimators is ~ 1/K.
I see Wikipedia saying that standard deviation is 1/sqrt(L), which is the same as variance being 1/L.

Boooooooooooo. I misread things. Here I was so happy that I could just crank up the window length, crank up K to compensate and get huge noise reductions for free.

This makes it a lot harder to justify, actually. If the line width of the original signal is 2/(W dt) for W=number of samples, with multitaper it's now 2p/(W dt) or ~~ 2(K+1/2)/(W dt) - so to increase K while maintaining the same line width, I need to increase W by about half the same factor. But increasing W means reducing the number of windows available for periodogram averaging N. Because the variance for averaged periodograms is ~1/N and for multitaper is ~1/K, this means that I can only get at most about a 2x reduction in variance by using multitaper. Previously I thought I could get essentially a sqrt(K) increase.

That kinda stinks, actually. For the last of the 4 plots linked to above, the p-p variation in the response is 1db, but it needs to be something like 0.03db to avoid bias if I use peak values for decimation. If I can only get that down to 0.5db with multitaper it just isn't going to happen.

Am I missing something? Nothing prevents me from using multitaper and periodogram averaging/Welch, right? Since my dataset is way, way external, and I don't want to have to compute an external Fourier transform.... Am I making any mistakes in trading off multitaper variance for averaged periodogram variance?


QUOTE
QUOTE (Axon @ Mar 3 2009, 20:28) *
In practice, I must be able to isolate 0.5hz tones clearly, and my present analyses are running out of memory when the windows exceed about 40 seconds in length.
If you are only interested in a particular frequency range, you may want to consider downsampling the signal prior to spectral analysis or using a Goertzel algorithm.

I'll need an explanation as to how Goertzel applies, but generally here, I'm interested in plots that are between 2-4 decades wide. For the truly LF stuff though, I often only care about 0-20hz and resampling down to that is a distinct possibility.

This post has been edited by Axon: Mar 4 2009, 03:39
Go to the top of the page
+Quote Post
Axon
post Mar 4 2009, 03:41
Post #14





Group: Members (Donating)
Posts: 1984
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



Also, on the power subject: I mostly made up that requirement for power invariance; I thought it was required to ensure some kind of accuracy in the solution. As you indicated though, in practice, if the variance is low enough that the peak level differs from the average by less than a pixel, using the peak level for decimation is OK. I hadn't realized that variance was the key issue here.
Go to the top of the page
+Quote Post
[JAZ]
post Mar 4 2009, 20:44
Post #15





Group: Members
Posts: 1783
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



Edit: Discard the method. It does have the problem you describe (going up with a note doesn't always give the same band amplitude).



Hello Axon.

I have another aproach that you could try out.

I am not sure if this is mathematically correct, but for my intended application (realtime power spectrum) it fits it well.

The concept is modifying the angle of the FFT analysis so that it increments exponentially instead of linearly.

Note: I know this doesn't look like an FFT analysis. It was some code I improved upon.

Input is pSamples
output is amp
l =left, r = right.

scope_spec_bands is an adjustable bar to show more or less bars. (from 16 to MAX_SCOPE_BANDS)

CODE

const int MAX_SCOPE_BANDS = 128;
const int SCOPE_SPEC_SAMPLES = 1024;

float *pSamplesL;
float *pSamplesR;

float aal[MAX_SCOPE_BANDS];
float aar[MAX_SCOPE_BANDS];
float bbl[MAX_SCOPE_BANDS];
float bbr[MAX_SCOPE_BANDS];
float ampl[MAX_SCOPE_BANDS];
float ampr[MAX_SCOPE_BANDS];

void InitSpectrum()
{
const float barsize = float(SCOPE_SPEC_SAMPLES>>1)/scope_spec_bands;
const float samplesFactor = SCOPE_SPEC_SAMPLES/16.0f;
for (int i=SCOPE_SPEC_SAMPLES-1;i>=0;i--)
{
//Linear -2pi to 2pi.
const float constant = 2.0f * helpers::math::pi_f * (-0.5f + ((float)i/(SCOPE_SPEC_SAMPLES-1)));
//Hanning window
const float window = 0.50 - 0.50 * cosf(2.0f * M_PI * i / (SCOPE_SPEC_SAMPLES - 1));
float j=0.0f;
for(int h=0;h<scope_spec_bands;h++)
{
//this makes it somewhat logaritmic.
//float th=(j*j/SCOPE_SPEC_SAMPLES)*constant;
//This simulates a constant note scale.
float th = powf(2.0f,j/samplesFactor)*constant;
//this is linear
//float th=j* constant;
cth[i][h] = cosf(th) * window;
sth[i][h] = sinf(th) * window;
j+=barsize;
}
}
}
void run() {
memset (aal,0,sizeof(aal));
memset (bbl,0,sizeof(bbl));
memset (aar,0,sizeof(aar));
memset (bbr,0,sizeof(bbr));

for (int i=0;i<SCOPE_SPEC_SAMPLES;i++)
{
const float wl=pSamplesL[i];
const float wr=pSamplesR[i];
for(int h=0;h<scope_spec_bands;h++)
{
aal[h]+=wl*cth[i][h];
bbl[h]+=wl*sth[i][h];
aar[h]+=wr*cth[i][h];
bbr[h]+=wr*sth[i][h];
}
}

for (int h=0;h<scope_spec_bands;h++)
{
ampl[h]= sqrtf(aal[h]*aal[h]+bbl[h]*bbl[h])/(SCOPE_SPEC_SAMPLES>>1);
ampr[h]= sqrtf(aar[h]*aar[h]+bbr[h]*bbr[h])/(SCOPE_SPEC_SAMPLES>>1);
}
}






This post has been edited by [JAZ]: Mar 4 2009, 21:07
Go to the top of the page
+Quote Post
Axon
post Mar 4 2009, 21:46
Post #16





Group: Members (Donating)
Posts: 1984
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



So I'm vacillating on the whole power invariance vs accurate peak estimation thing, this time on psychoacoustic grounds.

Asymptotically the ERB is 11% of the signal frequency. Let's be extremely generous and call the JND for power changes within an ERB to be 0.1db. At 20khz the ERB is therefore 2.18khz, and for a single tone at that frequency to be audible, it must be

b=(0.108 * 20000 + 24.7) // ERB
A=B*10^(0.1/20)-B // just noticeable tone power
G = 20*log10(A) // in dB
G = 28.1db

higher in amplitude than the noise PSD. That's ludicrously high. At 10khz it's still lower, but only down to 22db. At 1khz we're down to 3db but we certainly tend not to have peak detection problems at that frequency if we're still concerned about higher ones.

My point here is that low level harmonics might look scary, but they don't actually mean anything unless they are so ludicrously more powerful than the surrounding noise as to be impossible to miss for analyses like these. Even if I did a 5-octave spectrum 1500px wide, each pixel at 20khz represents only 154hz, which is a small fraction of the ERB. If power invariance is maintained, audible tones at high frequencies will always be visible for adequate plot widths, and their power can be compared against the nearby noise density to establish the level of audibility.

Long story short: I will probably still do multitaper because of the nearly-free variance benefits, but I will stick with per-pixel box averaging for final display.
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 1st October 2014 - 16:31