IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
interpreting frequency plots
krabapple
post Mar 14 2013, 17:48
Post #1





Group: Members
Posts: 2216
Joined: 18-December 03
Member No.: 10538



I've always wanted to understand frequency plots better -- these are plots derived from e.g. Audition's 'frequency analysis' tool. You select a waveform (or segment thereof) , choose an FFT window size and type (e.g. 4096 Blackmann-Harris) and it generates a plot (either linear or logarithmic) showing both channels. The numerical data can be exported and dumped into Excel for further chart-tweaking.


I have uploaded an example pdf of such an Excel chart to HA

http://www.hydrogenaudio.org/forums/index....showtopic=99939

(This is a track from a CD. The two curves are the left and right channels. I used the Audition FFT size and type indicated above)

What I want to understand is why the graphs are typically so 'jagged' -- this cannot possibly represent actual EQ moves at each frequency bin. So what causes it? (I used the FFT size and type indicated above)

I understand that there is less FFT sampling at bass frequencies than at midrange and treble, which is why the bass part of the curves are smooth (if I raise the FFT size, therer are more points in bass frequencies to plot, and the curve gets less smooth)
But I don't really understand why there's so much up-and-down in the upper frequencies, mainly because I don't really understand with the FFT tool is doing.

This post has been edited by krabapple: Mar 14 2013, 17:51
Go to the top of the page
+Quote Post
Alexey Lukin
post Mar 14 2013, 18:00
Post #2





Group: Members
Posts: 191
Joined: 31-July 08
Member No.: 56508



Smooth spectra are only produced by very transient sounds, like clicks. Take a single note — its spectrum will be a harmonic series, i. e. very jagged. When you mix many notes in a song, all their spectra are added together, hence the jaggedness. So, it is not just a property of the FFT — it's a property of the sound that you are analyzing.

This post has been edited by Alexey Lukin: Mar 14 2013, 18:00
Go to the top of the page
+Quote Post
DonP
post Mar 14 2013, 18:05
Post #3





Group: Members (Donating)
Posts: 1471
Joined: 11-February 03
From: Vermont
Member No.: 4955



QUOTE (krabapple @ Mar 14 2013, 11:48) *
What I want to understand is why the graphs are typically so 'jagged' -- this cannot possibly represent actual EQ moves at each frequency bin. So what causes it? (I used the FFT size and type indicated above)


Maybe because your source material is music instead of noise?

At least with "standard" (western/European) music, notes fall on particular frequencies that are a minimum of 2**(1/12) apart (halftone space), mostly more.

Go to the top of the page
+Quote Post
xnor
post Mar 14 2013, 18:09
Post #4





Group: Developer
Posts: 565
Joined: 29-April 11
From: Austria
Member No.: 90198



You only get a "flat" line if you analyze something like an impulse or full range sine sweep.

Regarding resolution, it is defined as Fs/N, so for 44.1 kHz and N=4096 the resolution is 10.77 Hz.
So from 100 Hz to 200 Hz you have about 9 bins, but from 1 kHz to 2 kHz you have about 93 bins. This makes the spectrum at higher frequencies look a lot more jagged.

This post has been edited by xnor: Mar 14 2013, 18:10
Go to the top of the page
+Quote Post
krabapple
post Mar 14 2013, 19:22
Post #5





Group: Members
Posts: 2216
Joined: 18-December 03
Member No.: 10538



Thanks people.

I understand about there being different number of bins in different frequency ranges, due to the resolution.

Theres' a typo in my first post "I don't really understand with the FFT tool is doing." should be "I don't really understand WHAT the FFT tool is doing."

NB the tool is working over the whole file -- this isn't snapshot taken while the file is playing. Its the whole track,selected then analyzed.

So let's say for the bins between 1400 and ~1500 Hz -- here's the actual data (rounded to the nearest integers). Col 1 is frequency, col 2 is Lch col 3 is rch (in dBFS)

1400 -40 -38
1410 -40 -39
1421 -42 -41
1432 -43 -42
1443 -43 -42
1453 -41 -39
1464 -38 -36
1475 -36 -34
1486 -34 -32
1497 -34 -32
1507 -35 -35

What has the tool actually done? For the first bin (1400-1409), has it determined an average level for all those frequencies, across the whole track, for each channel? 1400Hz is ~ F two octaves above middle C (i.e. it's F6). The next semitone (F#6) is ~1000 Hz above that, so 1400-1409 Hz is still pretty much 'F6'. Is the analyzer therefore essentially averaging the levels of all the 'F6' 's in the track? If that's how it works, I would expect to see peaks in pitches belonging to the key of the track , versus pitches outside the key, which would tend to be played less often ( or exist as harmonics) .

(This track happens to be by Yes, so I can't really test this here...they liked to change key a bit within a track. )

This post has been edited by krabapple: Mar 14 2013, 19:37
Go to the top of the page
+Quote Post
Alexey Lukin
post Mar 14 2013, 21:14
Post #6





Group: Members
Posts: 191
Joined: 31-July 08
Member No.: 56508



When you press the Scan button in Audition, it computes multiple spectra in your selection and averages them together. Each FFT splits the signal into a number of frequency bands (like a crossover does) and computes RMS of the signal in each band. So, your overall spectrum displays the time-averaged energy (RMS) in every frequency band of your file. If your FFT size is 4096, then the number of frequency bands is 2047 and their central frequencies are given as k44100/4096 (here k = 0, ..., 2048). The shape of each crossover band and inter-band signal leakage is controlled by the choice of a weighting window (Blackman-Harris in your case).
Go to the top of the page
+Quote Post
Alexey Lukin
post Mar 15 2013, 03:47
Post #7





Group: Members
Posts: 191
Joined: 31-July 08
Member No.: 56508



Made a mistake above: the number of bands is 2049.
Go to the top of the page
+Quote Post
xnor
post Mar 15 2013, 04:59
Post #8





Group: Developer
Posts: 565
Joined: 29-April 11
From: Austria
Member No.: 90198



Audition seems to display the bins incorrectly. Let's define res = Fs/N.

FFT gives results like this:
0 - res/2
res/2 - 3*res/2
and so on, so the bin centers are 0, 1*res, 2*res ... N/2*res which is a total of N/2+1 bins

Audition displays:
0 - res
res - 2*res
and so on, so the bin centers are res/2, 3*res/2 which is a total of N/2 bins

This post has been edited by xnor: Mar 15 2013, 04:59
Go to the top of the page
+Quote Post
krabapple
post Mar 15 2013, 19:10
Post #9





Group: Members
Posts: 2216
Joined: 18-December 03
Member No.: 10538



FWIW, in my hands, Audition generated 2046 data points per channel when the FFT window size was set to 4096.

//

So, given what's said here, what can be said about results from frequency profile comparison derived from the same method?

Here's a link to a graph I made of two different remasters of the same track, compared to a third (earliest) version .
THis is a 'difference' graph, where I just graph the difference for each bin , betweem the remaster and the earliest version. So the earliest version is the X-axis, and the curves are the differences betweem remaster 'HD', (both channels) and earliest version, and remaster 'AF', (both channels) and the earliest version What seems curious to me is how smooth the difference is between AF/earliest, versus HD/earliest. What might explain this?

http://www.hydrogenaudio.org/forums/index....st&p=827509

This post has been edited by krabapple: Mar 15 2013, 19:20
Go to the top of the page
+Quote Post
saratoga
post Mar 15 2013, 19:34
Post #10





Group: Members
Posts: 4903
Joined: 2-September 02
Member No.: 3264



QUOTE (krabapple @ Mar 15 2013, 13:10) *
Here's a link to a graph I made of two different remasters of the same track, compared to a third (earliest) version .
THis is a 'difference' graph, where I just graph the difference for each bin , betweem the remaster and the earliest version. So the earliest version is the X-axis, and the curves are the differences betweem remaster 'HD', (both channels) and earliest version, and remaster 'AF', (both channels) and the earliest version What seems curious to me is how smooth the difference is between AF/earliest, versus HD/earliest. What might explain this?

http://www.hydrogenaudio.org/forums/index....st&p=827509


What are the units on that plot? dB relative to the quantization size? So the 'AF' version is ~1 bit different then the original? Or am I misunderstanding?
Go to the top of the page
+Quote Post
krabapple
post Mar 15 2013, 20:51
Post #11





Group: Members
Posts: 2216
Joined: 18-December 03
Member No.: 10538



QUOTE (saratoga @ Mar 15 2013, 14:34) *
QUOTE (krabapple @ Mar 15 2013, 13:10) *
Here's a link to a graph I made of two different remasters of the same track, compared to a third (earliest) version .
THis is a 'difference' graph, where I just graph the difference for each bin , betweem the remaster and the earliest version. So the earliest version is the X-axis, and the curves are the differences betweem remaster 'HD', (both channels) and earliest version, and remaster 'AF', (both channels) and the earliest version What seems curious to me is how smooth the difference is between AF/earliest, versus HD/earliest. What might explain this?

http://www.hydrogenaudio.org/forums/index....st&p=827509


What are the units on that plot? dB relative to the quantization size? So the 'AF' version is ~1 bit different then the original? Or am I misunderstanding?



X axis is frequency (logarithmic)

Y is dB

Go to the top of the page
+Quote Post
Dynamic
post Mar 16 2013, 10:35
Post #12





Group: Members
Posts: 803
Joined: 17-September 06
Member No.: 35307



QUOTE (krabapple @ Mar 15 2013, 18:10) *


The frequency resolution depends on both the sample rate and the number of points in the FFT, so if the HD version had a sampling rate of, say, 96 kHz, the resolution of a 4096 point FFT-derived power spectrum would be roughly 48000Hz / 2047 = 23.45 Hz (albeit that there's an influence from the Windowing Function on the trade-off between stop-band ripple and frequency resolution). The CD version would be 22050Hz / 2047 = 10.77 Hz.

For that reason, I find it odd that both plots seem to show the same frequency bin points where the point-to-point fit line changes angle, if indeed the sampling rate is different between the two tracks. It might be that one is an HDCD and one is a regular CD or at least that both are 44.1kHz sampled, in which case the frequency resolution should be the same.

Anything above 16 kHz is probably best ignored as there are various dither and noise shaping schemes that make changes there that are in all likelihood inaudible.

There may well be different EQ between the two masterings and quite likely different playback volume levels (ReplayGain info would help in that regard) and there might even be clipping in one or the other.
Go to the top of the page
+Quote Post
Alexey Lukin
post Mar 16 2013, 14:51
Post #13





Group: Members
Posts: 191
Joined: 31-July 08
Member No.: 56508



QUOTE (Dynamic @ Mar 16 2013, 05:35) *
The frequency resolution depends on both the sample rate and the number of points in the FFT, so if the HD version had a sampling rate of, say, 96 kHz, the resolution of a 4096 point FFT-derived power spectrum would be roughly 48000Hz / 2047 = 23.45 Hz. The CD version would be 22050Hz / 2047 = 10.77 Hz.

More accurately, 96000/4096 = 23.4375 Hz, 44100/4096 = 10.7666 Hz.
Go to the top of the page
+Quote Post
krabapple
post Mar 16 2013, 23:49
Post #14





Group: Members
Posts: 2216
Joined: 18-December 03
Member No.: 10538



QUOTE (Dynamic @ Mar 16 2013, 05:35) *
QUOTE (krabapple @ Mar 15 2013, 18:10) *


The frequency resolution depends on both the sample rate and the number of points in the FFT, so if the HD version had a sampling rate of, say, 96 kHz, the resolution of a 4096 point FFT-derived power spectrum would be roughly 48000Hz / 2047 = 23.45 Hz (albeit that there's an influence from the Windowing Function on the trade-off between stop-band ripple and frequency resolution). The CD version would be 22050Hz / 2047 = 10.77 Hz.

For that reason, I find it odd that both plots seem to show the same frequency bin points where the point-to-point fit line changes angle, if indeed the sampling rate is different between the two tracks. It might be that one is an HDCD and one is a regular CD or at least that both are 44.1kHz sampled, in which case the frequency resolution should be the same.

Anything above 16 kHz is probably best ignored as there are various dither and noise shaping schemes that make changes there that are in all likelihood inaudible.

There may well be different EQ between the two masterings and quite likely different playback volume levels (ReplayGain info would help in that regard) and there might even be clipping in one or the other.


The 'HD' track is in fact 96kHz/24 bit. The 'AF' track and the reference track are 44/16. I'll check the Audition numerical output and see if the bin sizes were different for the HD vs the other two.

It's almost certain that there are EQ differences between the three versions -- they are different remasters. The question I have is why one difference graph is so jagged and the other so smooth.
Level matching would not change that.

This post has been edited by krabapple: Mar 16 2013, 23:51
Go to the top of the page
+Quote Post
saratoga
post Mar 17 2013, 05:09
Post #15





Group: Members
Posts: 4903
Joined: 2-September 02
Member No.: 3264



QUOTE (krabapple @ Mar 16 2013, 17:49) *
The question I have is why one difference graph is so jagged and the other so smooth.
Level matching would not change that.


Its jagged because its been redithered or at least had some kind of digital effect applied. That introduces a random 1 bit error. Since the bin size in that FFT is constant, but you've plotted it on a log scale, it gets more jagged as the frequency increases. The smoother one just looks like the level is a little different.
Go to the top of the page
+Quote Post
Dynamic
post Mar 17 2013, 09:11
Post #16





Group: Members
Posts: 803
Joined: 17-September 06
Member No.: 35307



Something doesn't look right. I'm half wondering whether the 96 kHz one is not plotted with the correct frequencies. It's also somewhat awkward to read a log-F scale where each decade starts with a 2 (20, 200, 2000, 20000 Hz), when it's more usual to plot 10, 100, 1000, 10000 so that the minor marks fall at 2,3,4,5,6,7,8,9 times the preceding decade value.

If you're trying to compare ripple it's important to avoid having different frequency resolution from one track to the other and compare like with like, and also to average over the same time (e.g. the entire track).

You could increase the FFT size to 16384 to improve the resolution (if Audition is like its predecessor, Cool Edit as I remember it). It's best to compare at various resolutions to see if it's an artifact of the smoothing effect from FFT bin width.

You could also resample the HD version to 44.1 kHz (or the others to 96k) and then plot them all on the same axes in case the frequencies were not the same.


--------------------
Dynamic the artist formerly known as DickD
Go to the top of the page
+Quote Post
krabapple
post Mar 17 2013, 18:26
Post #17





Group: Members
Posts: 2216
Joined: 18-December 03
Member No.: 10538



You're right, I mistakenly applied the same bins to the 96kHz and 44.1 remasters -- they should be different.



Go to the top of the page
+Quote Post
Dynamic
post Mar 17 2013, 21:54
Post #18





Group: Members
Posts: 803
Joined: 17-September 06
Member No.: 35307



My guess is that the best comparison would be to resample the 96 kHz version to 44.1kHz (16 bit will be fine) and save that as a temporary lossless file. Then run the same FFT Power Spectrum on comparable files and see if there's much difference.

Even if they're virtually identical, it would be useful information for others wanting to compare different materials properly.

This post has been edited by Dynamic: Mar 17 2013, 21:59
Go to the top of the page
+Quote Post
krabapple
post Mar 18 2013, 16:30
Post #19





Group: Members
Posts: 2216
Joined: 18-December 03
Member No.: 10538



Thanks again. Here's updated images

(1) conversion settings used to change the 96/24 file to 44/16

http://www.hydrogenaudio.org/forums/index....st&p=827804

(2) difference plot of 2 remasters versus the original CD issue, linear scale

http://www.hydrogenaudio.org/forums/index....st&p=827805

(3) same as (2), log scale

http://www.hydrogenaudio.org/forums/index....st&p=827806


This is again with an FFT window of 4096, Blackmann-Harris. The two sets of curves are much more comparable now in terms of 'smoothness'. I made no attempt to levelmatch..right now all three 'match' circa 200Hz and the two newer remasters 'match' each other again circa 4kHz.

My interpretation would be: Compared to the original issue, it looks like a mostly steady gain in the HD version through the midrange, whereas the AF has a 'hump' @3kHz and a 'dip' @7kHz. Also there's more low bass in both compared the old CD (considerably more in the case of the AF, though at this resolution there's little detail).


(edit: 'dbFS' replaced with 'dB' on axis title)

This post has been edited by krabapple: Mar 18 2013, 16:34
Go to the top of the page
+Quote Post
Dynamic
post Mar 18 2013, 18:43
Post #20





Group: Members
Posts: 803
Joined: 17-September 06
Member No.: 35307



Yes, at least they're fairly smooth.

To be clear, I'm assuming you have a spreadsheet that contains columns Frequency (Hz), Left (dBFS), Right (dbFS) for all frequency bins of the power spectrum of each file.

First, there's the original file from the first CD release, which you plotted in your first upload here.

Then there's the file from the CD remaster that you've called 'AF'.

Then there's the file from the HD release (downsampled from 96kHz stereo to 44.1kHz stereo to make it comparable to the CDs - your downsampling settings looked fine, by the way).

As you want to compare ORIG, AF and HD, you then subtract the values of ORIG from the values in AF and the values in HD to obtain the difference in dB for each version compared to the original CD release, and it's these difference values that you're plotting against frequency. (Difference in dB in logarithmic domain is equivalent to dividing in the linear domain, hence I use the term 'normalize' below)

In these difference plots you cannot identify the notes of the musical scale that are present because you've normalized against the original CD's frequency response to measure only the difference in EQ (and the difference in filtering at low and high end and thus you've normalized out any ripple that was originally present from tones that are notes on the musical scale (albeit averaged over the whole file).

Note that many sounds that aren't on the musical scale will be present thanks to noiselike sounds (sibilant vocal sounds and hissing noises, drums and hi-hat hits and similar untuned percussion all have a fairly spread spectrum showing no peaks around the notes of the musical scale), so when averaging over the whole file, there's likely to be plenty of power in those parts of the spectrum to successfully normalize against.

The original frequency response shows a 20 kHz low pass filter, which is recommended by the Red Book standard for audio CDs but isn't applied by Audition's downsampling filter or in many modern CD releases that often use more of a brickwall filter just below 22.05kHz. It also shows relatively little content below in the deep bass (open bass-E on a 4-string bass guitar has a fundamental frequency of about 41Hz).

Your plots now show the EQ curve you'd have to apply to the original CD to obtain the same tonal balance as the AF or HD releases (and the negative of those dB values will turn AF or HD into the original tonal balance). The exception is in ranges where the original or the reissued versions have very low (essentially zero) audible content, which I guess means below about 40Hz and above about 16 kHz to 20 kHz, where you're comparing very small numbers as a ratio (I'm actually surprised the differences are so low - within 10 dB). If you wanted to match the EQ between various recordings, you'd probably get close by reading off values and plugging them into an EQ.

It may be that certain aspects of stereo separation have changed slightly between mixes, causing some of the left-right differences shown in your plots, although these are fairly minor and barely audible (mostly < 1 dB).
Go to the top of the page
+Quote Post
krabapple
post Mar 18 2013, 22:42
Post #21





Group: Members
Posts: 2216
Joined: 18-December 03
Member No.: 10538



QUOTE (Dynamic @ Mar 18 2013, 13:43) *
Yes, at least they're fairly smooth.

To be clear, I'm assuming you have a spreadsheet that contains columns Frequency (Hz), Left (dBFS), Right (dbFS) for all frequency bins of the power spectrum of each file.


First, there's the original file from the first CD release, which you plotted in your first upload here.

Then there's the file from the CD remaster that you've called 'AF'.

Then there's the file from the HD release (downsampled from 96kHz stereo to 44.1kHz stereo to make it comparable to the CDs - your downsampling settings looked fine, by the way).

As you want to compare ORIG, AF and HD, you then subtract the values of ORIG from the values in AF and the values in HD to obtain the difference in dB for each version compared to the original CD release, and it's these difference values that you're plotting against frequency. (Difference in dB in logarithmic domain is equivalent to dividing in the linear domain, hence I use the term 'normalize' below)


Correct on all points.


QUOTE
In these difference plots you cannot identify the notes of the musical scale that are present because you've normalized against the original CD's frequency response to measure only the difference in EQ (and the difference in filtering at low and high end and thus you've normalized out any ripple that was originally present from tones that are notes on the musical scale (albeit averaged over the whole file).



But that could be done if one just graphs the actual values, rather than differences. i.e., not comparing to a reference?


QUOTE
Note that many sounds that aren't on the musical scale will be present thanks to noiselike sounds (sibilant vocal sounds and hissing noises, drums and hi-hat hits and similar untuned percussion all have a fairly spread spectrum showing no peaks around the notes of the musical scale), so when averaging over the whole file, there's likely to be plenty of power in those parts of the spectrum to successfully normalize against.

The original frequency response shows a 20 kHz low pass filter, which is recommended by the Red Book standard for audio CDs but isn't applied by Audition's downsampling filter or in many modern CD releases that often use more of a brickwall filter just below 22.05kHz. It also shows relatively little content below in the deep bass (open bass-E on a 4-string bass guitar has a fundamental frequency of about 41Hz).




By 'original frequency response', you mean you're inferring the FR of the reference track? Or do you mean, the non-difference response (i.e., actual values, if they were graphed)

Anyway, regarding filters: I actually cut off the graph at below 20 Hz and above 20 kHz for convenience; for example, there is more to the curves out to ~22kHz,as one would expect, as well as one sample point at 10 Hz not shown. Highest bin value is 22,039 Hz. Level at those frequencies is way down, as one would expect ;> And for examining comparative lowest bass, I think a higher FFT value would be the thing to do.



QUOTE
Your plots now show the EQ curve you'd have to apply to the original CD to obtain the same tonal balance as the AF or HD releases (and the negative of those dB values will turn AF or HD into the original tonal balance). The exception is in ranges where the original or the reissued versions have very low (essentially zero) audible content, which I guess means below about 40Hz and above about 16 kHz to 20 kHz, where you're comparing very small numbers as a ratio (I'm actually surprised the differences are so low - within 10 dB). If you wanted to match the EQ between various recordings, you'd probably get close by reading off values and plugging them into an EQ.

It may be that certain aspects of stereo separation have changed slightly between mixes, causing some of the left-right differences shown in your plots, although these are fairly minor and barely audible (mostly < 1 dB).


Thanks for these observations!

This post has been edited by krabapple: Mar 18 2013, 22:44
Go to the top of the page
+Quote Post
Dynamic
post Mar 19 2013, 11:50
Post #22





Group: Members
Posts: 803
Joined: 17-September 06
Member No.: 35307



QUOTE (krabapple @ Mar 18 2013, 21:42) *
But that could be done if one just graphs the actual values, rather than differences. i.e., not comparing to a reference?


Yes, the actual valued. Well, you'd pick out some of the notes. You can see the peak corresponding to tonal frequencies (the fundamental frequency of a note and its overtones or multiples of its frequency) more easily on a spectrogram. When you average over the whole song, the various notes become averaged down towards the flat overall response, but may still show up enoughto be picked out.

It also depends on the frequency resolution. More bins (16384, say) would show them up better.


--------------------
Dynamic the artist formerly known as DickD
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 21st August 2014 - 13:34