Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Visualization of signal distortion (Read 24141 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Visualization of signal distortion

I would like to propose a method of visualization of differences between two signals – input and output. Both signals are divided into time blocks (50+ ms) and difference between each corresponding blocks is calculated in dB and indicated by color. Such representation of difference signal helps to spot visually what parts of signal are transferred less accurately through device or processing under test. For example, it easily shows that sound section of iPhone 6 is exactly the same as of iPhone 5S.

Example of diffrogram for iPhone 6 and nine SE sound samples is below.



Details of the method and diffrograms for other devices are in the article “Diffrogram: visualization of signal differences in audio research”. Your critique and opinions would be very helpful for me. Thanks in advance.
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #1
Quote
At this point we are not trying to explain distinctions on these diffrograms and their impact on perceived audio quality of the devices. We can just establish the fact that various signals are transferred through these devices with various precision.


Perhaps you could put this statement or something like it at the beginning of the article.

Visualization of signal distortion

Reply #2
I would like to propose a method of visualization of differences between two signals – input and output. Both signals are divided into time blocks (50+ ms) and difference between each corresponding blocks is calculated in dB and indicated by color. Such representation of difference signal helps to spot visually what parts of signal are transferred less accurately through device or processing under test. For example, it easily shows that sound section of iPhone 6 is exactly the same as of iPhone 5S.

Example of diffrogram for iPhone 6 and nine SE sound samples is below.



Details of the method and diffrograms for other devices are in the article “Diffrogram: visualization of signal differences in audio research”. Your critique and opinions would be very helpful for me. Thanks in advance.



Both link and picture seem to have gone missing for me.

Visualization of signal distortion

Reply #3
Same here, server down?

edit: working now
"I hear it when I see it."


Visualization of signal distortion

Reply #5
How about a mesh or color 3d plot of noise spectrum?

Even better, a plot of SNR vs. frequency as a function of time. Then you have a chance of making minimal sense of it.
-----
J. D. (jj) Johnston

Visualization of signal distortion

Reply #6
Even better, a plot of SNR vs. frequency as a function of time. Then you have a chance of making minimal sense of it.

Probably difference between two spectrograms – input and output – will do what you suggest. I will try a bit later as it is not difficult. But now I would like to derive as much info as possible from comparison of wave-forms: Input-Output Difference. This does not seem to me senseless. Not perfect – yes, but any objective audio metric is not perfect.
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #7
Part 2 of the article is ready - http://soundexpert.org/news/-/blogs/visual...istortion#part2
Testing of iBasso DX50 with standard test signals and real-life music material.

p.s. with color pictures!!!


Starting with Figure 21, things that are clear to me in other presentation formats start looking like mud to me.  I can get spectrograms and obtain useful knowledge from them, but they are not as good for me other formats for the same information.

Visualization of signal distortion

Reply #8
Example of testing digital audio player with various audio signals and single measurement approach based on Difference level parameter.



Starting with Figure 21, things that are clear to me in other presentation formats start looking like mud to me.  I can get spectrograms and obtain useful knowledge from them, but they are not as good for me other formats for the same information.

Spectrograms are supplementary here, they are just to better understand the nature of the signal being researched. The main idea of diffrogram is to present a map of output waveform degradation (and to measure it). So the useful knowledge is what parts of a signal (or what types of signal) are distorted the most (or the less) by some device under test.
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #9
Example of testing digital audio player with various audio signals and single measurement approach based on Difference level parameter.



Starting with Figure 21, things that are clear to me in other presentation formats start looking like mud to me.  I can get spectrograms and obtain useful knowledge from them, but they are not as good for me other formats for the same information.

Spectrograms are supplementary here, they are just to better understand the nature of the signal being researched. The main idea of diffrogram is to present a map of output waveform degradation (and to measure it). So the useful knowledge is what parts of a signal (or what types of signal) are distorted the most (or the less) by some device under test.



It is well known that difference testing is fatally flawed by the fact that every kind of noise, distortion, and other artifact (some irrelevant to fidelity) is conflated into one number or one graph.

The whole purpose of over 100 years of audio testing has been to separate things that are of interest and relevant from those that are not.

For example difference testing makes a huge point out of latency, when due to the nature of the very process of recording, latency is not of interest at all to almost all listeners.

Visualization of signal distortion

Reply #10
It is well known that difference testing is fatally flawed by the fact that every kind of noise, distortion, and other artifact (some irrelevant to fidelity) is conflated into one number or one graph.

The whole purpose of over 100 years of audio testing has been to separate things that are of interest and relevant from those that are not.

For example difference testing makes a huge point out of latency, when due to the nature of the very process of recording, latency is not of interest at all to almost all listeners.

Difference calculation which I use is not affected by latency or any other phase shifts and time stretch/shrink. In general, all isomorphic distortions of output waveform are not counted. Difference level (Df) measures only non-isomorphic differences between two signals.

Below are results of Df measurements for the codecs from Kamedo2's personal listening test (http://www.hydrogenaud.io/forums/index.php?showtopic=109716). Sound material is the whole album “The dark side of the moon” which was divided into 400ms pieces (N=6430) and for each piece Df value was calculated. Histograms of Df values for the codecs are on the left, medians are marked with dots. Then those medians were mapped to results of the listening test, centered and scaled beforehand using mean and std deviation of resulting  quality scores. Center-n-scale procedure was made twice – for all 5 points (red dots) and for 4 points, without Opus (blue dots).



Taking into account that different sound material was used for calculation of Df and for listening test the correlation is surprising, blue dots are almost inside confidence intervals of audio quality scores. Is it just a coincidence or a sign of some regularity that was left unnoticed during “over 100 years of audio testing ”? Today I'm pretty sure that difference testing has a huge research potential, not as a substitute but  a supplement to listening tests. When nature of waveform degradation is the same, Df correlates well with results of listening tests. In our case Opus has substantially different psy-model (which is also seen in its histogram) and Df magic does not work for it.

So, difference testing is not “fatally flawed” it was just not used correctly. Exactly its possibility to accumulate and measure all types of artifacts results in good correlation with subjective measurements, even in case of psycho-acoustic coding where traditional audio measurements are helpless absolutely.
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #11
Finally new measurement procedure and presentation of results were defined. I started collection of audio measurements for portable players here – http://soundexpert.org/portable-players. Df-slides are self-explanatory; top ones represent devices supposed to have better perceived quality (more transparent).
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #12
Finally new measurement procedure and presentation of results were defined. I started collection of audio measurements for portable players here – http://soundexpert.org/portable-players. Df-slides are self-explanatory; top ones represent devices supposed to have better perceived quality (more transparent).


Wow, this is awesome. I'd love to see how the Pono stacks up to these other devices.

Visualization of signal distortion

Reply #13
Wow, this is awesome. I'd love to see how the Pono stacks up to these other devices.

I would love too ... and many other players as well. I need to elaborate some procedure of getting those devices, I can't buy them all 
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #14
So in
Code: [Select]
Df [dB] = 10 * Log10(1 - |p|)
is p simply the cross-correlation of the reference and recorded signal?

I don't see how that would correspond well to subjective evaluation at all. Small audibly irrelevant differences would degrade Df.
"I hear it when I see it."

Visualization of signal distortion

Reply #15
So in
Code: [Select]
Df [dB] = 10 * Log10(1 - |p|)
is p simply the cross-correlation of the reference and recorded signal?

I don't see how that would correspond well to subjective evaluation at all. Small audibly irrelevant differences would degrade Df strongly.


p is the cross-correlation but not simple. It is calculated only after output signal is time and phase aligned (warped) with desired accuracy. So "irrelevant differences" are not counted at all. Warping is computationally extensive - one minute of audio requires 25 min of processing time on my notebook (which is not very fast though).
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #16
So in
Code: [Select]
Df [dB] = 10 * Log10(1 - |p|)
is p simply the cross-correlation of the reference and recorded signal?

I don't see how that would correspond well to subjective evaluation at all. Small audibly irrelevant differences would degrade Df strongly.


Df is the cross-correlation but not simple. It is calculated only after output signal is time and phase aligned with desired accuracy. So "irrelevant differences" are not counted at all.


How do you account for irrelevent differences in the amplitude or phase response of the device's output ?

Looking at some of those results where you have near 16 bit limited performance for pure tones but single digit dB for white noise, it looks like you are not.

Visualization of signal distortion

Reply #17
I am not talking about phase but e.g. ripple in low pass filters, the effects of low frequency roll-off etc.
"I hear it when I see it."

Visualization of signal distortion

Reply #18
How do you account for irrelevent differences in the amplitude or phase response of the device's output ?

I am not talking about phase but e.g. ripple in low pass filters, the effects of low frequency roll-off etc.

Df approach, among other things, helps to avoid exactly this kind of questions and analysis (how amplitude or phase response affects listening experience). The problem with such analysis is that its results can't be generalized to real-life audio material as perception of the same drawbacks in audio reproduction varies with sound material. Df approach is different, it deals only with relationship between waveform and perception. Researching such relationship is much easier, you just compare degradation of waveform with degradation of perceived audio quality and try to figure out regularity. So this is just another attempt to find correlation between objective measurements and subjective impressions. At this stage of my research the most promising objective parameter is average (median) of Df values calculated for a relatively big population of short sound samples (6430 in df-slides). It correlates well with results of listening tests. In some cases it does not correlate, but these cases can be identified beforehand (this is another research task). So, returning to your questions, I could say only that such questions are non-existent in Df metrics. Your questions belongs to the standard audio metrics which can tell you how various specific and isolated audio parameters affect perceived audio quality but failed to predict the result in case of their combinations. Also the current audio metrics is less suitable for digital domain and completely helpless for psychoacoustic coding. I have a strong feeling that it reached the dead end and we need to try a different approach.

Looking at some of those results where you have near 16 bit limited performance for pure tones but single digit dB for white noise, it looks like you are not.

Right you are, reproduction of white noise is heavily affected by phase response. As you can see reproduction of real-life sound material is affected to substantially lesser degree (exactly as you want it to be).
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #19
Looking at some of those results where you have near 16 bit limited performance for pure tones but single digit dB for white noise, it looks like you are not.

Right you are, reproduction of white noise is heavily affected by phase response. As you can see reproduction of real-life sound material is affected to substantially lesser degree (exactly as you want it to be).


I don't think thats the right interpretation of those results.  I would say that your metric is extremely sensitive to what are essentially irrelevant differences in phase/amplitude response vs. frequency, which is why the white noise test fails to give meaningful results.  The very large difference between the harmonic tests and the broadband tests suggests that actually this is not going to work very well at all with real-world signals, which are not simple harmonics.

Visualization of signal distortion

Reply #20
Then those medians were mapped to results of the listening test
How?

Taking into account that different sound material was used for calculation of Df and for listening test the correlation is surprising, blue dots are almost inside confidence intervals of audio quality scores.
OK, so your metric fails completely to correlate with the perceived quality of this test, since only a single one of your mapped red dots fits into the confidence interval of one of the results, and even that single point might only fit due to how you do your mapping. Then you artificially removed Opus from your data, which, in fact, is the most interesting data point of all, because the Df metric is so vastly different from the perception based result. This shifts your dots around, and all of the sudden more dots fit the coincidence intervals. So...

Is it just a coincidence or a sign of some regularity that was left unnoticed during “over 100 years of audio testing ”?
It's pure coincidence, and it's also not a very good correlation at all. You provided no rationale why your Df metric should correlate at all with the result of Kamedo's test. All I see is random cherry-picking and doctoring of the data. Again.

Today I'm pretty sure that difference testing has a huge research potential, not as a substitute but  a supplement to listening tests. When nature of waveform degradation is the same, Df correlates well with results of listening tests. In our case Opus has substantially different psy-model (which is also seen in its histogram) and Df magic does not work for it.
It does not correlate well, and it doesn't fit to the Opus result. I realize that you're trying very hard to "sell" your Df idea to yourselves and others, but at this point it should have simply occured to you that the metric is not useful to assess audio quality. I'd still be interested in seeing a metric which does correlate with perceived audio quality, something like PSNR or SSIM in the video world, but this is not it.

So, difference testing is not “fatally flawed” it was just not used correctly.
If your example isn't a shining example of how much it fails, I do not know what is. You need "a million" cases to "prove" that your hypothesis is correct (in fact, you cannot), but just a single one to tear it down. You have provided the latter yourself. If you think this is a good example, I don't want to see the bad ones.

I think you should see this failure as an opportunity to rethink the Df metric, especially with respect to the Opus result. Maybe you can improve it somehow to make it more meaningful? Don't get me wrong, I appreciate the effort to find a useful metric.
It's only audiophile if it's inconvenient.

Visualization of signal distortion

Reply #21
Then those medians were mapped to results of the listening test
How?

Linear mapping with mean and std.deviation of subjective scores as coeffs. The method is used when you need to compare values from different linear scales.

OK, so your metric fails completely to correlate with the perceived quality of this test, since only a single one of your mapped red dots fits into the confidence interval of one of the results, and even that single point might only fit due to how you do your mapping. Then you artificially removed Opus from your data, which, in fact, is the most interesting data point of all, because the Df metric is so vastly different from the perception based result. This shifts your dots around, and all of the sudden more dots fit the coincidence intervals. So...

I'm pretty sure that it is not possible to find such objective audio metric that will work in all cases. I'm trying to reach less complicated goal – to define a class of cases where Df metrics will work for sure. The example with Opus is very indicative in this sense. The problem with exclusion of Opus is this – it should be excluded before results of listening test are ready. In other words, there should be a method to decide whether Df metrics works in this particular case or not. Such method exists and I mentioned its basic idea - when nature of waveform degradation is the same, Df correlates well with results of listening tests.

One purpose of df-metrics is to reduce the number of required listening tests. They are necessary only when nature of waveform degradation changes substantially (as in case of Opus). In other cases subjective scores can be computed analytically with some degree of confidence which can be also computed.

It's pure coincidence, and it's also not a very good correlation at all. You provided no rationale why your Df metric should correlate at all with the result of Kamedo's test. All I see is random cherry-picking and doctoring of the data. Again.

Taking into account above said it is a good correlation. May be it is exactly due to possibility of waveform degradation to accumulate all possible types of distortion... I don't know. I'd rather show different examples of correlation than try to think out a rationale why it should work.

And …
  • sometimes “ cherry-picking and doctoring of the data” means research
  • you don't need “a million” cases to prove something, a few good ones are usually enough
  • all scientific knowledge is falsifiable by definition

Anyway I appreciate your support for new audio metrics search.
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #22
I don't think thats the right interpretation of those results.  I would say that your metric is extremely sensitive to what are essentially irrelevant differences in phase/amplitude response vs. frequency, which is why the white noise test fails to give meaningful results.  The very large difference between the harmonic tests and the broadband tests suggests that actually this is not going to work very well at all with real-world signals, which are not simple harmonics.

I think that sine wave is the easiest for accurate reproduction by any device, white noise is the hardest as its waveform is the most complicated. Real-world signals are somewhere in between (histogram shows the example of such real-world signals). White noise testing is the worst-case testing for any device, all types of distortions introduced by device contribute to its degradation. I'm not sure about further interpretation of sine/white noise measurements.
keeping audio clear together - soundexpert.org

Visualization of signal distortion

Reply #23
Difference testing works in the sense that it detects any difference, sure, but it fails to weigh these differences.


I don't know if you account for that, but if you don't then could you try some noise compared to the same noise filtered with a 1st order all-pass and again compared to the same noise filtered with a 2nd order all-pass. What are the two resulting Df values?
"I hear it when I see it."

Visualization of signal distortion

Reply #24
I think that sine wave is the easiest for accurate reproduction by any device, white noise is the hardest as its waveform is the most complicated. Real-world signals are somewhere in between (histogram shows the example of such real-world signals). White noise testing is the worst-case testing for any device, all types of distortions introduced by device contribute to its degradation. I'm not sure about further interpretation of sine/white noise measurements.

I'm not so sure.

A sine is as hard to accurately reproduce, if not harder due to a lower crest factor. The problem is the definition of "accurate".
One big difference between these signals is the bandwidth. So of course, with a wide bandwidth signal such as noise you will catch a lot more reproduction imperfections.

Now if these imperfections were about the same in magnitude across the entire bandwidth, then I'd think that the noise signal will give you a worse Df but if you went through the whole bandwidth with one sine (one frequency) at a time you'd get better Df values.
"I hear it when I see it."