IPB

Welcome Guest ( Log In | Register )

SoundExpert explained, Methodology issues
Serge Smirnoff
post Nov 24 2010, 13:27
Post #1





Group: Members
Posts: 371
Joined: 14-December 01
Member No.: 641



I found this thread among SoundExpert referals and was a bit surprised with almost complete misunderstanding of SE testing methodology and particularly how diff signal is used in SE audio quality metrics. Discussion on the topic from 2006 actually seems more meaningful. So I decided to post here some SE basics for reference purposes. I will use a thought experiment which is close to reality though.

Suppose we have two sound signals the main and the side one. They could be for example a short piano passage and some noise. We can prepare several mixes of them in different proportions:
  • equal levels of main and side signals (0dB RMS)
  • half level of side signal (-6dB RMS)
  • quarter level of side signal (-12dB RMS)
  • 1/8 level of side signal (-18dB RMS)
  • 1/16 level of side signal (-24dB RMS)

After normalization all mixes have equal levels and we can evaluate perceptibility of the side signal in the mixes. Here at SE we found that this perceptibility is a monotonous function of side signal level and looks like this:

Figure: Side signal perception

(1) In other words, there is a relationship between objectively measured level of side signal and its subjectively estimated perceptibility in the mix. And what is more:
(a) this relationship is well described by 2-nd order curve (assuming levels are in dB)
(b) the relationship holds for any sound signals whether they are correlated or not, the only differences are position and curvature of the curve.

(2) These side stimulus perceptibility curves are the core of SE rating mechanism. Each device under test has its own curve plotted on basis of SE online listening tests.
(3) Side signals are difference signals of devices being tested. Levels of side signals are expressed in dB of Difference level parameter which is exactly equal to RMS level of side signal in our case.
(4) Subjective grades of perceptibility are anchor points of 5-grade impairment scale.
(5) Audio metrics beyond threshold of audibility is determined by extrapolation of that 2-nd order curves. Virtual grades in extrapolated area could be considered as objective quality parameters regarding human auditory peculiarities.

So, yes, difference signal is used in SE testing. We take into account both its level and how human auditory system perceives it together with reference signal. Some difference signals having fairly high levels still remain almost imperceptible against the background of reference signal and vice versa; perceptibility curves reflect this.

This is the concept. Many parts of it still need thorough verification in carefully designed listening tests, which are beyond SE possibilities. All we can do is to analyze collected grades returned by SE visitors. This will be done for sure and yet this can't be a replacement of properly organized listening tests.

SE testing methodology is new and questionable, but all assumptions look reasonable and SE ratings promising, at least to me. Time will show.


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post
 
Start new topic
Replies
drewfx
post Nov 24 2010, 18:20
Post #2





Group: Members
Posts: 93
Joined: 17-October 09
Member No.: 74078



What is the justification for the "dashed" portion of the curve?

Shouldn't it be a flat line once you reach "imperceptible"? If not, once something is imperceptible, how can it become "more imperceptible"?
Go to the top of the page
+Quote Post
Porcus
post Nov 27 2010, 15:49
Post #3





Group: Members
Posts: 1912
Joined: 30-November 06
Member No.: 38207



QUOTE (drewfx @ Nov 24 2010, 18:20) *
What is the justification for the "dashed" portion of the curve?

Shouldn't it be a flat line once you reach "imperceptible"? If not, once something is imperceptible, how can it become "more imperceptible"?



Matter of definition, interpretation and use.

1) Consider three chess games which are both "theoretically lost". One is a simple mate in one, the other is so hard that if you put 1000 chess players at the task, you won't be able to distinguish it from the startup position by statistical analysis of the outcome. And, the third is so hard that it won't be solved in fifty years. To clearcut the logic, assume that the second is like the third, except with 70 intermediary "only moves" (which do not constitute any learning curve for the subsequent ones).

Now everything else equal, you will still have a clear strict preference. Because you could risk meeting one of the very few chess players that actually can win this. You might not know that is is "humanly winable" though, but you will absolutely want to insure against the uncertainty if it is free.

Now consider a step-by-step sequence of chess positions, starting from the "third" one above. We index them by "# of very hard moves until the win is clear, as measured by statistics within confidence level [say, p]". How do you define the human-winability threshold?


2) Consider 32-bit sound file, then a 31 bit (LSB truncated) file, etc. Rank these. You may claim that every file above a "hearing threshold" of slightly below T bits, is equivalent. However, what if it is an unfinished product? Are you sure that the final mix is going to have the same hearing threshold? If not, then the high-resolution file could very well be more robust -- there might be manipulations which would enable you to hear a difference between the final and its T-bit version, although not between the original and its T-bit version. Most 16-bit CDs are mixed at higher word length, right?
Solution? A "robustness-to-manipulations" measure?


Of course:
- if no such issues apply, then zero value to superfluous information is at least as good a measure as everything else
- if anyone makes a selling claim, then they have the burden of proof. Then "inaudible difference" is the null hypothesis. You would grab the extra measured quality if for free, as an insurance against audibility, but you would frown upon someone trying to sell you an insurance against a disaster which no-one has ever substantiated has ever happened or could ever happen. (... well ...: http://en.wikipedia.org/wiki/Alien_abduction_insurance )
- even if we assume that there is some worth to this not-justified-as-generally-audible quality, then it is hard to quantify. Justifying it exists (by measurement) does not mean we can justify a reasonably narrow confidence interval for a particular point on the graph.


--------------------
One day in the Year of the Fox came a time remembered well
Go to the top of the page
+Quote Post
drewfx
post Nov 29 2010, 18:43
Post #4





Group: Members
Posts: 93
Joined: 17-October 09
Member No.: 74078



QUOTE (Porcus @ Nov 27 2010, 09:49) *
QUOTE (drewfx @ Nov 24 2010, 18:20) *
What is the justification for the "dashed" portion of the curve?

Shouldn't it be a flat line once you reach "imperceptible"? If not, once something is imperceptible, how can it become "more imperceptible"?



Matter of definition, interpretation and use.

1) Consider three chess games which are both "theoretically lost". One is a simple mate in one, the other is so hard that if you put 1000 chess players at the task, you won't be able to distinguish it from the startup position by statistical analysis of the outcome. And, the third is so hard that it won't be solved in fifty years. To clearcut the logic, assume that the second is like the third, except with 70 intermediary "only moves" (which do not constitute any learning curve for the subsequent ones).

Now everything else equal, you will still have a clear strict preference. Because you could risk meeting one of the very few chess players that actually can win this. You might not know that is is "humanly winable" though, but you will absolutely want to insure against the uncertainty if it is free.

Now consider a step-by-step sequence of chess positions, starting from the "third" one above. We index them by "# of very hard moves until the win is clear, as measured by statistics within confidence level [say, p]". How do you define the human-winability threshold?


2) Consider 32-bit sound file, then a 31 bit (LSB truncated) file, etc. Rank these. You may claim that every file above a "hearing threshold" of slightly below T bits, is equivalent. However, what if it is an unfinished product? Are you sure that the final mix is going to have the same hearing threshold? If not, then the high-resolution file could very well be more robust -- there might be manipulations which would enable you to hear a difference between the final and its T-bit version, although not between the original and its T-bit version. Most 16-bit CDs are mixed at higher word length, right?
Solution? A "robustness-to-manipulations" measure?


I would certainly agree it is fair to allow for a reasonable margin of error near the threshold of perception.

QUOTE
- if anyone makes a selling claim, then they have the burden of proof. Then "inaudible difference" is the null hypothesis.


And this really was my concern - if you have a "quality factor" metric that seems to imply one product is "better" than another based on the extrapolated portion of the curve, it is ripe for someone to misuse. For this reason, I think the information on the threshold of perception needs to be preserved.
Go to the top of the page
+Quote Post
greynol
post Nov 29 2010, 19:18
Post #5





Group: Super Moderator
Posts: 10040
Joined: 1-April 04
From: San Francisco
Member No.: 13167



QUOTE (drewfx @ Nov 29 2010, 09:43) *
And this really was my concern - if you have a "quality factor" metric that seems to imply one product is "better" than another based on the extrapolated portion of the curve, it is ripe for someone to misuse. For this reason, I think the information on the threshold of perception needs to be preserved.

Precisely (and on this forum, SE results do get misued)!

There has been a lot of talk about psychometrics, but little to none about psychoacousitcs. When it comes to perceptual coding it is the latter that is king.

Someone, anyone, provide some data showing a direct correlation between across the board "artifacts" amplification to the real-world application of lossy audio compression. I've seen claims that SE results are good for those interested in applications such as surround-sound processing, transcoding and equalization. Evidence, please!!!

NB: the word artifacts was put in silly quotes for a reason. We already had the discussion about what constitutes artifacts and the role masking plays. I am not denying that they can become unmasked through typical real-world usage but I am denying that across the board amplification of a difference signal that is subsequently added back in constitutes real-world usage.

AFAICT none of the criticisms put forth by people like Garf, Sebastian, Woodinville and Saratoga have been sufficiently addressed since they've been raised. It seems we've made no progress over the last four years.

This post has been edited by greynol: Nov 29 2010, 19:31


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
Serge Smirnoff
post Nov 29 2010, 20:21
Post #6





Group: Members
Posts: 371
Joined: 14-December 01
Member No.: 641



QUOTE (greynol @ Nov 29 2010, 22:18) *
Someone, anyone, provide some data showing a direct correlation between across the board "artifacts" amplification to the real-world application of lossy audio compression. I've seen claims that SE results are good for those interested in applications such as surround-sound processing, transcoding and equalization. Evidence, please!!!

I can well agree that such parameter as quality margin might not be very useful in practice of lossy codecs usage. The metric was developed for assessing wider class of low impairments. Finally it could be a substitute (further development) of current audio metric based on THD, SNR, IMD ... parameters. As opposed to current metric the new one has to be sensitive to psychoacoustic features of human hearing. That's why lossy coders are perfect for test drive of the metric. Also they produce time accurate output and diff signal is easy to extract.

So I prefer to separate the questions:
  1. Do we need some audio metric capable of assessing quality margin of various devices and DSPs?
  2. Do we need such metric for lossy encoders?
  3. Is it possible to develop such metric in principle?


This post has been edited by Serge Smirnoff: Nov 29 2010, 21:13


--------------------
keeping audio clear together - soundexpert.org
Go to the top of the page
+Quote Post

Posts in this topic
- Serge Smirnoff   SoundExpert explained   Nov 24 2010, 13:27
- - drewfx   What is the justification for the "dashed...   Nov 24 2010, 18:20
|- - Serge Smirnoff   QUOTE (drewfx @ Nov 24 2010, 21:20) What ...   Nov 24 2010, 20:00
||- - drewfx   QUOTE (Serge Smirnoff @ Nov 24 2010, 14:0...   Nov 24 2010, 20:24
||- - Serge Smirnoff   QUOTE (drewfx @ Nov 24 2010, 23:24) Exact...   Nov 24 2010, 21:49
|- - Porcus   QUOTE (drewfx @ Nov 24 2010, 18:20) What ...   Nov 27 2010, 15:49
|- - drewfx   QUOTE (Porcus @ Nov 27 2010, 09:49) QUOTE...   Nov 29 2010, 18:43
|- - greynol   QUOTE (drewfx @ Nov 29 2010, 09:43) And t...   Nov 29 2010, 19:18
|- - Serge Smirnoff   QUOTE (greynol @ Nov 29 2010, 22:18) Some...   Nov 29 2010, 20:21
- - drewfx   Just to be clear - I am not necessarily questionin...   Nov 24 2010, 22:17
|- - Serge Smirnoff   If you want to build human-hearing-oriented audio ...   Nov 25 2010, 00:24
||- - alexeysp   QUOTE (Serge Smirnoff @ Nov 25 2010, 01:2...   Nov 25 2010, 11:35
||- - Serge Smirnoff   QUOTE (alexeysp @ Nov 25 2010, 13:35) ...   Nov 25 2010, 19:33
|- - knutinh   QUOTE (drewfx @ Nov 24 2010, 22:17) I rep...   Nov 25 2010, 19:15
|- - Serge Smirnoff   QUOTE (knutinh @ Nov 25 2010, 21:15) If t...   Nov 25 2010, 19:49
|- - Kees de Visser   In the recently closed thread which the OP referre...   Nov 25 2010, 21:39
- - 2Bdecided   Just to be clear, your graph example shows grades ...   Nov 25 2010, 12:30
|- - Serge Smirnoff   QUOTE (2Bdecided @ Nov 25 2010, 14:30) Ju...   Nov 25 2010, 23:50
- - Woodinville   QUOTE (Serge Smirnoff @ Nov 24 2010, 04:2...   Nov 26 2010, 08:25
|- - Serge Smirnoff   QUOTE (Woodinville @ Nov 26 2010, 10:25) ...   Nov 26 2010, 16:25
|- - Woodinville   QUOTE (Serge Smirnoff @ Nov 26 2010, 07:2...   Nov 27 2010, 07:17
|- - Serge Smirnoff   QUOTE (Woodinville @ Nov 27 2010, 09:17) ...   Nov 27 2010, 08:29
|- - Woodinville   QUOTE (Serge Smirnoff @ Nov 26 2010, 23:2...   Nov 27 2010, 23:05
|- - knutinh   QUOTE (Woodinville @ Nov 27 2010, 23:05) ...   Nov 28 2010, 19:24
- - greynol   That's a mighty big if. For years people have...   Nov 28 2010, 20:14
|- - Kees de Visser   The technique isn't new, according to this AES...   Nov 28 2010, 21:35
||- - Serge Smirnoff   QUOTE (Kees de Visser @ Nov 29 2010, 00:3...   Nov 28 2010, 22:47
|- - 2Bdecided   QUOTE (greynol @ Nov 28 2010, 19:14) That...   Nov 29 2010, 11:49
|- - Porcus   QUOTE (2Bdecided @ Nov 29 2010, 11:49) I ...   Nov 29 2010, 13:00
|- - 2Bdecided   QUOTE (Porcus @ Nov 29 2010, 12:00) QUOTE...   Nov 29 2010, 16:27
|- - Porcus   [Heavily edited] QUOTE (2Bdecided @ Nov 29 2...   Nov 29 2010, 16:47
|- - knutinh   QUOTE (2Bdecided @ Nov 29 2010, 16:27) QU...   Nov 30 2010, 09:53
|- - Porcus   QUOTE (knutinh @ Nov 30 2010, 09:53) Why ...   Nov 30 2010, 11:28
|- - knutinh   QUOTE (Porcus @ Nov 30 2010, 11:28) QUOTE...   Nov 30 2010, 11:34
- - greynol   If we aren't going to consider real-world usag...   Nov 29 2010, 20:27
|- - Serge Smirnoff   QUOTE (greynol @ Nov 29 2010, 23:27) What...   Nov 29 2010, 20:36
- - greynol   Breaking masking by amplifying a difference signal...   Nov 29 2010, 20:45
|- - Serge Smirnoff   QUOTE (greynol @ Nov 29 2010, 23:45) Brea...   Nov 29 2010, 21:19
|- - Kees de Visser   QUOTE (greynol @ Nov 29 2010, 21:45) Brea...   Nov 29 2010, 23:21
|- - greynol   QUOTE (Kees de Visser @ Nov 29 2010, 14:2...   Nov 30 2010, 08:19
- - greynol   How so?   Nov 29 2010, 21:31
|- - Serge Smirnoff   QUOTE (greynol @ Nov 30 2010, 00:31) How ...   Nov 29 2010, 22:10
- - SebastianG   QUOTE (Serge Smirnoff @ Nov 24 2010, 13:2...   Nov 29 2010, 22:04
- - Woodinville   Using a difference signal as a signal-detection te...   Nov 29 2010, 22:14
|- - Porcus   QUOTE (Woodinville @ Nov 29 2010, 22:14) ...   Nov 29 2010, 23:00
||- - Woodinville   QUOTE (Porcus @ Nov 29 2010, 14:00) QUOTE...   Nov 30 2010, 00:26
|- - Serge Smirnoff   QUOTE (Woodinville @ Nov 30 2010, 01:14) ...   Nov 30 2010, 09:20
- - Serge Smirnoff   QUOTE (SebastianG @ Nov 30 2010, 01:04) I...   Nov 30 2010, 09:09
|- - 2Bdecided   QUOTE (Serge Smirnoff @ Nov 30 2010, 08:0...   Nov 30 2010, 16:24
|- - Serge Smirnoff   QUOTE (2Bdecided @ Nov 30 2010, 19:24) Ho...   Nov 30 2010, 17:38
|- - Woodinville   QUOTE (Serge Smirnoff @ Nov 30 2010, 08:3...   Dec 1 2010, 03:11
|- - Serge Smirnoff   QUOTE (Woodinville @ Dec 1 2010, 06:11) Q...   Dec 1 2010, 09:17
|- - Woodinville   QUOTE (Serge Smirnoff @ Dec 1 2010, 00:17...   Dec 1 2010, 22:03
|- - Kees de Visser   QUOTE (Woodinville @ Dec 1 2010, 23:03) T...   Dec 1 2010, 23:47
||- - Woodinville   QUOTE (Kees de Visser @ Dec 1 2010, 14:47...   Dec 1 2010, 23:55
||- - greynol   QUOTE (Woodinville @ Dec 1 2010, 14:55) s...   Dec 2 2010, 06:47
||- - Serge Smirnoff   QUOTE (Woodinville @ Dec 2 2010, 02:55) T...   Dec 2 2010, 08:53
||- - Kees de Visser   QUOTE (Woodinville @ Dec 2 2010, 00:55) T...   Dec 2 2010, 09:35
||- - greynol   QUOTE (Kees de Visser @ Dec 2 2010, 00:35...   Dec 2 2010, 10:34
||- - 2Bdecided   QUOTE (Kees de Visser @ Dec 2 2010, 08:35...   Dec 2 2010, 11:25
|||- - Kees de Visser   QUOTE (2Bdecided @ Dec 2 2010, 12:25) Com...   Dec 2 2010, 13:09
||||- - 2Bdecided   QUOTE (Kees de Visser @ Dec 2 2010, 12:09...   Dec 2 2010, 16:04
|||||- - Kees de Visser   QUOTE (2Bdecided @ Dec 2 2010, 17:04) QUO...   Dec 2 2010, 17:52
|||||- - Serge Smirnoff   QUOTE (2Bdecided @ Dec 2 2010, 19:04) Now...   Dec 2 2010, 19:24
||||- - greynol   QUOTE (Kees de Visser @ Dec 2 2010, 04:09...   Dec 2 2010, 19:15
|||- - Serge Smirnoff   QUOTE (2Bdecided @ Dec 2 2010, 14:25) Com...   Dec 2 2010, 13:10
||- - Woodinville   QUOTE (Kees de Visser @ Dec 2 2010, 00:35...   Dec 3 2010, 00:32
|- - Serge Smirnoff   QUOTE (Woodinville @ Dec 2 2010, 01:03) S...   Dec 2 2010, 09:01
- - Porcus   Joking aside: I'd be surprised if MPEG didn...   Nov 30 2010, 12:03
- - 2Bdecided   I can see how this could work for a simple low pas...   Dec 1 2010, 16:26
- - Serge Smirnoff   QUOTE (2Bdecided @ Dec 1 2010, 19:26) Wit...   Dec 2 2010, 09:41
- - 2Bdecided   QUOTE (Serge Smirnoff @ Dec 2 2010, 08:41...   Dec 2 2010, 11:32
- - Serge Smirnoff   QUOTE (2Bdecided @ Dec 2 2010, 14:32) If ...   Dec 2 2010, 12:18


Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd October 2014 - 03:09