IPB

Welcome Guest ( Log In | Register )

5 Pages V  « < 2 3 4 5 >  
Reply to this topicStart new topic
44.1 vs 88.2 ABX report at AES
Arnold B. Kruege...
post Jul 29 2010, 13:04
Post #76





Group: Members
Posts: 3930
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (AndyH-ha @ Jul 28 2010, 17:42) *
QUOTE (Arnold B. Krueger @ Jul 28 2010, 04:52) *
Modern ADCs do have analog anti-aliasing filters, but they are relatively simple and operate at ultrasonic frequencies. The brick wall that is right up against the audio band is digital and therefore the overall performance can be very similar to what you get if you record at a higher sample rate and downsample in the digital domain. Note that there can be considerable techncal variation in the details of how the digital filtering is implemented, whether in the ADC or applied later on.


The part about the final filtering being digital seems right, as far as I understand from reading, but based on my experiments, and those of a few others, the result of recording at 44.1 is never like that from recording at a 88.2 or 96 and downsampling with good software, as I pointed out earlier in this thread and in at least two others in HA (based on results using test tones, the only way to actually observe the final product). Do you have evidence that some soundcards really do better?


Virtually every audio interface is different from all the rest at some level of detail. So, the question is not whether they are different in any way but rather whether the differences are signficiant.

These days most audio products are sold without complete or even representive specifications and technical tests are rare compared to the size of the marketplace.

One of the ares in which I have observed possibly signficiant differences among audio interfaces is high frequency nonlinear distoriton.

In general, audio interfaces aren't significantly nonlinear above their nomal passband.

It has long been observed that in audio, "The wider you open the windows, the more dirt blows in".
Go to the top of the page
+Quote Post
2Bdecided
post Jul 29 2010, 22:11
Post #77


ReplayGain developer


Group: Developer
Posts: 5171
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (AndyH-ha @ Jul 28 2010, 22:42) *
The part about the final filtering being digital seems right, as far as I understand from reading, but based on my experiments, and those of a few others, the result of recording at 44.1 is never like that from recording at a 88.2 or 96 and downsampling with good software, as I pointed out earlier in this thread and in at least two others in HA (based on results using test tones, the only way to actually observe the final product). Do you have evidence that some soundcards really do better?
I agree Andy - I've never seen an A>D (or D>A) that comes close to achieving the kind of truly brick wall filtering that you get in Cool Edit Pro's resampling.

Whether this matters at all is another question, but it's an easily measurable difference.

I take Arny's point that they're all different (+ many are programmable), but none seem to make the effort to include a several thousand tap FIR filter.

Cheers,
David.

Go to the top of the page
+Quote Post
SebastianG
post Jul 30 2010, 09:55
Post #78





Group: Developer
Posts: 1318
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (2Bdecided @ Jul 29 2010, 22:11) *
but none seem to make the effort to include a several thousand tap FIR filter.

Right. And I don't see a real need for those in A/D or D/A. As a rule of thumb, for a filter with a transition band width of W (kilo)Hertz you need an impulse response of length L (milli)seconds with L*W = 12 (for about 100 dB stopband rejection). So, with fs=44 kHz and a transition band width of 2 kHz this comes down to 6 milliseconds. Of course, if you can tolerate a bit of aliasing within 20-22 kHz or some imaging during interpolation within 22-24 kHz you can halve that to 3 milliseconds. That's rather short in comparison to what Cool Edit is doing by default.

Cheers!
SG
Go to the top of the page
+Quote Post
lvqcl
post Jul 30 2010, 11:42
Post #79





Group: Developer
Posts: 3411
Joined: 2-December 07
Member No.: 49183



Audition 1.5: 44.1 -> 96 kHz resampling, Quality = 999, Pre/Post Filter = off.

Impulse response length = 542 samples = 5.6 ms.
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Jul 30 2010, 12:26
Post #80





Group: Members
Posts: 3930
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (2Bdecided @ Jul 29 2010, 17:11) *
I take Arny's point that they're all different (+ many are programmable), but none seem to make the effort to include a several thousand tap FIR filter.


That's because DACs are usually designed by engineers with some notion of costs and benefits. Some of the best converter chips seem to sell for under $7 (single unit!) and that's still too low to allow including a 2 GHz processor on the same chip.

I can still remember when a first rate converter chip cost more than $20!


Go to the top of the page
+Quote Post
WernerO
post Aug 2 2010, 08:19
Post #81





Group: Members
Posts: 74
Joined: 21-November 06
Member No.: 37858



QUOTE (Arnold B. Krueger @ Jul 20 2010, 14:13) *
It is my understanding that downsampling uses brick wall filtering, but that upsampling either uses no brick wall filtering at all, or uses brick wall filtering at the nyquist frequency of the higher sample rate. It is very hard to avoid brick wall filtering in digital, so that's rarely the goal. As I understand it, the major goal of higher sample rates is raising the frequency of any brick wall filters.

I think that the frequency of the corner frequency of the brick walls is highly significant. I don't think that anybody disagrees with the idea that in general, the higher the better. The only questions I'm aware of are how high, and what phase response is required for sonic transparency.


Your understanding is wrong. In fact so fundamentally wrong that I urge you to re-assess your complete knowledge of signal theory and of the sampling theorem.

Up- and oversampling still are both terms for the same mathematical process, and if that process is executed with the intent to obey the sampling theorem (i.e. contrary to doing funny stuff for the sake of it), then it will include a brickwall filter at half the original sampling rate. It simply has to, as this constitutes the bulk of that signal's reconstruction.

When oversampling the goal is not to raise the cutoff frequency of the brickwall (reconstruction) filter. That cutoff has to remain at half the original rate, as otherwise the first images are allowed to creep out. The goal, at least for a DAC, is to make most of the reconstructor in the digital domain (i.e. cheap, steep, linear, and linear-phase), with only the remainder in the analogue domain (there to suppress the images of the oversampled signal), indeed potentially at a higher cut-off frequency and with a shallow slope (i.e. cheap and simple).


It is similar in ADCs, where the modulator runs at several MHz and hard aliasing can be avoided with a simple analogue filter cutting in at a couple of 100kHz (which does not mean that designing analogue front-ends for today's DS ADCs is simple). After the conversion in the low-bit modulator the signal is then noise-shaped and decimated, which (ignoring the noise shaping) comprises of brickwall anti-alias filtering at the target Nyquist frequency.

In this sense running a Delta-Sigma ADC at 44.1kHz is the same as running it at 88.2kHz followed with off-line downsampling. The only difference is in the implementation of the two (actuall three!) anti-alias filters involved, where the off-line solution can be of arbitrarily high quality, while the hardware solution often is not that good at all. Indeed, most commercial ADC chips use half-band AA filters and thus allow some aliasing to happen in the, say, 20-22kHz band.




Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Aug 2 2010, 11:56
Post #82





Group: Members
Posts: 3930
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (WernerO @ Aug 2 2010, 03:19) *
QUOTE (Arnold B. Krueger @ Jul 20 2010, 14:13) *
It is my understanding that downsampling uses brick wall filtering, but that upsampling either uses no brick wall filtering at all, or uses brick wall filtering at the nyquist frequency of the higher sample rate. It is very hard to avoid brick wall filtering in digital, so that's rarely the goal. As I understand it, the major goal of higher sample rates is raising the frequency of any brick wall filters.

I think that the frequency of the corner frequency of the brick walls is highly significant. I don't think that anybody disagrees with the idea that in general, the higher the better. The only questions I'm aware of are how high, and what phase response is required for sonic transparency.


Your understanding is wrong. In fact so fundamentally wrong that I urge you to re-assess your complete knowledge of signal theory and of the sampling theorem.


Right and this was all corrected a week ago. Read on...
Go to the top of the page
+Quote Post
C.R.Helmrich
post Aug 9 2010, 21:41
Post #83





Group: Developer
Posts: 690
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



I finally found time to read the entire paper. It's quite well written in my opinion, but there are three points I'd like to add to the discussion by krabapple and hciman77.

  1. Different clocks were used for the 88.2 and 44.1 kHz recordings (RME ADC internal clock for 44.1, external Mutec master clock for 88.2). So the recordings were actually not done with the "exact same audio gear and settings", as claimed. I wonder if, since 88.2 = 44.1 x 2, it would be possible to construct a master clock which can serve both sampling rates simultaneously, i.e. a clock with two outputs, and one output providing only every second clock pulse? Maybe Arnold, Werner, or some other knowledgeable person can comment on that.
  2. The paper doesn't specify how precise the delay alignment was for the ABX of the excerpts with different sampling rates, just this: "We made sure that the selected files at 44.1 kHz and 88.2 kHz had the exact same fades (in and out) and length." I guess it doesn't matter, though, since switching sampling rate upon playback supposedly introduced a significant pause.
  3. The procedure of separating the 3 listeners who "significantly selected the wrong answer" from the rest sounds highly questionable to me. I think that, in order not to introduce bias, you'd also have to exclude the 3 listeners (or in general, the same right-hand percentile) who significantly selected the right answer. Or even better: don't separate the 16 listeners at all! Luckily, this was actually done by the authors. Their report: "When collapsing over all 16 participants, the results of the comparison between Orchestra files recorded at 88.2 and 44.1 kHz is still significant, p = .01. Concluding from the figures in the paper, I can add with confidence: When collapsing over all 16, the other two significant results (Guitar and Voice excerpts) disappear.

Which leaves us with: a difference between different recording sampling rates (and different clocks) was heard on the Orchestra item with a significant level of confidence. Which is interesting. The authors themselves speculate that this might be due to more detailed reproduction of transients in case of high-resolution sampling rates, which seems to be in line with our own 20-kHz brick wall test and which leads me to the question:

Amandine, would you mind sharing the low- and high-resolution Orchestra excerpt?

Comments welcome smile.gif

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
2Bdecided
post Aug 9 2010, 22:44
Post #84


ReplayGain developer


Group: Developer
Posts: 5171
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



I think before you pull out one positive result and say it's interesting, you must go back to this post...
http://www.hydrogenaudio.org/forums/index....st&p=714650
...which shows how amazing it would be to have no positive results, considering how many different ways the results are combined.

I'm really not an expert, but it seems that the statistical analysis isn't sufficient.

Another way to make the paper more satisfactory would be to re-do the apparently "good" combinations in isolation, using a pre-defined number of trials. No picking and choosing in this "second round".

Cheers,
David.

This post has been edited by 2Bdecided: Aug 9 2010, 22:47
Go to the top of the page
+Quote Post
Pio2001
post Aug 9 2010, 23:33
Post #85


Moderator


Group: Super Moderator
Posts: 3936
Joined: 29-September 01
Member No.: 73



I finaly got the article. Actually, they say something about considering the "false alarm rate" (false positives) when picking positive results.

They refer to Boley and Lester, "Statistical Analysis of ABX results Using Signal Detection Theory" (AES 127th convention), and to Macmillan and Creelman, "Detection theory : a user's guide", University Press Cambridge, 1991, without givin further details.

I don't have these references.

Go to the top of the page
+Quote Post
C.R.Helmrich
post Aug 10 2010, 01:37
Post #86





Group: Developer
Posts: 690
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (2Bdecided @ Aug 9 2010, 23:44) *
I think before you pull out one positive result and say it's interesting, you must go back to this post...
http://www.hydrogenaudio.org/forums/index....st&p=714650
...which shows how amazing it would be to have no positive results, considering how many different ways the results are combined.

I saw that but didn't quite get the message. Does it mean that the supposedly significant results (p = .01 etc.) are in fact not significant, i.e. the calculation of the level of confidence is wrong in the paper? At least the separation by musical excerpt sounds reasonable to me.

QUOTE
Another way to make the paper more satisfactory would be to re-do the apparently "good" combinations in isolation, using a pre-defined number of trials. No picking and choosing in this "second round".

That's essentially what I'm aiming at by asking Amandine for the items. Then we can try to repeat the experiment here. But it's worth adding that for the paper, the number of trials was also fixed in advance (each possible combination of sampling rate configurations was listened to four times for each musical excerpt).

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
Pio2001
post Aug 10 2010, 12:33
Post #87


Moderator


Group: Super Moderator
Posts: 3936
Joined: 29-September 01
Member No.: 73



The unknown thing is the origin of the p values. If they are associated with the ABX tests, like the ones given in ABX software, then nothing is significant, because p = 0.05 means, by definition, that such a success occurs by chance only one time out of 20 in average. P = 0.01 by chance one time out of 100 etc.

If on the other hand, some kind of "signal detection" algorithm was applied to all comparisons, including the subsampling of the listeners, in accordance with references [1] and [4] in the paper, and if the p values result of this signal detection technique, that might work.
Go to the top of the page
+Quote Post
WernerO
post Aug 11 2010, 13:51
Post #88





Group: Members
Posts: 74
Joined: 21-November 06
Member No.: 37858



QUOTE (krabapple @ Jul 16 2010, 23:23) *
The 88.2 excerpts were also then downsampled to 44.1 via Pyramix software,


I don't know which version of Pyramix they used, but somehow 6.2.3 leaves me less than impressed:






This is a missed opportunity, as a software SRC offers the chance to improve numerically on the performance of the 44.1kHz anti-alias filter of the average ADC chip.
Go to the top of the page
+Quote Post
lrossouw
post Sep 8 2010, 09:39
Post #89





Group: Members
Posts: 8
Joined: 23-August 09
From: Singapore
Member No.: 72561



Did they test for difference (Check if you can tell X=A or X=B, i.e. ABX ) or for preference (prefer A or B) on the 44 vs 88 testing? Don't have the paper for this but on the slides of the mp3 vs CD testing slides they tested for preference: http://www.music.mcgill.ca/~hockman/docume...ntation2009.pdf

If they repeated that method here then it could explain how people got it consistenly wrong and also why a two-tail value needs to be used. The "wrong" answer is to prefer the lower quality version and if you consistenly chose the lower quality version it means there is a differences but you prefer the lower quality version. I.e. if you consistently pick one you can tell a difference, but it may not be good or bad.

However if they want to look at probabilities statistics of sub-groups, the sub-groups need to be chosen on prior information. Shouldn't chose sub-groups based on the results of the tests. So they shouldn't look at stats on the 3 guys who's results were different and then try to chose a tail value for them as they are introducing bias.

This post has been edited by lrossouw: Sep 8 2010, 09:45
Go to the top of the page
+Quote Post
mzil
post Jul 22 2012, 02:47
Post #90





Group: Members
Posts: 624
Joined: 5-August 07
Member No.: 45913



QUOTE (krabapple @ Jul 16 2010, 17:23) *
2) I bought the paper. Here's a paraphrase of the methods and results. Note that the test signals were recorded by the authors...

equipment: the recording microphones (a pair of Sennheiser MKH 8020) had a FR of 10Hz-60kHz. Two stereo feeds from the mic preamp (Millennia HV-3D) to two Micstasy ADCs, one set to 44.1/24 the other to 88.2/24 ; then the 44.1/24 digital signal was recorded (at 44.1) on a Sound Devices 744T portable recorder, while the 88.2 output was recorded on a MacBook Pro at 88.2 using Logic Studio software. The recording diagram also shows that the 44.1 ADC used its internal clock, while the 88.2 ADC's master clock was a Mutec .


test signals: five musical/instrumental (orchestra, classical guitar, cymbals , voice, violin) recordings by the authors, from live performances ...
[bold texting added by me]

Krabapple, do you still have the paper? [And are you still subscribed to this old thread and seeing this question, I wonder?] Do the authors make any mention of level matching (using instrumentation) for the stage I have indicated above in bold text?

They rather oddly decided to use live music as their test source, and not a high resolution recording as Arnold Krueger correctly mentions they "should have" (that is then manipulated to create the different competing signals), but what assurance do we have that the input stages of the two Micstasy ADCs (having variable levels of gain with both manual and "auto" modes, as I understand it), successfully recorded the two analog signals at exactly the same, precise level in the digital domain? If one was a fraction of a dB different than the other, that could be the difference listeners heard, right there! Furthermore, even if both machine's inputs were set to the exact same attenuation values, do we know for a fact that simply selecting a different sampling rate won't in itself alter the actual level of the digital signal, by a small amount?

I often see people in the subjective evaluation world naively assume that there's no need to introduce level matching when comparing, say, the output of two CD players, because "the spec sheets say they both have a fixed, 2.0V output", but in truth they often do vary (slightly) when measured using instrumentation, and that small level change could easily be the difference they are hearing (but often mistakenly attribute it to being one of "quality", not simply level). I am wonder if maybe this study suffered from the same sort of flaw [an assuption that levels are matched by default, yet they aren't]?

edit to add: Of course the playback chain also would need be tested for exactly matched output level. Just like CD players vary in output, despite almost all claiming "2.0V", DACs also may vary slightly depending on the sampling rate they receive.

This post has been edited by mzil: Jul 22 2012, 03:27
Go to the top of the page
+Quote Post
C.R.Helmrich
post Mar 8 2014, 01:21
Post #91





Group: Developer
Posts: 690
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



Sorry for bumping this thread, but I thought the following might fit in nicely:

At the next AES Convention in Berlin, some researchers from Tokyo present a paper on the results of a "double-blind A/B comparison listening test" which supposedly revealed statistically significant differences between PCM (192 kHz, 24-bit) and DSD (2.8 and 5.6 MHz).

http://www.aes.org/events/136/papers/?displayall

P1-2 Subjective Evaluation of High Resolution Recordings in PCM and DSD Audio Formats—Atsushi Marui, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan; Toru Kamekawa, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan; Kazuhiko Endo, TEAC Corporation - Tokyo, Japan; Erisa Sato, TEAC Corporation - Tokyo, Japan

Abstract:

High-resolution audio production and consumption are increasing attraction supported by releases of the relatively affordable audio recorders from multiple manufacturers and broader bandwidth of the Internet. However, differences in audio quality between high-resolution audio formats are still not well known, especially between the different audio formats available for the audio recorders. In order to evaluate the differences between subjective impression of the sounds recorded using high resolution audio formats, three audio formats—PCM (192 kHz/24 bits), DSD (2.8 MHz), and DSD (5.6 MHz)—recorded with multiple studio-quality audio recorders were evaluated in a double-blind A/B comparison listening test. Six sound programs evaluated by forty-six participants on eight attributes revealed statistically significant differences between PCM and DSD but not between the two sampling frequencies (2.8 MHz and 5.6 MHz) of DSD.

Convention Paper 9019

Is anyone here planning to attend the conference and the presentation?

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
Mach-X
post Mar 8 2014, 08:08
Post #92





Group: Members
Posts: 275
Joined: 29-July 12
From: Windsor, On, Ca
Member No.: 101859



I am going to loosely quote a well known poster here "all this fuss about what happens at 22.05 kHz when 99% of the world has no issue with16khz brickwalled lossy encoding". Arnie?
Go to the top of the page
+Quote Post
bandpass
post Mar 8 2014, 09:06
Post #93





Group: Members
Posts: 327
Joined: 3-August 08
From: UK
Member No.: 56644



TEAC -- slight COI perhaps?

Guessing their method falls short of M&M - the gold standard.
Go to the top of the page
+Quote Post
[JAZ]
post Mar 8 2014, 18:38
Post #94





Group: Members
Posts: 1785
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



QUOTE (C.R.Helmrich @ Mar 8 2014, 01:21) *
...recorded with multiple studio-quality audio recorders ...


So... were they evaluating the formats, or the ADC's on those recorders?

(And that is assuming that when they say "Six sound programs" they mean six sound samples. Not so sure what the "eight attributes" part really means... where they answering at "which one sounds fuller?" type of questions? That would not be an ABX in any way)
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Mar 10 2014, 12:58
Post #95





Group: Members
Posts: 3930
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (Mach-X @ Mar 8 2014, 03:08) *
I am going to loosely quote a well known poster here "all this fuss about what happens at 22.05 kHz when 99% of the world has no issue with16khz brickwalled lossy encoding". Arnie?



If I didn't say it, I'll take credit for it anyway, because it is relevant and factual. ;-)

The classic trap related to high sample rate tests is monitoring equipment with IM.

Go to the top of the page
+Quote Post
WernerO
post Mar 11 2014, 08:04
Post #96





Group: Members
Posts: 74
Joined: 21-November 06
Member No.: 37858



Perhaps.

Another weakness is their apparent use of a less than blameless sample rate convertor (Pyramix 6).
Go to the top of the page
+Quote Post
Arnold B. Kruege...
post Mar 11 2014, 13:44
Post #97





Group: Members
Posts: 3930
Joined: 29-October 08
From: USA, 48236
Member No.: 61311



QUOTE (WernerO @ Mar 11 2014, 03:04) *
Perhaps.

Another weakness is their apparent use of a less than blameless sample rate convertor (Pyramix 6).


Pyramix 6.2.3

http://src.infinitewave.ca/



Yeccch!
Go to the top of the page
+Quote Post
Kees de Visser
post Mar 11 2014, 14:28
Post #98





Group: Members
Posts: 685
Joined: 22-May 05
From: France
Member No.: 22220



QUOTE (Arnold B. Krueger @ Mar 11 2014, 13:44) *
Yeccch!
Are you sure ? I agree it's not state of the art anymore (Pyramix v.6 is an old version btw) and for this kind of testing there's no excuse for not using the best SRC available, even for DSD to 24/192.
The purple colour means the distortion products are around -120dBFS. You posted recently in another thread:
QUOTE (Arnold B. Krueger @ Feb 27 2014, 13:56) *
Seems to completely miss the point is that when measured performance is sufficiently high (e.g. the 100 dB rule) subjective tests are a complete and total waste of time.
I'd be interested to see more test details.
Go to the top of the page
+Quote Post
Wombat
post Mar 11 2014, 15:51
Post #99





Group: Members
Posts: 1055
Joined: 7-October 01
Member No.: 235



QUOTE (Kees de Visser @ Mar 11 2014, 14:28) *
Are you sure ? I agree it's not state of the art anymore (Pyramix v.6 is an old version btw) and for this kind of testing there's no excuse for not using the best SRC available, even for DSD to 24/192.
The purple colour means the distortion products are around -120dBFS.

Looking at the passband behaviour i really wonder what is happening there. I have no clue about math but none of the known as correctly working SRCs has this wierd pattern.
This may hint to a problem with the v.6 resampler we can't exactly gather with the 3 graphs offered.


Go to the top of the page
+Quote Post
C.R.Helmrich
post Mar 11 2014, 23:55
Post #100





Group: Developer
Posts: 690
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (WernerO @ Mar 11 2014, 08:04) *
Another weakness is their apparent use of a less than blameless sample rate convertor (Pyramix 6).

This might be a dumb question, but: How do you know this? Did you review the paper?

Btw, happy (belated) 35th birthday to the Compact Disc! That format was way ahead of its time, I would say (notice I say format, not implementations thereof).

Apparently Sony also presented some CD prototype at an AES convention 35 years ago today.

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post

5 Pages V  « < 2 3 4 5 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 21st October 2014 - 10:53