IPB

Welcome Guest ( Log In | Register )

3 Pages V   1 2 3 >  
Reply to this topicStart new topic
24/96 digitalization - can it be audible, blind test results
pawelq
post Jun 18 2011, 23:14
Post #1





Group: Members
Posts: 541
Joined: 20-December 05
From: Springfield, VA
Member No.: 26522



I would like to share results of a blind listening test that was conducted on June 16th, 2011 in Warsaw, Poland.

The full description of the test is available at http://www.audio.e-snp.net/, but it’s in Polish, so I’ll provide a synopsis below.

The test question was "can we hear any effect of A/D/A conversion at 24/96 performed with a studio recorder". Frankly, both the test organizer and me (let's say, the test statistician), were pretty much convinced that no difference would be detectable.

The test music was played using a gramophone (Bergman Audio Sindre with Air Tight PC-1 cartridge), gramophone preamp (Air Tight ATE-2005), * , amplifier (Soulution 720 / 710), and loudspeakers (Hansen Audio Prince V2). In “analog” trials, the path was as described above. In “digital” trials, a studio recorder (Tascam DV-RA1000) set into “monitor” mode (i.e., A/D then D/A, at 24 bit and 96 kHz) was inserted in place of the asterisk.

The participants listened to about 1 minute of “Falling Alice” from Chick Corea’s “Mad Hatter” LP from 1976, mint condition. The test was conducted in a 5.8 m x 7.8 m acoustically adapted room.

There were 10 listeners (neither the test organizer nor me participated; he was switching the connections, I was several thousand kilometers away). All listeners listened together, being in the same room. They left the room for connection switching. During listening, the test organizer remained in the back of the room, invisible to the listeners. The listeners are (and me and the organizers) are members of a small Polish “sensible audiophile” internet forum.

There were 13 test trials, in 7 seven of them there was A/D/A conversion (D), in 6 purely analog path was used (A). The order of D and A trials was random.

Prior to the test, the listeners familiarized themselves with the supposed difference in A vs. D sound and with the recording, which was played a few times in A and D configuration. The listeners received answer cards on which they marked the trials as A or D. They were asked to answer in each trial, even if they were unsure. Prior to the test, they also provided answers to three questions: “Do you think that the effect of digitalization will be audible (yes/no)?”, “Do you consider yourself an experienced listener of vinyl records, using high-quality equipment (yes/no)?”, “How much of your listening time is spent listening to vinyl records (in %)”.

The results were analyzed in two main ways.
In the individual analysis, we checked if any of the participants identified A vs. D at a statistically significant level. With one-way binomial test with Šidák correction (due to multiple listeners. i.e., multiple tests) we determined that at most one error (12/13 correct) is allowed to pass this test (p=0.017; for 11/13 p=0.107).
In group analysis, we converted the results to proportion correct (e.g., 8/13=0.615) and used one-way Wilcoxon one-sample test to determine if the median of proportion correct was significantly higher than 0.5. (Additionally, we calculated one-way one-sample t-test, however, due to the small sample size, the normality assumption could not be reliably tested and we consider the results of t-test to be less trustworthy.)

The results table, a plot of proportion correct for each listener, and a more detailed description of the analysis are provided at http://www.audio.e-snp.net/wyniki.php (FYI, in the result table A stands for “analog” trials, C stands for “digital” trials, column 1 is listener number, columns 2-4 are answers to the three questions (NIE=no, TAK=yes), the bottom row shows what actually happened in the trials.)
No listener answered with 0 or 1 error, which was required for statistically significant outcome of the individual analysis. There was one 11/13 (0.846) result, and two 10/13 (0.769) results. No one scored below 6/13 (0.462).

The interesting thing is, that average proportion correct was 0.631, and it was significantly higher than chance (p=0.0322, Wilcoxon test; possibly unreliable t-test: p=0.0093). My interpretation is that the A/D/A process done with the Tascam recorder did audibly influence the signal.
No association of answers to any of the three questions with the proportion of correct answers was found.

Any comments?



--------------------
Ceterum censeo, there should be an "%is_stop_after_current%".
Go to the top of the page
+Quote Post
Notat
post Jun 19 2011, 01:01
Post #2





Group: Members
Posts: 581
Joined: 17-August 09
Member No.: 72373



Did you carefully match the level of the system with and without the Tascam inserted i.e. are you sure that the Tascam had exact unity gain? Listeners can hear can minute level differences and usually perceive them as qualities other than different level.
Go to the top of the page
+Quote Post
Juha
post Jun 19 2011, 07:23
Post #3





Group: Members
Posts: 478
Joined: 14-February 07
From: EU-FIN
Member No.: 40610



By the specs, Tascam DV-RA1000 does not have very good A/D convertor(s) so, it would be nice to see RMAA results of it.

Juha
Go to the top of the page
+Quote Post
hlloyge
post Jun 19 2011, 09:21
Post #4





Group: Members
Posts: 701
Joined: 10-January 06
From: Zagreb
Member No.: 27018



QUOTE (Juha @ Jun 19 2011, 07:23) *
By the specs, Tascam DV-RA1000 does not have very good A/D convertor(s) so, it would be nice to see RMAA results of it.

Juha


By what specs?
Go to the top of the page
+Quote Post
Juha
post Jun 19 2011, 10:29
Post #5





Group: Members
Posts: 478
Joined: 14-February 07
From: EU-FIN
Member No.: 40610



QUOTE (hlloyge @ Jun 19 2011, 11:21) *
QUOTE (Juha @ Jun 19 2011, 07:23) *
By the specs, Tascam DV-RA1000 does not have very good A/D convertor(s) ... .

Juha


By what specs?



DV-RA1000 - http://tascam.com/product/dv-ra1000/downloads/
DV-RA1000HD - http://tascam.com/product/dv-ra1000hd/downloads/

IMO, digital sources for this type of analog source against its digitalized version comparisons should be prepared using HQ A/D equipment (min. Mytek ADC, Benchmark ADC1, Prism sound AD-2, etc.).

Juha

This post has been edited by Juha: Jun 19 2011, 10:43
Go to the top of the page
+Quote Post
DonP
post Jun 19 2011, 11:50
Post #6





Group: Members (Donating)
Posts: 1473
Joined: 11-February 03
From: Vermont
Member No.: 4955



Just throwing out some guesses here of things that might give clues:

Levels set so clipping might have happened (like when the stylus hit the record)?

Sometimes with equipment like that "monitor mode" is meant for monitoring, not an output used for production, and is not up to the full specs.

What sort of time delay is there in the A/D/A process? Could there be some crosstalk or bleed through that would give a subtle pre-echo of the original analog?
Go to the top of the page
+Quote Post
WernerO
post Jun 19 2011, 16:08
Post #7





Group: Members
Posts: 74
Joined: 21-November 06
Member No.: 37858



QUOTE (Juha @ Jun 19 2011, 07:23) *
By the specs, Tascam DV-RA1000 does not have very good A/D convertor(s)


It is a textbook implementation of the PCM1804. It measures very well. Please point in detail at the specs that told you that it was not very good.

Go to the top of the page
+Quote Post
pawelq
post Jun 19 2011, 16:17
Post #8





Group: Members
Posts: 541
Joined: 20-December 05
From: Springfield, VA
Member No.: 26522



Thanks everyone for you comments. And sorry for my slow response, I have to discuss the details with the other test organizer, and being on different continents and in different time zones, we need time for this.

QUOTE (Notat @ Jun 18 2011, 20:01) *
are you sure that the Tascam had exact unity gain?

The difference was <0.2 dB.


QUOTE (Juha @ Jun 19 2011, 02:23) *
it would be nice to see RMAA results of it.

Wouldn't it require using measurement equipment that has better specs than the Tascam ?


QUOTE (DonP @ Jun 19 2011, 06:50) *
Levels set so clipping might have happened (like when the stylus hit the record)?

The preamplifier was muted before lowering the stylus and unmuted afterwards. It's a slow, ~2s mute/unmute. Placing the stylus and muting/unmuting was done by a person unaware whether the current trial was A or D.


QUOTE (DonP @ Jun 19 2011, 06:50) *
Sometimes with equipment like that "monitor mode" is meant for monitoring, not an output used for production, and is not up to the full specs.

We used the regular Analog Out RCA outputs. As far as we know, the procedure is identical to normal "production" recording except that there is no recording.


QUOTE (DonP @ Jun 19 2011, 06:50) *
What sort of time delay is there in the A/D/A process? Could there be some crosstalk or bleed through that would give a subtle pre-echo of the original analog?

We'll look into the possibility of measuring this delay. On the other hand, Tascam specs say that crosstalk is <-97 dB


At this point we have to admit that we have found a potential confound, although it might be irrelevant. We ensured that the levels were below clipping by observing Tascam's clipping indicators during pre-test runs. Everything seemed fine. But we also recorded the test music using the same configuration and settings, and there was tiny amount of slight clipping in the recorded file, namely 21/98 samples in L/R channels, for a total of >17 million samples in each channel. We are not sure if this can be audible/significant, but it is a methodological flaw, and we'll try to find a way of avoiding it, should we re-do the test, or do similar tests.





--------------------
Ceterum censeo, there should be an "%is_stop_after_current%".
Go to the top of the page
+Quote Post
Juha
post Jun 19 2011, 20:13
Post #9





Group: Members
Posts: 478
Joined: 14-February 07
From: EU-FIN
Member No.: 40610



QUOTE (WernerO @ Jun 19 2011, 18:08) *
QUOTE (Juha @ Jun 19 2011, 07:23) *
By the specs, Tascam DV-RA1000 does not have very good A/D convertor(s)


It is a textbook implementation of the PCM1804. It measures very well. Please point in detail at the specs that told you that it was not very good.


If it's BB PCM1804 chip as you suggest then it's specs are:

Dynamic Range: 112 dB (Typical)
SNR: 111 dB (Typical)

By the DV-RA1000HD documentation Dynamic Range for DAC is 120dB.

Specs from Owners Manual (Analog to Analog)

Dynamic Range: >103 dB (DVD +RW) (>96 CD-R/RW)
SNR: >103 dB

Bench document results (Analog to Analog)

Dynamic Range: 107 dB
SNR: ~107 dB


As comparison, E-MU 0404 USB - http://www.emu.com/products/product.asp?ca...lSpecifications
RMAA - http://ixbtlabs.com/articles2/proaudio/emu-0404-usb-p2.html

So, it's not the worst implementation found in Tascam but, isn't it just as using a 18 -bit A/D stage that supports 24-bit resolution (though, would that extra 2-3 bit range you could get by using ADCs I mentioned have much mean in this type of test?).


Juha
Go to the top of the page
+Quote Post
Notat
post Jun 20 2011, 00:14
Post #10





Group: Members
Posts: 581
Joined: 17-August 09
Member No.: 72373



QUOTE (pawelq @ Jun 19 2011, 09:17) *
QUOTE (Notat @ Jun 18 2011, 20:01) *
are you sure that the Tascam had exact unity gain?

The difference was <0.2 dB.

My recollection is that best practice for a sensitive test like this is <0.1 dB.
Go to the top of the page
+Quote Post
saratoga
post Jun 20 2011, 00:24
Post #11





Group: Members
Posts: 5116
Joined: 2-September 02
Member No.: 3264



QUOTE (Juha @ Jun 19 2011, 15:13) *
So, it's not the worst implementation found in Tascam but, isn't it just as using a 18 -bit A/D stage that supports 24-bit resolution (though, would that extra 2-3 bit range you could get by using ADCs I mentioned have much mean in this type of test?).


I don't know what you're trying to say with "18 -bit A/D stage that supports 24-bit resolution", but according to Google its a 24 bit converter.

Go to the top of the page
+Quote Post
Northpack
post Jun 20 2011, 00:48
Post #12





Group: Members
Posts: 455
Joined: 16-December 01
Member No.: 664



QUOTE (saratoga @ Jun 19 2011, 23:24) *
QUOTE (Juha @ Jun 19 2011, 15:13) *
So, it's not the worst implementation found in Tascam but, isn't it just as using a 18 -bit A/D stage that supports 24-bit resolution (though, would that extra 2-3 bit range you could get by using ADCs I mentioned have much mean in this type of test?).


I don't know what you're trying to say with "18 -bit A/D stage that supports 24-bit resolution", but according to Google its a 24 bit converter.

I think this is nit-picking anyway. Remember that the analogue source for this test is a record from 1975. It would be spectacular enough if if would suggest that a properly done 16bit A/D/A conversion were audible.

This post has been edited by Northpack: Jun 20 2011, 00:50
Go to the top of the page
+Quote Post
krabapple
post Jun 20 2011, 03:55
Post #13





Group: Members
Posts: 2418
Joined: 18-December 03
Member No.: 10538



Just to be clear, this is all about an average score? All the individual scores failed to reach the statistical significance threshold?


Go to the top of the page
+Quote Post
2Bdecided
post Jun 20 2011, 10:11
Post #14


ReplayGain developer


Group: Developer
Posts: 5254
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



82/130.

Assuming you didn't cherry pick the data, I reckon it's fair to take that as one block. I'm sure that results in a very small p value, but I can't find a large enough p-value table to check.


I'd be very worried about a 0.2dB level difference, and quite worried about clipping (depending on the content).

If X should not be audible, but apparently might have been audible, the fact you also have Y and Z which are known to be (sometimes) audible is rather significant.

Cheers,
David.
Go to the top of the page
+Quote Post
WernerO
post Jun 20 2011, 11:18
Post #15





Group: Members
Posts: 74
Joined: 21-November 06
Member No.: 37858



QUOTE (Juha @ Jun 19 2011, 20:13) *
isn't it just as using a 18 -bit A/D stage that supports 24-bit resolution (though, would that extra 2-3 bit range you could get by using ADCs I mentioned have much mean in this type of test?).


Yes, the noise performance of the ADC side of the DV-RA1000 is about 18 bit equivalent. I measured input SNR at 106dB or so, unweighted, 22kHz, IIRC. I only ever measured one ADC better under the exact same conditions; that one was 19 bit equivalent.

The products you show are not 2-3 bits better.






Go to the top of the page
+Quote Post
usernaim
post Jun 20 2011, 16:21
Post #16





Group: Members
Posts: 61
Joined: 4-May 08
Member No.: 53291



Interesting that no one did worse than 6/13. Just by intuition I find that to be potentially informative.

But the multiple listeners issue is bothersome. Is there any possibility that the listeners influenced each other?

Also, as far as applicability, we don't tend to listen to our hi-fi that way, with all that acoustic interference. I don't know if that makes the test more or less difficult or doesn't matter, but it might matter.
Go to the top of the page
+Quote Post
[JAZ]
post Jun 20 2011, 18:41
Post #17





Group: Members
Posts: 1796
Joined: 24-June 02
From: Catalunya(Spain)
Member No.: 2383



QUOTE (usernaim @ Jun 20 2011, 17:21) *
Interesting that no one did worse than 6/13. Just by intuition I find that to be potentially informative.


Doing 6/13 is not an indicative of being on the good track of things. 6/13 (and 7/13) is an indicative of chance. If you flip a coin, there's as much possibilities to get any of each sides. If you flip it twice, the randomness would suggest you get different sides each time, so achieveing a 1/2.

That's why here at hydrogenaudio we try to make people aware of ABX, and the way to understand the values ( asking to reach 95% or in some cases even 99% of success).

In fact, there is as much possibilities to reach 16/16 than to reach 0/16. 0/16, if not done by chance, would suggest that a difference was clearly noticed, but the user misinterpreted which was which when answering.


About the test itself, I am not knowledgeable enough to find what would make it incorrect, but having 3 out of 10 listeners with a result of 10/13 or better is something to try to understand. There is no proof (they didn't pass the test) that they could hear a difference, but the results imply that there might have been something.

This post has been edited by [JAZ]: Jun 20 2011, 18:43
Go to the top of the page
+Quote Post
usernaim
post Jun 20 2011, 19:46
Post #18





Group: Members
Posts: 61
Joined: 4-May 08
Member No.: 53291



QUOTE ([JAZ] @ Jun 20 2011, 13:41) *

QUOTE (usernaim @ Jun 20 2011, 17:21) *
Interesting that no one did worse than 6/13. Just by intuition I find that to be potentially informative.


Doing 6/13 is not an indicative of being on the good track of things. 6/13 (and 7/13) is an indicative of chance. If you flip a coin, there's as much possibilities to get any of each sides. If you flip it twice, the randomness would suggest you get different sides each time, so achieveing a 1/2.

That's why here at hydrogenaudio we try to make people aware of ABX, and the way to understand the values ( asking to reach 95% or in some cases even 99% of success).

In fact, there is as much possibilities to reach 16/16 than to reach 0/16. 0/16, if not done by chance, would suggest that a difference was clearly noticed, but the user misinterpreted which was which when answering.


About the test itself, I am not knowledgeable enough to find what would make it incorrect, but having 3 out of 10 listeners with a result of 10/13 or better is something to try to understand. There is no proof (they didn't pass the test) that they could hear a difference, but the results imply that there might have been something.

I disagree. If the listeners are guessing, you would expect a normal distribution centered at 50/50. That we have people who got 10/13 would be expected. That no one got 3/13 or 4/13 or even 5/13 is what stands out about the distribution. It is also, ultimately, why the overall mean was statistically significant. If bad guessers balanced out good guessers the mean would be 50/50. They didn't.

This post has been edited by usernaim: Jun 20 2011, 19:47
Go to the top of the page
+Quote Post
Parelius
post Jun 20 2011, 20:25
Post #19





Group: Members
Posts: 5
Joined: 31-July 09
Member No.: 71909



QUOTE (pawelq @ Jun 19 2011, 17:17) *
We ensured that the levels were below clipping by observing Tascam's clipping indicators during pre-test runs. Everything seemed fine. But we also recorded the test music using the same configuration and settings, and there was tiny amount of slight clipping in the recorded file, namely 21/98 samples in L/R channels, for a total of >17 million samples in each channel. We are not sure if this can be audible/significant, but it is a methodological flaw, and we'll try to find a way of avoiding it, should we re-do the test, or do similar tests.


I don't have any backing for this, so please just overlook if it is too far out.

I've been told that my MH ULN8 should operate at -6db (if I'm not wrong) at the input for the A/D converters to prove their best. Is that just a «fairytale»?; and if not, is this something that is common to A/D converters and could also apply to the Tascam unit? (Nothing said about the input in this test, as I can see.)

Apology if I'm breaking any rules here.
Go to the top of the page
+Quote Post
Notat
post Jun 20 2011, 23:09
Post #20





Group: Members
Posts: 581
Joined: 17-August 09
Member No.: 72373



QUOTE (Parelius @ Jun 20 2011, 13:25) *
QUOTE (pawelq @ Jun 19 2011, 17:17) *
We ensured that the levels were below clipping by observing Tascam's clipping indicators during pre-test runs. Everything seemed fine. But we also recorded the test music using the same configuration and settings, and there was tiny amount of slight clipping in the recorded file, namely 21/98 samples in L/R channels, for a total of >17 million samples in each channel. We are not sure if this can be audible/significant, but it is a methodological flaw, and we'll try to find a way of avoiding it, should we re-do the test, or do similar tests.


I don't have any backing for this, so please just overlook if it is too far out.

I've been told that my MH ULN8 should operate at -6db (if I'm not wrong) at the input for the A/D converters to prove their best. Is that just a «fairytale»?; and if not, is this something that is common to A/D converters and could also apply to the Tascam unit? (Nothing said about the input in this test, as I can see.)

Apology if I'm breaking any rules here.

Professional recording is usually done at -24 dBFS or so. As has apparently been demonstrated here, it does not pay to be stingy with headroom. Watch your meters. There is no standard for what a clipping indicator means.
Go to the top of the page
+Quote Post
Parelius
post Jun 20 2011, 23:10
Post #21





Group: Members
Posts: 5
Joined: 31-July 09
Member No.: 71909



^
Sleeping in class. «overlook» should be «ignore». Sorry for that. (Didn't find any edit button.)
Go to the top of the page
+Quote Post
AndyH-ha
post Jun 21 2011, 04:56
Post #22





Group: Members
Posts: 2223
Joined: 31-August 05
Member No.: 24222



In case it isn't clear, leaving headroom when recording is to prevent unexpectedly high input levels from clipping. It has nothing to do with the quality of the A to D (barring some unusual ADC), or how the converter operates. As long as there is no clipping. the input can be extremely near, or even at, 0dBfs.
Go to the top of the page
+Quote Post
WernerO
post Jun 21 2011, 07:03
Post #23





Group: Members
Posts: 74
Joined: 21-November 06
Member No.: 37858



QUOTE (Parelius @ Jun 20 2011, 21:25) *
I've been told that my MH ULN8 should operate at -6db (if I'm not wrong) at the input for the A/D converters to prove their best.


Many delta-sigma ADC chips have a slightly rising distortion in the top half of their input range.

Go to the top of the page
+Quote Post
AndyH-ha
post Jun 21 2011, 09:01
Post #24





Group: Members
Posts: 2223
Joined: 31-August 05
Member No.: 24222



What is the definition of a half of the input range?

Does this problem ever show up in tests of soundcards?

If so, where can some revealing results be found?
Go to the top of the page
+Quote Post
Kees de Visser
post Jun 21 2011, 09:46
Post #25





Group: Members
Posts: 707
Joined: 22-May 05
From: France
Member No.: 22220



QUOTE (AndyH-ha @ Jun 21 2011, 05:56) *
leaving headroom when recording is to prevent unexpectedly high input levels from clipping.
Under studio conditions it's very well possible to make an educated guess about the max spl and take some risk. And if clipping happens it's often no problem to record that part again. Live recording is different, so a larger headroom margin can be required. Last weekend I recorded airplanes on an airshow and even with plenty of headroom the large 3-engine airplane took me by surprise. 24-bit ADC is no luxury under these conditions because the quieter parts will require at least 20dB gain during post production.
Go to the top of the page
+Quote Post

3 Pages V   1 2 3 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 21st November 2014 - 19:16