IPB

Welcome Guest ( Log In | Register )

3 Pages V   1 2 3 >  
Reply to this topicStart new topic
Results for 24bit/96KHz test, vs. 16bit/44.1KHz
listen
post Jan 4 2004, 06:20
Post #1





Group: Members
Posts: 56
Joined: 12-September 03
Member No.: 8809



I've been trying tigre's 24/96 test proposed in this thread, and also discussed at Afterdawn.

High definition stuff is also discussed here, here, and samples are here, but yeah, we've got a listening test thread now, so might as well use it...

My equipment is an M-Audio Revolution 7.1 feeding straight to Sennheiser HD 200 headphones. I downloaded Lovely_1.wv and used foobar2000 to do resampling, replaygaining, and ABXing. At first I was using waveOut, but then I retested them all using Kernel Streaming.

Anyway, I can ABX (with less than 1% chance of guessing):

[24/96] vs. [24/96->16/44.1->24/96] (slow resampling, dither)
[24/96] vs. [24/96->24/44.1->24/96] (slow resampling)
[24/96] vs. [24/96->16/96->24/96] (dither)

My results varied a bit, but all were significant. The first test I did, I was not expecting to hear any difference, so I was very careful, and got 12/12. Since then I've had 12/12s, 11/12s, a 10/10, and an 8/8 (got interrupted but still a valid result, and it was only a retest...)

The most consistently hearable difference for me is when I listen between 5.2 and 7.2 seconds. Some sort of drum gets hit at about 5.7s. The high definition one is somehow more convincing. Today I was thinking of the good one as a push and the bad one as a pull, but yeah that's not a very helpdul description..

I'm also hearing other differences, but it's hard to know whether I'm being tipped off by something while focussing on something else, or even what the actual difference is in objective terms.

So, what could be wrong? What else would be worth testing? I was thinking of noise shaping the output maybe...
I'm not that keen to do a huge amount of retesting with every possible combination, but if someone thinks of something important I'll be sure to check it.
Go to the top of the page
+Quote Post
tigre
post Jan 4 2004, 18:53
Post #2


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



Thanks for the effort, listen. I hope this will encourage others to perform the test too - and some knowledgable people arround here to share their ideas about 'what could be wrong'.

I have tried to reproduce your findings, but never got better than 4/4, then results got messed up. I need to take breaks after a few trials but haven't had enough patience yet.


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
listen
post Jan 5 2004, 08:11
Post #3





Group: Members
Posts: 56
Joined: 12-September 03
Member No.: 8809



I just tried another ABX test.

This time it was 96kHz with a lowpass around 21-22, against 44.1kHz.

I tried this for two reasons.
  • To ensure that really high frequency content was not messing stuff up and causing an audible difference lower down in the spectrum. For example my headphones probably don't respond well to things much above the audible spectrum. Low passing eliminates the possibility that the high-def one actually sounds worse because of my equipment.
  • Because I think the difference I'm hearing is something other than really high frequency sound. In fact, at the volume I'm listening at, I wouldn't be surprised at all if I couldn't even hear to 18kHz. Come to think of it, I don't even know if I can hear that high anyway.
Anyway, I got 11/12... good enough... and not hugely difficult (I must be getting used to it). I heard exactly the same difference as I did without the low-pass. Well, that's what I think at least. The bit of percussion that I mentioned above seems more defined and convincing, and it just sits better. It sounds slightly more like someone playing it, rather than just a recording of the sound it can make.

I shouldn't forget though, that filters aren't perfect.
I'm guessing nobody knows a foolproof way of low-passing that I could use (not resampling to 44.1 wink.gif )

edit: what a demented winkie.. huh.gif

This post has been edited by listen: Jan 5 2004, 08:13
Go to the top of the page
+Quote Post
tigre
post Jan 5 2004, 13:11
Post #4


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



Intreseting. Mabe we're getting closer to track it down. What kind of lowpass have you used for your last test?

Can you please run the sample I attatch (one second of silece with single click) through this lowpass and post it here (if you don't have the possibility to upload, use upload forum or tell me, I'll PM you my email addy).
Attached File(s)
Attached File  singleclick_96.zip ( 32.68K ) Number of downloads: 765
 


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
listen
post Jan 6 2004, 07:48
Post #5





Group: Members
Posts: 56
Joined: 12-September 03
Member No.: 8809



Hi tigre,
I'm not sure why I didn't check this before, but I just discovered that none of the low-pass methods I used are transparent. Even though I think they still sound better than 44.1, to prove that it's better (not just different) I guess I would need to use a low-passed file that I can't tell apart from the original 96.

I don't know a huge amount about these filters, so please suggest a better way if you know. I'm using Audition (same as CoolEdit), and so far I've used the 'Butterworth' filter to make three files. One has a very steep rolloff starting at 21K, and practically disappearing just over 22K. Then I tried with a slow rolloff starting at 19K, with a fair bit of sound remaining past 22K. And I also tried one somewhere in between. None of them was transparent mad.gif
I don't want to start any lower than 19K, and I don't want to let in very much above 22K, because that would defeat the purpose of the test. What can I do?
Go to the top of the page
+Quote Post
KikeG
post Jan 6 2004, 16:22
Post #6


WinABX developer


Group: Developer
Posts: 1578
Joined: 1-October 01
Member No.: 137



Such rate of success in ABX makes the results a little bit suspicious, in my opinion. Suspicious mostly about something going wrong during the file generation process or the ABX procedure, possibly the RG process. But I can't say for sure, maybe it's all ok.

One of the things that looks strange is that you can ABX easily 16/96 vs. 24/96, that is, a change just in bitdepth. This is the first time I've seen something similar, here, on at any other forum I know, and this can't be reasonabily explained by any kind of poor transducer (headphone) or amp performance, mostly intermodulation. So I think more tests should be carried out to find out what's really going on.

I haven't still looked at the mentioned test files, and verified the actual high frequency content and noise floor at the parts that are ABXable. When I find some time and feel like working on this, I will try to look at what I just commented, and generate some controlled test files over these parts changing just bitdepth and lowpass and see if you can ABX them.

About Audition Butterworth lowpass filter: it distorts phase at frequencies somewhat below the cutoff point. Better use a Chevychev 2 filter, or even better use FFT filter of as much as 1024 points, Blackman windowing.

Also, if you feel like it, try disabling RG on foobar ABX tool, and verify you are not using any DSPs when generating test files. Try also using flat dither instead of noiseshaping dither, and fast resampling when downsampling.

In any case, it's good to know of your experience and results.

(I wish I had already fixed WinABX for 24-bit playback in W2K and XP. I'm on the way, but not done it yet, sorry)

This post has been edited by KikeG: Jan 6 2004, 16:30
Go to the top of the page
+Quote Post
KikeG
post Jan 7 2004, 10:52
Post #7


WinABX developer


Group: Developer
Posts: 1578
Joined: 1-October 01
Member No.: 137



Ok, finally I got to generate the test files.

Download this 1.8M zip: http://www.kikeg.arrakis.es/various/lovely_test.zip

It includes a flac file, flac decoder and SSRC. Extract to a folder, and execute (click) the 'generate.bat' file. 4 files will be generated:

- A: lovely_short.wav: original

- B1: lovely_16bit_dshaped.wav: dithered to 16 bit using noiseshaping dither, then back to 24 bit.

- B2: lovely_lowpass.wav: resampled to 44.1 KHz, then back to 96 KHz, all at 24 bit.

- B3: lovely_16bit_dflat.wav: dithered to 16 bit using flat dither, then back to 24 bit.

Edit: the flac file goes from approx. 4.2 sec to 8.2 sec. of original lovely_1.wv file.

Now, try ABXing the original from any of the other 3. Don't use RG since it's not needed at all. Please post total nš of trials and correct identifications.

Edit: of the two 16-bit converted files, try anyone you wish. I'd try the other varying dither alternative only if I could ABX the first dither option tried.

This post has been edited by KikeG: Jan 7 2004, 12:16
Go to the top of the page
+Quote Post
2Bdecided
post Jan 7 2004, 11:26
Post #8


ReplayGain developer


Group: Developer
Posts: 5089
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (KikeG @ Jan 7 2004, 09:52 AM)
Don't use RG since it's not needed at all.

That's very good advice! I don't know why foobar2k suggests using ReplayGain when ABXing. It's useful if you're comparing a codec which has intentionally (or unintentionally) scaled the audio, and it will probably prevent clipping - but otherwise it's a bad idea when ABXing! If you're sure that neither of the above can happen, then there is simply no need to ReplayGain in an aBX test, and it only adds another chance for error. You need to avoid all possible errors in a 16-bit vs 24-bit test!

Replay Gain cannot possibly help here if you're doing things correctly, but there's always the chance that it might scale one file fractionally differently from another, which introduces an extra variable that you don't want.


(If you're ABXing codecs which have (or might have) changed the volume of the file, then of course ReplayGain is very useful, but that's a different matter).


Cheers,
David.
Go to the top of the page
+Quote Post
KikeG
post Jan 7 2004, 12:06
Post #9


WinABX developer


Group: Developer
Posts: 1578
Joined: 1-October 01
Member No.: 137



Now, when you have tried the test files at my previous post and want to try the true test at issue, download the bat file at http://www.kikeg.arrakis.es/various/generate2.bat , put it on the same folder of the previous test, and execute it. It will generate 2 additional files to try to ABX from the original:

- B4: lovely_downs_dflat.wav: resampled to 16/44.1 and back to 24/96, using flat dither.

- B5: lovely_downs_dath.wav: resampled to 16/44.1 and back to 24/96, using soft ATH noiseshaping dither.

Try one of the files first. I'd try the other varying dither alternative only if I could ABX the first dither option tried.

This post has been edited by KikeG: Jan 7 2004, 12:13
Go to the top of the page
+Quote Post
listen
post Jan 7 2004, 23:49
Post #10





Group: Members
Posts: 56
Joined: 12-September 03
Member No.: 8809



Thanks for the input, I'll start working through your files soon...

I didn't mean to give the impression that 16bit vs. 24bit was easy, although there was one time I tried in the middle of the night (very relaxed, and no highway noise, background music etc..) when it was easy for a couple of minutes. Usually it is more difficult, in fact yesterday I couldn't seem to do it at all (with the same files and settings that I could tell apart before). Maybe I was just impatient..
also, I've discovered that my left ear is a bit blocked, and hears a couple of dB less than my right. Yesterday it 'popped', which is a relief to know it's not deafness, but everything in the headphones sounded quite different. Now it's back to how it was before. It feels like a blockage starting at my nose (if that makes sense)... in fact even when I swallow, I feel it more on the right, or maybe that's just what I'm hearing.

Yeah, replaygain... it seemed like foobar wouldn't let me ABX unless they were replaygained, but now I see there is an option to turn it off. I hope that's not all I was hearing..
Go to the top of the page
+Quote Post
Pio2001
post Jan 9 2004, 00:31
Post #11


Moderator


Group: Super Moderator
Posts: 3936
Joined: 29-September 01
Member No.: 73



I tried to ABX KikeG's files : lovely_short vs lovely_downs_dath
I thought I could hear the difference. I was carefully performing the ABX sessions...
After 5 sessions I made a pause and looked at my results : 3/5. Since I decided to go in 8, this is a failure.


WindowsXP, Wave Out, Marian Marc 2 soundcard, Senheiser HD-600 headphones.
The PC was running in one room, and I was listening in the next room, with the PC picture displayed on a screen by a very silent videoprojector in low lamp mode.
EDIT : no cable extension was used, the Senheiser cable was just long enough to run under the door from one room to the next, with the mouse, keyboard and video extention cables smile.gif

This post has been edited by Pio2001: Jan 9 2004, 00:33
Go to the top of the page
+Quote Post
listen
post Jan 9 2004, 02:13
Post #12





Group: Members
Posts: 56
Joined: 12-September 03
Member No.: 8809



Ok, I had a session last night and got some results. I didn't spend all night trying for high-scores, these are more like 'first concentrated attempt once I figured out the difference' scores.

First I tested 16 bit, flat dither:
The spot I had found the best for 16 bit testing was at 9s (in lovely_1), but doesn't matter, I used another guitar chord, at 2.1s in lovely_short.
This was not easy, and I only recorded a 10/12. I might re-do this at some stage to try for a better result, just to make sure.

Then 44.1KHz:
Using the tambourine at 3.1s (in short file), I was able to work through this in just a few minutes. I made a couple of stupid mistakes and didn't want to record a 10/12, so I took it up to 14/16. I also loaded up Audition and discovered that at the volume level I'm working with, 18KHz by itself is completely inaudible to me.

16 bit, noiseshaping dither:
I recorded that this was a dead giveaway. Sorry I didn't write down what part I listened to, but I would guess it was either 2.1s again, or my main spot around 3s. I got it to 8/8 without problems, but suddenly lost it completely, so I took a break.

lovely_downs_dflat:
This was easy, sounded like a lowpass (listening to the tambourine at 3.1s).
8/8, pretty hard to miss.

lovely_downs_dath:
I found this sample very confusing. I'm not in the habit of regularly comparing X and Y to A, I usually just compare X and Y and say that the best one is A. Well, this time, I got nine wrong in a row! It seems that listening to the ath shaped file makes the original sound bad huh.gif ... I kept choosing downs_dath as the good one. Then I realised I should compare to A, and I eventually ended up with a 12/14, listening to the tambourine. The 'lowpass' problem was not as easy to pick up here though, or maybe I was just getting tired by this stage.

Well, this was not meant to be a dither test, but it's still interesting smile.gif

For any sceptics, all I can say is that nobody knows who I am, so I don't really have anything to gain. If I was a well known sound engineer it would be different.

Anyway, I'm sure that more people can hear it than they realise. Maybe I will write an ABX guide...

Pio, have you tried kernel streaming? It was suggested to me because of waveout possibly not getting through windows untainted.
I'm quite envious of your setup... even my computer fan is really starting to irritate me now.
Go to the top of the page
+Quote Post
Pio2001
post Jan 9 2004, 12:22
Post #13


Moderator


Group: Super Moderator
Posts: 3936
Joined: 29-September 01
Member No.: 73



My soundcard doesn't support kernel streaming. But I checked long ago that the PCabx playback was bit perfect. Anyway it is bit perfect in Winamp with wave out.
Go to the top of the page
+Quote Post
Continuum
post Jan 9 2004, 14:59
Post #14





Group: Members
Posts: 473
Joined: 7-June 02
Member No.: 2244



OT:
QUOTE (Pio2001 @ Jan 9 2004, 12:31 AM)
...a very silent videoprojector...

Unbelievable! What is this prodigy of engineering?! laugh.gif
Go to the top of the page
+Quote Post
KikeG
post Jan 10 2004, 12:25
Post #15


WinABX developer


Group: Developer
Posts: 1578
Joined: 1-October 01
Member No.: 137



Very interesting...

Listen, could you try another test more, with alternative processing algorithms?

Dowload this 1.5M file: http://www.kikeg.arrakis.es/various/lovely_test2.zip

Extract its contents to same folder of my previous test files, and execute (click) the 'generate3.bat' file. 3 new files will be generated:

- B6: lovely_lowpass2.wav: a different lowpass.

- B7: lovely_dith2.wav: a different bitdepth reduction type.

- B8: lovely_lowpass2_dith2.wav: lowpass and bitdepth reduction simultaneously.

Again, try to ABX them from the 'lovely_short.wav' original of my previous test files.

Now, could you please verify and notify us that RG and DSP (except maybe volume control) in foobar2000 are disabled? Also, do you use something to control output volume? If so, what is it? Foobar volume control, or Revo control panel? If so, what are the settings?

Thanks for the testing.
Go to the top of the page
+Quote Post
Garf
post Jan 10 2004, 12:47
Post #16


Server Admin


Group: Admin
Posts: 4884
Joined: 24-September 01
Member No.: 13



QUOTE (listen @ Jan 9 2004, 03:13 AM)
lovely_downs_dath:
I found this sample very confusing.  I'm not in the habit of regularly comparing X and Y to A, I usually just compare X and Y and say that the best one is A.  Well, this time, I got nine wrong in a row!  It seems that listening to the ath shaped file makes the original sound bad  huh.gif ... I kept choosing downs_dath as the good one.  Then I realised I should compare to A, and I eventually ended up with a 12/14, listening to the tambourine

How can you end up with 12/14 if you got the first nine wrong?

Please mention _ALL_ ABX results, not just the ones you like. Otherwhise these tests are worthless.
Go to the top of the page
+Quote Post
listen
post Jan 11 2004, 00:57
Post #17





Group: Members
Posts: 56
Joined: 12-September 03
Member No.: 8809



Sure I will try the next batch, and double check foobar when I do. About volume, yes I use just the master volume on Revo control panel. It's set on about 3/4. Sensaura mode is not on, and the sample rate selector shows 96000.
Might be a bit longer this time, I'm trying the tests on other people too.. although a result from somewhere else would be better, just to show it's not my setup going wrong.


Garf,
the difference I hear is subtle, and most of the time I have to play around with the files for a while before I can rely on what I think I'm hearing. Once I figure it out, of course I use the reset button before trying for a good result. I don't see how a result of say 10 in a row is worthless in any context, assuming I haven't made hundreds (or thousands ohmy.gif ) of attempts before it.
Go to the top of the page
+Quote Post
Continuum
post Jan 11 2004, 08:43
Post #18





Group: Members
Posts: 473
Joined: 7-June 02
Member No.: 2244



QUOTE (listen @ Jan 11 2004, 12:57 AM)
the difference I hear is subtle, and most of the time I have to play around with the files for a while before I can rely on what I think I'm hearing.  Once I figure it out, of course I use the reset button before trying for a good result.

If you do this everytime, i.e. resetting before the true test starts, and never count the earlier trials, then it is no problem. But if you choose to reset based on your previous score, the results will loose some of their statistical significance.
Go to the top of the page
+Quote Post
Garf
post Jan 11 2004, 10:51
Post #19


Server Admin


Group: Admin
Posts: 4884
Joined: 24-September 01
Member No.: 13



QUOTE (Continuum @ Jan 11 2004, 09:43 AM)
If you do this everytime, i.e. resetting before the true test starts,

Another problem is determining when 'the true test' starts.

In any case, you should always count all trials. Even if you were just trying at first you will still get a significant result, provided you're really hearing a difference in the later trials. A score like 35/50 may not look as impressive as 10/10 but it's still significant!

Throwing out results is a big no-no in a sensitive test like this, and can very easily flaw the results.

PS. If I read the comments above, It seems to me that you did do more tests and those didn't give significant results. You must mention this! If you do 6 tests and 1 comes back significant, the overall result isn't necessarily valid with the same degree of confidence!
Go to the top of the page
+Quote Post
Continuum
post Jan 11 2004, 11:35
Post #20





Group: Members
Posts: 473
Joined: 7-June 02
Member No.: 2244



QUOTE (Garf @ Jan 11 2004, 10:51 AM)
Another problem is determining when 'the true test' starts.

Doesn't matter, as long as he resets the counter before taking the test.
(...and there is only one true test rolleyes.gif)
Go to the top of the page
+Quote Post
Garf
post Jan 11 2004, 14:02
Post #21


Server Admin


Group: Admin
Posts: 4884
Joined: 24-September 01
Member No.: 13



Some more stuff to test with:

http://sjeng.org/ftp/Orig.wv
http://sjeng.org/ftp/NoTrunc.wv
http://sjeng.org/ftp/Trunc16.wv

First one is the original, padded with 2 secs of silence to either edge (to prevent edge artifacts).

Second one is the original, resampled to 44.1, and then back to 96k, in full 32 bit float precision with my own resampling filter.

Third one is the original, resampled to 44.1, truncated to 16 bits, and then again upsampled to 96k at 24 bit precision.

The resampling filter should have better quality than SSRC slow mode. If you can ABX the first against the second, I don't know what the heck could be wrong sad.gif
Go to the top of the page
+Quote Post
listen
post Jan 14 2004, 03:38
Post #22





Group: Members
Posts: 56
Joined: 12-September 03
Member No.: 8809



Just thought I should say that I'm leaving town tomorrow, and will be away from my computer for at least six weeks. I've been busy recently, and haven't had much chance for listening.. I did try your most recent files KikeG, but without success yet. I will give them some more time in March though. So far, I have no interesting results for lovely_dith2, and while lovely_lowpass2 is pointing my way a bit, it could just be luck. Still, I need to sit down and really concentrate before I discount them completely. Garf, I will try yours in March too.

So far, I've also tried the test on about 10 people, who were unable to hear anything. But the other day, one of my friends got a 9/12 on lovely_16bit_dflat. That's not a great result, but it was the only result, and also the first time she had listened to it. It took more than an hour, so for anyone still trying it, don't just start guessing and give up after 10 minutes...

I think the most important thing to remember when testing these files is that the same file will gradually sound different as you listen to it more and more. So you can't just recognise a certain problem straight away every time you listen. A good way to test is to decide on which file (X or Y) you think is better, and compare it to the other. Listen to it a few times in a row and then switch to the other one. Take note of how much worse the second one was. Then repeat it a few times. After this, swap the files over.. that is, decide that actually the other file is the good one. Then repeat the process. After that, you might want to swap back again. How long it takes probably depends on a lot of things, but eventually you will notice that one of them takes a bigger boost from your imagination than the other. Or you might notice that one of them seems reluctant to be the bad one. You also should listen to A occasionally, to keep your perspective right. Once you begin to notice some consistency in all the little hints that you pick up from this process there is a very good chance that you will get it right.

Hey tigre, where does this sample come from? I'm really getting into it now, and I think I'll buy the DVD(?) if it's available.
Go to the top of the page
+Quote Post
listen
post Mar 31 2004, 04:57
Post #23





Group: Members
Posts: 56
Joined: 12-September 03
Member No.: 8809



I got motivated by a thread I saw the other day... and it's still March, just.. smile.gif

I tested lovely_short against lovely_lowpass2 last night. I had a single attempt, listening to the last percussion sound in the file. 11/12. I also checked it out with a spectogram, and was surprised to see frequencies represented right up to 29KHz! Since I can't hear a lone sine-wave even at 18KHz, I would say this might suggest there is more to hearing than we think..

Still no results for lovely_dith2.
I'm very busy this year, but if dith2 is important I can spend some more time with it... it seems more worthwhile testing files that I do get results for though.
Go to the top of the page
+Quote Post
tigre
post Mar 31 2004, 08:39
Post #24


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



listen, thanks for still spending time on this.

To ask your question from january: The samples are from this Chesky DVD.

IIRC you weren't able to ABX 24/96 vs. 16/96 so far (I've read through the thread again, but maybe I've missed it), so it would be interesting to perform some how-high-can-you-hear test. If you don't have the necessary software, I can create some samples for this if you want.


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post
tigre
post Mar 31 2004, 08:47
Post #25


Moderator


Group: Members
Posts: 1434
Joined: 26-November 02
Member No.: 3890



QUOTE (listen @ Mar 31 2004, 05:57 AM)
I had a single attempt, listening to the last percussion sound in the file.  11/12.

I just noticed this. I hope this doesn't mean that you've performed multiple ABX sessions before and only reported successful results. In this case you need to add all results (e.g. 7/12 + 11/12 + 8/12 = 26/36) and the p-value must be calculated from the total score. If such "cherry-picking" is involved he p-value of the "successful" attempt is not statistically valid.


--------------------
Let's suppose that rain washes out a picnic. Who is feeling negative? The rain? Or YOU? What's causing the negative feeling? The rain or your reaction? - Anthony De Mello
Go to the top of the page
+Quote Post

3 Pages V   1 2 3 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 20th August 2014 - 10:19