IPB

Welcome Guest ( Log In | Register )

4 Pages V  < 1 2 3 4 >  
Reply to this topicStart new topic
16bit vs 24bit, Rubbish or Truth?
AndyH-ha
post Jul 7 2008, 22:31
Post #51





Group: Members
Posts: 2223
Joined: 31-August 05
Member No.: 24222



There is a lot of (mostly one-sided) talk about raw 24 bit vs raw 16 bit, recorded at a fairly low level, normalized, then compared. There is a simple explanation at to why one might hear differences, very easy to demonstrate “in the laboratory.” Quantization distortion is the answer.

When the background noise is low enough (e.g. you are not so likely to notice this in a live recording of a rock concert as in a studio recording of a single acoustic instrument), one transform on 16 bit data (e.g. amplification) -- done without dither -- creates enough distortion to be readily noticeable on low level signals. The wrong, non-noise shaped dither can eliminate the distortion but can itself be audible. The cure is properly noise shaped dither - basic audio processing 101. If you have not assured that, you don’t have an argument.
Go to the top of the page
+Quote Post
euphonic
post Jul 7 2008, 23:56
Post #52





Group: Members
Posts: 86
Joined: 12-February 06
Member No.: 27711



QUOTE (ccryder @ Jul 7 2008, 00:13) *
#2, take 2 recordings, both recorded conservatively so as not to risk ever clipping or hitting 0dBFS for more than 1 sample. Say, -24dBFS is the highest peak on a simultaneouly recorded 24bit and 16bit recording. I don't care how much dither and noise shaping you add, you're not going to end up with more than 12 significant bits on a 16 bit recording, but with good quality analog components & A/D, you will still end up with 16 significant bits of audio on the 24bit version. The difference in listening back is effectively the difference between a 12bit recording and a 16 bit recording (noise floors notwithstanding).


This 12- vs 16-bit comparison is valid only if the listener deliberately jacks the volume way up during quiet bits and turns it down during peaks, i.e. an artificial scenario. Assuming the volume is kept constant throughout the recording, the total resolution is still fully 16-bit. I don't think 16/44.1 can be deemed insufficient based on such manipulation of playback done for the sake of hunting for artifacts.

This post has been edited by euphonic: Jul 7 2008, 23:57
Go to the top of the page
+Quote Post
ccryder
post Jul 8 2008, 05:31
Post #53





Group: Members
Posts: 12
Joined: 5-July 08
Member No.: 55311



OK, Gonna try once more to get some of you to understand what I'm talking about.
If you don't get it after this, I would venture to say you may never, or you don't want to get it, or you just don't like my explanation.

The overriding concept seems quite simple, and would appear to me to render the whole argument about audible differences moot.

AndyHA implied that any perceivable differences between 24bit and 16bit audio should be resolved with dither and noise shaping, and that if I'm hearing differences, it's because I didn't add "properly applied noise shaping." (I love that phrase "properly applied..." it says so much, as if there's an improper way to apply it, other than to apply it twice, or to perform additional post-processing afterward. Anyway....)

Let's start with an undithered 16bit quantization from a high-grade 24bit source. (Sample rate is irrelevant for the topic of discussion, as long as it's 44.1kHz or better.) Most wouldn't argue against the idea that this quantization noise is audible, and would therefore suggest that dither be added to reduce the perception of quantization noise. However, dither itself
adds some amount of broad spectrum noise to the signal, and raises the noise floor. Not much argument there. So now we have 2 kinds of noise that can possibly appear on a 16 bit recording, and I don't hear anyone debating whether or not either of these noises are detectable. It would appear universally accepted that indeed these noises can be heard.... Maybe not by everyone, maybe only by people with the right playback equipment and a trained ear.... whatever. Nobody argues that, and in fact, many here seem to feel comfortable implying ad infinitum that the panacea end-all-be-all solution to this is noise shaping.
If only it were that simple.

So now we've established that there is a necessity for noise shaping on a dithered signal, because without it, noise is more easily detectable. Great. Yes, I can hear that noise, and so can many others. Bring on the noise shaping! Oh but wait.... ummmm.... which noise shaping algorithm should I use? Hmmmm. Geez, there are so many. Surely they're all not the same, or there wouldn't be so many. What are the differences between them all?

Some noise shaping is available in hardware, and some in software. All of the manufacturers of noise shaping products/algorithms make claims of the perception of increased dynamic range to varying degrees. Many of the manufacturers make available more than one algorithm, with different perceived dynamic range claims... Well, why wouldn't I want to just use "the best one?" We can make the incorrect layman's assumption that the best one is the one that gives us the perception of the greatest perceived dynamic range, based purely on the manufacturers claims. Or you can ask an audio engineer how they choose the "best one."

(side note: Amazing how many will bash audio companies and producers of high end audio equipment, labeling them mere opportunist marketeers who use vague language to push unnecessary products on unsuspecting customers... But somehow these same people are more open to blindly accepting the noise shaping manufacturer's claims of increased perceived dynamic range numbers as gospel.... perhaps next someone will try to invalidate that statement by trying to convince me now that every noise shaping algorithm ever invented is just some pure audiophile marketing bull, and that it's not necessary... but I digress.)

Anyway... So we're starting from a well established point where it has been determined and widely accepted that quantization noise without dither can in fact be heard. Then you take the next step and consider that it is widely accepted that the best solution to resolving the quantization noise issue is to apply dither. OK. Fine. Now, I don't hear anyone making the argument that this dither noise cannot be heard. If anyone were to make that argument, they would be arguing against every noise shaping algorithm producer out there, and it is generally accepted that noise shaping has much value in this context. And of course, we can see that AndyHA fully believes in the power of noise shaping to reduce audibility of the added dither noise. And so do I. Note that I said *reduce* the audibility, and not eliminate.

Different noise shaping algorithms attempt to shift noise to either specific fixed frequency bands where the ear is less sensitive according to a fixed curve, or to dynamically modify where the noise is shifted based upon the signal content at any point in time in an attempt to artfully allow the noise to be masked. I believe Apogee makes this "dynamic noise shaping curve" claim with their UV22 product. In any case, the choice of a noise shaping algorithm is a personal one. Why? Because the different noise shaping algorithms have different curves, applying different amounts of "energies" to different frequency bands, and ultimately affecting different people with different hearing sensitivity curves. Simply put, just about every noise shaping algorithm is considered to impart a certain coloration to the signal. (Of course there are manufacturers that think they've come up with the best, non-coloration algorithm, and can't wait to tell you about it). Some might shift the noise to the 10kHz area, some to the 12kHz area... some put a "clump" of noise around 4kHz and another clump at 18kHz. Some attempt to push all of the noise toward the Nyquist limit, but do so with a smooth slope starting at 18kHz and increases the level of noise on higher and higher frequency bands. None of these numbers I mentioned are specific to any one algorithm--those numbers are just examples of how one noise shaping algorithm might differ from another. So as it turns out, the choice of noise shaping is somewhat of a compromise. If we have more than one algorithm available to us, we choose whichever algorithm we feel gives us the most pleasing results on a specific recording and specific type of music. Some people may find they like the sound of a noise shaping algorithm that pushes all of the noise to higher frequencies on a piece of music that already has a lot of very significant high frequency content, because they like the particular coloration the noise adds. Others might prefer that the noise was located somewhere else in the spectrum where it would be better masked by more dominant frequency content. The choice of algorithm is a personal one, and there can be many combinations of algorithms and signal type. Some manufacturers will indicate that X algorithm is probably best for quiet classical music, and Y is best for rock and roll. Some will say that Z is best if you ever plan to run any more post processing on the signal (something I'd never do anyway).

Needless to say, all of this implies that the results of the application of different noise shaping algorithms will impart varying degrees of coloration (or perhaps even to the degree that an individual, i.e. *not everyone* might not perceive the coloration at all). It is an artful dance that the engineer must make to choose which one he thinks makes the music sound the best. You have hearing curves to consider, the music has a particular EQ, and you try to match the noise shaping algorithm to strike a balance (compromise) between unpleasant coloration and unpleasant noise for the average Joe.

Great. So we know we can hear differences between dithered and non-dithered audio. We know that we can hear differences between noise shaped audio and non-noise shaped audio. And we know that a trained ear can hear different colorations of different noise shaping algorithms.

So given all of that, why is it that people think you can't hear the difference between a non-colored raw 24bit recording, and a dithered and/or noise shaped (colored) 16bit rendering from that same 24bit source?

All you folks championing the seemingly limitless merits of noise shaping in a very generic sense as the "difference eliminator" are way off-base. Nobody mentions the type of music, the type of dither, the type of noise shaping, and the type of listener. They just love to say "I don't believe anyone can hear a difference." It is statements like that that are just plain hogwash. This is pure logical deduction. If you (or *anyone else if not you*) claim to be able to hear the difference between one noise shaping(NS for short) algorithm and another, then it follows that you can hear the differences between NS audio and non NS audio, and dithered only, and non-dithered audio, that pretty much covers the full spectrum of discernibility. That's it. Period. No need to belabor the whole 16-bit vs 24 bit difference argument. The differences cannot be boiled down to any one thing for everyone, but for many, there are audible differences. Accept it, and get over it. Whether or not I (or anyone else) am more perceptive or not to these types noise and colorations is not for anyone to argue. If you want to talk about specific combinations of recordings, levels, listeners, technology, and experience, then perhaps there is a basis for discussion and specific analysis of perceptibility for that combination only.

Alas, as I did many years ago, I performed another test myself tonight. Yes, it was a blind test... or blind enough anyway (can't wait for the reaction I get for this one, but don't expect a response from me). Very simple. I took a 24-bit48kHz recording I recently made of a friend's classical guitar recital. It is a very dynamic recording, with signals ranging from "totally in your face" and yet not clipping, to extremely quiet--sneezing, breathing, sniffling, coughing squeeky-chaired audience notwithstanding. The kind of recording where you can hear people whispering from the back of the room intelligibly if you turn up the amp.

It was recorded with a stereo pair of Schoeps CMC64V, ORTF, onstage, about 4 feet from the performer, 2' off the floor. A Grace Lunatec V2 preamp (130dB dynamic range, per specs), Benchmark AD2402-96 A/D converter (118dB dynamic range, per specs), and some expensive (I'll make no claims of them being particularly mindblowing at this time) mic and signal cables were employed. I played it back in Winamp with no altering plugins or processing of any kind, which fed the signal to a Soundscape Mixtreme PCI 24/96 PCI card on a known bit-transparent audio workstation. The sample rate set at 48kHz internal to match the 24/48 recording. The Soundscape card is particularly cool, because it allows me to insert live VST plugins into the signal chain. So I chose to insert the old standby plugin... Waves L2 Ultramaximizer. I disabled all processing in the plugin, other than the dither section. No gain changes, no limiting, no ARC... just 3 settings: bit-depth, dither algorithm (type 1 and type 2), and noise shaping (none, Moderate, Normal, and Ultra. There is an A/B button in the plugin that allows one to set up 2 different configs. I set "A" to 24bit, no dither, no noise shaping. I first set "B" to 16bit, Type 1, and Moderate. I started the music playing, put on my Sennheiser HD650 headphones, plugged them into my Benchmark DAC1 D/A, positioned my mouse over the A/B button, closed my eyes, and clicked the mouse over the button some random amount of times in rapid succession, with specific intention to lose track of which one I was on... A or B.

My goal? 3 goals, which many it seems would like to confuse. #1, can I hear a difference? #2, can I pick out the 24-bit version consistently? and #3, can I pick the best sounding one consistently. Truth be told, I didn't start out with all three goals in mind. #2 implies #1, so initially, #2 was the goal. #3 was something I fell into and had to train my ear more to discern.... but that test was not quite solid for reasons that can be inferred from the rest of the test results below.

Anyway, So there I was clicking the A/B button, over and over again, not knowing which was which, with my eyes closed, trying to pick out the 24 bit recording from the 16-bit, type 1 Moderate noise shaping. If I heard something in the music that gave me the impression that whichever setting I was on was definitely the 24-bit setting, I opened my eyes, wrote down which setting I was on, and started the process over again clicking the button with eyes closed until I could effectively randomize which setting I was starting on. I flipped back any forth many times while I listened. Sometimes it took 20 A/B flips or more in rapid succession and the right music passage to combine in time that gave me the immediate "aha!" moment where I thought I was sure which one was the 24 bit material. I did this whole round 20 times for each possible noise shaping algorithm, which I thought would be a decent sampling of attempts to start with. On that first test, I properly selected the 24-bit recording 18 out of 20 times. The next test, normal noise shaping, type 1 dither, and the results were 19 out of 20 times correct. My ear was also becoming trained in hearing the differences. 3rd test, type 1, ultra noise shaping, and the results were 14 out of 20 correct. The ultra test was definitely harder, and I had to ask myself why, and I discovered that I just had a personal preference for the way that particular coloration enhanced my perception of what was better sounding for this recording, and my brain more often made the erroneous assumption that the better sounding one had to be 24 bit. Being familiar with the curve the ultra algorithm used, I basically trained myself to recognize the increase in high frequency energy which is characteristic of that algorithm, and went back and did the 3rd test again, and the results improved to match the other tests at 18 out of 20 correct. While I did the same tests with type 2 dither, the time that it took to discern the differences continued to go down, and my accuracy remained the same or better. I also did the test with no noise shaping, and it was a no-brainer. I could too easily tell that there was dither noise every time the levels died down, and found it frustrating that it was this noise that continued to tip me off before other factors.

Is my test the exact same as a double blind ABX test? Not quite. I didn't have the ability to make it so that the A & B's were randomly not actually switched in my test. Considering that I switched back and forth with closed eyes until I knew for sure which was which, I feel that the methodology well served the goal of answering the question "can I hear the difference?" I also didn't have the ability to mask the results from myself, since I performed the test alone and had to write down the results as I went along. I don't believe that to be a significant problem, since the next sample was otherwise an independent trial, and there was something to be learned from that as well.

What did *I* learn from this? I learned that my assertion that the ear can be trained to hear differences is valid, and that there is something to be said for allowing time for the ear to be trained. If the ear could not be trained over any length of time or number of trials to achieve results statistically significantly different, then it might prove all of these assumptions about indiscernability true. However, this was not the case. Further, I learned that what one uses to discern one from another may vary from passage to passage, person to person, and configuration to configuration. Sometimes it was perceived depth of the soundfield due to seemingly more persistent reverberations, sometimes it was specific colorations I detected in certain frequency bands, and sometimes it was the ability to perceive the noise. It was not the same thing each time that tipped me off, but each time I was convinced I was listening to the 24-bit recording for some reason or by some method on some passing passage, I looked up, and saw that I was correct often enough that it wasn't a fluke. The times I wasn't correct, I can only attribute to the specific combination of signal content, level, and coloration that made me think it sounded better in the moment. Finally, I learned that one second it might not be discernible, and in the next second it is.

The final test #3 that was only slightly necessary for me, as it was an afterthought... can I pick out the better sounding one consistently? In all cases except with the ultra shaping, I again chose the 24bit recording with regularity. For whatever reason, the ultra was adding something akin to an EQ that was making it particularly pleasing for that recording, and I continued to be pulled towards that 16-bit configuration for that recording. I know from previous experience that on other recordings I've mastered that the ultra algorithm adds too much coloration in high frequencies to be pleasing enough for me to call it better for those recordings (not apparently so in this one).

Bottom line for me:
24-bit vs 16 bit no dither = easily discernible, most agree
24-bit vs 16 bit with dither, no NS = easily discernible based purely on perceptibility of dither noise alone, most agree
24-bit vs 16-bit with dither and varying algorithms of NS = generally discernible for many reasons that varied from moment to moment, with notable improvement of accuracy over time due to unpreventable ear training. And yet, for some seriously odd stubborn reason, many continue to be in denial of this.

If your can hear differences in NS algorithms for a given audio source, then you can hear the difference between NS'ed and non-NS'ed audio.

Find flaws with my tests? Disagree with the premise entirely? Think what I wrote is too damn redundant (no doubt it is at times)? Then do your own tests, on your own 24bit recordings, with your own high-grade equipment, make an effort to train your ear to hear the different noises, artifacts, colorations (whatever you want to call them) and then let's talk turkey. It's just a silly baseless debate otherwise, typically accompanied with a lack of disclosure of all of the variables that truly matter, a lack of significant hands on experience with 24-bit audio, and completely invalidated by the simple premise that audio engineers every day choose NS algorithms by sampling with their ears which ones "sound the best" for a particular recording.

-DH

This post has been edited by ccryder: Jul 8 2008, 07:58
Go to the top of the page
+Quote Post
ccryder
post Jul 8 2008, 05:46
Post #54





Group: Members
Posts: 12
Joined: 5-July 08
Member No.: 55311



QUOTE (euphonic @ Jul 7 2008, 17:56) *
This 12- vs 16-bit comparison is valid only if the listener deliberately jacks the volume way up during quiet bits and turns it down during peaks, i.e. an artificial scenario. Assuming the volume is kept constant throughout the recording, the total resolution is still fully 16-bit. I don't think 16/44.1 can be deemed insufficient based on such manipulation of playback done for the sake of hunting for artifacts.


Really? And what if *all* of the "bits are quiet bits"? What if 98% of the time the recorded signal never peaks above -48dBFS? And that's just peak we're talking about there. What about if 99.9% of the time, peak RMS levels never reach above -60dBFS? I ask you... how many significant bits is 99.9% of your audio getting in the 16-bit domain on that recording? If you've never been asked to record extremely dynamic material, I can understand why you might think that. But you're wrong, because a 16 bit medium used to capture a signal that never peaks beyond -48dBFS will never have less than 8 zeros for MSB's, and never have more than 8 significant bits overall, unless post processing is performed to normalize or in some other way it is processed resulting in the creation of a mantissa based upon those 8 significant bits. Dither it all you want, noise shape it all you want, process it all you want you ain't gonna get blood from a stone, and you can't polish a turd.

This post has been edited by ccryder: Jul 8 2008, 05:47
Go to the top of the page
+Quote Post
ccryder
post Jul 8 2008, 06:27
Post #55





Group: Members
Posts: 12
Joined: 5-July 08
Member No.: 55311



The actual realized signal to noise ratio of the recording is the fundamental concept there, and not available dynamic range of the medium. Two very different things often confused.

This post has been edited by ccryder: Jul 8 2008, 06:29
Go to the top of the page
+Quote Post
Nick.C
post Jul 8 2008, 07:54
Post #56


lossyWAV Developer


Group: Developer
Posts: 1807
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



QUOTE (ccryder @ Jul 8 2008, 05:46) *
Really? And what if *all* of the "bits are quiet bits"? What if 98% of the time the recorded signal never peaks above -48dBFS? And that's just peak we're talking about there. What about if 99.9% of the time, peak RMS levels never reach above -60dBFS? I ask you... how many significant bits is 99.9% of your audio getting in the 16-bit domain on that recording? If you've never been asked to record extremely dynamic material, I can understand why you might think that. But you're wrong, because a 16 bit medium used to capture a signal that never peaks beyond -48dBFS will never have less than 8 zeros for MSB's, and never have more than 8 significant bits overall, unless post processing is performed to normalize or in some other way it is processed resulting in the creation of a mantissa based upon those 8 significant bits. Dither it all you want, noise shape it all you want, process it all you want you ain't gonna get blood from a stone, and you can't polish a turd.
Why would you be in the situation where you are only using the lowest 8 bits of a 16 bit sampler to sample a signal that you actually wanted (or the lower 16 bits of a 24 bit sampler)? It seems a bit contrived that 99.9% of the time the signal does not exceed -60dBFS. Why not just increase the gain pre-sampling? What you seem to be trying to do is use a 16 bit sampler as an 8 bit sampler and then imply that 16 bit is bad, when the sampling situation causing the 8 bit effective sampling is seemingly artificial.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
knutinh
post Jul 8 2008, 09:27
Post #57





Group: Members
Posts: 570
Joined: 1-November 06
Member No.: 37047



QUOTE (ccryder @ Jul 8 2008, 06:46) *
QUOTE (euphonic @ Jul 7 2008, 17:56) *


This 12- vs 16-bit comparison is valid only if the listener deliberately jacks the volume way up during quiet bits and turns it down during peaks, i.e. an artificial scenario. Assuming the volume is kept constant throughout the recording, the total resolution is still fully 16-bit. I don't think 16/44.1 can be deemed insufficient based on such manipulation of playback done for the sake of hunting for artifacts.


Really? And what if *all* of the "bits are quiet bits"? What if 98% of the time the recorded signal never peaks above -48dBFS? And that's just peak we're talking about there. What about if 99.9% of the time, peak RMS levels never reach above -60dBFS? I ask you... how many significant bits is 99.9% of your audio getting in the 16-bit domain on that recording? If you've never been asked to record extremely dynamic material, I can understand why you might think that. But you're wrong, because a 16 bit medium used to capture a signal that never peaks beyond -48dBFS will never have less than 8 zeros for MSB's, and never have more than 8 significant bits overall, unless post processing is performed to normalize or in some other way it is processed resulting in the creation of a mantissa based upon those 8 significant bits. Dither it all you want, noise shape it all you want, process it all you want you ain't gonna get blood from a stone, and you can't polish a turd.

One has to assume that during recording or mastering, a sane engineer would try to make use of all available bits in peak passages. I put a stress on "try", because musicians can seldomly be instructed to produce a given peak sound pressure, so some headroom is needed in the recording process.

For the remaining post, I will focus on playback technology, since the art of music production commonly involves toys that can expose any technical flaw in the recording equipment, and we have only to assume that any improvement (even normally non-distinguishable) could be a benefit for a recording engineer playing with the latest pro-tools plugins.

The flip side of this argument is that if you dont assume that the audio engineer makes use of all bits as much as possible, then clearly any resolution is to coarse. 128bits of as-of-yet unrealized DAC technology will produce obvious flawed signals if only the 4 lsb is ever used.


Now, if we assume that there is a peak passage in the music that is close to 0dBFS, we can do some thinking. What is the pain threshold of humans? What is the loudest static playback-gain that can ever be applied to a recording, given that a part of it is 0dBFS, without causing listener discomfort or even damaged hearing? Then, using this playback gain, you may very well assume that peaks occur for 0.01% of the time and that the rest of the disk contains real low levels (even though there are practical limitations to that - how much music has been composed where there is a 120dB level difference between parts?)

So the exercise of estimating the necessary number of bits can be reduced to finding:
1)The dynamic range of human hearing (hearing threshold vs uncomfortable/dangerous/painful levels)
2)The number of bits, that when noise-dithered using a given algorithm, produces a signal-error*) that is non-detectable

Of course, you may choose a dithering algorithm that is sub-optimal or even no dithering. But the question that you are answering then isnt "what number of bits is necessary", but "what number of bits is necessary if audio engineers are stupid". Since there is no limits to human stupidity, your answer is already given :-)

In addition, most recording and playback rooms have no chance giving you even that low noise levels, but thats another story.

-k
*)Quantization noise, dithering noise, and all other error forms

QUOTE (ccryder @ Jul 8 2008, 06:46) *
Dither it all you want, noise shape it all you want, process it all you want you ain't gonna get blood from a stone, and you can't polish a turd.

Dither can be used to improve the "percepted number of bits" when decreasing resolution.

Dither can not be used to improve a signal that is recorded using to few bits in the first place, such as your example. There is no knowledge of the information (it is lost), therefore it cannot be baked into the output as high-frequency noise either.

-k

QUOTE (ccryder @ Jul 8 2008, 06:31) *
Bottom line for me:
24-bit vs 16 bit no dither = easily discernible, most agree
24-bit vs 16 bit with dither, no NS = easily discernible based purely on perceptibility of dither noise alone, most agree
24-bit vs 16-bit with dither and varying algorithms of NS = generally discernible for many reasons that varied from moment to moment, with notable improvement of accuracy over time due to unpreventable ear training. And yet, for some seriously odd stubborn reason, many continue to be in denial of this.

If your can hear differences in NS algorithms for a given audio source, then you can hear the difference between NS'ed and non-NS'ed audio.

A recent test published in the JAES could not distinguish between SACD, DVD-A and the same signal degraded by a CD-recorder using non-dithered 16 bit/44.1kHz for the normal usage scenario.

However, when levels were cranked up to the limit of pain, and the source was silent, listeners reportedly could hear the elevated noise levels.
QUOTE
completely invalidated by the simple premise that audio engineers every day choose NS algorithms by sampling with their ears which ones "sound the best" for a particular recording.

Some sound engineers choose expensive power cables because it gives their sound "more 3 dimensionality" as well. Clearly, those cannot be used as sources of scientific knowledge until testing methods that take the human mind into consideration is used.

I am amazed at the lack of correlation between:
A)The ability to make recordings that sound very good
B)The ability to analyze technology from experience and sighted listening

It seems to me that many recording engineers are able to use their equipment intuitively to make good recordings, but often for very different reasons than what they believe themselves. Using them as witnesses in a scientific debate therefore has limited value, unless their (probably) above-average hearing is put to use in a blind-test.

-k

This post has been edited by knutinh: Jul 8 2008, 09:31
Go to the top of the page
+Quote Post
SebastianG
post Jul 8 2008, 09:49
Post #58





Group: Developer
Posts: 1318
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE
Simply put, just about every noise shaping algorithm is considered to impart a certain coloration to the signal.

Just so we're clear: Noise shaping only acts as a filter on the dithering+quantization noise. I'm sure you were only talking about the effect of audible dither/quantization noise.

QUOTE
So given all of that, why is it that people think you can't hear the difference between a non-colored raw 24bit recording, and a dithered and/or noise shaped (colored) 16bit rendering from that same 24bit source?

I guess they're assuming that a sane person wouldn't create 16 bit signals which stay below -30dBFS forcing the listener to crank up the volume so much to be able to hear something. I don't know if 16 bit -- as a final delivery format -- is good enough for everyone and I'm not saying that it is or isn't. But I certainly don't have a problem with it. I checked I could go as low as 12 (*) bits at 44 kHz with "proper dithering and noise shaping" without hearing any of the noise at "normal" listening levels during quiet parts. Read "proper" as dither being halfway between rectangular and triangular + using this noise shaping filter.

edit: I just tested it again: If I use a playback level that's almost uncomfortable, I can't differentiate between silence (all zeros) and dithered silence at 13 bits (triangular dither, noise shaper from above, this is not all zeros due to th triangular dither). So, it's 3 bits of headroom for me that 16 bit offers I guess.

Cheers,
SG

This post has been edited by SebastianG: Jul 8 2008, 20:14
Go to the top of the page
+Quote Post
MLXXX
post Jul 8 2008, 13:50
Post #59





Group: Members
Posts: 186
Joined: 25-February 08
From: Australia
Member No.: 51585



QUOTE (ccryder @ Jul 8 2008, 14:31) *
Bottom line for me:
24-bit vs 16 bit no dither = easily discernible, most agree
24-bit vs 16 bit with dither, no NS = easily discernible based purely on perceptibility of dither noise alone, most agree
24-bit vs 16-bit with dither and varying algorithms of NS = generally discernible for many reasons that varied from moment to moment, with notable improvement of accuracy over time due to unpreventable ear training. And yet, for some seriously odd stubborn reason, many continue to be in denial of this.

The "denial" is based on the fact that no-one has uploaded a test sample to this forum demonstrating that a 24-bit version can be distinguished from a properly noise-shaped dithered 16-bit mixdown, assuming that the 24bit version is not at an abnormally quiet level for distribution to consumers. AndyH-ha laid down the challenge quite some time ago. SebastianG has suggested a methodology for the dither in his post just above. [ABX software abounds, e.g. a full free download of foobar.]

The empirical evidence to date indicates that the "colour" of the dither is only audible at unrealistically high levels of playback gain, such levels being uncomfortable for listening to real music at normally mastered levels.

Surprising though this is, with the current clamour for 24-bit lossless sound streams for Blu-ray discs!

This post has been edited by MLXXX: Jul 9 2008, 09:26
Go to the top of the page
+Quote Post
krabapple
post Jul 8 2008, 17:29
Post #60





Group: Members
Posts: 2422
Joined: 18-December 03
Member No.: 10538



QUOTE (ccryder @ Jul 8 2008, 00:31) *
Is my test the exact same as a double blind ABX test? Not quite. I didn't have the ability to make it so that the A & B's were randomly not actually switched in my test. Considering that I switched back and forth with closed eyes until I knew for sure which was which, I feel that the methodology well served the goal of answering the question "can I hear the difference?" I also didn't have the ability to mask the results from myself, since I performed the test alone and had to write down the results as I went along. I don't believe that to be a significant problem, since the next sample was otherwise an independent trial, and there was something to be learned from that as well.



Let's leave aside the problem of the 'blinding' method used in this test (is there a reason you could not use something like WinABX or foobar2000's ABX tool, which would truly blind the test, and randomize it)? I'd note though, that you sneer at the phrase 'properly applied noise shaping', but ask us to let 'blind enough' pass.

Instead, tell us about the playback levels. You were listening via headphones, and rather fine ones at that, which is already going to
increase discriminative power. I presume levels were nominally matched between A and B going in to each comparison. Did you adjust volume during listening, e.g., to 'hear better' during quiet parts?

If in the end, what you achieved was successful training to hear low-level differences played back at high volume, using headphones... hopefully you can see the problem in trying to extrapolate that to practical effects during normal listening.

Care to speculate on how you'd perform in an open-air ABX, listening at normal levels?

This post has been edited by krabapple: Jul 8 2008, 17:35
Go to the top of the page
+Quote Post
AndyH-ha
post Jul 9 2008, 12:42
Post #61





Group: Members
Posts: 2223
Joined: 31-August 05
Member No.: 24222



“properly noise shaped dither” (what I wrote) is not the same thing as “properly applied noise shaping” (what you wrote, whatever it means). What I mean is noise shaped dither (vs not noise shaped) and a reasonable noise shaping (vs something really weird and possibly easily heard).

Dither comes in various types, flavors, and amounts. I realize flavor isn’t the correct term but I don’t recall it at the moment. One can certainly create dither that will be audible. Also, while I don’t know if anyone does, one could noise shape dither so it would be even more audible than without the noise shaping. You are using dither as a straw man to argue for subjectivity.

I do challenge the idea that all dither, and particularly all noise shaped dither, is audible. It is an easy thing to test. Generate some silence, say 10 seconds, at 24 bit or in floating point. Convert to 16 bit, dithered with noise shaping. What you now have is a 16 bit file of noise shaped dither, nothing else. For comparison, at the end (or beginning) generate an additional 10 seconds of silence (obviously at 16 bit this time).

Now try to hear where one ends and the other begins, or try any listening test that satisfies you. If you’ve chosen well, you will probably not hear the dither. Quite a few variations will be below audibility, not only “the best one.”

I chose the dither and noise shaping I prefer, based on ly own testing, but if I remember that testing time well enough, other specifications worked very well. This one just appealed most to my aesthetic sense, so I’ve continued to use it. In this test I don’t hear it by using high quality closed back headphones with my headphone amplifier turned up all the way. No music can be listened at any place near that setting without pain and damage.

Some people with extended high frequency hearing might hear the dither. I can easily see it in spectral view, but my hearing no longer extends to those higher frequency limits. However, the RMS measure on the file is -88dB. Put this in any real music and I think the possibility of someone hearing it without damaging their hearing in the process is very low (and maybe not even by risking hearing damage). This is obviously not dependent on my belief, or that of anyone else, it is easily tested objectively.

Now, the point of all that was your assertion that you can make the same recording in 16 bit and 24 bit, keeping the levels extra low to avoid even the appearance of approaching clipping, normalize the recordings for listening, and tell which is which from the sound. Maybe I misunderstood, I don’t have the time or energy right now to re-read it all.

My meaning was that, if you are doing a transform on the 16 bit data (amplifying), without dithering,
(1) you are creating quantization distortion that will differentiate it from the 24 bit version, at least in the very low level sections.
(2) A bad choice for the dither will make the dither itself audible, and thus still differentiate the two.
(3) A good choice of dither is very unlikely to make the dither audible and it will at least eliminate this very significant source of audible difference (the quantization distortion).
(4) If you can then tell them apart in a normal blind ABX test, you have something to talk about.

As stated, I’ve been searching for a good sample for years. I would like a chance at my own listening tests.
Go to the top of the page
+Quote Post
SebastianG
post Jul 9 2008, 14:07
Post #62





Group: Developer
Posts: 1318
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



Hi Andy!

QUOTE (AndyH-ha @ Jul 9 2008, 13:42) *
... noise shaped dither ...

I just want to mention that I'm not a fan of this expression because it isn't obvious which of the following two approaches is meant:
  • shaped dither / colored dither: Only the dither signal is filtered. Quantization noise remains white. Example: UV22 (Yuck!)
  • noise shaping quantizer: The overall error (X) is filtered which may include dither noise. This is way more powerful than the previous approach.
Note: I'm separating dither and quantization noise here (dither + quantization noise = overall noise).

QUOTE (AndyH-ha @ Jul 9 2008, 13:42) *
...if you are doing a transform on the 16 bit data (amplifying), without dithering,
(1) you are creating quantization distortion that will differentiate it from the 24 bit version, at least in the very low level sections.

He mentioned bit shifting as a noiseless way of amplification.

QUOTE (AndyH-ha @ Jul 9 2008, 13:42) *
As stated, I’ve been searching for a good sample for years. I would like a chance at my own listening tests.

Since a triangular dither makes the perceivable noise properties independant from the signal you might as well pick the "hardest" test sample: silence wink.gif .... which is what you suggested earlier.

Cheers,
SG

This post has been edited by SebastianG: Jul 9 2008, 17:43
Go to the top of the page
+Quote Post
AndyH-ha
post Jul 9 2008, 22:08
Post #63





Group: Members
Posts: 2223
Joined: 31-August 05
Member No.: 24222



Here is my ignorance showing. How does one “bit shift” to amplify? And is the claim being made that one can thus achieve amplification with no quantization error?

The sample I want is the 24 bit music recording that can be “properly” converted to 16 bit and the two distinguished, one from the other.
Go to the top of the page
+Quote Post
Nick.C
post Jul 9 2008, 22:12
Post #64


lossyWAV Developer


Group: Developer
Posts: 1807
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



If a signal with an amplitude of less than or equal to the maximum permissible value is bitshifted by one to the left (i.e. multiplied by two) then the resulting signal has no (further) added noise. This will have a side effect of making all of the lowest significant bits equal to zero.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
MichaelW
post Jul 10 2008, 07:16
Post #65





Group: Members
Posts: 631
Joined: 15-March 07
Member No.: 41501



QUOTE (Nick.C @ Jul 10 2008, 10:12) *
If a signal with an amplitude of less than or equal to the maximum permissible value is bitshifted by one to the left (i.e. multiplied by two) then the resulting signal has no (further) added noise. This will have a side effect of making all of the lowest significant bits equal to zero.

My ignorance is a LOT deeper than AndyH-ha's. So I ask:

Does this mean that if you have a 24-bit recording, with the "top" 8 bits unused (because it's been left for headroom, or whatever), you can losslessy convert it into a 16-bit recording, using all the bits?

If the question in itself betrays hopeless ignorance, please be gentle and just refer me to the right part of the Wiki or whatever; I only ask because I think it relates to what's at issue after ccryder's posts.
Go to the top of the page
+Quote Post
Nick.C
post Jul 10 2008, 08:00
Post #66


lossyWAV Developer


Group: Developer
Posts: 1807
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



QUOTE (MichaelW @ Jul 10 2008, 07:16) *
Does this mean that if you have a 24-bit recording, with the "top" 8 bits unused (because it's been left for headroom, or whatever), you can losslessy convert it into a 16-bit recording, using all the bits?
Basically yes. How you would determine the maximum significant bit used in the original to then work out how many bits to shift left by is up for debate....


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Chromatix
post Jul 10 2008, 13:52
Post #67





Group: Members
Posts: 62
Joined: 16-June 08
Member No.: 54419



Let's inject some common sense in to this, shall we?

I just used an SPL meter to measure the background noise level in my "living room".

First, I turned off all my computers (except the silent firewall in the next room), closed the triple-glazed windows (Finland gets cold in winter) and fire doors, and turned off the fridge in the kitchen. I even tried not to breathe too loudly while watching the meter display.

I was essentially unable to achieve a noise level lower than 33.5 dB(A). This was dominated by a small child playing in the courtyard, behind the aforementioned triple-glazing and some distance away. The meter did read lower than this occasionally, but not by much; I suspect that without the child in the background, it would still have been above 30 dB(A).

Combining this with the 120 dB pain threshold, it is clear that about 90dB of dynamic range is all that you can reasonably expect to reproduce in an average living room.

This is adequately supplied by a properly-mastered 16-bit CDDA recording. "Properly mastered" in this case means treating 0dBFS as the pain threshold, not the peak goal. This would put the LSB at 30dB SPL, which is below the noise floor of my living room.

In a dedicated, soundproofed, anechoic listening room, you might be able to achieve better, but I suspect only by about 10dB. Actual best-practice mastering would treat full-scale as about 107 dB SPL, which corresponds nicely to this. Again, 16 bits is clearly adequate, with the LSB being at about 17dB SPL.

For *recording*, it is necessary to use more than 16 bits, simply because it is impossible to accurately predict where the noise floor and peak excursion will be under live conditions. Using 24 bits for this lets you set up excess headroom and footroom to be on the safe side, and keeping this resolution throughout processing reduces cumulative distortion. The mastering process is where you correct for the uncertainties and provide a polished recording for living-room listening conditions, consuming some of the excess headroom and footroom in the process.
Go to the top of the page
+Quote Post
SebastianG
post Jul 10 2008, 15:47
Post #68





Group: Developer
Posts: 1318
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (Chromatix @ Jul 10 2008, 14:52) *
... Again, 16 bits is clearly adequate, ...

I guess that throwing in the details about equal loudness contours, typical spectral characteristics of music and what noise shaping can do would make 16 bits (at 44kHz or higher, LPCM) look even "more adequate" as delivery format. Just think of applying A-weighting on the music and on the 16bit induced noise floor in isolation for comparison. The "A-weighted" SNR is likely to be higher than the "plain" SNR.

Cheers,
SG

This post has been edited by SebastianG: Jul 10 2008, 15:58
Go to the top of the page
+Quote Post
Axon
post Jul 10 2008, 16:52
Post #69





Group: Members (Donating)
Posts: 1985
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



Just to throw another log on the fire: My phono preamp has an SNR of 55db (!!!) before RIAA eq right now, and I've found that even that is adequate for virtually all records.
Go to the top of the page
+Quote Post
Canar
post Jul 10 2008, 17:12
Post #70





Group: Super Moderator
Posts: 3372
Joined: 26-July 02
From: To:
Member No.: 2796



QUOTE (Axon @ Jul 10 2008, 08:52) *
Just to throw another log on the fire: My phono preamp has an SNR of 55db (!!!) before RIAA eq right now, and I've found that even that is adequate for virtually all records.
Yeah, but hey, it's vinyl, what do you expect, actual fidelity? tongue.gif


--------------------
You cannot ABX the rustling of jimmies.
No mouse? No problem.
Go to the top of the page
+Quote Post
pdq
post Jul 10 2008, 17:28
Post #71





Group: Members
Posts: 3442
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



QUOTE (Axon @ Jul 10 2008, 11:52) *
Just to throw another log on the fire: My phono preamp has an SNR of 55db (!!!) before RIAA eq right now, and I've found that even that is adequate for virtually all records.

I think you will find that after RIAA equalization the SNR is probably 10 to 20 dB higher.
Go to the top of the page
+Quote Post
Axon
post Jul 10 2008, 17:33
Post #72





Group: Members (Donating)
Posts: 1985
Joined: 4-January 04
From: Austin, TX
Member No.: 10933



About 10, but still.
Go to the top of the page
+Quote Post
hellokeith
post Jul 10 2008, 19:25
Post #73





Group: Members
Posts: 288
Joined: 14-August 06
Member No.: 34027



QUOTE (Chromatix @ Jul 10 2008, 07:52) *
I was essentially unable to achieve a noise level lower than 33.5 dB(A). This was dominated by a small child playing in the courtyard, behind the aforementioned triple-glazing and some distance away. The meter did read lower than this occasionally, but not by much; I suspect that without the child in the background, it would still have been above 30 dB(A).

Combining this with the 120 dB pain threshold, it is clear that about 90dB of dynamic range is all that you can reasonably expect to reproduce in an average living room.


Chromatix,

Have you by chance measured inside a car? I'd be interested in knowing what is the noise floor for a typical car while driving.

P.S. Also, if I could impose, I have something fairly short in Finnish I've been hunting someone to translate to English for me. If you agree, could you PM me.
Go to the top of the page
+Quote Post
greynol
post Jul 10 2008, 19:30
Post #74





Group: Super Moderator
Posts: 10249
Joined: 1-April 04
From: San Francisco
Member No.: 13167



C'mon now. This can be handled via PM.

Let's keep this discussion on topic, please!


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
cabbagerat
post Jul 10 2008, 21:08
Post #75





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



QUOTE (hellokeith @ Jul 10 2008, 10:25) *
Have you by chance measured inside a car? I'd be interested in knowing what is the noise floor for a typical car while driving.


CAR magazine publish noise measurements in all their road tests. Here is a sample from their May 2008 issue (which was in the pile of junk on my desk. Figures are in dB (A), but I have no idea how accurate they are except that it's a pretty good magazine.

Mercedes C180 Kompressor
Idle: 59
120 km/h: 67

Honda Civic 2.2 i-CTDi (the hatch you get in South Africa and Europe, not the one you get in the states):
Idle: 47
120 km/h: 65

Nissan X-Trail LE:
Idle: 37
120 km/h: 67

VW CrossPolo 1.9 TDI:
Idle: 43
120 km/h: 69

Ok, so at idle in the quietest cars you probably are around 35dB or higher, while at highway cruising speeds it seems to be pretty consistent around 65dB or so. That only gives 55dB of range before the pain threshold.

Edit: My oldschool CitiGolf is rated at 57dB (A) at idle and 75dB (A) at 120km/h, which is a little too loud. Inside a motorbike helmet, of course, is much louder.

Edit: It's my 1000th post. Insert fireworks here.

This post has been edited by cabbagerat: Jul 11 2008, 13:24


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post

4 Pages V  < 1 2 3 4 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd November 2014 - 14:20