IPB

Welcome Guest ( Log In | Register )

> Upload forum rules

- No over 30 sec clips of copyrighted music. Cite properly and never more than necessary for the discussion.


- No copyrighted software without permission.


- Click here for complete Hydrogenaudio Terms of Service

5 Pages V  < 1 2 3 4 5 >  
Reply to this topicStart new topic
Resampling down to 44.1KHz, Is there a method that will not colour the sound?
MLXXX
post May 14 2008, 14:34
Post #51





Group: Members
Posts: 207
Joined: 25-February 08
From: Australia
Member No.: 51585



[Martel, I have not tried to find out exactly how SSRC performs.]

Based on the contents of this thread up till now, I'd be inclined to prefer a 48KHz sampling rate over 44.1Khz as it gives more margin for error, with only a relatively slight (less than 10%) increase in raw file size.

[A range of 48KHz sound cards could be used for the playback and the precise characteristics of the filter would not be all that critical. Similarly the recording could be made with a range of recording devices, without undue concern about the filter characteristics.]

But there is another concern that is sometimes raised, beyond mere frequency response. It is a concern about relative timing and phase.

Is it good enough to shoehorn everything into a strict timing regimen of say 48000 samples a second, if some waveforms are slightly out of phase with each other, as captured by different microphones?

Arguably if 96KHz is used, any natural or artificial reverberation can be richer as the instantaneous wave cancellations are subtly recorded and reproduced without the constraint of a time structure (e.g. the volume level of different recorded tracks could be changed when creating a new mix and this could generate a whole new set of complex phase additions and cancellations, arguably more complex than if 48KHz had been used when recording).

Put another way, if an analogue source is captured simultaneously at 48KHz by two soundcards that are not locked in phase with each other, one card may be triggered by its sampling oscillator to take its sample* as much as 1/96000th sec after the other. In such a case, will the played back sound be perceptibly different in an A B comparison? This could be similar to comparing the sound from two microphones placed a distance apart equal to the distance sound travels in 1/96000th second. At 25 degrees Celsius, sound travels at about 346m/s. In 1/96000 sec, it would travel about 3.6mm, or a bit over a third of a centimetre.

A similar small difference due to sampling phase could also apply if downsampling a 192KHz recording to 48Khz. There will be 4 samples at 192KHz for every 1 at 48KHz. What if a 192Khz recording has 2 samples shaved off the start of it? If it is then converted to 48KHz it will give a slightly different result compared with a version that has not been shaved being converted to 48KHz. Substraction of the two conversions will leave a small residue. But will the two conversions sound different to the ear in an A-B comparison?

Even if they do sound different, is this not comparable with the difference we experience if we move our head back by a third of a centimetre [not when listening to headphones]. A practically negligible difference?

Are there any situations where it could make a material difference to the listening experience if the sound is captured at 48KHz and not, say, 96KHz?

_______________________

* Even with oversampling, there is subsequent decimation/averaging. After all of the processing, there exists but one sample value per channel, for each arbitrarily selected period of 1/48000 sec.
Go to the top of the page
+Quote Post
2Bdecided
post May 14 2008, 14:54
Post #52


ReplayGain developer


Group: Developer
Posts: 5362
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



To me, it sounds like you still haven't read the relevant threads in the FAQ. The subject of timing issues, or rather the lack of them, is quite well covered.


Also, IIRC there were cheap converters that did left then right, but I think we're talking decades ago, and they were quite rare. AFAIK no one is using them now in anything like a high quality application.

You can check for this fault quite trivially by recording or playing back the same thing on both channels. Impulses are an ideal test signal.


Even a sub-sample interchannel delay would cause audible high frequency loss if the output was combined to mono, which is one good reason why they are avoided. The other is that there is little reason to introduce them!

Cheers,
David.

This post has been edited by 2Bdecided: May 14 2008, 14:54
Go to the top of the page
+Quote Post
pdq
post May 14 2008, 15:36
Post #53





Group: Members
Posts: 3450
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



QUOTE (MLXXX @ May 14 2008, 09:34) *
Put another way, if an analogue source is captured simultaneously at 48KHz by two soundcards that are not locked in phase with each other, one card may be triggered by its sampling oscillator to take its sample* as much as 1/96000th sec after the other. In such a case, will the played back sound be perceptibly different in an A B comparison? This could be similar to comparing the sound from two microphones placed a distance apart equal to the distance sound travels in 1/96000th second. At 25 degrees Celsius, sound travels at about 346m/s. In 1/96000 sec, it would travel about 3.6mm, or a bit over a third of a centimetre.

If the two soundcards have clock frequencies that differ by only 0.001% (10 ppm) then the phase shift between them will reach 1/96000 second after only one second and will increase by this amount every second.
Go to the top of the page
+Quote Post
MLXXX
post May 14 2008, 15:52
Post #54





Group: Members
Posts: 207
Joined: 25-February 08
From: Australia
Member No.: 51585



2Bdecided, I have looked through the FAQs but there is not a lot that seems conclusive. Several times in my searches I came across this interesting report by yourself from 5 years ago, which seems quite relevant to the current topic:-


QUOTE (2Bdecided @ May 21 2003, 02:35) *
... The next day, while the demo was being run for the Nth time, I was at the back of the room talking with someone. Suddenly, I heard a difference as the source switched. I was surprised, having failed to hear a difference the previous day listening in the sweet spot. I listened as it switched again, and heard it switch back – ah ha, it must have just gone analogue / digital / analogue. I kept listening – I couldn’t hear the difference next time it switched.

I went to the middle/back of the room, and listened through the next demo. Without being told, I could pick out 44.1 and 48kHz. The difference was more obvious back from the sweet spot than in the sweet spot itself. More importantly, the difference wasn’t what I (or the other people who failed to hear it) had been listening for. It didn’t make any difference to the frequency response at all, or to the clarity of the high frequencies.

What 44.1kHz and 48kHz did do was to make the sound slightly less realistic, like the difference between a good and bad CD player. If the lower sampling rate had any defined “quality” it was a glassy kind of sound – I’d heard that word associated with CD before and thought it was complete rubbish – but now I actually heard the difference, I understood exactly what people had meant.

The change from 44.1 or 48kHz to analogue to 96kHz slightly increased the depth of the sound stage. I’d been listening to the amazing demo 1 for 2 days, so it was hardly an impressive difference, but it was still there.

If you’re counting, that’s only two blind detections – once when I wasn’t even listening, and again when I went back to the middle of the room to check – I confirmed which had been which with Kevin afterwards – “The next to last one was 40something, wasn’t it?”


You can say many things about this. You could say it was just luck, but I don’t think it was – I wasn’t even listening for the difference because (having listened the previous day) I didn’t think there was one to hear! You can say that I was hearing sonic deficiencies in the equipment. Well, maybe. That may be what the whole 44.1/96k debate is based on. All I can say is that, if there are sonic deficiencies in this equipment (I think the dCS boxes are around 5k each, and are used in many recording studios) then there isn’t much hope for the rest of us!

What you could say, with some justification, is that the “character” of 44.1 was more obvious outside the sweet spot, so maybe it’s not such a big issue. That’s probably true – except that maybe I was just listening for the wrong thing when I was “in the sweet spot”. Maybe I had to stop listening to the Hi-Fi, and start listening to the music and the performance to hear what was happening.

What is significant is that the 44.1kHz version wasn’t just different from the 96k and analogue version, it was [I[worse[/I]. As the analogue was the master, any difference would be bad news, but for it to be subjectively worse makes matters even, well, worse!

I was upset to think how much recorded music only exists as a 44.1kHz or 48kHz sampled digital master tape. I discussed the subjective imperfections (the improved depth and realism of the 96k version) with Kevin, and he agreed. He was surprised that I’d noticed it that day, but couldn’t even hear anything wrong with 32kHz the previous day! I asked him what he heard with 16-bit (we’d been using 24-bit all along) and DSD. He said 16-bit was even worse – it made the whole sound “grungy”, and that DSD sounded nice, but added it’s own signature. “You can tell when you’re playing DSD through this system – the rooms heats up wink.gif” he said – I looked at the huge amps, and could believe it.

One thing I should note: I didn’t think the analogue master was particularly good quality. It was a gorgeous recording, but it had obvious flaws – e.g. background noise, and some audible edits. Also, I didn’t hear any difference between analogue, 96k and 192k. I can’t explain why 44.1kHz and 48kHz sounded worse, but they did. No one responsible for the demo had any reason to rig the results, and I played with enough of the equipment to know that everything was above board and fair, even though some of the cables we used might not have met with audiophile approval. ...


QUOTE (pdq @ May 15 2008, 00:36) *
If the two soundcards have clock frequencies that differ by only 0.001% (10 ppm) then the phase shift between them will reach 1/96000 second after only one second and will increase by this amount every second.

I think this goes towards explaining why it is good practice to have a master synchronising signal, if more than one sound card is used for a recording.
Go to the top of the page
+Quote Post
cabbagerat
post May 14 2008, 15:57
Post #55





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



2Bdecided is right - you need to do some background reading. I will try to answer your questions as best I can.

QUOTE (MLXXX @ May 14 2008, 05:34) *
But there is another concern that is sometimes raised, beyond mere frequency response. It is a concern about relative timing and phase.
If the waves are slightly out of phase with eachother, they will be captured slightly out of phase. The ability to distinguish two phases in a sampled waveform is not directly limited by the sample rate - the SNR comes into play, too. This has been covered before in a number of threads. A recent thread on time resolution in PCM has all the answers.
QUOTE (MLXXX @ May 14 2008, 05:34) *
Arguably if 96KHz is used, any natural or artificial reverberation can be richer as the instantaneous wave cancellations are subtly recorded and reproduced without the constraint of a time structure (e.g. the volume level of different recorded tracks could be changed when creating a new mix and this could generate a whole new set of complex phase additions and cancellations, arguably more complex than if 48KHz had been used when recording).
You could argue that, but you would be wrong. No matter how "complex", "rich" or "nuanced" a signal is, it can still be described by it's bandwidth and SNR.
QUOTE (MLXXX @ May 14 2008, 05:34) *
Put another way, if an analogue source is captured simultaneously at 48KHz by two soundcards that are not locked in phase with each other, one card may be triggered by its sampling oscillator to take its sample* as much as 1/96000th sec after the other.
Yeah, and probably will be. Quartz clocks suck at long term stability - so you are going to be sampling at different instants. It's not a bad assumption that, given two arbitrary clocks at 96kHz the difference between them will be distributed evenly across 1/96000th of a second.
QUOTE (MLXXX @ May 14 2008, 05:34) *
In such a case, will the played back sound be perceptibly different in an A B comparison?
No, because the output of reconstruction will be the same in both cases, given a bandlimited signal. Kotelnikov's original paper (one of the first in the field) actually discusses this, and it can be proven without difficulty. There is a small theoretical problem with the turn on condition (the beginning of time), but this can be ignored in audio.
QUOTE (MLXXX @ May 14 2008, 05:34) *
This could be similar to comparing the sound from two microphones placed a distance apart equal to the distance sound travels in 1/96000th second. At 25 degrees Celsius, sound travels at about 346m/s. In 1/96000 sec, it would travel about 3.6mm, or a bit over a third of a centimetre.
Back in the mists of time, some radar signal processing was done with things called "accoustic delay lines" which worked in exactly this way. It worked amazingly well, for the time.

QUOTE (MLXXX @ May 14 2008, 05:34) *
A similar small difference due to sampling phase could also apply if downsampling a 192KHz recording to 48Khz. There will be 4 samples at 192KHz for every 1 at 48KHz. What if a 192Khz recording has 2 samples shaved off the start of it? If it is then converted to 48KHz it will give a slightly different result compared with a version that has not been shaved being converted to 48KHz. Substraction of the two conversions will leave a small residue. But will the two conversions sound different to the ear in an A-B comparison?
Blindly subtracting one digital signal from another isn't a good idea for just this reason. The two downsampled versions will differ by a "group delay", which you can correct digitally, in analogue, or by moving your speakers back a couple of centimeters. After reconstruction, the two signals will be identical. There can be a slight difference at turn on, but after that they'll be the same.

QUOTE (MLXXX @ May 14 2008, 05:34) *
Even if they do sound different, is this not comparable with the difference we experience if we move our head back by a third of a centimetre [not when listening to headphones]. A practically negligible difference?

Yes. Take the function f(x) = cos(x) u(x), where u(x) is zero for negative x and 1 for positive x. Start sampling at time zero, and at time 0+1/96000. When you have those samples, reconstruct the original wave. Notice that they will be different at the beginning. After this turn-on period they will be the same. Due to the antialiasing filter, this example is a little more subtle than that, even - but they will still be different as the whole process has to be causal. Does it matter in the real world? No.

QUOTE (MLXXX @ May 14 2008, 05:34) *
Are there any situations where it could make a material difference to the listening experience if the sound is captured at 48KHz and not, say, 96KHz?
In an ideal world, no. With real hardware, I don't know.


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
2Bdecided
post May 14 2008, 16:17
Post #56


ReplayGain developer


Group: Developer
Posts: 5362
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (MLXXX @ May 14 2008, 15:52) *
2Bdecided, I have looked through the FAQs but there is not a lot that seems conclusive. Several times in my searches I came across this interesting report by yourself from 5 years ago, which seems quite relevant to the current topic:-
Oh, I stand by that report (though I agree with the criticisms in the same thread).

It was subsequent reading, research, and experiments that cleared up (for me) most of the issues that you are working through. They are none issues (at least in theory).


There are only two slightly credible explanations: human ears don't quite work in the way we think, or the well known and understood imperfections in real equipment combine together to create audible differences.

There is an even more relevant point: no one has ABXed CD vs anything else, except by using seriously faulty equipment or by turning the volume up so high on near-silent passages that "normal" recordings would deafen you.

Cheers,
David.
Go to the top of the page
+Quote Post
MLXXX
post May 15 2008, 02:51
Post #57





Group: Members
Posts: 207
Joined: 25-February 08
From: Australia
Member No.: 51585



Thanks cabbagerat; your specific explanations in response to my post are appreciated.

QUOTE (2Bdecided @ May 15 2008, 01:17) *
There are only two slightly credible explanations: human ears don't quite work in the way we think, or the well known and understood imperfections in real equipment combine together to create audible differences.

There is an even more relevant point: no one has ABXed CD vs anything else, except by using seriously faulty equipment or by turning the volume up so high on near-silent passages that "normal" recordings would deafen you.

It's relatively easy equipment-wise to test 24 bits against a dither to 16 bits because you have exactly the same timing of the samples, and can use the same sound card for playback, operating with the same filter; whether reproducing 24 bits, 16 bits dithered, or a truncation to 16 bits. [I have done this myself with my own equipment at home.]

It's much harder to compare 96KHz as against 44.1KHz, and any differences that were heard could be ascribed to deficiencies in the equipment. I assume that is how you might now primarily explain that report of your own listening experience in 2003, at different sample rates.

But I wonder whether there are any recent tests with highly evolved equipment that have concentrated on the 44.1 vs 48 vs 96+ issue with audio clips designed to highlight differences.

I could imagine that if six violinists played in front of a microphone each and the sound was mixed in analogue the result would be quite complex. Alternatively, if each of the six sources were separately converted to digital at just 44.1KHz and mixed digitally with the other violins each at 44.1KHz, the result seems likely to be different, compared with sampling each at say 96KHz and mixing; even if the final mixdown of the 96KHz sources were at 44.1KHz.

People may ask 'why bother to use a separate ADC for each microphone?': just mix in an analogue mixer. Well as technology advances, ADCs are becoming quite cheap and it may be an attractive proposition to fit out a microphone with its own ADC (and perhaps some sort of wireless data link) and dispense with any analogue mixer.

There may be other recording situations that would be more demanding and have greater potential to be affected by phase differences.

If we really are sure that 96KHz is of no benefit now, are recording engineers using it just in case it may make a difference with loudspeakers of the future; or is the use of 96KHz driven by (i) flawed technical assumptions, and/or (ii) a market demand fostered by advertising hype?

This post has been edited by MLXXX: May 15 2008, 09:10
Go to the top of the page
+Quote Post
Martel
post May 15 2008, 09:46
Post #58





Group: Members
Posts: 564
Joined: 31-May 04
From: Czech Rep.
Member No.: 14430



QUOTE (MLXXX @ May 14 2008, 17:51) *
But I wonder whether there are any recent tests with highly evolved equipment that have concentrated on the 44.1 vs 48 vs 96+ issue with samples designed to highlight differences.

Well, those tests would merely prove/deny the equipment's ability to play back those samplerates. I guess there are some tests of CD players versus SACD on some audiophile pages. Since the differences between the formats are theoretically negligible, the real difference should lie only in playback equipment quality (or different mastering of CD and SACD version, so beware).
QUOTE (MLXXX @ May 14 2008, 17:51) *
I could imagine that if six violinists played in front of a microphone each and the sound was mixed in analogue the result would be quite complex. If each of the six sources were separately converted to digital at just 44.1KHz and mixed digitally with the other violins each at 44.1KHz, the result seems likely to be different, compared with sampling each at say 96KHz and mixing; even if the final mixdown is to 44.1KHz.

I think there's no theoretical reason why it should be different. Theoretically, you should be able to do filtering, analog-to-digital conversion, resampling and mixing in arbitrary order and get the same result. Practically, there is a preferred order of those since equipment is not ideal (linear, unlimited dynamic range etc.) and the effort is to minimize the overall distortion. Just look at the resampling results of those software resamplers. It is all about lowpass filtering and most of the resamplers fail at that utterly. It is problematic to properly design an analogue antialiasing filter for a 44kHz ADC, so a 96kHz one is a much better choice.
QUOTE (MLXXX @ May 14 2008, 17:51) *
Of course we could ask 'why bother to use a separate ADC for each microphone?': just mix in an analogue mixer.

Because it is not practical to have million ADCs in a studio and have to mix million different tracks in software.
QUOTE (MLXXX @ May 14 2008, 17:51) *
If we really are sure that 96KHz is of no benefit now, are recording engineers using it just in case it may make a difference with loudspeakers of the future; or is the use of 96KHz driven by (i) flawed technical assumptions, and/or (ii) a market demand fostered by advertising hype?

96kHz ADCs are less likely to be plagued by analog antialiasing filter, which they need to include. You may (relatively) easily design something like the SSRC's lowpass in software but it is virtually impossible using analogue circuit.


--------------------
IE4 Rockbox Clip+ AAC@192; HD 668B/HD 518 Xonar DX FB2k FLAC;
Go to the top of the page
+Quote Post
Kees de Visser
post May 15 2008, 10:55
Post #59





Group: Members
Posts: 737
Joined: 22-May 05
From: France
Member No.: 22220



QUOTE (MLXXX @ May 15 2008, 02:51) *
If we really are sure that 96KHz is of no benefit now, are recording engineers using it just in case it may make a difference with loudspeakers of the future; or is the use of 96KHz driven by (i) flawed technical assumptions, and/or (ii) a market demand fostered by advertising hype?
On a recording budget the difference between using 44.1 and 96 kHz (or higher) is really benign these days. Since there seems no evidence that using 44.1 gives better results there is very little reason not to use 96 kHz or higher as a production format.
There seems anecdotal evidence that some plug-ins perform (sound) better at 96 kHz rate. A possible explanation is that the code has been optimized for that rate and not for 44.1. This "shouldn't" be a reason to record at 96, but it's probably the most practical workflow.
Go to the top of the page
+Quote Post
2Bdecided
post May 15 2008, 11:34
Post #60


ReplayGain developer


Group: Developer
Posts: 5362
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (MLXXX @ May 15 2008, 02:51) *
I could imagine that if six violinists played in front of a microphone each and the sound was mixed in analogue the result would be quite complex. Alternatively, if each of the six sources were separately converted to digital at just 44.1KHz and mixed digitally with the other violins each at 44.1KHz, the result seems likely to be different, compared with sampling each at say 96KHz and mixing; even if the final mixdown of the 96KHz sources were at 44.1KHz.
Let's think this through. Firstly, nothing samples at 44.1kHz in the 21st century - ADCs are always oversampled. So what you have is at least 352.8kHz resampled to 44.1kHz, vs at least 384kHz resampled to 96kHz resampled to 44.1kHz. The mixing is not the only (or even the main) difference here. It's bad experimental practice to introduce multiple variables: You should compare sample rates and associated resampling, or mixing - not both at once.


Here is a comparison which at least has analogue vs digital mixing (and the inevitable circuit differences) as the only variable:

Situation 1 = 6 ADCs, 96kHz, resample to 44.1kHz, mix signals
Situation 2 = mix signals, 1 ADC, 96kHz, resample to 44.1kHz

The problem with this experiment in practice is that the digital gains could be matched perfectly, whereas the analogue gains could not. Still, let us forget that for a moment. Let us assume we can do a perfect summation in both digital and analogue, use unity gain for each, and not clip. Let us make the equations easier by simply having two violin players yielding two microphone feeds, x and y. Let us denote the function of the ADC and the resampling by f. Let us denote simple summation by +.

Situation 1: 2 ADCs, digital mixing
final output = f(x) + f(y)

Situation 2: analogue mixing, 1 ADC
final output = f(x+y)


Then question becomes simple, because the very definition of a linear system (in this case, system f) is that these two situations yield an identical result for any value of x and y. In reality, we would put limits on x and y and say that the system was linear within these limits (no use considering levels that would blow up the equipment!).

So, if x and y are sensible voltages from real microphones, is f a linear system? Let's pull it apart and check each part in term, since a concatenation of linear systems is by definition also linear.

ADC:
0. the buffer amplifier might(!) be linear
1. low pass filtering is linear
2. straight quantisation is not linear - so we won't use that!
2a. dithered quantisation is still not linear, but breaks down into a linear-on-average system, and a noise source
3. Nyquist sampling is linear, but that assumes a perfect filter
3a. non Nyquist sampling creates aliases - however, this is linear distortion, so is still linear
Resampling: conceptualised as a resample up to a common multiple, filtering, and decimation to the desired rate
4. adding zero samples to pad the sample rate to the desired one is linear
5. low pass filtering is linear
6. throwing away samples is linear

The only part which may be mathematically non linear is the dithered quantisation, and that can be arbitrarily good based on the bit depth - which you already seem unconcerned by.


To summarise, the systems involved are linear, and it doesn't make any difference whether you have 6 ADCs and a digital mixer, or an analogue mixer followed by 1 ADC. All these superfine details that you are imagining are perfectly captured (to within the parameters of the system, namely bandwidth and noise floor) - whichever way around you do it.

Non linearities (e.g. that first buffer amplifier) would break this - but they'd also introduce signals that weren't supposed to be there anyway! Depending on where in the chain you introduced non-linearities, either version could be closer to the "correct" version.

Cheers,
David.

This post has been edited by 2Bdecided: May 15 2008, 11:40
Go to the top of the page
+Quote Post
Kees de Visser
post May 15 2008, 11:37
Post #61





Group: Members
Posts: 737
Joined: 22-May 05
From: France
Member No.: 22220



QUOTE (Martel @ May 15 2008, 09:46) *
96kHz ADCs are less likely to be plagued by analog antialiasing filter, which they need to include. You may (relatively) easily design something like the SSRC's lowpass in software but it is virtually impossible using analogue circuit.
That's why almost all modern ADCs use oversampling and digital filtering. I think it's the need for low latency that restricts the complexity of digital filtering in recording equipment.
Go to the top of the page
+Quote Post
Martel
post May 16 2008, 08:48
Post #62





Group: Members
Posts: 564
Joined: 31-May 04
From: Czech Rep.
Member No.: 14430



QUOTE (Kees de Visser @ May 15 2008, 02:37) *
QUOTE (Martel @ May 15 2008, 09:46) *
96kHz ADCs are less likely to be plagued by analog antialiasing filter, which they need to include. You may (relatively) easily design something like the SSRC's lowpass in software but it is virtually impossible using analogue circuit.
That's why almost all modern ADCs use oversampling and digital filtering. I think it's the need for low latency that restricts the complexity of digital filtering in recording equipment.

Oh, sorry, I completely forgot that they are mostly based on delta-sigma. I must have been outside the audio territory for far too long. sad.gif
But I guess the claim about (lowpass) filtering quality and its impact still holds, be it digital or analogue. smile.gif


--------------------
IE4 Rockbox Clip+ AAC@192; HD 668B/HD 518 Xonar DX FB2k FLAC;
Go to the top of the page
+Quote Post
MLXXX
post May 18 2008, 16:29
Post #63





Group: Members
Posts: 207
Joined: 25-February 08
From: Australia
Member No.: 51585



QUOTE (cabbagerat @ May 15 2008, 00:57) *
No, because the output of reconstruction will be the same in both cases, given a bandlimited signal. Kotelnikov's original paper (one of the first in the field) actually discusses this, and it can be proven without difficulty. There is a small theoretical problem with the turn on condition (the beginning of time), but this can be ignored in audio.

I've noticed in several other threads in other forums that when a "What about different phases?" question is raised, it is dealt with by reference to steady waveforms and Nyquist. The argument goes that you can represent waveforms accurately with sampling at twice the maximum frequency of the Fourier series for a particular source. The question is not dealt with of the quality of representing interactions between waveforms from independent sources with continuously varying phase relationships. (I imagine I would not be in a position to understand a detailed mathematical explanation anyway!) Perhaps my query does seek to explore the "turn on condition".

QUOTE (2Bdecided @ May 15 2008, 20:34) *
To summarise, the systems involved are linear, and it doesn't make any difference whether you have 6 ADCs and a digital mixer, or an analogue mixer followed by 1 ADC.


I do not understand the beginning of the explanation as these formulae appear to anticipate the conclusion reached:-

Situation 1: 2 ADCs, digital mixing
final output = f(x) + f(y)

Situation 2: analogue mixing, 1 ADC
final output = f(x+y)


They seem to be declarations that a sampled output resolves to the same thing as an analogue output, for bandlimited input.

A 96KHz extract

I have always found combined strings a good test for audio equipment. I have come across a recording of an orchestra playing The Earth Overture by Kosuke Yamashita.

THe format is 7.1 channel 96KHz 24-bit linear PCM. (The Blu-ray reference disc has been released by Q-TEC.)

The audio quality is very good. I found that when I converted a short extract to 48KHz with Audition 3, the quality was reduced slightly (at least as played back by my AVR). In contrast, many other recordings I have experimented with have revealed no apparent (to me) audible differences when downsampled to 48KHz.

The 48KHz version is not quite as smooth sounding. I find this noticeable in the harmony between the string sections. With the 96KHz version, the sounds blend such that the strings taking the lower part are less noticeable. I'll upload a 9 second extract in this post if possible.

Now I imagine 2Bdecided and many others will assume my playback equipment is responsible for the difference, and that is distinctly possible; but it is also possible that a conversion to 48KHz of this particular recording will impair it.

ABXing was not easy. Loudspeakers revealed the differences (not my headphones). Here are my results:

foo_abx 1.3.1 report
foobar2000 v0.9.5.1
2008/05/18 22:33:06

File A: C:\Users\Public\earthsong_9seconds.wav
File B: C:\Users\Public\earthsong_9secondsAuditionConvertedto48KHz.wav

22:33:06 : Test started.
22:35:11 : 01/01 50.0%
23:01:19 : 02/02 25.0%
23:02:28 : 03/03 12.5%
23:03:18 : 04/04 6.3%
23:03:37 : 05/05 3.1%
23:03:44 : Test finished.

----------
Total: 5/5 (3.1%)


This post has been edited by MLXXX: May 18 2008, 18:12
Go to the top of the page
+Quote Post
Martel
post May 18 2008, 19:25
Post #64





Group: Members
Posts: 564
Joined: 31-May 04
From: Czech Rep.
Member No.: 14430



A 44 kHz digital waveform PERFECTLY describes ANY signal (or mixture of signals), including phase, from 0 to 22049 Hz, if you do not consider distortion caused by finite number of amplitude quantization steps.
Just looking at the waveform, you might get suspicious about accuracy at frequencies near the Nyquist one, since the signal hardly gets 3-4 samples per period. Try zooming in the waveform in Cool Edit up to the sub-sample accuracy. There you will see some interpolated points between actual samples. These are calculated solely by upsampling. No information is lost, you may recalculate the "missing" samples any time. This "upsampling" also happens naturally in DAC upon conversion to continuous-time domain.


--------------------
IE4 Rockbox Clip+ AAC@192; HD 668B/HD 518 Xonar DX FB2k FLAC;
Go to the top of the page
+Quote Post
cabbagerat
post May 19 2008, 08:33
Post #65





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



QUOTE (MLXXX @ May 18 2008, 07:29) *
I've noticed in several other threads in other forums that when a "What about different phases?" question is raised, it is dealt with by reference to steady waveforms and Nyquist. The argument goes that you can represent waveforms accurately with sampling at twice the maximum frequency of the Fourier series for a particular source. The question is not dealt with of the quality of representing interactions between waveforms from independent sources with continuously varying phase relationships. (I imagine I would not be in a position to understand a detailed mathematical explanation anyway!) Perhaps my query does seek to explore the "turn on condition".
You need to read some of the background theory, because I am not sure I can explain this clearly in a forum post. Essentially, the idea is that the sum of two bandlimited signals is a bandlimited signal. Therefore, in an ideal (no quantization, no clipping) system, if x would be properly sampled, and y would be properly sampled, then x+y will be properly sampled. With clipping and quantization, this becomes a little more grey, because (as detailed in 2Bdecided's post) we can't really assume the system is linear any more - but it's probably close enough. But the matter remains, there are no bandlimited signals whose "continuously varying phase relationships" cannot be captured by a sampled system - within the limits of the system SNR. It might seem logical that there are, but there really aren't.

As for the turn on condition - this is the question of, if your first discrete sample is sample x[0] of x(0), then what do you assume x[-1] to be during the reconstruction process? There is a mathematically correct way of doing it, and the way it's done in real systems.
QUOTE (MLXXX @ May 18 2008, 07:29) *
QUOTE (2Bdecided @ May 15 2008, 20:34) *

To summarise, the systems involved are linear, and it doesn't make any difference whether you have 6 ADCs and a digital mixer, or an analogue mixer followed by 1 ADC.


I do not understand the beginning of the explanation as these formulae appear to anticipate the conclusion reached:-

Situation 1: 2 ADCs, digital mixing
final output = f(x) + f(y)

Situation 2: analogue mixing, 1 ADC
final output = f(x+y)


They seem to be declarations that a sampled output resolves to the same thing as an analogue output, for bandlimited input.
Yes, as 2Bdecided said in his (excellent) post - the process is for the most part linear. If f(x) is a linear function - then f(x+y) = f(x)+f(y) and f(ax) = af(x) for constant x. The post goes on to develop an argument why the sampling process can reasonably be considered to be linear - hence these relationships hold. Obviously this only holds up to clipping, and above the noise floor - but is a fair enough assumption about *reasonable* signals.

Please read his post again.


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
MLXXX
post May 19 2008, 12:06
Post #66





Group: Members
Posts: 207
Joined: 25-February 08
From: Australia
Member No.: 51585



Rereading the post leaves me with the same impression. The conclusion of 2Bdecided's (excellent) post appears to flow from the mathematical basis it establishes at the beginning.

I note that in the analogue domain the sources to be mixed are not as severely bandlimited as they end up being when converted to the digital domain (assuming use of microphones that respond to frequencies exceeding 22050Hz, and assuming the use of a nominal digital sampling rate of 44.1KHz).

This difference between the bandwidths of the analogue and digital mixing processes must, I presume, be contemplated in the equations used at the beginning of the presentation, and must be considered to have no ultimate impact.

QUOTE (Martel @ May 19 2008, 04:25) *
Try zooming in the waveform in Cool Edit up to the sub-sample accuracy.
With the particular sample clip, the 96Khz and 48KHz waveforms (at a given elapsed time from the start of the clip) often differ dramatically, presumably as there is so much content above 24KHz in the 96KHz version.

But I can see that if a continuous high frequency sine wave not far below the Nyquist limit were being sampled one could verify performance near the Nyquist limit by inspection of the Cool Edit produced waveform graphs, and this would be an interesting exercise. The waveform would approximate a sine wave, possibly with a bit of phase delay introduced by digital filtering. I guess the phase delay could be observed by generating a waveform at 10.5Khz with a weak 2nd harmonic and observing the [average] displacement of the zero crossing of the 21Khz component relative to the zero crossing of the fundamental, though I've never tried this.


[Will upload my sample clip if possible within the next 24 hours.]

This post has been edited by MLXXX: May 19 2008, 14:29
Go to the top of the page
+Quote Post
pdq
post May 19 2008, 16:10
Post #67





Group: Members
Posts: 3450
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



Let me see if I can provide an analog-domain equivalent to what we are discussing (and somebody correct me if I'm wrong).

Let's say that you start with some waveform, and then you add a 22.05 kHz sine wave to it. Now lowpass the result to 22049 Hz.

You will now have one of two things. Either the original waveform had no content above 22049 Hz, in which case you have back the original waveform, no matter how complex it was; or else the original waveform had content above 22049 Hz, in which case you now have intermodulation products between the original waveform and the 22.05 kHz sine wave.

When you translate this to A/D conversion followed by D/A conversion and bandwidth limiting the result is exactly the same except for clipping and quantization.


Apparently this only applies if you are multiplying by a 22050 Hz sine wave.

This post has been edited by pdq: May 19 2008, 17:24
Go to the top of the page
+Quote Post
2Bdecided
post May 19 2008, 16:56
Post #68


ReplayGain developer


Group: Developer
Posts: 5362
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (pdq @ May 19 2008, 16:10) *
Let me see if I can provide an analog-domain equivalent to what we are discussing (and somebody correct me if I'm wrong).

Let's say that you start with some waveform, and then you add a 22.05 kHz sine wave to it. Now lowpass the result to 22049 Hz.

You will now have one of two things. Either the original waveform had no content above 22049 Hz, in which case you have back the original waveform, no matter how complex it was; or else the original waveform had content above 22049 Hz, in which case you now have intermodulation products between the original waveform and the 22.05 kHz sine wave.
Why would you have intermodulation products? Is this analogue circuit broken or something?

As long as everything is working, and you choose a sensible filter (let's say 20kHz) you won't know whether you added a 22.05kHz sine wave before filtering, or not. It won't interact within anything, and it'll be gone after you filter.

Cheers,
David.
Go to the top of the page
+Quote Post
greynol
post May 19 2008, 17:06
Post #69





Group: Super Moderator
Posts: 10338
Joined: 1-April 04
From: San Francisco
Member No.: 13167



Key word here is product.

Simply summing two signals will not result in intermodulation.


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
pdq
post May 19 2008, 17:08
Post #70





Group: Members
Posts: 3450
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



I could be wrong about this, but I thought that when you sum two frequencies the waveform is the same as if you had the sum and the difference of the two frequencies, but when you filter out the sum of the frequencies then you are left with the difference, which is an intermodulation product.
Go to the top of the page
+Quote Post
greynol
post May 19 2008, 17:18
Post #71





Group: Super Moderator
Posts: 10338
Joined: 1-April 04
From: San Francisco
Member No.: 13167



You have to multiply the two signals or subject them to some other non-linear process during the summation in order to get sum and difference frequencies.

This post has been edited by greynol: May 19 2008, 17:33


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
pdq
post May 19 2008, 17:25
Post #72





Group: Members
Posts: 3450
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



Sorry, post corrected.
Go to the top of the page
+Quote Post
MLXXX
post May 19 2008, 18:11
Post #73





Group: Members
Posts: 207
Joined: 25-February 08
From: Australia
Member No.: 51585



QUOTE (Martel @ May 19 2008, 04:25) *
A 44 kHz digital waveform PERFECTLY describes ANY signal (or mixture of signals), including phase, from 0 to 22049 Hz, if you do not consider distortion caused by finite number of amplitude quantization steps.
Just looking at the waveform, you might get suspicious about accuracy at frequencies near the Nyquist one, since the signal hardly gets 3-4 samples per period. Try zooming in the waveform in Cool Edit up to the sub-sample accuracy. There you will see some interpolated points between actual samples. These are calculated solely by upsampling. No information is lost, you may recalculate the "missing" samples any time. This "upsampling" also happens naturally in DAC upon conversion to continuous-time domain.

I do get suspicious when I look at a digital mixdown of 19KHz and 20Khz sinewaves that were created at 44.1KHz. There are so few sample points and yet as you say cooledit manages to create a realistic graphical interpolation (with this relatively simple waveform).

In contrast, when I look at 19KHz and 20KHz sinewaves created at 96KHz and mixed digitally in cooledit, there are so many more sample points in the mixdown that sophisticated interpolation would not be necessary: you could simply join the dots with a most basic form a of integration (a resistor and capacitor). The undulations in overall amplitude at a rate of 1KHz appear to be relatively smooth, at this higher sampling rate. I could readily imagine this undulating signal surviving, despite the addition of other high frequency signals into the digital mix each needing to be 'interpolated'.

This post has been edited by MLXXX: May 20 2008, 09:55
Go to the top of the page
+Quote Post
greynol
post May 19 2008, 18:25
Post #74





Group: Super Moderator
Posts: 10338
Joined: 1-April 04
From: San Francisco
Member No.: 13167



Reconstruction using a sinc pulse at every sample is perfect (ignoring quantization error and possible distortion at the edges) so long as the original signal is BW limited to half the sample rate. I am pretty sure this is exactly what cool edit and adobe audition are doing with their graphical representation. The software isn't Spice; it doesn't care about resistors and capacitors.

This is all that needs to be said. The number of sample points used is extraneous and therefore irrelevant.

This post has been edited by greynol: May 19 2008, 19:12


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post
Martel
post May 20 2008, 08:38
Post #75





Group: Members
Posts: 564
Joined: 31-May 04
From: Czech Rep.
Member No.: 14430



QUOTE (MLXXX @ May 19 2008, 09:11) *
I do get suspicious when I look at a digital mixdown of 19KHz and 20Khz sinewaves that were created at 44.1KHz. There are so few sample points and yet as you say cooledit manages to create a realistic graphical interpolation (with this relatively simple waveform).
There's really no reason to get suspicious as there is EXACTLY ONE WAY how to fill in the missing samples, there's NO ambiguity. And this is by inserting arbitrary number of null samples between actual samples, then apply a digital lowpass filter which would eliminate any frequencies at and above the original Nyquist frequency. Well, the results may vary depending on the filter design quality but the principle is the same. If you have top quality filters, you are able to almost perfectly reconstruct any signal present in a 44.1kHz digital waveform, when going into the continuous-time domain (analogue signal). And this holds vice-versa as well (going from analogue to digital), as pointed out in my previous post.
QUOTE (MLXXX @ May 19 2008, 09:11) *
In contrast, when I look at 19KHz and 20KHz sinewaves created at 96KHz and mixed digitally in cooledit, there are so many more sample points in the mixdown that sophisticated interpolation would not be necessary: you could simply join the dots with a most basic form a of integration (a resitor and capacitor). The undulations in overall amplitude at a rate of 1KHz appear to be relatively smooth, at this higher sampling rate. I could readily imagine this undulating signal surviving, despite the addition of other high frequency signals into the digital mix each needing to be 'interpolated'.

There is no "sophisticated" interpolation involved. I do not call lowpass filtering a sophisticated method. Well, perhaps the filter design itself might be "sophisticated" but the reconstruction process is not.
The samples that are present in the 96kHz wave and not in the 44.1kHz one are simply redundant and bring no additional information at all since they can be easilly (and almost perfectly, considering the digital filtering limits) recalculated.


--------------------
IE4 Rockbox Clip+ AAC@192; HD 668B/HD 518 Xonar DX FB2k FLAC;
Go to the top of the page
+Quote Post

5 Pages V  < 1 2 3 4 5 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 21st December 2014 - 02:35