IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Alt-Preset Standard/Extreme Flaimbait
idfubar
post Apr 30 2003, 18:53
Post #1





Group: Members
Posts: 1
Joined: 30-April 03
Member No.: 6294



Forgive me, but a quick question:

Is there any way to determine the difference between alt-preset standard/extreme settings mathematically?

The reason I ask is a CD is limited to a Signal-To-Noise Ratio of 96. Since the limiting factor in encoding is the fidelity of the source, is there someway to determine a "threshold" where using extreme v/s standard can be shown to be advantageous (i.e. shown to have an audible resolution) mathematically?
Go to the top of the page
+Quote Post
Doctor
post Apr 30 2003, 19:27
Post #2





Group: Members
Posts: 160
Joined: 16-January 03
Member No.: 4597



I'll byte (IANA Lame developer).

General question: is there general measurement of aps/ape snr? Answer: no, because psychoacoustic coding tries to maintain listener experience, not acoustical properties (cf. codec comparison by frequency graphs). There may be very vague limits.

Specific question: how does mp3 fidelity relate to cd fidelity? Answer: the 96 db you quoted represent the bit depth (16) of cd audio. Mp3 drastically transforms the signal, and then sends as many bits as it finds necessary. It is unlikely that it will send more than it got originally, so generally mp3 offers less fidelity than cd.

More specific question: can one choose compression level mathematically? Answer: probably not. Since coding is psychoacoustic, the question is not "is this sufficient snr" but "will I hear any difference", and the codec is already trying very hard to make sure you do not. So the compression level relates more to your equipment/ears/bandwidth requirements than the material.
Go to the top of the page
+Quote Post
DigitalMan
post Apr 30 2003, 20:52
Post #3





Group: Members
Posts: 486
Joined: 27-March 02
From: California, USA
Member No.: 1631



IAANA L.A.M.E. developer - feel free to correct any of this if you know the technology better:

The s/n ratio is just one of the measures of sound quality. It is a steady state measurement that does not directly apply to a lossy codec.

CD uses linear pulse code modulation which means each sample is coded linearly. In the case of CD the linear resolution is 16 bits, yeilding 2^16 possible amplitude levels which translates to a range (ratio) of 96dB.

My understanding is that MP3 actually uses 24 bits of equivalent precision as a format and is not linear, so theoretically you could code a signal that is 144dB lounder than no signal. This would mean that a 16 bit dynamic range from CD would fit easily into a 24bit range for MP3 - you should not lose any dynamic range at all when encoding to MP3.

However, to reduce the data rate MP3 dynamically compromises the s/n ratio to lower the typical resolution while allowing the quantization artifacts ("noise") to be as high as possible without becoming audible in the presence of a signal (music). So the internal precision of MP3 dynamic range may be 144dB but in practice it would be significantly lower, although the nature of the perceptual codec is designed to mask the noise so that you can't hear it.

Bottom line: dynamic range (s/n ratio) is not typically a problem for MP3. The typical difference in various bit rates for MP3 lies in how well it can handle difficult "killer" signals which cause it to screw up in unsubtle ways, or high frequency bandwidth, not in higher steady state noise levels. Recommend you browse the MP3 area on this forum for a background on what you "give up" at lower bit rates.


--------------------
Was that a 1 or a 0?
Go to the top of the page
+Quote Post
Gabriel
post May 1 2003, 13:38
Post #4


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



SNR is not relevant for a lossy codec using psychoacoustics.
In a lossy codec not using psychoacoustic properties, like a dpcm codec, it might be relevant.

Back to mp3.
SNR is not relevant, because the job of the codec is to use at any moment the lowest SNR ratio for a given frequency band, without beeing noticable by a human ear. The lowest the SNR, the more bits are saved.

There is no higher theorical limit to the SNR ratio of mp3 neither, because it is using as many bits as needed (determined by the psychoacoustic model), so it is not limited to 16, 24 or even 32 bits. It has an unlimited resolution.

I know that several "hi-fi" magazines tryed to measure the SNR value of mp3. This is just an indication that they do not understand the underlying technology. (no pun intended, everyone has his own knowledge field)

The dynamic range of the codec can however be mesured. For Lame it should be around 110-120 dB.
If I remember well, someone here measured it in the past.
Go to the top of the page
+Quote Post
Gabriel
post May 1 2003, 13:44
Post #5


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



Here is the link:
http://www.hydrogenaudio.org/forums/index....namic,and,range
Go to the top of the page
+Quote Post
DickD
post May 1 2003, 14:16
Post #6





Group: Members
Posts: 265
Joined: 12-January 03
Member No.: 4542



Your original guesses aren't sufficient to tell the difference between standard and extreme. In fact audible quality is barely different (most artifacts are flaws in the psychoacoustic model (or limitations of the MP3 format), so most of the extra bits used by extreme are not directed to fixing the flaws because the encoder doesn't think there are flaws, and extreme might simply happen to make the artifacts quieter, though they're still audible. That's why standard is the standard. Insane is the best that MP3 can do, and is recommended for problem samples where lame APS (or APS-Z) still has problems.

One difference that you might be able to analyse mathematically is the low-pass filter frequency. For music with insufficient high frequency content this won't work, but standard uses a fixed 19.0 kHz lowpass (polyphase, transition band 18671-19205 Hz), and extreme uses a 19.6 kHz lowpass (polyphase, transition band 19383 - 19916 Hz). Encspot will read the lowpass value from the LAME header, and 19000 Hz or 19600 Hz are the values for standard and extreme respectively. There's likely to be no audible difference on most samples.

BTW, a lot of people use MP3 for encoding because they believe they're restricted to MP3 for a portable hardware MP3 player. If so, you might actually be able to use MP2 as well using the same decoder (and usually get away with renaming to MP3 if the MP2 extension is not recognised). TooLame can encode .MP2, which isn't a transform codec, and won't suffer the same types of artifact as MP3 if MP3 simply isn't good enough. It's available on the MP3 page of Rarewares.hydrogenaudio.org. I've heard that this works pretty well at very high bitrates (such as 384 kbps) and has pre-echo and transient resolution closer to MusePack (but it's less efficient than MPC and lacking in joint stereo). Its psychoacoustics aren't mature, well optimized and well-tested like LAME's, but you could try a CBR 224, 256, 320 or 384 kbps commandline like
CODE
toolame -p 3 -b 256 file.wav file.mp2

or try VBR (though it's not guaranteed to be supported by the decoder) such as
CODE
toolame -p 3 -v 5 file.wav file.mp2
for VBR bitrates probably a fraction above those of lame --alt-preset extreme -Z. I wouldn't say either of these toolame settings is a secure mode, but it's a plausible alternative to Lame insane or extreme for portable players when you still have an artifact.
Go to the top of the page
+Quote Post
2Bdecided
post May 1 2003, 16:20
Post #7


ReplayGain developer


Group: Developer
Posts: 5105
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



Hi DickD!

I'm still having fun with mp2. At low bitrates (e.g. 128kbps), I prefer toolame -p 2. (or -p 4 - sounds the same to me, but I haven't tested it much).

At high bitrates, I can't hear a difference between the psychoacoustic models. On the version I was testing, -p 3 crashed anyway.

At any reasonable (i.e. hopefully transparent) bitrate, "true" stereo is essential with mp2 - it's joint stereo is quite bad (Intensity stereo only), and rarely transparent.


toolame -b 192 -p 2 -m s
is really good. However, I'm not sure
1) what it's doing above 16kHz (not a lot, I think - but I can't hear that high anyway)
2) if it's coping with harpsichord music very well (I think it maybe isn't - but neither does lame mp3)

Whatever - as you say, it's worth a try. mp2s play fine on my CD-mp3 walkman. CBR only - almost nothing plays mp2 VBR so it's not really worth the effort. sad.gif

If anyone else is thinking of trying it, I'd suggest 192kbps or 256kbps for starters. Lower is audibly worse. Higher hasn't given any improvement for me - but some serious harpsichord ABXing may change that! (Strangely, I have better and more enjoyable things to do! )

Cheers,
David.

P.S. - mathematical comparison of aps vs ape? subtract the original from each encoded file, and measure the remaining noise. What does this tell you? Absolutely nothing useful! Tt doesn't tell you what you can or cannot hear, and the file with the greater (inaudible) noise could even be the one which sounds best. But you asked for a mathematical comparison, and that is one. Now go and listen! wink.gif

This post has been edited by 2Bdecided: May 1 2003, 16:23
Go to the top of the page
+Quote Post
DickD
post May 2 2003, 11:08
Post #8





Group: Members
Posts: 265
Joined: 12-January 03
Member No.: 4542



Thanks for the info about MP3 portable support for MP2, 2Bdecided, including lack of VBR support at MP2.

I don't want to carry this too far off-topic, but from the TooLAME psychoacoustic page:

QUOTE
Psychoacoustic Model 3
A re-implementation of psychoacoustic model 1. ISO11172 was used as the guide for re-writing this PAM from the ground up.
Pros: No more obscure tables of values from the ISO code. Hopefully a good base to work upon for tweaking PAMs
Cons: At the moment, doesn't really sound any better than PAM1

Psychoacoustic Model 4
A cleaned up version of PAM2.
Pros: Faster than PAM2. No more obscure tables of values from the ISO standard. Hopefully a good base to work from for improving the PAMs
Cons: Still has the same "warbling"/"Davros" problems as PAM2.


I haven't tried TooLAME extensively (only a couple of tests, and prior to that my previous experience was years ago in CoolEdit96), but this comment about warbling and Davros sounds (meaning 'robotic' for those unfamiliar with Dr.Who) in PAM2, made me shy away from PAM2, though I guess it's only a low-bitrate problem and I chose PAM3 for no particular reason. The rarewares version didn't have PAM4, in fact, so I ought to get the version from the TooLame site if I do any more experiments.

The PAMs clearly aren't fine-tuned for transparency. I do note that tooLAME doesn't include low/high pass filtering functionality yet, so it won't be eliminating frequencies >16 kHz (unless it quantizes that band to all zeroes).

BTW, the original question in this thread. Not sure if it's about comparing two files, or having one file of unknown origin, perhaps even a decoded WAV, and trying to tell what it is.
Go to the top of the page
+Quote Post
2Bdecided
post May 2 2003, 11:38
Post #9


ReplayGain developer


Group: Developer
Posts: 5105
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



I fear it's me who is missing something in p2 - I'd read those descriptions from the site, and I've found that some vocals sometimes get a strange effect with p2 at 128kbps JS. You could say it's like Davros. But it's vastly preferable to the mess p1 makes of everything else at 128kbps!

At 192kbps -m j, one of them messed up Spahm, and the other messed up applause or fatboy or something (I've lost the note I made). However, using -m s, I couldn't find fault with either p1 or p2 at 192kbps.

As for the lowpass - with real music, I've not seen it encode anything above 16kHz at 192kbps SS p2. At 384kbps it sometimes reaches up to 20kHz - but that' very rare - it's usually cutting at 16-17kHz with very occasional blocks above. I'd guess that the ATH rises very sharply at this point.

Still, it sounds great to me. Maybe someone who can hear HF tones can try it on some of the known HF problem samples.

Cheers,
David.
Go to the top of the page
+Quote Post
DickD
post May 2 2003, 15:26
Post #10





Group: Members
Posts: 265
Joined: 12-January 03
Member No.: 4542



Ah, my test VBR MP2 didn't go below 192kbps (don't think tooLAME allows it for VBR) and averaged about 240 kbps, so I didn't detect any artifacts in what was probably a fairly easy sample too. I suspect all PAMs would have sounded pretty good at 192 and above with only subtle defects. If I were serious, I'd probably have given them a good test at lower bitrates, but it was more academic interest in case I eventually get a portable MP3 player and want an alternative with better transients. For now, I'm a happy Musepack --quality 5 --xlevel user.
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 28th August 2014 - 13:01