Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: -V2 gives way too high bitrate!?! (Read 18689 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

-V2 gives way too high bitrate!?!

I use dbpoweramp and I encode flac tracks to LAME 3.98r -V2. Usually this results in ~170 to ~220 kbps files (close to 192kbps CBR quality)

Well, not so for 2 As I Lay Dying albums that I have (think REALLY LOUD metal). When I encode those I get 230-270 kbps and of course the resulting files are quite big.

My question is: is this a bug or completely natural?
Furthermore (if it's not a bug), isn't it just better to encode in 192kbps and get a smaller file, since that's the quality I was going for anyway?

Cheers

-V2 gives way too high bitrate!?!

Reply #1
Natural. VBR targets quality, not bitrate. Loud metal has a lot of HF in cymbals, etc., which LAME allocates more bits to achieve a given quality.

If you want a lower bitrate, try -V3; it's not rocket science.

Quote
Usually this results in ~170 to ~220 kbps files

The key word is usually. Again, VBR doesn't have to target any bitrate. Any quoted figures are averages from particular test sets only.

-V2 gives way too high bitrate!?!

Reply #2
Natural. VBR targets quality, not bitrate. Loud metal has a lot of HF in cymbals, etc., which LAME allocates more bits to achieve a given quality.


I also find that cymbals in metal, punk rock, etc. are usually the first place that noticeable artifacts show up.  Furthermore, I haven't tested this formally or researched the theory of it, but I think based on anecdotal evidence that music with a lot of clipping and/or dynamic range compression (the way REALLY LOUD metal is probably mastered) is pretty hard to encode.

-V2 gives way too high bitrate!?!

Reply #3
metal (being loud and "noisy"*) kicks on one limitation of mp3 as a format (aside of being difficult to encode by nature).

Two ways to workaround (workaround not equal fix) this is either use something lower than -V2 (like -V3), or use -V2, -V1 or -V0 with the additional -Y parameter.

I preffer not to enter again in the discussion of what this paramater does, and what does not do. If you use it, it simply will not use as much bitrate for songs that have lots of frequency content above 16Khz.




*noisy defined as important amount of signal at the top end of the frequency range.

-V2 gives way too high bitrate!?!

Reply #4
Damn! That's what I suspected too (compression). It's just that the numbers seemed a bit ridiculous to my eyes! (~260 VBR for -V2, which I thought corresponded to 192 CBR)

Now I have to try other settings because the files are HUGE as it stands.

Anyway thanks!

-V2 gives way too high bitrate!?!

Reply #5
You could update to LAME 3.98.4, as previous 3.98 versions tried to workaround some Fhg decoder bug, which results in some bloating. Maybe try to reduce file size with mp3repacker too.

-V2 gives way too high bitrate!?!

Reply #6
Furthermore (if it's not a bug), isn't it just better to encode in 192kbps and get a smaller file, since that's the quality I was going for anyway?

Wrong! 192 kbps is NOT a quality it is a bitrate. Quality at this fixed bitrate could be just about anything.

If you really want the best quality while restricting to this approximate bitrate then use abr 192 instead.

-V2 gives way too high bitrate!?!

Reply #7
Quote
If you really want the best quality while restricting to this approximate bitrate then use abr 192 instead.

Or just try a lower VBR setting, which will produce smaller files that may well have an average bitrate more to your liking.

-V2 gives way too high bitrate!?!

Reply #8
Quote
Two ways to workaround (workaround not equal fix) this is either use something lower than -V2 (like -V3), or use -V2, -V1 or -V0 with the additional -Y parameter.

I forgot to add that my hearing is pretty awful. Specifically, I can barely hear anything above 17k in this test

Is there an option to cut content above 17k instead of 16k as -Y does?

-V2 gives way too high bitrate!?!

Reply #9
-Y does not cut content above 16k; it is not a lowpass.

[JAZ] really didn't want to get into this, and I don't blame him.  Unfortunately it's opened up the door for these kinds of false assumptions.

Answering your question, you can try --lowpass 17.

Regarding -Y, doesn't it kick in with -V settings greater than 2 (such as 2.1), or is it at 3 and higher?

-V2 gives way too high bitrate!?!

Reply #10
Quote
Regarding -Y, doesn't it kick in with -V settings greater than 2 (such as 2.1), or is it at 3 and higher?

The latter.
-Y is always on for -V 3 ... -V 9.999 and off by default for -V 0 ... -V 2.999. (and, there can be a noticeable difference in bitrate between -V 2.999 and -V 3)

-V2 gives way too high bitrate!?!

Reply #11
I hate to continue discussion about -Y since it seems to be one of the least favourite LAME aspects to bring up, but there doesn't seem to be a definitive description of what it does. The best information I've seen is the topic from 2002 where Dibrom talks about its purpose, which I gathered is essentially to tell LAME not to encode frequencies > 16 Khz if doing so will significantly increase the bitrate. Aside from his statement that this means ~80% of 16 Khz or higher frequencies won't get encoded, I haven't found any more in-depth descriptions of exactly how much content gets cut. The topic also mentions that a person's frequency response in a test scenario like the egopont test page or the ff123 sweep test isn't necessarily the same as what that person can hear in music, i.e. just because you can hear the 17 Khz beep on the "test your hearing" page doesn't  mean you will be able to ABX music with a lowpass of 16 or the Y switch. Consequently, is the only real answer to the -Y question to encode some files without -Y and also with -Y, then try to ABX them and if you can't hear a difference, you might as well use -Y to save space?

Also, the -Y description from --longhelp says that it ignores sfb21 noise like in CBR. Does this mean that in CBR mode, if you're encoding at a bitrate which uses a lowpass of > 16 Khz, LAME will discard 80% of the content at > 16 Khz so as to make the content < 16 Khz sound better? Presumably if this is the case, it's because since CBR mode is so restrictive, encoding frequencies at > 16 Khz would consume the majority of the available bitrate in CBR mode, and thus all the frequencies < 16 Khz would sound significantly worse. As a result, I assume CBR mode would ignore ~80% of the > 16 Khz content so that there would be sufficient bitrate to make the lower frequencies, which are usually more important, sound better.

I hope I'm not too far off track with this, haha.

 

-V2 gives way too high bitrate!?!

Reply #12
Consequently, is the only real answer to the -Y question to encode some files without -Y and also with -Y, then try to ABX them and if you can't hear a difference, you might as well use -Y to save space?

Pretty much, yes, though you might want to include -V3 in your testing.

-V2 gives way too high bitrate!?!

Reply #13
Consequently, is the only real answer to the -Y question to encode some files without -Y and also with -Y, then try to ABX them and if you can't hear a difference, you might as well use -Y to save space?

Pretty much, yes, though you might want to include -V3 in your testing.

Well, -V 3 would also cause a drop in bitrate as it relates to overall quality, whereas enabling -Y for -V 0 - 2 would still employ the respective qualities that those settings normally use at the frequencies below 16 kHz (and in the > 16 kHz frequencies that were still included), right?

I hope somebody can shed light on the CBR relationship that is mentioned in the -Y help.

-V2 gives way too high bitrate!?!

Reply #14
I suggest not to fiddle with -Y . If you want a lower bitrate and still keep reasonable quality: V3
V2 - Y is hardly better than V3 on problem samples. Also -Y will totally change the mp3 bitrate distribution and quality is affected - whether you hear it or not.

Another approach would be -V2.99 ~ -V2.5 as you still encode HF content but less aggressively.

The V3 is a more efficient method than cbr 192k encoding, it isn't restricted. It will use 200k when needed and drop to 160k when not.

-V2 gives way too high bitrate!?!

Reply #15
I am preparing to put this on the wiki (but my user has to be validated still. Also, if someone with more knowledge sees any error in the technical description, feel free to correct me)


The -Y switch of LAME

The short and consice:

It is not:[blockquote]
    The -Y switch is not a lowpass filter, nor does it remove high frequencies per-se.
    It does not filter, it does not prevent frequencies higher than 16Khz to exist.
[/blockquote]
It is:[blockquote]
    The -Y switch tells the encoder to use a more coarse representation for the higher frequencies, in the parts where it would cause an over-encoding of all the other bands.
    The -Y switch tells the encoder to not be so strict with the higher frequencies, *IF* they are going to cause an increase of bitrate.
[/blockquote]


The technical and know how:

Preface: [blockquote]
   MP3 audio is stored in the frequency domain instead of time domain.
    The frequencies are then subdivided in several bands.
   The values for these bands are quantized to increase their compressability.
   The scale factor is the amount of quantization. (higher quantization, less resolution. Resolution means the amount of possible values from minimum range to maximum range)
   There is also another quantizer, the global gain, that affects all bands.
[/blockquote]
The problem, described (based on mp3-tech.org):[blockquote]
   The last scalefactor band (sfb21 for long blocks or sfb12 for short blocks) has no own scalefactor.
   This scalefactor band covers the range from 16kHz up to the higher frequency limit, when using 44.1 or 48kHz sampling frequency.
   If the resolution of this part of the spectrum must be increased (determined by the psychoacoustic model), the local scalefactor, which is missing, can not be used to adjust its resolution.
   To increase the resolution in this case, the only solution is to reduce the global gain value (its quantization). This impacts all other scalefactors, which are also reduced.
   But once they reach a value of 0, they can not be reduced anymore, meaning that a higher than needed resolution will locally be used in those bands, leading to an inflate of the bitrate.
   When encoding sfb21 content, it is common to encounter some scalefactor bands that are encoded with a too high resolution just to accomodate the coding needs of sfb21.
[/blockquote]

The -Y switch in LAME:[blockquote]
   The -Y switch considers the last scalefactor band as different, and tries not to increase its resolution to the point of causing the sbf21 problem.
[/blockquote]

-V2 gives way too high bitrate!?!

Reply #16
very short version: -Y switch makes LAME not encode high frequencies accurately, because doing so can cause disproportional increases in bitrate.

-V2 gives way too high bitrate!?!

Reply #17
It was not too long and is well worth reading, benski.  If you don't have the time to devote to the entire post, you should have at least read what was labeled "short and concise".  It would have certainly taken no more time than it took you to post what you did.

I think [JAZ]'s entire post is quite worthy of being in the wiki provided that it is technically accurate (I have no reason to think it isn't technically accurate).  I might be wrong, but I think you've pretty much said the same thing that [JAZ] wrote, but more dense, which might be harder for a lay person to understand.  I don't see any reason why that should preclude what you wrote from being in the article if what you wrote can be used to improve it, however.

-V2 gives way too high bitrate!?!

Reply #18
It was not too long and is well worth reading, benski; at least what was labeled "short and concise".  I think it's very worthy to be in the wiki provided that it is technically accurate (I have no reason to think it isn't technically accurate).


Sorry, I meant it as "the very short version" and not a slight at JAZ at all.  I'll edit so no one takes it the wrong way

-V2 gives way too high bitrate!?!

Reply #19
I hope you caught my edits above since I think your "the very short version" looks excellent to me as well, provided that it is later expanded upon a bit in order to reach a wider audience.

I wish I could do more than just cheer from the sidelines.  While I feel comfortable reading what you guys wrote, it is clear that the two of you are much more intimate with the subject.

-V2 gives way too high bitrate!?!

Reply #20
I suggest not to fiddle with -Y . If you want a lower bitrate and still keep reasonable quality: V3
V2 - Y is hardly better than V3 on problem samples. Also -Y will totally change the mp3 bitrate distribution and quality is affected - whether you hear it or not.

I find this to be a bit counter-intuitive. -V 3 has the Y switch enabled by default, so would it not be true that using -V 2 -Y would still yield superior results to -V 3, since -V 3 is actually using -V 3 -Y?  Also, I think I should point out that saying "quality is affected - whether you hear it or not" is rather irrelevant, as the entire point of MP3 is to reduce quality as long as you don't hear it in the name of saving space. 

As probably the resident "lay" person in this topic, I will attempt to repeat [JAZ]'s technical explanation, and hopefully it will give some idea as to how effective that post would be in the wiki to answer the -Y question for people who aren't experts. I fully expect this summary to be less technically accurate than his post, but the purpose is theoretically to show how much of the explanation's content was absorbed more-or-less correctly by someone unfamiliar with the inner-workings of MP3.

In bullet form:

  • MP3 groups audio data by frequency.
  • Frequency range groups are known as bands.
  • Bands are quantized to make them compress better.
  • "Scale factor" refers to how much quantization (compression) is applied to each band, where higher quantization causes greater compression and consequently less variation between the minimum and maximum values (resolution).
  • Each band has its own scale factor, so that its quantization can be adjusted independently from the others.
  • Global gain is the quantizer that affects all bands simultaneously.
  • The only band without a scale factor is sfb21, which stores frequencies >= 16 kHz.
  • Since sfb21 doesn't have a scale factor, if LAME determines that sfb21 needs more resolution, there is no way for LAME to increase the resolution of sfb21 alone, since there is no scale factor.
  • The only way to increase the resolution on sfb21 is therefore to reduce the global gain, which then increases the resolution and lowers the quantization of sfb21.
  • The side effect of doing this is that since global gain applies to all bands, resolution will be increased and quantization will be lowered on every other band, too.
  • The result is that unnecessary resolution is applied to every other band, so the bitrates used in all the other bands will increase, too.
  • This means that LAME is forced to increase the bitrate of the entire file just so that the frequencies >= 16 kHz will be adequately quantized.
  • -Y, then, does not remove frequencies above 16 kHz like a lowpass would-- it only prevents LAME from reducing the global gain value when the psy-model says it should to achieve the desired quality in the 16 kHz + range. The result is that all the 16 kHz + frequencies still get encoded, but the ones that would normally have needed higher resolution to satisfy the criteria of the psy-model don't receive that treatment, while ones that wouldn't need higher resolution are unaffected by the Y switch.
  • In effect, -Y gives you the same global gain that you would have if you used a lowpass of 16 kHz, but you still get to keep the 16 kHz + frequencies. They just get encoded "as-is", without any extra resolution being given to them by lowering the global gain, which is what LAME would do without -Y.


...so, how far off am I?

The one thing that I'm still hoping someone will further enlighten me on is: does CBR mode employ a tactic like the Y switch by default? That is to say, if you encode at a CBR bitrate where the lowpass filter is above 16 kHz, does LAME focus its efforts on the scale factor bands other than sfb21, so that since the available bits are restricted, it can apply what's available to the bands that are deemed more important? If so, I would assume this means that, like with VBR -Y, CBR mode uses lower resolution in sfb21 because it can't afford to lower the global gain to accomodate sfb21's needs, as CBR bitrate limitations mean that LAME has to be frugal in the way it allocates bits.

I continue to bring this up, because I think if my analysis is correct, it might be worthwhile to add the correlation between -Y and CBR to in effect reassure people that if they are generally satisfied with CBR's frequeny range reproduction, they need not fear using -Y in VBR mode, as CBR always has a similar behaviour enabled.

-V2 gives way too high bitrate!?!

Reply #21
I would like an answer from robert, or someone that has studied this part of the code.

My interpretation of what i've heard is that in CBR mode, the code works in a way that increases the resolution only if enough bits are free and only by the amount the free bits allow.  (Effectively, trying to maximize the quality for the given amount of bits).
In other words, at 320kbps, it would be working more like -V0 (without -Y), and at 192kbps would be working more like -V3 (with -Y)


@Aleron : Your explanation of the subject is correct and correlates to what i've written. I will use part of your tests to make the technical part more understandable.

-V2 gives way too high bitrate!?!

Reply #22

I would like an answer from robert, or someone that has studied this part of the code.

My interpretation of what i've heard is that in CBR mode, the code works in a way that increases the resolution only if enough bits are free and only by the amount the free bits allow.  (Effectively, trying to maximize the quality for the given amount of bits).
In other words, at 320kbps, it would be working more like -V0 (without -Y), and at 192kbps would be working more like -V3 (with -Y)

Would I be correct in assuming that every bitrate except 320 would probably behave as though -Y was enabled, seeing as since 320 is the maximum bitrate, -Y isn't really needed? If so, perhaps 320 isn't the best CBR example to use in the explanation, as it's sort of the exception to the rule.

I figure there must be some correlation between -Y in VBR and CBR mode, seeing as the LAME long help says:

-Y    lets LAME ignore noise in sfb21, like in CBR

@Aleron : Your explanation of the subject is correct and correlates to what i've written. I will use part of your tests to make the technical part more understandable.

If you feel that I've correctly distilled the essence of -Y's function, feel free to use anything that you think would improve the quality and accessibility of the wiki entry. My paraphrase certainly isn't as precise as your post, but I hope it might be slightly more understandable to the layman who could still be somewhat confused after reading your in-depth explanation.

-V2 gives way too high bitrate!?!

Reply #23
-Y is still 'experimental'.  It was never supposed to be used by non-developers. This is what Dibrom once said. I think these matters are too technical and will never be fully understood.

Another way to look at it is that the original --alt-preset design was get quality 1st THEN reduce bitrate if possible. APS -Y sorta fits it but not really - you are telling it that bitrate is more important (in the context of sacrificing HF quality)  . Preset medium took this even further to get a lower bitrate (sacrifice HF + more aggressive ATH)

By using V2~v0 -Y you are putting yourself into quality is important but bitrate comes 1st. That is against the original design. You are sacrificing something. So V3 is targeting the old APS -Y user by giving a level between APS (V2) and medium (V4). That is more or less what APS -Y did in the old days.

-V2 gives way too high bitrate!?!

Reply #24

I would like an answer from robert, or someone that has studied this part of the code.

My interpretation of what i've heard is that in CBR mode, the code works in a way that increases the resolution only if enough bits are free and only by the amount the free bits allow.  (Effectively, trying to maximize the quality for the given amount of bits).
In other words, at 320kbps, it would be working more like -V0 (without -Y), and at 192kbps would be working more like -V3 (with -Y)

CBR/ABR: the target bitrate is given, the quantization step sizes are adjusted until the target bitrate is reached. LAME does this by in-/decreasing the global gain and evaluating the quantization noise where sfbs are adjustable. It does not evaluate quantization noise in sfb21!
VBR: the target quantization noise within the sfbs is given, the quantization step sizes are choosen such that they result in given quantization noise. At higher quality settings, sfb21 gets evaluated and setting global gain is the only way to control the quantization step size for it. Now, adding -Y lets LAME ignore whatever quantization noise will be in sfb21, just as CBR does.