Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: MPC VBR flaws (low volume & ringing) (Read 82651 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

MPC VBR flaws (low volume & ringing)

I’ve read recently some complaints about musepack and distortions occurring with classical music (examples here and here). There were no ABX tests to confirm them. According to my previous listening tests at ~175 kbps, musepack performs not only very well with various kinds of instrumental and vocal samples, but also better than competitors. But I’ve also noticed in the past one issue with this audio format that my previous test didn’t revealed, and it’s a very big one. I’d like to bring out this problem to the community, which wasn’t as far as I know warned about this kind of flaw.

Before carrying and before some seeing zealous users bare its teeth, I have to make clear that this issue only occurs in specific conditions. The problem is confined to low-volume musical content, and is mainly audible when this content has to be listened to a higher playback volume. In other words, affected tracks must have low volume parts, and tracks with high dynamic are not really concerned (you can’t constantly push the volume on such material: your neighbors won’t appreciate it). The problem becomes really critical with low-volume tracks only. People who have to live with the consequences of the “loudness war” are certainly not used to encounter such tracks, but for classical fans, tracks that are replaygained at +10 dB, +20 dB and sometimes +30 dB are all except a rare thing (tracks with corrected gain beyond +25 dB are nevertheless very rare). The encoded material would exhibit strong artifacts with ReplayGan set with Track Mode (they won’t be audible otherwise, except maybe as a subtle form of distortion – it could explain some recent complains about musepack and classical music). With RG enabled, even untrained people will be shocked by the terrible ringing that run across this musical material. MPC, with --standard profile, and to some degree --extreme and also –insane is apparently not sensitive enough to handle low volume situation.


At this stage of my account, some people would be probably tempted to claim that such issue is normal with perceptual encoding, and that all other formats will suffer from the same issue in this specific playback condition. But a quick comparison would immediately deny all validity to this idea. I’ve compared musepack --standard to comparable MP3, AAC and Vorbis presets, and these competitors showed the ability to encode properly (no ringing, flat lowpass at high level) the same material. Even stranger, MP3 at 128 kbps, or Vorbis at 90 kbps (!), or AAC (faac!) at 100 kbps perform *much* better than musepack --standard. In other words, perceptual encoders (at least modern one) could handle this situation transparently at mid/low bitrate, even with VBR; only musepack fails, and badly. It might be interesting to note that the VBR model is apparently flawed: with --standard, the bitrate drops to unusual value (110…140 kbps), and quality to an even more abnormal threshold. An illustration (graphical – listening tests were performed upstream - click for link) could make things easier to understand:



I’ve also uploaded an additional gallery - the last one looks very weird! and sounds even worse as it looks.


The ringing, and the austere lowpass, are obvious on these screenshots. Quality is objectively worse than MP3@128; subjectively speaking, the audibility is –as usual- linked to various conditions: hardware, player settings (RG or not), listener’ sensitivity to ringing. Some users won’t notice it, some others will be frightened. The important point to note here is that other audio formats have no problems; my purpose wasn’t to make an infertile comparison between MPC and other. Based on this comparison, I’m tempted to say that MPC could rejoin them with some tuning. Anecdotal point: LAME had recently serious issue (which also concern 3.90.3 ABR at mid/low bitrate) and they were recently solved by developers. I think Gabriel worked on an adaptive ATH threshold, and it might be a lead for MPC developer or for some users which are interested to play with current encoder switches.


I’ve uploaded some samples. The gain for short samples is necessary different from the gain of complete sample; but I’ve tried to cut sample with similar gain. The WavPack samples uploaded have all the native gain and the track_peak of the full track. I’ve also duplicated the track gain to the album gain.

http://guruboolez.free.fr/MPC/quiet_tracks_replaygained.zip

Two appendix in this zip file : a piano sample for which track gain for the sample doesn’t really match to the track gain of the full track (+40 dB instead of +25 dB) ; and a very noisy track for which musepack doesn’t have any problem, despite of high gain correction.


This report is probably the last one I’ll do for MPC (a developer have claim their lack of interest for improving classical at --standard), but I nevertheless hope it will help to improve the encoder. Playing with command line (in order to change ATH or noise sensitivity) might be enough to solve or reduce this issue; therefore, every MPC user could contribute. In the meantime, users should be aware of this issue.

MPC VBR flaws (low volume & ringing)

Reply #1
I confirm serious problems under these special listening conditions.

Thanks you for making it clearer for me. Now I understand it better!

MPC VBR flaws (low volume & ringing)

Reply #2
What I understand from your post guru is that at > insane, this effect isn't significant. I hope this is the case.

Edit: Whoa, this is only noticable with ReplayGain higher than +20dB or so. The lowest classical on my system is +3dB or so. Any rock/metal/rap/electronica is gained to -9dB sometimes. Looks like I shouldn't really be concerned. Phew.
Acid8000 aka. PhilDEE

MPC VBR flaws (low volume & ringing)

Reply #3
I think you should put more creativity into your report. Maybe writing it entirely in haiku, and adding pictures of women in bikini next to the spectrograms


Hehe. Anyway, thank-you very much for this very enlightening report, Guru.

MPC VBR flaws (low volume & ringing)

Reply #4
I haven't seen this mentioned anywhere, so I thought I'll quote Case from IRC:
Quote
<cse> btw, here's something Klemm says about encoding highly dynamic movie tracks. I think this is valid for classical too:
<cse> 3. Also I suggest to use the option
<cse>        --standard --ath_gain -14
<cse>    with movie soundtracks.
<cse>    For other quality settings
<cse>      --quality x --ath_gain 16-6*x
<cse>    This lowers ATH by 14 dB relative to the standard.
<cse>    Feeding of mppenc should be done with 24 bit when possible.
<cse>    Use a 24 bit AC3decoder.

MPC VBR flaws (low volume & ringing)

Reply #5
The obvious workaround is to check the track gain before encoding, then adjust the ath level according to the gain.

The fix for the encoder would be to adjust dynamically its ath level. For a vbr encoder it is very important as you can not rely on the target bitrate "safeguard" as in cbr.

MPC VBR flaws (low volume & ringing)

Reply #6
Quote
This report is probably the last one I’ll do for MPC (a developer have claim their lack of interest for improving classical at --standard), but I nevertheless hope it will help to improve the encoder.


I'd like to know where you got that claim
To put things clear, the encoder side is unmaintained. I'm not aware of anyone actually trying to improve it, or increase its transparency even for non-classical music. (feel free to prove me wrong)
Recent releases were bugfixes or nice additions, but no changes were made to the psy-model itself. Some extremely minor patches for tag writing will come soon and that will be all.
So unless Klemm give some input, nothing will come out off this stuff. The codec is just fading away, losing its relevance little by little (and all mpc haters drop tears of joy and happiness).
It's a 'Jump to Conclusions Mat'. You see, you have this mat, with different CONCLUSIONS written on it that you could JUMP TO.

MPC VBR flaws (low volume & ringing)

Reply #7
guruboolez, thank you very much for your input. I think your findings are very valuable.

I downloaded your sample set and tried to test according to Gambit's (Case's ? Frank's ? ) suggestions, but it seems my hearing will need a few more days of peace after I went to The Mars Volta concert on Tuesday.

MPC VBR flaws (low volume & ringing)

Reply #8
Quote
Quote
This report is probably the last one I’ll do for MPC (a developer have claim their lack of interest for improving classical at --standard), but I nevertheless hope it will help to improve the encoder.

I'm not aware of anyone actually trying to improve it, or increase its transparency even for non-classical music. (feel free to prove me wrong)
Recent releases were bugfixes or nice additions, but no changes were made to the psy-model itself.


Well, I've been working on encoder changes, including a complete rewrite, but is has kind of taken a back seat to my player for the moment.  I've made many changes to the psymodel code (many functions have been completely rewritten), although they are speed optimization oriented and the output is made to be identical.  I haven't gotten far enough to really be concerned with changing things at the quality level yet.

I think that if someone could get feedback from Frank about the best way to handle this (i.e., algorithms, etc.), I could try to implement his ideas since it seems maybe nobody else will.

Quote
The codec is just fading away, losing its relevance little by little (and all mpc haters drop tears of joy and happiness).
[a href="index.php?act=findpost&pid=308403"][{POST_SNAPBACK}][/a]


I think a lot of this has to do with a certain lack of visibility.  It also seems that many people (not all of course) involved with MPC don't get along too well with HA these days either, which sort of leads to a strained relationship as far as potential developers might be concerned.

I would personally be a bit sad to see MPC fade into irrelevancy because of lack of development since it was the first codec I used that I was really impressed with once I started paying serious attention to encoding quality.  For the most part, it's still one of the best too.

@guruboolez: Thanks for the report.

MPC VBR flaws (low volume & ringing)

Reply #9
Quote
The codec is just fading away, losing its relevance little by little (and all mpc haters drop tears of joy and happiness).[a href="index.php?act=findpost&pid=308403"][{POST_SNAPBACK}][/a]


I honestly don't see that as a surprise (but that's maybe because I would probably fall into your definition of "mpc hater")

The codec's biggest seling points were always quality and speed. While these features really set MPC apart during its heyday (2000~2002), nowadays the distinction with other codecs isn't that obvious. In 2001 Vorbis was still at its release candidates, and was slow. We had no Nero or iTunes, so the only option for us AAC lovers was the painfully slow Psytel. Lame represented a format that was probably already at the end of its improvement potential, and was quite slow as well.

Since then, Vorbis reached 1.0 and later 1.1. Quality improved a lot, and Lancer showed us you can have very, very fast Vorbis encoding with minimal quality tradeoffs. iTunes and Nero AAC were released, bringing AAC quality to a whole new level and making encoding much faster while at it. And the Lame developers seem set on amazing us with each new release, pulling MP3's quality much beyond what everybody thought would be the limit. And, of course, speed improved a lot there as well.

Now that the other formats are managing to catch up with MPC in its selling points, its limitations are starting to become evident, as the advantages no longer make up for them. Lack of hardware support, lack of multichannel support, can't be used with movies, can't be split and merged, patenting situation is unclear, development stalled... the list is long.

MPC VBR flaws (low volume & ringing)

Reply #10
Quote
I would personally be a bit sad to see MPC fade into irrelevancy because of lack of development since it was the first codec I used that I was really impressed with once I started paying serious attention to encoding quality.  For the most part, it's still one of the best too.[a href="index.php?act=findpost&pid=308430"][{POST_SNAPBACK}][/a]
I think, quite a lot of people would... It was exactly the same with me and I still use it exclusively.

Quote
I think that if someone could get feedback from Frank about the best way to handle this (i.e., algorithms, etc.), I could try to implement his ideas since it seems maybe nobody else will.[a href="index.php?act=findpost&pid=308430"][{POST_SNAPBACK}][/a]
Thank you, Dibrom.  Even *if* it doesn't work out.


[ Off topic: where's that darn "beer" emoticon ?  ]

MPC VBR flaws (low volume & ringing)

Reply #11
Thanks again for that summary, guruboolez. I already sent an email to Frank Klemm about this yesterday. I'm pretty sure he'll send me some comments about it, but as always, it may take a while.

Quote
Off topic: where's that darn "beer" emoticon ?


Ah yes... it got lost during some forum software updates. There, i added it again for you.

MPC VBR flaws (low volume & ringing)

Reply #12
Frank replied from work that he will comment as soon as he has some free time.

MPC VBR flaws (low volume & ringing)

Reply #13
My sincerest thanks to everyone involved.

MPC VBR flaws (low volume & ringing)

Reply #14
Quote
Quote
<cse> btw, here's something Klemm says about encoding highly dynamic movie tracks. I think this is valid for classical too:
<cse> 3. Also I suggest to use the option
<cse>        --standard --ath_gain -14
<cse>    with movie soundtracks.
<cse>    For other quality settings
<cse>      --quality x --ath_gain 16-6*x
<cse>    This lowers ATH by 14 dB relative to the standard.
<cse>    Feeding of mppenc should be done with 24 bit when possible.
<cse>    Use a 24 bit AC3decoder.

[a href="index.php?act=findpost&pid=308342"][{POST_SNAPBACK}][/a]

Thanks for quoting this information.
--ath_gain -14 works very well, and solves all issues (I didn't carefully tried to hear smallest difference). Good new: bitrate inflation is apparently very limited for most tracks (except low volume one of course).

Also interesting to note, applied to --radio profile this additional switch increase the quality by reducing the level of audible artifacts (classical samples). Bitrate nevertheless inflates from 15...20 kbps. But it seems that with classical music --radio --ath_gain -8 performs better than --quality 4.xx (at comparable bitrate). Apparently, musepack suffers from ATH issues at inferior profile (< --standard), and could maybe benefits from tunings in this area to improve overall quality.

MPC VBR flaws (low volume & ringing)

Reply #15
Quote
This report is probably the last one I’ll do for MPC (a developer have claim their lack of interest for improving classical at --standard), but I nevertheless hope it will help to improve the encoder.

Guruboolez, thanks for the detailed report on this issue in (still) my favorite lossy encoder. Please don't be put off too quickly by the hesitant reaction of "mpc devellopment". We know how the situation is .
The "workaround", as posted by Gambit, alone may have been worth it. Maybe this, what looks like a fixable issue, can spark interest of devellopers to have a go at it?

Quote
I've made many changes to the psymodel code, although they are speed optimization oriented and the output is made to be identical.
I'm surprised, I always thought the speed was very good.

/related rant
It was my feeling that Frank Klemm, at the time the last alpha's were released by him, was very concerned about bit rate bloat. Also the ATH's were redefined when introducing the --quality scales. Maybe this issue crept in at the same time (but maybe it was waiting to be brought to front by Guruboolez all the time  )

BTW Replaygain of +20dB is pretty extreme to me. I'm always cautious with positive RG because it can result in unwanted clipping if there are peaks in the same track/album.
I am aware that in this case it's just an indication when the reported issue occurs. And, apart from RG, someone could just play it very LOUD and maybe notice the same thing.
In theory, there is no difference between theory and practice. In practice there is.

MPC VBR flaws (low volume & ringing)

Reply #16
dibrom's speed enhancements were focused on PPC/etc AFAIK


later

MPC VBR flaws (low volume & ringing)

Reply #17
Quote
dibrom's speed enhancements were focused on PPC/etc AFAIK
[a href="index.php?act=findpost&pid=309311"][{POST_SNAPBACK}][/a]


A significant portion of them were, but the later changes I've made should improve speed on x86 also, though the gains should be smaller.

A complete rewrite with a more modular codebase could probably allow for a lot more significant optimizations with little hassle (in addition to other pluses like easier maintenance, and an easier transition to a different bitstream like SV8), which is what I was starting on right about the time I shifted to working more on my audio player.

I don't know when I'll release these changes, but input from Frank about possible quality fixes should be pretty independent of most of what I've done so assuming that it's not a huge hassle to fit into the current psymodel (and I can't see why it would), then it should be pretty simple to implement and release.

MPC VBR flaws (low volume & ringing)

Reply #18
As promised, here is the answer that i got from Frank Klemm today. I translated it from german.


Code: [Select]
The calculated masked threshold is indeed depending on the level. It changes if lower levels
are approached. This modification was made sometime between encoder version 1.06 and 1.1.

With high levels, the NMR (noise-to-mask-ratio) was raised by 0.5 dB, with low levels,
it was lowered. The masked threshold (ATH) was lowered by 6 dB in total.

The original behavior was that, up to a certain threshold, things were coded with full NMR,
and after that it would suddenly get muted. A signal around that switching threshold
produced audible artifacts, despite the fact that many bits were used for coding.

The current behavior is that the coding gradually gets worse with very low levels and there's
almost no usable signal in the end. Only when this point is reached, the coding is stopped.

When you're looking at the error signal over the signal strength, there's a slowly declining
function that approaches the ATH from above. The old behavior first caused the error signal to
fall ca. 20 dB below the ATH and only raise to ATH-level again when the coding was stopped.

Extensive listening tests with headphones were conducted (headphones because of the high
listening level). For listening material, among others, the Bolero by M. Ravel was used.
Volume was adjusted to ca. 114 dB SPL at -0 dB signal strength.

At this volume, noise in the recording and quantization artefacts are already an issue with
many 16 bit recordings. As long as this level is not (clearly) exceeded, the quality of the
coding was clearly better, despite the lower bitrate (even though the NMR was raised by 0.5 dB
and the ATH lowered by 6 dB, there are spare bits with almost every kind of music).
The fluctuation in the coding - which was caused by activation and deactivation of subbands -
disappears.

But if you turn up the volume clearly above this level (ca. from 120 dB SPL at -0 dB signal
strength on), you hear the coding errors which are then pretty different from the older versions.

Now, if you disregard the question "what good are replaygains above +10 dB?" (with classical
music, only album-based replaygain should be used anyway), the problem can be solved by
lowering the ATH. It will result in a slightly higher bitrate.

If this problem is relevant for daily use in any kind of way, i dare say "no".
For most pop titles, you can increase the ATH by 30 dB and still not notice anything.
Even with classical music, 10 dB are often possible.

A clean solution is not possible with a 1-pass-coder; you would first need a rough
volume estimation of the whole song to estimate the maximum position of the volume knob -
and even then, you could still re-adjust during the title.

Furthermore, i would recommend corrections within Replaygain. A "quick-to-hack" solution
would be that the title-based replaygain of neighboring tracks in an album must not
differentiate by more than 6 dB.

From these (calculated) values:

- 7,81 dB
- 6,41 dB
- 7,61 dB
+4,81 dB
- 8,11 dB
- 6,12 dB
+1,12 dB
- 9,12 dB

you will then get:

- 7,81 dB
- 6,41 dB
- 7,61 dB
- 2,11 dB        // raised to -8,11 + 6
- 8,11 dB
- 6,12 dB
- 3,12 dB        // raised to -9,12 + 6
- 9,12 dB


Then, short voice tracks/interludes/preludes etc. don't get boosted to +40 dB anymore.
Because this is currently the only limit: Replaygain values of more than +40 dB are
simply reduced to 0 dB (not really that clean either). This limit should also be
reduced to +12 dB (corresponds to K-26).

If this proposal is taken up, i could send some reasonably tuned example code.
Somewhere in the depths of my hard disk there should be something.
In that code, the increase of these "holes" is also depending on the Album-replaygain,
the title length and sometimes from more distant neighboring tracks.
A "1 second digital null" before the first title approximately gets the value of the
first track, a "2 second digital null" in between two tracks gets the mean value
of both tracks.



static const Profile_Setting_t  Profiles [16] = {
   { 0 },
   { 0 },
   { 0 },
   { 0 },
   { 0 },
/*    Short   MinVal  EarModel  Ltq_                min   Ltq_  Band-  tmpMask  CVD_  varLtq    MS   Comb   NS_        Trans */
/*    Thr     Choice  Flag      offset  TMN   NMT   SMR   max   Width  _used    used         channel Penal used  PNS    Det  */
   { 1.e9f,  1,      300,       30,    3.0, -1.0,    0,  106,   4820,   1,      1,    1.,      3,     24,  6,   1.09f, 200 },  // 0: pre-Telephone
   { 1.e9f,  1,      300,       24,    6.0,  0.5,    0,  100,   7570,   1,      1,    1.,      3,     20,  6,   0.77f, 180 },  // 1: pre-Telephone
   { 1.e9f,  1,      400,       18,    9.0,  2.0,    0,   94,  10300,   1,      1,    1.,      4,     18,  6,   0.55f, 160 },  // 2: Telephone
   { 50.0f,  2,      430,       12,   12.0,  3.5,    0,   88,  13090,   1,      1,    1.,      5,     15,  6,   0.39f, 140 },  // 3: Thumb
   { 15.0f,  2,      440,        6,   15.0,  5.0,    0,   82,  15800,   1,      1,    1.,      6,     10,  6,   0.27f, 120 },  // 4: Radio
   {  5.0f,  2,      550,        0,   18.0,  6.5,    1,   76,  19980,   1,      2,    1.,     11,      9,  6,   0.00f, 100 },  // 5: Standard
   {  4.0f,  2,      560,       -6,   21.0,  8.0,    2,   70,  22000,   1,      2,    1.,     12,      7,  6,   0.00f,  80 },  // 6: Xtreme
   {  3.0f,  2,      570,      -12,   24.0,  9.5,    3,   64,  24000,   1,      2,    2.,     13,      5,  6,   0.00f,  60 },  // 7: Insane
   {  2.8f,  2,      580,      -18,   27.0, 11.0,    4,   58,  26000,   1,      2,    4.,     13,      4,  6,   0.00f,  40 },  // 8: BrainDead
   {  2.6f,  2,      590,      -24,   30.0, 12.5,    5,   52,  28000,   1,      2,    8.,     13,      4,  6,   0.00f,  20 },  // 9: post-BrainDead
   {  2.4f,  2,      599,      -30,   33.0, 14.0,    6,   46,  30000,   1,      2,   16.,     15,      2,  6,   0.00f,  10 },  //10: post-BrainDead
};


The Ltq_offset entry is the alteration of the masked threshold against the standard model.
A reduction by 6 dB decreases the ATH by 6 dB in the whole frequency range.

The value left of that (EarModel) can be used for ATH fine-tuning for higher frequencies.
An increasing by 20 results in a ATH decrease by 1.5 dB at 10 KHz and 6 dB at 20 KHz.

--quality 6 against --quality 5 has the following differences in the ATH with this:

- 6,0 dB for low frequencies
- 6,5 dB for 8 kHz
- 7,0 dB for 11 kHz
- 8,0 dB for 16,3 kHz
- 9,0 dB for 20 kHz
-10,0 dB for 23 kHz

If there are further questions or if something was unintelligible, just keep asking.
I still have no time, but when i have 15 minutes silence, i can answer such things.

Motto of the day: The ingeniousness of a construction lies within its simplicity.
Everyone can build something complicated. (Sergeij P. Koroljow)

MPC VBR flaws (low volume & ringing)

Reply #19
I'm a bit surprised nobody has to say anything to say to this, especially guruboolez?

Anyway, to summarize Frank Klemm's comments in a more simple manner:

- In a version between 1.06 and 1.1, the coding of low level (not low frequency!) signals was changed, to avoid artifacts that were caused when such a signal approached a certain lower threshold which made it fluctuate between "encode" and "not encode"

- The new method avoids that fluctuation by gradually decreasing quality towards the lower threshold, leading to a gentle deterioration and no audible artifacts even with quite "silent" music under normal circumstances, which was checked in listening tests

- Ridiculously high Replaygain values however (usually in track gain) can make artifacts with the new method audible again

- Replaygain in it's current state has some shortcomings for very dynamic albums

- The new method could be tuned by lowering the ATH (absolute threshold of hearing); basically making the "simulated hearing" a bit more sensitive

- For daily use and normal listening conditions, this problem is not relevant

- Possible solutions include the tweaking of the ATH curves and modifications to Replaygain

MPC VBR flaws (low volume & ringing)

Reply #20
Quote
I'm a bit surprised nobody has to say anything to say to this, especially guruboolez?[a href="index.php?act=findpost&pid=310528"][{POST_SNAPBACK}][/a]


Well, Guruboolez already said he's not planning to test Musepack again after the terrible behaviour displayed by the project's maintainer in face of useful and valid test results. So I suspect it makes no difference to him anymore what Klemm says.

MPC VBR flaws (low volume & ringing)

Reply #21
Quote
Well, Guruboolez already said he's not planning to test Musepack again after the terrible behaviour displayed by the project's maintainer in face of useful and valid test results. So I suspect it makes no difference to him anymore what Klemm says.
[a href="index.php?act=findpost&pid=310529"][{POST_SNAPBACK}][/a]


Why don't you let him speak for himself? Some days ago he showed that he still is interested in a fix for this issue. No need trying to verbally divide things further than they already are.

MPC VBR flaws (low volume & ringing)

Reply #22
Hahaha, I love it when robert comes in to save the day for musepack. =D

MPC VBR flaws (low volume & ringing)

Reply #23
Quote
Hahaha, I love it when robert comes in to save the day for musepack. =D
[a href="index.php?act=findpost&pid=310566"][{POST_SNAPBACK}][/a]


hehe. I actually have been, from the beginning, defending my good friend Guruboolez from bullshit coming form all sides.

MPC VBR flaws (low volume & ringing)

Reply #24
Do we need to split this thread again to stay on topic?