Welcome Guest ( Log In | Register )

"MP3Gain: How can it be possible?", It 's indicated that the gain adjustments are lossless
post Aug 1 2012, 00:35
Post #1

Group: Members
Posts: 26
Joined: 19-December 10
Member No.: 86635

So I've been thinking of trying to write a similar program from scratch and there's one main thing that I don't even understand how it's possible, yet alone done. So as is said, the process MP3Gain uses is lossless. Thinking about it, the only way MP3Gain could work where any player would play back the songs with whatever the target volume was would be if the change is present in the waveform. After MP3Gain is applied, if it weren't obvious from the beginning, in any audio editing software, the gain reduction is clearly visible. I could somewhat understand how the process can be reversed with added value, even if the waveform clips, as the information could still somehow be stored (more easily than the other way around). On the other hand, when taken away, don't you permanently lose the dB that you took from the threshold? As an example, if a song starts with some 6 decibel ambient noise and you reduce the song by 6dB, wouldn't that intro just completely disappear? And if the change is undone, wouldn't you not get any of the data back (unless it's stored) and just make the existing data 6dB louder? If that's the case, it isn't really undoing the changes; it's really just adding the difference in value back between the indicated ReplayGain value and what it is now.

Sorry this was kinda long-winded but the last thing though I'd also like to ask about is clipping. If a track's peak values are clipping by default, reducing the loudness now would be too late, wouldn't it? Wouldn't it be clipping no matter what at this point, contrary to what is indicated? The peaks would be chopped off either way since the structure of the waveform is no longer saved after being finalized. And also, the maximized volume indications don't make sense (has to be turned on in the options). For example, I have a file which ReplayGain indicates peaks at about 1.05 (16-bit = 100.8dB) and yet it's marked that only a 1.5dB reduction would be necessary to get it maximized (the loudest point before clipping - 96dB). Is there something I'm missing?

Thanks guys! Answers to these would be extremely helpful.

PS- A lot of the things here indicate to me that the values, whether over or under, remain as part of the data in the container but just doesn't play back, or rather, clips since it's within the 16-bit parameter.
Go to the top of the page
+Quote Post
Start new topic
post Aug 2 2012, 06:45
Post #2

Group: Members
Posts: 26
Joined: 19-December 10
Member No.: 86635

QUOTE (mjb2006 @ Aug 1 2012, 14:31) *
If you're worried that you've lost something on the quiet end by reducing the global gain throughout the file, your decoder could output 24-bit instead of 16-bit. I'm doubtful this would matter in most recordings.

Ooh! Thanks for that; that's very interesting! Probably having to do with something along the lines of me not understanding the entirety of this process but how would decoding as such help with data taken from the low end? The idea to me behind this seems to be for clipping which can be worked around with these decoders instead of normalizing to the maximum by passing through the info beyond the 16-bit limitation (96dB FS). On that note, would you happen to know if the newest version of AC3Filter accomplishes this when setting the output to 24-bit? Thanks again.

QUOTE (Typhoon859 @ Jul 31 2012, 17:35) *
And if the change is undone, wouldn't you not get any of the data back (unless it's stored) ... A lot of the things here indicate to me that the values, whether over or under, remain as part of the data in the container but just doesn't play back, or rather, clips since it's within the 16-bit parameter.

That's basically it. Each granule in each frame in the MP3 (2 granules per frame) contains frequency & amplitude info for generating brief sine waves. These waves are summed by the decoder to make an equally brief but much more complex composite waveform. Space is saved at encoding time by (among other things) eliminating waves that don't make an audible difference, and by storing the parameters with less-than-perfect precision, and by using standard lossless compression techniques internally. At decoding time, the global gain field of each frame is used to scale each granule's composite wave. So if you modify the global gain fields, only the amplitude changes; the "shape" (frequency content) of the output wave is unaffected, hence the gain adjustment is "lossless".

Well firstly, thank you for being the first person to acknowledge that statement. In response to your comment, that pretty much explains the general premise of MP3 encoding for me after Googling "frame" and "granule" to understand that better. In the time since the beginning of my post, I sort of got it from context and from the mentioned link I visited about the global gain field, but otherwise, I still didn't understand how audio has frames, lol. All I really thought of it was that it was like digital packets.

But umm, also from that thread I earlier linked, I understood that MP3s have these global gain fields which could be altered to make differences only for decoders, leaving all the information in tact. This in fact makes it serve with the exact same function as ReplayGain but with limited accuracy to 1.5dB. In other words, the amplitude DOESN'T get physically changed, right? So yeah, if that's the case, then the process can be undone. Of course I knew that the actual shape of the waveform isn't changed. Reducing the amplitude at any stage though still removes the quietest levels by the correlated amount, and for music without much compression, it really starts to be noticeable with 4.5dB reduced or more. It could just be the deficiency of ALL my gear but I doubt it, especially when testing out and compensating for those volume changes using my FiiO E17 DAC/Amp. If all I said here is true though, the biggest part of my question is still answered because it's mainly regarding the undoing of these alterations; the mentioned loss would be the case regardless even with ReplayGain. Maybe I'm wrong in saying this; I actually hope so. XD

The MP3's combined waveform data consists of samples which use 32-bit float amplitudes, essentially perfect precision for audio purposes. However, typical audio playback APIs expect LPCM samples which use 16- or 24-bit integer amplitudes, so MP3 decoders convert the 32-bit float to 16-bit signed integer, normally. By definition, 1.0 in the 32-bit float is the maximum range of the integers you're converting to. If 16-bit, that means -1.0 is -32768 and +1.0 is +32767. As pointed out, technically you can't assign a decibel value to a single point, but that's irrelevant for purposes of detecting clipping; if the float exceeds 1.0, there's no choice but to clip when converting to LPCM.

Hopefully with this explanation you can see how reducing the global gain brings the float32 amplitudes under 1.0, which in turn prevents clipping, but is essentially "lossless" in the sense that it doesn't change the shape of the waveform, just its amplitude (volume).

Why are you the only one that gave a normal and direct response, yet alone a good one? Anyway...

"-32768" and "+32767": what is the unit of measure for these values or to what does it relate? From what I know, dB is a relative measurement; from what I remember, it can only be measured as the "average" within a period, although I was never quite sure how that works in non-simple waveforms, unless it's calculated by RMS.

In general I understand these things now individually (unless otherwise noted) but I just don't get how they relate, but I have to know that to understand how the calculations for clipping are done. Thanks again for your help!

What's the reason to want to host a forum if it's not for the reason of spreading knowledge and getting pleasure out of informing people? I have a feeling that even if initially there was good intention here, things just got derailed along the way by the people themselves because they begin using this as their personal platform for their own totalitarian ideals or simply conformist mentality - like in most other places. I wouldn't be at all surprised if this thread were locked, my posts were modified in what they say or edited together into senseless and out of context pieces - sort of like media propaganda, and I was banned with an automated message that I broke the rules. As a matter of fact, I'm so comfortable here that before posting this, I had to save this entire thread so far onto my computer to keep as evidence.

This was stupid on my part to say; I'm sorry... But guys, come on, is it too much to ask for us to learn from each other? Have you ever questioned the reason behind your approach to your responses? Just put yourself in my position. I've been getting more ridicule here from the majority than I have gotten aid. Thanks to anyone that truly contributed though, sincerely thank you!

This post has been edited by Typhoon859: Aug 2 2012, 07:39
Go to the top of the page
+Quote Post
post Aug 2 2012, 11:53
Post #3

ReplayGain developer

Group: Developer
Posts: 5663
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409

QUOTE (Typhoon859 @ Aug 2 2012, 06:45) *
Why are you the only one that gave a normal and direct response, yet alone a good one? Anyway...
Because it looks like you're very far from understanding this, yet you think you have good knowledge. Experience suggests that means we're in for a heck of a long thread, and not everyone has the time.

Digital audio comes from ADCs and goes to DACs*. Both use integers, 16-bits, 24-bits, whatever. These can be expressed in various ways. binary (unsigned int, twos compliment, etc), decimal (signed integer, floating point, etc). As long as the ranges and conversion factors are known and/or understood, these can be entirely equivalent. e.g.
"-32768" and "+32767": what is the unit of measure for these values or to what does it relate?
It's the range of 16-bit values expressed in signed integer decimal.

* no DAC = no way to hear the audio, so this is pretty fundamental. wink.gif

lossy codecs don't intend to accurately reproduce audio. A lossy audio file (like an mp3) contains what can be interpreted as a series of instructions to approximate the original waveform. e.g. "generate a sine wave at this frequency and amplitude, for this long, and add it to anther one, and another one, etc". How you convert these instructions into 16-bit or 24-bit or whatever audio data is very well defined, as is the amplitude level which matches the full scale of the output (1.0, or 32767/8 or 8388607/8 etc) - but within the mp3 file, there's no concept of 16-bits or 24-bits or any other input or output resolution - there's just the approximated parameters of sine waves which get added together by the decoder to create the output. A decoder can do this to any level of accuracy it wants, generating 64-bit output if it wishes.

The global gain field is just a way in which the amplitude of the sine waves gets expressed in an mp3 file. Instead of specifying all the sine wave amplitudes absolutely, there's the global gain for a given block, and then the amplitude offsets relative to that for each individual sine wave in a given block. It's just a more efficient way of saying exactly the same thing, but it has a side effect that changing the global gain field in an mp3 file is an easy and non-destructive way of changing its volume. You could do it by changing all the amplitudes specified for all the individual sine waves - except that in the amplitude parameter for them, the ranges and steps in those ranges will vary between them (because the different ranges allowed them to be encoded more efficiently), making a simple useful volume control almost impossible to do in that way.

As along as you can put the global gain fields back to what they were, you can't lose data by changing them, because the definitions of all the sine waves remain intact. If you have a (for example) 16-bit decoder, and reduce all the global gain fields dramatically, you could in theory reduce the amplitude of all the sine waves so that the 16-bit output rounded to zero - you'd get absolutely nothing out of a 16-bit decoder, because everything has been pushed more than 96dB below digital full scale. But, either by using a more accurate decoder (e.g. 32-bits) and amplifying the result, or more simply by putting the global gain fields back to what they were, you'd get all the audio data decoded just fine.

Hope this helps.


P.S. this has all been discussed before.

This post has been edited by 2Bdecided: Aug 2 2012, 11:58
Go to the top of the page
+Quote Post

Posts in this topic
- Typhoon859   "MP3Gain: How can it be possible?"   Aug 1 2012, 00:35
- - greynol   You appear to assume that mp3 data is 16-bit integ...   Aug 1 2012, 00:50
|- - 2Bdecided   QUOTE (greynol @ Aug 1 2012, 00:50) I rec...   Aug 1 2012, 13:59
- - saratoga   QUOTE (Typhoon859 @ Jul 31 2012, 19:35) I...   Aug 1 2012, 02:25
|- - greynol   Since we're not dealing with power, a ~0.2dB i...   Aug 1 2012, 03:30
|- - saratoga   QUOTE (greynol @ Jul 31 2012, 22:30) Sinc...   Aug 1 2012, 15:01
- - Typhoon859   Right, so, there evidently seems to be a lot I don...   Aug 1 2012, 13:24
|- - db1989   QUOTE (Typhoon859 @ Aug 1 2012, 13:24) In...   Aug 1 2012, 14:24
|- - saratoga   QUOTE (Typhoon859 @ Aug 1 2012, 08:24) Re...   Aug 1 2012, 16:06
- - pdq   I seem to recall that the dynamic range of the mp3...   Aug 1 2012, 14:16
- - mjb2006   QUOTE (Typhoon859 @ Jul 31 2012, 17:35) i...   Aug 1 2012, 19:31
- - Typhoon859   First of all, I'd just like to say that many o...   Aug 2 2012, 06:40
|- - saratoga   QUOTE (Typhoon859 @ Aug 2 2012, 01:40) QU...   Aug 2 2012, 16:05
- - Typhoon859   QUOTE (mjb2006 @ Aug 1 2012, 14:31) If yo...   Aug 2 2012, 06:45
|- - 2Bdecided   QUOTE (Typhoon859 @ Aug 2 2012, 06:45) Wh...   Aug 2 2012, 11:53
- - halb27   A short explanation of mp3 technology in the entir...   Aug 2 2012, 10:20
- - db1989   QUOTE (Typhoon859 @ Aug 2 2012, 06:40) QU...   Aug 2 2012, 11:05
- - [JAZ]   @Typhoon859: You should read again your posts, and...   Aug 2 2012, 13:03
|- - [JAZ]   QUOTE ([JAZ] @ Aug 2 2012, 14:03)...   Aug 2 2012, 17:38
|- - alanofoz   QUOTE ([JAZ] @ Aug 3 2012, 03:38)...   Aug 3 2012, 02:52
- - greynol   You're saying full scale is not maximum amplit...   Aug 3 2012, 04:51
|- - alanofoz   QUOTE (greynol @ Aug 3 2012, 14:51) You...   Aug 3 2012, 23:50
|- - [JAZ]   QUOTE (alanofoz @ Aug 4 2012, 00:50) QUOT...   Aug 4 2012, 10:55
- - [JAZ]   The signal to noise ratio is the difference betwee...   Aug 3 2012, 09:54
- - 2Bdecided   I think we scared him off. Interesting how, on a ...   Aug 3 2012, 09:58
|- - skamp   QUOTE (2Bdecided @ Aug 3 2012, 10:58) Int...   Aug 3 2012, 10:11
|- - 2Bdecided   QUOTE (skamp @ Aug 3 2012, 10:11) QUOTE (...   Aug 3 2012, 10:51
|- - Destroid   QUOTE (2Bdecided @ Aug 3 2012, 10:51) Ser...   Aug 3 2012, 11:17
|- - bandpass   QUOTE (2Bdecided @ Aug 3 2012, 10:51) The...   Aug 3 2012, 11:31
- - Destroid   Actually, I hope this person is still lurking and ...   Aug 3 2012, 10:50
- - greynol   Allow me to throw another reason into the mix as t...   Aug 3 2012, 15:36
- - alanofoz   Hmmm... I re-read my post and didn't think it ...   Aug 5 2012, 01:53

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:


RSS Lo-Fi Version Time is now: 25th November 2015 - 23:42