Welcome Guest ( Log In | Register )

"MP3Gain: How can it be possible?", It 's indicated that the gain adjustments are lossless
post Aug 1 2012, 00:35
Post #1

Group: Members
Posts: 26
Joined: 19-December 10
Member No.: 86635

So I've been thinking of trying to write a similar program from scratch and there's one main thing that I don't even understand how it's possible, yet alone done. So as is said, the process MP3Gain uses is lossless. Thinking about it, the only way MP3Gain could work where any player would play back the songs with whatever the target volume was would be if the change is present in the waveform. After MP3Gain is applied, if it weren't obvious from the beginning, in any audio editing software, the gain reduction is clearly visible. I could somewhat understand how the process can be reversed with added value, even if the waveform clips, as the information could still somehow be stored (more easily than the other way around). On the other hand, when taken away, don't you permanently lose the dB that you took from the threshold? As an example, if a song starts with some 6 decibel ambient noise and you reduce the song by 6dB, wouldn't that intro just completely disappear? And if the change is undone, wouldn't you not get any of the data back (unless it's stored) and just make the existing data 6dB louder? If that's the case, it isn't really undoing the changes; it's really just adding the difference in value back between the indicated ReplayGain value and what it is now.

Sorry this was kinda long-winded but the last thing though I'd also like to ask about is clipping. If a track's peak values are clipping by default, reducing the loudness now would be too late, wouldn't it? Wouldn't it be clipping no matter what at this point, contrary to what is indicated? The peaks would be chopped off either way since the structure of the waveform is no longer saved after being finalized. And also, the maximized volume indications don't make sense (has to be turned on in the options). For example, I have a file which ReplayGain indicates peaks at about 1.05 (16-bit = 100.8dB) and yet it's marked that only a 1.5dB reduction would be necessary to get it maximized (the loudest point before clipping - 96dB). Is there something I'm missing?

Thanks guys! Answers to these would be extremely helpful.

PS- A lot of the things here indicate to me that the values, whether over or under, remain as part of the data in the container but just doesn't play back, or rather, clips since it's within the 16-bit parameter.
Go to the top of the page
+Quote Post
Start new topic
post Aug 2 2012, 10:20
Post #2

Group: Members
Posts: 2489
Joined: 9-October 05
From: Dormagen, Germany
Member No.: 25015

A short explanation of mp3 technology in the entire context as far as your questions are concerned:

a) Input for mp3 encoding is usually a PCM signal (music representation in the time domain). That is 44100 times a second (for music on CD) the original music is looked up for the current value of the signal. This value is stored as a signed integer (with 16 bit resolution for music on CD). When the CD recording was fine there was no clipping, that is the entire track could be encoded in the range provided by 16 bit integers.
Don't think of loudness, db, etc. at the moment as it does not help here.

b) Encoding with mp3 (or another transform codec like AAC) means bringing the signal representation from the time domain to the frequency domain. Music representation in the frequency domain means creating time windows (10 msec long as the order of magnitude which transform codecs are using) and representing the music for a time window by giving the frequency-amplitude-distribution of it. This distribution is coded efficiently using various lossy and lossless techniques.
In the case of mp3 these time windows are the frames resp. granules of a frame. A frame consists of 1152 wave samples, and a frame is separated into two granules. There is more complication because of the usage of long/short/mixed blocks. But that's all specific details you should look after yourself when you're really implementing your own thing. It's not necessary for the fundamental understanding. Just think in terms of time windows the length of which can be adapted to the current musical situation according to actual needs.
What's important for your context: there is a scale factor (called global gain) for each time window which controls the amplitude of the entire frequency-amplitude-distribution of the window. This is the spot where a lossless amplitude variation of an existing mp3 file can be done.

c) Decoding an mp3 file means looking at each time window and transforming the frequency-amplitude-distribution back to the wave samples. It should be pointed out that clipping can occur in this process even when there was no clipping in the original music (due to the approximate nature of the frequency-amplitude-distribution stored in the mp3 file). This is where replaygain infos show up a peak value > 1.
The decoding machinery can take this into account and scale the frequency-amplitude-distribution down accordingly, or - if you use replaygain to scale down the global gain values - you can avoid this situation altogether making the mp3 file playable on any player without this special clipping.

All this has nothing to do with db, perceived loudness, etc. When it comes to these in your context, it's all about normalizing music relative to a standard of perceived loudness. This standard is 89 dB SPL (but you can deviate from this), and a replaygain analysis stage analyses the music and makes a suggestion by how many db to change the music to get at this standard perceived loudness. You can use this value to change the global gain factors of the mp3 file accordingly (which can be altered in 1.5 db steps, so you can do this only approximately).

I suggest to concentrate on the fundamental things at the moment, and take care about the irritating details like granules and short blocks or the special considerations for sfb21 when you're really implementing your own method. At that time you should get familiar with the exact mp3 specs, informations of which you can find in the net and/or in books (but please don't have HA member explain this to you other than for isolated specific questions). But why bother BTW as the tools are there for doing these things?

This post has been edited by halb27: Aug 2 2012, 10:32

lame3100m -V1 --insane_factor 0.75
Go to the top of the page
+Quote Post

Posts in this topic
- Typhoon859   "MP3Gain: How can it be possible?"   Aug 1 2012, 00:35
- - greynol   You appear to assume that mp3 data is 16-bit integ...   Aug 1 2012, 00:50
|- - 2Bdecided   QUOTE (greynol @ Aug 1 2012, 00:50) I rec...   Aug 1 2012, 13:59
- - saratoga   QUOTE (Typhoon859 @ Jul 31 2012, 19:35) I...   Aug 1 2012, 02:25
|- - greynol   Since we're not dealing with power, a ~0.2dB i...   Aug 1 2012, 03:30
|- - saratoga   QUOTE (greynol @ Jul 31 2012, 22:30) Sinc...   Aug 1 2012, 15:01
- - Typhoon859   Right, so, there evidently seems to be a lot I don...   Aug 1 2012, 13:24
|- - db1989   QUOTE (Typhoon859 @ Aug 1 2012, 13:24) In...   Aug 1 2012, 14:24
|- - saratoga   QUOTE (Typhoon859 @ Aug 1 2012, 08:24) Re...   Aug 1 2012, 16:06
- - pdq   I seem to recall that the dynamic range of the mp3...   Aug 1 2012, 14:16
- - mjb2006   QUOTE (Typhoon859 @ Jul 31 2012, 17:35) i...   Aug 1 2012, 19:31
- - Typhoon859   First of all, I'd just like to say that many o...   Aug 2 2012, 06:40
|- - saratoga   QUOTE (Typhoon859 @ Aug 2 2012, 01:40) QU...   Aug 2 2012, 16:05
- - Typhoon859   QUOTE (mjb2006 @ Aug 1 2012, 14:31) If yo...   Aug 2 2012, 06:45
|- - 2Bdecided   QUOTE (Typhoon859 @ Aug 2 2012, 06:45) Wh...   Aug 2 2012, 11:53
- - halb27   A short explanation of mp3 technology in the entir...   Aug 2 2012, 10:20
- - db1989   QUOTE (Typhoon859 @ Aug 2 2012, 06:40) QU...   Aug 2 2012, 11:05
- - [JAZ]   @Typhoon859: You should read again your posts, and...   Aug 2 2012, 13:03
|- - [JAZ]   QUOTE ([JAZ] @ Aug 2 2012, 14:03)...   Aug 2 2012, 17:38
|- - alanofoz   QUOTE ([JAZ] @ Aug 3 2012, 03:38)...   Aug 3 2012, 02:52
- - greynol   You're saying full scale is not maximum amplit...   Aug 3 2012, 04:51
|- - alanofoz   QUOTE (greynol @ Aug 3 2012, 14:51) You...   Aug 3 2012, 23:50
|- - [JAZ]   QUOTE (alanofoz @ Aug 4 2012, 00:50) QUOT...   Aug 4 2012, 10:55
- - [JAZ]   The signal to noise ratio is the difference betwee...   Aug 3 2012, 09:54
- - 2Bdecided   I think we scared him off. Interesting how, on a ...   Aug 3 2012, 09:58
|- - skamp   QUOTE (2Bdecided @ Aug 3 2012, 10:58) Int...   Aug 3 2012, 10:11
|- - 2Bdecided   QUOTE (skamp @ Aug 3 2012, 10:11) QUOTE (...   Aug 3 2012, 10:51
|- - Destroid   QUOTE (2Bdecided @ Aug 3 2012, 10:51) Ser...   Aug 3 2012, 11:17
|- - bandpass   QUOTE (2Bdecided @ Aug 3 2012, 10:51) The...   Aug 3 2012, 11:31
- - Destroid   Actually, I hope this person is still lurking and ...   Aug 3 2012, 10:50
- - greynol   Allow me to throw another reason into the mix as t...   Aug 3 2012, 15:36
- - alanofoz   Hmmm... I re-read my post and didn't think it ...   Aug 5 2012, 01:53

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:


RSS Lo-Fi Version Time is now: 1st December 2015 - 02:21