Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Gapless Playback Detection (Read 4973 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Gapless Playback Detection

Hi,
I'm trying to detect if a mp3 track has been encoded with the gapless option enabled. I'm trying to write a simple player that can quickly detect if the option is enabled. I'm currently testing with tracks encoded using LAME. Wondering if someone can point me to exactly what I should be looking for in the LAME header.

Thanks!


 

Gapless Playback Detection

Reply #2
I believe there are actually at least 3 places to look for delay and padding:
  • for LAME-encoded MP3s (since around 2003), it's in the LAME tag, as lvqcl referred to.
  • for iTunes-encoded MP3s, it's in the iTunSMPB tag, and maybe also iTunPGAP (documentation seems hard to come by).
  • for Fraunhofer-encoded MP3s, in the VBRI tag, after "VBRI", the first 2 bytes are the version and the next 2 are the combined delay & padding (delay varies with version; see http://mp3decoders.mp3-tech.org/decoders_lame.html#delays ).

There's also the older, less widely implemented method of gapless encoding, where one long MP3 stream for a contiguous set of tracks is divided into separate files, such as by the --nogap option in LAME, a similar gapless option in iTunes, or by using MP3 editors/splitters like mp3DirectCut (e.g. to split an album MP3 by cue sheet). The files may (but don't always) include flags to indicate they are part of a gapless set, or it may be possible to deduce this from weird delay & padding values. Since the encoder isn't reset between tracks, these MP3s are theoretically (though probably not ABX-able) better quality at the track boundaries when the files are played in order. However, proper playback requires reassembling the stream before or during decoding; the player must recognize the files as part of a set and avoid resetting the decoder between files. This is only supported by a couple of players (Rockbox and iTunes being the main ones, I think). It's much easier to just use the delay & padding metadata; it's good enough, and is ideal when the tracks aren't going to be played in order.

Gapless Playback Detection

Reply #3
Thanks mjb2006 and lvqcl.

For the delay / padding -- should I be expecting a 0 delay if it is gapless?


Gapless Playback Detection

Reply #5
Just in case this wasn't clear: the gaplessness is not a property of the MP3's audio data; it's just some hints in the file that the player uses to know what to do with the output of the decoder.

The output of the decoder will be the samples obtained from the MP3, with some junk at the beginning, and probably also at the end. Some or all of the junk at the beginning will have been added by the decoder; this decoder delay is a fixed amount, so it's easy to strip. The encoder probably added junk to the beginning and end of the MP3 itself, as well; this is the encoder delay and padding, respectively. The question is how much of each is there?

The player will probably be coded to work with a specific decoder, so the decoder delay will be known; for any MP3, the player will strip those extra samples from the beginning of the decoder's output. If the MP3 contains encoder delay & padding values, then the player knows to strip that many additional samples from the beginning and end of what remains, respectively. What's left should be the same number of samples as were originally input to the encoder when the MP3 was made.

If delay & padding values aren't available, but the encoder is known, it might still at least be able to make an educated guess at the amount of delay to strip from the beginning. Lame's delay is 576, for example. In the Fraunhofer case, with combined delay & padding in one value, I think you have to do a bit of both (deducing how much is delay and how much is padding based on the encoder version).