IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
Could we add gapless to Apple's AAC encoder?, task for iTunesencode or other?
guruboolez
post Jun 21 2005, 08:11
Post #1





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



I have a question about a possible new iTunesEncode feature (or any other program driving Apple's AAC encoder). Wouldn't it be possible to add a gapless feature? It might sound silly, but after all, nyaochi did it with the latest Fraunhofer MP3 encoder (here and here). I'm not a coding expert -I'm not a coder at all- but I think that doing it wouldn't be an impossible task.

Adding somewhere in the tags the precise offset (constant for Apple's encoder) and calculating the amount of padded sample maybe suffice? I don't really know. faac and Nero AAC are gapless, and apparently Apple is not hurry to implement this feature. It's a pity, because Apple's encoder is pretty good. Gapless playback is not possible on any iPod, but on a computer (playing with foobar2000 as exemple) users would benefits from it.

Could we reproduce Nyaochi's ACMenc patch to work with iTunes? Or make something similar? What do you think?
Go to the top of the page
+Quote Post
Otto42
post Jun 21 2005, 18:25
Post #2





Group: Members
Posts: 1075
Joined: 15-October 03
From: Memphis, TN
Member No.: 9323



Short answer: no.

Long answer: yes, but I'm not going to add it to iTunesEncode for a number of reasons:

1) iTunesEncode is designed to just be a CLI to access the functionality that iTunes itself provides via the COM interface, with regards to encoding. Other than copying the resulting encoded file around and renaming it, the file itself is never touched by iTunesEncode. It doesn't do anything at all with the actual data in the file, so adding that sort of functionality is beyond the scope of what iTunesEncode currently does.

2) I lost the source code to iTunesEncode via an accident, and so what you get is what there is. biggrin.gif Yeah, I could rewrite the missing pieces easily enough, but there's no real compelling reason to do so at this point. I have already added all the functionality I could feasibly add via the iTunes COM interface. Supposedly, the new iTunes will support VB scripting, which might be advantageous to use in some way, but COM is basically a dead end as far as new functionality goes.

Honestly, it makes more sense to make a new program to modify existing iTunes created AAC files to add that info, sort of thing.

This post has been edited by Otto42: Jun 21 2005, 18:28


--------------------
http://ottodestruct.com
Go to the top of the page
+Quote Post
Tropican
post Jun 21 2005, 23:55
Post #3





Group: Members
Posts: 68
Joined: 22-May 05
Member No.: 22203



This may be a dumb question, but if the encoding delay is constant, would it be possible to just add gapless info in the tags with a special MP4 remuxer? Forgive my ignorance, but to non-coders it seems simpler than I'm sure it is.

This post has been edited by Tropican: Jun 22 2005, 00:14
Go to the top of the page
+Quote Post
nyaochi
post Jun 22 2005, 01:40
Post #4





Group: Members
Posts: 169
Joined: 30-September 01
From: Tokyo, Japan
Member No.: 99



I don't know much about AAC and MP4 container (I don't even install iTunes on my computer tongue.gif ), but technically speaking, it should be possible to write such a frontend program for iTunes. We need: 1) encoder delay of iTunes AAC encoder; 2) number of padded silent samples at the end of stream by iTunes (of course it's not constant); and 3) container to store above 1) and 2) information.

As for ACMENC implementation, 1) is supposed to be specified by a user (through a command-line preset); 2) is calculated by the number of samples in an input audio and the number of samples (MP3 frames) in the output MP3 stream. We use MP3-Info frame to store the encoder delay/padding information. To achieve 2), we must count manually the number of frames in the output stream generated by F-IIS ACM codec. We cannot reuse my source code because MP3 and AAC/MP4 are totally different.

I'm not sure how AAC gapless is achieved, but the scenario I guess would be:
1) convert an input audio file into AAC file by iTunes' COM interface;
2) open the input audio file and obtain the number of samples;
3) open and parse the output AAC file to count up the number of frames;
4) construct MP4 stream from the AAC stream and delay/padding information by using libmp4v2(?) similarly to FAAC.

In addition to the general audio-programing knowledge, the knowledge about AAC stream format and MP4 container format will be necessary.

This post has been edited by nyaochi: Jun 22 2005, 01:46
Go to the top of the page
+Quote Post
Tropican
post Jun 22 2005, 02:19
Post #5





Group: Members
Posts: 68
Joined: 22-May 05
Member No.: 22203



QUOTE (nyaochi @ Jun 21 2005, 07:40 PM)
We need: 1) encoder delay of iTunes AAC encoder; 2) number of padded silent samples at the end of stream by iTunes (of course it's not constant)
*


How did you find the delay of the FhG encoders nyaochi? How exactly is an encoder delay determined in any format, or is it format specific?

Removing silent samples has been done so many times in audio editors that it shouldn't be a problem. Or will removing silent samples that aren't padded ones by iTunes cause even bigger problems?

QUOTE (nyaochi @ Jun 21 2005, 07:40 PM)
I'm not sure how AAC gapless is achieved
*


There's the FAAC source.

Please no developers take this as an insult. Note that I'm minimizing the amount of work this would take as I'm

1) Not a programmer
2) Someone who really would like to see someone at least attempt to do this, and am worried that the amount of time this would need may scare some talented people away, so is unestimating the time on such a project greatly
3)Am afraid that when iTunes receives VBR support from Quicktime 7, even if the quality ends up being better than Nero, most users here will have to stick with Nero for gapless.


There's always the chance Apple will add support themselves smile.gif
Go to the top of the page
+Quote Post
nyaochi
post Jun 22 2005, 03:11
Post #6





Group: Members
Posts: 169
Joined: 30-September 01
From: Tokyo, Japan
Member No.: 99



QUOTE (Tropican @ Jun 22 2005, 10:19 AM)
How did you find the delay of the FhG encoders nyaochi?  How exactly is an encoder delay determined in any format, or is it format specific?

Lookint at this page to guess the delay, I measured and confirmed the value by using a wave editor.

QUOTE (Tropican @ Jun 22 2005, 10:19 AM)
Removing silent samples has been done so many times in audio editors that it shouldn't be a problem.  Or will removing silent samples that aren't padded ones by iTunes cause even bigger problems?

You missed the point. AAC stream seems to have 1024 frame size, which means that you will/must get 1024*n samples after decoding an AAC stream. That's one reason why iTunes must pad silent samples to fill the last frame. And there's another reason from encoder delay, but I don't mention here. Anyway, the necessary task is not removing the silence, but telling a decoder the number of samples to be removed for playback. It cannot be achieved by an audio editor.

QUOTE (Tropican @ Jun 22 2005, 10:19 AM)
There's the FAAC source.

Please no developers take this as an insult.  Note that I'm minimizing the amount of work this would take as I'm

1) Not a programmer
2) Someone who really would like to see someone at least attempt to do this, and am worried that the amount of time this would need may scare some talented people away, so is unestimating the time on such a project greatly
3)Am afraid that when iTunes receives VBR support from Quicktime 7, even if the quality ends up being better than Nero, most users here will have to stick with Nero for gapless.


There's always the chance Apple will add support themselves  smile.gif
*

Of course I saw the FAAC source. I understand your feeling to minimize/simplify the problem. But the problem cannot be simplified as you expected. The simplest solution would be something like what I wrote in the previous post, which talented people won't scare.
Go to the top of the page
+Quote Post
Tropican
post Jun 22 2005, 04:10
Post #7





Group: Members
Posts: 68
Joined: 22-May 05
Member No.: 22203



Sorry, my last post was poorly written.

QUOTE (nyaochi @ Jun 21 2005, 09:11 PM)
You missed the point. AAC stream seems to have 1024 frame size, which means that you will/must get 1024*n samples after decoding an AAC stream. That's one reason why iTunes must pad silent samples to fill the last frame. And there's another reason from encoder delay, but I don't mention here. Anyway, the necessary task is not removing the silence, but telling a decoder the number of samples to be removed for playback. It cannot be achieved by an audio editor.


I actually didn't miss the point, but used an incredibly bad example. I was just wondering if we could implement already existing silence cutoff code. A better example would probably be how some Winamp plugins are able to just cuttoff the silence at the end of a file during decoding, thus achieving gapless playback. Or isn't that true gapless?

QUOTE (nyaochi @ Jun 21 2005, 09:11 PM)
Of course I saw the FAAC source. I understand your feeling to minimize/simplify the problem. But the problem cannot be simplified as you expected. The simplest solution would be something like what I wrote in the previous post, which talented people won't scare.
*


I didn't doubt you saw the FAAC source, as you specifically mentioned the decoding library used by it and other programs. I just wanted to make sure others reading this thread knew the information was available. My mentioning that I was minimizing the problem was just me apologizing in advance, to ensure no one would take offence to what I was saying. And also talented people may scare if they think that working on this means they themselves must complete it. Publishing the source of whatever they did would be more than good enough. If there is enough interest, others are then able to continue. I think we are putting the cart before the horse though, don't you? After all, this program will be about as popular and useful as your ACMENC and Otto's iTunesencode. Not saying they aren't loved here at HA and other select places on the net, but they are no where near large enough to have their development or developers questioned as to who's working on them and their progress. There really shouldn't be any planning, just us making a thread like this with info, and then down the line if someone ends up working on such an app from what we and hopefully other people write here they can release it to the community. That was the point I was trying to make, just because there's demand, doesn't mean someone who wants such a feature has to make an awesome program, or even little more than a hack. I didn't want people to think I was alluding to that.
Go to the top of the page
+Quote Post
saratoga
post Jun 22 2005, 05:35
Post #8





Group: Members
Posts: 5003
Joined: 2-September 02
Member No.: 3264



QUOTE (Otto42 @ Jun 21 2005, 09:25 AM)
Short answer: no.

Long answer: yes, but I'm not going to add it to iTunesEncode for a number of reasons:

1) iTunesEncode is designed to just be a CLI to access the functionality that iTunes itself provides via the COM interface, with regards to encoding. Other than copying the resulting encoded file around and renaming it, the file itself is never touched by iTunesEncode. It doesn't do anything at all with the actual data in the file, so adding that sort of functionality is beyond the scope of what iTunesEncode currently does.

2) I lost the source code to iTunesEncode via an accident, and so what you get is what there is. biggrin.gif Yeah, I could rewrite the missing pieces easily enough, but there's no real compelling reason to do so at this point. I have already added all the functionality I could feasibly add via the iTunes COM interface. Supposedly, the new iTunes will support VB scripting, which might be advantageous to use in some way, but COM is basically a dead end as far as new functionality goes.

Honestly, it makes more sense to make a new program to modify existing iTunes created AAC files to add that info, sort of thing.
*


How would you get the encoder delay out of iTunes though? Unless it'll give you the exact sample length of the origonal CD Audio track, I don't see how you could calculate it.
Go to the top of the page
+Quote Post
Gabriel
post Jun 22 2005, 08:56
Post #9


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



QUOTE
How would you get the encoder delay out of iTunes though?

Delay is usually constant for an encoder, so you can check it manually once, and you are fine.
However to compute padding value, you have to know the original number of samples.
Go to the top of the page
+Quote Post
Tropican
post Jun 22 2005, 22:31
Post #10





Group: Members
Posts: 68
Joined: 22-May 05
Member No.: 22203



QUOTE (Gabriel @ Jun 22 2005, 02:56 AM)
you have to know the original number of samples.
*


That's pretty easy then for encoding audio, as you can calculate the number of samples in a .wav. Just would have to build that into a program. Right Gabriel? Sorry, I know little about implementing such a feature.
Go to the top of the page
+Quote Post
M
post Jun 23 2005, 02:37
Post #11





Group: Members
Posts: 964
Joined: 29-December 01
Member No.: 830



Pardon my ignorance on the technicalities involved, but doesn't the MPEG-4 structure allow for chapter stops, or index points of some sort? And if so, wouldn't it be simpler to re-encode an album as a single *.m4a file, with possible plugin or hardware support for using those indices? That would eliminate the entire problem of offsets, calculated or actual, and enable true gapless playback for any player that didn't choke on the metadata. (Yes, I realize this would entail encoding each album as a single, large track, and that it would require players to buffer portions of the track - ideally, to buffer until the next index/chapter marker - but in my end-user/non-programmer/non-hardware-designer/feeble brain the method just makes sense!)

- M.
Go to the top of the page
+Quote Post
Mono
post Jun 23 2005, 04:13
Post #12





Group: Members (Donating)
Posts: 295
Joined: 4-December 03
From: Alabama
Member No.: 10171



Actually that's Apple's official stance:
QUOTE
Many music CDs contain songs that blend into each other, and importing them to iTunes may create a small gap between songs that interrupts the flow. If you use the iTunes Join Tracks feature, the program melds two or more songs into one, continuous gap-free track. So now you can enjoy listening to classical music, concept rock albums and extended dance mixes without the silent treatment.


--------------------
"Facts do not cease to exist just because they are ignored."
—Aldous Huxley
Go to the top of the page
+Quote Post
westgroveg
post Jun 23 2005, 05:55
Post #13





Group: Members
Posts: 1236
Joined: 5-October 01
Member No.: 220



I think guruboolez needs CASE.
Go to the top of the page
+Quote Post
soundcheck
post Jun 23 2005, 06:31
Post #14





Group: Members
Posts: 29
Joined: 19-June 05
Member No.: 22845



QUOTE (M @ Jun 22 2005, 09:37 PM)
Pardon my ignorance on the technicalities involved, but doesn't the MPEG-4 structure allow for chapter stops, or index points of some sort? And if so, wouldn't it be simpler to re-encode an album as a single *.m4a file, with possible plugin or hardware support for using those indices?
*


Audible is doing exactly what you describe for their audiobooks... large m4a files with chapter stops. It could potentially be used for true gapless albums.

Unfortunately it's still a mystery as to how the feature is implemented... sad.gif
Go to the top of the page
+Quote Post
bond
post Jun 23 2005, 20:20
Post #15





Group: Members
Posts: 881
Joined: 11-October 02
Member No.: 3523



QUOTE (soundcheck @ Jun 23 2005, 07:31 AM)
QUOTE (M @ Jun 22 2005, 09:37 PM)
Pardon my ignorance on the technicalities involved, but doesn't the MPEG-4 structure allow for chapter stops, or index points of some sort? And if so, wouldn't it be simpler to re-encode an album as a single *.m4a file, with possible plugin or hardware support for using those indices?
*


Audible is doing exactly what you describe for their audiobooks... large m4a files with chapter stops. It could potentially be used for true gapless albums.

Unfortunately it's still a mystery as to how the feature is implemented... sad.gif
*


do you have such a sample file?


--------------------
I know, that I know nothing (Socrates)
Go to the top of the page
+Quote Post
soundcheck
post Jun 23 2005, 23:06
Post #16





Group: Members
Posts: 29
Joined: 19-June 05
Member No.: 22845



QUOTE (bond @ Jun 23 2005, 03:20 PM)
QUOTE (soundcheck @ Jun 23 2005, 07:31 AM)

Audible is doing exactly what you describe for their audiobooks... large m4a files with chapter stops. It could potentially be used for true gapless albums.

Unfortunately it's still a mystery as to how the feature is implemented... sad.gif
*

do you have such a sample file?
*



Nothing I could legally redistribute.

These Audible files are heavily DRM'ed and can only be played in iTunes, and you're prompted for an Audible login and password when you try to add it to your library.

If one were so inclined, they could always search online for files with extension .aa but they're pretty much useless without an Audible account.
Go to the top of the page
+Quote Post
bond
post Jun 24 2005, 09:22
Post #17





Group: Members
Posts: 881
Joined: 11-October 02
Member No.: 3523



QUOTE (soundcheck @ Jun 24 2005, 12:06 AM)
These Audible files are heavily DRM'ed and can only be played in iTunes, and you're prompted for an Audible login and password when you try to add it to your library.

hm could this mean that the .aa files are simply .m4p files using apples drm?

try the following plz:

grab a copy of the mp4box tool and run the following commandline on the .aa file:
MP4Box -info input.aa
and post the output

get mp4box here


--------------------
I know, that I know nothing (Socrates)
Go to the top of the page
+Quote Post
soundcheck
post Jun 24 2005, 21:00
Post #18





Group: Members
Posts: 29
Joined: 19-June 05
Member No.: 22845



QUOTE (bond @ Jun 24 2005, 04:22 AM)
hm could this mean that the .aa files are simply .m4p files using apples drm?

try the following plz:

grab a copy of the mp4box tool and run the following commandline on the .aa file:
MP4Box -info input.aa
and post the output

get mp4box here
*



MP4Box can't open the file -- "extension not supported"...

After renaming the .aa to .mp4:

E:\MP4Box>MP4Box -info test.mp4
Error opening file test.mp4: Invalid IsoMedia File

This post has been edited by soundcheck: Jun 24 2005, 21:25
Go to the top of the page
+Quote Post
bond
post Jun 25 2005, 10:09
Post #19





Group: Members
Posts: 881
Joined: 11-October 02
Member No.: 3523



ok so its seems to be indeed a non-mp4 file, thx!


--------------------
I know, that I know nothing (Socrates)
Go to the top of the page
+Quote Post
soundcheck
post Jun 25 2005, 18:29
Post #20





Group: Members
Posts: 29
Joined: 19-June 05
Member No.: 22845



QUOTE (bond @ Jun 25 2005, 05:09 AM)
ok so its seems to be indeed a non-mp4 file, thx!
*


I think I see what's going on here. It seems that Audible are using a variety of formats. The file I have is directly from Audible.com and uses a proprietary speech-based codec.

However, the Audible files from the iTunes Music Store apparently are AAC and have the same bookmark & chapter-stop features. Check this out:

QUOTE
2. Audible File Formats

A. Enhanced playback features
Audible utilizes a proprietary file format that includes custom features that improve the user experience over regular MP3 formats when listening to spoken word audio.

Audible files provide the ability to bookmark and remember your last heard position on each and every file stored on the iPod.  You can switch between Audible files, exit and listen to music, and go back and forth and the iPod will remember your last position played and pick up where you left off.

Also, Audible files are broken up into different sections, either by timed intervals, chapters, or program segments. These segment markers allow you to quickly advance backward or forward to the next section.

. . .

Audible files purchased from the iTunes music store are encoded in Apple's AAC format, and provide the same enhanced playback improvements as files with .aa extension downloaded directly from Audible.com.

Audible User's Guide
Go to the top of the page
+Quote Post
bond
post Jun 25 2005, 19:05
Post #21





Group: Members
Posts: 881
Joined: 11-October 02
Member No.: 3523



can you run mp4box's -info on the itunes audible file and post the results plz


--------------------
I know, that I know nothing (Socrates)
Go to the top of the page
+Quote Post
M
post Jun 25 2005, 19:13
Post #22





Group: Members
Posts: 964
Joined: 29-December 01
Member No.: 830



QUOTE (soundcheck @ Jun 25 2005, 12:29 PM)
I think I see what's going on here. It seems that Audible are using a variety of formats. The file I have is directly from Audible.com and uses a proprietary speech-based codec.
*

Audible uses the ACELP.net codec for many audiobooks. (Before anyone asks, no, I do not have any such files... although perhaps this is what soundcheck has? The bitrate should be in the neighborhood of ~16kbps, if so.) Although the Helix implementation includes ACELP.net encoding, documentation of the *.aa container is almost non-existent. If we could find a way to convert Helix-encoded ACELP audio to iPod-compatible files, that would also be useful.

- M.
Go to the top of the page
+Quote Post
nyaochi
post Jun 27 2005, 03:04
Post #23





Group: Members
Posts: 169
Joined: 30-September 01
From: Tokyo, Japan
Member No.: 99



Since no one seems to take this task, I read the specification of MP4 file format available from the spec, install the latest iTunes on my machine, and download the latest mpeg4ip tools to dump MP4 streams.

I googled and found a thread to implement the gapless solultion (but I found original thread with an important information later and realized that I shouldn't have taken this approach crying.gif ) and implemented a tool to set "ctts" and "stts" MP4 boxes to iTunes' MP4 files (Again, don't take this approach).

I measured iTunes’ encoder delay and found it to be probably 1088 (= 1024+64?). Then I modify the MP4 stream to store gapless-playback information. The following is an example of a dump text of an MP4 stream my experimental program generated:
CODE
     type stts
      version = 0 (0x00)
      flags = 0 (0x000000)
      entryCount = 2 (0x00000002)
       sampleCount = 431 (0x000001af)
       sampleDelta = 1024 (0x00000400)
       sampleCount[1] = 1 (0x00000001)
       sampleDelta[1] = 744 (0x000002e8)

     type ctts
      version = 0 (0x00)
      flags = 0 (0x000000)
      entryCount = 2 (0x00000002)
       sampleCount = 1 (0x00000001)
       sampleOffset = 1088 (0x00000440)
       sampleCount[1] = 431 (0x000001af)
       sampleOffset[1] = 0 (0x00000000)


RESULT:
In a short answer, I could not get gapless playback/decoding by using foobar2000/faad. Even though foobar2000 displays the song length as I expected:
01-itunes-1088.m4a: 441000 (= 431 * 1024 + 744 -1088)
01-itunes.m4a: 442368 (= 432 * 1024)
foobar2000 and faad won’t remove samples at the beginning which comes from the encoder delay of iTunes’ AAC encoder.

REASON FOR FAILURE:
I found a post saying, "don't use 'ctts' and 'stts' boxes for gapless playback", in the original HA thread which does not exist in the Google's cache. Now I realized the reason why foobar and faad did not implement "ctts" for removing samples at the beginning.

HOW GAPLESS PLAYBACK IS ACHIEVED IN FAAC:
I have no idea how faac implements gapless playback. To remove the padded samples, we can use duration field in 'mdhd' MP4 box instead of 'stts'. But I was wondering how the decoder removed the samples which comes from FAAC encoder's delay. Then I compared the wave forms of: original wave; iTunes (delay = 1024; this is only for debugging purpose); iTunes (delay = 1088); iTunes (no delay information); faac (MP4 stream); and faac (AAC stream) in this order: http://nyaochi.sakura.ne.jp/temp/mp4-delay.png
All streams made from iTunes have the same delay even though I added delay information. Another interesting thing is, AAC stream does not have any delay. AFAIK, AAC stream does not contain gapless playback information, right? If so, the encoder delay of FAAC is found to be zero...

In conclusion, I could find a solution to remove padded samples, but no solution for removing samples at the beginning of a track that comes from encoder's delay. Does anyone know how to store encoder's delay in an MP4 stream? I'm disappointed to waste my weekend... sad.gif I've gotta sleep.
Go to the top of the page
+Quote Post
rjamorim
post Jun 27 2005, 03:57
Post #24


Rarewares admin


Group: Members
Posts: 7515
Joined: 30-September 01
From: Brazil
Member No.: 81



QUOTE (nyaochi @ Jun 26 2005, 11:04 PM)
Does anyone know how to store encoder's delay in an MP4 stream? I'm disappointed to waste my weekend...  sad.gif I've gotta sleep.
*


I think the Nero/Audiocoding guys never bothered to hack into the MP4 container a way to store delay, because both Nero and FAAC have the same delay, and FAAD is compatible with that delay, so it takes it into account automatically when decoding.

If that is correct, you would need to hack a way to store delay in MP4 yourself, and then patch FAAD to take this information into account.

QUOTE
If so, the encoder delay of FAAC is found to be zero...


Nope, but the encoder/decoder pair delay is zero smile.gif

If you decoded the FAAC-generated stream in iTunes, you would probably notice some delay.

This post has been edited by rjamorim: Jun 27 2005, 03:59


--------------------
Get up-to-date binaries of Lame, AAC, Vorbis and much more at RareWares:
http://www.rarewares.org
Go to the top of the page
+Quote Post
bond
post Jun 27 2005, 09:59
Post #25





Group: Members
Posts: 881
Joined: 11-October 02
Member No.: 3523



somehow i get the feeling that this "hacking" will not lead to anything good regarding interoperability with normal aac implementations and might break more (cant back this up, just a feeling and experience with private hacks)


therefore i would like to point out that the mp4 container offers explicitely one place where private info of any kind has to be and can be stored and thats the udta (userdata) atom

i would propose to use this for storing private gapless data and make the decoder of your choice (propably faad2) use this data from the udta


--------------------
I know, that I know nothing (Socrates)
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 2nd October 2014 - 18:03