Does AAC use *inter-frame* compression?
Reply #4 – 2014-10-17 15:36:57
At a first glance, this looks like LC-AAC does not use inter-frame compression, but HEv2-AAC does. I suspect SBR could have something to do with this. But I might be wrong here I think you are perfectly right. SBR (and probably also Parametric Stereo) use inter-frame techniques for lossless coding of the side-information. Thanks a lot for confirming! And thanks for the reminder on Parametric Stereo! Do you have any more specific information on this? May it be possible to get rid of this inter-frame dependency , similar to how we can strip the "reservoir" from MP3 frames? BTW: In the specs I found that SBR stores its data in the "sbr_extension_data" struct. But I found nothing that would clearly indicate an inter-frame dependency...In (HE-)AAC the bit-reservoir only exists on the encoder side to allocate bits for each frame (which can be of varying size, unlike in MP3), so it does not introduce frame dependencies at the decoder. This is what I was thinking too. Thus splicing AAC frames is supposed to work fine, in theory. However, it appears that SBR (and/or PS) will introduce some "reservoir" in the MP3 sense Regards.EDIT: I think this whole problem may be related to the "ps_data" used by Parametric Stereo:ps_data() { if (enable_ps_header) { if (enable_iid) { iid_mode nr_iid_par = nr_iid_par_tab[iid_mode] nr_ipdopd_par = nr_ipdopd_par_tab[iid_mode] } if (enable_icc) { icc_mode nr_icc_par = nr_icc_par_tab[icc_mode] } enable_ext } [....] } If the "enable_ps_header" bit is not set, there's no new PS header data and, according to the spec, the last configuration persists in that case So it could happen that the frame directly after the junction point doesn't have this bit set and thus will no longer decode correctly correctly, because it uses the configuration of the preceding frames - which have been exchanged ?! If that is the problem, possible workaround is to enforce that "enable_ps_header"is always TRUE, even if configuration didn't change. Would probably require a hacked encoder...