pre-echo control of psychoacoustic model II

Topic: pre-echo control of psychoacoustic model II (Read 3684 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

pre-echo control of psychoacoustic model II

2004-02-12 13:40:58

In the MPEG4 AAC ISO documentation, there is an example psychoacoustic model II .. and there is a part on pre-echo control:

nb(n) = max(q_thr(n), min(nb(n), scale*nb_prev(n))

It also stated that for :

long block ; scale = 2.0
short block ; scale = 1.0

This scaling for the short block is interesting.. It is suggesting that when an attack occurs, the maskers for the 8 short blocks are identical! However, experimental data & theoritical knowledge stated that for signals with a lot of transients, the maskers varied greatly from frame to frame..

On the other hand, the masker for long blocks doesn't change very much compared to previous frames..

I wonder if there is a documentation error ?

pre-echo control of psychoacoustic model II

Reply #1 – 2004-02-12 13:58:42

I think it is probably a documentation error (in fact, I am sure it is)

In MP3 values are 2.0 and 2.0 (long, short) IIRC

These values are to be found empirically, by performing listening tests (if the codec uses this method of pre-echo protection) - since AAC windows are of different size, and also there are some other tools like TNS - these numbers would have to be different than for MP3.

pre-echo control of psychoacoustic model II

Reply #2 – 2004-02-13 09:33:33

2.0 for short ? I personally think that is still too small!

In the case of start_block.. in some cases, the pre-echo control can cause the masker to be too low.. causing a significant increase in PE calculation !

Removing the pre-echo control totally doesn't seem to be causing any degradation in sound quality..

But why is it necessary to have the pre-echo control anyway ? Is there some limitations to the predictive technique of tonality calculations ?

pre-echo control of psychoacoustic model II

Reply #3 – 2004-02-13 13:28:37

hmm - I'm not using this way of pre-echo control at all

It is there to compensate for the imperfection of the T/F mapping - i.e. to minimize problems related to the size of a MDCT window.

In finding the best parameter you must take into account the TNS tool - which already eliminates a lot of need for overcoding.

Probably something between 6 and 10 would be sufficient for short blocks.

Notice