IPB

Welcome Guest ( Log In | Register )

4 Pages V  < 1 2 3 4 >  
Reply to this topicStart new topic
ITU-R BS.1387 Analysis Tool finally available, under GPL!
ff123
post Jan 6 2002, 18:12
Post #51


ABC/HR developer, ff123.net admin


Group: Developer (Donating)
Posts: 1396
Joined: 24-September 01
Member No.: 12



I'm not sure I buy the argument that averaging results over many different samples will improve the performance of the EAQUAL algorithm.

For example, suppose that EAQUAL in general fails to properly penalize pre-echo artifacts (which seems to be a plausible hypothesis). Then, averaging over many different samples won't necessarily compensate for this failing. Yes the results will be more reliable, but they won't be more valid.

ff123
Go to the top of the page
+Quote Post
Speek
post Jan 6 2002, 19:54
Post #52





Group: Members
Posts: 394
Joined: 31-October 01
Member No.: 386



Alexander,

I wonder why a test sample for EAQUAL should be at least 10 to 20 sec.?
Go to the top of the page
+Quote Post
Alexander Lerch
post Jan 6 2002, 20:22
Post #53


zplane.development Compaact! developer


Group: Developer
Posts: 65
Joined: 4-January 02
Member No.: 918



QUOTE
Originally posted by ff123


For example, suppose that EAQUAL in general fails to properly penalize pre-echo artifacts (which seems to be a plausible hypothesis).  Then, averaging over many different samples won't necessarily compensate for this failing.  Yes the results will be more reliable, but they won't be more valid.

ff123


Hi,
Of course you are correct. EAQUAL in its current version is not very sensitive regarding pre-echo artifacts.
Let me formulate it in this way: the algorithm's results will be more valid within the limitations of the algorithm. This was more or less proved, see Treurniet et al., Evaluation of the ITU-R Objective Audio Measurement Tool, Journal of the AES No.48, 2000

Alexander


--------------------
zplane.development
http://www.zplane.de
Go to the top of the page
+Quote Post
Alexander Lerch
post Jan 6 2002, 20:24
Post #54


zplane.development Compaact! developer


Group: Developer
Posts: 65
Joined: 4-January 02
Member No.: 918



QUOTE
Originally posted by Speek
Alexander,

I wonder why a test sample for EAQUAL should be at least 10 to 20 sec.?



For the analysis, a few seconds at beginning and end do not influence the result. This should be because the listeners are often less concentrated or may think artefacts like clicks etc are due to their playing equipment which is starting or stopping. Don't know if this is true or not. Perhaps there is another explanation too, but standards don't explain anything.smile.gif

Alexander


--------------------
zplane.development
http://www.zplane.de
Go to the top of the page
+Quote Post
ff123
post Jan 6 2002, 20:41
Post #55


ABC/HR developer, ff123.net admin


Group: Developer (Donating)
Posts: 1396
Joined: 24-September 01
Member No.: 12



QUOTE
For the analysis, a few seconds at beginning and end do not influence the result. This should be because the listeners are often less concentrated or may think artefacts like clicks etc are due to their playing equipment which is starting or stopping. Don't know if this is true or not. Perhaps there is another explanation too, but standards don't explain anything.


There was another objective tool based on the work of Frank Baumgarte, which EarGuy implemented and played with in the r3mix forums. It seemed to consistently show bad results at the beginning of an mp3 file. I don't know of any listening tests which confirm this effect, or what could explain it, but in the files I prepare for listening tests, I sidestep this problem (if it is a real effect) by duplicating the entire sample, and then by cutting it in half (choosing the second half, of course) for the final file.

[tangent]Interestingly, Fraunhofer's mp3enc codec will produce different artifacting depending on what signal went before, so it can be tricky to capture bad sections for this codec by excerpting sections of a song, and the method I use above will yield different results for mp3enc than just using the straight sample.[/tangent]

I also found early on that playing files which don't begin or end with silence can cause clicks initially upon playback. I avoid this problem by adding some silence before and after the sample.

ff123
Go to the top of the page
+Quote Post
Speek
post Jan 6 2002, 21:31
Post #56





Group: Members
Posts: 394
Joined: 31-October 01
Member No.: 386



QUOTE
Just out of curiosity, could someone also give --r3mix a shot.


Just added --r3mix to my test results.

Strange things happening on deerhunter.wav (sample from ff123). Lame --r3mix with only 113 kb/s is better than --alt-preset (fast) standard (154 and 163 kb/s). Also Psytel aacenc -normal (153 kb/s) is better than -extreme (183 kb/s). :confused:

Link: http://home.wanadoo.nl/~w.speek/eaqual.htm
Go to the top of the page
+Quote Post
Ivan Dimkovic
post Jan 6 2002, 21:31
Post #57


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



ITU-R BS.1387 also has "advanced" ear model, which is implemented in Opticom OPERA software,

here is what AES paper #4931 (regarding to PEAQ) says about "advanced" model:

QUOTE
3.4.3 Advanced Version

Compared to the "basic" version, this model performs the time to frequency warping using a filter bank, thus grouping the signal into 40 auditory bands with a temporal resolution of approximately 0.66 ms. This allows for a very accurate modelling of backward masking effects.


Andree Buschmann told me in some mail that he wasn't satisfied with PEAQ tool used in debugging of the AAC codec in Bosch - tool missed some important artifacts in fatboy.wav - but that's ok from my point of view - BS.1387 has its application, it is good to some degree, but it can't replace audiophile's ear smile.gif
Go to the top of the page
+Quote Post
Alexander Lerch
post Jan 7 2002, 09:23
Post #58


zplane.development Compaact! developer


Group: Developer
Posts: 65
Joined: 4-January 02
Member No.: 918



QUOTE
Originally posted by Alexander Lerch



For the analysis, a few seconds at beginning and end do not influence the result. 
Alexander


Oops, here I was not correct. It is some time ago I implemented this and kept it wrong in mind, perhaps I read something about this. It is only .5 seconds at beginning and end that don't influence the result. So, the minimum filesize would reduce a little bit.

Alexander


--------------------
zplane.development
http://www.zplane.de
Go to the top of the page
+Quote Post
Alex
post Jan 7 2002, 12:52
Post #59





Group: Members
Posts: 38
Joined: 7-January 02
Member No.: 954



Im in the middle of some testing, Id just like to confirm that 1024 is the correct offset for Psytel AAC, I`ll post the results later.

Thanks,
Alex
Go to the top of the page
+Quote Post
Speek
post Jan 7 2002, 13:23
Post #60





Group: Members
Posts: 394
Joined: 31-October 01
Member No.: 386



QUOTE
Originally posted by Alex
Im in the middle of some testing, Id just like to confirm that 1024 is the correct offset for Psytel AAC, I`ll post the results later.


It depends on what decoder you use. With the newest FAAD decoder dated 2002-01-05 the offset is 0.
Go to the top of the page
+Quote Post
Alex
post Jan 7 2002, 13:27
Post #61





Group: Members
Posts: 38
Joined: 7-January 02
Member No.: 954



so with 2002-01-05 its 0 and pre-2002-01-05 its 1024?
I just want to confirm this, Im getting some supprising results for Psytel, the results for other codecs are what I would expect.
I will post my results shortly but Id like to make sure they will be accurate.
Go to the top of the page
+Quote Post
Jan S.
post Jan 7 2002, 19:56
Post #62





Group: Admin
Posts: 2550
Joined: 26-September 01
From: Denmark
Member No.: 21



So if I wanted to test a LAME encoded and decoded file I shouldn't set any offset?


Jan.
Go to the top of the page
+Quote Post
Alex
post Jan 7 2002, 20:00
Post #63





Group: Members
Posts: 38
Joined: 7-January 02
Member No.: 954



yes so long as you use lame.exe --decode then there is no offset, Im not sure about any other decoders.
Go to the top of the page
+Quote Post
Speek
post Jan 7 2002, 20:17
Post #64





Group: Members
Posts: 394
Joined: 31-October 01
Member No.: 386



QUOTE
so with 2002-01-05 its 0 and pre-2002-01-05 its 1024?


With FAAD 2002-01-05 the offset is 0
With FAAD 2001-06-06 the offset is 1024
With other FAAD I don't know. But you can easily do the test with castanets and CoolEdit (or any other wave editor) that Ivan described at the beginning of this thread.

QUOTE
So if I wanted to test a LAME encoded and decoded file I shouldn't set any offset?


The LAME decoder corrects the offset. So if you decode a LAME MP3 file with LAME the offset is 0. If you use another decoder with no offset correction the offset will probably be 1105.
Go to the top of the page
+Quote Post
Alex
post Jan 7 2002, 21:05
Post #65





Group: Members
Posts: 38
Joined: 7-January 02
Member No.: 954



I downloaded the 2002-01-05 FAAD and the results were ok, although Psytel AAC still performed worse than Lame in the test that I did.
Go to the top of the page
+Quote Post
JohnV
post Jan 8 2002, 00:06
Post #66





Group: Developer
Posts: 2797
Joined: 22-September 01
Member No.: 6



QUOTE
Originally posted by ff123
There was another objective tool based on the work of Frank Baumgarte, which EarGuy implemented
Yeah, I hope Todd (Earguy) can come up with improved Baumgarte's digital ear model.

Quoting his email: "I discovered a major mistake that I made. I can't read German, so I was using online translators to translate Frank's disertation to English. In his disertation, he gives two different sets of model parameters to use. I didn't understand what the difference was between the two until now (online translators have improved in the last two years). And of course I was using the wrong set of parameters! So the previous results are in error. I will definitely have something to show early next year so watch the forums for my posts."

I hope that correctly working Baumgarte's digital ear will be better than PEAQ ITU-R BS.1387 basic model.


--------------------
Juha Laaksonheimo
Go to the top of the page
+Quote Post
Ivan Dimkovic
post Jan 9 2002, 10:57
Post #67


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



QUOTE
Originally posted by JohnV
Yeah, I hope Todd (Earguy) can come up with improved Baumgarte's digital ear model. 

Quoting his email: "I discovered a major mistake that I made.  I can't read German, so I was using online translators to translate Frank's disertation to English.  In his disertation, he gives two different sets of model parameters to use.  I didn't understand what the difference was between the two until now (online translators have improved in the last two years).  And of course I was using the wrong set of parameters!  So the previous results are in error.  I will definitely have something to show early next year so watch the forums for my posts."

I hope that correctly working Baumgarte's digital ear will be better than PEAQ ITU-R BS.1387 basic model.


Speaking about EAQUAL (PEAQ) psychoacoustic model, I had finally taken a look in the recommendation (psychoacoustic model description) and I must say that it is based on more advanced model than ISO Psychoacoustic Model II - I also know that one state-of-the-art compressor (I won't tell which !) is using many of properties from this model.

Non-linear Spreading function used in this model is described in Baumgarte's "Application of non-linear psychoacoustic model in ISO Layer III codec" (used to avoid tonality estimation problems). Also it uses temporal masking.

Its only weakness is 2048 sample length of the analysis window.
Go to the top of the page
+Quote Post
Speek
post Jan 9 2002, 15:49
Post #68





Group: Members
Posts: 394
Joined: 31-October 01
Member No.: 386



QUOTE
Its only weakness is 2048 sample length of the analysis window.
Ivan, could you explain this a little bit more?

Here's how imagine things:
A = reference file
B = test file
x = window where the attack/transient is (or begins)

I suppose EAQUAL compares each window of B to the corresponding window of A. So when there's pre-echo in B either Bx or Bx-1 will not be the same as Ax or Ax-1. So EAQUAL will detect the pre-echo. But I guess things don't work as I imagine?
Go to the top of the page
+Quote Post
Garf
post Jan 9 2002, 16:12
Post #69


Server Admin


Group: Admin
Posts: 4886
Joined: 24-September 01
Member No.: 13



Basically, the tool sees all audio in 20ms chunks in the frequency domain. This means that it will not clearly see any distortion smaller than 20ms in the temporal domain, because the error from the preecho is 'spread out' over a 20ms timeframe.

The preecho is audible because it is a relatively big distortion over a very small time interval. The tool will not see the difference between it and a very small distortion over a larger interval, which would be inaudible.

2048 samples is about _twice_ the size of an MP3 _large_ block btw. This would mean it's really completely blind to any temporal artifact.

(Disclaimer: Not sure this is all 100% correct, and I hope those that know will correct the mistakes)

--
GCP
Go to the top of the page
+Quote Post
Ivan Dimkovic
post Jan 9 2002, 18:33
Post #70


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



QUOTE
Originally posted by Speek
Ivan, could you explain this a little bit more?

Here's how imagine things:
A = reference file
B = test file
x = window where the attack/transient is (or begins)

I suppose EAQUAL compares each window of B to the corresponding window of A. So when there's pre-echo in B either Bx or Bx-1 will not be the same as Ax or Ax-1. So EAQUAL will detect the pre-echo. But I guess things don't work as I imagine?


Imagine this situation:

Divide the 2048 in 8 blocks of 256 time-domain samples (128 coefficients)

Now, inject the noise in each of the 8 blocks, say 10 dB - in all cases tool with 2048 sample window will detect it as the same increase of noise, but it can't guess in what segment problem occured.

2048 sample (1024 coefficient) window is standard MPEG AAC "LONG" window size. Because it leads to serious pre-echo problems, smaller window of 256 samples is employed if the signal conditions require it.
Go to the top of the page
+Quote Post
dB
post Jan 9 2002, 19:23
Post #71





Group: Members
Posts: 64
Joined: 1-November 01
Member No.: 388



QUOTE
Originally posted by Ivan Dimkovic

I also know that one state-of-the-art compressor (I won't tell which !)


Fraunhofer aac demo v2.2? biggrin.gif


bye, dB
Go to the top of the page
+Quote Post
cd-rw.org
post Feb 23 2002, 14:32
Post #72





Group: Members
Posts: 176
Joined: 5-October 01
Member No.: 217



Could someone test the GOGO NoCoda with competitive bitrate VBR settings?


--------------------
http://www.bitburners.com - We Burn a Bit
Go to the top of the page
+Quote Post
ashok
post Mar 14 2002, 13:21
Post #73





Group: Members
Posts: 9
Joined: 7-March 02
Member No.: 1456



hello all
have anybody have the free source code for ITU_R B.1387 or link to it( http://sourceforge.net/projects/eaqual/ this link does not contains any code.)
If i have to purchase the eaqual source code what is the price and source for it

u can contact me at


--------------------
thanks alot
Go to the top of the page
+Quote Post
Speek
post Mar 14 2002, 13:53
Post #74





Group: Members
Posts: 394
Joined: 31-October 01
Member No.: 386



QUOTE
Originally posted by ashok
hello all 
have anybody have the free source code for ITU_R B.1387 or link to it( http://sourceforge.net/projects/eaqual/ this link does not contains any code.)
If i have to purchase the eaqual source code what is the price and source for it

u can contact me at
Direct download from Mitiok's site: http://home.pi.be/~mk442837/EAQUAL.tar.gz
Go to the top of the page
+Quote Post
ashok
post Mar 14 2002, 14:33
Post #75





Group: Members
Posts: 9
Joined: 7-March 02
Member No.: 1456



thanks speek
the code here is building and running fine
thanks alot:) smile.gif


--------------------
thanks alot
Go to the top of the page
+Quote Post

4 Pages V  < 1 2 3 4 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 2nd October 2014 - 11:01