Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Alternative Multiformat Listening Test @ 128 kbps (Read 40413 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Alternative Multiformat Listening Test @ 128 kbps

As Sebastian Mares have practically finished accepting test results from participants I would like to offer for all who care to take part in alternative listening test with the same codec contenders but different sound samples. Testing methodology is also a bit different and makes testing much easier because sound artifacts are clearly audible in most cases. The main task of a listener is to grade annoyance of those artifacts.

Test files could be downloaded from here – ftp://www.soundexpert.info
Each time you click the link you’ll get random test file for one of these codecs:
•   Nero AAC 3.1.0.2
•   iTunes AAC 6.0.1.3
•   LAME 3.97 Beta 2
•   Ogg Vorbis AoTuV 4.51 Beta
•   WMA Professional 9.1
•   Shine 0.1.4 (Low Anchor)
Inside zip file you’ll find brief instruction and form for submitting results. The whole testing procedure is not hard and takes approx. 2-3 min., so you could easily test 5-10 files in one session. Each test file is 2-3 Mb.

Results of this test will appear on this page the same time with Sebastian Mares ones. After that they could be monitored in real time as new participants will add their grades.

Testing methodology used is described here - http://www.soundexpert.info/prozone.htm

Hope, comparison of both tests results will be interesting. If someone has any questions I am glad to answer.
keeping audio clear together - soundexpert.org

Alternative Multiformat Listening Test @ 128 kbps

Reply #1
Quote
Testing methodology used is described here - http://www.soundexpert.info/prozone.htm
[a href="index.php?act=findpost&pid=356795"][{POST_SNAPBACK}][/a]

So the test is based on samples with amplified artefacts? Is that fair to compare codecs that way?

I mean, some artefacts may be more audible than others after the amplification treatment. And depending on the listener the presence of a specific amplified artefact may cause a much worse rating than the original encoded sample.

Alternative Multiformat Listening Test @ 128 kbps

Reply #2
Quote
So the test is based on samples with amplified artefacts? Is that fair to compare codecs that way?[a href="index.php?act=findpost&pid=356968"][{POST_SNAPBACK}][/a]
I can’t answer definitely YES or NO. The methodology is pretty new, but all preliminary tests and research show it is promising. Comparison with standard listening tests is very important in this sense.

Quote
I mean, some artefacts may be more audible than others after the amplification treatment. And depending on the listener the presence of a specific amplified artefact may cause a much worse rating than the original encoded sample.[a href="index.php?act=findpost&pid=356968"][{POST_SNAPBACK}][/a]
Actually the methodology is more complicated than just amplifying artifacts. At least three artificial test stimuli are generated for each codec and sound sample. Final rating is computed on basis of these stimuli, graded by listeners.
keeping audio clear together - soundexpert.org

Alternative Multiformat Listening Test @ 128 kbps

Reply #3
Artificial stimuli? As opposed to music? Isn't that as bad as using a spectral analysis to evaluate codec quality?

Alternative Multiformat Listening Test @ 128 kbps

Reply #4
Quote
Artificial stimuli? As opposed to music? Isn't that as bad as using a spectral analysis to evaluate codec quality?
[a href="index.php?act=findpost&pid=356977"][{POST_SNAPBACK}][/a]
“Artificial” here is “made from natural sound samples by means of amplifying artifacts”.
keeping audio clear together - soundexpert.org

Alternative Multiformat Listening Test @ 128 kbps

Reply #5
Interesting to check out the comparisons at other bit rates... QDesign's mp2 encoder seems to beat LAME -V 0 --vbr-new in the 224 kbps range.    But the diagram might be a bit misleading, I don't know how many people have taken the test.
//From the barren lands of the Northsmen

Alternative Multiformat Listening Test @ 128 kbps

Reply #6
Quote
Interesting to check out the comparisons at other bit rates... QDesign's mp2 encoder seems to beat LAME -V 0 --vbr-new in the 224 kbps range.    But the diagram might be a bit misleading, I don't know how many people have taken the test.
[a href="index.php?act=findpost&pid=356983"][{POST_SNAPBACK}][/a]
Now it is definitely misleading due to insufficient results returned. All statistics will be available soon by clicking appropriate rating bar. Now some information about reliability of ratings can be derived from figures in brackets inside bars, which show in percents width of error tube for last 10 values of a rating.

BTW, are you sure that mp2 has to be worse than mp3 at high bitrates?
keeping audio clear together - soundexpert.org

Alternative Multiformat Listening Test @ 128 kbps

Reply #7
Quote
BTW, are you sure that mp2 has to be worse than mp3 at high bitrates?
Am I sure? Absolutely not. I haven't compared them. I was just under the impression that QDesign was a rather poor encoder, but that might have been their mp3 encoder.
//From the barren lands of the Northsmen

Alternative Multiformat Listening Test @ 128 kbps

Reply #8
Quote
I was just under the impression that QDesign was a rather poor encoder, but that might have been their mp3 encoder.
[a href="index.php?act=findpost&pid=356997"][{POST_SNAPBACK}][/a]
To my impression mp2 by QD is good, at least in comparison with the one from TMPGEnc, which has clearly audible artifacts without any artifacts amplification.

I'm not surprized, if mp2 at 224kbps be better than mp3. It has to be so in theory.
keeping audio clear together - soundexpert.org

Alternative Multiformat Listening Test @ 128 kbps

Reply #9
Quote
I'm not surprized, if mp2 at 224kbps be better than mp3. It has to be so in theory.
[a href="index.php?act=findpost&pid=357013"][{POST_SNAPBACK}][/a]


"What theory would that be?"

MP3 has a much more efficient encoding structure than MPEG Layer 2. This doesn't magically change at high bitrates.

 

Alternative Multiformat Listening Test @ 128 kbps

Reply #10
Quote
Quote
I'm not surprized, if mp2 at 224kbps be better than mp3. It has to be so in theory.
[{POST_SNAPBACK}][/a]


"What theory would that be?"

MP3 has a much more efficient encoding structure than MPEG Layer 2. This doesn't magically change at high bitrates.
[a href="index.php?act=findpost&pid=357017"][{POST_SNAPBACK}][/a]

What about pre-echo then?
It was often said that layer 3 isn't the most efficient encoding layer for high bitrate. An example with Frank Klemm comments on this subject, [a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=1189&view=findpost&p=11061]here[/url]:
Quote
Above 256 kbps MPEG-1 Layer 3 makes no sense. If you need
such bitrates, the reason for this high bitrate are flaws introduced with Layer 3. Often Layer 2 performs much better
than Layer 3 at the same bitrate. And note that Layer 2
also supports 384 kbps.

For Fatboys also MPEG-1 Layer 1 performs better than Layer 2
at the same bitrate.

For transparent encodings of typical pop music Layer 2 normally
outperforms Layer 3.

TOS#8 violation from F. Klemm ?

Alternative Multiformat Listening Test @ 128 kbps

Reply #11
Note the parts "256kbps", "Often"  and "note that Layer 2 also supports 384 kbps.". You have to go high enough for MP2's inefficiency compared to MP3 to be balanced by MP3's limitations (maximum bitrate of 320kbps is one of them). MP3 being more efficient isn't going to help it when it's coding at 320kbps and MP2 is coding at a higher rate

There's no "theory" which says that at 224kbps layer 2 "must" be better than layer 3.

Frank neglected to tell which layer 2 encoder he was thinking of (remember Musepack is based on layer 2)

Alternative Multiformat Listening Test @ 128 kbps

Reply #12
Quote
Note the parts "256kbps", "Often"  and "note that Layer 2 also supports 384 kbps.". You have to go high enough for MP2's inefficiency compared to MP3 to be balanced by MP3's limitations (maximum bitrate of 320kbps is one of them). MP3 being more efficient isn't going to help it when it's coding at 320kbps and MP2 is coding at a higher rate

There's no "theory" which says that at 224kbps layer 2 "must" be better than layer 3.

Frank neglected to tell which layer 2 encoder he was thinking of (remember Musepack is based on layer 2)
[{POST_SNAPBACK}][/a]

I noted it. But Klemm clearly stated that above 256 kbps layer 2 is better than MP3 at the same bitrate. Is talking about MPEG layer 2, not musepack. You maybe remember than Frank talked once about releasing a MPEG layer 2 encoder using different tunings used with Musepack.

But the most important part is that Frank seems to disagree with you about "MP3 has a much more efficient encoding structure than MPEG Layer 2. This doesn't magically change at high bitrates."
MP3 has known flaws. You recently mentionned one of them in a recent topic with pre-echo. MP3 can't solves this. MP2 is better:
[a href="http://www.personal.uni-jena.de/~pfk/mpp/timeres.html]http://www.personal.uni-jena.de/~pfk/mpp/timeres.html[/url]

Alternative Multiformat Listening Test @ 128 kbps

Reply #13
As far as I recall Klemm actually considered to move mpc to a transform codec (like mp3) exactly for the reason Garf stated... He seemed to think that properly tuned transform codec should outperform current mpc. I agreed though that that oppinion is not reflected in the quote you found.

Alternative Multiformat Listening Test @ 128 kbps

Reply #14
If you spend more bits on the right coefficients, the time smearing gets less. This is true even if the time resolution is low. Since MP3 is more efficient, it can afford to spend more bits to do this (remember GTune Vorbis?). That's why it's not so clear-cut.

On the page you link though, there's this:

Quote
To my mind only MPEG-4 AAC is capable to eliminates all disadvantages of the additional frequency resolution. The result is transparent coding at data rates around 120...130 kbps (instead of 170...180 kbps as MPEGplus). But a

    * high quality MPEG-4 AAC Encoder is much much more difficult to program and to tune than a MPEGplus encoder


I agree strongly with this (it's also true for MP3 vs MP2) and it points that the "theorethical" advantages any format may have can be problematic since they must be used fully and correctly. We have not reached this point with MP3 and it will take much longer with AAC.

That's why I also disagree with the statement that format X must be better than format Y. The implementation is the limiting factor.

Alternative Multiformat Listening Test @ 128 kbps

Reply #15
Quote
That's why I also disagree with the statement that format X must be better than format Y. The implementation is the limiting factor.
[a href="index.php?act=findpost&pid=357042"][{POST_SNAPBACK}][/a]

I also agree with it, of course
It's just that Frank Klemm comments are making sense. It's like He-AAC or Parametric Stereo: it outperforms LC-AAC or joint stereo for efficiency at low bitrate. But at higher ones, these tools are leading to artifacts or distortions you won't get with less efficient tools. This sudden change has nothing to do with magic. A sprinter is always a poor 10.000 meters runner.
There are several samples which may be transparent with layer 2 at high bitrate, even if the mp2 encoder is not intensively tuned - or may at least be less distorted than a very high quality implementation of MP3 such LAME. I can ABX castanets at 640 kbps with LAME freeformat, but I'll probably fail with most other formats at half this bitrate.
Implementation has a big part in quality, but some inherent flaw in design could definitely handicap a format. Klemm often criticized layer 3 (and not transform encoders by themselves) design in the past.

Actually, I woudn't exchange one LAME encoded file for 10 mp2 (toolame...) files at 224 kbps. There's maybe less pre-echo (it's not even sure), but there are several other form of distortions.

Alternative Multiformat Listening Test @ 128 kbps

Reply #16
The problem of PS and SBR is that they are parametric tools. You cannot add bits to the SBR or PS layer and expect it to (keep) improve(ing). They are hard limited.

Normal LC AAC doesn't have such a problem, until it hits the hard bitrate ceiling (500 and something kbps), and MP3 should be the same. I don't like the analogy because of this reason; AAC and MP3 have such a wide usable bitrate range because they work like this and not like PS/SBR.

It's possible that at some point solving MP3 preecho by adding bits is so terribly inefficient that MP2 surpasses it. But I have my doubts that it happens consistently at 224kbps, if only because time resolution is one of the very few areas where MP2 is better and for a lot of things MP3 isn't limited by it.

Alternative Multiformat Listening Test @ 128 kbps

Reply #17
The testing procedure of this second listening test puzzled me.
I downloaded a file, and it appears that I can only rate one encoder for a given sample. It's a single A-B procedure. There's nothing wrong with that. What perplexed me is the role of the anchor. Isn't it intended to prevent the listener from temperamentic rating? It implies that the listener could access to the anchor while testing the other contenders. But in your testing procedure, you can only access to one encoded file and the hidden reference; the anchor, like all other contenders, are not accessible. The anchors can't therefore plays any role - at least not the anchor's one. It's just an additional contender.

It also mean that the listener can't rate all competitor in a same row. I can download:
- LAME => hearing a distortion => give it the note of 2/5
- then the ANCHOR => finding it awful => give the note of 1/5
- then download again LAME => hearing the same distortion as before => give the note of 4 because I would consider is as much better than the anchor quality I still have in mind.

It's clearly recommended to evaluate all encodings in a same raw, and to compare them each others before rating them all. It's like ABCHR softwares are working. Or at least to have the possibility to rank all encodings in a short amount of time. With SoundExpert's procedure, it looks impossible and the ranking could vary according to the testing mood, leading to uncoherent results.

Alternative Multiformat Listening Test @ 128 kbps

Reply #18
Quote
What perplexed me is the role of the anchor. Isn't it intended to prevent the listener from temperamentic rating? It implies that the listener could access to the anchor while testing the other contenders. But in your testing procedure, you can only access to one encoded file and the hidden reference; the anchor, like all other contenders, are not accessible. The anchors can't therefore plays any role - at least not the anchor's one. It's just an additional contender...
It's clearly recommended to evaluate all encodings in a same raw, and to compare them each others before rating them all. It's like ABCHR softwares are working. Or at least to have the possibility to rank all encodings in a short amount of time. With SoundExpert's procedure, it looks impossible and the ranking could vary according to the testing mood, leading to uncoherent results.
[a href="index.php?act=findpost&pid=357205"][{POST_SNAPBACK}][/a]

Yes. The absence of low anchor really increases dispersion of results but it has to be compensated by broad participation of testers.  Target audience of SoundExpert is completely unprepared and in most cases has no idea of what listening tests are. Instead of educate and train them (which are hard and thankless in real world) I decided to offer the listening procedure as simple as possible. It utilizes basic skills of an average listener – just “like” and “dislike” with a few intermediate states. As artifacts are clearly audible the influence of “temperamentic rating” is not high indeed. Each person has its own “inborn scale of annoyances” and in this case it’s better just use it but not build to this particular listening test.

Off course, it is a compromise between simplicity of procedure and scientific significance of its results. SoundExpert is highly experimental research project and up till now it shows that this compromise works. I think more fruitful discussion will be possible when raw stats are available.

And now I just ask for volunteers to download and grade a test item. Indeed it’s more like a fun than a listening test. And as you see the results are pretty close to Sebastian’s ones.
keeping audio clear together - soundexpert.org

Alternative Multiformat Listening Test @ 128 kbps

Reply #19
I understand. The procedure is indeed very easy to understand. It's really important if the purpose is to reach a wide audience through the web.
The testing procedure discard all relevancy to the concept of anchor; so I wondered about the point of using Shine in your test. I suppose that yours was to fully mimic Sebastian's test, right?

Alternative Multiformat Listening Test @ 128 kbps

Reply #20
Quote
The testing procedure discard all relevancy to the concept of anchor; so I wondered about the point of using Shine in your test. I suppose that yours was to fully mimic Sebastian's test, right?
[a href="index.php?act=findpost&pid=357259"][{POST_SNAPBACK}][/a]
Exactly.   

To be correct the low anchor is absent because it is totally subjective in SE testing and that’s why it can’t be called an “anchor”.
keeping audio clear together - soundexpert.org

Alternative Multiformat Listening Test @ 128 kbps

Reply #21
Just noticed something on the page...

Quote
aac ABR@132.5 (Nero 7.0) - MPEG-4 AAC ABR Low Complexity, 132.5 kbit/s FBR
CODER: Nero Digital Audio AAC Encoder (3.1.0.2) from Nero Burning Rom 7.0.1.2 [emphasis is mine]


Is that really true?

Edit: Also, how did you get MAD to decode the Shine samples? It always failed on my side telling me that it cannot decode Dual Channel or something.  I had to use LAME for decoding.

Alternative Multiformat Listening Test @ 128 kbps

Reply #22
Quote
Just noticed something on the page...

Quote
aac ABR@132.5 (Nero 7.0) - MPEG-4 AAC ABR Low Complexity, 132.5 kbit/s FBR
CODER: Nero Digital Audio AAC Encoder (3.1.0.2) from Nero Burning Rom 7.0.1.2 [emphasis is mine]


Is that really true?


No, it’s not true. And now it’s not true twice. I was going to change version of Nero in future to the one which will contain the dll (or ABR part at least) used in test. Now I’m not sure - either to continue testing with explanation of the problem with Nero AAC or to exclude it from testing (or to include real AAC encoder from the latest release instead).

Quote
Also, how did you get MAD to decode the Shine samples? It always failed on my side telling me that it cannot decode Dual Channel or something.  I had to use LAME for decoding.
[a href="index.php?act=findpost&pid=357275"][{POST_SNAPBACK}][/a]
I decoded shine mp3 successfully:
C:\USER\Serge>madplay.exe -v -b 16 -d -S -o wave:.\out.wav out.mp3
MPEG Audio Decoder 0.15.2 (beta) - Copyright © 2000-2004 Robert Leslie et al.
00:01:33 Layer III, 128 kbps, 44100 Hz, dual channel, no CRC
3593 frames decoded (0:01:33.8), -1.5 dB peak amplitude, 0 clipped samples

The problem on my side was with the first second of encoded shine mp3 – there was a loud click and I added one second of digital silence at the beginning of SE test file to work that around.
keeping audio clear together - soundexpert.org

Alternative Multiformat Listening Test @ 128 kbps

Reply #23
Quote
... A sprinter is always a poor 10.000 meters runner. ....

I like this comparison for high / low bitrate encodings.
Quote
... I can ABX castanets at 640 kbps with LAME freeformat, but I'll probably fail with most other formats at half this bitrate. ...
[a href="index.php?act=findpost&pid=357053"][{POST_SNAPBACK}][/a]

As I'm pretty insensitive towards pre-echo would you be so kind to try castanets
with
-  Helix -V140 -X2 -HF2 -SBT450 -TX0 -F18600                      or similar
-  Lame 3.90.3 or 3.91 --abr 270 -b224 -h --lowpass 18600    or similar ?

As I'm considering using bitrates lower than I did before Vorbis comes to my mind again (on my iRiver H140 battery drain with Vorbis unfortunately is rather high but I can compensate for it a bit by using lower bitrates). Can you say something towards aoTuv 4.51 pre-echo behavior for -q7 or -q6?

Thanks in advance.
lame3995o -Q1.7 --lowpass 17

Alternative Multiformat Listening Test @ 128 kbps

Reply #24
Quote
Quote
I'm not surprized, if mp2 at 224kbps be better than mp3. It has to be so in theory.
[a href="index.php?act=findpost&pid=357013"][{POST_SNAPBACK}][/a]


"What theory would that be?"

MP3 has a much more efficient encoding structure than MPEG Layer 2. This doesn't magically change at high bitrates.
[a href="index.php?act=findpost&pid=357017"][{POST_SNAPBACK}][/a]

Sorry for being late with the answer, but now after guruboolez posts I could only add the citation from well known paper “MP3 and AAC Explained”:

Code: [Select]
5.6. Bit-rate versus quality
MPEG audio coding does not work with a fixed compression rate. The user can choose the bit-rate and this way the compression factor. Lower bit-rates will lead to higher compression factors, but lower quality of the compressed audio. Higher bit-rates lead to a lower probability of signals with any audible artifacts. However, different encoding algorithms do have ”sweet spots” where they work best. At bit-rates much larger than this target bit-rate the audio quality improves only very slowly with bit-rate, at much lower bit-rates the quality decreases very fast. The ”sweet spot” depends on codec characteristics like the Huffman codebooks, so it is common to express it in terms of bit per audio sample. For Layer-3 this target bit-rate is around 1.33 bit/sample (i.e. 128 kbit/s for a stereo signal at 48 kHz), for AAC it is around1 bit/sample (i.e. 96 kbit/s for a stereo signal at 48 kHz). Due to the more flexible Huffman coding, AAC can keep the basic coding efficiency up to higher bit-rates enabling higher qualities. Multichannel coding, due to the joint stereo coding techniques employed, is somewhat more efficient per sample than stereo and again than mono coding...


MPEG layers were designed to provide sufficient sound quality and the same time to be efficient in their area of application. For the reason sweet spot of mp1 (384-448 kbps) provides better audio quality than sweet spot of mp2 (192-256 kbps), which is better again than sweet spot of mp3 (112-160 kbps). Due to slow improvements in sound quality at bitrates above sweet spots, there are points where previous layer begins to provide better audio quality (AAC is going to be an exclusion from this rule – time and tests will show). For mp3/mp2 such point is between 224 and 256 kbps.
So, using layers at higher than sweet spot bitrates is inefficient and could be reasonable for compatibility only.
keeping audio clear together - soundexpert.org