IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
lame 3.97 alpha 5 testing thread, tests & results
guruboolez
post Jan 11 2005, 08:03
Post #1





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



The 12 samples correspond to the ff123'samples suit selected for the 64 kbps listening test.
www.ff123.net

Encoders and settings:
• lame 3.90.3 | John33 compile | --alt-preset 128
• lame 3.96.1 | John33 compile | --preset 128
• lame 3.97.a5 | John33 compile | --preset 128 -X 10,10

Hardware and software configuration
• Audigy2 soundcard
• Onyko R-A5 FM/AM Tuner Amplifier
• Beyerdynamic DT-531 headphone
• ff123's ABC/HR 1.1 beta 2

Personal mood
• tired... I didn't spent too much time for this test, and I have only ABXed important things (the two best encoders, and only if needed).

RESULTS


CODE
                  3.90.3   3.96.1  3.97.a5
ATrain              3.0      2.0     3.5
BachS1007           4.5      3.5     3.5
BeautySlept         1.5      2.5     3.5
Blackwater          4.5      3.5     4.5
FloorEssence        1.8      1.5     2.5
Layla               3.3      1.0     3.0
LifeShatters        3.0      1.5     3.0
LisztBMinor         3.5      1.5     2.0
MidnightVoyage      2.0      1.0     1.8
thear1              4.5      3.0     4.0
TheSource           4.0      3.0     5.0
Waiting             1.0      1.5     3.0
___________MEANS   3.05     2.13    3.27


COMMENTS
• most often, lame 3.96.1 appeared to be clearly behind lame 3.90.3 and lame 3.97.a5
• I had some troubles to differenciate lame 3.90.3 and lame 3.97.a5. I often changed the notation. Both are close (according to my experience), and it wasn't easy for me to tell which sounded worse or better, even when a difference was audible and ABXable.
• newest alpha had apparently serious troubles with LisztBMinor.wav sample: background noise was severly wounded, removing precious musical information. Difference could be checked through a frequency editor: it's really eloquent. IIRC [proxima] had also reported problems for this sample and recent lame builds.

ANALYSIS

ANOVA ANALYSIS

CODE
ANOVA
FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Blocked ANOVA analysis

Number of listeners: 12
Critical significance:  0.05
Significance of data: 4.54E-004 (highly significant)
---------------------------------------------------------------
ANOVA Table for Randomized Block Designs Using Ratings

Source of         Degrees     Sum of    Mean
variation         of Freedom  squares   Square    F      p

Total               35          45.01
Testers (blocks)    11          27.30
Codecs eval'd        2           8.91    4.46   11.15  4.54E-004
Error               22           8.80    0.40
---------------------------------------------------------------
Fisher's protected LSD for ANOVA:   0.535

Means:

3.97.a5  3.90.3   3.96.1  
 3.27     3.05     2.13  

---------------------------- p-value Matrix ---------------------------

        3.90.3   3.96.1  
3.97.a5  0.393    0.000*  
3.90.3            0.002*  
-----------------------------------------------------------------------

3.97.a5 is better than 3.96.1
3.90.3 is better than 3.96.1



TUKEY PARAMETRIC ANALYSIS

CODE
TUCKEY PARAMETRIC

FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Tukey HSD analysis

Number of listeners: 12
Critical significance:  0.05
Tukey's HSD:   0.649

Means:

3.97.a5  3.90.3   3.96.1  
 3.27     3.05     2.13  

-------------------------- Difference Matrix --------------------------

        3.90.3   3.96.1  
3.97.a5    0.225    1.150*
3.90.3              0.925*
-----------------------------------------------------------------------

3.97.a5 is better than 3.96.1
3.90.3 is better than 3.96.1


=> 3.96.1 is the worst; 3.90.3 and 3.97.a5 are tied.



EDIT: log files are here

This post has been edited by guruboolez: Dec 29 2005, 22:01
Go to the top of the page
+Quote Post
mithrandir
post Jan 11 2005, 15:00
Post #2





Group: Members
Posts: 669
Joined: 15-January 02
From: SE Pennsylvania
Member No.: 1032



This is what X mode 10 does...in quantize.c

CODE
case 10: {
 if (best->over_count > 0 ) {
   /* there are distorted sfb*/
   better = calc->over_SSD < best->over_SSD;
 } else {
   /* no distorted sfb*/
   better = calc->max_noise <= best->max_noise;
 }
 break;
}


This post has been edited by mithrandir: Jan 12 2005, 06:36
Go to the top of the page
+Quote Post
[proxima]
post Jan 11 2005, 20:56
Post #3





Group: Members
Posts: 197
Joined: 12-October 02
From: Italy
Member No.: 3537



QUOTE (guruboolez @ Jan 11 2005, 08:03 AM)
• newest alpha had apparently serious troubles with LisztBMinor.wav sample: background noise was severly wounded, removing precious musical information. Difference could be checked through a frequency editor: it's really eloquent. IIRC [proxima] had also reported problems for this sample and recent lame builds.
*

Yes, this problem of altered background noise (with also musical information) is noticeable with even more samples for recent lame builds. Sometimes the removed background noise is replaced with a HF ringing. Both artifacts are really annoying for me , this is the only reason i still prefer the old 3.90.3 in these cases. At this time, with 3.97a5, the following samples are affected: obscured, Atom_heart_mother, FloorEssence, LisztBMinor, spahm, rebel, Queen+-+your+my+best, gbtinc, applaud, amnesia, Bayle - Etching. Often the artifacts are even visible as dropouts with a spectral analysis. Of course, i can upload the remaining samples on request.

This post has been edited by [proxima]: Jan 11 2005, 21:08


--------------------
WavPack 4.3 -mfx5
LAME 3.97 -V5 --vbr-new --athaa-sensitivity 1
Go to the top of the page
+Quote Post
mithrandir
post Jan 12 2005, 03:24
Post #4





Group: Members
Posts: 669
Joined: 15-January 02
From: SE Pennsylvania
Member No.: 1032



I looked at the output of rebel on a spectrogram. There's a definite dropout in the 10-11KHz region. Maybe the encoder is not devoting bits to this sfb because it (falsely) thinks the tones are masked by the much stronger tones in the lower midrange (guitar). Just a guess.
Go to the top of the page
+Quote Post
LoFiYo
post Jan 12 2005, 03:28
Post #5





Group: Members
Posts: 133
Joined: 2-January 04
Member No.: 10896



Tested Dev0's sample from here.

3.90.3 --alt-preset cbr 128
3.97a5 -b 128 -X 10,10

I liked 3.97a5's CBR128 better than 3.90.3's on this particular sample even though they were both noticeably distorted.

ABC/HR Version 0.9b, 30 August 2002
Testname: 3.97a5 vs 3.90.3

1R = C:\My Test Samples\giveup\3903cbr128.mp3.wav
2L = C:\My Test Samples\giveup\397a5cbr128.mp3.wav

---------------------------------------
General Comments:
The stereo separation (not sure if this is the correct term) on both files sounds broken for the guitar. On the original, the guitar is at mid-left (no echo from the right channel), but on the mp3s, the guitar is echoed from the right.
---------------------------------------
1R File: C:\My Test Samples\giveup\3903cbr128.mp3.wav
1R Rating: 2.5
1R Comment: This sounds rougher and dirtier. The incorrect stereo separation/added echo from the right channel is more noticeable than File 2.
---------------------------------------
2L File: C:\My Test Samples\giveup\397a5cbr128.mp3.wav
2L Rating: 3.0
2L Comment: The distortion is less annoying than File 1. Sounds slightly cleaner.
---------------------------------------
ABX Results:
Original vs C:\My Test Samples\giveup\3903cbr128.mp3.wav
10 out of 10, pval < 0.001
Original vs C:\My Test Samples\giveup\397a5cbr128.mp3.wav
10 out of 10, pval < 0.001
C:\My Test Samples\giveup\3903cbr128.mp3.wav vs C:\My Test Samples\giveup\397a5cbr128.mp3.wav
10 out of 10, pval < 0.001
Go to the top of the page
+Quote Post
randallrp
post Jan 12 2005, 04:33
Post #6





Group: Members
Posts: 14
Joined: 9-January 05
Member No.: 18950



Thank you

If I may be so bold, was there a reason why you chose the --preset 128 for the 3.96.1 as opposed to -V 5 ?

Regards
Randy
Go to the top of the page
+Quote Post
dev0
post Jan 12 2005, 06:18
Post #7





Group: Developer
Posts: 1679
Joined: 23-December 01
From: Germany
Member No.: 731



Cause 3.96.1 had ABR/CBR issues, which Gabriel worked on and need testing.


--------------------
"To understand me, you'll have to swallow a world." Or maybe your words.
Go to the top of the page
+Quote Post
Gabriel
post Jan 12 2005, 20:07
Post #8


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



QUOTE
I looked at the output of rebel on a spectrogram. There's a definite dropout in the 10-11KHz region. Maybe the encoder is not devoting bits to this sfb because it (falsely) thinks the tones are masked by the much stronger tones in the lower midrange (guitar). Just a guess.


Unfortunately no. The psymodel thinks that this area needs some bits. However, it appears that sfb10 is empty on those parts, which is in contradiction with the psymodel requirements, and even with the computed distortion which indicates no distortion there.

edit: 3.90.3 seems to exhibit the same dropouts on the Rebel sample. Does 3.90.3 sounds better on this sample?

This post has been edited by Gabriel: Jan 12 2005, 20:11
Go to the top of the page
+Quote Post
Gabriel
post Jan 12 2005, 20:33
Post #9


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



It seems that lowering the ath is reducing those dropouts (at least in the Atom case).

Would you mind testing "-X 10,10 --athlower 10"?
If this solves the problem, it would be very helpfull the determine which athlower value is leading to satisfactory results.
Go to the top of the page
+Quote Post
[proxima]
post Jan 12 2005, 20:57
Post #10





Group: Members
Posts: 197
Joined: 12-October 02
From: Italy
Member No.: 3537



The rebel sample is only slight better with 3.90.3, i don't think rebel is a problem case for the new aplhas. Even 3.90.3 has distorsions.

I will test when possible the setting with the lowered ATH but i strongly believe this is the way to go because i've already noticed improvments in the past. Now, for VBR, i believe that "--athaa-sensitivity 1" reduce significantly ringing because VBR can choose to lower the ATH even more than the default. When i proposed the "--athaa-sensitivity 1" you seemed to dislike this way of tweaking because of too high ATH adjustment range for a 128 preset. Maybe i have misunderstood something but it seems that now you're going for the "lower ATH" solution even with CBR/ABR. Maybe a lower ATH (or a higher ATH sensitivity) is the key for all presets (CBR, ABR,VBR) that targets below ~150 Kbps (medium included).


--------------------
WavPack 4.3 -mfx5
LAME 3.97 -V5 --vbr-new --athaa-sensitivity 1
Go to the top of the page
+Quote Post
Gabriel
post Jan 12 2005, 22:27
Post #11


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



I'd prefer to use a lower base level for ath but not increasing the adjustement range too much.
I think that I should also enable ath-aa for cbr/abr.
Go to the top of the page
+Quote Post
guruboolez
post Jan 13 2005, 08:12
Post #12





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



96 kbps ABR & VBR TEST

Samples:
The 12 first samples correspond to the ff123'samples suit selected for the 64 kbps listening test.
I've also added eight more samples:
- the 4 samples used by ff123 for the 128 kbps collective test (Dogies.wav, Fossiles.wav, Rawhide.wav, Wayitis.wav)
- macabre.wav (full orchestra sample, uploaded by ff123)
- SinceAlways.wav (recently uploaded by Dev0)
- castanets2.wav for testing sharpness & pre-echo
- Orion II.wav (brass instrument) for testing micro-attacks with a real instrument.

Encoders and settings:
• lame 3.90.3 | John33 compile | --alt-preset 96
• lame 3.96.1 | John33 compile | --preset 96
• lame 3.97.a5 | John33 compile | --preset 96 -X 10,10
• lame 3.97.a5 | John33 compile | -V 7
• lame 3.97.a5 | John33 compile | -V 8 [see NOTE ABOUT ENCODINGS for details about these choices]


Hardware and software configuration:
• Audigy2 soundcard
• Onyko R-A5 FM/AM Tuner Amplifier
• Beyerdynamic DT-531 headphone
• ff123's ABC/HR 1.1 beta 2





NOTE ABOUT VBR ENCODINGS:

There's no VBR preset corresponding to 96 kbps for 'general' music. -V7 seems to produce the closest bitrate value on most samples, but -V8 is needed with some kind of music (I've noticed it in the past with metal/hard rock, and it was also reported here by other members). On the other side, even –V7 could be inferior to 96 kbps. It happens here with four samples, with a terrible deviation for the two "classical music" (and low volume) samples: 67 & 72 kbps for BachBWV1007 and LisztBMinor. Of course, -V7 will logically produce bloated bitrate with the first category of music: the seven biggest files of this test were encoded with this VBR setting.

I had therefore two serious possibilities: first one was to discard VBR from this test. This solution is the most opportune, but IMHO not the most useful one. I'd like to see how will perform lame VBR at such bitrate, and wonder if it could outperform ABR/CBR, at comparable bitrate of course. Second valid possibility: introducing BOTH -V7 and -V8 encodings in this test. Then, everybody could choose its own comparison's strategy (comparing ABR/CBR to a fixed VBR preset or comparing with the preset matching with 96 kbps for each situation). I've finally decided for the second option, despite of difficulty (it supposes for each sample an unnecessary file to be tested...)

NOTE ABOUT SAMPLING RATE.

All settings lead to a resampling process (-> 32000 hertz), except one: lame 3.96.1.


RESULTS

CODE
                  3.90.3   3.96.1  3.97a5  3.97a5  3.97a5
                  ABR 96   ABR 96  ABR 96  VBR V7  VBR V8              

ATrain              2.3      1.0     2.8     1.8     1.0    
BachS1007           4.7      3.8     4.3     1.3     1.0
BeautySlept         3.0      1.5     2.5     2.5     1.0
Blackwater          3.5      1.0     3.5     2.0     1.5
FloorEssence        1.7      1.3     3.0     2.0     1.0
Layla               2.5      1.0     3.0     3.8     1.5
LifeShatters        3.4      1.0     2.3     4.0     3.0
LisztBMinor         4.5      3.5     4.0     1.5     1.3
MidnightVoyage      2.5      1.0     2.0     3.5     1.5
thear1              3.0      1.0     3.0     4.0     2.6
TheSource           2.0      1.5     2.7     1.8     1.0
Waiting             2.0      1.0     2.0     4.0     3.0
__________________________________________________________
Dogies [ff123]      2.5      1.4     2.5     2.0     1.2
Fossiles  [ff123]   3.5      1.5     3.5     2.5     1.0
SinceAlways [Dev0]  2.7      1.0     2.5     4.2     3.5
Macabre  [ff123]    2.5      1.0     2.8     2.0     1.4
Rawhide  [ff123]    3.0      1.0     3.0     1.5     1.0
Wayitis   [ff123]   2.7      1.7     2.7     2.2     1.2
__________________________________________________________
Casta.2 [preecho]   1.4      1.4     2.3     2.7     1.0
OrionII [micro-att] 3.5      1.0     2.5     1.4     1.2

-----------------------------------------------------------
· · · · · · MEANS   2.85     1.43    2.84    2.54    1.55 |
-----------------------------------------------------------

click for log files


COMMENTS:

• VBR encoding at low bitrate can’t apparently be recommended. Bitrate fluctuates too much (that’s normal), but quality too (that’s not normal). VBR should provide constant quality for fluctuating bitrate whereas ABR/CBR should conduct to constant bitrate and erratic quality. At low bitrate, ABR is clearly more robust, despite its limited bitrate allocation. VBR suffers too much from ringing and tons of other artefacts. –V 8 is most often awful; -V7 can’t avoid some difficulties (on LisztBMinor for example).
• ABR at 96 kbps with lame 3.96.1 is a complete tragedy. The lack of resampling could maybe explain a part of this disaster.
• There’s no winner at the end of the 3.90.3 / 3.97a5 competition. Overall results are identical (2.85 vs 2.84!), but this final result contains many difference. Each encoder has specific reaction. With micro-attacks [Orion II.wav] 3.90.3 is apparently clearly better. It could be confirmed with other tests already performed by other persons on comparable samples [fatboy.wav and Bayle – Etching.wav]. With pure pre-echo sample, the new alpha is apparently better. 3.90.3 has also less ringing problem (but problem is not totally absent). Nevertheless, 3.97a5 is now close to 3.90.3 in my opinion, but future progress is of course welcome.


STATISTICS:

If some people would play with statistic tool, just copy and paste the following table:
CODE
390ABR  396ABR  397ABR  397VBR7 397VBR8
2.3     1.0     2.8     1.8     1.0    
4.7     3.8     4.3     1.3     1.0
3.0     1.5     2.5     2.5     1.0
3.5     1.0     3.5     2.0     1.5
1.7     1.3     3.0     2.0     1.0
2.5     1.0     3.0     3.8     1.5
3.4     1.0     2.3     4.0     3.0
4.5     3.5     4.0     1.5     1.3
2.5     1.0     2.0     3.5     1.5
3.0     1.0     3.0     4.0     2.6
2.0     1.5     2.7     1.8     1.0
2.0     1.0     2.0     4.0     3.0
2.5     1.4     2.5     2.0     1.2
3.5     1.5     3.5     2.5     1.0
2.7     1.0     2.5     4.2     3.5
2.5     1.0     2.8     2.0     1.4
3.0     1.0     3.0     1.5     1.0
2.7     1.7     2.7     2.2     1.2
1.4     1.4     2.3     2.7     1.0
3.5     1.0     2.5     1.4     1.2



ANOVA Analysis:

CODE
FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Blocked ANOVA analysis

Number of listeners: 20
Critical significance:  0.05
Significance of data: 1.95E-009 (highly significant)
---------------------------------------------------------------
ANOVA Table for Randomized Block Designs Using Ratings

Source of         Degrees     Sum of    Mean
variation         of Freedom  squares   Square    F      p

Total               99         102.70
Testers (blocks)    19          16.33
Codecs eval'd        4          39.16    9.79   15.76  1.95E-009
Error               76          47.20    0.62
---------------------------------------------------------------
Fisher's protected LSD for ANOVA:   0.496

Means:

390ABR   397ABR   397VBR7  397VBR8  396ABR  
 2.85     2.84     2.54     1.55     1.43  

---------------------------- p-value Matrix ---------------------------

        397ABR   397VBR7  397VBR8  396ABR  
390ABR   1.000    0.217    0.000*   0.000*  
397ABR            0.217    0.000*   0.000*  
397VBR7                    0.000*   0.000*  
397VBR8                             0.646    
-----------------------------------------------------------------------

390ABR is better than 397VBR8, 396ABR
397ABR is better than 397VBR8, 396ABR
397VBR7 is better than 397VBR8, 396ABR




TUKEY PARAMETRIC Analysis:



CODE
FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Tukey HSD analysis

Number of listeners: 20
Critical significance:  0.05
Tukey's HSD:   0.699

Means:

390ABR   397ABR   397VBR7  397VBR8  396ABR  
 2.85     2.84     2.54     1.55     1.43  

-------------------------- Difference Matrix --------------------------

        397ABR   397VBR7  397VBR8  396ABR  
390ABR     0.000    0.310    1.300*   1.415*
397ABR              0.310    1.300*   1.415*
397VBR7                      0.990*   1.105*
397VBR8                               0.115  
-----------------------------------------------------------------------

390ABR is better than 397VBR8, 396ABR
397ABR is better than 397VBR8, 396ABR
397VBR7 is better than 397VBR8, 396ABR


Bitrate table:


CODE
            3.90.3  3.96.1  3.97   3.97   3.97
            ABR96   ABR96  ABR96   -V 7   -V 8
ATrain         91     90     91     112    94
BachS1007      97     98     97     67     56
BeautySlept    93     92     93     98     84
Blackwater     93     94     93     95     78
castanets2     91     92     94     99     86
doggies        94     93     95     107    90
FloorEssence   95     101    99     120    101
Fossils        92     92     94     104    84
SinceAlways    95     95     96     124    105
Layla          93     96     97     121    103
LifeShatters   95     96     95     107    89
LisztBMinor    91     90     91     72     59
Macabre        91     90     91     108    92
MidnightVoyage 92     92     94     121    99
Orion II       90     94     100    102    83
Rawhide        92     92     93     100    82
thear1         92     92     93     110    91
TheSource      98     98     97     93     79
Waiting        90     90     92     117    98
Wayitis        91     91     93     92     75
             92,8   93,4   94,4   103,4  86,4


This post has been edited by guruboolez: Dec 29 2005, 22:05
Go to the top of the page
+Quote Post
guruboolez
post Jan 13 2005, 08:13
Post #13





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



160 kbps ABR & VBR TEST [including –VBR-NEW vs default VBR mode]

Samples:
20 samples - same as 96 kbps listening test above.

Encoders and settings:
• lame 3.90.3 | John33 compile | --alt-preset 160
• lame 3.97.a5 | John33 compile | --preset 160 -X 10,10
• lame 3.97.a5 | John33 compile | -V 4
• lame 3.97.a5 | John33 compile | -V 4 –vbr-new [see NOTE ABOUT ENCODINGS for details about these choices]


Hardware and software configuration:
…they didn’t change.





NOTE ABOUT ENCODINGS:

• Modern encoder at 160 kbps should all reach near-transparency state to my ears. Tests are therefore more difficult. In order to avoid unnecessary exhaustion, I’ve tried to limit the number of challengers. 3.96.1 ABR, which clearly appeared as buggy during previous tests, was consequently removed.
• Like previous test, the main epistemological issue was to legitimate the choice of a VBR setting. –V 4 is the closest setting from 160 kbps value… but often higher. Average bitrate for the 20 samples reach for example 171 kbps, with 197 and 134 kbps for extreme samples. I can’t introduce this time a lower setting: the full sample suit drop to 135 kbps with –V 5 setting, and testing it would be completely absurd.
I had another idea at that moment: using the alternative VBR engine, aka --vbr-new, commonly used with --alt-preset fast routines. I’ve noticed in the past that --vbr-new encodings are slightly smaller than default VBR mode. Other people have also reported the same fact. I’ve therefore encoded the 20 samples with –V 4 --vbr-new, and average bitrate reached 160,4 kbps. Nice, isn’t it ;-)
I didn’t choose between –V 4 and –V 4 fast, and simply put both in the arena. It will be a good occasion to compare the performances of these two VBR engines.

P.S. I did a big rest during this test, and resumed it after 8 hours of sleeping.


RESULTS

CODE
                  3.90.3   3.97a5  3.97a5  3.97a5
                  ABR160   ABR160  VBR4    VBR4NEW            

ATrain              4.0      3.0     4.3     4.0
BachS1007           4.5      5.0     4.0     5.0
BeautySlept         3.5      4.0     4.0     2.5
Blackwater          4.5      4.5     4.0     3.7
FloorEssence        2.9      3.2     3.4     4.0

  SLEEPING - SLEEPING - SLEEPING - SLEEPING

Layla               3.7      4.0     4.7     4.3
LifeShatters        4.0      4.5     3.5     4.0
LisztBMinor         5.0      4.5     3.5     5.0
MidnightVoyage      3.0      4.0     4.9     4.3
thear1              4.8      4.8     5.0     5.0
TheSource           4.2      4.2     3.5     3.2
Waiting             3.0      3.5     4.0     4.5
_________________________________________________
Dogies [ff123]      3.7      3.0     2.0     4.5
Fossiles  [ff123]   4.0      3.5     1.8     3.0
SinceAlways [Dev0]  2.0      3.0     2.3     3.5
Macabre  [ff123]    3.8      3.5     4.5     5.0
Rawhide  [ff123]    4.0      4.5     4.5     4.5
Wayitis   [ff123]   4.0      3.5     2.0     4.7
_________________________________________________
Casta.2 [preecho]   2.8      3.2     2.5     1.8
OrionII [micro-att] 3.0      3.5     2.5     4.3

---------------------------------------------------
· · · · · · MEANS   3.72     3.84    3.54    4.04 |
---------------------------------------------------

click for log files


COMMENTS:

• On average, 3.90.3 ABR was slightly inferior to 3.97.a5 ABR. Both are very close, and similar, except on speed: 3.97a5 is obviously faster.
• VBR comparison is more interesting. First, --vbr-new engine produces with most sample better results than default VBR mode. Statistically (see above), it only appears using friedman.exe tool with tukey parametric with –s 0.1 option (10% confidence, instead of defaulted 5%). But it’s probably more interesting to differentiate results. VBR NEW was much better with LisztBMinor, Dogies, Fossiles, SinceAlways, Wayitis and Orion II. For strange reasons, the defaulted VBR suffers a lot from ringing: background noise becomes irregular, and the distortions also infect some precious musical information (especially at low volume). I’d like to illustrate this problem with a frequency analysis, confirming by eyes the seriousness of this problem:
http://audiotests.free.fr/tests/200...r_vs_vbrnew.gif
On the other side, --vbr-bew routine produced a clearly worse result with the harpsichord sample (BeautySlept.wav). It’s not a surprise I must say: I’ve noticed it two years ago [it must be with lame 3.92 or 3.93]. But if we except this issue (very specific, but unfortunately very annoying for me), I didn’t found any other situation in which --vbr-new really suffered compared to default mode.
• -V 4 --vbr-new is not obviously better than lame 3.97 alpha 5 ABR 160 (final bitrate were the same). It’s a bit problematic: Shouldn’t we expect a real difference between ABR and VBR? Is VBR clearly better than ABR/CBR? And for what situation? Are those VBR/ABR similarities something structural (e.g. we can’t expect from well-tuned MP3 encoders a real quality margin between two modes) or something purely accidental (e.g. lack of tuning of current VBR mode compared to well-tuned ABR settings)? We have some elements of answers: we saw first than ABR outperformed VBR at 96 kbps, and then that existing differences at 160 kbps are really minor, at least with common music (situation could differ with killer sample). On the other side, I’ve found –V 5 --athaa-sensitivity with lame 3.96.1 really better than lame 3.90.3 in a recent past. There are dissimilarities between elements of answer I’ve gathered. Therefore, I think we should try to find a real answer in the future (and temporary forgot our current beliefs) to this fundamental interrogation.



STATISTICS:

If some people would play with statistic tool, just copy and paste the following table:
CODE
90ABR   97ABR   97VB4   97VB4n
4.0     3.0     4.3     4.0
4.5     5.0     4.0     5.0
3.5     4.0     4.0     2.5
4.5     4.5     4.0     3.7
2.9     3.2     3.4     4.0
3.7     4.0     4.7     4.3
4.0     4.5     3.5     4.0
5.0     4.5     3.5     5.0
3.0     4.0     4.9     4.3
4.8     4.8     5.0     5.0
4.2     4.2     3.5     3.2
3.0     3.5     4.0     4.5
3.7     3.0     2.0     4.5
4.0     3.5     1.8     3.0
2.0     3.0     2.3     3.5
3.8     3.5     4.5     5.0
4.0     4.5     4.5     4.5
4.0     3.5     2.0     4.7
2.8     3.2     2.5     1.8
3.0     3.5     2.5     4.3



ANOVA Analysis:
my results can’t lead to any differentiation.

TUKEY PARAMETRIC Analysis [-s 0.1]:

CODE
FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
Tukey HSD analysis

Number of listeners: 20
Critical significance:  0.10
Tukey's HSD:   0.480

Means:

97VB4n   97ABR    90ABR    97VB4    
 4.04     3.84     3.72     3.54  

-------------------------- Difference Matrix --------------------------

        97ABR    90ABR    97VB4    
97VB4n     0.195    0.320    0.495*
97ABR               0.125    0.300  
90ABR                        0.175  
-----------------------------------------------------------------------

97VB4n is better than 97VB4R



Bitrate table:

CODE
              3.90.3  3.97     3.97    3.97
              ABR160  ABR160   -V4   -V4-new
ATrain          155     154     166     171
BachS1007       163     160     134     124
BeautySlept     157     156     142     148
Blackwater      158     157     168     155
castanets2      150     152     143     154
dogies          158     158     178     173
FloorEssence    164     171     198     188
fossiles        156     156     188     171
SinceAlways     161     161     190     173
Layla           159     163     193     177
LifeShatters    161     159     177     147
LisztBMinor     155     153     145     155
macabre         156     154     187     166
MidnightVoyage  156     157     189     171
Orion II (2.1)  155     160     176     159
rawhide         157     156     159     155
thear1          157     156     182     157
TheSource       165     162     136     139
Waiting         155     154     191     167
wayitis         155     155     190     174
            157,65   157,7   171,6   161,2


This post has been edited by guruboolez: Dec 29 2005, 22:07
Go to the top of the page
+Quote Post
guruboolez
post Jan 13 2005, 08:13
Post #14





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



DVD RIP test: AC3 transcoding at ~96 and ~128 kbps

Samples:
I’ve tried to see how will perform a comparison with DVD Video transcoding. DVD Video are very different from CD: we have on one hand variable quality AC3 (which introduce a lot of quantization noise, and sometimes strong lowpass), 48000 hertz sampling rate, and always high dynamic soundtracks including spoken and ambient parts; on the other hand, CD is 44100 hertz, original PCM quality (with infinitesimal quantization noise) and most often limited dynamically (thanks to loudness race).
For this test, I had to build myself all samples. There are only six samples: the conclusions couldn’t be anything else than leads for further investigations. I always used AC3 as source (no DTS nor PCM). I’ve selected native stereo AC3 encodings when possible; for one sample, I had to downsample myself to stereo. Decoding, downsampling and transcoding were performed directly with foobar2000. Samples are:
• Jean-Pierre Jeunet — Alien 4 Resurrection: Jean-Pierre Jeunet presents the DVD edition in English with pronounced French accent. Native stereo AC3 encoding at 192 kbps.
• Rowan Atkinson — Blackadder IV (“Captain Cook”). English speaking with public’s laughs. Native stereo AC3 encoding at 384 kbps.
• King Hu — Come Drink With Me (L’Hirondelle d’Or). Quiet music with water. Mono (two channels) AC3 encoding at 192 kbps.
• Gérard Corbiau — Farinelli. A morning: Horses, birds… then woman voice on diner in French language. Native stereo AC3 encoding at 224 kbps.
• Quentin Tarentino — Pulp Fiction. Ezechiel and gunshots… 448 kbps multichannel AC3 encoding, downsampled to stereo.
• Akira Kurosawa — Ran. A hunt, and dramatic music (flute & percussion). Native stereo AC3 encoding at 448 kbps (!).



PART I: 96 kbps encodings



Encoders and settings:
• lame 3.90.3 | John33 compile | --alt-preset 96
• lame 3.96.1 | John33 compile | --alt-preset 96
• lame 3.97.a5 | John33 compile | --alt-preset 96 -X 10,10
• lame 3.97.a5 | John33 compile | -V 7 --vbr-new
• lame 3.97.a5 | John33 compile | -V 8 --vbr-new

Hardware and software configuration:
…same as before

NOTE ABOUT ENCODINGS:

I explained before (see 96 kbps encoding test) the reason for maintaining two VBR settings in the test. This time, I used --vbr-new engine, which apparently perform better than defaulted mode, especially on low volume signal (and soundtracks are mainly built with low volume parts).


RESULTS

CODE
                  3.90.3   3.96.1  3.97a5  3.97a5  3.97a5
                  ABR 96   ABR 96  ABR 96  VBnew7  VBnew8              

Alien4              4.7      3.0     4.2     1.5     1.0
Blackadder          3.0      2.3     3.5     2.0     1.5
Come Drink With Me  5.0      4.0     4.3     4.3     2.5
Farinelli           2.5      1.3     2.3     2.0     1.3
Pulp Fiction        3.0      1.7     2.7     1.7     1.0
Ran                 4.0      3.0     4.0     2.0     1.0
       MEANS      3.70     2.55    3.50    2.25    1.38

click for log files


COMMENTS:

• lame 3.90.3 is slightly better on average than 3.97.a5 (use this statement with caution: it can’t be confirmed by friedman.exe analysis). The latest alpha had slight problems with ringing (it also appeared on previous test with the same setting but CD encoding). Difference is not dramatic, but I’d use 3.90.3 in order to maximise quality at this setting (or better: resume the test with more sample).
• lame 3.96.1 is bad, but MUCH BETTER here than during previous 96 kbps test.
• VBR encodings are another time not reliable at this low bitrate. Using the alternative VBR engine is not a solution for all audible problems: ringing first, and many other artefacts. –V8 is pathetic (despite of high bitrate!); -V7 better, but still inferior to ABR for higher bitrate.


STATISTICS:

If some people would play with statistic tool, just copy and paste the following table:
CODE
3.90.3  3.96.1  3.97ABR 3.97Vn7 3.97Vn8
4.7     3.0     4.2     1.5     1.0
3.0     2.3     3.5     2.0     1.5
5.0     4.0     4.3     4.3     2.5
2.5     1.3     2.3     2.0     1.3
3.0     1.7     2.7     1.7     1.0
4.0     3.0     4.0     2.0     1.0



ANOVA Analysis:
CODE
3.90.3 is better than 3.96.1, 3.97Vn7, 3.97Vn8
3.97ABR is better than 3.96.1, 3.97Vn7, 3.97Vn8
3.96.1 is better than 3.97Vn8
3.97Vn7 is better than 3.97Vn8


TUKEY PARAMETRIC Analysis [-s 0.1]:

CODE
3.90.3 is better than 3.96.1, 3.97Vn7, 3.97Vn8
3.97ABR is better than 3.96.1, 3.97Vn7, 3.97Vn8
3.96.1 is better than 3.97Vn8



Bitrate table:

CODE
             3.90.3    3.96.1   3.97     3.97       3.97
             ABR 96    ABR 96   ABR 96   -Vn 7      -Vn 8
Alien4          95       97       92       133       128
Blackadder      95       97       97       100       92
Come Drink…     98       102      98       110       106
Farinelli       99       101      99       105       104
Pulp Fiction    96       100      98       107       109
Ran             91       92       93       82        68
             95,7     98,2     96,2     106,2     101,2





PART II: 128 kbps encodings



Encoders and settings:
• lame 3.90.3 | John33 compile | --alt-preset 128
• lame 3.96.1 | John33 compile | --alt-preset 128
• lame 3.97.a5 | John33 compile | --alt-preset 128 -X 10,10
• lame 3.97.a5 | John33 compile | -V 5
• lame 3.97.a5 | John33 compile | -V 5 --vbr-new

Hardware and software configuration:
…still the same

NOTE ABOUT ENCODINGS:

This time, I’ve compared –V 5 and –V 5 --vbr-new: bitrate are totally different, and it’s a good occasion to see if --vbr-new engine is also better at ~128 kbps compared to defaulted VBR mode.
Important note: this time, --vbr-new doesn’t lead to lower bitrate, but to much higher one (102 vs 138 kbps). Differences could be amazing. Best example: with the 100% spoken sample (Alien 4), -V 5 encoding = 90 kbps and –V 5 --vbr-new = 171 kbps. [The very end of the sample was encoded at 320 kbps, which is probably excessive for near-silence…].



RESULTS

CODE
                  3.90.3  3.96.1  3.97a5  3.97a5  3.97a5
                  ABR128  ABR128  ABR128  VBR 5  VBRnew5              

Alien4              4.7     4.3     4.7     3.5     4.0
Blackadder          3.0     2.5     3.3     3.5     4.0
Come Drink With Me  5.0     4.5     4.5     3.5     4.0
Farinelli           3.7     3.0     3.5     4.0     4.3
Pulp Fiction        2.7     2.5     3.5     1.5     3.0
Ran                 4.0     2.0     3.0     2.5     3.5
       MEANS      3.95    3.13    3.75    3.08    3.80

click for log files



COMMENTS:

• lame 3.90.3 is slightly better on average than 3.97.a5 (again, it can’t be confirmed by friedman.exe analysis). It’s an important change, because with CD encoding at the same setting, lame 3.90.3 sounded slightly worse. But 6 samples are probably not enough to be sure about it. Still ringing (slight but existing) issues with 3.97 alpha 5 (I repeat that 3.90.3 is not entirely free of ringing).
• lame 3.96.1 is not as terrible with AC3@48000 than with PCM@44100. But it can’t be recommended.
• VBR –V5 is inferior again to –V5 --vbr-new, on all samples! Extensive tests should be done to confirm it.
• VBR –V5 --vbr-new and ABR 128 are tied. Difference is really marginal (but bitrate is 10 kbps higher with VBR). Again, we should question the theoretical superiority of VBR compared to ABR, and its usefulness. Especially when we have in mind the bloated bitrate which occurs with (apparently) innocent samples. It could be problematic with some movies.


STATISTICS:

If some persons would play with statistic tool, just copy and paste the following table:
CODE
3.90.3   3.96.1   3.97ABR  3.97VB5  3.97Vn5
4.7      4.3      4.7      3.5      4.0
3.0      2.5      3.3      3.5      4.0
5.0      4.5      4.5      3.5      4.0
3.7      3.0      3.5      4.0      4.3
2.7      2.5      3.5      1.5      3.0
4.0      2.0      3.0      2.5      3.5



ANOVA Analysis:
CODE
3.90.3 is better than 3.96.1, 3.97VB5
3.97Vn5 is better than 3.96.1, 3.97VB5
3.97ABR is better than 3.97VB5


TUKEY PARAMETRIC Analysis [-s 0.1]:
no reliable conclusion

Bitrate table:

CODE
               3.90.3    3.96.1    3.97      3.97a5    3.97a5
               ABR 128   ABR 128   ABR 128   -V 5    -V 5--vbr-new
Alien4           124       123       121       90        171
Blackadder       126       127       126       126       133
Come Drink…      128       133       130       73        147
Farinelli        132       134       132       103       129
Pulp Fiction     129       132       130       103       133
Ran              122       123       122       121       119
              126,8     128,7     126,8     102,7     138,7




PART II: 160 kbps encodings

I’m K.O. Use original AC3 instead wink.gif


EDIT: all 6 samples are available HERE (limited availability).

This post has been edited by guruboolez: Dec 29 2005, 22:08
Go to the top of the page
+Quote Post
Gabriel
post Jan 13 2005, 09:36
Post #15


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



Thank you very much for those extensive results.
Go to the top of the page
+Quote Post
guruboolez
post Jan 13 2005, 10:54
Post #16





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



I've forgot this one, from Dev0 at 128 kbps :

QUOTE
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname:

1R = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - ABR - 128].wav
2R = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - CBR - 128].wav
3L = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - CBR - 128].wav
4L = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - ABR - 128].wav
5R = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - ABR - 128 - XX10].wav
6L = D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - CBR - 128 - XX10].wav

---------------------------------------
General Comments:
notation is linked to the performance of solo guitar (introduction, from 1.0 to ~5.0)
---------------------------------------
1R File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - ABR - 128].wav
1R Rating: 1.0
1R Comment:
---------------------------------------
2R File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - CBR - 128].wav
2R Rating: 1.0
2R Comment:
---------------------------------------
3L File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - CBR - 128].wav
3L Rating: 2.0
3L Comment:
---------------------------------------
4L File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - ABR - 128].wav
4L Rating: 2.0
4L Comment:
---------------------------------------
5R File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - ABR - 128 - XX10].wav
5R Rating: 3.5
5R Comment:
---------------------------------------
6L File: D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - CBR - 128 - XX10].wav
6L Rating: 3.5
6L Comment:
---------------------------------------
ABX Results:
D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.96.1 - ABR - 128].wav vs D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - CBR - 128].wav
    8 out of 8, pval = 0.004
D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.90.3 - CBR - 128].wav vs D:\lame test\Dev0\giveuptheghost-sincealways.sample18sec [3.97a5 - ABR - 128 - XX10].wav
    8 out of 8, pval = 0.004


No real difference between ABR/CBR for the same encoder.
3.96.1 < 3.90.3 < 3.97a5

This post has been edited by guruboolez: Jan 13 2005, 10:54
Go to the top of the page
+Quote Post
amano
post Jan 13 2005, 17:58
Post #17





Group: Members
Posts: 483
Joined: 1-December 02
Member No.: 3949



Wow. guruboolez, I have to thank you for your professional efforts.
Go to the top of the page
+Quote Post
Gabriel
post Jan 13 2005, 19:07
Post #18


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



Right now I do not need additionnal feedback in the 96-165kbps range.
All those results are very informative to me, and I will adjust parameters according to them.
Go to the top of the page
+Quote Post
Gabriel
post Jan 13 2005, 20:18
Post #19


LAME developer


Group: Developer
Posts: 2950
Joined: 1-October 01
From: Nanterre, France
Member No.: 138



A new alpha will be available soon:
*vbr is unchanged
*cbr/abr are now using the X 10 mode (remapped to X9), ath levels changed.

I think that this version should reduce dropouts introduced by 3.96 in cbr/abr.
Go to the top of the page
+Quote Post
Lev
post Jan 14 2005, 14:31
Post #20





Group: Members
Posts: 524
Joined: 7-November 02
From: Gloucester, UK
Member No.: 3716



I think I speak on behalf of everyone (pompous thing that I am) when I say I am *really* happy reading this thread. Huge thanks to Guru for testing, and a huge thanks to Gabriel for continuing development. Thanks smile.gif


--------------------
http://www.megalev.co.uk
Go to the top of the page
+Quote Post
guruboolez
post Jan 15 2005, 11:10
Post #21





Group: Members (Donating)
Posts: 3474
Joined: 7-November 01
From: Strasbourg (France)
Member No.: 420



To finish with this alpha:
I've tested yesterday the --athlower setting, using different value (from -5 to -15). With LisztBminor.wav, I've noticed the biggest progress using --athlower 10 (compared to --athlower 9 and lower value). --athlower 15 lead to slight additional progress (one artifact was removed).
I've also tested with some other samples. --athlower 10 is apparently a good way to reduce ringing (but not to totally remove it). But I have to precise that I had to increase the listening volume to hear it (mp3GAIN could also reveal some problems inaudible on 'normal' conditions' — I have often experienced that with my portable player).

If some people are interesed to test, I have three other samples that might be interesting:
ftp://ftp2.foobar2000.net/foobar/ATH_LAME.ZIP
Go to the top of the page
+Quote Post
mithrandir
post Jan 15 2005, 16:57
Post #22





Group: Members
Posts: 669
Joined: 15-January 02
From: SE Pennsylvania
Member No.: 1032



--vbr-new has another problem with controlling bitrate bloat. Guruboolez identified that vbr-new does often use 320kbps frames at the ends of an encoded track (during near silence). However, I have noticed that it also does this on tracks with those notorious "hidden songs", when they'll stick an extra song at the end of the final track with several minutes of silence between the two (in the same WAV).

I encoded a track by Duncan Sheik called "Nichiren" and during the silence vbr-new was using 192 and 224kbps frames. Of course this "silence" is not digital silence but "analog silence". At -85dB, it's probably tape hiss but quiet enough to use 32kbps frames, I'd say.
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 24th October 2014 - 09:05