IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
A little VBR ~88 kbps ABX-test, One sample, Vorbis 1.5, LAME V8, WMA 25
Alex B
post Nov 28 2005, 19:43
Post #1





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



Some background information

A couple of weeks ago we had a minor debate about the quality of the modern lossy encoders at a Finnish AV forum. My plan was to provide the forum users an opportunity to test their beliefs and hearing by providing a few test samples and instructions how to use the foo_abx tool.

I prepared five samples of different genres from my collection. I thought the selected samples would be a bit above average in complexity - nothing like killer samples, but not too easy for the encoders. I encoded the samples using three different quality levels and two encoders: Vorbis b4.5 (-q 1.5, -q 4.25 and -q 6.25) and LAME 3.97b1 (-V8, -V5 and -V2; all --vbr -new). The idea was to explain something like this:
1. the lowest quality is useful e.g. with portables, but not actually "hifi" quality
2. the middle quality is very good and finding differences is not easy
3. the highest quality is transparent or almost transparent.

My plan didn't work out. After ABX testing the lowest quality samples I realized that my samples are going to be way too easy for the encoders at the higher quality levels. For example, one of the local audio gurus who accepts only lossless files tried to ABX one of the samples. He could ABX MP3 -V8 and didn't like it, but the Vorbis -q 1.5 sample made him almost angry because he couldn't ABX it. He used his high-end speakers instead of headphones, but this tells something about the Vorbis quality anyway. There was no sense to continue the test with the higher quality samples.

I had no plans to publish any results here, but because of the ongoing debate about the 128 kbps test here is a nice example:


The test sample

hot_tequilla_brown.flac (genre: ~electronic/pop/funk (?), 21 s, 2.63 MB)

- This sample produces 91-96 kbps (my overall test target was ~88 kbps for this quality level).

The tested lossy files are available in this package: lossy_samples.zip (767 kB)


My ABX results


LAME 3.97 beta1 -V8 --vbr-new ~94 kbps
CODE
foo_abx v1.2 report
foobar2000 v0.8.3
2005/11/13 00:01:32

File A: file://E:\test\Monkey's Audio 3.99 High\hot_tequilla_brown.ape
File B: file://E:\test\LAME 3.97 beta 1 -V8 --vbr-new\hot_tequilla_brown.mp3

00:01:35 : Test started.
00:02:04 : 01/01 50.0%
00:02:09 : 02/02 25.0%
00:02:17 : 03/03 12.5%
00:02:29 : 04/04 6.3%
00:02:35 : 05/05 3.1%
00:02:41 : 06/06 1.6%
00:02:51 : 07/07 0.8%
00:03:04 : 08/08 0.4%
00:03:08 : Test finished.

----------
Total: 8/8 (0.4%)

LAME was easy to ABX because of the obvious lowpass. I think -V8 is too low setting for anything that contains high frequencies. (Though, in general my high frequency hearing is not excellent. I can't ABX a lowpass over 16 kHz)

LAME at -V5 was much better with this sample. It sounded fine in casual listening, but I didn't ABX it.


Vorbis aoTuV beta 4.5 -q 1.5 ~96 kbps
CODE
foo_abx v1.2 report
foobar2000 v0.8.3
2005/11/13 00:27:18

File A: file://E:\test\Monkey's Audio 3.99 High\hot_tequilla_brown.ape
File B: file://E:\test\Vorbis aoTuV beta 4.5 -q1,5\hot_tequilla_brown.ogg

00:27:21 : Test started.
00:32:10 : 01/01 50.0%
00:32:50 : 02/02 25.0%
00:33:14 : 02/03 50.0%
00:33:27 : 03/04 31.3%
00:34:26 : 04/05 18.8%
00:35:00 : 05/06 10.9%
00:35:12 : 06/07 6.3%
00:35:25 : 07/08 3.5%
00:37:55 : 08/09 2.0%
00:38:09 : 09/10 1.1%
00:38:29 : 10/11 0.6%
00:38:48 : Test finished.

----------
Total: 10/11 (0.6%)

Very difficult to ABX. It took me 5 minutes to find a passage where I could possibly hear a difference. Vorbis was almost transparent with this sample. In casual listening I couldn't hear any problems.


Window Media Audio

Today I tested WMA standard with the same sample. I had never properly tried WMA at VBR quality 25.

WMA 9.1 Standard VBR25 ~91 kbps
CODE
foo_abx v1.2 report
foobar2000 v0.8.3
2005/11/28 17:38:04

File A: file://E:\test\Monkey's Audio 3.99 High\hot_tequilla_brown.ape
File B: file://E:\test\WMA9.1 STD VBR25\hot_tequilla_brown.wma

17:38:06 : Test started.
17:39:10 : 01/01 50.0%
17:39:42 : 02/02 25.0%
17:40:24 : 03/03 12.5%
17:40:42 : 04/04 6.3%
17:41:14 : 05/05 3.1%
17:42:03 : 06/06 1.6%
17:42:44 : 07/07 0.8%
17:43:14 : 07/08 3.5%
17:44:11 : 08/09 2.0%
17:47:26 : 08/10 5.5%
17:47:44 : 08/11 11.3%
17:49:04 : 09/12 7.3%
17:49:46 : 10/13 4.6%
17:50:37 : 11/14 2.9%
17:53:24 : 12/15 1.8%
17:54:03 : 13/16 1.1%
17:54:34 : 14/17 0.6%
17:54:54 : 15/18 0.4%
17:58:55 : 16/19 0.2%
17:59:53 : 17/20 0.1%
18:00:01 : Test finished.

----------
Total: 17/20 (0.1%)

This was not easy, but I could hear the difference at a certain passage, mostly because of the slight lowpass and perhaps a bit narrower stereo width. But after the first seven tries my ears got tired and I had difficulties to ABX. I wanted to be sure and continued through 20 trials. It took over 20 minutes. In general I couldn't hear any obvious problems. (I didn't expect VBR25 to be this good. Am I becoming deaf?) Since I tested Vorbis two weeks ago I cannot directly compare these two codecs.


The test gear used: Terratec DMX 6fire 24/96 soundcard, Harman/Kardon AVI 200 MKII amp, KOSS HV/1A headphones.

Edit: typo
Edit 2: changed the lossless sample to FLAC format
Edit 3: added the lossy samples


EDIT 4:
I removed the samples to make room. PM me if you like to try them.


This post has been edited by Alex B: Feb 26 2006, 20:14


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
user
post Nov 30 2005, 12:31
Post #2





Group: Members
Posts: 873
Joined: 12-October 01
From: the great wide open
Member No.: 277



nice hint,
I second it, and repeat so one of my suggestions to the multiformat test:

Lower averaged target bitrate of music of various genres from 128k down to 10x or even 9x kbit/s. People testing will have hard work still, consider even the number of samples and formats.

This idea is somehow logical, as encoders made progress over the years, so the magic 128k area for "CD-quality" is lower these days..., especially for Joe Average.

And, thinking more, testing in these bitrates suitable for portables, we should collect comparable test results not only for our HA formats ogg aotuv, aac (in those 2 variants), mp3 lame, mpc (in past, though it should be tested against the modern encoders also), but also wma, and maybe wma-pro, but with lower priority, because wma-pro hasn't much hardware support yet and it is windows only anyway, or is there mac, linux support ?
Polls can be taken as support for decisions, but I don#t think, polling makes sense regarding selecting contenders for listening tests. As we don't have pressures on time, we should setup tests, so that there is logic in them. And we should more think in long time terms meanwhile, as we see the a little bit already the end of lossy format developments (regarding squeezing qualitywise even more out of given stereo bitrates). So, tests should contain something, that tests of modern encoders can be compared to older tests (of previous encoders of same format), otherwise it is even more difficult to say something to development of a format.

This post has been edited by user: Nov 30 2005, 12:47


--------------------
www.High-Quality.ch.vu -- High Quality Audio Archiving Tutorials
Go to the top of the page
+Quote Post
ErikS
post Dec 1 2005, 10:02
Post #3





Group: Members
Posts: 757
Joined: 8-October 01
Member No.: 247



OK, finally! It's pretty hard to find any ABX comparator for mac... But once I got a perl-script running, I'm very confident that I can hear a difference also with the Vorbis file (8/8). It's pretty good though, and I'd say "perceptible but not annoying"...

CODE

a Playing file A...
b Playing file B...
x ********Choosing/Playing an X...
B Vote for B logged... currently 1/1
x ********Choosing/Playing an X...
A Vote for A logged... currently 2/2
x ********Choosing/Playing an X...
b Playing file B...
a Playing file A...
b Playing file B...
x Playing the X again...
A Vote for A logged... currently 3/3
x ********Choosing/Playing an X...
A Vote for A logged... currently 4/4
x ********Choosing/Playing an X...
B Vote for B logged... currently 5/5
x ********Choosing/Playing an X...
B Vote for B logged... currently 6/6
x ********Choosing/Playing an X...
B Vote for B logged... currently 7/7
x ********Choosing/Playing an X...
B Vote for B logged... currently 8/8
Go to the top of the page
+Quote Post
ErikS
post Dec 1 2005, 10:27
Post #4





Group: Members
Posts: 757
Joined: 8-October 01
Member No.: 247



I came here from the 128 kbit pre-test discussion, so it was of some interest to test it at -q4 also. This level is very close the original, and I'd rate it very close to "imperceptible".. Abx: 14/16

CODE

x ********Choosing/Playing an X...
A Vote for A logged... currently 1/1
x ********Choosing/Playing an X...
x Playing the X again...
A Vote for A logged... currently 2/2
x ********Choosing/Playing an X...
B Vote for B logged... currently 3/3
x ********Choosing/Playing an X...
B Vote for B logged... currently 4/4
x ********Choosing/Playing an X...
A Vote for A logged... currently 5/5
x ********Choosing/Playing an X...
A Vote for A logged... currently 6/6
x ********Choosing/Playing an X...
x Playing the X again...
b Playing file B...
XUnkown key 'X'.
B Vote for B logged... currently 7/7
x ********Choosing/Playing an X...
A Vote for A logged... currently 8/8
x ********Choosing/Playing an X...
B Vote for B logged... currently 9/9
x ********Choosing/Playing an X...
B Vote for B logged... currently 10/10
x ********Choosing/Playing an X...
A Vote for A logged... currently 11/11
x ********Choosing/Playing an X...
A Vote for A logged... currently 11/12
x ********Choosing/Playing an X...
b Playing file B...
x Playing the X again...
A Vote for A logged... currently 12/13
x ********Choosing/Playing an X...
A Vote for A logged... currently 13/14
x ********Choosing/Playing an X...
x Playing the X again...
B Vote for B logged... currently 14/15
x ********Choosing/Playing an X...
A Vote for A logged... currently 14/16
All done! ABX results: 14/16
Go to the top of the page
+Quote Post
Defsac
post Dec 1 2005, 12:41
Post #5





Group: Members
Posts: 347
Joined: 17-May 05
Member No.: 22107



QUOTE (Alex B @ Nov 29 2005, 04:43 AM)
But after the first seven tries my ears got tired and I had difficulties to ABX. I wanted to be sure and continued through 20 trials.
Just a note, ABX trials are only statistically valid if you either say "I will do x trials" before the ABX test and stick to it or hide the results until the end of the test.
Go to the top of the page
+Quote Post
Gecko
post Dec 1 2005, 13:51
Post #6





Group: Members
Posts: 938
Joined: 15-December 01
From: Germany
Member No.: 662



I'm still amazed that there is such a huge difference between the average joe and people with some artifact training. My normal reaction would be: 1.5? That's horrible quality! No need to ABX something like that.

So I went ahead and tested a random sample of my own with Lancer 20051121, which includes the current aotuv tunings at -q 1.5. It sounded much better than expected (with speakers at least).

ABX with headphones was easy due to the overall "mushyness" and lack of highs: 8/8.
The sample in question was an excerpt from "Jake Walton - Seven Gurdies".

ABXing the hot_tequilla_brown sample was a little more difficult: 8/8. I focused on the lowpass mostly. But amazing quality nevertheless. I don't think I would have been able to do it with speakers. I also expected pre-echo to be much worse.

I guess this is a testament to aoyumi's fine work and a reminder why we need to backup our claims, which we otherwise would take for granted, with blind tests.
Go to the top of the page
+Quote Post
Alex B
post Dec 1 2005, 14:23
Post #7





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



QUOTE (Defsac @ Dec 1 2005, 01:41 PM)
Just a note, ABX trials are only statistically valid if you either say "I will do x trials" before the ABX test and stick to it or hide the results until the end of the test.
*


You are right. Better would have been to stop the test at 10 and continue with a new test after a rest break for seeing if the possibly changed physical condition and practice can make difference.

For my defense I'd like to add that I wasn't in the best shape for test. I wanted to include WMA in my report, but I was tired after a long workday and I found it difficult to concentrate. Too bad for science that hearing is not an absolute thing that doesn't change from time to time. In this case it changed during the test. At first I was sure about the difference and the results were similar. Then I lost my concentration and started guessing. After a break and a coffee cup (illegal doping) I could hear the difference again.

Perhaps the test was meaningful just for myself. I think I proved (to myself) that I can hear the difference and the difference is small enough to be difficult to hear if I am not in the best possible shape.

Edit: typo

This post has been edited by Alex B: Dec 3 2005, 19:05


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
shadowking
post Dec 1 2005, 14:26
Post #8





Group: Members
Posts: 1523
Joined: 31-January 04
Member No.: 11664



Vorbis does sound really good at lower bitrates, but the HF boost / coarse sound is still there - Q4 is fixed, but at Q2 its definately there most of the time.


--------------------
Wavpack -b450s0.7
Go to the top of the page
+Quote Post
Alex B
post Dec 1 2005, 14:34
Post #9





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



QUOTE (Gecko @ Dec 1 2005, 02:51 PM)
I'm still amazed that there is such a huge difference between the average joe and people with some artifact training. My normal reaction would be: 1.5? That's horrible quality! No need to ABX something like that.

So I went ahead and tested a random sample of my own with Lancer 20051121, which includes the current aotuv tunings at -q 1.5. It sounded much better than expected (with speakers at least).

ABX with headphones was easy due to the overall "mushyness" and lack of highs: 8/8.
The sample in question was an excerpt from "Jake Walton - Seven Gurdies".

ABXing the hot_tequilla_brown sample was a little more difficult: 8/8. I focused on the lowpass mostly. But amazing quality nevertheless. I don't think I would have been able to do it with speakers. I also expected pre-echo to be much worse.

I guess this is a testament to aoyumi's fine work and a reminder why we need to backup our claims, which we otherwise would take for granted, with blind tests.
*


Thanks.

Would you mind to try the WMA sample?

I guess I should have not mentioned about the lowpass. When you start listening specifically to it you kind of lose the complete picture. A slight lowpass is often masked with other elements and difficult to realize.


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Alex B
post Dec 1 2005, 14:44
Post #10





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



QUOTE (shadowking @ Dec 1 2005, 03:26 PM)
Vorbis does sound really good at lower bitrates, but the HF boost / coarse sound is still there - Q4 is fixed, but at Q2 its definately there most of the time.
*

Now it's my turn to be picky. Have you ABXed Vorbis aoTuV b4.5 or 4.51 at -q 2?


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
ErikS
post Dec 1 2005, 14:49
Post #11





Group: Members
Posts: 757
Joined: 8-October 01
Member No.: 247



QUOTE (Gecko @ Dec 1 2005, 02:51 PM)
I'm still amazed that there is such a huge difference between the average joe and people with some artifact training. My normal reaction would be: 1.5? That's horrible quality! No need to ABX something like that.

[...]

I guess this is a testament to aoyumi's fine work and a reminder why we need to backup our claims, which we otherwise would take for granted, with blind tests.
*


Exactly my thoughts too before and after. smile.gif
Go to the top of the page
+Quote Post
Gecko
post Dec 1 2005, 15:03
Post #12





Group: Members
Posts: 938
Joined: 15-December 01
From: Germany
Member No.: 662



QUOTE (Alex B @ Dec 1 2005, 03:34 PM)
Thanks.

Would you mind to try the WMA sample?

I guess I should have not mentioned about the lowpass. When you start listening specifically to it you kind of lose the complete picture. A slight lowpass is often masked with other elements and difficult to realize.
*

I didn't read all your comments about the individual codecs. But you are right, if the original were allready lowpassed, things would get a lot more difficult.

Wma sounds just like I would expect such a low bitrate encoding to sound (contrary to the good performance of vorbis). Strong lowpass and what remained of the highs was metallic and squishy. Fastest ABX ever; 11 seconds to get 8/8. smile.gif

Mp3 was even worse: lower lowpass than wma with some pre-echo. 8/8

On the abc/hr scale I would rank ogg around 4 (perceptible, but not annoying). Wma and mp3 would get something < 2 (Annoying), with mp3 being a little lower.
Go to the top of the page
+Quote Post
Alex B
post Dec 1 2005, 15:11
Post #13





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



Gecko,

What equipment you used?

I am afraid that my HW chain may produce distortion in headphone listening.


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
shadowking
post Dec 1 2005, 15:24
Post #14





Group: Members
Posts: 1523
Joined: 31-January 04
Member No.: 11664



QUOTE (Alex B @ Dec 1 2005, 05:44 AM)
QUOTE (shadowking @ Dec 1 2005, 03:26 PM)
Vorbis does sound really good at lower bitrates, but the HF boost / coarse sound is still there - Q4 is fixed, but at Q2 its definately there most of the time.
*

Now it's my turn to be picky. Have you ABXed Vorbis aoTuV b4.5 or 4.51 at -q 2?
*



I did with Aotuv 4. Haven't tried 4.5


--------------------
Wavpack -b450s0.7
Go to the top of the page
+Quote Post
Gecko
post Dec 1 2005, 15:25
Post #15





Group: Members
Posts: 938
Joined: 15-December 01
From: Germany
Member No.: 662



QUOTE (Alex B @ Dec 1 2005, 04:11 PM)
Gecko,

What equipment you used?

I am afraid that my HW chain may produce distortion in headphone listening.
*

Foobar -> Terratec Aureon Sky (with Prodigy drivers) -> Beyerdynamic DT 880
Go to the top of the page
+Quote Post
Alex B
post Dec 1 2005, 15:36
Post #16





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



I guess I am in better shape now.

This time I tried to find out if Vorbis is different from WMA. Before starting the test I decided to do 20 tries.

Vorbis -q 1.5 vs WMA VBR25
CODE
foo_abx v1.2 report
foobar2000 v0.8.3
2005/12/01 16:22:24

File A: file://E:\test\hot_tequilla_brown.ogg
File B: file://E:\test\hot_tequilla_brown.wma

16:22:25 : Test started.
16:23:02 : 01/01  50.0%
16:23:18 : 02/02  25.0%
16:23:27 : 03/03  12.5%
16:23:35 : 04/04  6.3%
16:23:42 : 05/05  3.1%
16:23:50 : 06/06  1.6%
16:24:00 : 07/07  0.8%
16:24:14 : 08/08  0.4%
16:24:20 : 09/09  0.2%
16:24:26 : 10/10  0.1%
16:24:30 : 11/11  0.0%
16:24:35 : 12/12  0.0%
16:24:41 : 13/13  0.0%
16:24:48 : 14/14  0.0%
16:24:58 : 15/15  0.0%
16:25:03 : 16/16  0.0%
16:25:11 : 17/17  0.0%
16:25:16 : 18/18  0.0%
16:25:21 : 19/19  0.0%
16:25:28 : 20/20  0.0%
16:25:29 : Test finished.

----------
Total: 20/20 (0.0%)


No problems to ABX. Vorbis was clearly better. I can confirm Gecko's findings.

Also, I didn't hear any "headphone distortion" that would have masked artifacts. I think my gear is fine after all. The problem I had with ABXing WMA was caused by other factors.


Edit: codebox

This post has been edited by Alex B: Dec 3 2005, 18:54


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Alex B
post Dec 1 2005, 15:55
Post #17





Group: Members
Posts: 1303
Joined: 14-September 05
From: Helsinki, Finland
Member No.: 24472



I uploaded the classical sample I used in my test here :
http://www.hydrogenaudio.org/forums/index....ndpost&p=346739

Any comments about it?


--------------------
http://listening-tests.freetzi.com
Go to the top of the page
+Quote Post
Shade[ST]
post Dec 1 2005, 16:02
Post #18





Group: Members
Posts: 1189
Joined: 19-May 05
From: Montreal, Canada
Member No.: 22144



Incredible!

I never thought I'd be unable to hear the difference at such a low bitrate; I'm thinking about reencoding my whole collection to ogg now -- maybe at -q 4 or so. Are there any problem samples I could check with, to see if I hear a difference at that type of bitrate?

It was easy to tell for mp3 and wma, though. 19/20 and 12/12

This post has been edited by Shade[ST]: Dec 1 2005, 16:04
Go to the top of the page
+Quote Post
IgorC
post Jan 3 2006, 19:42
Post #19





Group: Members
Posts: 1560
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



It's amazy sample. Even itunes v6.0.1.3 at 128 kbit/s VBR 136 kbit/s (aprox. 155 kbit/s real bitrate for this sample) wasn't transparent. Well maybe it's trasnparent. But I could hear artefacts and noise on human voices in both channels during ABX test.

CODE
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname: itunes vbr 136   vs original .      Tequila

1R = C:\TEST48\15_tequila\itunes 128vbr 136.wav


1 of   1, p = 0.500
 2 of   2, p = 0.250
 3 of   3, p = 0.125
 4 of   4, p = 0.063
 4 of   5, p = 0.188
 5 of   6, p = 0.109
 6 of   7, p = 0.063
 7 of   8, p = 0.035
 8 of   9, p = 0.020
 9 of  10, p = 0.011
10 of  11, p = 0.006
11 of  12, p = 0.003
11 of  13, p = 0.011
12 of  14, p = 0.006
13 of  15, p = 0.004
14 of  16, p = 0.002
15 of  17, p = 0.001
16 of  18, p < 0.001
FINISHED

---------------------------------------
General Comments:

---------------------------------------
ABX Results:
Original vs C:\TEST48\15_tequila\itunes 128vbr 136.wav
   16 out of 18, pval < 0.001


This post has been edited by IgorC: Jan 3 2006, 19:49
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 31st August 2014 - 06:20