IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
Listening test using 2013-03-09 build
RobertM
post Mar 9 2013, 10:49
Post #1





Group: Members
Posts: 12
Joined: 17-February 13
Member No.: 106691



I completed a listening test against Opus files encoded with the latest build (as of 2013-09-03). This time I've actually been more thorough - ABX test results from foobar2000 are attached along with the Opus-encoded files. I also took azaqiel's advice and updated the version reported by the encoder, to prevent any confusion.

"Sample 01" from the page below was used for the test. May repeat the test later with other difficult samples.
http://people.xiph.org/~greg/opus/ha2011/


Summary:

Results were very much as expected. Opus quality has definitely improved over time and gets closer to transparency with higher bitrate.

1. 64kb/s from the above page (old opus version) and 64kb/s from the newest Opus version

There was a noticeable improvement in quality with the new Opus version

2. 64kb/s vs original

It was fairly easy to tell the difference, but still quite good quality

3. 96 kb/s vs original

Could still tell the difference but artifacts were noticably improved from the 64kb/s file

4. 128 kb/s vs original

Still can hear a very subtle artifact introduced by the codec (which appears on the note between 2.155 seconds and 2.423 seconds) but had to strain to hear it.

5. 256 kb/s vs original

Very close to transparent. I managed to tell the difference sometimes by listening very hard for the artifact. However, my ability to tell the two apart was far from perfect.

6. 500 kb/s vs original

This was transparent to me.
Attached File(s)
Attached File  sample01_RM.txt ( 3.73K ) Number of downloads: 169
Attached File  Test1.zip ( 1.17MB ) Number of downloads: 178
 
Go to the top of the page
+Quote Post
zerowalker
post Mar 10 2013, 22:44
Post #2





Group: Members
Posts: 268
Joined: 6-August 11
Member No.: 92828



Isn´t that pretty bad, to not be able to reach transparency at 256kbps?
Or is this some kind of super killer sound we are talking about?

Cause i think that Vorbis and AAC can pretty much reach Transparency at 196-256 most of the time, though i am not some kind of master within this.

This post has been edited by db1989: Mar 10 2013, 22:51
Reason for edit: deleting pointless full quote
Go to the top of the page
+Quote Post
db1989
post Mar 10 2013, 22:55
Post #3





Group: Super Moderator
Posts: 5275
Joined: 23-June 06
Member No.: 32180



Yes, it was a sample that is known to be difficult to encode, not just everyday music, as you would know if you had followed the link and read the description.

Another thing you would know if that were true is that Opus was the highest rated codec in the test overall.

Neither of these things require being “some kind of master”, just the simplest kind of research before posting.
Go to the top of the page
+Quote Post
saratoga
post Mar 10 2013, 22:56
Post #4





Group: Members
Posts: 5147
Joined: 2-September 02
Member No.: 3264



QUOTE (zerowalker @ Mar 10 2013, 16:44) *
Isn´t that pretty bad, to not be able to reach transparency at 256kbps?
Or is this some kind of super killer sound we are talking about?


Its one of the lowest scored samples in that test, so evidently its a very difficult sample for current Opus encoders.
Go to the top of the page
+Quote Post
zerowalker
post Mar 10 2013, 23:00
Post #5





Group: Members
Posts: 268
Joined: 6-August 11
Member No.: 92828



QUOTE (db1989 @ Mar 10 2013, 22:55) *
Yes, it was a sample that is known to be difficult to encode, not just everyday music, as you would know if you had followed the link and read the description.

Another thing you would know if that were true is that Opus was the highest rated codec in the test overall.

Neither of these things require being “some kind of master”, just the simplest kind of research before posting.


Ah well that explains it:)

Well what i meant with "master" was more, that i myself can´t distinguish artifacts easily, i can feel that 128kbps mp3 is much "weaker" than 196+, but i can´t really say. At the point in time i hear an artifacts compared to the other codec etc.
If it´s not very easily of course.

But yeah, my bad for not going to the link, the kbps and results took the best of me, and i was a bit disappointed at first, sorry for that.
Go to the top of the page
+Quote Post
IgorC
post Mar 10 2013, 23:53
Post #6





Group: Members
Posts: 1580
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



RobertM,

Let me comment two things. First, one sample isn't enough representative to conclude if there was an improvement. The ratio quality/quantity starts to work out from 10 samples. Second, this particular sample as all other were quickly adopted by developers for tuning of Opus almost 2 years ago, so it's not surprising that latest Opus 1.1a did better on it.

Anyway it's a nice start.

P.S. It's more usefull to perform a tests on two samples with 7/7 instead of one sample but 14/14. The probability of guessing with 7 correct trials is already less than 1 %. Personally I perform test on 20 samples or so with 5/5 trials (3.2%) when not sure about perceived differences.

This post has been edited by IgorC: Mar 11 2013, 00:05
Go to the top of the page
+Quote Post
jmvalin
post Mar 11 2013, 00:05
Post #7


Xiph.org Speex developer


Group: Developer
Posts: 485
Joined: 21-August 02
Member No.: 3134



QUOTE (saratoga @ Mar 10 2013, 16:56) *
Its one of the lowest scored samples in that test, so evidently its a very difficult sample for current Opus encoders.


Well, it was the lowest Opus score for 1.0. The new version has significant improvements on that sample.
Go to the top of the page
+Quote Post
eahm
post Mar 11 2013, 00:26
Post #8





Group: Members
Posts: 1167
Joined: 11-February 12
Member No.: 97076



Is there a Windows compiled 2013-09-03?
Go to the top of the page
+Quote Post
wswartzendruber
post Mar 11 2013, 02:51
Post #9





Group: Members
Posts: 106
Joined: 11-December 06
Member No.: 38563



Is there a place that houses updated builds of the alpha branch for Win32? I'm interested in testing these on a certain demented project of mine.

This post has been edited by wswartzendruber: Mar 11 2013, 02:51
Go to the top of the page
+Quote Post
RobertM
post Mar 11 2013, 08:30
Post #10





Group: Members
Posts: 12
Joined: 17-February 13
Member No.: 106691



QUOTE (IgorC @ Mar 11 2013, 09:53) *
RobertM,

Let me comment two things. First, one sample isn't enough representative to conclude if there was an improvement. The ratio quality/quantity starts to work out from 10 samples. Second, this particular sample as all other were quickly adopted by developers for tuning of Opus almost 2 years ago, so it's not surprising that latest Opus 1.1a did better on it.

Anyway it's a nice start.

P.S. It's more usefull to perform a tests on two samples with 7/7 instead of one sample but 14/14. The probability of guessing with 7 correct trials is already less than 1 %. Personally I perform test on 20 samples or so with 5/5 trials (3.2%) when not sure about perceived differences.


I agree, and hope to test more samples as I get the time, but it does prove that Sample 1 (which was one of the hardest samples for Opus to encode back then) has been improved by the latest work on the encoder. Also that it is virtually transparent (to my ears) at 256 kb/s. If you need to listen as carefully as I did and still can't tell the difference all the time, then it's just as good as the uncompressed version.

I've also shared the compiled windows binaries with one other member but not sure if it's ok to post in a public thread. Can't see any TOS against it, but can an admin confirm if a link to the binaries is fine to post here?
Go to the top of the page
+Quote Post
RobertM
post Mar 11 2013, 10:05
Post #11





Group: Members
Posts: 12
Joined: 17-February 13
Member No.: 106691



In an effort to be "fair" to the Opus encoder, I've chosen a sample which Opus was quite good at but the other codecs had trouble with - "Sample 16".
http://people.xiph.org/~greg/opus/ha2011/

Samples from the new encoder and ABX results attached.

Summary:

These results surprised me - I wasn't able to detect any improvement due to the new encoder, but originally I thought the sample was transparent at 64kb/s. After listening many times, I was able to detect a slight difference on the first guitar chord at some bitrates.

1. 48kb/s vs original

A small amount of distortion on the guitar notes at this bitrate, but still good quality

2. 64kb/s from the above page (old opus version) vs original

It took me a long time to be able to differentiate these two but when I spotted the tiny difference in the first guitar chord, I was able to repeatedly identify it.

3. 64kb/s vs original

As above, was able to hear a slight difference

4. 64kb/s from the above page (old opus version) vs 64kb/s from the newest Opus version

Was unable to differentiate these two, indicating no major difference between the new encoder and old encoder for this sample.

5. 96kb/s vs original

This was transparent to me. The ABX results swing slightly towards a small difference, but I think it was due to chance.

This post has been edited by RobertM: Mar 11 2013, 10:06
Attached File(s)
Attached File  sample16_RM.txt ( 3.13K ) Number of downloads: 114
Attached File  Test2.zip ( 207.98K ) Number of downloads: 107
 
Go to the top of the page
+Quote Post
kabal4e
post Mar 12 2013, 02:54
Post #12





Group: Members
Posts: 8
Joined: 10-March 13
From: Waikato, NZ
Member No.: 107144



Thanks to RobertM I have an opus-tools build from 2013.03.09.
I mistakenly believed it had variable framesize as in opus_exp branch built in. Unfrtunately, it didn't, but after some ABX-ing I realised I couldn't distinguish the difference between the latest general and experimental builds anyway.
However, a while ago, maybe not in Opus branch of HA, a sweep sample was tested. And Opus performed very bad. I was hoping to see some improvement, but there wasn't any. Please, listen to samples attached and judge yourself.

This post has been edited by kabal4e: Mar 12 2013, 03:35
Attached File(s)
Attached File  sweep_16bit.flac ( 365.71K ) Number of downloads: 115
Attached File  sweep_16bit.opus ( 99.05K ) Number of downloads: 108
 
Go to the top of the page
+Quote Post
jmvalin
post Mar 12 2013, 03:17
Post #13


Xiph.org Speex developer


Group: Developer
Posts: 485
Joined: 21-August 02
Member No.: 3134



QUOTE (kabal4e @ Mar 11 2013, 20:54) *
Thanks to RobertM I have an opus-tools build from 2013.03.09.
I mistakenly believed it had variable framesize as in opus_exp branch built in. Unfrtunately, it didn't, but after some ABX-ing I realised I couldn't distinguish the difference between the latest general and experimental builds anyway.
However, a while ago, maybe not in Opus branch of HA, a sweep sample was tested. And Opus performed very bad. I as hoping to see some improvement, but there wasn't any. Please, listen to samples attached and judge yourself.


Wow! As much as I think sine sweep tests are stupid for codecs, there's no excuse for the behaviour you're seeing on this file with 1.1-alpha and later. That sine sweep is actually hitting a corner case in the bandwidth detection code of the encoder (see commit 7509fdb8). Thankfully, it shouldn't be too hard to fix. It's quite spectacular, but not that big a deal overall because fortunately it's highly unlikely to occur on real music.
Go to the top of the page
+Quote Post
kabal4e
post Mar 12 2013, 03:34
Post #14





Group: Members
Posts: 8
Joined: 10-March 13
From: Waikato, NZ
Member No.: 107144



QUOTE (jmvalin @ Mar 12 2013, 15:17) *
As much as I think sine sweep tests are stupid for codecs

However, Vorbis, Apple AAC and Nero AAC performed well with this. With Vorbis ended up with the lowest bitrate of all, given the same target bitrate.
But, when a sweep is hidden in a real music, such as glitchhop or dubstep, Opus performes really well. So, I've got no complaints for real music samples. I could attach a few samples if people are interested.

This post has been edited by kabal4e: Mar 12 2013, 03:37
Go to the top of the page
+Quote Post
jmvalin
post Mar 12 2013, 03:45
Post #15


Xiph.org Speex developer


Group: Developer
Posts: 485
Joined: 21-August 02
Member No.: 3134



QUOTE (kabal4e @ Mar 11 2013, 21:34) *
However, Vorbis, Apple AAC and Nero AAC performed well with this. With Vorbis ended up with the lowest bitrate of all, given the same target bitrate.


Sure, one of the things the Opus format does to gain efficiency is assuming that it's encoding signals with a wide spectrum. This assumptions saves bits on the vast majority of files and wastes bits on synthetic tests like this. So I've no problem with being less efficient in terms of bitrate. Of course, the problem here is that it doesn't even encode properly -- and that's something that needs fixing.

QUOTE (kabal4e @ Mar 11 2013, 21:34) *
But, when a sweep is hidden in a real music, such as glitchhop or dubstep, Opus performes really well. So, I've got no complaints for real music samples. I could attach a few samples if people are interested.


Sure, I understand exactly what's happening and it's really a corner case. it not only requires no spectral content above the sine, but I think even a downward sine sweep would actually have worked fine.
Go to the top of the page
+Quote Post
jmvalin
post Mar 12 2013, 18:45
Post #16


Xiph.org Speex developer


Group: Developer
Posts: 485
Joined: 21-August 02
Member No.: 3134



QUOTE (jmvalin @ Mar 11 2013, 21:17) *
Wow! As much as I think sine sweep tests are stupid for codecs, there's no excuse for the behaviour you're seeing on this file with 1.1-alpha and later. That sine sweep is actually hitting a corner case in the bandwidth detection code of the encoder (see commit 7509fdb8). Thankfully, it shouldn't be too hard to fix. It's quite spectacular, but not that big a deal overall because fortunately it's highly unlikely to occur on real music.


The problem is now fixed in git. Here's the fix for those who are curious. With the change, the sweep doesn't have dropouts anymore. It still uses a higher bit-rate than necessary, but I'm not really concerned with that.
Go to the top of the page
+Quote Post
RobertM
post Mar 12 2013, 20:14
Post #17





Group: Members
Posts: 12
Joined: 17-February 13
Member No.: 106691



QUOTE (jmvalin @ Mar 13 2013, 03:45) *
QUOTE (jmvalin @ Mar 11 2013, 21:17) *
Wow! As much as I think sine sweep tests are stupid for codecs, there's no excuse for the behaviour you're seeing on this file with 1.1-alpha and later. That sine sweep is actually hitting a corner case in the bandwidth detection code of the encoder (see commit 7509fdb8). Thankfully, it shouldn't be too hard to fix. It's quite spectacular, but not that big a deal overall because fortunately it's highly unlikely to occur on real music.


The problem is now fixed in git. Here's the fix for those who are curious. With the change, the sweep doesn't have dropouts anymore. It still uses a higher bit-rate than necessary, but I'm not really concerned with that.


That's excellent - can confirm that the sine sweep is good now. Thanks jmvalin smile.gif

I'll do a repeat of the listening tests soon to see if anything has changed in the music samples.
Go to the top of the page
+Quote Post
jmvalin
post Mar 12 2013, 20:44
Post #18


Xiph.org Speex developer


Group: Developer
Posts: 485
Joined: 21-August 02
Member No.: 3134



QUOTE (RobertM @ Mar 12 2013, 15:14) *
I'll do a repeat of the listening tests soon to see if anything has changed in the music samples.


Feel free to do that, but I highly doubt this impacted any music samples. In general, what's useful would be to check if there's any regression between 1.0.x and the current master.
Go to the top of the page
+Quote Post
kabal4e
post Mar 12 2013, 23:39
Post #19





Group: Members
Posts: 8
Joined: 10-March 13
From: Waikato, NZ
Member No.: 107144



QUOTE (jmvalin @ Mar 13 2013, 08:44) *
I highly doubt this impacted any music samples.

Did the testing and couldn't find any impact.
Foobar's bit compare tool shows only 25-50% of samples to be different, which is an amazing result. Usually, I get 99.9999%. (please, note I understand that this has nothing to do with human hearing)

QUOTE (jmvalin @ Mar 13 2013, 08:44) *
In general, what's useful would be to check if there's any regression between 1.0.x and the current master.

Personally, I couldn't find any regressions between 1.0.2 and 1.1a. For me 1.1a sounds better. If I had more time I could do some ABX-ing, but not today.
Go to the top of the page
+Quote Post
db1989
post Mar 13 2013, 00:13
Post #20





Group: Super Moderator
Posts: 5275
Joined: 23-June 06
Member No.: 32180



QUOTE (kabal4e @ Mar 12 2013, 22:39) *
Foobar's bit compare tool shows only 25-50% of samples to be different, which is an amazing result. Usually, I get 99.9999%. (please, note I understand that this has nothing to do with human hearing)
Audible or not, this is almost totally useless as a way to evaluate a lossy codec, even were it not the case that phase-shifting, etc. will completely confound naïve bit-comparisons.

QUOTE
For me 1.1a sounds better. If I had more time I could do some ABX-ing, but not today.
Please wait until you’ve ABXd it to make claims, in that case.
Go to the top of the page
+Quote Post
jmvalin
post Mar 13 2013, 03:04
Post #21


Xiph.org Speex developer


Group: Developer
Posts: 485
Joined: 21-August 02
Member No.: 3134



QUOTE (db1989 @ Mar 12 2013, 19:13) *
QUOTE (kabal4e @ Mar 12 2013, 22:39) *
Foobar's bit compare tool shows only 25-50% of samples to be different, which is an amazing result. Usually, I get 99.9999%. (please, note I understand that this has nothing to do with human hearing)
Audible or not, this is almost totally useless as a way to evaluate a lossy codec, even were it not the case that phase-shifting, etc. will completely confound naïve bit-comparisons.


Well, bit comparisons are very useful. If two clips are bit-identical, they have the same quality (no matter what your ABX test says), which saves a lot of time. Also, for many changes, just having a single bit change means you screwed up something.
Go to the top of the page
+Quote Post
db1989
post Mar 13 2013, 09:35
Post #22





Group: Super Moderator
Posts: 5275
Joined: 23-June 06
Member No.: 32180



QUOTE (jmvalin @ Mar 13 2013, 02:04) *
If two clips are bit-identical, they have the same quality
But we’re talking about a lossy codec.

QUOTE
Also, for many changes, just having a single bit change means you screwed up something.
I presume this means it’s useful during the process of development. But again, the post was addressed to an end-user. Bit-comparing lossy streams to their uncompressed source can be confounded in so many ways and is not likely to be informative even if they’re controlled for.
Go to the top of the page
+Quote Post
bawjaws
post Mar 14 2013, 17:54
Post #23





Group: Members
Posts: 174
Joined: 10-December 02
Member No.: 4043



QUOTE (db1989 @ Mar 13 2013, 00:35) *
QUOTE (jmvalin @ Mar 13 2013, 02:04) *
If two clips are bit-identical, they have the same quality
But we’re talking about a lossy codec.

QUOTE
Also, for many changes, just having a single bit change means you screwed up something.
I presume this means it’s useful during the process of development. But again, the post was addressed to an end-user. Bit-comparing lossy streams to their uncompressed source can be confounded in so many ways and is not likely to be informative even if they’re controlled for.


"Bit identical" and "not bit-identical" seem to give useful info for various purposes, but only bit identical gives you info on comparitive quality

This post has been edited by bawjaws: Mar 14 2013, 17:55
Go to the top of the page
+Quote Post
db1989
post Mar 14 2013, 18:49
Post #24





Group: Super Moderator
Posts: 5275
Joined: 23-June 06
Member No.: 32180



Please explain how a bit-comparison provides any information except from ‘this file is different from that file’, as already noted by jmvalin above, and which is very basic and limited in its utility. Please then elaborate about how the information from a bit-comparison can indicate relative quality between streams.

Can anyone provide a justification for discussion of bit-comparing in reference to a lossy codec—except from ‘this≠that’—, for example an explanation of why it isn’t even less useful than difference signals, which we already tend to advise against? If not, this is all just clutter in the thread, and I’m inclined to remove it.
Go to the top of the page
+Quote Post
jmvalin
post Mar 14 2013, 21:38
Post #25


Xiph.org Speex developer


Group: Developer
Posts: 485
Joined: 21-August 02
Member No.: 3134



QUOTE (db1989 @ Mar 14 2013, 13:49) *
Can anyone provide a justification for discussion of bit-comparing in reference to a lossy codec—except from ‘this≠that’—, for example an explanation of why it isn’t even less useful than difference signals, which we already tend to advise against? If not, this is all just clutter in the thread, and I’m inclined to remove it.


The information contained in A!=B, is that something actually changed. What you compare is not original to coded, but codedA to codedB. It tells you whether whatever you changed actually had *any* impact on the result. For example, in some circumstances, adding a certain option to opusenc will produce *exactly* the same output as without the option. Before you waste an hour trying to ABX, you can quickly see that the decoded files are identical. The opposite is also true. If you have two different builds of the same code that produce non-identical results (even if it sounds the same), it's often worth at least investigating (it's sometimes just different rounding, but sometimes not). This is why bit comparisons are useful. They're a sanity check. I've myself made the error before: asking people to tell me which of two files sounded the best when in fact they were bit-identical.
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 19th December 2014 - 02:40