IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
Personal Listening Test of Opus, Celt, AAC at 75-100kbps, ABC/HR blind test, 1 Listener
Kamedo2
post Nov 17 2012, 09:25
Post #1





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



Abstract:
Blind Comparison between 2012/09 new Opusenc(tfsel5), old Celtenc 0.11.2, Apple AAC-LC tvbr, cvbr.
This is an English version of my original post in Japanese. http://d.hatena.ne.jp/kamedo2/20121116/1353099244#seemore

Encoders:
libopus 0.9.11-146-gdc4f83b-exp_analysis
https://people.xiph.org/~greg/opus-tools_exp_dc4f83be.zip
celt-0.11.2-win32
https://people.xiph.org/~greg/celt-0.11.2-win32.zip
qaac 1.40
qaac 1.40

Settings:
opusenc --bitrate 66 input.wav output.wav
celtenc input.48k.raw --bitrate 75 --comp 10 output.wav
qaac --cvbr 72 -o output.m4a input.wav
qaac --tvbr 27 -o output.m4a input.wav
opusenc --bitrate 90 input.wav output.wav
celtenc input.48k.raw --bitrate 100 --comp 10 output.wav
qaac --cvbr 96 -o output.m4a input.wav
qaac --tvbr 45 -o output.m4a input.wav

Samples:
20 Sounds of various genres, from easy to modestly critical.
http://zak.s206.xrea.com/bitratetest/main.htm
To download, access to the link above, 2nd paragraph, 3rd-6th links. (40_30sec - Run up)

Hardwares:
Sony PSP-3000 + RP-HT560(1st) , RP-HJE150(2nd), took the average of the two results.

Results:




Conclusions & Observations:
I could not detect a significant improvement in the new September 1st version of Opus, from the old Celtenc in 2011.
It's possibly because the new Opus inflates bitrates more than it improves qualities, although the set of sounds contain easy samples.
On 75kbps, Opus/Celt are markedly better. On 100kbps, there is no big difference between those codecs.

Raw data:
40 Logs and encoders, decoders log
http://zak.s206.xrea.com/bitratetest/log_o...kbps100kbps.zip
CODE
% Opus, AAC 75kbps, 100kbps ABC/HR Score
% This format is compatible with my graphmaker, as well as ff123's FRIEDMAN.
opus_75k    celt_75k    cvbr_75k    tvbr_75k    opus100k    celt100k    cvbr100k    tvbr100k
%features 6 75kbps 75kbps 75kbps 75kbps 100kbps 100kbps 100kbps 100kbps
%features 7 OPUS OPUS AAC-LC AAC-LC OPUS OPUS AAC-LC AAC-LC
3.050    3.100    2.500    2.750    3.500    3.750    3.700    3.800    
3.750    2.950    2.700    2.750    4.050    3.800    4.000    3.950    
2.800    2.550    3.000    3.000    3.600    3.250    4.050    3.900    
2.700    3.150    2.350    2.300    3.350    3.800    3.600    3.700    
4.000    3.400    2.850    2.850    4.350    3.900    3.550    3.550    
2.600    2.550    2.800    2.800    3.350    3.150    3.950    3.900    
3.400    3.950    3.000    3.200    3.850    4.500    3.700    3.800    
3.450    3.500    2.900    2.800    3.850    4.050    4.050    4.150    
2.950    2.700    3.550    3.450    3.250    3.450    4.000    3.850    
3.100    3.400    2.750    2.600    3.800    3.850    4.150    4.000    
3.350    3.100    2.600    2.600    3.750    3.400    3.450    3.500    
3.750    3.350    2.800    2.950    4.050    3.750    3.800    3.850    
3.550    3.300    2.600    2.650    4.250    3.950    3.750    3.600    
3.100    3.350    2.750    2.550    3.650    3.700    3.850    3.800    
3.400    3.450    2.900    2.900    3.650    3.950    3.750    3.900    
3.250    3.300    2.750    2.800    3.650    3.850    3.950    3.750    
3.600    3.800    3.300    3.300    3.550    4.000    3.650    3.700    
3.700    3.350    3.300    3.300    3.900    3.650    4.100    4.000    
3.100    3.600    3.150    3.000    3.700    3.800    4.100    3.850    
3.650    4.050    3.000    2.900    4.050    4.250    3.750    3.550

It's not strange that some scores get 0.050 scale because I did tests twice per each music.
Go to the top of the page
+Quote Post
C.R.Helmrich
post Nov 17 2012, 10:28
Post #2





Group: Developer
Posts: 687
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



Thanks for this interesting test, Kamedo, and welcome to HA!

So is the "libopus 0.9.11-146-gdc4f83b-exp_analysis (https://people.xiph.org/~greg/opus-tools_exp_dc4f83be.zip)" equal to the current official Opus release, or something older?

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
Kamedo2
post Nov 17 2012, 10:40
Post #3





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



QUOTE (C.R.Helmrich @ Nov 17 2012, 18:28) *
Thanks for this interesting test, Kamedo, and welcome to HA!

So is the "libopus 0.9.11-146-gdc4f83b-exp_analysis (https://people.xiph.org/~greg/opus-tools_exp_dc4f83be.zip)" equal to the current official Opus release, or something older?

Chris

I started the tests in 2012-09-01, and at that time it was the newest experimental branch called exp_analysis7, which was merged into main branch in 2012-10-09 by Jean-Marc Valin.
So it's not the current official Opus release. I think I spent too much time testing.
http://git.xiph.org/?p=opus.git;a=commit;h...9dedf21b3fa6ecb
Go to the top of the page
+Quote Post
Dynamic
post Nov 17 2012, 10:48
Post #4





Group: Members
Posts: 821
Joined: 17-September 06
Member No.: 35307



Thank you for your time and dedication, Kamedo2 biggrin.gif

Like the other tests you've done, it's a very thorough job and provides useful comparative data points in the comparison of the leading encoders at 100kbps, much in the way that other individual good listeners' serious multi-sample tests (e.g. guruboolez) have provided useful information in the past.

At 75 kbps it's debatable and possibly subjective whether iTunes/Quicktime LC-AAC or HE-AAC mode is the better AAC codec at 75 kbps, which is near the point (from blind comparisons I've seen) where the two AAC modes converge in quality scores. It also extends somewhat above 64 kbps the range where ABX logs have shown Opus to beat a leading AAC encoder by a likely significant margin (albeit that the 64 kbps multiformat test had ruled out LC-AAC before the main test as HE-AAC beat it at 64 kbps).

It's also reassuring that the modifications to the final rather advanced CELT versions that allowed it to be combined with a modified SILK in the hybrid Opus codec do not appear to have inflicted any quality penalty at the bitrates and samples you've tested.

Thanks again!
Go to the top of the page
+Quote Post
IgorC
post Nov 17 2012, 11:04
Post #5





Group: Members
Posts: 1576
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



Kamedo2, Thank You for all your tests.
Glad to see You on Hydrogen Audio.

The way the test was delivered reminds me someone's else. tongue.gif
Go to the top of the page
+Quote Post
Kamedo2
post Nov 17 2012, 18:44
Post #6





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



The samples I used


The ABX criteria is 12/15(p=0.02). Among the 320 ABXs I did, I failed to pass ABX once, in mybloodrusts, celt 100kbps.(11/15)
Go to the top of the page
+Quote Post
Anakunda
post Nov 17 2012, 23:45
Post #7





Group: Members
Posts: 460
Joined: 24-November 08
Member No.: 63072



QUOTE (Kamedo2 @ Nov 17 2012, 09:25) *
Blind Comparison between 2012/09 new Opusenc(tfsel5), old Celtenc 0.11.2, Apple AAC-LC tvbr, cvbr.


Is Apple AAC really better at 96k than Opus? I was confident Opus rules all the low bitrate ranges unsure.gif
And I wonder why OpusEnc with ea7 was used when there is the official build (libopus 1.0.1)?
Go to the top of the page
+Quote Post
C.R.Helmrich
post Nov 18 2012, 00:54
Post #8





Group: Developer
Posts: 687
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (Anakunda @ Nov 18 2012, 00:45) *
Is Apple AAC really better at 96k than Opus? I was confident Opus rules all the low bitrate ranges unsure.gif
And I wonder why OpusEnc with ea7 was used when there is the official build (libopus 1.0.1)?

Your second question was answered in Kamedo's reply to my question. And who says Apple AAC is better than Opus at 96 kbps? The confidence intervals highly overlap, and Kamedo's statistical Analysis says there is no significant difference.

And: in today's codec development, we consider 96 kbps "high bitrate" smile.gif

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
Kamedo2
post Nov 18 2012, 05:47
Post #9





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



QUOTE (Anakunda @ Nov 18 2012, 07:45) *
Is Apple AAC really better at 96k than Opus? I was confident Opus rules all the low bitrate ranges unsure.gif

It is possible to think that Opus is slightly better than Apple AAC at 96k on a very large set of samples.
Maybe Opus looks poorer than you imagined on the x-axis=actual bitrate y-axis=average score graph, because
the bitrate bloat on Opus is so big(nominal=90k actual=102k), making the plot rightward shift in the graph.
This is an worrying behavior, and my set of 20 samples contain plenty of non-critical samples.

QUOTE (Anakunda @ Nov 18 2012, 07:45) *
And I wonder why OpusEnc with ea7 was used when there is the official build (libopus 1.0.1)?

libopus 1.0.1 was released two weeks after I had started the test, which took me 2.5 months.
The newer version may beat the AAC with margin, if the tweaks includes quality improvement or tightened
actual bitrates.
Go to the top of the page
+Quote Post
Dynamic
post Nov 18 2012, 09:20
Post #10





Group: Members
Posts: 821
Joined: 17-September 06
Member No.: 35307



QUOTE
x-axis=actual bitrate


That was one query I did have about your method, Kamedo2 - mainly because I don't speak more than about 5 words of Japanese to find out for myself.

As you have been so thorough, I expect we can assume that you tested the encoders on a larger collection (maybe 10 or many more CDs) of normal music to determine the target settings to achieve the bitrate target, and that you plotted THAT collection-wide bitrate (for the large sample size) on the horizontal axis of your graphs, ignoring the actual bitrate used for the test samples?

The bitrate on a number of short samples (with more problem samples than usual) may properly be much higher than the average of a wide collection and is not relevant in the context of fair comparison. (That's the beauty of well tuned VBR, and to some extent modern constrained VBR as used by the 'CBR' modes - using much higher bitrate when the sound requires it and less when it doesn't without inflating the average over a whole collection).

There's also a chance that Opus produces lower bitrates for short samples as it doesn't use large codebooks compared to AAC, Vorbis etc., so using the sample-wide bitrate would give an over-emphasised disadvantage to AAC at lower bitrates on shorter-than-normal samples, again requiring the bitrate of a wider collection of representative CD releases, presumably encoded in the popular file-per-track mode without any pre-applied Replay Gain, as that's most representative of real use of lossy encoders for playback.
Go to the top of the page
+Quote Post
Kamedo2
post Nov 18 2012, 10:23
Post #11





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



QUOTE (Dynamic @ Nov 18 2012, 17:20) *
QUOTE
x-axis=actual bitrate


That was one query I did have about your method, Kamedo2 - mainly because I don't speak more than about 5 words of Japanese to find out for myself.

As you have been so thorough, I expect we can assume that you tested the encoders on a larger collection (maybe 10 or many more CDs) of normal music to determine the target settings to achieve the bitrate target, and that you plotted THAT collection-wide bitrate (for the large sample size) on the horizontal axis of your graphs, ignoring the actual bitrate used for the test samples?


Horizontal axis is always the average actual bitrate including headers and footers of each sample.
I calculated filesize*8/(sample_num/44100) of each sample, typically 20-30sec.
Then, I took the average of the 20 bitrates in bps. Larger collections are NEVER used.
Encoded files are typically around 200KB. Opus should benefit form smaller codebooks in this case.

QUOTE
larger collection (maybe 10 or many more CDs) of normal music

My set of samples are fairly normal. At least they are not an array of super-critical fatboys, although short.
The last 5 samples from the 20 samples I used.(Reunion Blues ~ Run up)
http://zak.s206.xrea.com/bitratetest/bitra...st_wav25-29.zip

CODE
list of audio file size in KB(1000B)
286 377 187 220 231 151 153 278 104 232 252 204 182 246 187 251 267 255 155 170
283 278 182 231 175 165 191 264 100 237 260 192 188 265 190 282 289 256 186 189
303 280 202 233 190 163 191 266 108 233 273 199 188 263 184 313 278 256 187 189
319 216 208 224 175 163 191 253 108 209 268 205 163 240 154 341 233 242 169 158
389 509 252 298 313 204 202 375 139 311 342 276 244 334 238 341 344 344 210 229
379 373 242 309 234 220 255 353 133 316 346 257 250 354 253 376 386 343 249 252
415 370 265 310 252 222 252 355 143 312 372 264 248 354 249 411 366 341 250 245
425 287 275 310 239 222 262 347 143 282 372 282 231 340 212 455 305 330 241 222

list of audio file bitrate including headers and footers in kbps(1000bps)
  76 101  78  72  92  69  60  79  80  74  69  81  73  69  75  67  72  74  63  68
  75  74  76  76  70  76  75  75  77  76  72  76  75  75  76  75  78  75  75  76
  81  75  84  76  76  75  75  75  84  74  75  79  75  74  74  84  75  75  75  75
  85  58  87  73  70  75  75  72  84  67  74  81  65  68  62  91  63  71  68  63
104 136 105  97 125  93  79 106 107  99  94 110  98  94  96  91  93 100  84  92
101 100 101 101  94 101 100 100 103 101  95 102 100 100 102 100 104 100 100 101
111  99 111 101 101 102  99 100 111  99 102 105  99 100 100 110  99  99 101  98
113  77 115 101  96 102 103  98 111  90 102 112  92  96  86 121  83  96  97  89
Go to the top of the page
+Quote Post
Dynamic
post Nov 18 2012, 20:07
Post #12





Group: Members
Posts: 821
Joined: 17-September 06
Member No.: 35307



Thank you for the clarification.

It seems that duration is typically 20-30 seconds, though a few are around 10 seconds, so this is typically 10-20% of the length of a typical song. I think the advantage to Opus of no codebook transmission comes mainly in files of a few seconds' duration as it saves a few kilobytes compared to formats like Vorbis, so hopefully any bitrate penalty to AAC is can be neglected as it's probably less than 1% on average.

I guess only a few samples like Tom's Diner and 41_30sec have been considered problem samples in the past by my recollection, so their effect on overall bitrate is hopefully minor. My guess is that the test method is essentially sound, and any minor differences that might arise between the test samples and typical real-world usage are probably masked by the statistical error bars.
Go to the top of the page
+Quote Post
IgorC
post Nov 18 2012, 20:12
Post #13





Group: Members
Posts: 1576
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



Kamedo2,

I'm not here to criticize your test, everything in contrary. I'm agree with your results.
I know how it feels when somebody starts criticize results of test because someone doesn't like that the codec he includes in his products wasn't first and no matter how good your intentions were.

So, Thank You Very Much for your labor. smile.gif

Though I have a bit different philosophy of testing VBR encoders which are very closely related to HA's public tests one.
Simply talking, it's to expect that encoder bloats bitrate on difficult samples. It's ok while VBR encoders respect the average bitrate on a large number of albums.
But still I agree with You that it's not wrong to think that an overall bitrate for all tested samples should be the same too.
As all listening tests yours has samples which difficultness higher than average. Not necessary extremely killers but somewhat harder to code. So it's an evaluation of HIGH part of VBR only. But we don't test here simple parts where the encoder goes really LOW. Yes, the experimental Opus bloats bitrate on hard parts (HIGH) but there wasn't an evaluation of a LOW part (very easy to encode samples). More likely Opus'es VBR mode is more "true" than Apple's true VBR laugh.gif
I've encoded some albums with the experimental Opus build and CELT 0.11.2. Yes, there could be difference in 2-3 kbps overall but shifting bitrates ~10 kbps is a bit too much.

But then again it's my point of view.

I think ideally two encoders should be tested at bitrate where both end up with the same file size for large number of albums and then randomly choose testing items to match the same rate between them too.

This post has been edited by IgorC: Nov 18 2012, 20:50
Go to the top of the page
+Quote Post
Kamedo2
post Nov 19 2012, 02:17
Post #14





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



QUOTE (IgorC @ Nov 19 2012, 04:12) *
As all listening tests yours has samples which difficultness higher than average. Not necessary extremely killers but somewhat harder to code. So it's an evaluation of HIGH part of VBR only. But we don't test here simple parts where the encoder goes really LOW. Yes, the experimental Opus bloats bitrate on hard parts (HIGH) but there wasn't an evaluation of a LOW part (very easy to encode samples).

If we were to include LOW samples, we can predict what happens on CBR and VBR codecs.
On CBR like CELT, because there are plenty of bits available, the average quality will go up. It will make upward shift on the x-axis=bitrate y-axis=quality plot.
On VBR like Opus, because the goal is to retain the same quality, the average bitrate will go down, while quality remains the same. Leftward shift on the plot.
On the bitrate vs quality graph, upward shift and leftward shift effectively means the same thing; better results.

QUOTE (IgorC @ Nov 19 2012, 04:12) *
I've encoded some albums with the experimental Opus build and CELT 0.11.2. Yes, there could be difference in 2-3 kbps overall but shifting bitrates ~10 kbps is a bit too much.

Nice to hear that bitrate increase is rather minor.
The problem is, the "some albums" is not redistributable thus not reproducible. May be we should have an objective method to measure "bitrate bloat".
On VBR development people are likely to tweak settings that cause bitrate increase. Rightward shift on the plot is a bad thing, and the tweak involves bitrate increase on only the
critical part, average bitrate bloat will be minimized, meaning very little rightward shift. If the tweaks has an generic bitrate increase, rightward shift is big, it means useless.
Generic bitrate increase plagues LAME, and I don't want Opus to follow the same way.

As for my samples being short, a Vorbis codebook is, for example, several kilobytes. Compared to 4-5 minutes samples, there are at best 2% more benefit that Opus has from being small.
It can be a problem when the intended application is a music storage. On ads and news videos, game sounds, wikimedia, tests at 20 seconds is about right.
Go to the top of the page
+Quote Post
jmvalin
post Nov 19 2012, 20:03
Post #15


Xiph.org Speex developer


Group: Developer
Posts: 481
Joined: 21-August 02
Member No.: 3134



Hi Kamedo2, thanks for the test. From what I see, I suspect that the main "issue" is the rate. Based on the commit ID you're quoting for Opus, it was a version where the VBR is supposed to be properly "calibrated". That is, on a large collection of samples, the average output rate matches the target rate. Unlike the original CELT VBR code (which is what the Opus RFC has) that was always trying to get the same average rate, the new VBR code is quite aggressive. It will increase the rate significantly on hard samples and reduce it on easier samples. It would seem like your test has a significant fractions of samples that Opus considers hard, which is why you're getting 100 kb/s when asking for 90 kb/s. Keep in mind that the samples Opus considers hard are different from the ones that are considered hard for MP3, Vorbis or AAC.
Go to the top of the page
+Quote Post
lvqcl
post Nov 19 2012, 20:32
Post #16





Group: Developer
Posts: 3383
Joined: 2-December 07
Member No.: 49183



I took my Opus compile (libopus v1.0.1-140-gc55f4d8 from git) and encoded several albums with it (pop/rock/electronic etc).
X axis: the value of the --bitrate option. Y axis: real bitrate. One album = one line on the graph.

Go to the top of the page
+Quote Post
Kamedo2
post Nov 19 2012, 23:19
Post #17





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



QUOTE (jmvalin @ Nov 20 2012, 04:03) *
it was a version where the VBR is supposed to be properly "calibrated". That is, on a large collection of samples, the average output rate matches the target rate.

Thank you. Nice to hear that Opus developers have a sensible way to avoid tweaks that increase bitrates on virtually anything and perceive them as 'improvement'.

QUOTE (lvqcl @ Nov 20 2012, 04:32) *
I took my Opus compile (libopus v1.0.1-140-gc55f4d8 from git) and encoded several albums with it (pop/rock/electronic etc).
X axis: the value of the --bitrate option. Y axis: real bitrate. One album = one line on the graph.

I'm pleased to see, according to your graph, in real life application as a music storage, Opus always creates files with users' desired bitrates. Very close.

According to the bitrate table I posted, only 2 samples, mybloodrusts and Dimmu Borgir are what Opus felt as 'easy' samples(the value of the --bitrate option > real bitrate). Although I'm reluctant to include more easy samples because they adds more time ABXing and statistical noise, I should have included more easier samples.
Go to the top of the page
+Quote Post
Dynamic
post Nov 20 2012, 03:03
Post #18





Group: Members
Posts: 821
Joined: 17-September 06
Member No.: 35307



Thanks again to everyone in this thread. I'm certainly learning a lot. That graph of actual versus requested bitrate is useful information I wasn't aware of.

The low latency and fixed codebook requirements of Opus/CELT do mean it has different hard samples compared to high latency codecs like AAC and Vorbis and requires different bitrates to encode them well.

The beauty of Kamedo2's visual graphing approach of bitrate versus quality rating is that we can adjust the horizontal position of the rated scores if we find the true representative bitrate of the settings used over a wider collection without having to retest the samples.

Moving Opus at 75 kbps leftwards to 66 kbps target bitrate, it's stilll clearly a winner over AAC-LC at anything remotely plausible for the AAC setting even 60 kbps upwards, so the Opus victory there is solid. We don't know where CELT would end up, but likely no further to the right than its current position, possibly as low as about 66 kbps also.

At 100 kbps area, both AAC and Opus might need a little shift to the left (Opus to 90 kbps, AAC not sure, but probably less shift to the left) and are likely to remain statistically tied as they are now.

If we get more accurate bitrates, the graph can be replotted, the statistical results will surely be the same and the scatter plots in the first figure will represent actual bitrate variation over the sample corpus. A short horizontal line indicating the wide collection average at each setting would be useful to indicate which samples each encoder considers about average or a little bit harder to encode than average. (In any future test where the tested encoder settings are matched closely over a wider collection of CDs, the 'average bitrate line' for each encoder at the same target rate will be pretty close)

This post has been edited by Dynamic: Nov 20 2012, 03:09
Go to the top of the page
+Quote Post
jmvalin
post Nov 20 2012, 05:13
Post #19


Xiph.org Speex developer


Group: Developer
Posts: 481
Joined: 21-August 02
Member No.: 3134



QUOTE (Dynamic @ Nov 19 2012, 21:03) *
We don't know where CELT would end up, but likely no further to the right than its current position, possibly as low as about 66 kbps also.


CELT would normally stay where it is because it never had unconstrained VBR. It was always trying to hit the requested bit-rate average for every file. Unconstrained VBR is a new feature in Opus and is still only implemented in the development branch (git master). So normally the CELT curves would remain in place, while the OPus curves would move to 66 kb/s and 90 kb/s.
Go to the top of the page
+Quote Post
jmvalin
post Nov 20 2012, 18:39
Post #20


Xiph.org Speex developer


Group: Developer
Posts: 481
Joined: 21-August 02
Member No.: 3134



QUOTE (Kamedo2 @ Nov 19 2012, 17:19) *
According to the bitrate table I posted, only 2 samples, mybloodrusts and Dimmu Borgir are what Opus felt as 'easy' samples(the value of the --bitrate option > real bitrate). Although I'm reluctant to include more easy samples because they adds more time ABXing and statistical noise, I should have included more easier samples.


Well, you could also just "calibrate" on a large set that includes a bit of everything, which I believe is what most tests do. I had a closer look at the bitrates and indeed it seems like your test really focuses on hard samples -- probably more than any other test I've seen so far. Are the audio samples available so I can check why Opus is finding them hard?
Go to the top of the page
+Quote Post
C.R.Helmrich
post Nov 20 2012, 23:01
Post #21





Group: Developer
Posts: 687
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (Kamedo2 @ Nov 17 2012, 10:25) *
Samples:
20 Sounds of various genres, from easy to modestly critical.
http://zak.s206.xrea.com/bitratetest/main.htm
To download, access to the link above, 2nd paragraph, 3rd-6th links. (40_30sec - Run up)


Edit: I put these samples into Fraunhofer's latest AAC encoder, and at VBR 3 (target ~97 kbps), they average 107 kbps. Here as well, only 2 items get <= 97 kbps (samples 11ff and 24td). So it seems that what Opus/Celt finds difficult to code is not that different from what an AAC encoder might find difficult, after all (except maybe for very tonal items, yes).

Chris

This post has been edited by C.R.Helmrich: Nov 20 2012, 23:33


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
Kamedo2
post Nov 21 2012, 02:08
Post #22





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



QUOTE (jmvalin @ Nov 20 2012, 13:13) *
CELT would normally stay where it is because it never had unconstrained VBR. It was always trying to hit the requested bit-rate average for every file. Unconstrained VBR is a new feature in Opus and is still only implemented in the development branch (git master). So normally the CELT curves would remain in place, while the OPus curves would move to 66 kb/s and 90 kb/s.

I assume that the preference for bitrate 'calibration' is to make it close to the real-world applications, where easy and difficult samples appear roughly 1:1, as opposed to 2:18 now.
In the real world where easy samples appear more frequently than in this test, cbr codecs like CELT will not 'normally stay'. Because there are plenty of easy samples, it means less perceptible artifacts, thus better average quality(Imagine MP3 CBR 128kbps), and move upwards. It will NOT remain in place.
In other words, in the typical music storage applications, VBRs like Opus get better compression, while CBRs get better quality.


QUOTE (C.R.Helmrich @ Nov 21 2012, 07:01) *
Edit: I put these samples into Fraunhofer's latest AAC encoder, and at VBR 3 (target ~97 kbps), they average 107 kbps. Here as well, only 2 items get <= 97 kbps (samples 11ff and 24td). So it seems that what Opus/Celt finds difficult to code is not that different from what an AAC encoder might find difficult, after all (except maybe for very tonal items, yes).

Chris

Interesting stuff. 11ff = finalfantasy is the samples Opus considered it to be the hardest. But for AAC, it's easy. I guess it's because the sound of the harpsichord is tonal. After using 101 and 136kbps for the harpsichord, Opus did a very good job preserving the detail(3.75 and 4.05, respectably). And somehow, 24td = Tom's Diner is easy for AAC(but difficult for Opus).
Go to the top of the page
+Quote Post
jmvalin
post Nov 21 2012, 03:15
Post #23


Xiph.org Speex developer


Group: Developer
Posts: 481
Joined: 21-August 02
Member No.: 3134



QUOTE (Kamedo2 @ Nov 20 2012, 20:08) *
I assume that the preference for bitrate 'calibration' is to make it close to the real-world applications, where easy and difficult samples appear roughly 1:1, as opposed to 2:18 now.
In the real world where easy samples appear more frequently than in this test, cbr codecs like CELT will not 'normally stay'. Because there are plenty of easy samples, it means less perceptible artifacts, thus better average quality(Imagine MP3 CBR 128kbps), and move upwards. It will NOT remain in place.
In other words, in the typical music storage applications, VBRs like Opus get better compression, while CBRs get better quality.


Actually, in the "real world", I would say that difficult samples represent less than 10% of all samples. Also, by "stay where it is", all I meant was that the x axis would remain at the same place regardless of the samples because CELT gives the same average for every sample (unlike the new Opus code). As for "better compression vs better quality", it's really all the same thing because you trade one for the other. The main thing we've actually been trying to do with VBR is to keep the quality constant and by adjusting the bitrate. In a perfect world (unachievable), all samples would get exactly the same quality and only the bit-rate would vary.


QUOTE (Kamedo2 @ Nov 20 2012, 20:08) *
Interesting stuff. 11ff = finalfantasy is the samples Opus considered it to be the hardest. But for AAC, it's easy. I guess it's because the sound of the harpsichord is tonal. After using 101 and 136kbps for the harpsichord, Opus did a very good job preserving the detail(3.75 and 4.05, respectably). And somehow, 24td = Tom's Diner is easy for AAC(but difficult for Opus).


Of all the instruments, harpsichord is the most difficult to code for Opus and the one that requires the highest bitrate. So the fact that you included two harpsichord samples (although harpsichord is typically rare) is in part what explains why Opus is using a rate that's much higher than it's usual rate. As for the reason why tonal samples are harder for Opus than for AAC, the main reason is the focus on interactive applications. Opus is designed for lower delay than AAC-LC, which forced the use of a shorter window (especially shorter overlap).
Go to the top of the page
+Quote Post
Kamedo2
post Nov 21 2012, 03:26
Post #24





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



Bitrate vs Score plot of the 20 samples used.

Opus

These two outlier samples at the top-right are finalfantasy and FloorEssence. Excellent quality but lots of bitrates used.

Apple AAC-LC tvbr

Go to the top of the page
+Quote Post
Kamedo2
post Nov 21 2012, 05:37
Post #25





Group: Members
Posts: 220
Joined: 16-November 12
From: Kyoto, Japan
Member No.: 104567



QUOTE (jmvalin @ Nov 21 2012, 11:15) *
Actually, in the "real world", I would say that difficult samples represent less than 10% of all samples. Also, by "stay where it is", all I meant was that the x axis would remain at the same place regardless of the samples because CELT gives the same average for every sample (unlike the new Opus code). As for "better compression vs better quality", it's really all the same thing because you trade one for the other. The main thing we've actually been trying to do with VBR is to keep the quality constant and by adjusting the bitrate. In a perfect world (unachievable), all samples would get exactly the same quality and only the bit-rate would vary.

I meant 'hard samples' by the definition of its real bitrate exceeds the value of the --bitrate option. In a real world, if we were to randomly choose 100 samples from the world, it is likely that roughly 50 samples < value of the --bitrate option < 50 samples. Yes, 1:1. As for "better compression vs better quality", those who want to calibrate the x-axis for the better representation of the world should also calibrate the y-axis. The calibration of the y-axis involves listening tests including many easy samples, and it's understandable why people don't want to do that, but on the average bitrate vs average score graph, calibrating only one axis would break the beautiful point of the graph, when the graph includes evaluation of CBR codecs like CELT, and make it less fair.

By far the simplest solution I think of is to add more easy samples, as defined by samples that creates less than expected from the value of the --bitrate option.

QUOTE (jmvalin @ Nov 21 2012, 11:15) *
Of all the instruments, harpsichord is the most difficult to code for Opus and the one that requires the highest bitrate. So the fact that you included two harpsichord samples (although harpsichord is typically rare) is in part what explains why Opus is using a rate that's much higher than it's usual rate.

Sorry for my misleading reply. Out of 20 samples I used, only finalfantasy is a harpsichord sample. I can understand why would critical samples like finalfantasy, FloorEssence, or VelvetRealm takes much space.
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 21st September 2014 - 00:38