IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Codec performance comparison, ARM9E results from rockbox
saratoga
post Jul 11 2010, 00:03
Post #1





Group: Members
Posts: 4968
Joined: 2-September 02
Member No.: 3264



I recently did some performance comparisons of various codecs in Rockbox on arm9e (specifically a Sansa Clipv2). For this test I clocked the CPU and memory at 40MHz, to mimic the typical low power settings an MP3 players CPU would run at while in low power mode with the screen off. The results are pretty interesting.

CODE

128k (give or take for AC3) lossy files:

a52_stereo_192.ac3 17.29MHz
mpc_128.mpc 18.01MHz
wma_128.wma 19.26MHz
vorbis_128.ogg 20.16MHz
nero_128.m4a 24.26MHz
lame_128.mp3 26.35MHz




Vorbis uses tremor, mp3 uses libmad, mpc uses peter's fixed point decoder, wma uses my fixed point port of ffmpeg's decoder, nero uses libfaad combined with a lot of code from ffmpeg and a52 uses liba52.

Some caveats:

AAC isn't as well optimized as the others. It uses the same transform, windowing code as Vorbis, but libfaad is kind of a mess.

Vorbis and WMA are pretty well optimized.

MP3 is extremely well optimized, and probably faster then a lot of commercially shipping decoders judging by the much better battery life in rockbox then the retail firmware when decoding mp3. All the pure MDCT codecs use the same MDCT, which is pretty close to the performance arm ltd claims for their optimized mdct. MP3 uses its own MDCT, which is within in assembly and within a few instructions of the best known algorithm.

All low power, low accuracy, etc modes are disabled. All codecs decode at full resolution, typically > 16 bit.

Some thoughts:

A52 is really fast because its really simple. It has two transform block sizes, both of which are shorter then most codecs. In stereo mode its basically nothing but huffman, inverse quant, and then mdct.

AAC is really complicated and the spec isn't freely available, which makes optimizing it difficult.

MP3 is really slow compared to pure MDCT codecs or pure subband codecs in spite of intense optimization. This is because it has to do both an MDCT (like Vorbis, AAC, WMA, etc) and a synthesis filterbank (like MPC). Neglecting the synthesis filterbank, it performs similarly to the other small block sized mdct codec, a52. This leads to decreased battery life using the format.

WMA uses the most complicated MDCT scheme of the codecs, but is otherwise extremely simple at mid to high bitrates (low bitrate coding is more complicated). Consequently, its very fast and very easy to optimize. In terms of quality per decode time, it and vorbis are probably towards the top.

People say vorbis is slow, but actually its one of the fastest formats tested. Its fairly complicated, but reasonably easy to optimize and in general efficiently designed. It does have a very annoying habit of using an unpredictable amount of memory though. Some files can use up to 500KB of RAM, but most modern ones only need < 250KB. Annoying.

I didn't post ATRAC3 results because I don't have an encoder, but its also really, really slow because its a hybrid format like MP3 with a lot of really weird stuff added on top that no other format uses.

ffmpeg codecs are typically the fastest and best written.



CODE
Lossless Formats

flac_5.flac           7.07MHz
flac_8.flac           7.67MHz
wv_fastx3.wv          24.11MHz
wv_normx4.wv          28.69MHz
true_audio.tta        36.62MHz
ape_c1000.ape         40.66MHz
wv_high.wv            45.07MHz
ape_c2000.ape         57.69MHz
ape_c3000.ape         86.34MHz
ape_c4000.ape         221.24MHz


Some caveats:

flac uses ffmpeg, ape uses rockbox (which was since ported to ffmpeg), wavpack and TTA are the official decoders.

Lossless formats are more boring to me, so I can't say as much about them.

All the formats are really well optimized, though flac, and ape are probably the best optimized.


Some thoughts:

FLAC is amazingly efficient. Its probably the fastest compressed format in remotely widespread use. All formats seem to compress about the same, so if you want to use lossless on batteries, you should probably look at flac first.

APE is really, really slow. I didn't benchmark c5000 at low clock speed since I didn't have all night, but it nees around 900MHz. I would not use APE on a portable device.

Wavpack is probably fast enough that you wouldn't notice much difference in battery life verses flac. Most devices can't lower the clock below 20MHz anyway.


More results if anyone is interested:

http://www.rockbox.org/wiki/CodecPerforman...0MHz_PCLK_40MHz
Go to the top of the page
+Quote Post
edwardar
post Jul 11 2010, 09:48
Post #2





Group: Members
Posts: 98
Joined: 8-July 04
Member No.: 15139



Wow, that's blown all my preconceptions out of the water!!

I've been using Lame V5 on my clip+ (rockboxed), because I thought it maximised battery life. Now I'm ready to chuck that in and move to ogg. I much prefer the artifacts introduced by ogg at low bitrate than those created by mp3.

I'll stay away from wma, and it looks like mpc wasn't optimised for low bitrates.

Go to the top of the page
+Quote Post
C.R.Helmrich
post Jul 11 2010, 10:39
Post #3





Group: Developer
Posts: 688
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



Thanks a lot for this interesting comparison, saratoga! One question:

QUOTE (saratoga @ Jul 11 2010, 01:03) *
WMA uses the most complicated MDCT scheme of the codecs, ...

Can you elaborate, please? What does WMA do MDCT-wise that the other codecs don't?

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
saratoga
post Jul 11 2010, 18:06
Post #4





Group: Members
Posts: 4968
Joined: 2-September 02
Member No.: 3264



QUOTE (C.R.Helmrich @ Jul 11 2010, 05:39) *
Thanks a lot for this interesting comparison, saratoga! One question:

QUOTE (saratoga @ Jul 11 2010, 01:03) *
WMA uses the most complicated MDCT scheme of the codecs, ...

Can you elaborate, please? What does WMA do MDCT-wise that the other codecs don't?



Most codecs have 2 MDCT sizes that are used, one for frames that need good time resolution, and one for frames that require better frequency resolution. For example AAC-LC uses 256 and 2048 frequency domain samples. It also puts limits on the order they can be used. WMA however has 5 transform sizes (and WMA Pro which is very similar uses 6) arranged in powers of 2 increasing to 2048. It also doesn't put strict limits on the order they can be in, presumably to maximize efficiency by picking the optimal time/frequency tradeoff for every few hundred to few thousand samples. No other codec that I've seen does this, although different AAC profiles use different size transforms, and Vorbis allows any 2 window sizes to be chosen by the encoder.

I'm not sure why only MS does this, theres no computational cost in the decoder, and their encoders do take advantage of the extra window sizes very frequently in ordinary files. Its possible it just doesn't help that much in practice or that they've patented it.

edit: can't spell

This post has been edited by saratoga: Jul 11 2010, 18:09
Go to the top of the page
+Quote Post
AshenTech
post Aug 5 2010, 01:57
Post #5





Group: Members
Posts: 78
Joined: 11-November 08
Member No.: 62144



would lower bitrate make vorbis even faster to decode?

asking because most of my audio books are at q0, and my music is at between q-2 and q-4(a few are higher due to very noticeable artifacts, but most are in the 2-4 range 2-3 being more common)

Go to the top of the page
+Quote Post
Northpack
post Aug 5 2010, 07:23
Post #6





Group: Members
Posts: 455
Joined: 16-December 01
Member No.: 664



QUOTE (edwardar @ Jul 11 2010, 08:48) *
Wow, that's blown all my preconceptions out of the water!!

I've been using Lame V5 on my clip+ (rockboxed), because I thought it maximised battery life. Now I'm ready to chuck that in and move to ogg. I much prefer the artifacts introduced by ogg at low bitrate than those created by mp3.

Correct me if I'm wrong, but AFAIK rockbox will run in boosted mode (240Mhz) on the Clip+ anyway, because changing clock speed introduced all sorts of crashes. So for now it shouldn't make any difference which codec you use...
Go to the top of the page
+Quote Post
edwardar
post Aug 5 2010, 12:42
Post #7





Group: Members
Posts: 98
Joined: 8-July 04
Member No.: 15139



QUOTE (Northpack @ Aug 5 2010, 07:23) *
Correct me if I'm wrong, but AFAIK rockbox will run in boosted mode (240Mhz) on the Clip+ anyway, because changing clock speed introduced all sorts of crashes. So for now it shouldn't make any difference which codec you use...

I did read about these problems, but I assumed they had been fixed because people are now getting 15+ hours out of their clip+ (more than the original firmware). However, I never read that this actually had been fixed... I'll look into it.

Either way, knowing that ogg will not use more power than mp3 is enough for me to switch to ogg, considering the better quality at low bitrates.
Go to the top of the page
+Quote Post
saratoga
post Aug 5 2010, 18:40
Post #8





Group: Members
Posts: 4968
Joined: 2-September 02
Member No.: 3264



Updated with new results. Mohamed Tarek has been busy working on WMA Pro and also found time to spot a nasty little oversight in my WMA code thats been slowing it down 3MHz or so all these years. WMA Pro is also now nearly as fast as the other codecs, but much room for improvement remains. Andree Buschmann has also been busy with WMA Pro and AAC.

QUOTE
I did read about these problems, but I assumed they had been fixed because people are now getting 15+ hours out of their clip+ (more than the original firmware). However, I never read that this actually had been fixed... I'll look into it.


Its still disabled on some Sansas, but I think theres still power savings since the CPU can idle more on faster codecs.

This post has been edited by saratoga: Aug 5 2010, 18:47
Go to the top of the page
+Quote Post
saratoga
post Aug 5 2010, 18:44
Post #9





Group: Members
Posts: 4968
Joined: 2-September 02
Member No.: 3264



Hmm seems I can't edit the first post, so heres the updated table:

CODE
128k (give or take for AC3) lossy files:

wma_128.wma            16.37MHz
a52_stereo_192.ac3    17.29MHz
mpc_128.mpc            18.01MHz
vorbis_128.ogg            20.16MHz
wmapro_141k.wma    22.13MHz
nero_128.m4a            22.35MHz
lame_128.mp3            26.35MHz
Go to the top of the page
+Quote Post
AshenTech
post Aug 7 2010, 20:21
Post #10





Group: Members
Posts: 78
Joined: 11-November 08
Member No.: 62144



could we get some speex numbers, im just wondering if speex is faster or slower then vorbis smile.gif
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 21st September 2014 - 19:19