IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Best Quality for Lossy Encoding of Audiobooks?
encoder
post Nov 27 2012, 15:26
Post #1





Group: Members
Posts: 51
Joined: 27-November 12
Member No.: 104797



Obviously if I don't want FLAC. But I also don't want it to be (much) distinguishable from the original. Storage space is not really an issue nowadays. What do you use? MP3 or OGG file format? Which settings? Which software? I have Fre:Ac. I would prefer an easy to use Windows solution, no command lines. What settings to use?

By the way what is the best audio grabber program nowadays? Is it still EAC? I'm really out of this game lately. smile.gif

Oh, and how to create separate audio tracks for gapless playback on any basic player? I just see your Wiki... There used to be a command line solution to cut the audio files at the block's ends so every simple player can play 'em back gaplessly. It worked for MP3. Does it work for OGG?

Thanks!

This post has been edited by encoder: Nov 27 2012, 15:33
Go to the top of the page
+Quote Post
eahm
post Nov 27 2012, 15:45
Post #2





Group: Members
Posts: 1056
Joined: 11-February 12
Member No.: 97076



You can hear difference if you convert in 64kbps, even with AAC. I like AAC 128kbps audiobooks but fre:ac uses FAAC which is outdated, use foobar2000 + qaac to create them. I assume OGG aoTuV 128kbps is also great. MP3 I don't really know, try to ABX 128-160+kbps and let us know.


--------------------
/lwAsIimz
Go to the top of the page
+Quote Post
Dynamic
post Nov 27 2012, 18:06
Post #3





Group: Members
Posts: 803
Joined: 17-September 06
Member No.: 35307



QUOTE (encoder @ Nov 27 2012, 14:26) *
By the way what is the best audio grabber program nowadays? Is it still EAC? I'm really out of this game lately. :)

Oh, and how to create separate audio tracks for gapless playback on any basic player? I just see your Wiki... There used to be a command line solution to cut the audio files at the block's ends so every simple player can play 'em back gaplessly. It worked for MP3. Does it work for OGG?


If by 'grabber' you mean CD ripping, EAC is still great, CUETools' CUERipper is equally good and perhaps easier to set up (that's what I use mostly, usually in Burst mode when the disc is found in CTDB and modest ripping errors can be corrected without re-ripping). Also excellent is dBpowerAmp. Its PerfectMeta could be a great time saver on a large ripping process. Foobar2000's ripper is pretty good too though I'm not as familiar with the techniques used. I've used them all from time to time. (CUETools and dBpowerAmp Converter can also handle a lot of file format conversion tasks, as can foobar2000).

Ogg Vorbis is intrinsically gapless (as is its new successor, Opus). Only a few external players support gapless MP3 via Lame's Accurate Length tags. The old --nogap solution had its problems and is deprecated in favour of the Lame Accurate Length information.

Most of the very basic players close the stream and re-open thus failing to preserve gapless playback regardless of how you encode. However, some excellent low cost players (such as Sandisk Clip) support Rockbox, which offers gapless playback.
Go to the top of the page
+Quote Post
DonP
post Nov 28 2012, 03:35
Post #4





Group: Members (Donating)
Posts: 1471
Joined: 11-February 03
From: Vermont
Member No.: 4955



Most of the books on my portable I have in speex. With many running 6 hours I'll give up a little transparency to save space on a flash player. Even if I want "good" sound I won't go over q0 with vorbis. If they are done as "radio drama" (music, stereo staging, sound effects) rather than just some guy reading the book out loud I might go more.
Go to the top of the page
+Quote Post
jensend
post Nov 28 2012, 05:34
Post #5





Group: Members
Posts: 143
Joined: 21-May 05
Member No.: 22191



As far as the quality-bitrate tradeoff goes, Opus (homepage, FAQ) is easily the best codec available for audiobook use. Already at 24kbps it's quite close to the original. (That's 97 hours of audiobooks per GB of storage, vs 1 hour 40 minutes for CD or 18 hours for the 128kbps AAC eahm was suggesting.)

Codecs without any speech coding technology, like MP3, AAC, or Vorbis, tend to require about double the bitrate to get comparable results.

Almost all other codecs with modern speech coders are patented, and getting an encoder requires paying a royalty. On top of that, those codecs will have trouble on any non-speech content (music, effects) in audiobooks. Opus is the first publicly available codec to combine state-of-the-art speech and general audio coding technologies, and of course it's royalty-free.

The only drawback is that since it's a brand new format - just standardized in September - software and devices are just starting to add support. Software-side, Opus playback is supported by VLC, Foobar2000, and Firefox. Device-side, Rockbox has added support in their development builds. More applications and devices will be adding support by the end of the year.

The command line isn't as scary as you think. If the command line really is a problem you can use Foobar2000 to encode to Opus instead.
Go to the top of the page
+Quote Post
encoder
post Nov 28 2012, 13:53
Post #6





Group: Members
Posts: 51
Joined: 27-November 12
Member No.: 104797



Thanks for the info! This Opus thing looks interesting. It's 1.01, is it already the best (for music as well)? Will my 1st gen. nonRockboxed Sansa clip play it if I somehow "make it" to an OGG? I just didn't bother to Rockbox it. Default is king, ain't it?

Most important: what bitrate to use for Opus for audiobook (and music)? Let's say I am used to 256-320k MP3s.

As for ripping CDs: I have all the time in the world and I prefer the bit accurate method.

Will the gapless playback work on Opus as well? Where can I read more about this newer method of gapless? Google didn't help.
Go to the top of the page
+Quote Post
marc2003
post Nov 28 2012, 14:53
Post #7





Group: Members
Posts: 4443
Joined: 27-January 05
From: England
Member No.: 19379



QUOTE
Default is king, ain't it?


no. laugh.gif

the sansa clip firmware won't play opus files and it can't do gapless either regardless of format.

rockbox does perfect gapless and plays opus files. there's simply no reason not to rockbox your player. i own both a clip and clip+ and couldn't live without rockbox - mainly for the gapless support. i don't know how anyone can put up with a player that doesn't do it.
Go to the top of the page
+Quote Post
DonP
post Nov 28 2012, 16:04
Post #8





Group: Members (Donating)
Posts: 1471
Joined: 11-February 03
From: Vermont
Member No.: 4955



QUOTE (jensend @ Nov 28 2012, 00:34) *
As far as the quality-bitrate tradeoff goes, Opus (homepage, FAQ) is easily the best codec available for audiobook use. Already at 24kbps it's quite close to the original. (That's 97 hours of audiobooks per GB of storage, vs 1 hour 40 minutes for CD or 18 hours for the 128kbps AAC eahm was suggesting.)

.....
The command line isn't as scary as you think. If the command line really is a problem you can use Foobar2000 to encode to Opus instead.


OK I downloaded the stuff (Opus, new foobar, dev rockbox). Convert in foobar is set with a command line (--bitrate 10 %s %d)
Can it use standard input?

I converted a spoken word CD and it sounds pretty good at 10 kb/s (5 MB for the whole thing) There's just a little bit of music in the intro and outro. Not great on that but at least it doesn't make me cringe like some speech specific coders. So this is my new format for speech. I'll try some music too but more concern there for getting it to work on multiple players and figuring out my transparency point.

Go to the top of the page
+Quote Post
Seren
post Nov 28 2012, 16:26
Post #9





Group: Members
Posts: 52
Joined: 1-November 12
Member No.: 104244



10kb/s for music shock1.gif
I wonder the day this becomes the norm... probs when we have 100 petabyte hdds for $50 but oh well...
Btw if your going that low, you might want to see how it sounds with mono, a 16kb/s mono seemed to not give me pain in my ears wheres a 16kb/s stereo would.
But if your used to such high kb mp3s, why not just start out at 64kb/s, which is a tad overkill for voice with opus but sounds pretty decent for music as well.
Go to the top of the page
+Quote Post
eahm
post Nov 28 2012, 16:37
Post #10





Group: Members
Posts: 1056
Joined: 11-February 12
Member No.: 97076



10kb/s so 1.25KB/s?

For my previous post, of course you don't have to go 128kbps with AAC, I don't listen to audiobooks that often, 128kbps are the ones I found (from Audible etc.) that sound more like they should (good tone, good voice, good microphone?). 64kbps AAC is probably more than fine, I don't like 64kbps MP3 though.

This post has been edited by eahm: Nov 28 2012, 16:39


--------------------
/lwAsIimz
Go to the top of the page
+Quote Post
jensend
post Nov 28 2012, 18:23
Post #11





Group: Members
Posts: 143
Joined: 21-May 05
Member No.: 22191



encoder: Yes, Opus is also somewhat better than the competition (AAC/Vorbis) for music, though not by as large a margin.

Being "used to 256-320k MP3s" doesn't give us enough information to tell you where your optimal point on the bitrate vs quality curve is. If you've been using a modern high-quality MP3 encoder like recent versions of LAME, did a whole lot of blind listening tests, and decided you really needed to encode MP3s at >256kbps even when you're using a space-constrained portable player like the original Clip, that would mean you're more sensitive to coding artifacts than just about anyone on the planet (or that there are only one dozen CDs you will ever want to listen to). The normal recommendation for MP3 stereo music using LAME is -V2 (~190kbps) for very sensitive listening without storage constraints and -V4 (~165 kbps) or lower for portable use.

For Opus, which is considerably better than MP3, I'd recommend you start by encoding one audiobook at 24kbps and some stereo music at 96kbps, do a little listening test (possibly using Foobar2000's ABX tool), and use that information to make a decision on how to encode your whole collection. Depending on your tastes you might want to go as high as 32kbps for audiobooks and 128kbps for music, but I don't think you'll want to go above that for a portable player.

DonP, I think you'll find that increasing the bitrate from 10kbps to 12kbps, which gives mediumband (6kHz) rather than narrowband (4kHz) audio bandwidth, will make your audiobooks sound considerably better without much of a bitrate change. Since most of the energy in sibilants (s-sounds etc) is around 4-6kHz, and since there's energy in that range even for vowels, the quality difference between narrowband and mediumband speech is quite large, probably just as large as the quality difference between mediumband and the superwideband you get at 22kbps.
Go to the top of the page
+Quote Post
Dynamic
post Nov 28 2012, 19:13
Post #12





Group: Members
Posts: 803
Joined: 17-September 06
Member No.: 35307



I agree that at its speech-codec settings (when it uses SILK Linear Prediction mode), Opus is far less awful for music than most speech codecs (certainly most of the CELP and GSM types).

The Opus scalable bitrate demo is a good first take on what it sounds like with music with plenty of sparkly percussion and shows where it typically switches bandwidth and stereo mode as it sweeps from 8kbps to 64 kbps. My suggestion for great quality speech and decent music would be 32kbps, so I agree with jensend.
Go to the top of the page
+Quote Post
DonP
post Nov 28 2012, 19:16
Post #13





Group: Members (Donating)
Posts: 1471
Joined: 11-February 03
From: Vermont
Member No.: 4955



QUOTE (jensend @ Nov 28 2012, 13:23) *
encoder: Yes, Opus is also somewhat better than the competition (AAC/Vorbis) for music, though not by as large a margin.

DonP, I think you'll find that increasing the bitrate from 10kbps to 12kbps, which gives mediumband (6kHz) rather than narrowband (4kHz) audio bandwidth, will make your audiobooks sound considerably better without much of a bitrate change. Since most of the energy in sibilants (s-sounds etc) is around 4-6kHz, and since there's energy in that range even for vowels, the quality difference between narrowband and mediumband speech is quite large, probably just as large as the quality difference between mediumband and the superwideband you get at 22kbps.


I'll give that a shot. 10 really sounds ok to me though.. probably just low subjective requirements for plain speech.

I encoded some acoustic guitar music at 64kb/s and it just sounded wrong.. a little mushy. ABX was 10/10 and I could be pretty sure which was which after listening to only one of X OR Y and not bothering with A and B. I could also ABX that track 100% with vorbis q=0 (64kb), but with considerably more effort. 150 kb/s Opus is so far transparent to me.

Go to the top of the page
+Quote Post
IgorC
post Nov 28 2012, 19:27
Post #14





Group: Members
Posts: 1553
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



QUOTE (DonP @ Nov 28 2012, 15:16) *
I encoded some acoustic guitar music at 64kb/s and it just sounded wrong.. a little mushy.

1.0.1 is actually restricted VBR. Tonality (guitar as well) is an issue for 1.0.1
Maybe You will want to try the last experimental branch which is pretty good (especially for tonality) at this point .

This post has been edited by IgorC: Nov 28 2012, 19:29
Go to the top of the page
+Quote Post
eahm
post Nov 28 2012, 19:36
Post #15





Group: Members
Posts: 1056
Joined: 11-February 12
Member No.: 97076



Just tested AAC Apple True VBR Q18 (~75kbps) (the average of the full audiobook was 43kbps) and I coultdn't distinguish it from the original, you can go much lower than expected with speech.

This post has been edited by eahm: Nov 28 2012, 20:04


--------------------
/lwAsIimz
Go to the top of the page
+Quote Post
jensend
post Nov 28 2012, 19:50
Post #16





Group: Members
Posts: 143
Joined: 21-May 05
Member No.: 22191



A brief bit about Opus's ability to deal with mixed content: if you have an audiobook with quite a bit of music content, then for the time being, to get the full benefit of Opus's ability to code both speech and music, you need to be using an encoder newer than the one currently offered at opus-codec.org (for instance, this one)and you need to be encoding at 30kbps or higher.

Here's why, in case you're interested in the details. Music-oriented lossy codecs use the MDCT. To enable them to encode quality speech at lower bitrates than MDCT codecs can, speech coders use some variant of linear prediction. (I'll abbreviate that as LP- remember it has nothing to do with vinyl).

Opus has three modes: an LP mode (with bandwidths of either 4, 6, or 8kHz), a MDCT mode (with 4, 8, 12, or 20kHz bandwidth) and a hybrid mode (12 or 20kHz bandwidth). In hybrid mode, the lower frequencies with most of the speech energy (up to 8kHz) are done with LP while the higher frequencies are done with MDCT. Higher bandwidths, as well as using the MDCT mode, need more bits to code well.

The version currently on the website chooses the mode and bandwidth based on the bitrate and allows you to influence that choice a little by using a command line switch to tell it to expect either speech or music. It'll use just one mode and bandwidth for the entire file.

Newer versions remove that switch, instead detecting the type of content automatically, and will switch modes and bandwidths (seamlessly, of course) in the middle of a file if the content changes. At 20kbps and below, only LP modes are used. For 20-30kbps it'll use hybrid modes. For 30-42kbps it will use the MDCT mode for music and the hybrid mode for speech, switching back and forth based on the content. Above 42kbps there's no longer any benefit to using hybrid mode for speech so it'll just use MDCT all the time.
Go to the top of the page
+Quote Post
DonP
post Nov 28 2012, 20:32
Post #17





Group: Members (Donating)
Posts: 1471
Joined: 11-February 03
From: Vermont
Member No.: 4955



QUOTE (IgorC @ Nov 28 2012, 14:27) *
1.0.1 is actually restricted VBR. Tonality (guitar as well) is an issue for 1.0.1
Maybe You will want to try the last experimental branch which is pretty good (especially for tonality) at this point .


Is that "opus-tools_exp_tfself.zip"? That certainly loosens the reigns on the VBR. Total size for that file is about 25% bigger for same settings and foobar shows a lot more change in bit rate from frame to frame.
Go to the top of the page
+Quote Post
DonP
post Nov 30 2012, 19:21
Post #18





Group: Members (Donating)
Posts: 1471
Joined: 11-February 03
From: Vermont
Member No.: 4955



For speech, 12 kb/s opus is working fine on the portable player (sansa e200 with rockbox) but music, not so good. 64kb/s plays, but controls and display becomes sluggish (ex: new track doesn't display for 20 seconds or so after first one ends). 128 kb hangs up the controls completely, have to do a hard turn off (hold power button for a while) to get out of it. I hope there's more efficiency to come in the rockbox decoder. I do accept that is is part of a development build, not a "stable release"


This post has been edited by DonP: Nov 30 2012, 19:27
Go to the top of the page
+Quote Post
jensend
post Dec 1 2012, 01:31
Post #19





Group: Members
Posts: 143
Joined: 21-May 05
Member No.: 22191



QUOTE (DonP @ Nov 30 2012, 11:21) *
I hope there's more efficiency to come in the rockbox decoder. I do accept that is is part of a development build, not a "stable release"
There's plenty more efficiency to be had in the decoder. The initial Rockbox Opus work basically just got everything working without doing any optimization. That was sufficient to get better than realtime playback on a lot of the most popular devices, including Sandisk's AS3525v2-based players: the revised Fuze, the revised version of the original Clip (i.e. the ones with 2.xx.xx firmware versions, which were most of the units sold), the Clip+, and the Clip Zip. A good beginning, but just the beginning.

The revised c200 and e200 as well as the original Fuze and Clip use the original AS3525 system-on-a-chip. The CPU difference is small, but the orig. AS3525 has only 1/4 as much RAM, and initially it ran into stack space issues with MDCT Opus modes. A lot of work has been done since then to reduce the stack space required for Opus, and with those improvements these players should be fine.

The original e200 uses a rather different chipset, the pp5024, which looks like it's a fair bit slower but has plenty of RAM. Back at the beginning of October n1s, who's one of the guys working on Rockbox Opus optimizations, said on irc that he was getting faster-than-realtime 64kbps decoding on that chip, but not quite fast enough to leave sufficient CPU for the user interface. That sounds like what you were experiencing. The bit of optimization that's been done since then should be plenty enough to make 64kbps playback smooth and responsive, but 128kbps and up will need more work.

There's tons more optimization work that can be done*, and much of what has been done in the past 6 weeks hasn't made it to Rockbox's mainline development builds yet. If you'd like to know more, or if you'd like to see whether you can be of help with testing, ask around on the rockbox irc channel, forums, or mailing list. (Quite often the people best equipped to answer your question aren't in IRC at the moment, so it may take a while to get an answer, but when they are around it's probably the most convenient method for communicating with them.)

*As one example, getting other transform codecs to work as well as they do in Rockbox required a good bit of device-specific FFT/MDCT optimization, but the code they wrote for that only supports power-of-two sizes, and Opus uses non-power-of-2 FFTs. So right now whenever you ask Rockbox to decode hybrid or MDCT mode Opus, it's using generic code from the mainline libopus for the transform. The libopus code is a good algorithm, but for these kinds of things the difference between good generic code and well-tuned device specific code can be huge.
Go to the top of the page
+Quote Post
eahm
post Dec 5 2012, 00:07
Post #20





Group: Members
Posts: 1056
Joined: 11-February 12
Member No.: 97076



QUOTE (eahm @ Nov 28 2012, 11:36) *
Just tested AAC Apple True VBR Q18 (~75kbps) (the average of the full audiobook was 43kbps) and I coultdn't distinguish it from the original, you can go much lower than expected with speech.

My test on one audiobook:

FLAC: 2.21GB

AAC-LC True VBR (qaac/Apple) -V18 (~51kbps): 240MB

HE-AAC (qaac/Apple) -v32 --he (~33kbps): 151MB

HE-AAC (fhgaacenc/Fraunhofer) --vbr 1 (~31kbps): 145MB

Opus (0.1.5 from opus-codec.com) --vbr 32 (~34kbps): 152MB

MP3 (LAME 3.99.5) -V 7 (~77kbps): 351MB

This post has been edited by eahm: Dec 5 2012, 00:27


--------------------
/lwAsIimz
Go to the top of the page
+Quote Post
IgorC
post Dec 5 2012, 03:04
Post #21





Group: Members
Posts: 1553
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



QUOTE (eahm @ Dec 4 2012, 20:07) *
Opus (0.1.5 from opus-codec.com) --vbr 32 (~34kbps): 152MB


Post #16

QUOTE (jensend @ Nov 28 2012, 15:50) *
A brief bit about Opus's ability to deal with mixed content: if you have an audiobook with quite a bit of music content, then for the time being, to get the full benefit of Opus's ability to code both speech and music, you need to be using an encoder newer than the one currently offered at opus-codec.org (for instance, this one)and you need to be encoding at 30kbps or higher.

Or this one

This post has been edited by IgorC: Dec 5 2012, 03:05
Go to the top of the page
+Quote Post
eahm
post Dec 5 2012, 06:39
Post #22





Group: Members
Posts: 1056
Joined: 11-February 12
Member No.: 97076



Thanks IgorC.

AAC-LC True VBR (qaac/Apple) -V9 (~45kbps): 211MB

AAC-LC True VBR (qaac/Apple) -V0 (~39kbps): 183MB

Opus (opusenc.exe from opus_tools_2012_11_15_sse.zip + DLLs from opusfile-0.2-win32.zip from opus-codec.com) --vbr 32 (~34kbps): 152MB

Opus (opus_v1.0.1_154_g07418d9.zip) --vbr 32 (~34kbps): 152MB

MP3 (LAME 3.99.5) -V 8 (~69kbps): 313MB

MP3 (LAME 3.99.5) -V 9 (~52kbps): 234MB


Not that everyone cares about every single codec but... I just like to test when I have some free time. I would use one of the two HE-AAC, cars with AAC capability will play them.

This post has been edited by eahm: Dec 5 2012, 07:26


--------------------
/lwAsIimz
Go to the top of the page
+Quote Post
IgorC
post Dec 5 2012, 15:01
Post #23





Group: Members
Posts: 1553
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



It is recommended to use ABR instead of VBR for encoding with LAME at 100 kbps and lower.
http://wiki.hydrogenaudio.org/index.php?title=LAME
Go to the top of the page
+Quote Post
eahm
post Dec 5 2012, 18:29
Post #24





Group: Members
Posts: 1056
Joined: 11-February 12
Member No.: 97076



I don't test or keep track of MP3 too much, I'd actually like it to disappear and be replaced by a newer codec like AAC. Thanks for the link though, I've read it once but I didn't remember about the lower bitrate setting.

MP3 (LAME 3.99.5) --abr 96 (~97kbps): 442MB

MP3 (LAME 3.99.5) --abr 64 (~63kbps): 285MB

This post has been edited by eahm: Dec 5 2012, 18:31


--------------------
/lwAsIimz
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 20th August 2014 - 05:58