IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
caudec: a multiprocess audio converter for Linux and OS X, Leverages multi-core CPUs with lots of RAM
skamp
post Feb 15 2012, 19:40
Post #1





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



caudec is a command-line utility for GNU/Linux and OS X that transcodes (converts) audio files from one format (codec) to another. It leverages multi-core CPUs with lots of RAM by using a ramdisk, and running multiple processes concurrently (one per file and per codec). It is Free Software, licensed under the GNU General Public License (version 3). The APEv2 tagger that's bundled with versions 1.7.1 and later, is licensed under the Mozilla Public License, version 2.

  • Supported input codecs: WAV, AIFF, CAF, FLAC, WavPack, Monkey's Audio, TAK, Apple Lossless.
  • Supported output codecs: WAV, AIFF, CAF, FLAC, Flake, WavPack, Monkey's Audio, TAK, Apple Lossless, lossyWAV, LAME, Ogg Vorbis, Nero AAC, qaac, Musepack, Opus.
  • Support for high quality resampling and downmixing / upmixing to stereo, with SoX.
  • Optimized I/O: input files are copied onto a tmpfs mount sequentially, so as to get the best performance out of the underlying medium (e.g. a hard drive). Transcoding however is done concurrently. Example: file 1 gets copied. When that's done, transcoding of file 1 starts. Meanwhile, file 2 gets copied, etc… Very little time is lost reading the files.
  • Transcoding to several different codecs at once is possible. In that case, decoding of input files is done only once.
  • Multiple instances of caudec can be run concurrently while sharing ressources.
  • Metadata is preserved (as much as possible) from one codec to another.
  • Multiprocess Replaygain scanner (except for Opus and Musepack).
  • Uses existing, popular command line encoders/decoders.


Tested under Arch Linux and OS X. Download here. Please use the bug tracker to report any bugs. Feedback is most welcome!

This post has been edited by skamp: Oct 21 2013, 18:31


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
skamp
post Feb 15 2012, 22:45
Post #2





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



I just released version 1.1.0, which adds support for Musepack.


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
Dario
post Feb 16 2012, 00:34
Post #3





Group: Members
Posts: 158
Joined: 20-September 11
Member No.: 93842



Excuse my ignorance, but does TAK actually work under Linux?

This post has been edited by Dario: Feb 16 2012, 00:34
Go to the top of the page
+Quote Post
skamp
post Feb 16 2012, 00:38
Post #4





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



The encoder/decoder (Takc.exe) works with wine. Linux users can use it for archiving, while transcoding to some other codec (e.g. lossy) for listening purposes. Caudec supports TAK encoding and decoding if the user has installed both Wine and TAK.

This post has been edited by skamp: Feb 16 2012, 00:38


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
skamp
post Feb 17 2012, 08:21
Post #5





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



It just occurred to me that I left out one of caudec's main selling points: it's fast. It sounds obvious to me, but maybe it isn't so much. I was never a sales person. It might also not be obvious that it works best on somewhat large sets of files (e.g. a whole album with one or two CDs, one file per track).

Encoding ABBA's 2CD The Definitive Collection (148 minutes, 37 tracks) from WAV to FLAC --best, with one process, on a Core i7 @ 2.2 GHz: 46x real time.
Same as above, with 8 processes: 186x

Just for kicks, FLAC -5 (default setting) with 8 processes encodes at 569x, TAK -p2 at 743x.


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
skamp
post Feb 23 2012, 01:35
Post #6





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



I just released version 1.3.0 of caudec, that
  • adds support for WavPack lossy
  • adds support for resampling of stereo files
  • corrects a bug that increased disk space usage on tmpfs
  • improves prediction of required disk space on tmpfs
  • adds support for a CAUDECDIR environment variable for setting the temporary dir to your liking

Upgrading is highly recommended, if only for the bug fix. Please report any issues using the issues tracker.

This post has been edited by skamp: Feb 23 2012, 01:36


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
Takla
post Jun 2 2012, 00:31
Post #7





Group: Members
Posts: 169
Joined: 14-November 09
Member No.: 74931



Hi skamp. I tried your caudec script and it is definitely very fast. I tested it by transcoding from flac to ogg -q 7 an album of flacs and it shaved maybe 40% off the time taken by oggenc or by ffmpeg>wav>oggenc or straight ffmpeg -i $file -acodec libvorbis etc. As far as I can tell all the speed benefit comes from parallel processing (I checked this by processing a single file and finding that in this case caudec is in fact slower than oggenc or a more typical bash script). So I'm wondering what is the point of creating the tmpfs and doing so much copying? Is it just to facilitate dropping files in and out of a queue? I can't see any need to create a memory consuming structure for machines with large amounts of RAM, because transcoding is almost all CPU. So I like your script's speed but I wonder if the same thing couldn't be achieved more simply by using job control to get bash running parallel encoder processes, or maybe I missed something important?
Go to the top of the page
+Quote Post
Canar
post Jun 2 2012, 00:40
Post #8





Group: Super Moderator
Posts: 3348
Joined: 26-July 02
From: princegeorge.ca
Member No.: 2796



While encoding is a parallel task, reading from a drive is intrinsically sequential. You can't double read speed by reading 2 files at once. In fact, you're likely to harm read speed. By queuing disk operations and running encoding purely in RAM, caudec cuts out the parallel read bottlenecks and runs the process as fast as possible.


--------------------
You cannot ABX the rustling of jimmies.
No mouse? No problem.
Go to the top of the page
+Quote Post
Takla
post Jun 2 2012, 08:08
Post #9





Group: Members
Posts: 169
Joined: 14-November 09
Member No.: 74931



I can see the logic, but disk reads are very high these days. How can there be a bottle neck when reading 6 or 8 or 10 lossless files of maybe between 20MB and 50MB each, which are going to take a a few seconds to decode and encode anyway? Surely that doesn't present any kind of challenge with modern hardware?

I'd noticed that oggenc on XP was significantly faster than a gcc compiled oggenc binary in Linux so I was keen to try to make up the difference and Skamp's script prompted me to go back to my bash scripts and add some parallel processing. My scripts are simpler stuff: essentially decode+dump metadata function, encoder function, metadata writer function. By letting the core functions of the script run in parallel/background processes (number of cores +1) I can achieve about the same improvements, for example the directory I transcoded earlier, flacs to oggs:

my original bash script:
real 3m3.301s
user 3m8.952s
sys 0m3.496s

caudec:
real 1m47.993s
user 3m11.467s
sys 0m4.126

my bash script with some parallel processes/backgrounding:
real 1m52.904s
user 3m10.826s
sys 0m3.877s

But I only have 4 year old dual core AMD Athlon64 desktop and a 5 year old Core Duo (32-bit only) and a similar vintage Core 2 Duo....no experience of i7 here so I can't personally scale my tests up to 4 cores and 8 threads. Has anyone with modern hardware (quad core, multi GB RAM, SATA III etc) actually measured the difference and if so is it found it to be substantial? At the moment I can see Skamp's caudec page which compares single thread processing (and I assume conventional read from HDD) with parallel processing from tmpfs. Obviously the parallelism makes a huge difference and perhaps that accounts for all or almost all the difference, so what is missing is some data showing that the tmpfs is solving a problem or adding a benefit.

edited for typos.

This post has been edited by Takla: Jun 2 2012, 08:10
Go to the top of the page
+Quote Post
skamp
post Jun 2 2012, 10:49
Post #10





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



What Canar said. Hard drives don't like concurrent access, and you actually lose read speed (more than proportionally) as you increase the number of concurrent accesses. My laptop hard drive tops out at maybe 70 MB/s on a single access, but it's not like it gives me 17.5 MB/s per file when I'm accessing 4 files at once, it gives me less than that. Same thing with my USB3 HDD where my backup resides. I tested it a while ago so I don't have the exact figures anymore, but my observation was that single-access, sequential reading was needed.

I have a quad-core i7 with 8 threads and 8 GiB of RAM, so my objective was to get the highest transcoding speeds possible while leveraging the gear at my disposal. Copying input files to a tmpfs sequentially while transcoding them concurrently proved to be the most efficient way. The speed gains range from slight to significant, depending on the gear, the configuration (number of processes, etc…) and the set of files you're transcoding. E.g. reading 8 files at once can slow my hard drive to a crawl.

This post has been edited by skamp: Jun 2 2012, 10:53


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
lvqcl
post Jun 2 2012, 10:57
Post #11





Group: Developer
Posts: 3341
Joined: 2-December 07
Member No.: 49183



somewhat related: http://www.hydrogenaudio.org/forums/index....showtopic=94783
Go to the top of the page
+Quote Post
skamp
post Jun 2 2012, 11:22
Post #12





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



I dug up an old version (before 1.0) that didn't copy input files to a tmpfs. Here are the results when transcoding FLACs from my hard drive to Ogg Vorbis, with 8 processes, on a 2 CD album with 37 files (same external encoders):
  • old version: 71.41 seconds (15.0 MB/s) (124.3x)
  • latest caudec: 58.71 seconds (18.2 MB/s) (151.1x)

That's roughly a 21% speed increase. Maybe not quite as dramatic as one could hope, but substantial nonetheless.

Obviously I dropped filesystem caches before each run.

This post has been edited by skamp: Jun 2 2012, 11:29


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
Takla
post Jun 2 2012, 11:48
Post #13





Group: Members
Posts: 169
Joined: 14-November 09
Member No.: 74931



Thanks for the info. If I ever get an i7 I'll be keen to transcode this way. I've been trying out different numbers of parallel processes and I've found that on my Athlon 64 I get maximum transcode speed by allowing 5 parallel processes instead of 3, and this now performs at least as quickly as the tmpfs method (time difference is <1%), though it's all snail paced compared to your i7 figures; where you get 124x I get 26x (all on the same disk) crying.gif

This post has been edited by Takla: Jun 2 2012, 11:49
Go to the top of the page
+Quote Post
skamp
post Jun 2 2012, 12:06
Post #14





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



I'm guessing your hard drive is less of a bottleneck with your configuration (CPU speed, number of concurrent reads on the HDD) than with mine smile.gif

Incidentally, the tmpfs method provides no speed gain when I'm transcoding FLACs located on my SSD. In that case, the storage medium is no longer the bottleneck. Unfortunately, my SSD is nowhere near large enough to hold my entire FLAC library, so I still have to deal with my slowish HDD.


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
Takla
post Jun 2 2012, 12:08
Post #15





Group: Members
Posts: 169
Joined: 14-November 09
Member No.: 74931



I got my Core 2 Duo 1.6 GHz running 64-bit Debian Stable headless with 512MB RAM to hit the heady heights of 33x. It's a champagne moment. Tomorrow I buy the (parallel) stripes, body kit and chrome exhaust.
Go to the top of the page
+Quote Post
skamp
post Jun 2 2012, 12:41
Post #16





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



QUOTE (Takla @ Jun 2 2012, 09:08) *
I'd noticed that oggenc on XP was significantly faster than a gcc compiled oggenc binary in Linux so I was keen to try to make up the difference


That's the reason I added support for Windows binaries with Wine. There are instructions on how to install and use those with caudec.
lvqcl's Ogg Vorbis AoTuV ICC build might be of interest to you.


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
Takla
post Jun 2 2012, 13:20
Post #17





Group: Members
Posts: 169
Joined: 14-November 09
Member No.: 74931



I saw the info on wine and win binaries in your docs/examples and it struck a chord because I'd previously noticed a big discrepancy between the speed of oggenc in XP (with foobar as frontend) and oggenc in Debian 32-bit. But as I don't make a habit of watching the text scroll by I can live with my newly parallelised scripts doing 26x or 33x (finally quicker than AoTuV in XP on my hardware). I'll stick with native binaries so I can run the same scripts across different free OS and architectures and not have to care if wine is installed/working/worth the effort.
Go to the top of the page
+Quote Post
Takla
post Jun 4 2012, 00:04
Post #18





Group: Members
Posts: 169
Joined: 14-November 09
Member No.: 74931



btw I booted my XP install to see what foobar2000 and oggenc were doing and discovered that the apparent gulf in encoder performance between oggenc in Debian and oggenc in XP was simply due to foobar2000 running two oggenc processes in parallel (XP version of oggenc being aoTuVb6.03 from rarewares). Once both cores are maxed out oggenc performs a little faster (very little: <1%, probably has more to do with OS services than the binary) in Debian 32-bit than in XP SP3 32-bit though the difference is very slight (if you measured it using a button-press stopwatch you'd never know there was any difference). Anyway if I happen again on an application which apparently performs hugely better or worse on a different OS I'll take a closer look before assuming something is either very wrong or inexplicably excellent....

This post has been edited by Takla: Jun 4 2012, 00:06
Go to the top of the page
+Quote Post
skamp
post Jun 4 2012, 11:48
Post #19





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



QUOTE (skamp @ Jun 2 2012, 12:22) *
That's roughly a 21% speed increase. Maybe not quite as dramatic as one could hope, but substantial nonetheless.


The benefit gets more obvious as CPU time decreases (the HDD becomes more of a bottleneck). Here's a case where the difference becomes "dramatic": encoding WAVs to FLAC (-q 5, FLAC's default compression level).
  • old version: 70.63 seconds (22.2 MB/s) (125.7x)
  • latest caudec: 38.33 seconds (40.9 MB/s) (231.4x)

That's a 84% speed increase smile.gif YMMV of course.


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
punkrockdude
post Jun 4 2012, 20:14
Post #20





Group: Members
Posts: 244
Joined: 21-February 05
Member No.: 20022



I am glad more Linux stuff is being done since I use Linux on my laptop and I learn new things all the time. Regards.
Go to the top of the page
+Quote Post
skamp
post Jun 5 2012, 12:49
Post #21





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



I was curious, so I implemented a switch for disabling the preloading of input files to RAM, for cases where the underlying medium is a fast SSD, ramdisk or whatever. I ran a few tests with light to intensive CPU tasks, and the speed gains were negligible. Since inappropriate / uneducated use of that switch could easily cause terrible performance, I've decided to revert the change and not include it in a future release (not until everyone has terrabyte SSDs, at least).

This post has been edited by skamp: Jun 5 2012, 12:56


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
skamp
post Jun 27 2012, 09:11
Post #22





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



I released version 1.4.0, with many changes (pretty much as many commits as all of the other versions combined):
  • now runs on Mac OS X (tested on Lion)
  • smart handling of concurrent instances
  • better detection of ramdisks
  • don't abort if no ramdisk is available
  • support for e/m TAK compression parameters
  • removed reckless option to disable checking of available space
  • fixed long standing bugs in the installation script
  • fixed regression with empty APEv2 tags
  • better handling of ALAC metadata
  • changed handling of user interruption (Ctrl+C), removed pgrep dependency
  • lots of minor fixes

Upgrading is strongly recommended. Please use the tracker to report any bugs.


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
skamp
post Jul 10 2012, 12:22
Post #23





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



Latest version (1.4.3) brings support for Opus and ALAC encoding, among other improvements and fixes. See changes.


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post
Manlord
post Jul 10 2012, 22:08
Post #24





Group: Members
Posts: 25
Joined: 2-April 10
Member No.: 79529



Excelent, thank you skamp. Going to test it on Debian 6.
Go to the top of the page
+Quote Post
skamp
post Jul 30 2012, 11:16
Post #25





Group: Developer
Posts: 1413
Joined: 4-May 04
From: France
Member No.: 13875



I released caudec 1.5.0. Changes:

  • Replaygain scanner (except for Opus and Musepack)
  • preservation of embedded artwork from FLAC and ALAC, to FLAC, ALAC, AAC and MP3
  • new -C switch disables metadata preservation
  • report both read and write speeds
  • better estimation of ramdisk space requirements with APE input files
  • various fixes


Thanks to Garf for his help on the RG scanner.

This post has been edited by skamp: Jul 30 2012, 11:22


--------------------
See my profile for measurements, tools and recommendations.
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 1st August 2014 - 11:33