IPB

Welcome Guest ( Log In | Register )

LAME resampling/lower sample rate questions
jensend
post Mar 3 2012, 06:44
Post #1





Group: Members
Posts: 143
Joined: 21-May 05
Member No.: 22191



I have a couple hundred hours of recorded speech (recorded in mono, 16-bit, mostly 48kHz but some 44.1kHz) I need to make available on the Internet, and as much as I'd like to use Opus for this, it's probably going to have to be MP3 for compatibility reasons. A bit of off-the-cuff testing shows that somewhere around 40-48 kbps seems to be fine for my purposes, but I'd like to be somewhat more confident about the choices I'm making before I start encoding all this. Thought I'd use the opportunity to educate myself a little while I'm at it too.

I know LAME automatically resamples the input when targeting low bitrates. I'd like to know exactly how it determines the output bitrate. I'm not fantastic with C (reading others' source takes me an inordinate amount of time) and I didn't find the relevant code with a little use of grep in the LAME sources. Could somebody point me to where in the sources this decision is made?

Also, how are the threshholds for switching to lower sample rates tuned? Might I be better off to resample at a lower rate than LAME would normally choose, since speech has so much less energy / useful information at high frequencies than music?

One thing that I did come across when looking for how the output rate is set was the following line from lame.c:
CODE
cfg->mode_gr = cfg->samplerate_out <= 24000 ? 1 : 2; /* Number of granules per frame */

I don't know much about the details of the MP3 format, but my initial guess is that using only one granule per frame increases the overhead from headers but allows for better accuracy in seeking etc. Since a granule at 24kHz is only 24ms this doesn't strike me as a very good guess. Could someone enlighten me about the reason for this switch? What other threshholds/decision points in either bitrate or sample rate are interesting or might be worth being informed about?
Go to the top of the page
+Quote Post
 
Start new topic
Replies
jensend
post Mar 4 2012, 00:58
Post #2





Group: Members
Posts: 143
Joined: 21-May 05
Member No.: 22191



Well, that simply means we need to find out how the lowpass frequency gets set. For ABR that turns out to be fairly simple: the optimum_bandwidth function is called, giving a lowpass frequency which depends only on the target bitrate; the result is then multiplied by 1.5 for mono, giving us the following table of lowpass frequencies and resampling rates:
CODE
bitrate >=    lowpass freq  sampling rate
60            16500        48000
52            15000        32000
44            11250        32000
36            10500        24000
28             8250        22050
20             5850        16000
12             5550        16000
0              3000         8000

This doesn't seem particularly carefully tuned. I see no reason why just multiplying the stereo lowpass frequencies by 1.5 should work all the way across this range of bitrates, and this completely skips 44.1kHz, 12kHz, and 11.05kHz sampling rates.

I had thought that I'd be learning more about what makes sense from LAME's carefully tuned defaults. While I imagine the stereo ABR defaults have been carefully tuned, it may not be at all difficult to improve on the above for mono, and it would be simple to cobble together a patch implementing such improvements.
Go to the top of the page
+Quote Post

Posts in this topic


Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd July 2014 - 15:12