Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: GXLame - Low-bitrate MP3 encoder. (Read 69785 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

GXLame - Low-bitrate MP3 encoder.

This is a snapshot of my work on "GXLame." GXLame is an MP3 encoder based off of LAME v3.98.4 and v3.99b0 which has been heavily optimized for high-quality, low-bitrate VBR encoding. It is similar in concept to other popular encoders at these bitrates such as some AAC codecs, Vorbis mods, and so forth at bitrates down to 56kbps. This codec does not rely on aggressive lowpassing or resampling to acheive these low bitrates, and the quality aims to be acceptable at much lower bitrates than have come to be expected of the standard. Here's a rough idea of what to expect:

Code: [Select]
On a continuous scale from V 0 (lowest bitrate) to V 100 (highest quality), with V20 as the default:
V100: 256kbps
V90: 224kbps
V80: 185kbps
V70: 162kbps
V60: 146kbps
V50: 128kbps
V40: 112kbps
V30: 96kbps
V20: 85kbps
V10: 74kbps
V0: 64kbps (actually 56kbps @ 32KHz; auto resampling to 32Khz takes place at V5 and below)

The range 0-35 is where the most tuning took place. The codec accepts input from stdin and can be used in foobar2000 (and many other audio rippers/managers/converters) by following any of the guides for LAME, but with a different commandline. For instance, one can easily import CD audio into foobar2000 and convert the tracks with the simple commandline: GXLame-t5.3 -S - %d
For greatly increased encoding speed, add "-f". To target a different quality level, add "-Vx" ('x' here means a number like 30, which would produce average bitrates somewhat close to 96kbps according to the above table).
Code: [Select]
GXLame 32bits version GXLame-t5 (9 Aug 2011)

This version contains debugging options.

usage: GXLame [options] <infile> [outfile]

    <infile> and/or <outfile> can be "-", which means stdin/stdout.

RECOMMENDED:
    GXLame input.wav output.mp3

OPTIONS:
    -b bitrate      (Not recommended) set the bitrate, default 85 kbps
    -h              highest quality, but slower (not recommended).
    -f              fast mode, slightly lower quality (but still very good)
    -V n            quality setting for VBR.  default n = 20 (near 85 kbps)
                    100 = highest quality, biggest files. 0 = smallest files
    --preset type   type must be "medium", "standard", "extreme", "insane",
                    or a value for an average desired bitrate and depending
                    on the value specified, appropriate quality settings will
                    be used.
                    "--preset help" gives more info on these

    --priority type  sets the process priority
                     0,1 = Low priority
                     2   = normal priority
                     3,4 = High priority

    --longhelp      full list of options

    --license       print License information

This is an early test release. Although some great progress is being made, it is not completely tuned, stable, or optimized. Then again, codecs never are and probably never will be. I want to gather user feedback, so use this puppy to compress whatever audio you will. Please note that lossy transcoding is an especially bad idea with GXLame. It relies so heavily on the psymodel and noise shaping that any artifacts present in the original--even inaudible ones in a transparent encode--may rebound here with a great vengeance. Particular culprits are transient smearing and additional high frequency distortion. If you must transcode, at least resample to a different frequency first (for instance, add "--resample 48" to the commandline when re-compressing/transcoding standard CD audio).

Grab a look at the changelog and older versions in the uploads thread here. Be sure to provide your opinions, discussions, impressions, test results, and whatever other witty banter you might deem applicable in this thread!

Right now, go forth and test it on your music, soundtracks, speech tracks (for speech, I recommend GXLame -V0 -mm --resample 16) -- I'm looking for tests for any regressions that might have been introduced in t5.3 since t5.2.
Copy Restriction, Annulment, & Protection = C.R.A.P. -Supacon

GXLame - Low-bitrate MP3 encoder.

Reply #1
Fascinating. How easily backported are your changes into the main LAME source? Is this a fork? How do your own tests rank the two encoders?

GXLame - Low-bitrate MP3 encoder.

Reply #2
Fascinating. How easily backported are your changes into the main LAME source? Is this a fork? How do your own tests rank the two encoders?


I started knowing little about LAME's VBR code only a few days ago, but I've worked very rapidly since then, and the results should clearly speak for themselves. It is a fork with some pretty significant differences, but the changes probably wouldn't be terribly difficult to backport at all.

As to how I'd rank the two against each other, I'd rank GXLame well above LAME at these quality modes/bitrates, but it's actually very hard to say anything about comparable commandlines. You see, stock LAME even with lowpass set to some arbitrarily high value like 16 (just as long as band 21 is discarded, so -Y is your friend), and/or resampled to 32khz, simply can't approximate, for instance, GXLame's V10's bitrates in VBR mode. The closest LAME comes (ironically, LAME -V 9.9 with lowpass 16 resampled to 32khz) produces significantly worse results from my testing.

But the results of one guy's tests (okay, me and a few others) on a certain subset of music doesn't say enough on its own.

If you'd like to compare this to LAME, it's best to figure out the closest commandline in 3.98 and go from there. I think you'll be very pleasantly surprised, however. 
Copy Restriction, Annulment, & Protection = C.R.A.P. -Supacon

GXLame - Low-bitrate MP3 encoder.

Reply #3
LAME…stands for "Lame Ain't an MP3 Encoder" (I'd say it should be called "LIME" because it is).

Not no more it doesn't, and not originally it wasn't!

Quote
LAME originally stood for LAME Ain't an Mp3 Encoder. LAME started life as a GPL'd patch against the dist10 ISO demonstration source, and thus was incapable of producing an mp3 stream or even being compiled by itself. But in May 2000, the last remnants of the ISO source code were replaced, and now LAME is the source code for a fully LGPL'd MP3 encoder

GXLame - Low-bitrate MP3 encoder.

Reply #4
Not no more it doesn't, and not originally it wasn't!

Quote
LAME originally stood for LAME Ain't an Mp3 Encoder. LAME started life as a GPL'd patch against the dist10 ISO demonstration source, and thus was incapable of producing an mp3 stream or even being compiled by itself. But in May 2000, the last remnants of the ISO source code were replaced, and now LAME is the source code for a fully LGPL'd MP3 encoder


Haha, you sure know your codec history! Well done. So the name is just a remnant from when it was a patch.

Anyway, I found a bug that causes a lot of the wonderful "underwater" artifact (and smearing distortion) in certain samples. I will re-upload soon with a quickfix. Yeah, sort of glad nobody jumped on it yet--things like this could be a huge turn-off.
Copy Restriction, Annulment, & Protection = C.R.A.P. -Supacon

GXLame - Low-bitrate MP3 encoder.

Reply #5
Quick impression: Less warbling and distortion than Lame, but more noise.
With heavily compressed music it's surprisingly good, but in case of eg. Third World Man by Steely Dan it almost sounds like someone's playing a bicycle pump along to the music. 

GXLame - Low-bitrate MP3 encoder.

Reply #6
I don't pretend to be rude but...

Do you have knowledge on psychoacoustics and on Mpeg layer III as a format?
Are your changes motivated by knowledge instead of by tweaking and listening?

I am not against your work, and I haven't even checked the results of this first test version, but if the answer to those questions is "no", I would advise you to slow down.

The problem with lossy codecs is that they are not just an algorithm. They are the result of knowledge and empirical studies that are not trivial and which are sometimes interrelated.


The founder of this site can be proud of taking LAME back at 3.89 and doing such changes in the code that led to the --alt-presets (which nowadays are still alive in the -V settings). Your goals are similar to the ones he had then (best quality/ratio at high bitrate vs best quality/ratio at low bitrate).

As such it's something that would be nice to happen, but only achievable with the proper knowledge.



And one more thing:

From your other thread about getting information, GXLame seems to simply have switched from vbr-new to vbr-old. Can you comment on this, and if your comparison of GXLame -V10 being better than LAME -V9.9 is using vbr-new or vbr-old?

Concretely, i am interested in how much of a difference there is right now on truely comparable settings.

GXLame - Low-bitrate MP3 encoder.

Reply #7
I have to disagree [JAZ]. The results speak for themselves. The goal of an MP3 encoder is to produce good-sounding MP3s. You need little (to no!) comprehension of the underlying mechanisms, necessarily, to be able to discern which of two files you prefer, after you've identified that you are indeed hearing a difference.

You could write a genetic algorithm, for example, to tune MP3 without understanding much of what's going on.

In short: knowledge is not a pre-requisite, knowledge is simply beneficial.

GXLame - Low-bitrate MP3 encoder.

Reply #8

Are your changes motivated by knowledge instead of by tweaking and listening?

Both, to varying degrees. 

Quote
I am not against your work, and I haven't even checked the results of this first test version

That's fine, thanks for being honest. If you do find the time, I'd certainly appreciate you giving it a shot, though. Thanks!

Quote
The founder of this site can be proud of taking LAME back at 3.89...

Speaking of knowledge, everyone here seems to know their codec history! I'm really proud. I know, I know, geeky of me to say...but still.

Quote
From your other thread about getting information, GXLame seems to simply have switched from vbr-new to vbr-old. Can you comment on this, and if your comparison of GXLame -V10 being better than LAME -V9.9 is using vbr-new or vbr-old? Concretely, i am interested in how much of a difference there is right now on truely comparable settings.

Certainly! In that thread, I was just trying to get basic information about what I thought to be a bug with the VBR-old code, which I decided was the most solid base for GXLame's VBR. Once I identified and fixed this issue, I proceeded to make my changes. For this reason, GXLame doesn't touch VBR-new, and little of the work applies to it. Also, the VBR scale in GXLame is from 0 (low quality) to 100 (large, high quality). As for my comparison of quality, you'll have to see for yourself. Sometimes it's hard to know what a program or codec can do without trying it.  The point I was trying to make about comparable settings is that there really aren't any. V9.9 in LAME with the strange commandline I suggested (including vbr-old) comes close in bitrate to GXLame's V10, though.

Quote
Quick impression: Less warbling and distortion than Lame, but more noise.
With heavily compressed music it's surprisingly good, but in case of eg. Third World Man by Steely Dan it almost sounds like someone's playing a bicycle pump along to the music. 

Haha, glad the overall impression is good! Yes, there is a bit of added noise. You might consider that a leaf from lossyWAV and AoTuV's book-- at heavy settings there's a bit of added noise, but with the hope that the difference isn't "too annoying." That said, bicycle pumps aren't exactly the most innocuous sources of noise.    Could you upload a 30-second sample of the clip in question?  Thanks for testing!!

Quote
You could write a genetic algorithm, for example, to tune MP3 without understanding much of what's going on.

What an interesting idea! Acovea genetically evaluates compiler flags; a speed test of the result would determine the next round of flags. I wonder if it isn't possible to "genetically" run loads against LAME's psymodel until you get the "best" possible combination of internal settings values... The problem here is, unlike speed tests which are relatively unequivocal, there is no infallible "audio quality" model to run these different tunings against, unless you have a group of 100 dedicated audio testers who'd be willing to test 5 times a day for years to come. The same happens to be true of video encoding, as x264's default "best looking" options have long since deviated away from what simply produces the highest PSNR or SSIM values (these are the mathematical "video quality metrics"). Quality, especially at low bitrates where even individual ABX tests fall apart (where every sample is distinguishable from the original), is a much more intangible construct to tune for genetically. Saying "these artifacts are more/less annoying than these" is of course an existing model, but still...you need people's opinions on "annoyance levels." Haha, sorry for getting off topic and raising the "annoyance level" of this thread.

Thanks to all who've done tests so far! Remember to grab the latest binary (with bugfix). Tests I'm currently looking for:
1. Bitrate. Do the bitrates in the table (first post) match the bitrates GXLame is providing you on [larger] sets of your music?
2. What kind of artifact should I tune against? "Artifacts" here means perceived ringing, warbling, added noise, etc. So in other words, which of these artifacts stands out most to you in low-bitrate GXLame MP3's? Thanks again!
Copy Restriction, Annulment, & Protection = C.R.A.P. -Supacon


GXLame - Low-bitrate MP3 encoder.

Reply #10
There biggest problem will be low volume , solo/jazz/trip-hop vocals, solo instrumental intro etc - expect ringing and lots of ugly distortions made worse by VBR.

GXLame - Low-bitrate MP3 encoder.

Reply #11
Ranges from 0-35 are the most interesting to me.

Why don't you focus your work exclusively on lower bitrates?
I'm sure transparency-focused users are happy with the way lame is now...
And why not explore bitrates lower than 64Kbps? Even if just for fun / technical merit...

Anyway, whatever happens, it is indeed beautiful to see that the MP3 format can apparently be still "stretched-forward"...

GXLame - Low-bitrate MP3 encoder.

Reply #12
@Stereotype: in GXLame, 0..35 is the low bitrate range. The scale is inverted in comparison to LAME, (i.e. it is like in other codecs: nero, vorbis...)


@Canar: If i take something made by someone else, not really well documented, and i don't have knowledge of the limitations and tradeoffs involved, the work of improving it can easily create a scenario where changes either: make a positive difference on the items tried, while making a negative one on other items, or not make a difference on the items tried but making a negative one on other items.

By definition of tuning, the case of making worse the items tried is discarded, and the possibility of making better something not tried just by chance is small.

And just like The sheep has said, a genetic algorithm can be just as good as the psychoacoustic model that it uses to decide what's good and what's bad which sort of defeats the idea.


Again, i respect the intention and procedure. I was just trying to avoid a situation where we would need different LAME versions for different targets and/or files. This is especially accentuated by the fact that GXLame uses --vbr-old, while the development on LAME in the last two/three years has been in -vbr-new, and is the default mode nowadays.

If it ever gets to get merged, -V6~V9 could use -vbr-old while the rest use vbr-new, but this will depend on how much the changes affect the rest of the encoder.


If this is the next Aotuv, it would be great. Does it have the chance to be?

GXLame - Low-bitrate MP3 encoder.

Reply #13
very interesting project. Hopefully it is backward compatible with most hardware/software mp3 decoders...
thank you

First question: Could you add special switch to disable automatic downsampling (e.g. resampling to 32 kHz for V0) or something like --resample X in original LAME?
🇺🇦 Glory to Ukraine!

GXLame - Low-bitrate MP3 encoder.

Reply #14
Quote from: Steve Forte Rio link=msg=702724 date=
Hopefully it is backward compatible with most hardware/software mp3 decoders...

First question: Could you add special switch to disable automatic downsampling (e.g. resampling to 32 kHz for V0) or something like --resample X in original LAME?

The files should be compatible with all compliant MP3 players...
As for resample, there is a --resample X option. Usage is like in LAME or Venc. X is in Khz (so 44.1 would be good).

Let me know of any issues, particularly if the bitrate map is correct and any annoying artifacts you might hear. Join the main discussion here, too.

Thanks for testing!
Copy Restriction, Annulment, & Protection = C.R.A.P. -Supacon

GXLame - Low-bitrate MP3 encoder.

Reply #15
There biggest problem will be low volume , solo/jazz/trip-hop vocals, solo instrumental intro etc - expect ringing and lots of ugly distortions made worse by VBR.


Is the artifact that most bothers you the ringing on these types of tracks? Thanks for the report.

Pumping noise, particularly noticeable in the instrumental intro.

Speaking of instrumental intro, I tried your clip, gaekwad2. I'm a bit confused now as to exactly what you mean about the pumping noise you describe. The most apparent artifacts I detect are exemplified from the range 10s-12s in the sample. Just to verify we're referring to the same thing, I hear a metallic, high-pitched squeak accompanying the percussion (I believe these is typically referred to as "ringing artifacts" or perhaps in this case, chirping). Try the original test1 on this clip (the one before the so-called "bugfix"). Does this reduce the issue a bit for you?

Also, what quality level did you test at (10, 20...)? Thanks!
Copy Restriction, Annulment, & Protection = C.R.A.P. -Supacon

GXLame - Low-bitrate MP3 encoder.

Reply #16
Speaking of instrumental intro, I tried your clip, gaekwad2. I'm a bit confused now as to exactly what you mean about the pumping noise you describe. The most apparent artifacts I detect are exemplified from the range 10s-12s in the sample. Just to verify we're referring to the same thing, I hear a metallic, high-pitched squeak accompanying the percussion (I believe these is typically referred to as "ringing artifacts" or perhaps in this case, chirping). Try the original test1 on this clip (the one before the so-called "bugfix"). Does this reduce the issue a bit for you?

Also, what quality level did you test at (10, 20...)? Thanks!

I mean an audible noise floor which rises and falls along with the music. It's most obvious at quality levels below V30 (though still audible at 40, at 50 I don't notice it anymore at least at normal listening level (based on the whole track which has a replaygain of ~0dB)) and when the guitar is playing on the left since the noise appears in both channels.

The original version doesn't help here, it just adds that underwater effect to both the music and the noise.

GXLame - Low-bitrate MP3 encoder.

Reply #17
Cool project, I need to do further testing before I can say anything about quality for my ears...
I changed my mind about the last post (#6) how can I delete it?....

GXLame - Low-bitrate MP3 encoder.

Reply #18
In response to the samples I received so far!

Quote
GXLame-t2 released!

Changelog:
- Fixed a nasty bug with channel mapping under V30
- Tweaked the dynamic noise floor
- Reduced ringing
- Significant tuning
- Raised lowpass

I've tried to address the samples and artifacts people reported most. The result is (hopefully) much better quality across the board. Let me know if there are any regressions.

(I still need that bitrate test, people!  )


Have at it. Perceptual quality (should be) improved on many samples.
Copy Restriction, Annulment, & Protection = C.R.A.P. -Supacon

GXLame - Low-bitrate MP3 encoder.

Reply #19
Nice! I'm tempted to try hack-in support for GXLame when I get around to re-adding MP3 support to my Vorbis streaming component.

GXLame - Low-bitrate MP3 encoder.

Reply #20
Nice! I'm tempted to try hack-in support for GXLame when I get around to re-adding MP3 support to my Vorbis streaming component.


Excellent! Perhaps I should get to work on the ABR mode--or try to improve capped VBR. Otherwise, it would be awkward to stream even an 85kbps file if it had some bursts of 224+kbps. Then again, vorbis suffers from this theoretical streaming problem unless it is similarly constrained, but I've yet to encounter any real-world problems in streaming true VBR vorbis with adequate decoder-side buffer. So yeah, go for it!

@all: Work is still underway. I may release a t3 (test 3) soon with the improvements. I'm focusing on the underwater effect and general distortions this time around, and will issue the release once even a small (but concrete and universal) quality improvement is made. Then I will try to project some of my work into the framework of the vbr-new mode (or, failing that, make a hybrid vbr or other simpler forms of speed increase/compromise). Finally, I will try to work a bit on the ABR quality (maybe). It's interesting how LAME has such different tunings for each of the modes of operation, but I certainly see the rationale behind it.

Any reports yet about the bitrate matching? (i.e. Is V10 really 84kbps for your music?) Make sure to mention the genre and other features of the test sample(s) if you choose to make a report. Thanks!
Copy Restriction, Annulment, & Protection = C.R.A.P. -Supacon

GXLame - Low-bitrate MP3 encoder.

Reply #21
The noise is greatly reduced in t2, now the biggest difference when comparing GXLame -V 10 to Lame -V 9.9 --resample 32 --lowpass 16 is that gxlame preserves a lot more high frequencies at the price of a generally dirtier sound (ringing, yes, but not the metallic wma-standard-kind), reminds me of Vorbis before aoTuV in a way.

Bitrates:
I used 46 tracks that are representative of my cd collection (mixed genres) with lossless codecs, so mp3 bitrates may be a bit off.
Code: [Select]
V 0:    59
V 10:   77
V 20:   87
V 30:  100
V 40:  115
V 50:  130
V 60:  153
V 70:  169
V 80:  187
V 90:  212
V 100: 230

GXLame - Low-bitrate MP3 encoder.

Reply #22
The noise is greatly reduced in t2, now the biggest difference when comparing GXLame -V 10 to Lame -V 9.9 --resample 32 --lowpass 16 is that gxlame preserves a lot more high frequencies at the price of a generally dirtier sound (ringing, yes, but not the metallic wma-standard-kind), reminds me of Vorbis before aoTuV in a way.

(emphasis mine)
I'll admit to having performed no listening tests, but I noticed this behavior discussed earlier on, and was curious.
Why are (expensive to encode) higher frequencies being preserved in a low-bitrate optimized MP3 encoder?
Creature of habit.

GXLame - Low-bitrate MP3 encoder.

Reply #23
I do intend to do a bit of listening testing on this when I get back home sometime next week. I haven't yet due to work.

GXLame - Low-bitrate MP3 encoder.

Reply #24
Why are (expensive to encode) higher frequencies being preserved in a low-bitrate optimized MP3 encoder?


--resample 32 --lowpass 16 should ring a bell to you.  It does not try to preserve the whole bandwidth, but a lowpass of less than 12Khz can be more annoying that an artifact at -30dBs.

That's what he is trying to do, redefine the "annoyability" of artifacts.