IPB

Welcome Guest ( Log In | Register )

Near-lossless / lossy FLAC, An idea & MATLAB implementation
2Bdecided
post Jun 12 2007, 20:31
Post #1


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



This is an (unoriginal) idea / work in progress. I make no claims for it, but it might be interesting or useful for someone. It is not competitive with wavpack lossy. It is not "finished" either! As far as I know, it is 100% compatible with existing recent lossless FLAC implementations.


The idea is simple: lossless codecs use a lot of bits coding the difference between their prediction, and the actual signal. The more complex (hence, unpredictable) the signal, the more bits this takes up. However, the more complex the signal, the more "noise like" it often is. It's seems silly spending all these bits carefully coding noise / randomness.

So, why not find the noise floor, and dump everything below it?

This isn't about psychoacoustics. What you can or can't hear doesn't come into it. Instead, you perform a spectrum analysis of the signal, note what the lowest spectrum level is, and throw away everything below it. (If this seems a little harsh, you can throw in an offset to this calculation, e.g. -6dB to make it more careful, or +6dB to make it more aggressive!).


How is this applied to FLAC? FLAC has a nice featured called "wasted_bits". If it finds all bits below a certain bit are consistently zero, it simply stores: "the bottom 3 bits are all zeros" and then takes no more effort in encoding them. It checks this once per frame. In FLAC frames can be variable length, but current encoders use a fixed 4096 sample length.

This means if you have a 24-bit file, but it only contains 16-bit audio data (i.e. the bottom 8 bits are zero throughout) then FLAC encodes it just as efficiently as a 16-bit file. The only overhead is a few bits every 4096 samples saying "wasted_bits=8".

It also means that if, say, you have a normal 16bit CD and you find the noise floor during a certain 4096 samples never falls below the 12th bit, you can set bits 13-16 to zero, then feed the result to FLAC, and it will automatically use a lower bitrate for that frame than if you fed it all 16 bits.

Hence "lossy FLAC" is a wav pre-processor for regular lossless FLAC. The interim stage is a "lossy" wav file with 0s in some least significant bits. The final output is a 100% compliant FLAC, which faithfully reproduces this "lossy" wav file. The lossy stage is therefore the pre-processor, and the processed "lossy" wav file, when encoded to FLAC, results in a lower bitrate than the original wav file when encoded to FLAC.


Potentially the quality is very near to what you started with, and more than good enough for many applications. In most places where mp3 doesn't work, I believe that lossy FLAC will.


On music which FLAC already compresses very well, lossy FLAC gives little advantage. Often it does exactly nothing (full 16 bits preserved), or nearly nothing (the last bit or two dropped occasionally). On music which causes the FLAC bitrate to go comparatively high, lossy FLAC usually brings a significant gain. I've seen bitrates fall by 20%-50%. Still, it's not low bitrate encoding, and it's pure VBR.


Problem samples? I don't know - I'm hoping some HA regulars can lend their ears and detective skills here. Standard lossy codec problem samples are probably not that relevant. Wavpack lossy problem samples are more relevant, but lossy FLAC does seem to spot some of these and either quantises less aggressively or not at all (i.e. encoding is pure lossless).


So what can people download? Well, sadly, I'm not a C programmer. I'm attaching a MATLAB script that works as a lossy FLAC pre-processor. You run a .wav file through this, and then encode it to FLAC as normal.

If you haven't got MATLAB, but have an idea for a useful sample to test, upload it to HA (maximum 30 seconds; shorter=better because MATLAB is slow and the code isn't optimised at all!) and I'll upload a lossy FLAC version when I get a chance.


I'll post more about the algorithm later.

Cheers,
David.

P.S. the attachment should be "lossyFLAC.m" but HA won't allow me to upload .m, so I've changed it to .txt.
Attached File(s)
Attached File  lossyFLAC.txt ( 8.14K ) Number of downloads: 1374
 
Go to the top of the page
+Quote Post
 
Start new topic
Replies
2Bdecided
post Jun 13 2007, 12:41
Post #2


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



Thanks. If you know of anything which wavpack and dualstream can't handle at 350k, that might be an interesting test.


The lossy FLAC bitrate will never be competitive in this incarnation for two reasons:

1) it's just a preprocessor, so it has to work within the limits of the host format (in this case, FLAC).
2) it doesn't use any noise shaping.

It would be interesting to see the lossy FLAC method of setting the noise floor integrated into something like wavpack lossy, with or without wavpack lossy's noise shaping.


btw, the most interesting problem sample I found was "short block test 2". Lossy FLAC does absolutely nothing to it, judging the noise floor to be at or below the 16th bit. Hence it gets encoded losslessly at 137kbps.

Cheers,
David.
Go to the top of the page
+Quote Post
goodnews
post Jun 13 2007, 12:46
Post #3





Group: Banned
Posts: 232
Joined: 20-January 06
Member No.: 27228



I am opposed to calling any lossy implementation of FLAC still FLAC. FLAC has positioned itself as a "Free Lossless Audio Codec" (it's name) and changing the name or what it means now would be detrimental and confusing to users I believe. FLAC has also stood for LOSSLESS -- that's why so many people use and like it (no loss of audio quality/data).

Please call your variant LFLAC or something other than FLAC please to avoid confusion with older decoders/hardware implementations which likely won't support any lossy variant. Thanks!

This post has been edited by goodnews: Jun 13 2007, 12:46
Go to the top of the page
+Quote Post
2Bdecided
post Jun 13 2007, 13:17
Post #4


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (goodnews @ Jun 13 2007, 12:46) *
Please call your variant LFLAC or something other than FLAC please to avoid confusion with older decoders/hardware implementations which likely won't support any lossy variant. Thanks!


As currently implemented, it is a pre-process to a standard FLAC encoder.

As such, it is 100% compatible with all FLAC compliant decoders, requires no change to the format, and the final file will be a standard .flac file from a standard FLAC encoder.

Given your concerns, this should scare you far more than the name (which can be anything - well, anything sensible).


However, I've already addressed this concern earlier in the thread: if users can't be trusted to tag (or not to untag) lossy FLAC files properly, the only way to recognise them is from something at the FLAC frame level (the "wasted_bits" data already tells you what is happening), or by spotting rows of 0s in the LSBs of the decoded audio data.


If an incompatible "LFLAC" format can do the job better (i.e. more efficiently; same performance in fewer bits) than standard FLAC with the lossy FLAC pre-processor, then it'll probably be created, and you'll have nothing to worry about.

However, the beauty of lossy FLAC (as a pre-processor) is that it's compatible with all the FLAC implementations out there. Unless "LFLAC" brings big advantages, making an intentionally incompatible "LFLAC" format just to hold lossy FLAC data won't stop the problem you envisage: On day 1, nothing will play it back, but it could easly be transcoded losslessly (i.e. maintaining the same losses!) into standard FLAC, maintaining the bitrate advantage and playing back correctly on everything that supports FLAC. So if I or someone else were to force a different lossy FLAC / LFLAC format onto the world, people would transcode it to standard FLAC to get it to play on various devices. Then there would be exactly the same "lossy FLAC" files that I've provided above.


Look at it this way: at least with lossy FLAC there's an easy way to check that it's lossy. However, if someone transcodes a traditional high bitrate lossy file without a lowpass to FLAC and gives it to you, the only way of knowing is by listening.


It's ironic that you're facing this problem because FLAC is open source. If it was closed source, I'd never have been able to do this.

Don't panic though. This is still at the "proof of concept" stage. It might not work. If it does work, I'm sure someone will implement it properly, and they might not base that implementation on FLAC at all.

Cheers,
David.

This post has been edited by 2Bdecided: Jun 13 2007, 13:20
Go to the top of the page
+Quote Post
goodnews
post Jun 13 2007, 13:51
Post #5





Group: Banned
Posts: 232
Joined: 20-January 06
Member No.: 27228



David,

I understand more about what you are attempting, but I still don't like FLAC being "forked" like this. Not that you can't do it legally (i.e. open source). But Josh has said before that FLAC hasn't been "forked" in all the years that it has been out, and I believe that "forking" it now would damage FLAC's reputation unless the name and file extension were changed.

When I see a FLAC file, I know it's lossless. FLAC is synonymous with lossless. Changing to to a "forked" lossy version where now a FLAC file could be lossless or could be lossy would confuse many people and IMO detract from the format's name and reputation among users that FLAC has built up all these years.

I suggest you use a different extension .LFL or .LFLAC instead of .FLAC to avoid any chance of confusion. Look at how Apple uses .M4A for lossy and now Apple lossless. You just can't always easily tell in 3rd part apps if you are playing a lossy or lossless file (other than perhaps by the file size). Many apps will choke on an Apple Lossless .M4A file as they think it is a MPEG 4 (AAC audio) file.

I'd hate to see the FLAC name and file extension "bastardized" to mean "it could be lossless or it could be lossy, your guess?" My vote is for FLAC to remain FLAC (LOSSLESS) and please choose some other file extension for a lossy implementation of FLAC, if you so desire to make one.
Go to the top of the page
+Quote Post

Posts in this topic
- 2Bdecided   Near-lossless / lossy FLAC   Jun 12 2007, 20:31
- - 2Bdecided   For those who don't want to read MATLAB code b...   Jun 12 2007, 20:41
|- - 2Bdecided   Here are some examples - only a couple, because I...   Jun 12 2007, 20:52
- - TBeck   Interesting approach. I did something similar in ...   Jun 12 2007, 21:09
- - JeanLuc   So ... basically you are applying a variable or ...   Jun 12 2007, 21:31
- - jcoalson   that was my hunch too, that for noisy samples you ...   Jun 12 2007, 21:36
|- - 2Bdecided   QUOTE (jcoalson @ Jun 12 2007, 21:36) tha...   Jun 13 2007, 09:20
- - 2Bdecided   I've attached some lossy and lossless files fo...   Jun 13 2007, 10:07
- - shadowking   I am sensitive to this noise with wavpack and dual...   Jun 13 2007, 10:54
- - 2Bdecided   Thanks. If you know of anything which wavpack and ...   Jun 13 2007, 12:41
|- - goodnews   I am opposed to calling any lossy implementation o...   Jun 13 2007, 12:46
||- - 2Bdecided   QUOTE (goodnews @ Jun 13 2007, 12:46) Ple...   Jun 13 2007, 13:17
||- - goodnews   David, I understand more about what you are attem...   Jun 13 2007, 13:51
|- - jcoalson   QUOTE (Nick.C @ Jun 13 2007, 07:26) This ...   Jun 13 2007, 16:55
|- - SebastianG   Hi, David and Josh! QUOTE (2Bdecided @ J...   Jun 13 2007, 17:42
- - Nick.C   Support for the LFLAC name (or possible Lossy Free...   Jun 13 2007, 13:00
- - Nick.C   This sounds like something that could be achieved ...   Jun 13 2007, 13:26
- - 2Bdecided   goodnews, I have no desire to damage FLAC. I woul...   Jun 13 2007, 14:31
- - halb27   FLAC as such remains lossless of course. You can n...   Jun 13 2007, 14:34
- - Nick.C   I don't see this as "damaging" FLAC ...   Jun 13 2007, 14:45
|- - pepoluan   QUOTE (Nick.C @ Jun 13 2007, 20:45) Now, ...   Jun 13 2007, 17:21
|- - jcoalson   QUOTE (pepoluan @ Jun 13 2007, 11:21) Sli...   Jun 13 2007, 19:07
|- - goodnews   QUOTE (jcoalson @ Jun 13 2007, 12:07) all...   Jun 13 2007, 19:36
- - SebastianG   QUOTE (2Bdecided @ Jun 12 2007, 21:31) Th...   Jun 13 2007, 15:23
- - menno   Nice. This is the same way MPEG-4 SLS becomes loss...   Jun 13 2007, 15:32
- - 2Bdecided   SebG, Thanks for your response, but I'm confu...   Jun 13 2007, 16:27
- - Nick.C   Not suggesting that you compromise the excellent r...   Jun 13 2007, 17:11
- - 2Bdecided   SebG, Ah, I see. Well, it might be interesting to...   Jun 13 2007, 18:59
- - halb27   I tried all the samples you provided, and couldn...   Jun 13 2007, 21:03
|- - 2Bdecided   QUOTE (halb27 @ Jun 13 2007, 21:03) I tri...   Jun 14 2007, 11:53
|- - 2Bdecided   QUOTE (halb27 @ Jun 13 2007, 21:03) Can y...   Jun 14 2007, 15:45
||- - halb27   QUOTE (2Bdecided @ Jun 14 2007, 16:45) QU...   Jun 14 2007, 20:03
|- - 2Bdecided   I've uploaded herding_calls, trumpet, harp40_1...   Jun 14 2007, 16:44
- - Bourne   I kinda talked about this once before... I called ...   Jun 13 2007, 23:40
- - Mitch 1 2   Using a two-part file extension (e.g. .lossy.flac)...   Jun 14 2007, 10:35
- - robert   foobar has some problem with the sample 1_Furious:...   Jun 14 2007, 13:11
- - halb27   Will try them tonight. BTW I don't concentrat...   Jun 14 2007, 13:14
|- - 2Bdecided   QUOTE (halb27 @ Jun 14 2007, 13:14) As fo...   Jun 14 2007, 13:49
|- - halb27   QUOTE (2Bdecided @ Jun 14 2007, 14:49) .....   Jun 14 2007, 15:39
- - Mark0   I may be misunderstanding something, but: why link...   Jun 14 2007, 13:51
- - Ariakis   It was originally designed to specifically exploit...   Jun 14 2007, 14:28
|- - 2Bdecided   QUOTE (Ariakis @ Jun 14 2007, 14:28) It w...   Jun 14 2007, 15:12
|- - Mark0   QUOTE (2Bdecided @ Jun 14 2007, 16:12) QU...   Jun 14 2007, 15:40
||- - TBeck   QUOTE (Mark0 @ Jun 14 2007, 15:40) ... Ex...   Jun 14 2007, 16:54
||- - TBeck   QUOTE (TBeck @ Jun 14 2007, 16:54) QUOTE ...   Jun 14 2007, 18:12
||- - Mark0   QUOTE (TBeck @ Jun 14 2007, 17:54) For op...   Jun 14 2007, 18:36
||- - TBeck   QUOTE (Mark0 @ Jun 14 2007, 18:36) QUOTE ...   Jun 14 2007, 19:16
|- - halb27   QUOTE (2Bdecided @ Jun 14 2007, 16:12) .....   Jun 14 2007, 15:41
- - naturfreak   My suggestion to further prevent confusion whether...   Jun 14 2007, 14:59
- - Nick.C   So, how soon before an executable version of Sound...   Jun 14 2007, 16:35
|- - 2Bdecided   QUOTE (Nick.C @ Jun 14 2007, 16:35) So, h...   Jun 14 2007, 16:59
- - smok3   i couldnt reliably abx the 1st set of samples, but...   Jun 14 2007, 18:12
- - halb27   Just tried your variants of furious. 6_furious is ...   Jun 14 2007, 20:57
- - halb27   Just tried badvilbel, trumpet, herding_calls, harp...   Jun 14 2007, 21:37
|- - TBeck   QUOTE (halb27 @ Jun 14 2007, 21:37) QUOTE...   Jun 14 2007, 22:08
- - 2Bdecided   Just to confirm: The latest set of samples (those...   Jun 14 2007, 22:22
|- - TBeck   QUOTE (2Bdecided @ Jun 14 2007, 22:22) Ju...   Jun 14 2007, 22:28
- - 2Bdecided   Yes.   Jun 14 2007, 22:44
- - TBeck   I updated the results. Now the last 4 samples have...   Jun 14 2007, 22:55
- - TBeck   Mea culpa, mea maxima culpa! I did some mista...   Jun 15 2007, 01:40
|- - Porcupine   Wow, amazing thread. I'm really late to the pa...   Jun 15 2007, 02:21
|- - TBeck   QUOTE (Porcupine @ Jun 15 2007, 02:21) TB...   Jun 15 2007, 03:46
|- - 2Bdecided   QUOTE (Porcupine @ Jun 15 2007, 02:21) On...   Jun 15 2007, 11:22
|- - 2Bdecided   Here are some more examples. It's interesting...   Jun 15 2007, 11:40
- - Nick.C   It seems that the SoundSimplifierô method is provi...   Jun 15 2007, 09:10
|- - shadowking   QUOTE (Nick.C @ Jun 15 2007, 18:10) Out o...   Jun 15 2007, 09:30
|- - jcoalson   QUOTE (Nick.C @ Jun 15 2007, 03:10) It se...   Jun 15 2007, 18:24
- - Nick.C   I'm just trying to rationalise the justificati...   Jun 15 2007, 09:43
- - halb27   Most of the samples mentioned arise from knowledge...   Jun 15 2007, 09:56
- - Nick.C   Well said! We're now getting into the real...   Jun 15 2007, 10:05
- - 2Bdecided   Here are some more:   Jun 15 2007, 11:57
- - 2Bdecided   QUOTE (halb27 @ Jun 14 2007, 20:57) Just ...   Jun 15 2007, 12:46
|- - halb27   QUOTE (2Bdecided @ Jun 15 2007, 13:46) .....   Jun 15 2007, 13:21
- - haregoo   For those who are too lazy to download samples one...   Jun 15 2007, 14:11
- - halb27   Just tried Atem-lied. Couldn't abx it.   Jun 15 2007, 18:58
- - SebastianG   Three rather unrelated but still on-topic comments...   Jun 15 2007, 19:10
|- - TBeck   QUOTE (SebastianG @ Jun 15 2007, 19:10) (...   Jun 15 2007, 19:54
- - SebastianG   Very true. Although, what integer "scalefacto...   Jun 15 2007, 20:09
|- - TBeck   QUOTE (SebastianG @ Jun 15 2007, 20:09) V...   Jun 15 2007, 20:22
|- - SebastianG   QUOTE (TBeck @ Jun 15 2007, 21:22) As man...   Jun 15 2007, 22:59
|- - menno   QUOTE (SebastianG @ Jun 15 2007, 23:59) H...   Jun 18 2007, 10:59
- - 2Bdecided   I like the subtractive dither idea (though of cour...   Jun 16 2007, 22:48
|- - TBeck   QUOTE (2Bdecided @ Jun 16 2007, 22:48) I ...   Jun 17 2007, 15:29
|- - halb27   QUOTE (TBeck @ Jun 17 2007, 16:29) ... I ...   Jun 17 2007, 18:09
|- - TBeck   QUOTE (halb27 @ Jun 17 2007, 18:09) If a ...   Jun 18 2007, 01:54
|- - halb27   QUOTE (TBeck @ Jun 18 2007, 02:54) QUOTE ...   Jun 18 2007, 07:11
|- - 2Bdecided   QUOTE (TBeck @ Jun 18 2007, 01:54) Unfort...   Jun 18 2007, 10:28
- - shadowking   I don't expect problems at 550k. Even shorten ...   Jun 18 2007, 07:30
|- - 2Bdecided   QUOTE (shadowking @ Jun 18 2007, 07:30) I...   Jun 18 2007, 09:52
- - shadowking   Ok. I understand now. Goal is transparent pure vbr...   Jun 18 2007, 10:32
|- - 2Bdecided   QUOTE (shadowking @ Jun 18 2007, 10:32) O...   Jun 18 2007, 12:00
- - Nick.C   As the basic premise of reducing bitdepth "tr...   Jun 18 2007, 10:51
- - menno   Funny is also that SLS does not manage to take any...   Jun 18 2007, 11:09
- - shadowking   More info and warnings about these samples here: ...   Jun 18 2007, 12:47
- - TBeck   QUOTE (halb27 @ Jun 18 2007, 07:11) QUOTE...   Jun 18 2007, 13:00
|- - 2Bdecided   QUOTE (TBeck @ Jun 18 2007, 13:00) I am n...   Jun 18 2007, 13:51
- - Nick.C   A bit O/T, but I'm a reformed Pascal / Assembl...   Jun 18 2007, 13:54
- - 2Bdecided   I haven't tried myself, but GNU Octave is supp...   Jun 18 2007, 14:51
- - TBeck   Would it be ok to discuss my preprocessor in this ...   Jun 18 2007, 15:19
- - 2Bdecided   I think "Developers" can upload files an...   Jun 18 2007, 17:10
3 Pages V   1 2 3 >


Closed TopicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 26th December 2014 - 12:33