IPB

Welcome Guest ( Log In | Register )

Near-lossless / lossy FLAC, An idea & MATLAB implementation
2Bdecided
post Jun 12 2007, 20:31
Post #1


ReplayGain developer


Group: Developer
Posts: 5089
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



This is an (unoriginal) idea / work in progress. I make no claims for it, but it might be interesting or useful for someone. It is not competitive with wavpack lossy. It is not "finished" either! As far as I know, it is 100% compatible with existing recent lossless FLAC implementations.


The idea is simple: lossless codecs use a lot of bits coding the difference between their prediction, and the actual signal. The more complex (hence, unpredictable) the signal, the more bits this takes up. However, the more complex the signal, the more "noise like" it often is. It's seems silly spending all these bits carefully coding noise / randomness.

So, why not find the noise floor, and dump everything below it?

This isn't about psychoacoustics. What you can or can't hear doesn't come into it. Instead, you perform a spectrum analysis of the signal, note what the lowest spectrum level is, and throw away everything below it. (If this seems a little harsh, you can throw in an offset to this calculation, e.g. -6dB to make it more careful, or +6dB to make it more aggressive!).


How is this applied to FLAC? FLAC has a nice featured called "wasted_bits". If it finds all bits below a certain bit are consistently zero, it simply stores: "the bottom 3 bits are all zeros" and then takes no more effort in encoding them. It checks this once per frame. In FLAC frames can be variable length, but current encoders use a fixed 4096 sample length.

This means if you have a 24-bit file, but it only contains 16-bit audio data (i.e. the bottom 8 bits are zero throughout) then FLAC encodes it just as efficiently as a 16-bit file. The only overhead is a few bits every 4096 samples saying "wasted_bits=8".

It also means that if, say, you have a normal 16bit CD and you find the noise floor during a certain 4096 samples never falls below the 12th bit, you can set bits 13-16 to zero, then feed the result to FLAC, and it will automatically use a lower bitrate for that frame than if you fed it all 16 bits.

Hence "lossy FLAC" is a wav pre-processor for regular lossless FLAC. The interim stage is a "lossy" wav file with 0s in some least significant bits. The final output is a 100% compliant FLAC, which faithfully reproduces this "lossy" wav file. The lossy stage is therefore the pre-processor, and the processed "lossy" wav file, when encoded to FLAC, results in a lower bitrate than the original wav file when encoded to FLAC.


Potentially the quality is very near to what you started with, and more than good enough for many applications. In most places where mp3 doesn't work, I believe that lossy FLAC will.


On music which FLAC already compresses very well, lossy FLAC gives little advantage. Often it does exactly nothing (full 16 bits preserved), or nearly nothing (the last bit or two dropped occasionally). On music which causes the FLAC bitrate to go comparatively high, lossy FLAC usually brings a significant gain. I've seen bitrates fall by 20%-50%. Still, it's not low bitrate encoding, and it's pure VBR.


Problem samples? I don't know - I'm hoping some HA regulars can lend their ears and detective skills here. Standard lossy codec problem samples are probably not that relevant. Wavpack lossy problem samples are more relevant, but lossy FLAC does seem to spot some of these and either quantises less aggressively or not at all (i.e. encoding is pure lossless).


So what can people download? Well, sadly, I'm not a C programmer. I'm attaching a MATLAB script that works as a lossy FLAC pre-processor. You run a .wav file through this, and then encode it to FLAC as normal.

If you haven't got MATLAB, but have an idea for a useful sample to test, upload it to HA (maximum 30 seconds; shorter=better because MATLAB is slow and the code isn't optimised at all!) and I'll upload a lossy FLAC version when I get a chance.


I'll post more about the algorithm later.

Cheers,
David.

P.S. the attachment should be "lossyFLAC.m" but HA won't allow me to upload .m, so I've changed it to .txt.
Attached File(s)
Attached File  lossyFLAC.txt ( 8.14K ) Number of downloads: 1321
 
Go to the top of the page
+Quote Post
 
Start new topic
Replies
2Bdecided
post Jun 12 2007, 20:41
Post #2


ReplayGain developer


Group: Developer
Posts: 5089
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



For those who don't want to read MATLAB code but want to know what's happening...

The algorithm is quite simple. Pick two FFT sizes - one long one, useful for catching tonal signals, one short one, useful for catching transients. Find out where the quantisation noise due to truncating at each bit will fall in these sized FFTs. Store this data in a look-up table.

Now go through the audio file. For each 4096-sample block, look at the long and short FFTs across that block separately, and find the lowest value in each, look up the implied number of wasted bits for each, and then use the lowest value of wasted bits to round the audio in that block to that many bits.

Job done.

However, there are some "bodges" in there.

Firstly, a frequency range is specified. FFT bins outside this frequency range won't be checked. Otherwise, a sharp 20kHz low pass filter in the original would force "wasted_bits" to zero simply to maintain a -96dB noise floor above 20kHz.

Secondly, the FFT's are "spread" before finding the lowest value. This isn't some clever psychoacoustic ear/masking spreading function - it's just a simple average. The reason is quite simple: in almost any windowed FFT, you'll get some bins into which almost no energy falls. This really isn't significant, but if we didn't ignore these bins, they'd force us to keep all the bits all the time. As it is, I've averaged over 4 bins using a rectangular spreading function before finding the lowest. If this gives you cause for concern, this should allay your fears: there are still enough low bins that 8-bit dither, pasted into a 16-bit file, is still encoded with 10-bit accuracy! In other words, when encoding pure noise, there's still a 2-bit "safety margin". Whether this works for all signals is one reason I'd like to people listen.

Thirdly, it's trivial to shift the thresholds, so I've put that feature in, though set it to 0 by default.


There are issues which remain to be solved.

It seems to work OK with clipped files, which is a surprise, because a positive clipped integer sample (e.g. 16bit audio) is all ones, hence wasted_bits=0. I need to look into this. Converting to 24-bits and dropping the audio by 6dB would be a solution (already implemented) if this was a problem.

There is no checking of the mid or side channels yet. Ideally, the algorithm should check mid and side in the same way as left and right, and pick the global noise floor. One caveat is that any channel which is digital silence (or "near" digital silence - there's a can of worms) needs to be ignored.

You can run many many generations with lossy FLAC before problems arise. I've gone to 50 generations with trivial processing and dithering at each generation. The quantisation noise was 1-2 bits higher in the 50th generation than in the first lossy FLAC generation. If this is a problem (I couldn't hear it) I assume you could set a -12dB noise threshold offset to solve it, though the efficiency would decrease dramatically.

Finally, this will lead to FLAC files that look like they're lossless (because FLAC is normally lossless) but are in fact lossy. Never fear! A simple utility (someone else can write one) to check the value of "wasted_bits" will soon tell you what you have. Real FLAC files almost almost never have non-zero "wasted_bits". lossy FLAC files will have load.


To answer the obvious question about bitrates, here is a screen grab from foobar2k showing the bitrates of some wavpack lossy problem samples in lossy FLAC.
Attached Image



Here is an unrelated file containing a random mixture of music samples from a recent listening test.
This is the waveform (top view) and lossy FLAC quantisation noise (bottom view)
Attached Image


This is a graph of the number of bits removed (i.e. the quantisation / rounding level) in each FLAC frame/block:
Attached Image


Obviously the quantisation noise and number of bits removed are correlated (perfectly)..

Cheers,
David.

This post has been edited by 2Bdecided: Jun 12 2007, 20:53
Go to the top of the page
+Quote Post
2Bdecided
post Jun 12 2007, 20:52
Post #3


ReplayGain developer


Group: Developer
Posts: 5089
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



Here are some examples - only a couple, because I'm on dial up.

The originals are elsewhere on HA - do a search if you want to grab them to compare.


Penultimate comment: I have no plans for a lossy+correction=lossless version. It would be possible to do it crudely with two FLAC files (one lossy, one residual) and adding them; or smartly by integrating this within FLAC itself. Not my job. Not sure it's worth it.

Finally (for now), as discussed in a recent thread, if you can't hear above 16kHz, then you can often reduce FLAC bitrates by resampling to 32kHz. Combining that with lossy FLAC pre-processing brings the bitrate down still further. I'm almost tempted to use it.


Let the hunt for problem samples begin!

Cheers,
David.
Attached File(s)
Attached File  07_Furious_lossy.flac ( 286.78K ) Number of downloads: 688
Attached File  09_SeriousTrouble_lossy.flac ( 218.13K ) Number of downloads: 506
 
Go to the top of the page
+Quote Post

Posts in this topic
- 2Bdecided   Near-lossless / lossy FLAC   Jun 12 2007, 20:31
- - 2Bdecided   For those who don't want to read MATLAB code b...   Jun 12 2007, 20:41
|- - 2Bdecided   Here are some examples - only a couple, because I...   Jun 12 2007, 20:52
- - TBeck   Interesting approach. I did something similar in ...   Jun 12 2007, 21:09
- - JeanLuc   So ... basically you are applying a variable or ...   Jun 12 2007, 21:31
- - jcoalson   that was my hunch too, that for noisy samples you ...   Jun 12 2007, 21:36
|- - 2Bdecided   QUOTE (jcoalson @ Jun 12 2007, 21:36) tha...   Jun 13 2007, 09:20
- - 2Bdecided   I've attached some lossy and lossless files fo...   Jun 13 2007, 10:07
- - shadowking   I am sensitive to this noise with wavpack and dual...   Jun 13 2007, 10:54
- - 2Bdecided   Thanks. If you know of anything which wavpack and ...   Jun 13 2007, 12:41
|- - goodnews   I am opposed to calling any lossy implementation o...   Jun 13 2007, 12:46
||- - 2Bdecided   QUOTE (goodnews @ Jun 13 2007, 12:46) Ple...   Jun 13 2007, 13:17
||- - goodnews   David, I understand more about what you are attem...   Jun 13 2007, 13:51
|- - jcoalson   QUOTE (Nick.C @ Jun 13 2007, 07:26) This ...   Jun 13 2007, 16:55
|- - SebastianG   Hi, David and Josh! QUOTE (2Bdecided @ J...   Jun 13 2007, 17:42
- - Nick.C   Support for the LFLAC name (or possible Lossy Free...   Jun 13 2007, 13:00
- - Nick.C   This sounds like something that could be achieved ...   Jun 13 2007, 13:26
- - 2Bdecided   goodnews, I have no desire to damage FLAC. I woul...   Jun 13 2007, 14:31
- - halb27   FLAC as such remains lossless of course. You can n...   Jun 13 2007, 14:34
- - Nick.C   I don't see this as "damaging" FLAC ...   Jun 13 2007, 14:45
|- - pepoluan   QUOTE (Nick.C @ Jun 13 2007, 20:45) Now, ...   Jun 13 2007, 17:21
|- - jcoalson   QUOTE (pepoluan @ Jun 13 2007, 11:21) Sli...   Jun 13 2007, 19:07
|- - goodnews   QUOTE (jcoalson @ Jun 13 2007, 12:07) all...   Jun 13 2007, 19:36
- - SebastianG   QUOTE (2Bdecided @ Jun 12 2007, 21:31) Th...   Jun 13 2007, 15:23
- - menno   Nice. This is the same way MPEG-4 SLS becomes loss...   Jun 13 2007, 15:32
- - 2Bdecided   SebG, Thanks for your response, but I'm confu...   Jun 13 2007, 16:27
- - Nick.C   Not suggesting that you compromise the excellent r...   Jun 13 2007, 17:11
- - 2Bdecided   SebG, Ah, I see. Well, it might be interesting to...   Jun 13 2007, 18:59
- - halb27   I tried all the samples you provided, and couldn...   Jun 13 2007, 21:03
|- - 2Bdecided   QUOTE (halb27 @ Jun 13 2007, 21:03) I tri...   Jun 14 2007, 11:53
|- - 2Bdecided   QUOTE (halb27 @ Jun 13 2007, 21:03) Can y...   Jun 14 2007, 15:45
||- - halb27   QUOTE (2Bdecided @ Jun 14 2007, 16:45) QU...   Jun 14 2007, 20:03
|- - 2Bdecided   I've uploaded herding_calls, trumpet, harp40_1...   Jun 14 2007, 16:44
- - Bourne   I kinda talked about this once before... I called ...   Jun 13 2007, 23:40
- - Mitch 1 2   Using a two-part file extension (e.g. .lossy.flac)...   Jun 14 2007, 10:35
- - robert   foobar has some problem with the sample 1_Furious:...   Jun 14 2007, 13:11
- - halb27   Will try them tonight. BTW I don't concentrat...   Jun 14 2007, 13:14
|- - 2Bdecided   QUOTE (halb27 @ Jun 14 2007, 13:14) As fo...   Jun 14 2007, 13:49
|- - halb27   QUOTE (2Bdecided @ Jun 14 2007, 14:49) .....   Jun 14 2007, 15:39
- - Mark0   I may be misunderstanding something, but: why link...   Jun 14 2007, 13:51
- - Ariakis   It was originally designed to specifically exploit...   Jun 14 2007, 14:28
|- - 2Bdecided   QUOTE (Ariakis @ Jun 14 2007, 14:28) It w...   Jun 14 2007, 15:12
|- - Mark0   QUOTE (2Bdecided @ Jun 14 2007, 16:12) QU...   Jun 14 2007, 15:40
||- - TBeck   QUOTE (Mark0 @ Jun 14 2007, 15:40) ... Ex...   Jun 14 2007, 16:54
||- - TBeck   QUOTE (TBeck @ Jun 14 2007, 16:54) QUOTE ...   Jun 14 2007, 18:12
||- - Mark0   QUOTE (TBeck @ Jun 14 2007, 17:54) For op...   Jun 14 2007, 18:36
||- - TBeck   QUOTE (Mark0 @ Jun 14 2007, 18:36) QUOTE ...   Jun 14 2007, 19:16
|- - halb27   QUOTE (2Bdecided @ Jun 14 2007, 16:12) .....   Jun 14 2007, 15:41
- - naturfreak   My suggestion to further prevent confusion whether...   Jun 14 2007, 14:59
- - Nick.C   So, how soon before an executable version of Sound...   Jun 14 2007, 16:35
|- - 2Bdecided   QUOTE (Nick.C @ Jun 14 2007, 16:35) So, h...   Jun 14 2007, 16:59
- - smok3   i couldnt reliably abx the 1st set of samples, but...   Jun 14 2007, 18:12
- - halb27   Just tried your variants of furious. 6_furious is ...   Jun 14 2007, 20:57
- - halb27   Just tried badvilbel, trumpet, herding_calls, harp...   Jun 14 2007, 21:37
|- - TBeck   QUOTE (halb27 @ Jun 14 2007, 21:37) QUOTE...   Jun 14 2007, 22:08
- - 2Bdecided   Just to confirm: The latest set of samples (those...   Jun 14 2007, 22:22
|- - TBeck   QUOTE (2Bdecided @ Jun 14 2007, 22:22) Ju...   Jun 14 2007, 22:28
- - 2Bdecided   Yes.   Jun 14 2007, 22:44
- - TBeck   I updated the results. Now the last 4 samples have...   Jun 14 2007, 22:55
- - TBeck   Mea culpa, mea maxima culpa! I did some mista...   Jun 15 2007, 01:40
|- - Porcupine   Wow, amazing thread. I'm really late to the pa...   Jun 15 2007, 02:21
|- - TBeck   QUOTE (Porcupine @ Jun 15 2007, 02:21) TB...   Jun 15 2007, 03:46
|- - 2Bdecided   QUOTE (Porcupine @ Jun 15 2007, 02:21) On...   Jun 15 2007, 11:22
|- - 2Bdecided   Here are some more examples. It's interesting...   Jun 15 2007, 11:40
- - Nick.C   It seems that the SoundSimplifierô method is provi...   Jun 15 2007, 09:10
|- - shadowking   QUOTE (Nick.C @ Jun 15 2007, 18:10) Out o...   Jun 15 2007, 09:30
|- - jcoalson   QUOTE (Nick.C @ Jun 15 2007, 03:10) It se...   Jun 15 2007, 18:24
- - Nick.C   I'm just trying to rationalise the justificati...   Jun 15 2007, 09:43
- - halb27   Most of the samples mentioned arise from knowledge...   Jun 15 2007, 09:56
- - Nick.C   Well said! We're now getting into the real...   Jun 15 2007, 10:05
- - 2Bdecided   Here are some more:   Jun 15 2007, 11:57
- - 2Bdecided   QUOTE (halb27 @ Jun 14 2007, 20:57) Just ...   Jun 15 2007, 12:46
|- - halb27   QUOTE (2Bdecided @ Jun 15 2007, 13:46) .....   Jun 15 2007, 13:21
- - haregoo   For those who are too lazy to download samples one...   Jun 15 2007, 14:11
- - halb27   Just tried Atem-lied. Couldn't abx it.   Jun 15 2007, 18:58
- - SebastianG   Three rather unrelated but still on-topic comments...   Jun 15 2007, 19:10
|- - TBeck   QUOTE (SebastianG @ Jun 15 2007, 19:10) (...   Jun 15 2007, 19:54
- - SebastianG   Very true. Although, what integer "scalefacto...   Jun 15 2007, 20:09
|- - TBeck   QUOTE (SebastianG @ Jun 15 2007, 20:09) V...   Jun 15 2007, 20:22
|- - SebastianG   QUOTE (TBeck @ Jun 15 2007, 21:22) As man...   Jun 15 2007, 22:59
|- - menno   QUOTE (SebastianG @ Jun 15 2007, 23:59) H...   Jun 18 2007, 10:59
- - 2Bdecided   I like the subtractive dither idea (though of cour...   Jun 16 2007, 22:48
|- - TBeck   QUOTE (2Bdecided @ Jun 16 2007, 22:48) I ...   Jun 17 2007, 15:29
|- - halb27   QUOTE (TBeck @ Jun 17 2007, 16:29) ... I ...   Jun 17 2007, 18:09
|- - TBeck   QUOTE (halb27 @ Jun 17 2007, 18:09) If a ...   Jun 18 2007, 01:54
|- - halb27   QUOTE (TBeck @ Jun 18 2007, 02:54) QUOTE ...   Jun 18 2007, 07:11
|- - 2Bdecided   QUOTE (TBeck @ Jun 18 2007, 01:54) Unfort...   Jun 18 2007, 10:28
- - shadowking   I don't expect problems at 550k. Even shorten ...   Jun 18 2007, 07:30
|- - 2Bdecided   QUOTE (shadowking @ Jun 18 2007, 07:30) I...   Jun 18 2007, 09:52
- - shadowking   Ok. I understand now. Goal is transparent pure vbr...   Jun 18 2007, 10:32
|- - 2Bdecided   QUOTE (shadowking @ Jun 18 2007, 10:32) O...   Jun 18 2007, 12:00
- - Nick.C   As the basic premise of reducing bitdepth "tr...   Jun 18 2007, 10:51
- - menno   Funny is also that SLS does not manage to take any...   Jun 18 2007, 11:09
- - shadowking   More info and warnings about these samples here: ...   Jun 18 2007, 12:47
- - TBeck   QUOTE (halb27 @ Jun 18 2007, 07:11) QUOTE...   Jun 18 2007, 13:00
|- - 2Bdecided   QUOTE (TBeck @ Jun 18 2007, 13:00) I am n...   Jun 18 2007, 13:51
- - Nick.C   A bit O/T, but I'm a reformed Pascal / Assembl...   Jun 18 2007, 13:54
- - 2Bdecided   I haven't tried myself, but GNU Octave is supp...   Jun 18 2007, 14:51
- - TBeck   Would it be ok to discuss my preprocessor in this ...   Jun 18 2007, 15:19
- - 2Bdecided   I think "Developers" can upload files an...   Jun 18 2007, 17:10
3 Pages V   1 2 3 >


Closed TopicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 22nd August 2014 - 00:19