Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Near-lossless / lossy FLAC (Read 176960 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Near-lossless / lossy FLAC

Reply #100
I think "Developers" can upload files anywhere. "Members" can only upload in the uploads forum. I guess you can PM a moderator to become a "Developer".

Thanks for the info.

Of course you can discuss your pre-processor here, though if we're both going to get people to ABX, it might get a bit confusing. I guess it depends if you expect them to merge, or not.

I am not an open source zealot, but for a useful discussion, you'll have to share pretty much exactly what it's doing - otherwise there's little hope of finding relevant problem samples without doing an exhaustive test.

I understand.

Currently i only want to see, if my approach is useful at all. It's quite possible that your method is superior. If so, then it does not make sense to put more effort into my method. It's really very very simple. And it had to be simple in 1997 when my Pentium 133 provided very limited processing power...

What it does: It performs a very simple estimation of the expected residuals (what is likely to remain after sending the signal through the linear predictor?). Then it calculates Log2 of the mean of the residuals, substracts the quality treshhold (expressed as bits) and uses the difference as shift value. Thats all.

It's some kind of CBR compression with some variation introduced by the estimation error of the residuals.

Probably you are right: It's better to open a new thread to avoid confusion. But it would be nice to compare the results to check, if my approach does make any sense.

If so, then i can think about an implementation as part of a preprocessor or intgerated into TAK. Currently i am just curious and maybe a bit lazy: Actually i should better work on far less interesting tasks of the next TAK lossless release...

  Thomas

Near-lossless / lossy FLAC

Reply #101
I just have opened a new thread to discuss my preprocessor implementation.

  Thomas

 

Near-lossless / lossy FLAC

Reply #102
... Can you try these:
http://64.41.69.21/technical/reference/keys_1644ds.wav...

Just tried keys with wavPack lossy @ 350 kbps fast mode: terrible. triangle-2 from that page is very ugly too.
I was afraid after having heard furious that there may be real life sound that makes look wavPack lossy pretty bad.
Setting for my DAP (see signature) btw is excellent in comparison to plain wavPack usage - 32 kHz sampling frequency and s0.4 is a bit of an anti-killer setting. I did a lot of listening tests with it this weekend and I'm very content.

Anyway this shows that a good quality control would be very much welcome for wavPack lossy. Usually a pretty moderate bitrate yields excellent results, but it's not always the case. Maybe this thread encourages David Bryant to go along this way.
lame3995o -Q1.7 --lowpass 17

Near-lossless / lossy FLAC

Reply #103
Wow! They're killer samples for this algorithm, and FLAC itself. I think they're still transparent (can you try ABX please?) but look at the bitrates (all FLAC)...

keys_1644ds:
lossless: 1078kbps (ratio=0.764)
lossy: 829kbps (ratio=0.587)

....

To me keys_1644ds_lossy.flac is transparent too.
I wouldn't care much about bitrate bloat of such samples. A robust extremely high quality is what counts, as well as average bitrate of different genres.
lame3995o -Q1.7 --lowpass 17

Near-lossless / lossy FLAC

Reply #104
I agree with halb27. A humongous lossy file in those cases is actually what a good VBR is supposed to do, to maintain the sound quality.

In any case, here's a sample I'd like you to try, 2Bdecided.

Very "Easy" Sample

This is a typical example of the kind of obnoxious music that I think compresses best with things such as WavPack lossy. It requires 1200+ kbps to be mathematically lossless, but even at 200 kbps and below it still sounds transparent to me. Maybe shadowking or others can try to ABX it at WavPack/Optimfrog lowest quality settings, I think it's transparent.

The reason to test this sample would be to test how dynamic the range of 2Bdecided's VBR algorithm is. Will it choose a very low bitrate, and if it does will it still be transparent? By the way, this sample isn't clipped I made sure. Upon extraction, EAC reported the song's normalization as 98.8% amplitude.

I plan to install foobar2k next weekend and I can try to ABX some of the samples, sorry I didn't have enough time this past weekend. I figure I need to dedicate about a day to play with foobar first, and another day to do actual listening tests. But just to let people know, never from the start did I consider myself to have terrific hearing. I think I probably only have fairly good hearing, and it isn't as good as it used to be either, I'm in my late 20's. And even when I was in my teens, I think I had perfect undamaged hearing but when my friends and I did "single" blind-tests back in the early days of mp3, one of my friends easily defeated me in being able to differentiate certain things.

Near-lossless / lossy FLAC

Reply #105
Simple tonal classical music should be tested. Corelli and bruhns samples from Guruboolez come to mind.


... Can you try these:
http://64.41.69.21/technical/reference/keys_1644ds.wav...

Just tried keys with wavPack lossy @ 350 kbps fast mode: terrible. triangle-2 from that page is very ugly too.
I was afraid after having heard furious that there may be real life sound that makes look wavPack lossy pretty bad.
Setting for my DAP (see signature) btw is excellent in comparison to plain wavPack usage - 32 kHz sampling frequency and s0.4 is a bit of an anti-killer setting. I did a lot of listening tests with it this weekend and I'm very content.

Anyway this shows that a good quality control would be very much welcome for wavPack lossy. Usually a pretty moderate bitrate yields excellent results, but it's not always the case. Maybe this thread encourages David Bryant to go along this way.


True. These cases will never happen on CD though. They are good reference for tuning. The thing is that the preprocessor is averaging 450~550k on vbr. If I matched wavpack abr bitrate then all results are also transparent. It would be very impressive if to have a wavpack preset avg 320k that can adapt to these cases. So far this isn't the case - not even dualstream but its close. and I don't know if its possible without noise shaping (vbr + shaping). The wavpack 4.x encoder should handle these fine.

Near-lossless / lossy FLAC

Reply #106
... These cases will never happen on CD though. ...

Why not? That's exactly what I am afraid of.
I am not so much afraid of killer samples originating from certain kind of electronic music (cause that's not 'my' music) - but an isolated loud triangle or similar might well reach my musical horizon.
Sure things are to be set into relation. We can't expect a full frequency encoding with 350 kbps fast mode to be good or acceptable at everything. 'My' 350 kbps fast mode 32 kHz resampling result isn't transparent as well though acceptable.
lame3995o -Q1.7 --lowpass 17

Near-lossless / lossy FLAC

Reply #107
Its nothing but test signal stuff. If you read the pcabx page and all its warning of possible equipment and hearing damage.

I agree though that some cd content can be similar, but the effect will be a much more subtle noise. The fast and even normal modes were never designed for robust compression and quality. I would go for 350k high modes and even -ans. I never liked too many hacks. Actually I think at these middle bitrates -ans will be an advantage as you will get gains in these situation and you have enough masking to not hear the -ans 'working'.

Near-lossless / lossy FLAC

Reply #108
I agree with halb27. A humongous lossy file in those cases is actually what a good VBR is supposed to do, to maintain the sound quality.

In any case, here's a sample I'd like you to try, 2Bdecided.

Very "Easy" Sample

This is a typical example of the kind of obnoxious music that I think compresses best with things such as WavPack lossy. It requires 1200+ kbps to be mathematically lossless, but even at 200 kbps and below it still sounds transparent to me. Maybe shadowking or others can try to ABX it at WavPack/Optimfrog lowest quality settings, I think it's transparent.

The reason to test this sample would be to test how dynamic the range of 2Bdecided's VBR algorithm is. Will it choose a very low bitrate, and if it does will it still be transparent? By the way, this sample isn't clipped I made sure. Upon extraction, EAC reported the song's normalization as 98.8% amplitude.
This one is very interesting!

Guess how many "bits of resolution per sample" lossy FLAC thinks this needs?

On average, 5-6, with some moments only needing 4.

In other words, lossy FLAC is sometimes dropping 12 of the original 16 bits!

This means, despite the original sample not clipping, the resulting file often clips. For this reason, I've done two versions - one at full amplitude (which clips) and one at 50% amplitude 24-bit resolution (though most of these bits are subsequently dumped: typically 17-18!).

If you want to ABX the reduced amplitude version, you'll have to enable ReplayGain, or use the "lossless half amplitude" version I've included below as a reference.

This is the one that clips:
[attachment=3366:attachment]444kbps

This is the one that's 6dB quieter, and doesn't clip
[attachment=3367:attachment]342kbps

This is a lossless 6dB quieter file for ABX comparison with the above
[attachment=3368:attachment]1269kbps


Given the aggressive processing this has received by lossy FLAC, I would really appreciate it if people could try to ABX.


Quote
I plan to install foobar2k next weekend and I can try to ABX some of the samples, sorry I didn't have enough time this past weekend. I figure I need to dedicate about a day to play with foobar first, and another day to do actual listening tests. But just to let people know, never from the start did I consider myself to have terrific hearing. I think I probably only have fairly good hearing, and it isn't as good as it used to be either, I'm in my late 20's. And even when I was in my teens, I think I had perfect undamaged hearing but when my friends and I did "single" blind-tests back in the early days of mp3, one of my friends easily defeated me in being able to differentiate certain things.
Well, you can try. Foobar2k ABX is easy enough to use - I don't think you'll need a day! More like one minute...

Cheers,
David.

Near-lossless / lossy FLAC

Reply #109
To my untrained ear, 342kbps sounds *very* nice indeed - using fb2k & earbuds on my laptop - not a semi-pro ABX, just comparable to how I would actually listen to it.

However, I can't play it on my iPAQ because GSPFlac.dll does not seem to handle this type of FLAC  . Converted to WAV and GSPlayer still fails - hmmmmm....... something's wrong with my hardware. Nope, nothing wrong with hardware - just a 16bit limitation - it falls over with >16bit samples. Played all the other comparator samples in 69/79 and was pleased not to notice any degradation (caveat: on earbuds on an iPAQ). Very pleased.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #110
...This means, despite the original sample not clipping, the resulting file often clips. ...

I must have a misconception about your preprocessor.
I thought you're just zeroing a certain amount of least significant bits of each sample according to what your machinery thinks it can safely do so. But then clipping couldn't occur with your preprocessor, and shouldn't occur with FLAC of course.
What's wrong with my imagination?
lame3995o -Q1.7 --lowpass 17

Near-lossless / lossy FLAC

Reply #111
I zero by rounding, not truncation.

Maybe I'll try truncation instead.

The problem is that it will introduce a DC bias which will accumulate if you go through many generations of processing. However, it should solve the clipping problem.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #112
Okay, trying to understand the maths - you have a 16 bit signed number between -32768 and +32767  so if (not clear on the nomenclature here) rounding / truncating was to 10 significant bits (-512 to 511), you might get:

0101101011101000 > 0101101011000000

or am I missing the point? Or, do you merely create a bitstream of signed 10bit elements?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #113
Here is the version with truncation (round down, if you like) instead of standard rounding...

[attachment=3369:attachment]This is 16bits

I don't think this file will sound any different, but the errors are all one way (down), which means the DC level has been shifted by up to 6.5% in places. EDIT: I can imagine a file where the DC jump would an audible, so don't like this method, but see what you think.


Re Clipping: If you are going to round or truncate samples, some of the results will hit digital full scale (0dB FS). This isn't strictly "clipping", even though many audio editors will report it as such. Clipping is strictly where the (analogue) signal should be larger than 0dB FS, but was clipped at 0dB FS because it couldn't go any higher in the digital domain. In comparison, 0dB FS is exactly where lossy FLAC wanted to put those samples: no higher, no lower. So there's never clipping, but there can be digital full scale samples which weren't at digital full scale before.

This isn't a problem. The problem is that digital waveforms have positive and negative peaks (obviously!). The limits, i.e. negative and positive digital full scale, are not equal because we have an even number of values available (2 to the power n values, where n is the bitdepth), but must include zero in the middle, which leaves an odd number of values to split between positive and negative - hence we cannot have the same number of positive and negative values available. e.g. for 16-bit audio, 0 is zero(!), negative full scale is -32768, positive full scale is +32767.

With rounding, you could head towards positive digital full scale. This would be +32768, which you can't have. That's the problem. In this case, lossy FLAC was hitting +32767, which is binary all ones, i.e. no wasted bits. In contrast, truncation always works: you'll never hit +32768 because you're always rounding down, and the largest number you started with was only +32767. Since you can have -32768, truncation (rounding down) is always OK.

Hope this makes sense.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #114
To my untrained ear, 342kbps sounds *very* nice indeed - using fb2k & earbuds on my laptop - not a semi-pro ABX, just comparable to how I would actually listen to it.

However, I can't play it on my iPAQ because GSPFlac.dll does not seem to handle this type of FLAC  . Converted to WAV and GSPlayer still fails - hmmmmm....... something's wrong with my hardware. Nope, nothing wrong with hardware - just a 16bit limitation - it falls over with >16bit samples. Played all the other comparator samples in 69/79 and was pleased not to notice any degradation (caveat: on earbuds on an iPAQ). Very pleased.


You can have that file in 16-bits.

You're not losing anything in this case, since lossy FLAC has already dumped most of the bits!

If lossy FLAC wanted to keep all 16-bits, then staying at 16-bits but reducing the amplitude will effectively raise the noise floor by 6dB, which is why I don't do it by default. However, given the 16-bit hardware limitation and the other issues with clipping, I think this should certainly be an option, as with the file I've attached here.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #115
Truncating can be bad.

I have a killer test sample where truncating is ABXable.

It's also an interesting sample for checking lossy FLAC's block boundaries. Despite visible glitches in this killer test sample, I don't think they are audible.


This is the orginal file:
[attachment=3371:attachment]

This is the truncated version:
[attachment=3372:attachment]I can ABX this

This is the rounded (hence clipped) version:
[attachment=3373:attachment]I can't ABx this, but maybe someone can?

This is how it should be done (24 bits, half amplitude):
[attachment=3374:attachment]


This is a half amplitude 16-bit version for Nick:
[attachment=3377:attachment]


There must be a similar test case which can be bad for rounding. I'm going to give it more thought.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #116
What about rounding towards zero? Should prevent clipping and avoid a systematic DC offset.
lame3995o -Q1.7 --lowpass 17

Near-lossless / lossy FLAC

Reply #117
That would put a kink in the transfer function, a bit like a bad class B amp. This would introduce harmonic (and aliased inharmonic) distortion that may or may not be audible. It would also reduce the peak to peak and RMS amplitude, which could, in extreme cases, be perceived as a slight reduction in loudness.

There's no perfect solution to this, so I guess it's a case of finding the least imperfect. The problem is, which one is least imperfect probably depends on the application, and it would be nice to avoid that complexity.


btw, the issue in my previous post is an example of noise modulation. This is an effect of having no dither, or the wrong dither. I didn't expect to find a real world sample where this would be audible, but to solve my own killer sample I might have to add a dither option.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #118
Truncating can be bad.

I second that!

My lossy approach is currently truncating. I tried the problem sample (for TAK-Lossy) "keys_1644ds" (reported by halb27) with rounding and it sounded considerably better. With truncating i would have to reduce the wasted sample bits count by about 0.7 (on average) to achieve a similar quality.

Near-lossless / lossy FLAC

Reply #119
I zero by rounding, not truncation.

Maybe I'll try truncation instead.

The problem is that it will introduce a DC bias which will accumulate if you go through many generations of processing. However, it should solve the clipping problem.

Cheers,
David.

I am only curious. How does the MATLAB rounding-function work? Round to even?

Near-lossless / lossy FLAC

Reply #120
I am only curious. How does the MATLAB rounding-function work? Round to even?
By default, MATLAB works in double precision floats, and stores audio data over the range +/-1. It's only converted back to integers when writing to .wav. To round, I'm multiplying by 2^n, using round(), and then dividing by 2^n. The round function itself just rounds like you would at school.

Gotta go - serious lightening storm here!

Near-lossless / lossy FLAC

Reply #121
So it looks like real rounding is necessary.
For a general procedure that avoids clipping this seems to mean always to work in 24 bit and shifting input down 1 bit. As this reduces final bitrate according to your sample a higher precision seems to be necessary in order to get at the same final resolution.
Or is my understanding wrong?
lame3995o -Q1.7 --lowpass 17

Near-lossless / lossy FLAC

Reply #122
...
This is the one that's 6dB quieter, and doesn't clip
[attachment=3367:attachment]342kbps

This is a lossless 6dB quieter file for ABX comparison with the above
[attachment=3368:attachment]1269kbps
...
Given the aggressive processing this has received by lossy FLAC, I would really appreciate it if people could try to ABX.

Can't abx annoyingloudsong, neither the half amplitude version nor the "clipped" version.
lame3995o -Q1.7 --lowpass 17

Near-lossless / lossy FLAC

Reply #123
Becoming more concerned about the capabilities of the iPAQ - it plays most FLAC files I've tried with pretty good reproduction, however......

I tried the 4 noise test samples but got weird harmonics running throughout each and all quite different - more worryingly, harmonics were not the same on consecutive repeats of the same file. Played them on my laptop (2GHz T2500, 1GB) and couldn't tell them apart.

Like the way the discussion is going...... Oh, just a thought - instead of bitshifting right by 1, why not divide by Root-2 then round? Would this help the clipping?

Ditto inability to ABX annoyingloudsong (Laptop / iPAQ).
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #124
Thanks for all the testing and input halb27, Nick.C, shadowking, TBeck, SebG, smok3, and for the file Porcupine. Special thanks to Josh Coalson and David Bryant.


From the testing so far, I'm thinking the default behaviour for a pre-processor should be this:

fix clipping = If file clips extensively, attenuate by 31/32 and retain at least 5 bits*
output bitdepth = input bitdepth
dither = none
threshold shift = 0dB
bit reduction method = round
frame_size = dynamic* or 1024 (lossless target format dependent)


For advanced users, other options should be available:

fix clipping: do nothing / default* / default+dither* / For 16-bit files: change to 24-bits and 50% amplitude / For 24-bit files: 50% amplitude with dither
dither: default / rectangular* / triangular* / noise shaped* (maybe!)
threshold shift = anything
frame_size = default* / fixed: 1024 / 2048 / 4096 / 8192 etc (lossless codec and sample rate dependent)

* = not tried / implemented yet.


If this was integrated properly into a lossless codec, there could be three changes/improvements:

1. I like SebG's suggestions of subtractive dither.
2. The clipping handling should be transparent to the user, efficient, and free from confusing options; e.g. internally it could be 50% amplitude 24-bit, but it should be bounced back to full amplitude 16-bit by the decoder (if the original was 16-bit).
3. The dynamic frame size should be tightly integrated / optimised with the codec.

EDIT: either implementation (pre-processor or integrated) could have a hybrid mode added, so you have a lossless correction file too. Obviously in the pre-processor version, you'd need a post-processor to stitch them back together, so this isn't very useful for listening unless support becomes widespread, but it works for "archiving", whatever people mean by that.


I'll implement some of the pre-processor behaviour described above when I get chance.

In the mean time, I'm still looking for problem samples.

Cheers,
David.