Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Near-lossless / lossy FLAC (Read 176325 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Near-lossless / lossy FLAC

Reply #150
ShadowKing, I take it that those samples are LossLess FLAC?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)


Near-lossless / lossy FLAC

Reply #152
ShadowKing's samples.
Code: [Select]
                                            FLAC  PP10
=======================================================
10 - Dungeon - The Birth- The Trauma Begins  919   453
A02_metamorphose                             846   507
aps_Killer_sample                            929   484
Moon_short                                   834   550
velvet                                       957   516
=======================================================
Average                               1411   897   502
                                      100%  64.6% 35.6%
                                             100% 56.0%
=======================================================
No artifacts noticable.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #153
Playing about with the code, I've added a "choose_bits_to_remove" parameter - which is used as follows:

Code: [Select]
        if (choose_bits_to_remove==0),
            bits_to_remove(block_number)=min(min(bits_to_remove_table));
        else
            bits_to_remove(block_number)=floor(mean(mean(bits_to_remove_table)))+(choose_bits_to_remove-1);
        end;
        bits_to_remove(block_number)=min(bits_to_remove(block_number),bs-minimum_bits_to_keep);


To my ears (combined with minimum_bits_to_keep=5) the transparency threshold is about 3 or 4.  Setting Minimum_bits_to_keep (MBTK) to 6 improves BTR=4. The bitrate reduction is fairly significant:

Code: [Select]
Samples: 10 - Dungeon - The Birth- The Trauma Begins, 41_30sec, A02_metamorphose, 
annoyingloudsong, aps_Killer_sample, Atem_lied, ATrain, birds,
E50_PERIOD_ORCHESTRAL_E_trombone_strings, eig, glass_short, jump_long, Moon_short,
rach_original, rawhide, S13_KEYBOARD_Harpsichord_C, S30_OTHERS_Accordion_A,
S34_OTHERS_GlassHarmonica_A, S35_OTHERS_Maracas_A, S53_WIND_Saxophone_A, thewayitis,
VELVET

|=====|=========================|
| WAV | 53,763,880 (1411.2kbps) |
|FLAC | 29,767,971 ( 781.2kbps) |
|=====|=========================|========================|========================|
|     |        MBTK=5           |        MBTK=6          |        MBTK=7          |
|=====|=========================|========================|========================|
|BTR0 | 17,209,767 ( 451.7kbps) | 17,209,767 ( 451.7kbps)| 17,256,277 ( 452.9kbps)|
|BTR1 | 16,052,243 ( 421.3kbps) | 16,052,243 ( 421.3kbps)| 16,110,776 ( 422.9kbps)|
|BTR2 | 13,259,455 ( 348.0kbps) | 13,313,411 ( 394.4kbps)| 13,530,611 ( 355.2kbps)|
|BTR3 | 10,814,615 ( 283.9kbps) | 11,025,396 ( 289.4kbps)| 11,369,979 ( 298.4kbps)|
|BTR4 |  8,959,432 ( 235.1kbps) |  9,288,634 ( 243.9kbps)|  9,732,593 ( 255.5kbps)|
|=====|=========================|========================|========================|
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #154
You should start to pickup some hiss below 300k . Sometimes turning up the volume reveals it, otherwise these encoders are artifact free.

Dungeon - baby crying  added hiss
Velvet - noise moving around beats (doom-chik-doom-chik)
Atemlied - hissing on the phone ringing part
41 secs -  cymbals 'dusty'
metmorphose - hiss on the HF bits
moon short - slight hiss

Near-lossless / lossy FLAC

Reply #155
Which BTR were you using? MBTK=7 (or maybe 8?) may help. My "testing" is on earbuds at moderate volume - suitable for an office environment at lunch. It also replicates my most likely playback environment.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #156
Nick,

My gut feeling (and I haven't tried it yet) is that this will introduce audible problems.

Near the start of this thread, halb27 ABXed some samples with 6dB and 12dB more noise than default. From the bitrates, it looks like you're pushing it even further than that.


I've been working to solve the problem sample I managed to manufacture. It's fixed now with rectangular or triangular dither, which I've finally implemented properly. I still think it's a waste of time for most content, but it's nice to have the option.

I'll upload when I get the chance.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #157
Good afternoon David,

In ways, I'm looking for "an acceptable bitrate / quality" balance - my DAP of choice plays FLAC and this method of bitrate reduction feels "cleaner" than moving to a full blown lossy codec. Your original concept has proven itself - how far it can be pushed whilst maintaining "acceptable" quality is another matter. I see this as an analog to the LAME -V0 .. -V9 options.

Looking forward to the revised source to chew on.....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #158
If you want to force the bitrate lower, you can do any or all of the following (with predictable results)...


* Resample to 32kHz
-  (removes frequencies above 16kHz)

* Reduce the bitdepth (e.g. 14-bits, 12-bits) within the 16-bit file
-  (introduces fixed noise)
      (either pre-process, or force "bits_to_remove" to always be above a certain number)

* ReplayGain (or just reduce the volume) before encoding
-  (makes it quieter!)

* Use a positive noise_threshold_shift
-  (introduces variable noise)


Part of what you've done is similar to just reducing the bitdepth, but might be less predictable.

I'll post some numbers in a moment...

Near-lossless / lossy FLAC

Reply #159
I grabbed all the files from the Atem_lied to thewayitis test set.

Regular flac: 728kbps
Lossy flac: 524kbps
Lossy flac nts+6dB: 457kbps

Regular flac RG: 756kbps (! didn't help, because most of these files are quiet!)
Regular flac RG 32k: 592kbps
Lossy flac RG 32k: 441kbps
Lossy flac RG 32k nts+6dB: 386kbps
Lossy flac RG 32k nts+12dB: 328kbps
Lossy flac RG 32k nts+24dB: 230kbps

I also tried annoyinglyloudsong:

Regular flac: 1252 kbps
Lossy flac: 411kbps

Regular flac RG 32kHz: 828kbps
Lossy flac RG 32kbps nts+6dB: 266kbps
Lossy flac RG 32kbps nts+12dB: 211kbps
Lossy flac RG 32kbps nts+24dB: 133kbps


I ran all these tests with triangular dither. With the caveat that the block switching might not be debugged, I've attached my latest script.


Resampling to 32kHz is normally transparent for me, but won't be for people who can hear above 16kHz.

nts+24dB sounds awful - like an FM radio with a very weak signal
nts+12dB sounds OK. The hiss is audible if you listen carefully. It's probably OK for you Nick.
nts+6dB sounds good. It's probably ABXable, but I didn't try.

Cheers,
David.


Near-lossless / lossy FLAC

Reply #161
Looking at the analysis times (1.5ms and 20ms) then the corresponding FFT_Length for those, I was wondering why the time is not set so that no rounding of the power to which two is raised is required when determining FFT_Length?

using time=10^(log10(2)*bits-log10(fs)) yields

time (bits=6, fft_length=32) = approx. 1.451ms;
time (bits=10, fft_length=1024) = approx. 23.219ms;

and for the extra analysis:

time (bits=8, fft_length=256) = approx. 5.805ms;

Cound there be a benefit in tuning the analysis time exactly to the fft_length?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #162
Hi Nick,

I kind of picked the times off the top of my head. They seemed like good times.

As you've seen, they're converted into numbers of samples the way they are, so you get something close to those times that's a power of 2, irrespective of sampling frequency. It could be neater (it's "closest" on a log scale, which may or may not be ideal), but I can't see any advantage to picking exact times.

There can't be any times that will convert to exact powers of 2 for 32kHz, 44.1kHz and 48kHz sampling.

If you want to avoid the log calculation, use a look up table, either to approximate the calculation, to specify sample values directly for common sample rates. However, I think there are other log calculations later in the code that you can't avoid.

Cheers,
David.


btw, do the 32kHz sampled files play OK on your porable?

 

Near-lossless / lossy FLAC

Reply #163
Oops - didn't reply to the samples - NTS6 and NTS12 play fine, NTS24 is full of hiss - probably to be expected due to the noise added.

Been playing with the number of analyses and fft_lengths:

5 analyses (4,6,8,10,12 bits) and following BTR variant (btr_type=4)

Code: [Select]
        btr_sum  = sum(sum(bits_to_remove_table));
        btr_min  = min(min(bits_to_remove_table));
        btr_max  = max(max(bits_to_remove_table));
        btr_size = number_of_analyses * channels;
        
        if (btr_type==0),
            bits_to_remove(codec_block_number)=btr_min;
        else
            bits_to_remove(codec_block_number)=max(0,floor((btr_sum-btr_min-btr_max)/(btr_size-2)+(btr_type-1)/2));
        end;

        bits_to_remove(codec_block_number)=bs-max((bs-bits_to_remove(codec_block_number)),minimum_bits_to_keep);


This gave me *really* nice sounding results (got a pair of Sennheiser canal phones for my iPAQ) at 272kbps for the sample set used previously.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #164
So let me see if I've got this right...

You're doing FFTs of sizes 2 to the power 4, 6, 8, 10, and 12.

You were taking the mean bits-to-remove across the block, and but now you're adding them together, subtracting the highest and lowest values, dividing by something which isn't quite the number of values, and also dropping an extra 1-2 bits.

I'll have to give it a listen. It can't be magic (or, I would think, universally transparent!, but maybe it hides the worst noise where it's least obvious.

For a laugh, tell me how long it takes to run your five analysis version in Octave

Cheers,
David.

Near-lossless / lossy FLAC

Reply #165
Basically I'm calculating the mean of all the values (disregarding the highest & lowest) then adding 1.5 bits and finally rounding down.

i.e. bits_to_remove_table=[2,3,4,5,6],[3,4,5,6,6] >> (44-2-6)/(10-2) = 36/8 =  4.5 add 1.5 = 6!

Oh, analysis takes a very long time..........

but...... tried 5,7,9 & 11 with btr_type=4 (i.e. add 1.5 bits) and get 292kbps, but with less analysis time.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #166
I see. I'm unsure as to why it's not (btr_size-2*channels).

I suspect you'll get more noise (possibly audible) for highly tonal and highly transient signals.

All else being equal, forcing an extra bit to remove is the same as using a +6dB noise threshold shift (except when bits to remove would have been zero with the former).

It should be fine for what you want it for.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #167
2*channels would remove 4 values. I only want to remove the highest and the lowest analysis value (i.e. 2), and take the mean of the rest.

To be perfectly frank, I'm trying lots of permutations and seeing how the results pan out - I have two loops set up so that it loops through number_of_analyses=2:5 and btr_type=0:5 and it already loops through the 21 samples in the format .AxBy.wav where x=number of analyses and y=btr_type - leave simmering for quite a while and you get some results to listen to.

Oh, I had to modify wavread and wavwrite to read / write integer values and modify your script to do the same as 3 copies of the audio data was causing my machine to run out of memory.......

Love the concept - like the fact that I can get good quality at 300 - 350kbps on the sample set.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #168
Glad you're having fun with it Nick. For myself, I'd feel more comfortable with mp3 at those bitrates, but I could be convinced.

You mentioned modifying waveread and wavewrite. It sounds like a good idea. I don't have to be so careful with 4GB of RAM, but hopefully eventually I (or someone) will implement disk buffering do it's doesn't matter.

It's great that you're playing with it and finding useful ways to get good quality at lower bitrates, but there is a hard ceiling with this approach. I don't want to sound negative, but you're adding flat noise, and experience suggests this becomes audible for problem samples ~300-400kbps, and audible for many things much below this.


For the future, I'm wondering how well psychoacoustic based noise shaping would work with this. Not instead of what's there already, but as an optional alternative. You could obviously throw away more bits, but the peak level would increase (dramatically in some cases) and you must hit a point where FLAC (or whatever) finds it harder to compress.

Bryant has mentioned this before, as has SebG...

http://www.hydrogenaudio.org/forums/index....showtopic=11623

It's more complicated than what's in there at present. I might try it just for the fun(!) of it, but I'm off on holiday so it won't be for a while.

Cheers,
David.

Near-lossless / lossy FLAC

Reply #169
If it was easy, anyone could do it......

I'm a totally unskilled amateur in audio processing - but having immense fun. Have a good holiday!

[edit]
Had a rethink on the forcing extra bits to be removed and reverted back to the simplistic mean(mean(bits_to_remove_table)) alternative - but still using 4 analyses,  fft_length =2^(5,7,9,11).

Had a look at the triangular dither and found a Gaussian variant in wikipedia and this link http://www.musicdsp.org/showone.php?id=121. Currently using (sum of 8 separate Rand(block_size,channels)-4)/8.

Planning to do a lot of conversion for DAP use - now to come up with a method of preserving tags.......
[/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #170
Back to using 2 analyses (6 & 10 bit fft_length), using gaussian dither with 32 repeats and your fix_clipped=2 - forcing bits to be removed again - the dither *seems* to mask the extra bit loss.

Anyway, I can't seem to upload the results (no webspace of my own), so I can't submit for constructive criticism.

Removing up to 2 extra bits (1/3 bit at a time) over the mean I can reduce 63.1MiB of WAV to between 20.3Mib and 11.7MiB of .ss.flac (lossless flac = 36.1MiB) for the 25 files in my sample set.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #171
You can upload in the uploads forum here.

("Developers" can upload in normal threads - don't ask me, I found it be accident)

On its own, "forcing some bits to be removed always" just raises the noise floor a little. Most people are more than happy with 14-bits (~FM BBC Radio 3 on very good equipment kind of quality). It's a good strategy to have as an option - I'll certainly merge it into my code when I get back.

btw, even just rectangular dither solved the problem sample I created, but if you're forcing just audible noise, your dither choice would be subjective. EDIT: followed the dither link. Dubious information. IIRC Gaussian isn't proven to remove all harmonic distortion or noise modulation, where triangular is perfect in both regards. Rectangular is only perfect in the former - it can leave noise modulation (though we're adding some of that anyway!).

(Nice holiday so far, but the weather will probably be poor tomorrow).

Cheers,
David.

Near-lossless / lossy FLAC

Reply #172
Glad the holiday's going well!

I managed to upload some files see:
http://www.hydrogenaudio.org/forums/index....showtopic=56129

- if there are any other samples anyone would wish to be processed, let me know. Samples uploaded for information basically - the bitrate is dramatically reduced in most cases.

Baically, I'm playing with dither now - the gaussian implemented easily, so worth a try at least.

I've processed a few albums now and they typically reduce to about 1/3rd of the lossless FLAC size post processing. So, from a magpie's perspective I can fit 3 times as many on my DAP!
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Near-lossless / lossy FLAC

Reply #173
I'm not certain that lossyFLAC will never work at the bitrates you seem to want, but in the meantime, have you heard of mp3?

Cheers,
David.

Near-lossless / lossy FLAC

Reply #174
Mp3 - fuzzat den?  Yes, I could use Lame or aoTuV as one way of doing this, but I'm having much fun playing around with the script and it's costing me nothing.........

On reflection, I'll probably fall back to your original script after learning for myself why I wouldn't want to remove any more bits.

Still playing with dither and another possible variant on conditional fix_clipped.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)