IPB

Welcome Guest ( Log In | Register )

9 Pages V  « < 5 6 7 8 9 >  
Closed TopicStart new topic
Near-lossless / lossy FLAC, An idea & MATLAB implementation
Nick.C
post Jun 28 2007, 14:27
Post #151


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



ShadowKing, I take it that those samples are LossLess FLAC?


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
shadowking
post Jun 28 2007, 14:31
Post #152





Group: Members
Posts: 1530
Joined: 31-January 04
Member No.: 11664



QUOTE (Nick.C @ Jun 28 2007, 23:27) *
ShadowKing, I take it that those samples are LossLess FLAC?


Yes.


--------------------
Wavpack -b450s0.7
Go to the top of the page
+Quote Post
Nick.C
post Jun 28 2007, 16:23
Post #153


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



ShadowKing's samples.
CODE
                                            FLAC  PP10
=======================================================
10 - Dungeon - The Birth- The Trauma Begins  919   453
A02_metamorphose                             846   507
aps_Killer_sample                            929   484
Moon_short                                   834   550
velvet                                       957   516
=======================================================
Average                               1411   897   502
                                      100%  64.6% 35.6%
                                             100% 56.0%
=======================================================
No artifacts noticable.

This post has been edited by Nick.C: Jun 28 2007, 16:24


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Nick.C
post Jul 3 2007, 13:09
Post #154


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Playing about with the code, I've added a "choose_bits_to_remove" parameter - which is used as follows:

CODE
        if (choose_bits_to_remove==0),
            bits_to_remove(block_number)=min(min(bits_to_remove_table));
        else
            bits_to_remove(block_number)=floor(mean(mean(bits_to_remove_table)))+(choose_bits_to_remove-1);
        end;
        bits_to_remove(block_number)=min(bits_to_remove(block_number),bs-minimum_bits_to_keep);


To my ears (combined with minimum_bits_to_keep=5) the transparency threshold is about 3 or 4. Setting Minimum_bits_to_keep (MBTK) to 6 improves BTR=4. The bitrate reduction is fairly significant:

CODE
Samples: 10 - Dungeon - The Birth- The Trauma Begins, 41_30sec, A02_metamorphose,
annoyingloudsong, aps_Killer_sample, Atem_lied, ATrain, birds,
E50_PERIOD_ORCHESTRAL_E_trombone_strings, eig, glass_short, jump_long, Moon_short,
rach_original, rawhide, S13_KEYBOARD_Harpsichord_C, S30_OTHERS_Accordion_A,
S34_OTHERS_GlassHarmonica_A, S35_OTHERS_Maracas_A, S53_WIND_Saxophone_A, thewayitis,
VELVET

|=====|=========================|
| WAV | 53,763,880 (1411.2kbps) |
|FLAC | 29,767,971 ( 781.2kbps) |
|=====|=========================|========================|========================|
|     |        MBTK=5           |        MBTK=6          |        MBTK=7          |
|=====|=========================|========================|========================|
|BTR0 | 17,209,767 ( 451.7kbps) | 17,209,767 ( 451.7kbps)| 17,256,277 ( 452.9kbps)|
|BTR1 | 16,052,243 ( 421.3kbps) | 16,052,243 ( 421.3kbps)| 16,110,776 ( 422.9kbps)|
|BTR2 | 13,259,455 ( 348.0kbps) | 13,313,411 ( 394.4kbps)| 13,530,611 ( 355.2kbps)|
|BTR3 | 10,814,615 ( 283.9kbps) | 11,025,396 ( 289.4kbps)| 11,369,979 ( 298.4kbps)|
|BTR4 |  8,959,432 ( 235.1kbps) |  9,288,634 ( 243.9kbps)|  9,732,593 ( 255.5kbps)|
|=====|=========================|========================|========================|


This post has been edited by Nick.C: Jul 3 2007, 14:30


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
shadowking
post Jul 3 2007, 14:07
Post #155





Group: Members
Posts: 1530
Joined: 31-January 04
Member No.: 11664



You should start to pickup some hiss below 300k . Sometimes turning up the volume reveals it, otherwise these encoders are artifact free.

Dungeon - baby crying added hiss
Velvet - noise moving around beats (doom-chik-doom-chik)
Atemlied - hissing on the phone ringing part
41 secs - cymbals 'dusty'
metmorphose - hiss on the HF bits
moon short - slight hiss

This post has been edited by shadowking: Jul 3 2007, 14:09


--------------------
Wavpack -b450s0.7
Go to the top of the page
+Quote Post
Nick.C
post Jul 3 2007, 14:11
Post #156


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Which BTR were you using? MBTK=7 (or maybe 8?) may help. My "testing" is on earbuds at moderate volume - suitable for an office environment at lunch. It also replicates my most likely playback environment.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
2Bdecided
post Jul 3 2007, 14:15
Post #157


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



Nick,

My gut feeling (and I haven't tried it yet) is that this will introduce audible problems.

Near the start of this thread, halb27 ABXed some samples with 6dB and 12dB more noise than default. From the bitrates, it looks like you're pushing it even further than that.


I've been working to solve the problem sample I managed to manufacture. It's fixed now with rectangular or triangular dither, which I've finally implemented properly. I still think it's a waste of time for most content, but it's nice to have the option.

I'll upload when I get the chance.

Cheers,
David.
Go to the top of the page
+Quote Post
Nick.C
post Jul 3 2007, 14:22
Post #158


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Good afternoon David,

In ways, I'm looking for "an acceptable bitrate / quality" balance - my DAP of choice plays FLAC and this method of bitrate reduction feels "cleaner" than moving to a full blown lossy codec. Your original concept has proven itself - how far it can be pushed whilst maintaining "acceptable" quality is another matter. I see this as an analog to the LAME -V0 .. -V9 options.

Looking forward to the revised source to chew on..... smile.gif


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
2Bdecided
post Jul 4 2007, 10:16
Post #159


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



If you want to force the bitrate lower, you can do any or all of the following (with predictable results)...


* Resample to 32kHz
- (removes frequencies above 16kHz)

* Reduce the bitdepth (e.g. 14-bits, 12-bits) within the 16-bit file
- (introduces fixed noise)
(either pre-process, or force "bits_to_remove" to always be above a certain number)

* ReplayGain (or just reduce the volume) before encoding
- (makes it quieter!)

* Use a positive noise_threshold_shift
- (introduces variable noise)


Part of what you've done is similar to just reducing the bitdepth, but might be less predictable.

I'll post some numbers in a moment...

This post has been edited by 2Bdecided: Jul 4 2007, 10:39
Go to the top of the page
+Quote Post
2Bdecided
post Jul 4 2007, 12:00
Post #160


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



I grabbed all the files from the Atem_lied to thewayitis test set.

Regular flac: 728kbps
Lossy flac: 524kbps
Lossy flac nts+6dB: 457kbps

Regular flac RG: 756kbps (! didn't help, because most of these files are quiet!)
Regular flac RG 32k: 592kbps
Lossy flac RG 32k: 441kbps
Lossy flac RG 32k nts+6dB: 386kbps
Lossy flac RG 32k nts+12dB: 328kbps
Lossy flac RG 32k nts+24dB: 230kbps

I also tried annoyinglyloudsong:

Regular flac: 1252 kbps
Lossy flac: 411kbps

Regular flac RG 32kHz: 828kbps
Lossy flac RG 32kbps nts+6dB: 266kbps
Lossy flac RG 32kbps nts+12dB: 211kbps
Lossy flac RG 32kbps nts+24dB: 133kbps


I ran all these tests with triangular dither. With the caveat that the block switching might not be debugged, I've attached my latest script.


Resampling to 32kHz is normally transparent for me, but won't be for people who can hear above 16kHz.

nts+24dB sounds awful - like an FM radio with a very weak signal
nts+12dB sounds OK. The hiss is audible if you listen carefully. It's probably OK for you Nick.
nts+6dB sounds good. It's probably ABXable, but I didn't try.

Cheers,
David.
Attached File(s)
Attached File  lossyFLAC6.txt ( 16.23K ) Number of downloads: 208
 
Go to the top of the page
+Quote Post
2Bdecided
post Jul 4 2007, 13:35
Post #161


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



Examples for Nick.

Not transparent.
Attached File(s)
Attached File  annoyingloudsong_32k_nts6.lossy.flac ( 520.44K ) Number of downloads: 212
Attached File  annoyingloudsong_32k_nts12.lossy.flac ( 411.19K ) Number of downloads: 184
Attached File  annoyingloudsong_32k_nts24.lossy.flac ( 260.42K ) Number of downloads: 148
 
Go to the top of the page
+Quote Post
Nick.C
post Jul 6 2007, 11:10
Post #162


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Looking at the analysis times (1.5ms and 20ms) then the corresponding FFT_Length for those, I was wondering why the time is not set so that no rounding of the power to which two is raised is required when determining FFT_Length?

using time=10^(log10(2)*bits-log10(fs)) yields

time (bits=6, fft_length=32) = approx. 1.451ms;
time (bits=10, fft_length=1024) = approx. 23.219ms;

and for the extra analysis:

time (bits=8, fft_length=256) = approx. 5.805ms;

Cound there be a benefit in tuning the analysis time exactly to the fft_length?


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
2Bdecided
post Jul 6 2007, 12:14
Post #163


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



Hi Nick,

I kind of picked the times off the top of my head. They seemed like good times.

As you've seen, they're converted into numbers of samples the way they are, so you get something close to those times that's a power of 2, irrespective of sampling frequency. It could be neater (it's "closest" on a log scale, which may or may not be ideal), but I can't see any advantage to picking exact times.

There can't be any times that will convert to exact powers of 2 for 32kHz, 44.1kHz and 48kHz sampling.

If you want to avoid the log calculation, use a look up table, either to approximate the calculation, to specify sample values directly for common sample rates. However, I think there are other log calculations later in the code that you can't avoid.

Cheers,
David.


btw, do the 32kHz sampled files play OK on your porable?
Go to the top of the page
+Quote Post
Nick.C
post Jul 6 2007, 12:45
Post #164


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Oops - didn't reply to the samples - NTS6 and NTS12 play fine, NTS24 is full of hiss - probably to be expected due to the noise added.

Been playing with the number of analyses and fft_lengths:

5 analyses (4,6,8,10,12 bits) and following BTR variant (btr_type=4)

CODE
        btr_sum  = sum(sum(bits_to_remove_table));
        btr_min  = min(min(bits_to_remove_table));
        btr_max  = max(max(bits_to_remove_table));
        btr_size = number_of_analyses * channels;
        
        if (btr_type==0),
            bits_to_remove(codec_block_number)=btr_min;
        else
            bits_to_remove(codec_block_number)=max(0,floor((btr_sum-btr_min-btr_max)/(btr_size-2)+(btr_type-1)/2));
        end;

        bits_to_remove(codec_block_number)=bs-max((bs-bits_to_remove(codec_block_number)),minimum_bits_to_keep);


This gave me *really* nice sounding results (got a pair of Sennheiser canal phones for my iPAQ) at 272kbps for the sample set used previously.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
2Bdecided
post Jul 6 2007, 14:42
Post #165


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



So let me see if I've got this right...

You're doing FFTs of sizes 2 to the power 4, 6, 8, 10, and 12.

You were taking the mean bits-to-remove across the block, and but now you're adding them together, subtracting the highest and lowest values, dividing by something which isn't quite the number of values, and also dropping an extra 1-2 bits.

I'll have to give it a listen. It can't be magic (or, I would think, universally transparent!, but maybe it hides the worst noise where it's least obvious.

For a laugh, tell me how long it takes to run your five analysis version in Octave wink.gif

Cheers,
David.
Go to the top of the page
+Quote Post
Nick.C
post Jul 6 2007, 14:58
Post #166


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Basically I'm calculating the mean of all the values (disregarding the highest & lowest) then adding 1.5 bits and finally rounding down.

i.e. bits_to_remove_table=[2,3,4,5,6],[3,4,5,6,6] >> (44-2-6)/(10-2) = 36/8 = 4.5 add 1.5 = 6!

Oh, analysis takes a very long time..........

but...... tried 5,7,9 & 11 with btr_type=4 (i.e. add 1.5 bits) and get 292kbps, but with less analysis time.

This post has been edited by Nick.C: Jul 6 2007, 15:01


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
2Bdecided
post Jul 6 2007, 15:39
Post #167


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



I see. I'm unsure as to why it's not (btr_size-2*channels).

I suspect you'll get more noise (possibly audible) for highly tonal and highly transient signals.

All else being equal, forcing an extra bit to remove is the same as using a +6dB noise threshold shift (except when bits to remove would have been zero with the former).

It should be fine for what you want it for.

Cheers,
David.

This post has been edited by 2Bdecided: Jul 6 2007, 15:41
Go to the top of the page
+Quote Post
Nick.C
post Jul 6 2007, 20:59
Post #168


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



2*channels would remove 4 values. I only want to remove the highest and the lowest analysis value (i.e. 2), and take the mean of the rest.

To be perfectly frank, I'm trying lots of permutations and seeing how the results pan out - I have two loops set up so that it loops through number_of_analyses=2:5 and btr_type=0:5 and it already loops through the 21 samples in the format .AxBy.wav where x=number of analyses and y=btr_type - leave simmering for quite a while and you get some results to listen to.

Oh, I had to modify wavread and wavwrite to read / write integer values and modify your script to do the same as 3 copies of the audio data was causing my machine to run out of memory.......

Love the concept - like the fact that I can get good quality at 300 - 350kbps on the sample set.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
2Bdecided
post Jul 6 2007, 23:30
Post #169


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



Glad you're having fun with it Nick. For myself, I'd feel more comfortable with mp3 at those bitrates, but I could be convinced.

You mentioned modifying waveread and wavewrite. It sounds like a good idea. I don't have to be so careful with 4GB of RAM, but hopefully eventually I (or someone) will implement disk buffering do it's doesn't matter.

It's great that you're playing with it and finding useful ways to get good quality at lower bitrates, but there is a hard ceiling with this approach. I don't want to sound negative, but you're adding flat noise, and experience suggests this becomes audible for problem samples ~300-400kbps, and audible for many things much below this.


For the future, I'm wondering how well psychoacoustic based noise shaping would work with this. Not instead of what's there already, but as an optional alternative. You could obviously throw away more bits, but the peak level would increase (dramatically in some cases) and you must hit a point where FLAC (or whatever) finds it harder to compress.

Bryant has mentioned this before, as has SebG...

http://www.hydrogenaudio.org/forums/index....showtopic=11623

It's more complicated than what's in there at present. I might try it just for the fun(!) of it, but I'm off on holiday so it won't be for a while.

Cheers,
David.
Go to the top of the page
+Quote Post
Nick.C
post Jul 7 2007, 11:41
Post #170


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



If it was easy, anyone could do it...... wink.gif

I'm a totally unskilled amateur in audio processing - but having immense fun. Have a good holiday!

[edit]
Had a rethink on the forcing extra bits to be removed and reverted back to the simplistic mean(mean(bits_to_remove_table)) alternative - but still using 4 analyses, fft_length =2^(5,7,9,11).

Had a look at the triangular dither and found a Gaussian variant in wikipedia and this link http://www.musicdsp.org/showone.php?id=121. Currently using (sum of 8 separate Rand(block_size,channels)-4)/8.

Planning to do a lot of conversion for DAP use - now to come up with a method of preserving tags.......
[/edit]

This post has been edited by Nick.C: Jul 9 2007, 16:57


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Nick.C
post Jul 11 2007, 14:20
Post #171


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Back to using 2 analyses (6 & 10 bit fft_length), using gaussian dither with 32 repeats and your fix_clipped=2 - forcing bits to be removed again - the dither *seems* to mask the extra bit loss.

Anyway, I can't seem to upload the results (no webspace of my own), so I can't submit for constructive criticism.

Removing up to 2 extra bits (1/3 bit at a time) over the mean I can reduce 63.1MiB of WAV to between 20.3Mib and 11.7MiB of .ss.flac (lossless flac = 36.1MiB) for the 25 files in my sample set.

This post has been edited by Nick.C: Jul 11 2007, 14:20


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
2Bdecided
post Jul 11 2007, 22:00
Post #172


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



You can upload in the uploads forum here.

("Developers" can upload in normal threads - don't ask me, I found it be accident)

On its own, "forcing some bits to be removed always" just raises the noise floor a little. Most people are more than happy with 14-bits (~FM BBC Radio 3 on very good equipment kind of quality). It's a good strategy to have as an option - I'll certainly merge it into my code when I get back.

btw, even just rectangular dither solved the problem sample I created, but if you're forcing just audible noise, your dither choice would be subjective. EDIT: followed the dither link. Dubious information. IIRC Gaussian isn't proven to remove all harmonic distortion or noise modulation, where triangular is perfect in both regards. Rectangular is only perfect in the former - it can leave noise modulation (though we're adding some of that anyway!).

(Nice holiday so far, but the weather will probably be poor tomorrow).

Cheers,
David.

This post has been edited by 2Bdecided: Jul 11 2007, 22:05
Go to the top of the page
+Quote Post
Nick.C
post Jul 12 2007, 08:12
Post #173


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Glad the holiday's going well!

I managed to upload some files see:
http://www.hydrogenaudio.org/forums/index....showtopic=56129

- if there are any other samples anyone would wish to be processed, let me know. Samples uploaded for information basically - the bitrate is dramatically reduced in most cases.

Baically, I'm playing with dither now - the gaussian implemented easily, so worth a try at least.

I've processed a few albums now and they typically reduce to about 1/3rd of the lossless FLAC size post processing. So, from a magpie's perspective I can fit 3 times as many on my DAP!


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
2Bdecided
post Jul 16 2007, 13:11
Post #174


ReplayGain developer


Group: Developer
Posts: 5364
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



I'm not certain that lossyFLAC will never work at the bitrates you seem to want, but in the meantime, have you heard of mp3? wink.gif

Cheers,
David.
Go to the top of the page
+Quote Post
Nick.C
post Jul 16 2007, 13:52
Post #175


lossyWAV Developer


Group: Developer
Posts: 1815
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Mp3 - fuzzat den? wink.gif Yes, I could use Lame or aoTuV as one way of doing this, but I'm having much fun playing around with the script and it's costing me nothing.........

On reflection, I'll probably fall back to your original script after learning for myself why I wouldn't want to remove any more bits.

Still playing with dither and another possible variant on conditional fix_clipped.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post

9 Pages V  « < 5 6 7 8 9 >
Closed TopicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 28th December 2014 - 20:26