Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV 1.3.0 Development Thread (Read 195156 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

lossyWAV 1.3.0 Development Thread

Following the release of lossyWAV 1.2.0, I feel it is (yet again) time to kick off development of the next minor release.

Items currently on the list for inclusion in 1.3.0:
  • [complete] Implementation of SG's adaptive noise shaping method;
  • [v1.4.0] Identification and adaptation of a psy-model to use in conjunction with the new noise shaping method;
  • [deleted] Enable use libfftw3f-3.dll for single precision FFT analyses;
  • [complete] Introduce option to output a spectral "plot" of the processed audio data.
If you have any ideas, requests, suggestions, code optimisations, etc, please post them here.

Link to hydrogenaudio wiki lossyWAV article.

Suggested foobar2000 converter setup:

lossyFLAC:
Code: [Select]
Encoder: c:\windows\system32\cmd.exe
Extension: lossy.flac
Parameters: /d /c c:\"program files"\bin\lossywav - --quality standard --silent --stdout|c:\"program files"\bin\flac - -b 512 -5 -f -o%d --ignore-chunk-sizes
Format is: lossless or hybrid
Highest BPS mode supported: 24
lossyTAK:
Code: [Select]
Encoder: c:\windows\system32\cmd.exe
Extension: lossy.tak
Parameters: /d /c c:\"program files"\bin\lossywav - --quality standard --silent --stdout|c:\"program files"\bin\takc -e -p2m -fsl512 -ihs - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24
lossyWV:
Code: [Select]
Encoder: c:\windows\system32\cmd.exe
Extension: lossy.wv
Parameters: /d /c c:\"program files"\bin\lossywav - --quality standard --silent --stdout|c:\"program files"\bin\wavpack -hm --blocksize=512 --merge-blocks -i - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24
Enclose the element of the path containing spaces within double quotation marks ("), e.g. C:\"Program Files"\directory_where_executable_is\executable_name. This is a Windows limitation.

Change log 1.3.0: 06/08/2011
Introduction of fixed noise shaping using curve based on SebastianG's noise shaping IIR filter previously used in 1.2.0. Parameter -s, --shaping [n] takes optional parameter (0<=n<=1) or is set automatically depending on quality setting;
Code improvements.
Bug hunting.

[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]Change log beta 1.2.3p RC11: 14/06/2011
(Minor) modifications to --adaptive parameter;
Introduction of -W, --width <n> parameter (80<=n<=255) to allow user to select width of certain output options;
Introduction of -H, --histogram parameter to display 64-bin histogram of sample values;
Code improvements.
Bug hunting.

Change log beta 1.2.3o RC10: 16/05/2011
(Minor) modifications to --adaptive parameter.
Removal of fast / lower precision sqrt and log functions - seems to improve adaptive shaping "accuracy".
Bug hunting.
Previous beta versions / release candidates left up to allow side-to-side comparison, however, this is functionally very similar to beta 1.2.3n RC9.

Change log beta 1.2.3n RC9: 13/05/2011
Modifications to --adaptive parameter. Now allows user to disable default gain curve (use nogain) and default frequency warping (using nowarp) after the --adaptive parameter. Filter order still user selectable. Parameter takes multiple sub-parameters.
Bug hunting.

Change log beta 1.2.3m RC8: 11/05/2011
Modifications to adaptive noise shaping method. Now only uses "current" FFT results rather than using historical data as well.
Bug hunting.

Change log beta 1.2.3l RC7: 09/05/2011
(Minor) modifications to adaptive noise shaping method.
Bug hunting.

Change log beta 1.2.3k RC16: 06/05/2011
(Minor) modifications to adaptive noise shaping method.
Bug hunting.
Slight change to the --spread parameter (related to output of codec-block processing and subsequent spreading function(s)) - now indicates whether the static or dynamic maximum-bits-to-remove limits kicked in during processing (i.e. all FFTs for a particular codec block indicated more bits could be removed but bits-removed was limited to static and dynamic limits).

Change log beta 1.2.3j: 04/05/2011
Modifications to adaptive noise shaping method;
Change to --static parameter; maximum bits-to-keep now limited to bits-per-sample - 4.

Change log beta 1.2.3i: 22/04/2011
Modifications to adaptive noise shaping method;
DC offset now removed from audio data (and bit-removed data and correction data, when enabled) prior to analysis - seems to have improved "Furious" sample problem.

Change log beta 1.2.3h: 16/04/2011
Modifications to adaptive noise shaping method: higher sample-rates now treated differently - hopefully now avoiding the suspected phase related distortion encountered with 384kHz samples.

Change log beta 1.2.3g: 12/04/2011
Modifications to adaptive noise shaping method: --warp parameter removed due to complexity of selecting correct value. The value is now frequency dependent to allow for consistency in the portion of the warped spectrum associated with the frequency range of interest (up to 16kHz by default);
Modifications made to handling of sample-rates other than 44.1kHz. I realised that as all the testing has pretty much been carried out at that rate then there may be issues with handling other sample-rates. There were. I think that they're now fixed.

Change log beta 1.2.3f: 24/03/2011
Modifications to adaptive noise shaping method;
Addition of the --static parameter to allow the user to increase the number of (static) minimum-bits-to-keep in the range 7<=n<=16, default=6.

Change log beta 1.2.3e: 16/03/2011
Modifications to adaptive noise shaping method.

Change log beta 1.2.3d RC5: 28/02/2011
Change to parameter limits for -A, --adaptive: now 64<=n<=256;
Change to order of lower quality presets:
  • C, economic, 0.0;
  • P, portable, -2.5;
  • X, extraportable, -5.0.
Code tidy up and optimisation.
This is release candidate #5 for lossyWAV 1.3.0.

Change log beta 1.2.3c RC4: 22/02/2011
Parameter -i, --impulse removed. The shortest FFT (32 samples for 44.1/48kHz input) is now enabled by default for all quality settings, previously for int(q)>0. Can be disabled using --analyses 2 (default number of analyses is now 3).
A change to quality preset selection. Parameter -q, --quality will now accept a numeric value in the range -5 to 10 as before but also a short and long preset name as follows:
  • I, insane, 10.0;
  • E, extreme, 7.5;
  • H, high, 5.0;
  • S, standard, 2.5 [default];
  • P, portable, 0.0;
  • N, intermediate, -2.5;
  • X, extraportable, -5.0.
Code tidy up and optimisation.
This is release candidate #4 for lossyWAV 1.3.0.

Change log beta 1.2.3b RC3: 16/02/2011
--classic and --altpreset quality systems removed;
Fixed noise shaping removed;
Adaptive noise shaping now enabled by default. Use --adaptive OFF to disable. Parameter will still take numeric input to set number of taps for FIR filter used.
Remapping of quality presets as follows:
  • -I, --insane = -q 10.0;
  • -E, --extreme = -q 7.5;
  • -S, --standard = -q 5.0 [default];
  • -R, --reasonable = -q 2.5;
  • -P, --portable = -q 0.0;
  • -e, --economic = -q -2.5;
  • -s, --superportable = -q -5.0.
Code tidy up and optimisation.
This is release candidate #3 for lossyWAV 1.3.0.

Change log beta 1.2.3a RC2: 11/02/2011
(Very) minor changes to the adaptive noise shaping algorithm. Bug-fix to avoid divide-by-zero in levinson-durbin recursion.
Replacement of default internal settings, ranging from -5 to 10. Existing defaults now accessible using --classic parameter.
This is release candidate #2 for lossyWAV 1.3.0.

Change log beta 1.2.2z RC1: 10/01/2011
Minor changes to the adaptive noise shaping algorithm;
This is release candidate #1 for lossyWAV 1.3.0.

Change log beta 1.2.2y: 08/01/2011
Minor changes to the adaptive noise shaping algorithm, now a bit faster.

Change log beta 1.2.2x: 05/01/2011
Minor changes to the adaptive noise shaping algorithm - now takes into account long FFT analysis results for previous codec-block as well as present codec-block; Changes made to curve which modifies shape of desired filter output per codec-block-channel.

Change log beta 1.2.2w: 02/12/2010
Minor changes to the adaptive noise shaping algorithm; default FIR filter size=64;
Bug found in processing of multi-channel audio - also fixed in version 1.2.0 as maintenance release 1.2.0b (see Validated News);
Code tidy up.

Change log beta 1.2.2t: 11/11/2010
Minor changes to the adaptive noise shaping algorithm (max FIR filter size=512);
-X, --sortspread parameter removed;
-F, --fftw parameter removed. libfftw3-3.dll (double precision) used automatically if found;
Code tidy up.

Change log beta 1.2.2s: 08/08/2010
Further changes to the adaptive noise shaping algorithm;

Change log beta 1.2.2r: 26/07/2010
Bug-fixes and minor changes to the adaptive noise shaping algorithm;
Compatibility with libfftw3f-3.dll removed (single precision) to significantly simplify code;
Removal of added dither in bit-removal algorithm when adaptive noise shaping active.

Change log beta 1.2.2q: 29/06/2010
Further changes to the adaptive noise shaping algorithm;
Change to bit-removal algorithm when adaptive noise shaping active - dither introduced.

Change log beta 1.2.2p: 21/06/2010
Further changes to the adaptive noise shaping algorithm.

Change log beta 1.2.2n: 21/06/2010
Further changes to the adaptive noise shaping algorithm. Modification made to the high frequency portion of the desired shape.

Change log beta 1.2.2m: 17/06/2010
Further changes to the adaptive noise shaping algorithm. Now uses the results of both the long and short FFT analyses when determining the desired shape of filter.

Change log beta 1.2.2k: 11/06/2010
Desired shape of filter now changes more gradually per codec-block rather than totally;
Fix to calculation of frequency distribution to allow more accurate comparison of input and output.

Change log beta 1.2.2j: 04/06/2010
Temporary fix to stop spikes in output - attempt #1.

Change log beta 1.2.2i: 03/06/2010
Bug fix in FFT routine selection process.

Change log beta 1.2.2h: 03/06/2010
Bug fix in adaptive noise shaping method.

Change log beta 1.2.2g: 01/06/2010
Modifications made to adaptive noise shaping method.

Change log beta 1.2.2f: 31/05/2010
Modifications made to adaptive noise shaping method. Change in filter behaviour at higher frequencies.

Change log beta 1.2.2e: 31/05/2010
Modifications made to adaptive noise shaping method. Better filter resolution achieved (I think). --adaptive parameter now takes an integer value as the order of the FIR filter. Valid in the range 16<=n<=128, default=32.

Change log beta 1.2.2d: 29/05/2010
Modifications made to adaptive noise shaping method. Better resolution (I think) of the filter - interim beta - still work to do on this.

Change log beta 1.2.2c: 28/05/2010
Modifications made to adaptive noise shaping method. Attempt #1 to fix the low-frequency error.

Change log beta 1.2.2b: 27/05/2010
Modifications made to adaptive noise shaping method. Attempt #1 to fix clicking error.

Change log beta 1.2.2a: 26/05/2010
Further progress made with SG's adaptive noise shaping method. The --adaptive parameter now takes a parameter (in the range -1<n<1) to allow the warping factor to be changed. Error found and fixed in remove_bits routine (for adaptive-shaping).

Change log beta 1.2.1a: 25/05/2010
Further progress made with SG's adaptive noise shaping method. Extremely simplistic psy-model in place. Enable by adding --adaptive to the command line.
Post-analysis of the bit-removed audio data can be performed using the --postanalyse parameter in conjunction with the --freqdist parameter.

Change log pre beta 1.2.1a: 17/05/2010
Code optimisations;
Major part of the implementation of SG's adaptive noise shaping;
Modification of fftw_interface unit and FFT handling routines to allow the use of libfftw3f-3.dll as well as libfftw3-3.dll (-F, --fftw <single/double/off>);
Implementation of the spectral plot (--freqdist) for input and optionally output audio data.[/size]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #1
I'm going to enjoy this thread!

Cheers,
David.

lossyWAV 1.3.0 Development Thread

Reply #2
I've been waiting for this thread and finally here it is.

No ideas or suggestions at the moment though, but some will most likely appear when the discussion kicks off.

lossyWAV 1.3.0 Development Thread

Reply #3
  • Implementation of SG's adaptive noise shaping method;
  • Identification and adaptation of a psy-model to use in conjunction with the new noise shaping method;

Do I understand correctly what we're aiming for?
The noise shaping is so we can add more noise but still hide it, this could give better compression. But with noise shaping, the used codec will have worse compression as a result, which will (partly) eliminate this advantage.
It will be interesting to see how much room there is for applying noise shaping before it becomes counter productive.
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV 1.3.0 Development Thread

Reply #4
The lossless codec is especially less efficient if added noise is in the very high frequency range.
With an adaptive noise shaping there is hope that overall efficiency will improve.

@Nick: Nice to see you still improving lossyWAV. Thanks a lot.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #5
Quote
Nice to see you still improving lossyWAV. Thanks a lot.
Same here.

I like the fact that lossywav exist as an alternative no matter if I use it or not. The idea behind it is great & makes it worthy.

Quote
Implementation of SG's adaptive noise shaping method

... even if I don't get what it means  This have been in the TODO list for so long that I'll be curious to hear what it sounds like !!!

lossyWAV 1.3.0 Development Thread

Reply #6
Right, I've got the guts of SG's adaptive noise shaping (ANS) algorithm working, up to a point. I think that I have the initial state of the FIR filter working but am not sure how to use it - the existing static noise shaping (SNS) algorithm uses an IIR filter.

If anyone with experience of using FIR filters were to offer some hints, that would be very much appreciated.

I've got the optional use of single / double precision FFTW DLLs in place, as well as the spectral plot. After a bit more work on ANS, I will release a beta version and source.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #7
I don't know how much you know and how much you don't know, plus I myself have absolutely no idea what FIR filters are , so I'm not sure if anything I come up with will be of any use, but I'm going to try anyway.

This seems to have some implementation notes, although they're for C and assembly so I don't know if it'll be of any use.

This is also about implementing it in C. (I've gotten that from here.)

This seems to have a bunch of details about FIR filters.

lossyWAV 1.3.0 Development Thread

Reply #8
Following some very constructive help from SG, lossyWAV beta 1.2.1a is attached to post #1 in this thread.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #9
Do I understand correctly what we're aiming for?
The noise shaping is so we can add more noise but still hide it, this could give better compression. But with noise shaping, the used codec will have worse compression as a result, which will (partly) eliminate this advantage.

"hiding the noise" and "noise partly eliminates this advantage" sort of contradict each other. If it's properly hidden ("behind the original signal") it will NOT affect the quantized signal's predictability in any significant way. And that's the goal I'd be aiming for.
(1) apply psychoacoustic model to determine tolerable noise levels
(2) restrict local SNRs not to go below a certain threshold
(3) derive noise shaping filter and wasted_bits from the results

#1 makes sure that the noise is below the masking thresholds
#2 makes sure that the noise doesn't hurt predictability.

Cheers!
SG

lossyWAV 1.3.0 Development Thread

Reply #10
SebastianG,


If I understand correctly, dither with noise shaping is tend to be used as the last step of audio processing chain. How worse it could be if dithered before, like lossyWAV with ANS, or the product of CD, and then dither with noise shaping another time at the end of playback side?

Regards,
Enig123

lossyWAV 1.3.0 Development Thread

Reply #11
lossyWAV beta 1.2.2a attached to post #1 in this thread.

They say that a picture is worth a thousand words - the following images indicate that the implementation of SG's adaptive noise shaping method in beta 1.2.2a is pretty much putting the noise where I want it to be, i.e. behind the signal.

[edit] Superseded - images removed. [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #12
Excellent job, Nick. :) Looking forward to the 1.3.0!
Infrasonic Quartet + Sennheiser HD650 + Microlab Solo 2 mk3. 

lossyWAV 1.3.0 Development Thread

Reply #13
If I understand correctly, dither with noise shaping is tend to be used as the last step of audio processing chain.

I don't know. It certainly is a possibility to "work" in 24 bit and do a word length reduction to 16 bit with a dithering noise shaping quantizer so you can create an audio CD for example.

How worse it could be if dithered before, like lossyWAV with ANS, or the product of CD, and then dither with noise shaping another time at the end of playback side?

I don't understand. Why would you be dithering at the "end of the playback side"?

As for LossyWav: I think dithering is neither appropriate nor necessary. The probability of the signal being "self-dithering" is very high if a noise shaper is used that "hides the noise behind the signal".

It's possible that "dithering" means something different to you.

lossyWAV 1.3.0 Development Thread

Reply #14
Average bitrate for my usual test set of various pop music:

lossyWAV_beta_1.2.2a -P --altpreset --adaptive: 368 kbps (compared to 379 kbps without --adaptive)
lossyWAV_beta_1.2.2a -Z --altpreset --adaptive: 314 kbps

This is pretty interesting.

I'm going to do some listening tests within the next days focusing on -Z --altpreset --adaptive.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #15
I don't understand. Why would you be dithering at the "end of the playback side"?

As for LossyWav: I think dithering is neither appropriate nor necessary. The probability of the signal being "self-dithering" is very high if a noise shaper is used that "hides the noise behind the signal".

It's possible that "dithering" means something different to you.


For example, playing a "dithered" lossyFLAC with foobar2000 and let the output dithering to 16-bit, which is possible with volue adjustment and the foobar2000 uses float point for internal processing. There will be two dither processings. I wonder if that would cause some negative effect in terms of quality.


lossyWAV 1.3.0 Development Thread

Reply #17
I just did some listening tests.

Problems for the adaptive noise shaping are
- Bibilolo
- Furious.

Kind of low frequency artefacts are added, annoying with -Z --adaptive, much better but still easily ABXable with -Z --altpreset --adaptive.

Maybe shaped noise is a problem when concentrated at low frequencies.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #18
Thanks for that - I'll get to work on it and see what comes out....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #19
There are clicks (waveform discontinuities) in the added noise at some block boundaries, e.g. Furious.wav --adaptive -Z, look at the correction file at samples 166400, 166912, 171008, 171520 etc.

I can't hear it, but I doubt it should be happening.

Cheers,
David.

lossyWAV 1.3.0 Development Thread

Reply #20
Maybe shaped noise is a problem when concentrated at low frequencies.

The next thing on the list was a sort of (I hope simple) PSY-model, it could be it is already needed 
@Nick: Thanks for the spectrum pictures, it made it much more clear to me what the idea of ANS is.
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV 1.3.0 Development Thread

Reply #21
Thanks for the feedback on "clicking", David - I will see what I messed up and revert.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #22
I can confirm the added noise can be described as kind of clicking.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #23
lossyWAV beta 1.2.2b attached to post #1 in this thread.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #24
Modification of fftw_interface unit and FFT handling routines to allow the use of libfftw3f-3.dll as well as libfftw3-3.dll (-F, --fftw <single/double/off>);

By the way, what exactly is the difference between using libfftw3f-3.dll and libfftw3-3.dll? Does lossyWAV only use one or both if they're both present?