Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV 1.1.0 Development Thread. (Read 181994 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

lossyWAV 1.1.0 Development Thread.

Following the release of lossyWAV 1.0.0b, I feel it is time to kick off development of the next minor release.

Items currently on the list for inclusion in 1.x.0:

[blockquote]1.1.0: STDIN input;
1.1.0: STDOUT output;
1.1.0: Channel independent bit removal;
1.1.0: Reversion to same bits-to-remove for all channels;
1.1.0: Noise shaping;
1.2.0: Checking of S (=L-R) channel for matrix surround content;[/blockquote]
If you have any ideas, suggestions, code optimisations, etc, please post them here.
[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]
Code: [Select]
lossyWAV 1.1.0b, Copyright (C) 2007,2008 Nick Currie. Copyleft.

This program is free software: you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later
version.

This program is distributed in the hope that it will be useful,but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.  See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with
this program.  If not, see <http://www.gnu.org/licenses/>.

Process Description:

lossyWAV adds white noise to the processed output. The amount of added noise is
based on analysis of the signal levels in the frequency range 20Hz to 16kHz.

If signals above the upper limiting frequency are at an even lower level, they
can be swamped by the added noise. This is usually inaudible, but the behaviour
can be changed by specifying a higher --limit (in the range 16kHz to 20kHz).

For many audio signals, there is little content at very high frequencies, and
forcing lossyWAV to keep the added noise level lower than the content at these
frequencies can increase the bitrate dramatically for no perceptible benefit.

Usage   : lossyWAV <input wav file> <options>

Example : lossyWAV musicfile.wav

Quality Options:

-I, --insane        highest quality output, suitable for transcoding;
-E, --extreme       high quality output, also suitable for transcoding;
-S, --standard      default quality output, considered to be transparent;
-P, --portable      good quality output for DAP use. Not considered to be fully
                    transparent, but considered fit for its intended purpose.

Standard Options:

-c, --check         check if WAV file has already been processed; default=off.
                    errorlevel=16 if already processed, 0 if not.
-C, --correction    write correction file for processed WAV file; default=off.
-f, --force         forcibly over-write output file if it exists; default=off.
-h, --help          display help.
-L, --longhelp      display extended help.
-M, --merge         merge existing lossy.wav and lwcdf.wav files.
-o, --outdir <t>    destination directory for the output file(s).
-v, --version       display the lossyWAV version number.

Advanced Options:

-                   if filename="-" then WAV input is taken from STDIN.
    --blockdist     show distribution of lowest significant bit of input
                    codec-blocks and bit-removed codec-blocks.
-D, --dither <n>    enable variable PDF dither of output; default=off;
                    0 = rectangular; 1 = triangular; 0.5 = half way between.
-l, --limit <n>     set upper frequency limit to be used in analyses to n Hz;
                    (16000<=n<=20000), default = 16000.
    --linkchannels  Revert to original single bits-to-remove value for all
                    channels rather than channel dependent bits-to-remove.
-q, --quality <n>   quality preset (10=highest quality, 0=lowest bitrate;
                    default = --standard = 5; --insane = 10; --extreme = 7.5;
                    --portable = 2.5)
    --sampledist    show distribution of lowest significant bit of input
                    samples and bit-removed samples.
    --scale <n>     scaling factor from WaveGain, etc; (0.0<n<=8.0),default=1.0
-s, --shaping <n>   enable fixed noise shaping; (0.00<=n<=1.00); default=q/10;
                    0.00 = off, 1.00 = 100% effectiveness, 0.50 = 50%, etc.
    --stdinname <t> pseudo filename to use when input from STDIN.
    --stdout        write processed WAV output to STDOUT.
-w, --writetolog    create (or append to) lossyWAV.log in the output directory.

System Options:

-B, --below         set process priority to below normal.
-d, --detail        enable detailed bits-to-remove information output mode
    --low           set process priority to low.
-n, --nowarnings    suppress lossyWAV warnings.
-Q, --quiet         significantly reduce screen output.
    --silent        no screen output.

Special thanks:

David Robinson      for the publication of his lossyFLAC method, guidance, and
                    the motivation to implement the method as lossyWAV.
Horst Albrecht      for ABX testing, valuable support in tuning the internal
                    presets, constructive criticism and all the feedback.
Sebastian Gesemann  for the noise shaping coefficients and help in using them
                    in the lossyWAV noise shaping implementation.
Don Cross           for the Complex-FFT algorithm used.
[/size]
Link to the hydrogenaudio wiki article

Suggested foobar2000 converter setup:

lossyFLAC:
Code: [Select]
Encoder: c:\windows\system32\cmd.exe
Extension: lossy.flac
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\flac - -b 512 -5 -f -o%d
Format is: lossless or hybrid
Highest BPS mode supported: 24
lossyTAK:
Code: [Select]
Encoder: c:\windows\system32\cmd.exe
Extension: lossy.tak
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\takc -e -p2m -fsl512 -ihs - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24
lossyWV:
Code: [Select]
Encoder: c:\windows\system32\cmd.exe
Extension: lossy.wv
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\wavpack -hm --blocksize=512 --merge-blocks -i - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24
There is a known problem within foobar2000 (although more likely to do with cmd.exe itself) when running an executable within the cmd.exe command line from a path which includes spaces. The suggested fix for this is to enclose the element of the path which contains spaces within double quotation marks ("), e.g. c:\"program files"\directory_where_executable_is\executable_name

Change log 1.1.0c: 30/04/2009
Exactly as 1.1.0b except that the WINE incompatibility issue has been fixed.

Executable here.
Source here.

[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]Change log 1.1.0b: 03/08/08
FFT lengths will now increase for higher bitrate audio, i.e. 88.2/96kHz, 176.4/192kHz and 352.8/384kHz;
improved logfile output and --detail output;
reference threshold constants for rectangular dither and triangular dither have been calculated so added noise should be the same for dither off and any dither level between 0 and 1 - the number of bits-to-remove will however reduce with "increasing" dither.

Change log 1.1.0: 12/07/08
Certain advanced parameters removed for final release.

Change log 1.0.1x RC4: 12/07/08
Final release candidate prior to release of 1.1.0

Change log 1.0.1w RC3: 02/07/08
Code tidied up a bit more (yet again....);
--wine parameter modified to stop the program using Windows API function calls when using piped input (should hopefully stop crashing under Wine).

Change log 1.0.1v RC2: 30/06/08
Code tidied up a bit more (again....);
--wine parameter implemented to stop the program using the GetLastError Windows API call when using piped input (should stop crashing under Wine).

Change log 1.0.1u RC1: 20/06/08
Code tidied up a bit more;
--bitdist parameter introduced to allow user to "examine" the distribution of lowest set bit on a codec-block by codec-block basis, channels treated separately.

Change log beta 1.0.1t: 11/06/08
Revision to STDIN handling - bug found where last codec-block read from foobar2000 using STDIN input was not being written to the output file.

Change log beta 1.0.1s: 09/06/08
Revision to STDIN handling. Now (fingers crossed) should work successfully inside Foobar2000;
Code and help tidied up;
Dither function fixed and augmented. Taking on board a statement by SG with respect to using a dither function somewhere between rectangular (rand - 0.5) and triangular (rand-0.5)+(rand-0.5), i.e. (rand-0.5)+s*(rand-0.5) {0<=s<=1}. s=0 = rectangular dither; s=1 = triangular dither. -D, --dither now requires a supplementary <n> in the range 0<=n<=1.

Change log beta 1.0.1r: 03/06/08
Implementation of fast square root function using lookup tables for fxtract(ed) exponent and mantissa of input value;
--scale parameter corrected to accepted values in the range 0<n<=8.

Change log beta 1.0.1q: 30/05/08
Codec-block overflow bug (when codec-block-size=4096) corrected;

Change log beta 1.0.1p: 29/05/08
Quality synonym automatic noise shaping bug corrected;

Change log beta 1.0.1o: 29/05/08
Spreading function spread-zones and spreading-function string modified to allow finer control of high frequency zones;
Code "recovered" from 1.0.1e after a minor hardware failure

Change log beta 1.0.1n: 26/05/08
Implementation of -H, --highskew <n> parameter. Functionally identical to the internal skewing applied to the FFT results (-36dB @ 20Hz to 0dB at 3.45kHz) except applied from 3.45kHz upwards. Valid in the range 0 to 36 (0=default=no high skew applied).

Change log beta 1.0.1m: 25/05/08
reintroduction of max-inter-block-change implementation limits increase in bits-to-remove between codec-blocks to 1 bit.

Change log beta 1.0.1k: 23/05/08
static maximum_bits_to_remove limitation re-applied in serial with dynamic maximum_bits_to_remove limitation;
Automatic noise shaping now applied using a shaping-factor of quality-level / 10.

Change log beta 1.0.1j: 23/05/08
-q <n> quality selection moved to advanced settings;
-E, --excessive changed to --extreme; -I, --insane added, equivalent to -q 10;
--lowpass changed to -l, --limit in keeping with discussion;
Process Description text added to --longhelp.

Change log beta 1.0.1i: 23/05/08
-q <n> quality selection moved to advanced settings;
-E, --excessive; -N, --normal; -P, --portable quality "names" introduced following discussion in the development thread. These equate to -q 7.5; -q 5.0 and -q 2.5 respectively.

Change log beta 1.0.1h: 20/05/08
minimum bits to keep values changed for -q 0 and -q 1 to 2.333 and 2.667 respectively.

Change log beta 1.0.1g: 22/05/08
Reference_threshold > threshold_index > bits_to_remove calculation refined;
spreading function string modified;
minimum bits to keep values changed for -q 0 and -q 1;
--writetolog (-w) parameter implemented to write minimal output to "lossyWAV.log". Appends to existing file if already exists;
--lowpass <n> parameter re-implemented to allow users to set upper frequency limit of the range that lossyWAV uses in its analyses (16000<=n<=24000).

Change log beta 1.0.1f: 20/05/08
Filenaming logic "improved" when STDIN and STDOUT used together.

Change log beta 1.0.1e: 19/05/08
STDIN / STDOUT mode tidied up. Use the following as a flossy.bat file for foobar conversion:
Code: [Select]
@echo off
z:\bin\lossyWAV %1 --low --nowarnings --quiet %3 %4 %5 %6 %7 %8 %9 --stdout|z:\bin\flac - -5 -f -b 512 -o%2
Unfortunately, due to the nature of piped input to FLAC, the lossyWAV 'fact' chunk is lost. This means no record is kept within the file that is has been processed with lossyWAV (however, the lower the quality setting of the processing, the more likely the bitrate will be an obvious indicator that the file has indeed been processed with lossyWAV);
Minor error found and amended in revised remove_bits procedure, no minimum_bits_to_keep value was being applied, although this has little impact at -q >= 2;
New parameter --linkchannels implemented to revert to old remove_bits method whereby all channels share the same bits_to_remove. Implementing this, I found an error in the original which was forcing more bits to be lost to clipping prevention than should have been (i.e. output was more conservative).

Change log beta 1.0.1d: 18/05/08
STDIN / STDOUT mode modified again (use '-' as a filename to enable STDIN input, --stdout to enable STDOUT output).
Console output has been redirected to 'con', rather than STDOUT.

Change log beta 1.0.1c: 16/05/08
STDIN / STDOUT mode modified again (use '-' as a filename to enable STDIN input).

Change log beta 1.0.1b: 15/05/08
Channel independent bit-removal implemented;
STDIN / STDOUT mode modified - still very much a work in progress.

Change log beta 1.0.1: 14/05/08
STDIN / STDOUT mode commenced.[/size]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.1.0 Development Thread.

Reply #1
I've been playing with STDIN / STDOUT. Setting input-file-name and output-file-name, using --silent and the following command line in a DOS box:
Code: [Select]
for %a in (..\_swav16\*.wav) do lossywav - -q 0 -S 0 --silent <"%a" |flac - -b 512 -5 --sign signed --bps 16 --sample-rate 44100 --channels 2 --endian little -o"%~na.lossy.flac" -f
I get a set of lossyFLAC files. However this method does not allow retention of the 'fact' chunk as the --keep-foreign-metadata FLAC parameter is incompatible with the - parameter to indicate STDIN input to FLAC.

Speed is better due to much less HDD access.
Code: [Select]
c:\data_nic\bin\lossywav - -q 0 -S 0 --silent |c:\data_nic\bin\flac - -b 512 -5 --sign signed --bps 16 --sample-rate 44100 --channels 2 --endian little -o"%d" -f

....does not work in foobar2000. Does anyone have any ideas?

[edit] lossywav beta 1.0.1 removed as being obsolete. [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.1.0 Development Thread.

Reply #2
I'm currently toying around with "frequency-warped all-pole lattice filters". I think they are the perfect fit for your case once I get them to work as noise shaping filters. These are the kinds of filters Edler et al used for their "new paradigm codec" (better known as Fraunhofer's "ultra low delay codec"). I'm confident that it's possible to turn these filters into noise shaping filters as required in the lossyWAV case.

If you're interested in this we should talk about how collaboration might look like.

Buzz words explained:

frequency warping = A technique that can be used in filter design to achieve nonuniform frequency resolution. In the context of lossy coding and masking thresholds this technique helps find filters that match the masking threshold well.

all-pole filter = A kind of digital filter. The transfer function's nominator is constant (ie the filter has no zeros, only poles). These are often used in speech codecs as synthesis filters but they also seem appropriate for matching masking thresholds (see Edler et al).

lattice filter = A special implementation that allows easy filter interpolation.


Cheers,
SG

lossyWAV 1.1.0 Development Thread.

Reply #3
I'm currently toying around with "frequency-warped all-pole lattice filters". I think they are the perfect fit for your case once I get them to work as noise shaping filters. These are the kinds of filters Edler et al used for their "new paradigm codec" (better known as Fraunhofer's "ultra low delay codec"). I'm confident that it's possible to turn these filters into noise shaping filters as required in the lossyWAV case.

If you're interested in this we should talk about how collaboration might look like.

Buzz words explained:

frequency warping = A technique that can be used in filter design to achieve nonuniform frequency resolution. In the context of lossy coding and masking thresholds this technique helps find filters that match the masking threshold well.

all-pole filter = A kind of digital filter. The transfer function's nominator is constant (ie the filter has no zeros, only poles). These are often used in speech codecs as synthesis filters but they also seem appropriate for matching masking thresholds (see Edler et al).

lattice filter = A special implementation that allows easy filter interpolation.


Cheers,
SG
I would be delighted to use your proposed noise shaping method. ygpm.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)



lossyWAV 1.1.0 Development Thread.

Reply #6
I've had another look at the FLAC format specification and it appears that the wasted bits parameter is channel dependent rather than block dependent.

This raises the interesting possibility of removing different numbers of bits for each channel.... This will require quite a bit of rework in the remove_bits procedure, however I think that it will be worth it in the end as it can only increase the number of bits removed.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.1.0 Development Thread.

Reply #7
Also, the quantization/dithering part could be done jointly on the channels using a generalized metric in the spirit of M/S coding. You usually don't want the quantization noise's coherence (comparing left versus right) to differ greatly from the coherence between the left and right of your original signal, I suppose.

edit: This is probably overkill at the moment.

lossyWAV 1.1.0 Development Thread.

Reply #8
Also, the quantization/dithering part could be done jointly on the channels using a generalized metric in the spirit of M/S coding. You usually don't want the quantization noise's coherence (comparing left versus right) to differ greatly from the coherence between the left and right of your original signal, I suppose.

edit: This is probably overkill at the moment.
....but, the bit-removal related noise takes into account the RMS value of each channel with respect to maximum bits-to-remove and also each channel is treated separately for FFT analysis purposes. I have a working beta now and below are the resultant bitrates for my 53 problem sample set:
Code: [Select]
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|Version| FLAC  | -q 10 | -q 9  | -q 8  | -q 7  | -q 6  | -q 5  | -q 4  | -q 3  | -q 2  | -q 1  | -q 0  |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|1.0.0b |784kbps|637kbps|607kbps|577kbps|545kbps|513kbps|480kbps|449kbps|427kbps|390kbps|349kbps|306kbps|
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|1.0.1b |784kbps|654kbps|626kbps|596kbps|565kbps|534kbps|501kbps|470kbps|447kbps|408kbps|366kbps|329kbps|
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+


[edit] lossywav beta 1.0.1b removed as being obsolete. [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.1.0 Development Thread.

Reply #9
... I have a working beta now and below are the resultant bitrates for my 53 problem sample set: ...


Sorry: what has changed? I don't understand it. Especially I expected bitrate to go down.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.1.0 Development Thread.

Reply #10
... I have a working beta now and below are the resultant bitrates for my 53 problem sample set: ...
Sorry: what has changed? I don't understand it. Especially I expected bitrate to go down.
Hehehe.... You spotted the mistake, I transposed the bitrates. I'll amend now.
Code: [Select]
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|Version| FLAC  | -q 10 | -q 9  | -q 8  | -q 7  | -q 6  | -q 5  | -q 4  | -q 3  | -q 2  | -q 1  | -q 0  |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|1.0.0b |784kbps|654kbps|626kbps|596kbps|565kbps|534kbps|501kbps|470kbps|447kbps|408kbps|366kbps|329kbps|
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
|1.0.1b |784kbps|637kbps|607kbps|577kbps|545kbps|513kbps|480kbps|449kbps|427kbps|390kbps|349kbps|306kbps|
+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.1.0 Development Thread.

Reply #11
... I transposed the bitrates. ...

Very interesting results. I tried my regular track set using -q 4, -q 6.5, and -q 1.5.
With -q 6.5 and -q 4 my results are close to yours: a saving of nearly 20 kbps on average.
With -q 1.5 it's less than that however: a saving of a bit less than 10 kbps.

It's a welcome decrease in  bitrate.
I did a short listening test at -q 1.5 for badvilbel, triangle and Under the Boardwalk, and quality is fine to me.

I call it another step forward.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.1.0 Development Thread.

Reply #12
Very interesting results. I tried my regular track set using -q 4, -q 6.5, and -q 1.5.
With -q 6.5 and -q 4 my results are close to yours: a saving of nearly 20 kbps on average.
With -q 1.5 it's less than that however: a saving of a bit less than 10 kbps.

It's a welcome decrease in  bitrate.
I did a short listening test at -q 1.5 for badvilbel, triangle and Under the Boardwalk, and quality is fine to me.

I call it another step forward.
I think that it is. I transcoded my Mike Oldfield collection (38 discs as single files, 33.5 hours) FLAC: 797kbps; lossyFLAC -q 0 1.0.1b: 264kbps (232kbps to 290kbps album range) and still palatable to the ears.

[edit] I'd be very interested if anyone with some poly-channel WAV files could use 1.0.0b to process them and encode to FLAC then do the same with 1.0.1b. I feel that there may be a marked difference in the resultant bitrates.

The separation of the channels in terms of calculating the bits to remove and removing the bits has two effects: firstly, each separate channel RMS value is used (rather than the minimum of all channels) and bits-to-remove determined from that channel's FFT analyses; secondly, when removing the bits, if too many clips occur in one channel, only that channel's bits-to-remove is reduced until an acceptable number of clips is achieved (not all channels). [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.1.0 Development Thread.

Reply #13
This is brilliant. If you look at the early MATLAB code, you'll know I had plans to check for near-silent channels and take them out of the calculation - now that Nick has spotted that wasted_bits is channel dependent, there's no need, and you have this great increase in efficiency. For signals where most of the content is in one channel, this makes a huge difference.

I wonder what other lossless codecs do? If any do wasted_bits per block, not per channel, then you're adding more noise than you can get any benefit from, which will reduce the lossless encoding efficiency slightly. Not a big deal, but it would make sense to have the option to turn it off. Leave it on by default though.


While you're playing in this area, it might be worth adding an option (it should not be there by default) to check the stereo difference channel S=(L-R) and run the analysis on this too, allowing it to drag the bits_to_remove down on the L and R channels if the value for S is lower than that for L or R. This will prevent unexpected things turning up in the surround channel on matrix surround systems (e.g. Dolby ProLogic). Add an optional offset too - e.g. bits_to_remove in L and R should never be more than x above the bits_to_remove calculated for S. Obviously calculate and compare in the threshold domain, not the bits_to_remove domain, because that's too coarse. I just explained it this way for simplicity.


For a change like these, I think it's probably worth putting out another stable release before going for the radical change of noise shaping. There's little chance of "breaking" anything at this point, whereas tuning of the next stage could take a long time. Might as well give the benefit of the improvement to the masses!

Cheers,
David.

lossyWAV 1.1.0 Development Thread.

Reply #14
Nick, beta v1.0.1b is the first release that doesn't start in my win98. "Program has performed an illegal action bla bla"  Unknown error in unknown module and all zeros. So something has changed.

lossyWAV 1.1.0 Development Thread.

Reply #15
This is brilliant. If you look at the early MATLAB code, you'll know I had plans to check for near-silent channels and take them out of the calculation - now that Nick has spotted that wasted_bits is channel dependent, there's no need, and you have this great increase in efficiency. For signals where most of the content is in one channel, this makes a huge difference.

I wonder what other lossless codecs do? If any do wasted_bits per block, not per channel, then you're adding more noise than you can get any benefit from, which will reduce the lossless encoding efficiency slightly. Not a big deal, but it would make sense to have the option to turn it off. Leave it on by default though.


While you're playing in this area, it might be worth adding an option (it should not be there by default) to check the stereo difference channel S=(L-R) and run the analysis on this too, allowing it to drag the bits_to_remove down on the L and R channels if the value for S is lower than that for L or R. This will prevent unexpected things turning up in the surround channel on matrix surround systems (e.g. Dolby ProLogic). Add an optional offset too - e.g. bits_to_remove in L and R should never be more than x above the bits_to_remove calculated for S. Obviously calculate and compare in the threshold domain, not the bits_to_remove domain, because that's too coarse. I just explained it this way for simplicity.


For a change like these, I think it's probably worth putting out another stable release before going for the radical change of noise shaping. There's little chance of "breaking" anything at this point, whereas tuning of the next stage could take a long time. Might as well give the benefit of the improvement to the masses!

Cheers,
David.
It took me a bit by surprise how much the bitrate came down (on some tracks, not all). I will work out how to implement the parameter required to revert to the old method for codecs which cannot treat each channel separately for the purpose of wasted-bits.

I'll get this working properly and try to firm up the STDIN / STDOUT mechanisms before going further. Properly processing the S channel may have to wait until 1.2.0.

I would agree that the resultant bitrate reduction associated with this finding is important enough to warrant a 1.1.0 release earlier than expected and defer noise shaping to 1.2.0.

Nick, beta v1.0.1b is the first release that doesn't start in my win98. "Program has performed an illegal action bla bla"  Unknown error in unknown module and all zeros. So something has changed.
How are you running beta 1.0.1b? If using the STDIN / STDOUT option, it may still be a bit flakey - at 1.0.1b I was assuming both STDIN input and STDOUT output when using the '-' parameter. beta 1.0.1c changes this a bit by allowing '-' to indicate STDIN input in isolation and I am working on a '--nochunksin <bps> <channels> <rate>' parameter to allow "proper" STDIN input from FLAC, etc. The corresponding '--stdout' parameter is also in place and I am working on the '--nochunksout' parameter to disable any WAV information other than a stream of samples going to STDOUT. I will post beta 1.0.1c soon.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.1.0 Development Thread.

Reply #16
How are you running beta 1.0.1b?

Like any other first starts of new releases I started it from the command line without any parameters/options.  Normally one gets 'use proper syntax / type -help'.  This time it failed.

lossyWAV 1.1.0 Development Thread.

Reply #17
While you're playing in this area, it might be worth adding an option (it should not be there by default) to check the stereo difference channel S=(L-R) and run the analysis on this too [..]
David.

This would be very complicated with multi channel (>2) files and I somehow doubt the usefulness.

This will prevent unexpected things turning up in the surround channel on matrix surround systems (e.g. Dolby ProLogic).

I think this is new territory that has not (extensively) been tested with the current version as well. What should a matrix decoder do with white noise?
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV 1.1.0 Development Thread.

Reply #18
While you're playing in this area, it might be worth adding an option (it should not be there by default) to check the stereo difference channel S=(L-R) and run the analysis on this too [..]

This would be very complicated with multi channel (>2) files and I somehow doubt the usefulness.

This will prevent unexpected things turning up in the surround channel on matrix surround systems (e.g. Dolby ProLogic).

I think this is new territory that has not (extensively) been tested with the current version as well. What should a matrix decoder do with white noise?

You would not do it with "multi-channel" files - it is only relevant to stereo files.

I'm not sure what you mean by "What should a matrix decoder do with white noise?". It will decode it the same way it would any other signal, though if it's in the rear channels then some noise reduction might kick in, and if it's an active decoder (like Dolby Pro Logic) the "steering" might attenuate quieter sounds.


It is new territory.

Cheers,
David.

lossyWAV 1.1.0 Development Thread.

Reply #19
I've attached an example. a..._MS_done.flac is a critical sample for this issue, that you might choose to encode with lossyWAV.

If you take the .lossy version, decode the result through a matrix surround decoder, you will get noise in the rear channel. In the file a..._MS_done.lossy.MS.flac I've put the "centre" channel into the left channel, and the "rear" channel into the right channel, so you can hear the result easily. Note that it would be very difficult to hear in a real surround sound system, unless you put your ear right up to the rear speakers (some people do though!).

Even this critical sample isn't bad, and I can't imagine how it could ever get any worse than this, but it would be useful to have a switch to check the S channel to keep it as "clean" as the other (real!) "channels" - if not now, please add it to the list of features for the future! When the switch is activated, you should probably check M=L+R in the same way.

Cheers,
David.

lossyWAV 1.1.0 Development Thread.

Reply #20
Looking for a very high quality substitute for lossless archiving I ended up with v1.0.1b -q 7.0 --shaping 1.0.
Yields a bitrate of 528 kbps on average with my regular track set, which is 34 kbps more than when not using --shaping. But listening to the correction file noise is so much less audible when using noise shaping that it's worth spending this extra bitrate.
Bitrate difference is higher for lower quality settings as I noticed before: with v1.0.1b -q 5.5 using --shaping 1.0 or not makes up for a difference of 46 kbps with my regular track set.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.1.0 Development Thread.

Reply #21
Looking for a very high quality substitute for lossless archiving I ended up with v1.0.1b -q 7.0 --shaping 1.0.
Yields a bitrate of 528 kbps on average with my regular track set, which is 34 kbps more than when not using --shaping. But listening to the correction file noise is so much less audible when using noise shaping that it's worth spending this extra bitrate.
Bitrate difference is higher for lower quality settings as I noticed before: with v1.0.1b -q 5.5 using --shaping 1.0 or not makes up for a difference of 46 kbps with my regular track set.
I take it from that that you are content with the bit-removal process being channel dependent rather than the lowest of all channel bits-to-remove? In my listening to the results of the revised bit-removal, I am content with the results, also with the improved efficiency when losslessly encoded.

I am still working on the STDIN and STDOUT processes. At present lossyWAV beta 1.0.1c can output raw audio to FLAC and have it correctly encoded (using lossywav wavfilename.wav --stdout | flac - -5 -b 512 --bps 16 --channels 2 --sample-rate 44100 --sign signed --endian little -f -o"wavfilename.lossy.flac"). It can take input through STDIN, (i.e. lossywav - <wavfilename.wav) and will output "lossyWAV.lossy.wav".

I am having difficulty piping FLAC --stdout output or foobar2000 converter output into lossywav - I cannot find any documentation which details the transfer format for foobar2000. [edit] Using "flac -d wavfilename.flac --stdout|lossywav - -q 0" I got a lossywav processed file lossywav.lossy.wav - when encoded with FLAC it seems to have worked. However, a double pipe will not (yet, if ever) work. [/edit]

lossyWAV beta 1.0.1c attached to post #1 in this thread.

NB: using STDIN (filename='-') is only working if the --nochunksin parameter is NOT specified. At present using both in combination will cause the program to crash. This release specifically made to see of collector's crash issue has been resolved....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.1.0 Development Thread.

Reply #22
I take it from that that you are content with the bit-removal process being channel dependent rather than the lowest of all channel bits-to-remove? ....
Yes, absolutely. Honestly speaking my imagination has always been that each channel was processed independently. I was rather surprised to learn that bits-to-remove was identical to all channels.

I am having difficulty piping FLAC --stdout output or foobar2000 converter output into lossywav ....
Maybe the wavPack documentation for the -i option helps:

-i = ignore length in wav header (no pipe output allowed)

Some programs that pipe data to encoders do not always give the correct length in the wav headers that they provide (foobar's clienc and CDex are examples). In these cases use this option to force WavPack to ignore the header and accept the actual length. Because WavPack must seek to the beginning of the file to write the correct length, this option cannot be used with piped output.


As you have to use a lossless codec afterwards it looks like we can have only piping in the input or piping in the output of LossyWAV.
Guess you're done: you use a temp wav file as lossyWAV input and piping to FLAC, and you don't benefit from having a piped input to lossyWAV and a temp wav file interface to the lossless codec.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.1.0 Development Thread.

Reply #23
Yes, absolutely. Honestly speaking my imagination has always been that each channel was processed independently. I was rather surprised to learn that bits-to-remove was identical to all channels.
Up until 1.0.0, the processing was carired out separately then the minimum value used, however the maximum bits-to-remove was dependent on the averae RMS over all channels. At 1.0.0, the minimum RMS of all channels was used.

Maybe the wavPack documentation for the -i option helps:

-i = ignore length in wav header (no pipe output allowed)

Some programs that pipe data to encoders do not always give the correct length in the wav headers that they provide (foobar's clienc and CDex are examples). In these cases use this option to force WavPack to ignore the header and accept the actual length. Because WavPack must seek to the beginning of the file to write the correct length, this option cannot be used with piped output.


As you have to use a lossless codec afterwards it looks like we can have only piping in the input or piping in the output of LossyWAV.
Guess you're done: you use a temp wav file as lossyWAV input and piping to FLAC, and you don't benefit from having a piped input to lossyWAV and a temp wav file interface to the lossless codec.
I would *really* like to implement piped input and output in foobar as it is the major processing bottleneck now.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)