IPB

Welcome Guest ( Log In | Register )

3 Pages V   1 2 3 >  
Reply to this topicStart new topic
TransPCM—use Float16/24 to reduce bit-depth, also promotes compression, 2013-05-30: beta 0.1.3a / formerly HalfPrecision / ref.: IEEE 754-2008
Nick.C
post Sep 12 2011, 22:03
Post #1


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



When I first became aware of the Float16 type I found it interesting in terms of maybe having potential as a possible lossy storage type.

Attached is a processed version of the sample in this thread which has been decoded back to PCM. The processing is as follows (at this stage as there is no player that I am aware of):
Input file Sample [PCM] > FLOAT16 > [PCM] > Output File.

The conversion from integer to Float16 reduces the number of significant bits of the 24-bit sample to (a maximum of) 11. There is therefore an amount of added noise due to truncation. The added noise has been noise shaped using SG's adaptive noise shaping method as used in lossyWAV.

Original Sample

Lossy Float16 Processed Sample [edit] removed due to TransPCM executable / foobar2000 Float16 playback availability, 29th May 2012. [/edit]

If the 24-bit coded sample is encoded with FLAC (wFormatTag left as 0x0001 as FLAC does not handle Floating Point samples and wBitsPerSample is left as 24) there is a saving of in excess of 20% in filesize.

I don't see any practical advantage using this storage type for native 16-bit PCM.

Changelog:

TransPCM beta 0.1.3a 30/05/2013
  • Bug fix in Float24 handling (thanks Bryant!).
  • Bug fix in WAVE_FORMAT_EXTENSIBLE chunk handling (thanks again Bryant!).
TransPCM beta 0.1.2 02/06/2012
  • Modification to adaptive noise shaping for 44.1/48kHz input;
  • Amendment to tracer fingerprinting.


This post has been edited by Nick.C: May 30 2013, 07:57
Attached File(s)
Attached File  TransPCM_beta_0.1.3a.zip ( 79.66K ) Number of downloads: 73
 


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
AndyH-ha
post Sep 12 2011, 22:11
Post #2





Group: Members
Posts: 2205
Joined: 31-August 05
Member No.: 24222



This differs from simply reducing the bit depth?
Go to the top of the page
+Quote Post
Nick.C
post Sep 12 2011, 22:19
Post #3


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Yes - in that values up to c. +/- 2^31 can still be stored (with an appropriate scaling factor). The granularity changes as sample absolute values get bigger. In this way, the original dynamic range can be preserved more effectively.

ABS(Input Sample) = 0 to 2047 : Output difference = 0;
ABS(Input Sample) = 2048 to 4095 = Output difference : 0 to -1;
ABS(Input Sample) = 4096 to 8191 = Output difference : 0 to -3;
ABS(Input Sample) = 8192 to 16383 = Output difference : 0 to -7;
ABS(Input Sample) = 16384 to 32767 = Output difference : 0 to -15;
ABS(Input Sample) = 32768 to 65535 = Output difference : 0 to -31;
etc.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
saratoga
post Sep 12 2011, 22:32
Post #4





Group: Members
Posts: 4853
Joined: 2-September 02
Member No.: 3264



Isn't this pretty much what ADPCM does? Except I guess without the differential encoding.
Go to the top of the page
+Quote Post
C.R.Helmrich
post Sep 12 2011, 23:05
Post #5





Group: Developer
Posts: 686
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



You mean a-Law/µ-Law?

Makes me wonder how many bits suffice to code the range -32768...32767. IOW, let the section "ABS(Input Sample) = 16384 to 32767" be the outermost one. Is a float13 enough?

Chris

This post has been edited by C.R.Helmrich: Sep 12 2011, 23:09


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
bryant
post Sep 13 2011, 06:13
Post #6


WavPack Developer


Group: Developer (Donating)
Posts: 1290
Joined: 3-January 02
From: San Francisco CA
Member No.: 900



QUOTE (Nick.C @ Sep 12 2011, 14:03) *
Attached is a processed version of the sample in this thread which has been decoded back to PCM. The processing is as follows (at this stage as there is no player that I am aware of):
Input file Sample [PCM] > FLOAT16 > [PCM] > Output File.

What you are describing is very similar to WavPack lossy. The difference is that you also want to add a decorrelation step before truncating to float16 to reduce the magnitude of the samples. This could be as simple as using the delta between samples, but it would be even better to have an adaptive filter to better handle some difficult cases.
Go to the top of the page
+Quote Post
Nick.C
post Sep 15 2011, 21:09
Post #7


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



The plan was to simply set wFormatTag= 0x0003 and wBitsPerSample =16. Any compliant player could play the WAV file without any decoding.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
knutinh
post Sep 16 2011, 13:49
Post #8





Group: Members
Posts: 569
Joined: 1-November 06
Member No.: 37047



float16 could be very efficient on GPU hardware if that is relevant to you.

-k
Go to the top of the page
+Quote Post
benski
post Sep 16 2011, 14:03
Post #9


Winamp Developer


Group: Developer
Posts: 670
Joined: 17-July 05
From: Brooklyn, NY
Member No.: 23375



QUOTE (Nick.C @ Sep 15 2011, 16:09) *
The plan was to simply set wFormatTag= 0x0003 and wBitsPerSample =16. Any compliant player could play the WAV file without any decoding.


From experience, I doubt many media players will play that.
Go to the top of the page
+Quote Post
Nick.C
post Sep 16 2011, 19:26
Post #10


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



I understand that currently players are rather unlikely to accept a 16-bit floating point WAV file. However, the RIFF standard effectively allows the use of Float16 even if it wasn't a standard at the time that the RIFF standard was set.

Decoding Float16 values is trivial - I have written a basic decoder (no NaN, Infinity handling) to both Float64 and (scaled) Integer output in Delphi seperately in x86/x87 - not too many lines of code per sample.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Nick.C
post Sep 22 2011, 13:05
Post #11


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



Below is Delphi / IA32 & x87 code used to encode Double to Float16 and decode Float16 to Double (actually extended [80-bit] in this case). Hopefully someone will find it useful.

Compatibility at a player level would be a great starting point (i.e. simply playing a RIFF WAVE with wFormatTag=0x0003 and wBitsPerSample=0x0010).

CODE
//==============================================================================
//
// Float16 encode / decode functions;
//
// 2011 Nick Currie
//
//==============================================================================

Type
Float16 = Word;

SingleRec = Packed Record
Case Integer of
1:(Single : Single);
2:(Integer : Integer);
3:(Words : Array[0..1] of Word);
4:(BytePx : Byte; WordP1 : Word;);
5:(Bytes : Array[0..3] of Byte);
End;

DoubleRec = Packed Record
Case Integer of
1:(Double : Double);
2:(Int64 : Int64);
3:(Integers : Array[0..1] of Integer);
4:(Words : Array[0..3] of Word);
5:(BytePx : Byte; WordP1,WordP3,WordP5 : Word;);
6:(Bytes : Array[0..7] of Byte);
End;

ExtendedRec = Packed Record
Case Integer of
1:(Extended : Extended);
2:(Int64 : Int64);
3:(Integers : Array[0..1] of Integer);
4:(Words : Array[0..4] of Word);
5:(Bytes : Array[0..9] of Byte);
End;

Var
Powersof = Packed Record
two : array[-1024..1023] of Double;
End;

{$IFNDEF USEASM}
//============================================================================
Function F80_to_F16(F80_Value : Extended) : Float16;
//============================================================================
Var
Exponent : Integer;
Sign : Word;
Begin
//============================================================================
Sign:=ExtendedRec(F80_Value).Words[4] and $8000;

If (F80_Value=0) then
F80_to_F16:=Sign
else
Begin
Exponent:=(ExtendedRec(F80_Value).Words[4] and $7FFF)-$3FFF;
ExtendedRec(F80_Value).Words[4]:=ExtendedRec(F80_Value).Words[4] or $3FFF;

if Exponent>15 then
F80_to_F16:=Sign or $7C00
else
If (Exponent>-15) then
F80_to_F16:=(Sign or ((Exponent+15) shl 10) or ((ExtendedRec(F80_Value).Words[3] and $7FFF) shr 5))
else
if Exponent>-25 then
F80_to_F16:=(Sign or (ExtendedRec(F80_Value).Words[3] shr (-Exponent-9)))
else
F80_to_F16:=Sign;
End;
//============================================================================
End;
//============================================================================
{$ELSE}
//============================================================================
Function F80_to_F16(F80_Value : Extended) : Float16; Assembler;
//============================================================================
asm
push ecx
push edx

xor eax,eax
xor edx,edx

mov cx,word ptr F80_Value+8
mov dh,ch
and dh,$80
and cx,$7FFF
jz @@EX

sub cx,$3FFF

cmp cx,-$19
jle @@EX

mov ah,$7C
cmp cx,$10
jge @@EX

@@01: mov ax,word ptr F80_Value+6

cmp cx,-$0F
jg @@02

neg cl
sub cl,$09
shr eax,cl
jmp @@EX

@@02: add cl,$0F
shr ax,$05
shl cl,$02
and ax,$03FF
or ah,cl

@@EX: or ax,dx
pop edx
pop ecx
End;
//============================================================================
{$ENDIF}

{$IFNDEF USEASM}
//============================================================================
Function F64_to_F16(F64_Value : Double) : Float16;
//============================================================================
Var
Exponent : Integer;
Sign : Word;
Begin
//============================================================================
DoubleRec(F64_Value).Double:=F64_Value;
Sign:=DoubleRec(F64_Value).Words[3] and $8000;

If (F64_Value=0) then
F64_to_F16:=Sign
else
Begin
Exponent:=(((DoubleRec(F64_Value).Words[3] and $7FF0) shr 4)-$3FF);
DoubleRec(F64_Value).Words[3]:=((DoubleRec(F64_Value).Words[3] and $000F) or $3FF0);

if Exponent>15 then
F64_to_F16:=Sign or $7C00
else
If (Exponent>-15) then
F64_to_F16:=(Sign or ((Exponent+15) shl 10) or ((DoubleRec(F64_Value).WordP5 shr 2) and $3FF))
else
if Exponent>-25 then
F64_to_F16:=(Sign or (((DoubleRec(F64_Value).WordP5 or $1000) and $1FFE) shr (-Exponent-12)))
else
F64_to_F16:=Sign;
End;
//============================================================================
End;
//============================================================================
{$ELSE}
//============================================================================
Function F64_to_F16(F64_Value : Double) : Float16; Assembler;
//============================================================================
asm
push ecx
push edx

xor edx,edx
xor eax,eax

mov cx,word ptr F64_Value+6
mov dh,ch
and dh,$80
and cx,$7FF0
jz @@EX

shr cx,4
sub cx,$3FF

cmp cx,-$19
jle @@EX

mov ah,$7C
cmp cx,$10
jge @@EX

@@01: mov ax,word ptr F64_Value+5
shr ax,2
and ax,$03FF

cmp cx,-$0F
jg @@02

or ah,$04
neg cl
and ax,$07FF
sub cl,14
shr eax,cl
jmp @@EX

@@02: add cl,15
shl cl,2
or ah,cl

@@EX: or ax,dx
pop edx
pop ecx
End;
//============================================================================
{$ENDIF}


{$IFNDEF USEASM}
//============================================================================
Function F16_to_F80(F16_Value : Float16) : Extended;
//============================================================================
Begin
//============================================================================
if F16_Value and $8000<>0 then
Begin
if ((F16_Value and $7FFF)=0) then
F16_to_F80:=-0
else
if ((F16_Value and $7C00)=0) then
F16_to_F80:=-(F16_Value and $3FF)*powersof.two[-24]
else
F16_to_F80:=-((F16_Value and $3FF) or $400)*powersof.two[((F16_Value shr 10) and $1F)-25]
End
else
Begin
if F16_Value=0 then
F16_to_F80:=0
else
if ((F16_Value and $7C00)=0) then
F16_to_F80:=(F16_Value and $3FF)*powersof.two[-24]
else
F16_to_F80:=((F16_Value and $3FF) or $400)*powersof.two[((F16_Value shr 10) and $1F)-25]
End;
//============================================================================
End;
//============================================================================
{$ELSE}
//============================================================================
Function F16_to_F80(F16_Value : Float16) : Extended;
//============================================================================
asm
push ecx
push edx

movzx eax,F16_Value
mov dl,ah
and ax,$7FFF
jnz @@01

fldz
jmp @@EX

@@01: fld1
test dl,$80
jz @@02

fchs

@@02: mov ecx,eax
and ax,$3FF
shr cx,10
mov edx,1000
push eax
and cl,$1F
jz @@03

or word ptr [esp],$400
mov dx,999
add dx,cx

@@03: fimul dword ptr [esp]
fmul qword ptr powersof.two[edx*double_size]

@@04: pop eax

@@EX: pop edx
pop ecx
//============================================================================
End;
//============================================================================
{$ENDIF}
[edit] Code update. [/edit]

This post has been edited by Nick.C: Oct 27 2011, 07:26


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Nick.C
post Oct 20 2011, 12:53
Post #12


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



HalfPrecision beta 0.0.2cd attached. [bugfix] [superseded].

HalfPrecision is a tool to convert PCM in WAV format to Float16 in WAV format. Noise is added during the conversion process. By default this noise is adaptively shaped, however fixed noise shaping is also available along with the option to disable all noise shaping.

The -P, --precision <n> parameter selects the number of mantissa bits to use (8<=n<=11). Reducing the number of mantissa bits increases added noise.

By default, processed audio is output in Float16 format. Using the --decoded parameter converts the processed audio from Float16 back to the PCM in the original sample bitdepth (which may add a further small amount of noise) to allow the audio to be played in existing players.

NB: the Float16 audio is normalised in the range ±65504.0 rather than the standard ±1.0 (32bit and 64bit floating point audio). This allows a greater dynamic range of sample values (±65504 to ±6.1035E-05 [normalised]; ±5.9605E-08 [denormal]) to be stored. Returning samples to expected ±1.0 range is achieved by simple multiplication by 2^-16.

Bug-reports, comments and (constructive) criticism are welcomed.

This post has been edited by Nick.C: Oct 24 2011, 18:26


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
lvqcl
post Oct 22 2011, 10:54
Post #13





Group: Developer
Posts: 3328
Joined: 2-December 07
Member No.: 49183



foobar2000 1.1.9 beta1: "Implemented experimental support for 16-bit floating-point WAV files."
Go to the top of the page
+Quote Post
Nick.C
post Oct 22 2011, 14:24
Post #14


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



biggrin.gif


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Nick.C
post Oct 24 2011, 18:23
Post #15


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



HalfPrecision beta 0.0.2e attached [superseded]. Fix to allow encoding to lossy.wav (and optional correction file) using foobar2000. "--stdinname %d" must be used in the command line parameters.

It occurs to me that a free-to-use (and platform agnostic) analog to HDCD could be created using Float16 encoded PCM written to a conventional CD. The trick would be to in some way tell the (software presumably) player that the audio was encoded in Float16. I'm guessing some sort of LSB encoding (minimal noise) of a periodic nature, maybe 32 to 64 LSBs per (codec-block x channels)?

[edit] beta 0.0.2e superseded [/edit]

This post has been edited by Nick.C: Nov 1 2011, 13:49


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
pdq
post Oct 24 2011, 19:29
Post #16





Group: Members
Posts: 3372
Joined: 1-September 05
From: SE Pennsylvania
Member No.: 24233



I once had an idea for how to encode audio data. Take the square root of each data point, keeping the original sign. Decoding consists of squaring the encoded value, again keeping the sign. The result is that you lose about one bit of resolution on full-scale values, but you almost double the number of bits on small values.

For example, take 24 bit signed integer data, square root it and keep 12 bits. The values decode back to 24 bits with 15 bits of resolution on full-scale values.
Go to the top of the page
+Quote Post
Nick.C
post Oct 24 2011, 21:02
Post #17


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



That would be one way of doing it. However sample size range would be ±32767˛ to ±1˛, i.e. 1.0737E+09:1 . Sample range for Float16, as stated above, is ±65504 to ±6.1035E-05 [normalised]; ±5.9605E-08 [denormal], i.e. 1.0732E+09:1 [normalised]; 1.0990E+12:1 [denormal].

One idea for a signature / watermark to indicate Float16 encoding would be to use the bit pattern of 'Float16' with a bit inserted at either end and between the bits from each character, i.e. $231B0DE613A0C46C. This, when evaluated as a double precision float, equates to 1.41990141750237441E-0139. This pattern could be inserted, one bit per 7 samples, with a recurrence of, say, 512 samples, irrespective of number of channels (data span = 441 samples).

This post has been edited by Nick.C: Oct 24 2011, 21:08


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Nick.C
post Nov 1 2011, 13:53
Post #18


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



HalfPrecision beta 0.0.2f attached [edit] superseded [/edit].

Changelog:
  • Now outputs "halfp" and "hpcdf" files, i.e. halfp.wav and hpcdf.wav.
  • Fixes to encoding to ensure proper coding when precision limited to fewer than 10 mantissa bits.
  • Tweaks to adaptive noise shaping coefficients.


This post has been edited by Nick.C: Nov 10 2011, 13:42


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Nick.C
post Nov 10 2011, 13:41
Post #19


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



HalfPrecision beta 0.0.2g attached. [edit] Superseded. [/edit]

Changelog:
  • Output type can now be selected from 16/32 bit float; 16/24/32 bit integer;
  • Valid input types now include 8*/16/32 bit float; 8/16/24/32 bit integer;
The so called 8 bit floating point type turned up when I was searching for details of the Float16 type some time ago. The 8 bits are comprised: sign / 4-bit exponent / 3-bit mantissa, exponent bias is -2, minimum non-zero = ±1; maximum=±122880. Precision is basic (i.e. 0 to 7 are denormal (integer, exponent=0) then 8 upwards, 1.000 to 1.875 x 2^(exponent+2)). I'm still evaluating this type to see if merits further work.

Output to 8-bit unsigned integer is not yet enabled - I'm still trying to iron out some issues with clipping and adaptive noise shaping.

This post has been edited by Nick.C: Nov 14 2011, 21:23


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Didjeridoo
post Nov 11 2011, 11:36
Post #20





Group: Members
Posts: 16
Joined: 8-June 10
Member No.: 81308



Nick.C Thank you for your innovation.
It remains to add support FLOAT16 in FLAC, WavPack, TAK...
P.S. lossyWav not correctly support FLOAT16. dry.gif

This post has been edited by Didjeridoo: Nov 11 2011, 11:59


--------------------
MPC --quality 10 --tmn 20 --nmt 20 - %d || WV -miqhnb5x3 - %d
Go to the top of the page
+Quote Post
Nick.C
post Nov 11 2011, 23:05
Post #21


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



@Didjeridoo: Thanks - I am also looking forward to compatibility with a lossless format. The codebase for HalfPrecision is that of lossyWAV, so it is not inconceivable that an updated version of lossyWAV will appear at some point with the input capabilities of HalfPrecision. Output will still only be integer PCM though.

As an aside, and taking on board an early comment by Chris, I am also working on implementing read/write capability for µ-Law and A-Law samples.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Nick.C
post Nov 14 2011, 21:22
Post #22


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



HalfPrecision beta 0.0.2h attached. [edit] Superseded [/edit]

Changelog:
  • Now reads / writes CCITT 8-bit µLaw and ALaw;
  • Now writes 8-bit unsigned integer.
  • Work on so-called 8-bit float terminated after µLaw and ALaw implementation.


This post has been edited by Nick.C: Nov 15 2011, 20:31


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
Didjeridoo
post Nov 15 2011, 10:32
Post #23





Group: Members
Posts: 16
Joined: 8-June 10
Member No.: 81308



Nick.C I watch all the moves and develops shifty.gif
Is not the end, FLOAT24 also needed))
P.S. foobar2000 not read / write CCITT 8-bit µLaw and ALaw.


--------------------
MPC --quality 10 --tmn 20 --nmt 20 - %d || WV -miqhnb5x3 - %d
Go to the top of the page
+Quote Post
lvqcl
post Nov 15 2011, 15:53
Post #24





Group: Developer
Posts: 3328
Joined: 2-December 07
Member No.: 49183



QUOTE (Didjeridoo @ Nov 15 2011, 13:32) *
P.S. foobar2000 not read / write CCITT 8-bit µLaw and ALaw.

It does read them.
Go to the top of the page
+Quote Post
Ljubo44
post Nov 15 2011, 19:13
Post #25





Group: Members
Posts: 33
Joined: 16-January 11
Member No.: 87368



Do you have any sample of command line for foobar.
Go to the top of the page
+Quote Post

3 Pages V   1 2 3 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 25th July 2014 - 17:21