IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
Lossy-based lossless coding, Stupid idea
Efenstor
post May 15 2004, 13:34
Post #1





Group: Members
Posts: 9
Joined: 25-August 03
Member No.: 8520



I've got a rather stupid idea: did anybody try to use a wavelet/DCT/FFT-based compression to make a "point of reference" for lossless compression?

I mean, you compress high frequencies using MP3-like but NOT psychoacoustic coding, then decode the frame, find differences with original uncompressed frame and compress those differences using some strong compression together with that lossy frame.

I think in any case such compression would be much more effective than La/Monkey's Audio/Optimfrog.

Or maybe those encoders are using it already?
Go to the top of the page
+Quote Post
JohnV
post May 15 2004, 14:45
Post #2





Group: Developer
Posts: 2797
Joined: 22-September 01
Member No.: 6



Wavpack hybrid compression can do this.
There's not any gain in efficiency though.


--------------------
Juha Laaksonheimo
Go to the top of the page
+Quote Post
Efenstor
post May 15 2004, 17:52
Post #3





Group: Members
Posts: 9
Joined: 25-August 03
Member No.: 8520



No gain? It's impossible! For wave data frequency representation is much more efficient than byte-by-byte. Probably they used psychoacoustic-based approach in their hybrid compression but it should be not human-ear- but compression-effectiveness-oriented! For example: compress the range of mostly low and mid-range frequencies, they can be compressed using byte-by-byte methods well enough, use DCT or wavelets mostly for high frequencies.

It must give some gain anyway! I compared files compressed using MP3 128 kbps with original using WaveLab's comparison function: differences are rather subtle (at least for 16-bit data).

And what if to use the multi-stage approach: differences are also compressed using wavelets, then unpacked and compared with the original differences, then those differences are compressed using Huffman method (with delta-compression) or/and maybe with some dictionary-based method and etc.
Go to the top of the page
+Quote Post
robUx4
post May 15 2004, 18:17
Post #4


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



I think this should be investigated more. Imagine HE-AAC with the difference losslessly encoded. I think it's possible to gain efficiency compared to usual lossless codec.


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post
Garf
post May 15 2004, 18:27
Post #5


Server Admin


Group: Admin
Posts: 4886
Joined: 24-September 01
Member No.: 13



QUOTE (Efenstor @ May 15 2004, 06:52 PM)
No gain? It's impossible! For wave data frequency representation is much more efficient than byte-by-byte.

Not if it has to decode back identically to the 'byte-by-byte' version.
Go to the top of the page
+Quote Post
Garf
post May 15 2004, 18:29
Post #6


Server Admin


Group: Admin
Posts: 4886
Joined: 24-September 01
Member No.: 13



QUOTE (Efenstor @ May 15 2004, 06:52 PM)
It must give some gain anyway! I compared files compressed using MP3 128 kbps with original using WaveLab's comparison function: differences are rather subtle (at least for 16-bit data).

Try to losslessy encode the difference and look at that bitrate. It'll be surprisingly high. Now add the mp3's bitrate.

Compare to plain lossless.
Go to the top of the page
+Quote Post
Efenstor
post May 15 2004, 18:38
Post #7





Group: Members
Posts: 9
Joined: 25-August 03
Member No.: 8520



It is becuase psychoacoustic-based lossy compression has been used! It should be optimized to simplify compression of the difference data! It should be not psychoacoustic, it's compressionacoustic.

The major idea of good compression is to predict what byte will be next and use it as the point of reference to reduce data which corrects the predicted value. I think that "bad quality" audio can be used as a point of reference very efficiently.

Anyway audio compression is not my profession, I'm a desktop programmer (mostly image processing and file utilities), I think someone who made some lossless codec would experiment a bit with this approach.

Interesting, does any lossless codec use a dictionary-based compression (i.e. like WinRAR or WinAce)?

This post has been edited by Efenstor: May 15 2004, 18:47
Go to the top of the page
+Quote Post
Garf
post May 15 2004, 18:49
Post #8


Server Admin


Group: Admin
Posts: 4886
Joined: 24-September 01
Member No.: 13



QUOTE (Efenstor @ May 15 2004, 07:38 PM)
I think that "bad quality" audio can be used as a point of reference very efficiently.

Why would it be?
Go to the top of the page
+Quote Post
Efenstor
post May 15 2004, 18:55
Post #9





Group: Members
Posts: 9
Joined: 25-August 03
Member No.: 8520



Because it reduces the range of difference and, if it's compressionacoustic, makes the difference more smooth and thus easily compressed.

Sure, noise is unpredictable, but is there so much loud white noise?

This post has been edited by Efenstor: May 15 2004, 19:00
Go to the top of the page
+Quote Post
Efenstor
post May 15 2004, 18:57
Post #10





Group: Members
Posts: 9
Joined: 25-August 03
Member No.: 8520



I dunno, maybe it was the stupid idea indeed. But noone tried. Maybe there something is yet.

This post has been edited by Efenstor: May 15 2004, 19:06
Go to the top of the page
+Quote Post
SebastianG
post May 16 2004, 15:22
Post #11





Group: Developer
Posts: 1318
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (Efenstor @ May 15 2004, 09:57 AM)
I dunno, maybe it was the stupid idea indeed. But noone tried. Maybe there something is yet.


Check out the LTAC thread

bye,
Sebi
Go to the top of the page
+Quote Post
Efenstor
post May 16 2004, 18:23
Post #12





Group: Members
Posts: 9
Joined: 25-August 03
Member No.: 8520



Mostly what I meditated on. Though clearly my thought was to research compressionacoustics. After all, all sound is a bunch of sines, it may be really possible to find out which frequenices give minimum differences when extracted, re-created and compared with the original wave. At least in the most cases.

This post has been edited by Efenstor: May 16 2004, 18:24
Go to the top of the page
+Quote Post
Ivan Dimkovic
post May 16 2004, 18:43
Post #13


Nero MPEG4 developer


Group: Developer
Posts: 1466
Joined: 22-September 01
Member No.: 8



There is a devil called "entropy of the source" - whatever you do with the source, it is mathematically impossible to loselessly code signal in a such way that it require less bits than it's own entropy. Discussion about this is more like inventing a perpetuum mobile or tri-secting the angle, better forget about that smile.gif

Hybrid loseless codec implemented perfectly will be a percent or few percents less efficient than plain losless due to signalling overhead.
Go to the top of the page
+Quote Post
Pinky's brain
post May 17 2004, 04:15
Post #14





Group: Members
Posts: 13
Joined: 7-January 04
Member No.: 11037



Besides, if you use an invertible transform why not simply losslessly code the coefficients?

AFAICS the best performing lossless audio coders use LMS IIR prediction, and indeed multistage compression is almost always inefficient.

See :
http://edocs.tu-berlin.de/diss/2003/kim_jonghwa.pdf (easy to read, nothing too complex here)
http://www.eurecom.fr/~mary/publications.html (uses multistage compression, but math is too dense for me to get my head around)

Lossless image coders are moving to using bayesian techniques using multiple predictors, using the history of prediction errors to determine the "weighting" of each predictor (the simplest way is to do straight weighting of the predictions or choose the prediction with the heighest weight and then use traditional entropy coding assuming a laplacian or GG distribution ... but in theory you could compute a compound probability distribution for the actual coefficient and use that for coding, instead of using differential coding).

If I (as a complete layman I should add) wanted to improve lossless audio coding I would go the same way. Use multiple LMS trained IIR predictors with different adaption rates, and maybe a library of pretrained predictors, and use the history of prediction errors of each to weight them.

This post has been edited by Pinky's brain: May 17 2004, 04:36
Go to the top of the page
+Quote Post
cabbagerat
post May 17 2004, 11:36
Post #15





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



I decided to test this out, so I wrote a shell script that:
  • Encodes a wave with LAME
  • Subtracts the resultant mp3 from the original
  • Flac compresses the resulting difference file
  • tar.bz2 compresses the two files (MP3 and FLAC difference) togther
This process generally gets within 20% larger than the FLAC of the original file. Some samples do REALLY badly (like BigYellow from the current 128bit test - the compressed version is bigger than the original) and I haven't yet found a sample that does better than straight FLAC with this process. Interestingly, compression rates vary wildly with the individual sample and LAME settings (some samples get best compression with LAME --preset insane, others with --preset 64). While this is hardly a scientifically interesting test, it was realtively interesting to write and test.

Ivan Dimkovic is right about information entropy. Lossless audio encoders are a very interesting case here - as they can losslessly compress a sound file into a smaller number of bits than it's information entropy predicts. While this sounds impossible, it's not as some of the information contained in the waveform is actually contained in the decoding algorithm. Thus lossless audio compressors will do very badly with the vast majority of waveforms, but very well with ones that represent sensical audio data.

A good place to start if you don't know anything about the subject is here:
http://en.wikipedia.org/wiki/Information_theory
http://en.wikipedia.org/wiki/Information_entropy
http://en.wikipedia.org/wiki/Kolmogorov_complexity


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
robUx4
post May 17 2004, 11:58
Post #16


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



Mmm, I don't think that you can compress below the entropy. Or I don't know how you compute it. But given the nature of sound, you can predict that the next sample will be somehow like the previous one. That's this difference that needs to be encoded... Of course with pure electronic music things are getting harder since the signal/sound doesn't necessarily have this "sound" nature...

The problem of this approach (compress the difference) is that it's like encoding pure noise. All the "sound" nature of the signal is removed and nothing remains but noise... But when the lossy codecs get closer and closer to the source with a small bitrate, this noise can be compressed to less bits (imagine if the difference is never bigger than 8 bits in a 16 bits source, you already get a 50% compression for free). So IMO this should be investigated a bit further. Even with MP3 if you chose an MP3 at 320kbps and at 64 kbps, you have more bitrate for the lossy codec. That'sd why I think that HE-AAC could be a good candidate in this case. The result is not so far from the original, compared to the bitrate.

Anyone willing to do a test with MP3 and/or Vorbis + FLAC and/or ZIP at different bitrates ?

Or is there any command-line tool to compute the difference of 2 WAV files ? (maybe I should code one)...

This post has been edited by robUx4: May 17 2004, 12:01


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post
NumLOCK
post May 17 2004, 12:22
Post #17


Neutrino G-RSA developer


Group: Developer
Posts: 852
Joined: 8-May 02
From: Geneva
Member No.: 2002



QUOTE (cabbagerat @ May 17 2004, 11:36 AM)
Ivan Dimkovic is right about information entropy.

Indeed. smile.gif

QUOTE
Lossless audio encoders are a very interesting case here - as they can losslessly compress a sound file into a smaller number of bits than it's information entropy predicts.


Wrong ! The reason why you can usually compress a sound file, is because PCM coding needs more bits than what the actual entropy of the data would require.

Therefore when you code the data in a more clever way (ie: remove redundant data), you will approach that theoretical limit (which is a hard limit, determined by the real entropy value).

QUOTE
While this sounds impossible, it's not as some of the information contained in the waveform is actually contained in the decoding algorithm.

Wrong.. this is a misconception.
Even with a 10GB algorithm which compresses a 1MB sound file, the worst-case compression (with a clever algorithm) would be 1MB + 1 bit, and the best-case compression (with the same algorithm) *could* be 1 bit.

A 1-bit compressed file contains just the information: "YES, it is the exact data which is known to the algorithm". Unfortunately it can only be done with one set of input data. And it will expand all other possibilities by 1 bit (1st bit, which will say "NO, this it something else").

When you take more space for the algorithm you can make it more clever, but still, all data-dependant information will be in the compressed file. Otherwise you cannot unpack the files.

QUOTE
Thus lossless audio compressors will do very badly with the vast majority of waveforms, but very well with ones that represent sensical audio data.

Yes, because:
- sensical audio data is redundant (ie: contains correlation)
- and: the algorithm is made to take advantage these correlations.

Unfortunately: you cannot make a compression algorithm which is good on any (ie: random) data, so you will expand random data by at least 1 bit, no matter how complex and clever the algorithm is dry.gif

Edit: By the way, if you could make an algorithm which reduces any data by just 1 bit, you could also use it several times, and therefore compress anything down to zero bits laugh.gif

This post has been edited by NumLOCK: May 17 2004, 12:24


--------------------
Try Leeloo Chat at http://leeloo.webhop.net
Go to the top of the page
+Quote Post
cabbagerat
post May 17 2004, 18:34
Post #18





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



QUOTE
By the way, if you could make an algorithm which reduces any data by just 1 bit, you could also use it several times, and therefore compress anything down to zero bits

Which would be very cool. I am really looking forward to the new LAME --preset onebit. smile.gif
QUOTE
Wrong ! The reason why you can usually compress a sound file, is because PCM coding needs more bits than what the actual entropy of the data would require.

I didn't know that and find it very interesting. Do you have any references? I am not doubting you, I would just like to know more.
QUOTE
A 1-bit compressed file contains just the information: "YES, it is the exact data which is known to the algorithm". Unfortunately it can only be done with one set of input data. And it will expand all other possibilities by 1 bit (1st bit, which will say "NO, this it something else").
When you take more space for the algorithm you can make it more clever, but still, all data-dependant information will be in the compressed file. Otherwise you cannot unpack the files.

I agree completely. However, your 1 bit files contains, obviously, precisely one bit of information (in the Shannon sense). However the original contained more - which shows that the compressor decreased the information entropy of the original file. This information can't, by definition, dissapear - it's stored in the algorithm itself. There is as much (or more) information in (Algorithm+Compressed File) as in (Original File). If there wasn't then there couldn't be a 1->1 mapping between compressed files and uncompressed files.

Suppose I copy all my MP3s (and OGGs, MPCs, etc) into a big database indexed by their MD5 sum. Then I replace all my MP3 files with 128 bit binary files containing this sum. I have "compressed" my MP3 collection to a couple of hundred kilobytes. However that doesn't mean that they would be any good without the database (part of the algorithm), which runs to tens of gigabytes.
QUOTE
The result is not so far from the original, compared to the bitrate.

The result isn't very far, percuptually, from the original - but that doesn't mean that there isn't still a huge amount of data in the difference file. Some of the frequencies in the original will not reduce substantially in amplitude if the encoded file were subtracted from the original.


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
robUx4
post May 17 2004, 20:20
Post #19


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



QUOTE (cabbagerat @ May 17 2004, 06:34 PM)
QUOTE
Wrong ! The reason why you can usually compress a sound file, is because PCM coding needs more bits than what the actual entropy of the data would require.

I didn't know that and find it very interesting. Do you have any references? I am not doubting you, I would just like to know more.

The entropy of digital silence is close to nothing (1 bit ?). And it can be encoded at 24 bits 192kHz.


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post
robUx4
post May 17 2004, 20:22
Post #20


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



QUOTE (cabbagerat @ May 17 2004, 06:34 PM)
QUOTE
The result is not so far from the original, compared to the bitrate.

The result isn't very far, percuptually, from the original - but that doesn't mean that there isn't still a huge amount of data in the difference file. Some of the frequencies in the original will not reduce substantially in amplitude if the encoded file were subtracted from the original.

I know but IMO it would be nice to make a few tests to ensure this is a complete wrong way.


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post
cabbagerat
post May 17 2004, 20:43
Post #21





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



QUOTE
I know but IMO it would be nice to make a few tests to ensure this is a complete wrong way.

Fair enough. I have done some fiddling myself, but with exams coming up I don't have time to do any conclusive experiments. It would be cool if somebody did them though.

If anybody is interested, they can get my script and sources from here: http://users.smuts.uct.ac.za/~mbrooker/lossyless.tar.gz
Nothing that most here couldn't put together in half an hour, though.


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
robUx4
post May 17 2004, 21:21
Post #22


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



How do you make the WAV substraction ? That's all I'm missing (I've started coding such a utility called pcm-)...

edit: I see you do a basic difference on raw data. I was thinking about doing it with WAV (with my own WAV classes). But it LAME, OGG and FLAC can handle raw audio, that's fine.

This post has been edited by robUx4: May 17 2004, 21:24


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post
Pamel
post May 17 2004, 22:07
Post #23





Group: Members
Posts: 147
Joined: 15-June 03
Member No.: 7199



QUOTE (cabbagerat @ May 17 2004, 12:34 PM)
I agree completely. However, your 1 bit files contains, obviously, precisely one bit of information (in the Shannon sense). However the original contained more - which shows that the compressor decreased the information entropy of the original file. This information can't, by definition, dissapear - it's stored in the algorithm itself. There is as much (or more) information in (Algorithm+Compressed File) as in (Original File). If there wasn't then there couldn't be a 1->1 mapping between compressed files and uncompressed files.

I'm happy to see that someone finally brought up the (Algorithm+Compressed File) problem. There is a theory of compression that goes something like this:

QUOTE
It is impossible to make a compression algorithm that can compress random data where the resulting compressed data plus the size of the algorithm is smaller than the original data.
Given this, how is anything compressed losslessly? Well, most data isn't very random. Images and sounds have distinct patterns that allow them to be compressed by programs that depend on those patterns. If you try to compress data that does not contain those patterns, your resulting file plus the size of the algorithm will always be larger than the original file.

Now, how does that apply to the posters question? Well, if you compress data (audio data in this case) in a lossy manner, then take the difference of the lossy data from the original, the resulting file will be lacking those patterns that are needed for good compression. In fact, the data contained in the diff will be nearly random. So, bad compression.

More information about this is available from the Comp.Compression FAQ, or you can ask directly at the Comp.Compression usenet group.

IIRC, there was a prize being offered at one time to anyone that could break this 'Law of Compression'.
Go to the top of the page
+Quote Post
Pamel
post May 17 2004, 22:41
Post #24





Group: Members
Posts: 147
Joined: 15-June 03
Member No.: 7199



Found it. Here is a Wikipedia reference to the challenge. And here is the quote from the FAQ:
QUOTE
Steve Tate <srt@cs.unt.edu> suggests a good challenge for programs
that are claimed to compress any data by a significant amount:

    Here's a wager for you: First, send me the DEcompression algorithm.  Then I
    will send you a file of whatever size you want, but at least 100k.  If you
    can send me back a compressed version that is even 20% shorter (80k if the
    input is 100k) I'll send you $100.  Of course, the file must be able to be
    decompressed with the program you previously sent me, and must match
    exactly my original file.  Now what are you going to provide
    when... er... if you can't demonstrate your compression in such a way?

So far no one has accepted this challenge (for good reasons).

Mike Goldman <whig@by.net> makes another offer:

    I will attach a prize of $5,000 to anyone who successfully meets this
    challenge.  First, the contestant will tell me HOW LONG of a data file to
    generate.  Second, I will generate the data file, and send it to the
    contestant.  Last, the contestant will send me a decompressor and a
    compressed file, which will together total in size less than the original
    data file, and which will be able to restore the compressed file to the
    original state.

    With this offer, you can tune your algorithm to my data.  You tell me the
    parameters of size in advance.  All I get to do is arrange the bits within
    my file according to the dictates of my whim.  As a processing fee, I will
    require an advance deposit of $100 from any contestant.  This deposit is
    100% refundable if you meet the challenge.


Now, I am not saying that what Efenstor says is impossible, just not with the tools available. You would need to use an algorithm that produced a file that was very close to the original. This means that any codecs that use any type of "psychoacoustic-based approach" would need to be skipped since the data removed by this approach would create more "random" data for the diff. With an accurate enough file, you could create a diff that was mostly zeros. Then to compress that, you wouldn't want to use a lossless audio codec as those are tuned to audio needs. You would need to use an algorithm tuned to the data produced by these diffs that was mostly zeros.

This MIGHT be an efficient way to do it, but testing would be needed to see for sure, and that is a lot of work to really do it right.
Go to the top of the page
+Quote Post
NumLOCK
post May 17 2004, 22:43
Post #25


Neutrino G-RSA developer


Group: Developer
Posts: 852
Joined: 8-May 02
From: Geneva
Member No.: 2002



QUOTE (cabbagerat @ May 17 2004, 06:34 PM)
I agree completely. However, your 1 bit files contains, obviously, precisely one bit of information (in the Shannon sense). However the original contained more - which shows that the compressor decreased the information entropy of the original file. This information can't, by definition, dissapear - it's stored in the algorithm itself. There is as much (or more) information in (Algorithm+Compressed File) as in (Original File). If there wasn't then there couldn't be a 1->1 mapping between compressed files and uncompressed files.

Suppose I copy all my MP3s (and OGGs, MPCs, etc) into a big database indexed by their MD5 sum. Then I replace all my MP3 files with 128 bit binary files containing this sum. I have "compressed" my MP3 collection to a couple of hundred kilobytes. However that doesn't mean that they would be any good without the database (part of the algorithm), which runs to tens of gigabytes.

You're right, when we consider specific files (ie: 1000 predefined MP3's), the size of the algorithm should be taken into account.

In the more general sense, ie. when you need to compress any given MP3, you can make a database of all possible MP3's and then, during compression, just store the index to the received file.

Unfortunately if you really have all possible MP3's in the database, then the index size will be equal to the file size -- thus a MD5sum will not be enough to differentiate them wink.gif

About the entropy: In fact the entropy of a file whose contents can be completely predicted, is zero.

Note: A possibility to compress a MP3 further, would be to build a table of all possible MP3's of a given length (which should produce a fairly big table whistling.gif ). Ok, now there's a 1:1 relationship between indices and elements. Then, eliminate all awful-sounding possibilities from the table. Now you gain a few bits on the indices, which means further lossless compression of MP3's cool.gif

This would mean, obviously, exponentially large memory and processing requirements. In fact, it would already become infeasible with a 16-byte MP3 -- which doesn't even cover the full header biggrin.gif

This post has been edited by NumLOCK: May 17 2004, 22:48


--------------------
Try Leeloo Chat at http://leeloo.webhop.net
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 31st October 2014 - 22:42