IPB

Welcome Guest ( Log In | Register )

2 Pages V  < 1 2  
Reply to this topicStart new topic
Lossy-based lossless coding, Stupid idea
SirGrey
post May 17 2004, 22:52
Post #26





Group: Members
Posts: 241
Joined: 8-February 04
Member No.: 11863



QUOTE
<Wrong ! The reason why you can usually compress a sound file, is because PCM coding needs more bits than what the actual entropy of the data would require. >
I didn't know that and find it very interesting. Do you have any references? I am not doubting you, I would just like to know more.

Read about Huffman codes (or coding).
The idea that any *real* data use more space because of correlation of values is very interesting to understand biggrin.gif
The idea is that in any *real* alphabet you use to represent the data, some elements are used frequently than others. For example, symbol "A" is used more often than others in an english text. So, it is waste of bits to code A to 8Bits. It should be coded with 1.
And the same applies to 16Bit audio sample representation and so on...
Go to the top of the page
+Quote Post
cabbagerat
post May 18 2004, 08:42
Post #27





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



QUOTE
QUOTE
It is impossible to make a compression algorithm that can compress random data where the resulting compressed data plus the size of the algorithm is smaller than the original data.

Given this, how is anything compressed losslessly? Well, most data isn't very random. Images and sounds have distinct patterns that allow them to be compressed by programs that depend on those patterns. If you try to compress data that does not contain those patterns, your resulting file plus the size of the algorithm will always be larger than the original file.

Maybe I should rephrase that - how about "It is impossible to make a compression algorithm that can compress data where the amount of information entropy contained in the sum of the algorithm and the compressed file is less than the information entropy contained in the original.
QUOTE
Then, eliminate all awful-sounding possibilities from the table.

Can't do that - It would eliminate half of all current releases wink.gif
QUOTE
Read about Huffman codes (or coding).
The idea that any *real* data use more space because of correlation of values is very interesting to understand

Fair enough - I think I misunderstood what numLOCK was saying - I heard:
"The PCM representation of an arbitrary wave form is not that most compact representation that preserves all the original data". Or "All waveforms can be expressed more efficiently in way other than PCM".
QUOTE
How do you make the WAV substraction ? That's all I'm missing (I've started coding such a utility called pcm-)...

edit: I see you do a basic difference on raw data. I was thinking about doing it with WAV (with my own WAV classes). But it LAME, OGG and FLAC can handle raw audio, that's fine.

I use SoX to convert to RAW, subtract them, then use SoX to convert back to wav. Mostly this is because I was too lazy to write a proper program that understands WAV. Using libsndfile this should be trivial, however.
The relevent command lines (for 16 bit, 44100Hz stereo):
sox in.wav -r 44100 -c 2 -w -s out.raw
sox -r 44100 -c 2 -w -s in.raw out.wav


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
robUx4
post May 18 2004, 12:06
Post #28


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



I made some test here (without an audio editor to check some things). Apparently the Vorbis@128+FLAC combination can give good results on some file (and poor on others). Vorbis@64+FLAC gave worst result than the 128 one.

Bzip2 can't compress any of the residual noise I've produced with different combinations.

MP3 has this encoder delay problem that seems to make sample accurate files impossible. (I also had problems with the --nogap option).

I may try the same thing with AAC using FAAC/FAAD. Hopefully there is no such encoder delay problem...

But at least I found one file in which I gained 14% compared to the same file compressed with FLAC. So this could work practically with more tuning.


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post
robUx4
post May 18 2004, 13:53
Post #29


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



OK, it seems to be useless with AAC also. So I'll just drop the idea from now on...


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post
SebastianG
post May 18 2004, 16:09
Post #30





Group: Developer
Posts: 1318
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (robUx4 @ May 18 2004, 03:06 AM)
Bzip2 can't compress any of the residual noise I've produced with different combinations.


Bzip2 makes use of the Burrow-Wheeler-Transform. This is kind of permutation which usually (ie for text) leads to long runs of same symbols because there is a strong inter-symbol relationship. Since the residual is a very noisy digital signal this transform is more or less useless.

Most - if not all - general purpose compressors don't perform very well in this case.

I think the best approach would be to use vorbis at -q6 and to code the difference via LPC+rice coding. The cool thing about Vorbis in this case is: The already-coded floor-curve tells you something about the time/frequency energy distribution within the difference signal. This information could be used to calculate an LPC-whitening filter & a good guess for the optimal rice-coding parameter k. So, the usual FLAC-overhead (LPC filter & rice coding parameters) can be minimized.

The bad thing is: All this has to be done in a 100% deterministic way in order to be lossless.
If you don't wanna rely on a certain floating-point implementations (which is certainly a good idea!) then you have to do everything using integer arithmetic. (tough job that is!)

Sure, it's possible somehow, but IMHO not worth the effort.

QUOTE (robUx4 @ May 18 2004, 03:06 AM)
But at least I found one file in which I gained 14% compared to the same file compressed with FLAC. So this could work practically with more tuning.


That is a surprise. I suppose this is a very rare case.

bye,
Sebastian

This post has been edited by SebastianG: May 18 2004, 16:13
Go to the top of the page
+Quote Post
mcbevin
post May 18 2004, 17:17
Post #31


La developer


Group: Developer
Posts: 47
Joined: 20-February 02
Member No.: 1364



QUOTE
I made some test here (without an audio editor to check some things).


Firstly, in my experience its _very_ important to make fully sure that whatever you're doing is fully bit-identical reversible. Otherwise you can be 99% sure, whenever you find yourself with some big improvement in compression, that you've just made a mistake somewhere.


QUOTE
But at least I found one file in which I gained 14% compared to the same file compressed with FLAC. So this could work practically with more tuning.


Thats _very_ interesting, even if its not a common case. But first, it would be very good if you can have a test setup whereby you do the compression and then the decompression and then bitwise compare the two files to be sure you're doing something that can work.

I take when you say this file gained 14% that you mean the total size of the .ogg(?) file + flac compressed residual(?) was 14% smaller than flac compressing the original wav? If so it would definitely be worth looking further into, especially if the music file used wasn't too unusual. You could also try with some of the better compressing lossless audio codecs to see if they can be improved as well.
Go to the top of the page
+Quote Post
cabbagerat
post May 18 2004, 18:17
Post #32





Group: Members
Posts: 1018
Joined: 27-September 03
From: Cape Town
Member No.: 9042



My own tests reveal that the OGG+Flac combination can get very close to, or in rare cases a few percent smaller than, raw FLAC on a minority of samples. On about 80% of samples it is between 10 and 50% larger.
QUOTE
Bzip2 makes use of the Burrow-Wheeler-Transform. This is kind of permutation which usually (ie for text) leads to long runs of same symbols because there is a strong inter-symbol relationship. Since the residual is a very noisy digital signal this transform is more or less useless.

That's interesting. Most of the reason I used bzip2 is that I type "tar cjf" almost by reflex, rather than anything else.
QUOTE
Sure, it's possible somehow, but IMHO not worth the effort.

I would tend to agree. Even if it does manage to do a few percent better than FLAC (which looks doubtful) it is much, much slower and implementing seeking and the rest would be a real pain.It was an interesting idea though.


--------------------
Simulate your radar: http://www.brooker.co.za/fers/
Go to the top of the page
+Quote Post
robUx4
post May 18 2004, 21:02
Post #33


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



I agree on all this.

If you want to test yourself the 14% one, it's LFO "Blown" which can be found on my website. And yes it gives 14% better compression than FLAC alone. But the other files I tried were much bigger. So IMO it's not really spending more time on this.


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post
mcbevin
post May 19 2004, 14:00
Post #34


La developer


Group: Developer
Posts: 47
Joined: 20-February 02
Member No.: 1364



Well the importance of this is not that one should try and create a lossless coder by slapping ogg and flac together. However if, even on only a few songs, ogg+flac beat flac alone, especially if by such a large number as 14%, then that shows that theres a huge potential room for improvement.

However, its already known that FLAC has room for improvement - just look at comparisons to the other encoders. If you could repeat the tests with La or Optimfrog (I am the admittedly the La developer but the reason I would like to see the results with La is that La also generally has the best compression) and there was improvement in some cases, then that would be important. The next step would then _not_ be to slap ogg+la/ofr together, but rather to determine what it was that ogg was doing that was improving things, to code something doing something similar but modified to suit lossless compression, modify the lossless compressor to be more suited to the new signal, put the two together in the filter pipeline, play around a bit, and then see if that couldn't give significantly better results.

As you can imagine, a difficult and lengthy process, which is why its something that would only be worth undertaking if some more definitive results were first available.

Now I would take this 'blown' file and construct a test with it except the file on your website is an mp3, and converting mp3->wav and then doing the test would be rather meaningless (i trust this isn't how you've performed it?).
Go to the top of the page
+Quote Post
robUx4
post May 19 2004, 15:53
Post #35


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



QUOTE (mcbevin @ May 19 2004, 02:00 PM)
Now I would take this 'blown' file and construct a test with it except the file on your website is an mp3, and converting mp3->wav and then doing the test would be rather meaningless (i trust this isn't how you've performed it?).

This is how I did it. Actually the converted file is audio anyway, not even with the maximum entropy it originally had, but the difference is not so important in this case (what if you want to encode a poor recording with this codec ?). Otherwise you can buy the CD "Sheath" which is on Warp.

The only interresting point in this kind of hybrid codec would be to have both a lossy part and a lossless backup. That means when you want to put the audio file in a portable device or stream it, you don't have to reencode it but just use the lossy part with a rather good quality. That would be convenient in the future (large HD and portable devices with few CPU power).


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post
mcbevin
post May 19 2004, 17:52
Post #36


La developer


Group: Developer
Posts: 47
Joined: 20-February 02
Member No.: 1364



QUOTE
This is how I did it. Actually the converted file is audio anyway, not even with the maximum entropy it originally had, but the difference is not so important in this case (what if you want to encode a poor recording with this codec ?). Otherwise you can buy the CD "Sheath" which is on Warp.


Deary me. It almost goes without saying if you lossy encode something, then decode it, then re-lossy encode it, and decode it again, that you might be able to compress the final result better with a 'lossy+lossless on the residual' approach than a pure lossless approach.

I.e., its quite plausible that some lossy codecs, though I don't know which if any, have a property whereby (say for ogg as an example) after performing the conversions wav->ogg1->wav1->ogg2->wav2 that ogg1==ogg2 bit identically and thus wav1==wav2, and then the residual would be a bunch of zeroes which would of course be easy to compress and the combination of the ogg+the close to 0-byte residual would be more efficient than losslessly compressing wav1. Or if this is not the case its at least very possible that wav1 and wav2 are much more similar than wav and wav1, even if you use different lossy codecs for the two stages.


QUOTE
The only interresting point in this kind of hybrid codec would be to have both a lossy part and a lossless backup. That means when you want to put the audio file in a portable device or stream it, you don't have to reencode it but just use the lossy part with a rather good quality. That would be convenient in the future (large HD and portable devices with few CPU power).


That could be interesting except that:
1. two lossless compressors already do this quite well.
2. using a lossy compressed file as the base and then encoding the residual is generally a horrible basis for lossless compression if you're looking for good compression.

From my perspective the interesting thing would be if some _techniques_ from lossy compression could be incorporated into a lossless codec. In general I'm dubious of the idea as the needs of lossy and lossless compression are so different, but I try to keep an open mind to all possibilities.

This post has been edited by mcbevin: May 19 2004, 17:55
Go to the top of the page
+Quote Post
SebastianG
post May 19 2004, 18:00
Post #37





Group: Developer
Posts: 1318
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (mcbevin @ May 19 2004, 05:00 AM)
The next step would then not be to slap ogg+la/ofr together, but rather to determine what it was that ogg was doing that was improving things, to code something doing something similar but modified to suit lossless compression, modify the lossless compressor to be more suited to the new signal, put the two together in the filter pipeline, play around a bit, and then see if that couldn't give significantly better results.

I guess frequency adaptive channel decorrelation would be one thing Vorbis does using adaptive vector codebooks that helps reducing file size of ogg+FLAC in comparison to pure FLAC.
FLAC's channel decorrelation is rather poor compared to Vorbis'.

Here's an advanced idea of how one could try to decorrelate a stereo signal.

1) choose a channel CH1 out of [L,R} to be the first that will be coded and CH2 the other channel that comes second.
2) code channel CH1 "the usual way" (ie LPC-filter + residual)
3) calculate a "good decorrelation-filter impulse response"
4) code this filter somehow
5) calculate the decorrelation residual by CH2' = CH2 - filter(CH1)
6) code channel CH2' the usual way

This decorrelation-filter should minimize the energy of [CH2 - filter(CH1)]
Its impulse response could be modeled as a weighted sum of bandpass filters. These weights could then be coded compactly by something like delta&huffman-coding.

To exploit phase correlations we could assign 2 bandpassfilters for the same frequency band with a phase shift of 90°. This way we would be able to predict CH2 well even if there are phase differences.

The actual weights could be calculated the following way:
1) calculate FFT on both channels
2) divide spectrum into smaller subbands
3) calculate subband energies and (complex) cross-correlation factors
4) calculate weights by weight[subband]:=crosscorr[subband]*energy[CH2][subband]/energy[CH1][subband]

any coments / suggestions ?

bye,
SebastianG
Go to the top of the page
+Quote Post
Efenstor
post May 19 2004, 20:16
Post #38





Group: Members
Posts: 9
Joined: 25-August 03
Member No.: 8520



As I said before, none of the existing codecs fit the need, especially those which extract the noise component and remake it when decoding (e.g. Vorbis).

When I compressed a wave using Musepack (indeed it doesn't matter), then decoded it, compared with the original, generated the alpha-file and listened to it. As one could expect, it consisted exclusively of noises, those noises which human ear somewhat cannot hear.

In the case of lossy-based coding it should behave otherwise: it should code pure sines VERY roughly but encode noises as fine as possible. Probably it would be not an MP3-like compression at all. It should split wave to noise and sines and pay 9 of 10 to noise and 1 to sines.

In other words, it requires much more in-depth research. It is impossible to prove or deny it using existing psychoacoustic-based codecs.
Go to the top of the page
+Quote Post
robUx4
post May 19 2004, 22:58
Post #39


Matroska Developer


Group: Developer (Donating)
Posts: 410
Joined: 14-March 02
From: Paris
Member No.: 1519



Yes, as Pamel said a lossy codec in this case would probably have no need for a psychoacoustic model.

And about the source, I'm sorry but the decoded file is just music that has almost the same entropy as the original. So wether a codec will almost produce the same result is not important. That's what you would expect from a codec anyway ! And actually that would have been the case with MP3 but not Vorbis. And the opposite happens...


--------------------
http://www.matroska.org/ : the best vapourware / http://robux4.blogspot.com/
Go to the top of the page
+Quote Post

2 Pages V  < 1 2
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 22nd September 2014 - 23:57