IPB

Welcome Guest ( Log In | Register )

2 Pages V  < 1 2  
Reply to this topicStart new topic
7z beats other codecs on 24bit 48khz sample, Split from: "Which is the best lossless codec?"
Nystagmus
post Jan 2 2014, 23:01
Post #26





Group: Members
Posts: 27
Joined: 13-October 13
Member No.: 110926



I'm not going to claim which technique(s) I think are better, but it's worth mentioning that the type of music (spectrally, the amount and size of repetition, and it terms of timbre) really profoundly affects audio compression successfullness. But I do like 7zip as an archiver, and it's pretty darned cool that Foobar2000 media player can play contents of 7z archives (at least with the foo addon).

Life is pretty good these days for music and media files. I recently switched to BandiZip because it has 7zip support but can do a few things that 7zip can't. BandiZip almost out-7zipped 7-zip itself! But I still use 7zip because it can do a few things that BandiZip can't. Occasionally I try out PeaZip for the same reasons. But 7zip is pretty awesome for being able to open up MHTs, XPIs, XPSs, DMGs (macOS), and setup/installer programs of some types. I really love bypassing installers and just extracting the contents. Sometimes that really saves tons of headaches with annoying installers. At first I couldn't get 7zip to run on Windows 7, but after auto-elevating UAC I got it to work fine again. Luckily, 7zip is still being developed so it's all good. I don't use ZIP for compression anymore and RAR is still a bit proprietary-like in some ways, but I use what I just mentioned to open em.

For the record, I also enjoy using FLAC and WavPack. They each have some awesome advantages for certain situations. And I love how they are still supported and hardware support is growing through stuff like RockBox and Linux distros.

This post has been edited by Nystagmus: Jan 2 2014, 23:02
Go to the top of the page
+Quote Post
Porcus
post Jan 3 2014, 00:57
Post #27





Group: Members
Posts: 1842
Joined: 30-November 06
Member No.: 38207



Since the thread is already bumped:

QUOTE (Thundik81 @ Aug 18 2013, 10:54) *


Three "general" compressors beating TAK on size, and Stuffit being on par at encoding speed? I am impressed, even though they are not seekable.


(A couple of the comments made me think of a particular peak value which disproportionally many of my rips do have - namely, .999969, occurring on 13 percent (compare with about 30 for the value 1). Could the reason for this value be a known algorithm that could potentially be exploited?)


--------------------
One day in the Year of the Fox came a time remembered well
Go to the top of the page
+Quote Post
bryant
post Jan 3 2014, 03:13
Post #28


WavPack Developer


Group: Developer (Donating)
Posts: 1290
Joined: 3-January 02
From: San Francisco CA
Member No.: 900



QUOTE (Porcus @ Jan 2 2014, 15:57) *
(A couple of the comments made me think of a particular peak value which disproportionally many of my rips do have - namely, .999969, occurring on 13 percent (compare with about 30 for the value 1). Could the reason for this value be a known algorithm that could potentially be exploited?)

I don't think there's possibility of exploitation there. The maximum allowable 16-bit PCM values are -32768 and +32767, so if you simply call 32768 full scale (1.0), then 32767 becomes 0.999969. I can imagine all kinds of normalization algorithms that would leave the sample values clipped to +/-32767 after converting from a higher resolution master. Unless all the even values are missing, that doesn't give you much to work with.

As for the "general" compressors doing better than TAK, my guess is that they have some audio specialized processing in there that is getting invoked. It's just not very likely that any general purpose algorithm will do well with stereo 16-bit PCM audio by accident (even with a huge dictionary). Interestingly, that's not the case for DSD audio; I recently was experimenting with compressing that and it took a few days before I could beat bzip2 (and it still beats me on some samples)!

Go to the top of the page
+Quote Post
Porcus
post Jan 3 2014, 09:19
Post #29





Group: Members
Posts: 1842
Joined: 30-November 06
Member No.: 38207



Ah, 32767 ... almost facepalming I didn't think of that.


QUOTE (bryant @ Jan 3 2014, 03:13) *
As for the "general" compressors doing better than TAK, my guess is that they have some audio specialized processing in there that is getting invoked.


Well certainly (like WinZip), but I am still surprised that they bother to take it to the extent that they beat everything faster than WavPack. People like yourself and Thomas have spent quite some effort on the audio part, and I wonder if any Stuffit customer would be disappointed if Stuffit simply grabbed some reasonable codec that is already out there under a permissive license ... (then OTOH for marketing purposes they maybe do not want to see anyone pointing out that they charge money for something that compresses .wavs to exactly the size of a .vw plus file header difference? Of course the similarity to TAK in speed/compression performance could easily fuel other speculations wink.gif ).



QUOTE (bryant @ Jan 3 2014, 03:13) *
Interestingly, that's not the case for DSD audio; I recently was experimenting with compressing that and it took a few days before I could beat bzip2 (and it still beats me on some samples)!

Starting from LPCM-optimized WavPack, I presume?

BTW, I did the following comparison out of curiosity, taking the "most FLACable" and "least FLACable" CD rip in my collection and compared WavPack to FLAC to TAK: http://www.hydrogenaudio.org/forums/index....mp;#entry800823
I take the hunch that you don't listen much to Piaf and Thomas doesn't listen much to Merzbow ;-)

This post has been edited by Porcus: Jan 3 2014, 09:20


--------------------
One day in the Year of the Fox came a time remembered well
Go to the top of the page
+Quote Post
ktf
post Jan 3 2014, 09:38
Post #30





Group: Members
Posts: 339
Joined: 22-March 09
From: The Netherlands
Member No.: 68263



I've used XZ (which uses a compression algorithm very similar to 7-zip) in the first revision of my Lossless audio codec comparison, it's in the raw data, the cvs file. There's only one album where XZ beats any codec, and that's on mono material with lots of silence. Pretty much everywhere else, XZ doesn't even come close. I guess the mentioned sample was something special, quantized in some special way for example.

This post has been edited by ktf: Jan 3 2014, 09:38


--------------------
Music: sounds arranged such that they construct feelings.
Go to the top of the page
+Quote Post
Mangix
post Jan 3 2014, 11:14
Post #31





Group: Members
Posts: 587
Joined: 26-February 06
Member No.: 28077



QUOTE (Porcus @ Jan 2 2014, 15:57) *
Since the thread is already bumped:

QUOTE (Thundik81 @ Aug 18 2013, 10:54) *


Three "general" compressors beating TAK on size, and Stuffit being on par at encoding speed? I am impressed, even though they are not seekable.


(A couple of the comments made me think of a particular peak value which disproportionally many of my rips do have - namely, .999969, occurring on 13 percent (compare with about 30 for the value 1). Could the reason for this value be a known algorithm that could potentially be exploited?)

Those general compressors have specialized models that deal with .wav files. Typically some context mixing algorithm with filters. They also operate on large blocksizes so seeking has to be done in a similar manner to mp3(decoding the whole file). They're also quite slow and unsuitable for real-time playback. WinRK and NanoZip anyway.

FLAC usually has a blocksize of 4096 if I'm not mistaken. TAK i think goes up to 16384. Smaller blocksizes lose compression in exchange for seekability.

QUOTE
There's only one album where XZ beats any codec, and that's on mono material with lots of silence.
LZ77 does very well when you have repeating sequences. Silence falls into that category. Although xz does use a "delta" filter when applied to audio data. It probably allows the LZ77 model to find the patterns.

This post has been edited by Mangix: Jan 3 2014, 11:19
Go to the top of the page
+Quote Post
Porcus
post Jan 3 2014, 16:49
Post #32





Group: Members
Posts: 1842
Joined: 30-November 06
Member No.: 38207



As a sidenote, after someone here on HA pointed out to me that FLAC could be used as a general purpose compressor by --force-raw-format, I did for fun try it on a few files. Not very competitive.


--------------------
One day in the Year of the Fox came a time remembered well
Go to the top of the page
+Quote Post
Mangix
post Jan 4 2014, 00:22
Post #33





Group: Members
Posts: 587
Joined: 26-February 06
Member No.: 28077



FLAC would probably do better on general files if the Rice coding was replaced with Arithmetic coding or FSE. Actually in the case of the latter, decode speed should improve.

This post has been edited by Mangix: Jan 4 2014, 00:22
Go to the top of the page
+Quote Post
bryant
post Jan 4 2014, 20:36
Post #34


WavPack Developer


Group: Developer (Donating)
Posts: 1290
Joined: 3-January 02
From: San Francisco CA
Member No.: 900



QUOTE (Porcus @ Jan 3 2014, 07:49) *
As a sidenote, after someone here on HA pointed out to me that FLAC could be used as a general purpose compressor by --force-raw-format, I did for fun try it on a few files. Not very competitive.

WavPack can do this too with "--raw-pcm" and yes, they're generally not too competitive. Switching to 8-bit mono sometimes helps. I think the only practical value of this is finding bugs in the code.
Go to the top of the page
+Quote Post
bryant
post Jan 4 2014, 20:40
Post #35


WavPack Developer


Group: Developer (Donating)
Posts: 1290
Joined: 3-January 02
From: San Francisco CA
Member No.: 900



QUOTE (Porcus @ Jan 3 2014, 00:19) *
QUOTE (bryant @ Jan 3 2014, 03:13) *
Interestingly, that's not the case for DSD audio; I recently was experimenting with compressing that and it took a few days before I could beat bzip2 (and it still beats me on some samples)!

Starting from LPCM-optimized WavPack, I presume?

No, actually starting from scratch and using arithmetic coding. And one of the methods actually was a decent general purpose compressor that beat WinZip on one huge pdf that I had! smile.gif
Go to the top of the page
+Quote Post
thebombzen
post Jan 9 2014, 23:06
Post #36





Group: Members
Posts: 2
Joined: 9-January 14
Member No.: 113916



QUOTE (Mangix @ Jan 3 2014, 05:14) *
QUOTE (Porcus @ Jan 2 2014, 15:57) *

There's only one album where XZ beats any codec, and that's on mono material with lots of silence.

LZ77 does very well when you have repeating sequences. Silence falls into that category. Although xz does use a "delta" filter when applied to audio data. It probably allows the LZ77 model to find the patterns.


XZ can do particularly well if you use a custom filter chain. I can get pretty good ratios (though still not as good as flac -8) when using XZ's ability to use a custom filter chain. Specifically, I use
CODE
xz -vvk --delta=dist=4 --delta=dist=4 --lzma2=dict=128MiB,lc=0,lp=2,pb=2,mode=normal,nice=273,mf=bt4,depth=1024 Audio_file.wav

but I change the dictionary size to be the smallest value that is either of the form 2^n or 2^n + 2^(n-1) that's larger than the file I'm compressing, because anything larger is unnecessary and those are the values that XZ Supports.

Note that with delta, you can specify the distance, which is extremely useful because each sample is 4 bytes long (16-bit stereo, adjust for other formats) so the corresponding byte would be 4 bytes away. Using delta twice improves the ratio further in every audio file I've tried it on, but I don't entirely know why. For some reason, three delta filters consistently performs worse than two, even though two consistently performs better than one. Someone else will have to explain this one to me.

Also note the values I'm using for lc, lp, and pb. (The other non-dict values are just max settings.) It's easier to explain if I quote the XZ manpages:
QUOTE (man xz)
lc=lc Specify the number of literal context bits. The minimum is 0 and the maximum is 4; the default is 3. In addition, the sum of lc and lp must not exceed 4.

All bytes that cannot be encoded as matches are encoded as literals. That is, literals are simply 8-bit bytes that are encoded one at a time.

The literal coding makes an assumption that the highest lc bits of the previous uncompressed byte correlate with the next byte. E.g. in typical English text, an upper-case letter is often followed by a lower-
case letter, and a lower-case letter is usually followed by another lower-case letter. In the US-ASCII character set, the highest three bits are 010 for upper-case letters and 011 for lower-case letters. When
lc is at least 3, the literal coding can take advantage of this property in the uncompressed data.

The default value (3) is usually good. If you want maximum compression, test lc=4. Sometimes it helps a little, and sometimes it makes compression worse. If it makes it worse, test e.g. lc=2 too.

lp=lp Specify the number of literal position bits. The minimum is 0 and the maximum is 4; the default is 0.

Lp affects what kind of alignment in the uncompressed data is assumed when encoding literals. See pb below for more information about alignment.

pb=pb Specify the number of position bits. The minimum is 0 and the maximum is 4; the default is 2.

Pb affects what kind of alignment in the uncompressed data is assumed in general. The default means four-byte alignment (2^pb=2^2=4), which is often a good choice when there's no better guess.

When the aligment is known, setting pb accordingly may reduce the file size a little. E.g. with text files having one-byte alignment (US-ASCII, ISO-8859-*, UTF-8), setting pb=0 can improve compression
slightly. For UTF-16 text, pb=1 is a good choice. If the alignment is an odd number like 3 bytes, pb=0 might be the best choice.

Even though the assumed alignment can be adjusted with pb and lp, LZMA1 and LZMA2 still slightly favor 16-byte alignment. It might be worth taking into account when designing file formats that are likely to be
often compressed with LZMA1 or LZMA2.


By using lp=2 and pb=2, I set LZMA2 to assume 4-byte alignment for everything, which is exactly how I want it for 16-bit stereo samples. I'd change these to 1 for 16-bit mono, 24-bit stereo, and 8-bit stereo. (24-bit stereo only contains one factor of two, but 16-bit stereo contains two; that's why it's lower for this one). The choice of lc=0, lc=1, or lc=2 is not critical: I tried all three on several samples and got nearly identical results (as in within .001 ratio), better or worse depending on the samples.

So there you have it, your guide on how to better compress your wav files with XZ. Final answer: Use FLAC or any other audio-oriented program. FLAC compresses better and also has much faster compression and decompression; FLAC decompressed around 6x faster than XZ and compressed around 20x faster.
Go to the top of the page
+Quote Post

2 Pages V  < 1 2
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 31st July 2014 - 06:23