IPB

Welcome Guest ( Log In | Register )

6 Pages V  « < 3 4 5 6 >  
Reply to this topicStart new topic
lossyWAV 1.1.0 released., Added noise WAV bit reduction method.
halb27
post Aug 17 2008, 07:20
Post #101





Group: Members
Posts: 2424
Joined: 9-October 05
From: Dormagen, Germany
Member No.: 25015



Nick.C will tell better, but as for my understanding the different kind of noise shaping can't be expected soon.
Moreover it's not even clear at the moment whether or not it will be the better alternative to the current kind of noise shaping at least when targeting at 'lossless quality in a practical sense' (using '--standard').
The most promising usage of this hopefully advanced noise shaping will be at the lower bitrate quality end. Hopefully we will get the very good quality of '--portable' with a significantly lower bitrate than 370 kbps which is needed on average at the moment.

So waiting for the new noise shaping IMO is necessary only when targeting at low bitrate settings.

BTW I have to update my signature. Though I don't keep a lossless version of my music any more I do not use '--extreme' any more (with the exception of very precious tracks). '--standard' is fine even for this purpose.
Moreover I don't care any more about 'bitrate bloat' on rare occasion (solo instruments) compared to lossless wavPack. Relative to my total collection it's negligible, and I prefer having all my music in FLAC and mp3 format (thinking a bit of a future DAP).

This post has been edited by halb27: Aug 17 2008, 10:23


--------------------
lame3100m -V1 --insane-factor 0.75
Go to the top of the page
+Quote Post
BlAcKnOiSe
post Aug 17 2008, 10:23
Post #102





Group: Members
Posts: 14
Joined: 8-August 08
Member No.: 56869



With all the respect, what's the sense of lossyWAV+WavPack encoding?... Why don't use WavPack in lossy mode?... As I understand it, it uses a similar compression... and maybe the encoding is a little less hassle... Or lossyWAV has some compression/size benefit over lossy WavPack file?...
Go to the top of the page
+Quote Post
lvqcl
post Aug 17 2008, 10:56
Post #103





Group: Developer
Posts: 3326
Joined: 2-December 07
Member No.: 49183



QUOTE (BlAcKnOiSe @ Aug 17 2008, 13:23) *
With all the respect, what's the sense of lossyWAV+WavPack encoding?... Why don't use WavPack in lossy mode?... As I understand it, it uses a similar compression... and maybe the encoding is a little less hassle... Or lossyWAV has some compression/size benefit over lossy WavPack file?...

You can transcode lossyWAV+WavPack files to FLAC/WMALossless/TAK/etc and get exactly the same quality and nearly the same bitrate as original WV files. WavPack lossy mode doesn't have this feature: you cannot transcode it to any other format without loss in quality or compression.
Go to the top of the page
+Quote Post
halb27
post Aug 17 2008, 13:19
Post #104





Group: Members
Posts: 2424
Joined: 9-October 05
From: Dormagen, Germany
Member No.: 25015



QUOTE (BlAcKnOiSe @ Aug 17 2008, 11:23) *
With all the respect, what's the sense of lossyWAV+WavPack encoding?...

I don't know whether this addresses me. In case it does (you wrote it after my post):
I used lossless wavPack (without lossyWAV preprocessing) in all those cases were this yielded smaller files than lossyWAV | FLAC.
It is extremely rare that this happens, but with solo instrument music it can be like that. The main reason is that FLAC doesn't work well with this kind of music (considering just lossless encoding FLAC yields larger files than wavPack or TAK up to 10% or more for solo instrument tracks and similar music).
At the moment I can use wavPack on my DAP, but I don't want to rely on that for the future. wavPack is a great piece of software, but unfortunately not well supported on DAPs.
That's why I concentrate on FLAC and mp3, and don't care any more about the very few files where the lossyWAV | FLAC procedure yields larger files than a lossless wavPack encoding.

Other than that lossyWAV | wavPack is really not very attractive. With best compatibility for players in mind lossyWAV | FLAC is more attractive. With best overall efficiency in mind lossyWAV | TAK is more attractive. lossyWAV | wavPack is the least efficient combination, even after David Bryant's improvements on this situation. And wavPack enthusiasts have no reason to switch from wavPack lossy to another very high qualty lossy procedure. wavPack lossy is great as well (and nearly for sure is the better solution when targeting at a bitrate below 300 kbps).

lossyWAV | lossless codec on the other hand has the great feature that the lossless codec can be replaced whenever another codec will become more attractive. This is a future-proof feature and a lossless process. And of course quality also is superb when targeting at ~350+ kbps.

This post has been edited by halb27: Aug 17 2008, 13:36


--------------------
lame3100m -V1 --insane-factor 0.75
Go to the top of the page
+Quote Post
BlAcKnOiSe
post Aug 17 2008, 14:01
Post #105





Group: Members
Posts: 14
Joined: 8-August 08
Member No.: 56869



QUOTE
I don't know whether this addresses me.

No... smile.gif It was a coincidence that you talked about wavpack. But you and lvqcl pointed out that the future lossless transcoding is the reason to consider the lossyWAV+WavPack combo... wink.gif
So other than that it's better to stay with wavpacks native lossy format, or use lossyWAV + "more popular lossless codec"...
Go to the top of the page
+Quote Post
Nick.C
post Aug 17 2008, 21:13
Post #106


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



QUOTE (sauvage78 @ Aug 17 2008, 06:49) *
Quote: Nick.C
"My intention is to understand and implement SebastianG's new noise shaping method, but for that I will also have to introduce / find a PSY model of some kind."

Do you have any clue of how long this will take ? 3 months, 6 months or a year ?
I don't really know - I have been thinking about it for a while and I believe that one of the nice things about lossyWAV is that it's psy model (if it can be called that) is extremely simplistic - basically take more account of fft results at the lower end of the frequency spectrum and disregard any results above 16kHz.
QUOTE (sauvage78 @ Aug 17 2008, 06:49) *
Is SebastianG giving you any accelerated private lessons ?
Nope.
QUOTE (sauvage78 @ Aug 17 2008, 06:49) *
It's not that I want to hurry you wink.gif or being rude in any way crying.gif , but I care a lot for lossywav ... it's already my favorite lossy codec ... & I plan to convert tera of lossless to Lossy|Tak -P|-p2e ... (without lossless backup) so I care a lot for this new noise shaping method if it can make me save some kbps (& also for the new special Tak setting for lossywav tongue.gif )
The new noise shaping method may not actually reduce bitrate (although I would hope that it would). However, until a psy model is identified as being compatible with the rest of the NS mechanics then the development cannot progress past the implementation of SebastianG's method sans psy model.
QUOTE (sauvage78 @ Aug 17 2008, 06:49) *
without a 1.2.0 development thread I am asking myself everyday:
1: is it a TODO thing that is already actively worked on in the shadows. smile.gif
2: is it a TODO thing that is just an idea. crying.gif
3: are you in vacation with wife & kids. cool.gif
so I'd rather simply ask ...
1: Not yet;
2: Not really;
3: I was for a week, and we all had a nice time.
QUOTE (sauvage78 @ Aug 17 2008, 06:49) *
as I have been disapointed by vaporware feature from Christopher 'Monty' Montgomery in the past ... I am very suspicious about open source developers claims ... so excuse me if I sound rude I just want to test your determination to get lossywav to the max of its possible efficiency. (I don't want the "new noise shaping method" to become the "bitrate peeling" of lossywav)

I know you just released V1.1.0 a month ago & that I shouldn't already be longing for more ... but now that you are a developer you'll have to learn that end-users are relentless vampires !!! LOL
lossyWAV has absorbed most of my "me" time over the last 13 months or so. It is still being worked on, but only speedups and code optimisations at present. I would expect to start the first attempt at implementing SG's new method "soon" but I am not prepared (as this is a hobby project) to set a definitive date for completion.

I am very pleased that you think enough of the project to be seeking "more" from it, and also appreciative of the ABX testing that you have carried out during the tuning phase of 1.1.0.


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
sauvage78
post Aug 18 2008, 05:34
Post #107





Group: Members
Posts: 677
Joined: 4-May 08
Member No.: 53282



halb27 & Nick.C
Thanks for the good answers, it helped me see clearer in the future of lossywav.

Nick.C,

1: Did you ever think of having a separate site for lossywav ? other than the wiki I mean. Noobs like to have a homesite for their bookmarks. It's nicer to point a noob to an "official" site than to a wiki or a forum, for sharing links. A forum can disapear & a wiki is too anonymous IMHO. Maybe a simple one like wavpack, hosted by rarewares ?

2: Did you ever think of asking for a logo like other codec did on HA in the past, sure it will not improve the audio quality but it will help identify the codec & I could put it on my avatar as a free ad for publicity wink.gif
Some guys are pretty skilled on the forum, the tak logo happened to be very nice !

3: Did you ever think putting lossywav under Xiph umbrella ? (I am not sure it is a good idea at all LOL but Xiph doesn't have any hybrid codec) & I mean, from the start you seemed more attached to flac more than any other lossless codec ... ph34r.gif Xiph is maybe a communist place to be ph34r.gif but maybe it would attract some Linux users to lossywav & maybe would incite Josh to dig his brain to find a dedicaded flac setting for lossywav like Tom is doing for Tak wink.gif

quote: Nick C
"and also appreciative of the ABX testing that you have carried out during the tuning phase of 1.1.0"
well i must have spended less than 4 hours in total for that sample, & only -q2 was very time greedy if I recall well, all credit goes to Mardel for finding it wink.gif ... but I would redo it if a sample would threaten my -q2.5 transparency wink.gif
... to be fully honest I tested for myself first tongue.gif , this sample come out just when I needed to see what lossywav had in the stomach. Nowaday from time to time when I suspect a part a my favorite songs would be problematic for lossywav I encode it at -q1 & see what happens ... I didn't found any killer sample yet wink.gif

Edit:
Also for me -p2e should be recommended over -p2m for lossytak, the m switch is too time greedy for the very small spacegain, it's not a bad switch as it brings you the best compression for a target decoding speed, but it shouldn't be recommended as default IMHO even if Tom said so. (In fact he didn't said so if I recall well, he said there was almost no spacegain for lossytak past -p2m which is true if you consider only spacegain & decoding speed, but false if you add encoding speed as a consideration & you concluded that -p2m was recommended ... which is a missinterpretation IMHO ... there is a nice speedgain between -p2e & -p2m even if there is almost no spacegain) The m switch is nice only in two case IMHO: -1 best size for a target decoding speed for DAP, there -p2m may be usefull as it brings you the absolute max compression inside the area of the -p2 decoding speed hardware requirement setting & the other case is -2 the absolute max compression for Tak aka -p5m. The average lossytak user doesn't fall in these cases.

This post has been edited by sauvage78: Aug 18 2008, 07:36


--------------------
CDImage+CUE
Secure [Low/C2/AR(2)]
Flac -4
Go to the top of the page
+Quote Post
2Bdecided
post Aug 18 2008, 11:00
Post #108


ReplayGain developer


Group: Developer
Posts: 5059
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



lossyWAV with a good psymodel and arbitrary noise shaping could be as "good" as a typical lossy codec - it would avoid some of the restrictions of some lossy codecs, but would have plenty of its own. Whether it can ever compete in efficiency terms depends in part on the design of the lossless codec it's partnered with.

It would be a fascinating project, but the quality would depend completely on the psychoacoustic model. If you borrowed the psychoacoustic model from Musepack, and used arbitrary noise shaping to push the noise to match that added by musepack, this would be a good start. It's not "optimal" (you're imposing some musepack restrictions on lossyWAV), but it's easier than building a psychoacoustic model from scratch.

I'd join in, if I had the time. However, you have to ask the question: who would use it, and why? If you want a good psychoacoustic codec, there are plenty to choose from. lossyWAV + psymodel is unlikely to beat the best of them in quality, or any of them in bitrate.

I guess it could be higher quality and more compatible in some situations?

Cheers,
David.
Go to the top of the page
+Quote Post
Hancoque
post Aug 18 2008, 11:31
Post #109





Group: Members
Posts: 291
Joined: 27-January 04
From: Germany
Member No.: 11530



I think that the advantage of lossyWAV is that there is *no* psymodel involved, so that the compression stays robust and can be used for post-processing without revealing hidden artefacts. I also think that bitrates around 400 kbps are small enough for this approach and that other codecs should be used if (much) lower bitrates are needed. It would be re-inventing the wheel if lossyWAV incorporated psychoacoustic methods to achieve (near-)transparent results below 300 kbps. We already have MP3, Vorbis and many more like that. In my opinion one should focus on the original goal: Introduce a negligible loss while maintaining the benefits of lossless compression.

This post has been edited by Hancoque: Aug 18 2008, 11:34
Go to the top of the page
+Quote Post
sauvage78
post Aug 18 2008, 11:47
Post #110





Group: Members
Posts: 677
Joined: 4-May 08
Member No.: 53282



well I am not technical guy but as far as I understund the psymodel will be used to tell how the added noise must be shaped in order to be unlistenable ... it will not be a psymodel used to affect the music, but the added noise only so it is not "re-inventing the wheel" ... I may be completly wrong in how I understand things but I think Nick is not stupid enough to transform lossywav in ... yet another DCT codec ...

psymodel+DCT is evil, psymodel+noise shaping is good ... I wouldn't even have asked for this enhancement if I knew it would be a regression ... I would be damn stupid too ... hum well maybe I am afterall wink.gif

This post has been edited by sauvage78: Aug 18 2008, 12:01


--------------------
CDImage+CUE
Secure [Low/C2/AR(2)]
Flac -4
Go to the top of the page
+Quote Post
2Bdecided
post Aug 18 2008, 12:55
Post #111


ReplayGain developer


Group: Developer
Posts: 5059
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



...but that's exactly what psymodels do in all codecs - judge where noise can be added, such that it will (or should!) be inaudible.

DCT (e.g. mp3, AAC), or not (e.g. mp2, Musepack) - that only affects how accurately you can place the noise in the time and/or frequency domains.

The lossyWAV difference signal (i.e. the mathematical difference between the original and the coded version, calculated by simple waveform subtraction) sounds so different because the noise is currently white, or with a fixed shape. If you aggresively shape the noise based on a psy model, it'll sound fairly similar to mp3, AAC or whatever.

It's true you could avoid DCT artefacts specifically, but AAC does this pretty well with temporal noise shaping, and Musepack does this entirely since it's not a transform codec.

I still think it's interesting to try this - for one thing, it creates a new type of audio codec - but potential problems will be similar to some of those already faced by other psychoacoustic based codecs. It won't be a magic faultless codec, like lossyWAV without a psy model might be.

Cheers,
David.
Go to the top of the page
+Quote Post
halb27
post Aug 18 2008, 13:59
Post #112





Group: Members
Posts: 2424
Joined: 9-October 05
From: Dormagen, Germany
Member No.: 25015



Sounds like planning to implement a rather sophisticated psy model.
I had thought that it's only about a very elementary psy model something that's necessary for placing say in one block noise in the HF region, in another in the low frequency region, in third keeping noise flat, or something like that. I guess wavPack lossy's dynamic noise shaping is of that kind and I thought that it's about something similar.
I personally don't see much sense in struggling for < 250 kbps with the lossyWAV approach. We have a lot of good codecs here - as was said - which probably do a better job than wavPack lossy does with this approach.

Improving quality in the ~300 kbps region is the more promising approach IMO.

This post has been edited by halb27: Aug 18 2008, 13:59


--------------------
lame3100m -V1 --insane-factor 0.75
Go to the top of the page
+Quote Post
SebastianG
post Aug 18 2008, 15:37
Post #113





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (Hancoque @ Aug 18 2008, 12:31) *
I think that the advantage of lossyWAV is that there is *no* psymodel involved,

First of all, how would that be an advantage? Sencond, what's your definition of "psymodel" ? Doesn't lossyWav's analysis stage qualify as "psymodel"?

QUOTE (Hancoque @ Aug 18 2008, 12:31) *
so that the compression stays robust and can be used for post-processing without revealing hidden artefacts.

I'm pretty sure that this is simply a matter of headroom. Wouldn't it be even more "robust" if the noise floor's shape matched the hearing threshold's shape -- assuming the the same NMR (noise-to-mask ratio) on average?

QUOTE (2Bdecided @ Aug 18 2008, 13:55) *
...but that's exactly what psymodels do in all codecs - judge where noise can be added, such that it will (or should!) be inaudible.

Right, and lossyWav is trying the same thing by selecting "wasted_bits" on a per-block basis ("temporal noise shaping").

Obviously temporal noise shaping is a good idea. It hides noise in "places" (in time) where it's not easily recognisable. If it weren't a good idea, there wouldn't be a good reason to favour lossyWAV over plain static word length reduction. The same logic applies to spectral noise shaping. It'll hide noise in "places" (in frequency) where it's not easily recognisable. So, the lack of spectral noise shaping isn't really a feature, is it?

QUOTE (2Bdecided @ Aug 18 2008, 13:55) *
I still think it's interesting to try this - for one thing, it creates a new type of audio codec but potential problems will be similar to some of those already faced by other psychoacoustic based codecs. It won't be a magic faultless codec, like lossyWAV without a psy model might be.

In what way is the current lossyWAV faultless? Are you sure this isn't entirely due to the "high bitrate headroom"?

Cheers,
Sebastian
Go to the top of the page
+Quote Post
2Bdecided
post Aug 18 2008, 16:34
Post #114


ReplayGain developer


Group: Developer
Posts: 5059
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (SebastianG @ Aug 18 2008, 15:37) *
QUOTE (2Bdecided @ Aug 18 2008, 13:55) *

I still think it's interesting to try this - for one thing, it creates a new type of audio codec but potential problems will be similar to some of those already faced by other psychoacoustic based codecs. It won't be a magic faultless codec, like lossyWAV without a psy model might be.
In what way is the current lossyWAV faultless?
I didn't claim it was. I said "might be", in reference to what others may believe or hope (from experience so far).

I was discussing the hope that some people have, that a version with a psy model would be as "safe" as the current version. If it relied on the psy model completely, then I don't believe it would be any more "safe" than mp3/musepack/vorbis/etc. That was the point I was making.


QUOTE
Are you sure this isn't entirely due to the "high bitrate headroom"?
I'm fairly sure it's because it keeps the added noise below the signal (measured using a variety of temporal and spectral windows), and because the point where the added noise comes closest to the signal is usually at a high frequency where it would be less audible/objectionable anyway. The result is usually over coding at most frequencies, and a comparatively high bitrate.

There's isn't much bitrate "headroom" though - maybe there is more in the ways Nick has intelligently developed it, but in the simple early versions a 6dB increase in noise made that noise a little audible, a 12dB increase made it very audible. In that sense, there was little headroom.


Let me turn the question around: what would be involved in making the error signal from, say, vorbis, Musepack or AAC (I won't include mp3, because you'd need freeformat to have sufficient bitrate available), comparable to that from lossyWAV?

Cheers,
David.
Go to the top of the page
+Quote Post
SebastianG
post Aug 18 2008, 18:09
Post #115





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (2Bdecided @ Aug 18 2008, 17:34) *
I didn't claim it was. I said "might be", in reference to what others may believe or hope (from experience so far).

Right. I missed that part.

QUOTE (2Bdecided @ Aug 18 2008, 17:34) *
[...] The result is usually over coding at most frequencies [...] There's isn't much bitrate "headroom" though [...]

That's precisly what I mean. Quality varies. Why would you want to do that?

It should be clear that spectral noise shaping can prevent such large discrepancies (some parts are over-coded and some are just good enough to not sound bad). This issue came up a number of times by now and I still fail to understand the obsession of some people with a spectrally flat noise floor.

Regarding presence/lack of a psychoacoustic model: Excuse me but where do the 'wasted_bits' values come from? Certainly there's a unit in lossyWAV that does some analysis on how much spectrally flat noise is tolerable. It already does some spectral analysis ("different temporal and spectral windows"). It might be simplistic but it certainly qualifies as psyachoacoustic model.

QUOTE (2Bdecided @ Aug 18 2008, 17:34) *
Let me turn the question around: what would be involved in making the error signal from, say, vorbis, Musepack or AAC [...] comparable to that from lossyWAV?

You make it sound like if that's a good thing to try. I don't see where you're going with this. But in theory it should be possible -- at least in case of Vorbis. You probably have to design some new code books (i.e. for those parts that are heavily over-coded) ;-) Encoding the floor curve is a piece of cake since it's a boring straight line with an offset depending on 'wasted_bits'.

Cheers,
Sebastian
Go to the top of the page
+Quote Post
Hancoque
post Aug 18 2008, 22:23
Post #116





Group: Members
Posts: 291
Joined: 27-January 04
From: Germany
Member No.: 11530



QUOTE (SebastianG @ Aug 18 2008, 16:37) *
First of all, how would that be an advantage?

Whether it is an advantage or not depends on what you want to achieve. Personally, I want to use lossyWAV because it allows for heavy post-processing. So, for me it isn't only important that a resulting file sounds good "as is" but also that it doesn't show any issues after equalizing, pitch-shifting or HRTF processing. For me it's a compromise between "pure lossy" and lossless. And I thought that this is the whole purpose of lossyWAV: being lossy to lower the bitrate while staying close to lossless and maintaining all the advantages of lossless compression like post-processability or the assurance that your dog or invisible aliens won't hear nasty artefacts while you enjoy a seemingly good sound. wink.gif

QUOTE (SebastianG @ Aug 18 2008, 16:37) *
Sencond, what's your definition of "psymodel" ? Doesn't lossyWav's analysis stage qualify as "psymodel"?

Well, basically it is already psychoacoustic in a way as it utilizes the masking effect (keeping added noise below the noise floor). But I don't think that some added white noise is a big problem. White noise is neutral and some tape hiss was never an issue during the days of cassette recorders.

Some time ago I read about the different dithering techniques and it has been stated that (flat) triangular dither is the preferred choice if further processing is planned while coloured dither should only be used in the final step. I'm pretty sure that for many people encoding to lossyWAV will be the final processing stage but for me it isn't. It's just the last stage where something is saved as a file.

I'm writing all this because I'm concerned that lossyWAV might be optimized for noise shaping in a way that non-shaped quality suffers as it is no longer of importance to the developer and those that are involved in the optimization process. It would reassure me to know that the quality of --portable --shaping 0 will only increase (not so important) but never decrease (very important) in the future.

This post has been edited by Hancoque: Aug 18 2008, 22:41
Go to the top of the page
+Quote Post
Nick.C
post Aug 19 2008, 07:24
Post #117


lossyWAV Developer


Group: Developer
Posts: 1785
Joined: 11-April 07
From: Wherever here is
Member No.: 42400



QUOTE (Hancoque @ Aug 18 2008, 22:23) *
I'm writing all this because I'm concerned that lossyWAV might be optimized for noise shaping in a way that non-shaped quality suffers as it is no longer of importance to the developer and those that are involved in the optimization process. It would reassure me to know that the quality of --portable --shaping 0 will only increase (not so important) but never decrease (very important) in the future.
I have no intention at all of forcing the user to use noise shaping, neither do I want to reduce the perceived quality of any of the quality presets. I hope that that in some way reassures you.... smile.gif

[edit] Taking this opportunity to seek advice from interested users, I am wondering as to the real need for the correction file? Reversion to lossless, although possible, is not a painless procedure (although when creating a correction file the user cannot use STDOUT processing) and as the file "addition" process works outside of a lossless codec the re-integration cannot be performed at the player (as WavPack so elegantly can, I believe). [/edit]

This post has been edited by Nick.C: Aug 19 2008, 07:45


--------------------
lossyWAV -q X -a 4 --feedback 4| FLAC -8 ~= 320kbps
Go to the top of the page
+Quote Post
smok3
post Aug 19 2008, 08:24
Post #118


A/V Moderator


Group: Moderator
Posts: 1726
Joined: 30-April 02
From: Slovenia
Member No.: 1922



correction files would be a waste of time imho, there are other who can do that (wavpack & optimfrog for example), so why bother.


--------------------
PANIC: CPU 1: Cache Error (unrecoverable - dcache data) Eframe = 0x90000000208cf3b8
NOTICE - cpu 0 didn't dump TLB, may be hung
Go to the top of the page
+Quote Post
2Bdecided
post Aug 19 2008, 10:34
Post #119


ReplayGain developer


Group: Developer
Posts: 5059
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



Oh, I liked them sad.gif OK, I admit I haven't used one yet(!), but I liked having the option.
Go to the top of the page
+Quote Post
SebastianG
post Aug 19 2008, 10:42
Post #120





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



Hancoque, there's nothing in between "pure lossy" and lossless. You're aming for a large headroom in quality. That's all. You didn't mention a single reason against spectral noise shaping.

QUOTE
Well, basically it is already psychoacoustic in a way as it utilizes the masking effect (keeping added noise below the noise floor).

I know what you mean by "noise floor" and this is probably part of a misunderstanding. How would you know the "nosie floor" in order to keep additional quantization noise below it? You don't. And it doesn't really matter at all. Hearing thresholds matter. You want to be on the safe side w.r.t. further processing etc? Use a large headrooms.

...like for example quantization noise that's right under min(S - 12dB, M - 25dB) within every cricital band where S=signal, M=estimated masking threshold. The above formular grauantees a minimal SNR of 12 dB and a headroom of at least 15 dB considering a medium quality psychoacoustic model that's within +/- 10 dB of accuracy.

Cheers,
SG
Go to the top of the page
+Quote Post
2Bdecided
post Aug 19 2008, 11:03
Post #121


ReplayGain developer


Group: Developer
Posts: 5059
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (SebastianG @ Aug 18 2008, 18:09) *
That's precisly what I mean. Quality varies. Why would you want to do that?
The same is true of LPCM, isn't it? The "overcoding" for loud signals is much greater than the "overcoding" for quiet signals.

Yet some people insist on using lossless coding of LPCM - not everyone looks at LPCM and thinks "why would you want to do that?" - not every one feels the need to add noise to LPCM to make it "constant quality".

lossyWAV is a bit like LPCM, and a bit like psychoacoustic coding. It's a "half way house", and maybe not many people will want to use it. However, those who have a reason for avoiding full psychoacoustic coding (e.g. a desire to post-process), and also a reason for avoiding lossless (e.g. a desire not to throw 1000kbps+ at a lo-fi 2008 CD issue which would be largely transparent at 8-bit resolution!) may want to try it.


Remember (as if I need to tell you! wink.gif ) that conventional codecs often put "noise" above the "signal", since basic spectral masking theory says this is often OK, because it's inaudible. In contrast, lossyWAV tries to keep the added noise below the signal, always.

It should be obvious why this is useful for post-processing - if the noise is below the signal on every "useful" time and frequency scale, then it's near impossible to drag it above the signal, never mind to make it audible. Whereas the "inaudible" noise from conventional codecs, which could be above the signal, can "easily" be EQ'd (for example) to the level where it's no-longer masked.

I admit that concepts such as "noise below the signal" and "on every useful time and frequency scale" are not as simple and clear-cut as I imply when describing lossyWAV, but Nick really does seem to have hammered out most of the wrinkles in practice.

Cheers,
David.


QUOTE (SebastianG @ Aug 19 2008, 10:42) *
...like for example quantization noise that's right under min(S - 12dB, M - 25dB) within every cricital band where S=signal, M=estimated masking threshold. The above formular grauantees a minimal SNR of 12 dB and a headroom of at least 15 dB considering a medium quality psychoacoustic model that's within +/- 10 dB of accuracy.
You could do that. You could introduce noise shaping and "real" psychoacoustics like that - or alternatively you could introduce noise shaping based on the minimum energy in each critical band (or fraction thereof). That's what I envisaged. Keep the noise well below the signal, even if the resulting noise level is lower than the masking threshold estimate.

Cheers,
David.
Go to the top of the page
+Quote Post
SebastianG
post Aug 19 2008, 11:58
Post #122





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (2Bdecided @ Aug 19 2008, 12:03) *
The same is true of LPCM, isn't it? The "overcoding" for loud signals is much greater than the "overcoding" for quiet signals.

Yeah, and that's a particularly bad thing to do, isn't it? biggrin.gif That's the whole point of lossy coding, isn't it? biggrin.gif wink.gif

btw: AFAIR Bryant suggested using WavPack lossy instead of going from 24bit to 16bit LPCM to save space. This sounds like a sensible thing to do, doesn't it? It'll allow you to do some postprocessing (like dynamic range compression etc) whereas at 16 bit you're stuck with a certain noise floor that may get audible after some further manipulations...

QUOTE (2Bdecided @ Aug 19 2008, 12:03) *
Remember (as if I need to tell you! wink.gif ) that conventional codecs often put "noise" above the "signal", since basic spectral masking theory says this is often OK, because it's inaudible. In contrast, lossyWAV tries to keep the added noise below the signal, always.

This is without doubt desirable. I'm not suggesting anything else, ("min(S - 12dB, M - 25dB)", numbers are chosen more or less arbitrarily).

There are a couple of reasons why this is a good idea in this context: you don't need to dither at all. It won't add much entropy which is good for encoders like FLAC. It won't increase energy anywhere by more than 0.26 dB in this case (SNR>=12 dB, measured over critical bands and not single frequency lines).

Your other suggestion (the level of minimal energy in a critical band) would be a conservative version of masking with a narrow spreading function. I'd still prefer to just add headroom instead of messing with the spreading function. At least this is a tweaking question and a use case question. The typical use case is that we don't do a lot of post processing. For those who do, there can be enough headroom by reducing overall noise levels. You could make a switch for "conservative spreading function" with a comment like "allows heavy equalization but is not recommended for normal use".

Cheers,
SG

This post has been edited by SebastianG: Aug 19 2008, 14:15
Go to the top of the page
+Quote Post
carpman
post Aug 19 2008, 14:26
Post #123





Group: Developer
Posts: 1310
Joined: 27-June 07
Member No.: 44789



QUOTE (Nick.C @ Aug 19 2008, 07:24) *
Taking this opportunity to seek advice from interested users, I am wondering as to the real need for the correction file?

Don't use them with LossyWAV. If I really did want them I'd use WavPack Hybrid.

Suggestion for later down the line, when LossyWAV has been more exposed and more thoroughly tested, but how about:

Changing --extreme to -q 6
Changing --insane to -q 7.5

I'd only recommend this on the proviso that --standard (-q 5) hasn't a single problem sample and is thus still considered transparent 100% of the time.

C.


--------------------
TAK -p4m :: LossyWAV -q 6 | TAK :: Lame 3.98 -V 2
Go to the top of the page
+Quote Post
2Bdecided
post Aug 19 2008, 15:40
Post #124


ReplayGain developer


Group: Developer
Posts: 5059
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



SebastianG,

The nice thing about lossyWAV is that you can "easily" try all these approaches. There's no fixed filterbank, bitrate cap, DCT legnth(s) to accommodate or limit exactly where noise can be added vs not added.

You (currently) have a fixed block size, and you're stuck with that time resolution in terms of changing how many bits you remove. However, frequency domain-wise you can do anything you want - I guess the only restriction is having a stable realisable noise shaping filter which won't cause clipping. Time-domain wise and time/frequency wise you're quite free to shift noise around as you wish (but the total noise power can only jump by a minimum of 6dB with transitions every 512 samples).

One caveat is that some of the things you could do would be quite pointless because they would dramatically reduce the efficiency of the subsequent lossless coding.

Cheers,
David.
Go to the top of the page
+Quote Post
sauvage78
post Aug 19 2008, 16:00
Post #125





Group: Members
Posts: 677
Joined: 4-May 08
Member No.: 53282



I don't use correction files I have never used it & don't plan to ever use it.

carpman:
we already had a lot of discussions on lossywav presets, everybody has different opinions on it ...

Q2.5 --portable is a preset because problem samples were ABXable up to Q2 (but very hard at Q2)
, so Q2.5 has a small but nice 0.5 margin.

Q5 --standard is a preset has it is the original historic setting, where the added noise is always masked in theory

Q7.5 --extreme is a miror of Q2.5 with a paranoid margin

Q10 --insane is just pushing the incrementation logic of a 2.5 by 2.5 step by preset to the limit of the quality scale.

linking --extreme & --insane to Q6 & Q7.5 just because you consider it 100% transparent is pure random ... personnaly I consider Q2.5 100% transparent until someone prove the contrary by providing an ABXable problem sample. So far Q2.5 is the "real life" transparency point of lossywav & Q5 is the theoric optimal setting for transparency of lossywav ... anything above Q5 is overkill, only maybe usefull for people willing to use a transform codec later.

Q10 --insane is a useless preset IMHO, but if Nick wants it, amen ... I will not ask him to remove it (Edit: well I already did, but he didn't listen wink.gif )

you could randomly link --insane & --extreme to any setting above Q5 ... because it is overkill anyway ... using a 2.5 step is just a "little" less random ...

IMHO the rational choice for a lossywav setting is between either Q2.5 or Q5 ... using Q2.5 you take the risk that a problem sample arise & prove that Q2.5 is not 100% transparent ... using Q5 the margin is already VERY big & you have the theory on your side so you may have the felling of doing things the right way ... anything above q5 is pure paranoia or a transcoding setting.

presets were made as a guide for noobs because lossywav is very different from any other lossy codec ... if you like q6 then use the quality scale, as long as you know why you use q6, it's ok ... obviously having 2 overkill presets is missleading for noobs as people used to the vorbis/nero quality scale think "the bigger the better" ... it is quickly only true in theory & not in real life (ABX) for lossywav ...

This post has been edited by sauvage78: Aug 19 2008, 17:01


--------------------
CDImage+CUE
Secure [Low/C2/AR(2)]
Flac -4
Go to the top of the page
+Quote Post

6 Pages V  « < 3 4 5 6 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 24th July 2014 - 16:25