IPB

Welcome Guest ( Log In | Register )

An idea of audio encode algorithm, based on maximum allowed volume of , WavPack hybrid mode test included
softrunner
post Mar 6 2013, 00:11
Post #1





Group: Members
Posts: 48
Joined: 19-July 12
Member No.: 101579



Full topic title: "An idea of audio encode algorithm, based on maximum allowed volume of signals difference"

Recently I have discovered for myself, that the difference of the source and encoded audio can be easily obtained by inverting source audio and mixing it with the encoded one. Then the idea of encode algorithm came into my head: just try to keep the signals difference at the same level (or less), defined by user. Thus, the audio quality is simply measured by volume of the difference of the signals, and this difference is nothing but distortions, produced by encoder.
The whole algorithm looks like this:
1. Take maximum allowed volume of signals difference from user.
1. Make a copy of source audio and invert it.
2. Split both source and inverted audio on frames of the same size.
3. Encode first frame of source audio, mix the result with first frame of inverted audio and calculate the volume of obtained difference.
4. If the volume of the difference is higher, than allowed by user, add some bitrate and repeat from item no. 3.
5. If the volume of the difference is not higher, than allowed by user, add first encoded frame to the final output.
6. Repeat items 3-5 with second, third, etc... frames, until the end of the source file.

Of cause, this algorithm is much slower then just direct encode, but definately if should not be slower, than video encoding (and people are ready to wait for many hours while their videos are being encoded).

I tried to reproduce this algorithm manually by test using WavPack hybrid mode as an encoder (source audio sample was splitted on 11 parts of 1 second), and it showed, that 23.4 % of space/bitrate could be saved. Another important thing is that the user is guaranteed, that he will not get distortions with volume level, higher then he expects, so he can safely encode many files simultaneously without looking at the content. User gets freed both from unnecessary waste of bitrate and uncontrolled distortions.

The only thing is needed is that some audio developers get interested in this idea and implement it as a computer program.

The whole set of files of the WavPack test I've made is here.

This post has been edited by softrunner: Mar 6 2013, 00:20
Go to the top of the page
+Quote Post
 
Start new topic
Replies
softrunner
post Mar 22 2013, 03:16
Post #2





Group: Members
Posts: 48
Joined: 19-July 12
Member No.: 101579



QUOTE (2Bdecided @ Mar 12 2013, 13:47) *
QUOTE (softrunner @ Mar 9 2013, 02:09) *
The point is in reducing file size without any audible loss of quality on all inputs possible with 100% guarantee.
I think that's called FLAC.

QUOTE (Dynamic @ Mar 12 2013, 15:42) *
Lossless is the only true guarantee.

The popularity of lossless is based mostly on placebo effect. File size of lossless does not match real quality it has.
QUOTE
I think lossyWAV --maxclips 0 (i.e. lossyWAV standard) hasn't even shown non-transparency on extreme full-scale test signals, so lossyFLAC, lossyWV, lossyTAK etc are all viable.
In real music, the --maxclips 0 can be omitted with no reported problem samples. (The artificial test sample that generated the clipping problem was about 10 dB louder to the ear than today's maximally loud albums - or --maxclips 0 can be included at the expense of a minor bitrate increase).

The difference, I hear in samples I've posted, is not about clipping. It is simply a noise, added by lossyWAV, which is sometimes too loud. Also I do not see, how --maxclips 0 changes the situation. The weakest point of lossyWAV is when there is a quiet noise and some simple one tone signal, which can be a single note of some musical instrument.
QUOTE
The first versions (or if you turn of all noise shaping in later versions using --shaping 0) make the bare minimum psychoacoustic assumption that the one kind of white noise is indistinguishable from another and keeps the added noise below the minimum noise measured in the signal's audible spectrum without making any kind of filtering or frequency-shaped noise. (This is the flavour of lossyWAV support included in CUETools and CUERipper, and I've had no problems using it as a source for transcoding into conventional lossy)

Yes, I checked --shaping 0 and --shaping 1 also, and it seems they are not audible on "standard" preset (they are audible on "economic"). So it seems to be, that adaptive noise shaping, used in lossyWAV by default, is not the best choise for quality presets "standard" and higher. Though with ANS some samples are not audible on "extraportable", where "shaping 0" and "shaping 1" are clearly audible.
QUOTE
The improvements to lossyWAV up to v1.3 (adaptive shaping of the added noise to match the signal spectrum) seem to have been very safe and conservative and seem to have actually hidden the noise better with more margin of safety.

I also think so, but sometimes ANS puts too much noise where there is no enough space for it, and it becomes clearly audible, so I think it is a possible direction for improvement of lossyWAV.
QUOTE (C.R.Helmrich @ Mar 13 2013, 00:46) *
Convert some CD audio to 8 bit/sample, that gives you the ~45 dB difference level you want. You'll find it's not enough for many music files, especially ones with long fade-outs.

If do it without dithering, the noise will be about -1.4 dB, and if use dithering, yes, it will be about -45 dB, but it all will be in high frequencies, so using equalizer will make it easily audible. What I'm talking about is not just such a simple technic as converting into 8 bit. As Gecko wrote:
QUOTE
Anyway, I added that line in my original post to acknowledge the fact that the OP (as far as I have understood) isn't just trying to create ~8-bit audio, but rather imposing a bound on the maximum allowed error after Wavpack's psycho-acoustic lossy treatment.

But I have found, that -45dB is audible for WavPack, so just simple restriction of maximum volume of error signal will be not efficient enough. Anyway, what I want is some vbr quality oriented mode of WavPack, and then it will be more clear, how good it is.
QUOTE
Or 512-kbps AAC or Opus. I cannot think of any signal which would not be coded transparently at that bitrate. Because I don't know of any signal which is not transparent at e.g. Winamp AAC VBR 6 at half that bitrate on average. (Edit: I'm talking about stereo here of course).

It depends on what to call "transparent". Vorbis is audible on 619 kbps (check "FighterBeatLoop" sample in Uploads section), but it sounds good on much lower bitrates.
Go to the top of the page
+Quote Post
db1989
post Mar 22 2013, 11:51
Post #3





Group: Super Moderator
Posts: 5275
Joined: 23-June 06
Member No.: 32180



QUOTE (softrunner @ Mar 22 2013, 02:16) *
QUOTE
Or 512-kbps AAC or Opus. I cannot think of any signal which would not be coded transparently at that bitrate. Because I don't know of any signal which is not transparent at e.g. Winamp AAC VBR 6 at half that bitrate on average. (Edit: I'm talking about stereo here of course).
It depends on what to call "transparent". Vorbis is audible on 619 kbps (check "FighterBeatLoop" sample in Uploads section), but it sounds good on much lower bitrates.
Zoom:
QUOTE
It depends on what to call "transparent".
The irony is strong with this one. How do you define “transparent”, then? To me, it seems as though your ideal definition is transparency for everyone all the time. Setting aside how patently absurd that idea is since transparency specifically refers to specific combinations of listener and material, your pointing out how a codec that is usually transparent at much more sensible bitrates fails to be transparent at a very high bitrate with one particular sample does not support your argument: it’s actually undercutting it. There will always be exceptions to transparency, at least for certain people and certain signals, and none of your nice-sounding-in-novice-theory-but-baseless-in-practice ideas are likely to change that. At least develop a consistent narrative before you try to make everyone implement it at your behest.
Go to the top of the page
+Quote Post
2Bdecided
post Mar 22 2013, 14:57
Post #4


ReplayGain developer


Group: Developer
Posts: 5362
Joined: 5-November 01
From: Yorkshire, UK
Member No.: 409



QUOTE (db1989 @ Mar 22 2013, 10:51) *
QUOTE (softrunner @ Mar 22 2013, 02:16) *
It depends on what to call "transparent".
The irony is strong with this one. How do you define “transparent”, then? To me, it seems as though your ideal definition is transparency for everyone all the time. Setting aside how patently absurd that idea is...
Why is it absurd? The HA mantra is that CD quality audio is transparent WRT a stereo source. There are caveats (most importantly, gain riding may break it; lousy implementations may break it), but here at least, it's not a controversial statement.

If altering a single bit in a CD quality audio signal renders it non-transparent (i.e. that single bit change is audible) to some person under some reasonable listening circumstances, then miraculously CD quality really does define transparency, and nothing less counts. However, I suspect that's not the case. With a bound set of use cases, you can probably create something that's more efficient than lossless coding of CD quality audio while remaining transparent to the stereo source. If you feed lossyWAV with 24-bits, for example, it'll cope with the gain riding that CD quality audio will not, while (I expect! wink.gif ) remaining transparent, and at a lower bitrate.

QUOTE
since transparency specifically refers to specific combinations of listener and material, your pointing out how a codec that is usually transparent at much more sensible bitrates fails to be transparent at a very high bitrate with one particular sample does not support your argument: it’s actually undercutting it. There will always be exceptions to transparency, at least for certain people and certain signals, and none of your nice-sounding-in-novice-theory-but-baseless-in-practice ideas are likely to change that.
It seems to me that the more "clever" you try to be in designing a codec, the lower a bitrate you can achieve for transparency "most" of the time, the fewer number of problems samples you will have, but greater the percentage bitrate increase is required to deal with those few problem samples you have left. I am talking anecdotally - I have no evidence or justification for this.



However, you are absolutely correct to call out softrunner on their use of "transparent", because until you define what you mean, the rest of the discussion is pointless. e.g. is it transparent even if you post-process with...
1) EQ. If so, how much?
2) DRC. If so, how much?
3) stereo processing. If so, what? Logic7? Vocal Cut? etc
4) phasing and flanging? Other DSP effects and production techniques?
5) adding an inverted copy of the original signal and amplifying the result by 100dB? wink.gif

Only lossless is transparent with post-process number 5 wink.gif I suspect 4 can be almost as tricky, and 3 is quite tricky. 1 and 2 can be accounted for and bounded.

Good luck softrunner.

Cheers,
David.
Go to the top of the page
+Quote Post

Posts in this topic
- softrunner   An idea of audio encode algorithm, based on maximum allowed volume of   Mar 6 2013, 00:11
- - saratoga   QUOTE Then the idea of encode algorithm came into ...   Mar 6 2013, 00:21
|- - softrunner   QUOTE (saratoga @ Mar 6 2013, 03:21) The ...   Mar 6 2013, 00:36
|- - saratoga   QUOTE (softrunner @ Mar 5 2013, 18:36) I...   Mar 6 2013, 00:58
- - greynol   None of the lossy codecs commonly discussed on thi...   Mar 6 2013, 02:22
- - DVDdoug   softrunner, If you want to demonstrate to yoursel...   Mar 6 2013, 21:01
- - C.R.Helmrich   QUOTE (softrunner @ Mar 6 2013, 00:11) ju...   Mar 6 2013, 21:21
|- - softrunner   QUOTE (saratoga @ Mar 6 2013, 03:21) The ...   Mar 7 2013, 16:59
|- - 2Bdecided   QUOTE (softrunner @ Mar 7 2013, 15:59) We...   Mar 7 2013, 17:14
|- - greynol   LossyWAV is commonly discussed here and I lamented...   Mar 7 2013, 18:05
||- - saratoga   QUOTE (greynol @ Mar 7 2013, 12:05) Lossy...   Mar 7 2013, 20:25
|- - db1989   QUOTE (softrunner @ Mar 7 2013, 15:59) We...   Mar 7 2013, 18:51
||- - Canar   QUOTE (softrunner @ Mar 7 2013, 15:59) We...   Mar 7 2013, 20:20
|- - Nessuno   QUOTE (softrunner @ Mar 7 2013, 16:59) Bu...   Mar 7 2013, 20:54
|- - C.R.Helmrich   Indeed. Softrunner, if you want mathematical close...   Mar 7 2013, 22:51
- - greynol   @Canar: Please show me a lossy algorithm with no ...   Mar 7 2013, 20:29
|- - Canar   QUOTE (greynol @ Mar 7 2013, 11:29) Pleas...   Mar 7 2013, 20:32
- - softrunner   QUOTE (2Bdecided @ Mar 7 2013, 20:14) You...   Mar 9 2013, 03:09
|- - saratoga   QUOTE (softrunner @ Mar 8 2013, 21:09) QU...   Mar 9 2013, 04:00
|- - greynol   QUOTE (softrunner @ Mar 8 2013, 18:09) Th...   Mar 9 2013, 08:31
|- - Nessuno   softrunner, you evidently lack the theorical bases...   Mar 9 2013, 10:15
|- - db1989   In support of Nessuno’s conclusions, as well as th...   Mar 9 2013, 11:53
||- - greynol   QUOTE (db1989 @ Mar 9 2013, 02:53) * And ...   Mar 9 2013, 17:55
|- - 2Bdecided   QUOTE (softrunner @ Mar 9 2013, 02:09) Th...   Mar 12 2013, 10:47
||- - Dynamic   Lossless is the only true guarantee. LossyWAV...   Mar 12 2013, 12:42
|- - C.R.Helmrich   QUOTE (softrunner @ Mar 9 2013, 03:09) QU...   Mar 12 2013, 21:46
- - Gecko   On a very basic level, lossy encoders have a mecha...   Mar 9 2013, 12:06
- - greynol   So WavPack does have a psychoacoustic model?   Mar 9 2013, 17:46
|- - Gecko   QUOTE (greynol @ Mar 9 2013, 17:46) So Wa...   Mar 10 2013, 17:10
- - greynol   If you know then say.   Mar 10 2013, 17:50
- - Gecko   Well, since Wavpack lossy doesn't just discard...   Mar 10 2013, 19:16
- - greynol   Sorry, but that really doesn't cut it. Could ...   Mar 10 2013, 19:31
- - Gecko   In that case, maybe I need to revise my definition...   Mar 11 2013, 18:49
- - pdq   Can you play the correction file to a Wavpack loss...   Mar 11 2013, 19:25
- - Gecko   I tried the old inversion trick on a drum & ba...   Mar 11 2013, 20:02
|- - bryant   QUOTE (Gecko @ Mar 11 2013, 11:02) I trie...   Mar 28 2013, 04:59
- - db1989   Premises: (1) If a residual signal created by mixi...   Mar 11 2013, 20:21
- - greynol   For the record, I'm not in any position to def...   Mar 11 2013, 20:56
|- - Nessuno   QUOTE (greynol @ Mar 11 2013, 20:56) At a...   Mar 11 2013, 21:57
- - softrunner   QUOTE (2Bdecided @ Mar 12 2013, 13:47) QU...   Mar 22 2013, 03:16
|- - saratoga   QUOTE (softrunner @ Mar 21 2013, 21:16) Q...   Mar 22 2013, 03:24
|- - Gecko   QUOTE (softrunner @ Mar 22 2013, 03:16) B...   Mar 22 2013, 08:48
|- - db1989   QUOTE (softrunner @ Mar 22 2013, 02:16) Q...   Mar 22 2013, 11:51
||- - 2Bdecided   QUOTE (db1989 @ Mar 22 2013, 10:51) QUOTE...   Mar 22 2013, 14:57
||- - db1989   QUOTE (2Bdecided @ Mar 22 2013, 13:57) QU...   Mar 22 2013, 15:20
|- - 2Bdecided   QUOTE (softrunner @ Mar 22 2013, 02:16) I...   Mar 22 2013, 14:43
- - jmvalin   Hey everyone, I just had this great idea that shou...   Mar 22 2013, 07:41
- - 2Bdecided   Sorry db1989, I'm not trying to personally att...   Mar 22 2013, 16:37
|- - db1989   QUOTE (2Bdecided @ Mar 22 2013, 15:37) So...   Mar 22 2013, 18:24
|- - Nessuno   QUOTE (2Bdecided @ Mar 22 2013, 16:37) Le...   Mar 23 2013, 11:00
|- - 2Bdecided   QUOTE (Nessuno @ Mar 23 2013, 10:00) QUOT...   Mar 28 2013, 10:34
|- - db1989   QUOTE (2Bdecided @ Mar 28 2013, 09:34) If...   Mar 28 2013, 13:41
||- - 2Bdecided   QUOTE (db1989 @ Mar 28 2013, 12:41) QUOTE...   Mar 28 2013, 17:30
|||- - db1989   QUOTE (2Bdecided @ Mar 28 2013, 16:30) Ah...   Mar 28 2013, 17:38
||- - DonP   QUOTE (db1989 @ Mar 28 2013, 07:41) QUOTE...   Mar 28 2013, 17:45
||- - db1989   QUOTE (DonP @ Mar 28 2013, 16:45) First, ...   Mar 28 2013, 17:50
||- - Nessuno   QUOTE (db1989 @ Mar 28 2013, 17:50) QUOTE...   Mar 28 2013, 22:29
|- - jmvalin   QUOTE (2Bdecided @ Mar 28 2013, 05:34) It...   Mar 28 2013, 19:42
- - 2Bdecided   RE: An idea of audio encode algorithm, based on maximum allowed volume of   Mar 22 2013, 18:49
- - softrunner   QUOTE (saratoga @ Mar 22 2013, 06:24) QUO...   Mar 25 2013, 03:10
|- - lvqcl   QUOTE (softrunner @ Mar 25 2013, 06:10) A...   Mar 25 2013, 16:12
|- - probedb   QUOTE (softrunner @ Mar 25 2013, 02:10) N...   Mar 25 2013, 16:44
|- - Gecko   QUOTE (softrunner @ Mar 25 2013, 03:10) N...   Mar 25 2013, 18:22
- - greynol   Thanks for chiming-in, David!   Mar 28 2013, 07:20
- - 2Bdecided   I think he implied a noise floor relative to peak ...   Mar 28 2013, 21:19
- - jmvalin   QUOTE (2Bdecided @ Mar 28 2013, 16:19) I ...   Mar 28 2013, 21:49
- - 2Bdecided   QUOTE (jmvalin @ Mar 28 2013, 20:49) I...   Mar 29 2013, 12:16


Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 22nd December 2014 - 16:39