IPB

Welcome Guest ( Log In | Register )

An idea of audio encode algorithm, based on maximum allowed volume of , WavPack hybrid mode test included
softrunner
post Mar 6 2013, 00:11
Post #1





Group: Members
Posts: 48
Joined: 19-July 12
Member No.: 101579



Full topic title: "An idea of audio encode algorithm, based on maximum allowed volume of signals difference"

Recently I have discovered for myself, that the difference of the source and encoded audio can be easily obtained by inverting source audio and mixing it with the encoded one. Then the idea of encode algorithm came into my head: just try to keep the signals difference at the same level (or less), defined by user. Thus, the audio quality is simply measured by volume of the difference of the signals, and this difference is nothing but distortions, produced by encoder.
The whole algorithm looks like this:
1. Take maximum allowed volume of signals difference from user.
1. Make a copy of source audio and invert it.
2. Split both source and inverted audio on frames of the same size.
3. Encode first frame of source audio, mix the result with first frame of inverted audio and calculate the volume of obtained difference.
4. If the volume of the difference is higher, than allowed by user, add some bitrate and repeat from item no. 3.
5. If the volume of the difference is not higher, than allowed by user, add first encoded frame to the final output.
6. Repeat items 3-5 with second, third, etc... frames, until the end of the source file.

Of cause, this algorithm is much slower then just direct encode, but definately if should not be slower, than video encoding (and people are ready to wait for many hours while their videos are being encoded).

I tried to reproduce this algorithm manually by test using WavPack hybrid mode as an encoder (source audio sample was splitted on 11 parts of 1 second), and it showed, that 23.4 % of space/bitrate could be saved. Another important thing is that the user is guaranteed, that he will not get distortions with volume level, higher then he expects, so he can safely encode many files simultaneously without looking at the content. User gets freed both from unnecessary waste of bitrate and uncontrolled distortions.

The only thing is needed is that some audio developers get interested in this idea and implement it as a computer program.

The whole set of files of the WavPack test I've made is here.

This post has been edited by softrunner: Mar 6 2013, 00:20
Go to the top of the page
+Quote Post
 
Start new topic
Replies
softrunner
post Mar 9 2013, 03:09
Post #2





Group: Members
Posts: 48
Joined: 19-July 12
Member No.: 101579



QUOTE (2Bdecided @ Mar 7 2013, 20:14) *
You just need to reduce the bitdepth of the audio signal by an amount equivalent to the difference (=noise) you're willing accept. You'll get 6dB more noise per extra bit dropped. Lower bitdepth = lower bitrate when losslessly encoded. So, use any audio editor that allows you to change the bitdepth, then use almost any lossless codec on the result = job done.

And I have to do all this manually for all lossless files I have? Actually, that's not what I'm willing for. And I think, this will not be efficient enough.
QUOTE
For a smarter way of doing it, take a look at lossyWAV.

I know about lossyWAV. First, it does not accept maximum allowed volume of error signal as an input parameter, volume of distortions is dependant on material, being processed. Also, check this sample. On it lossyWAV gives distortions, which are audible even at "extreme" preset (235 kbps; in FLAC this sample uses 301 kbps), so we have to admit, that it's bits reduction and masking technics do not work properly on all kind of audio material. And for this sample WavPack even on 96 kbps gives perfect result, it's error signal is extremely quiet, so, the only thing needed is just to guide WavPack, teach it, where to use more bitrate and where to reduce it.

QUOTE (greynol @ Mar 7 2013, 21:05) *
Who can guarantee that this "maximum mathematical closeness" will work

Definately it will work. It you do not allow distortions of some volume, they will never appear there. Very simple logic, which simply works. And I do not claim, that it is the final destination. I accept, that it is possible to allow encoder to be more aggressive in certain circumstances, but it is the question of separate research for each encoder separately. Firstly, very simple approach should be implemented.
QUOTE
Also, please don't insult our intelligence by suggesting that we must try all possible input signals before rejecting the assertion that this idea will do better than already established practice built upon well established knowledge when you have not even offered any evidence supporting your concept.

I do not know exactly, how it will work, but I want to try it, because already established practice does not work good enough. All, we do, is just a blind play with bitrates, believing, that we have some quality there. And when we find one more killersample, we realize, that it was just a believe.
QUOTE
If this isn't about audibility then I completely fail to see the point.

The point is in reducing file size without any audible loss of quality on all inputs possible with 100% guarantee. That means, that there will be no more killersamples at all. Every user will use his own level of allowed distortions, dependent on sensibility of his ears, and he will know exactly, what he gets.

QUOTE (Canar @ Mar 7 2013, 23:20) *
You can change the maximum allowed error level of audio simply by altering the number of bits allocated per sample in an uncompressed context. With an appropriate codec, you can use fractional numbers of bits-per-sample. Then you can compress it down losslessly for a further reduction in file size.

All this is far from real practice, and I'm not against existing methodics, on the contrary, I am for using them, but with looking at the result they give.

QUOTE (saratoga @ Mar 7 2013, 23:25) *
so I think hes interested in highly compressed audio, whereas lossy wav is going to be about 2x that bitrate for good results.

No, encoder can use as much bitrate, as it can for max. allowed signals difference. For substituting lossless I would accept the difference of approximately -45 dB and lower if it would be efficient enough.

QUOTE (Nessuno @ Mar 7 2013, 23:54) *
We use psychoacoustic models exactly because we haven't an exact mathematical description of the auditory system, otherwise lossy compression would be deterministic and "just pure calculation" (well, more or less... anyway still more complex than sums and subtractions).

One more time, this psychoacoustic models do not garantee you anything. They give you only approximate results and sometimes fail.

QUOTE (C.R.Helmrich @ Mar 8 2013, 01:51) *
And it makes perfect sense: if you wouldn't consider the input level, a quiet signal would sound worse after your coding than a loud but otherwise identical signal.

First, if I understand you correctly: turn the volume control on maximum, and you will hear the noise... but nobody listens music on such a volume. Also, I've made a test: encoded one sample into WavPack 192 kbps (lowest possible), and track peak of the difference file was 0.077026. Then I decreased the volume on 40 dB, encoded again in 192 kbps, and you think track peak of the difference file was about the same 0.077026? No, it was 0.000854. Encoders know about such a tricks, so we are in safety here.

This post has been edited by softrunner: Mar 9 2013, 03:11
Go to the top of the page
+Quote Post
db1989
post Mar 9 2013, 11:53
Post #3





Group: Super Moderator
Posts: 5275
Joined: 23-June 06
Member No.: 32180



In support of Nessunoís conclusions, as well as the juicy number quoted by greynol, we have this:
QUOTE (softrunner @ Mar 9 2013, 02:09) *
I do not know exactly, how it will work, but I want to try it, because already established practice does not work good enough. All, we do, is just a blind play with bitrates, believing, that we have some quality there. And when we find one more killersample, we realize, that it was just a believe.
Yeah. OK.

I donít feel like trying to respond methodically to your, erm, points. What I will say is that (1) a one-size-fits-all approach is not going to work, regardless of how nice and easy it might sound and how much you like it for that reason*, and (2) a uniform level of noise throughout one stream does not necessarily mean a uniform level of non-audibility of the same noise.

Again, if youíre wondering why this hasnít been done despite apparently being so simple, you need to consider the very real possibility that it hasnít been done because itís too simple.

* And this sentiment takes us back to your previous ideas about VBR encoding, wherein you were also effectively demanding that people create an encoder that can guarantee transparency to everyone at a single setting. That wasnít viable, either.
Go to the top of the page
+Quote Post

Posts in this topic
- softrunner   An idea of audio encode algorithm, based on maximum allowed volume of   Mar 6 2013, 00:11
- - saratoga   QUOTE Then the idea of encode algorithm came into ...   Mar 6 2013, 00:21
|- - softrunner   QUOTE (saratoga @ Mar 6 2013, 03:21) The ...   Mar 6 2013, 00:36
|- - saratoga   QUOTE (softrunner @ Mar 5 2013, 18:36) I...   Mar 6 2013, 00:58
- - greynol   None of the lossy codecs commonly discussed on thi...   Mar 6 2013, 02:22
- - DVDdoug   softrunner, If you want to demonstrate to yoursel...   Mar 6 2013, 21:01
- - C.R.Helmrich   QUOTE (softrunner @ Mar 6 2013, 00:11) ju...   Mar 6 2013, 21:21
|- - softrunner   QUOTE (saratoga @ Mar 6 2013, 03:21) The ...   Mar 7 2013, 16:59
|- - 2Bdecided   QUOTE (softrunner @ Mar 7 2013, 15:59) We...   Mar 7 2013, 17:14
|- - greynol   LossyWAV is commonly discussed here and I lamented...   Mar 7 2013, 18:05
||- - saratoga   QUOTE (greynol @ Mar 7 2013, 12:05) Lossy...   Mar 7 2013, 20:25
|- - db1989   QUOTE (softrunner @ Mar 7 2013, 15:59) We...   Mar 7 2013, 18:51
||- - Canar   QUOTE (softrunner @ Mar 7 2013, 15:59) We...   Mar 7 2013, 20:20
|- - Nessuno   QUOTE (softrunner @ Mar 7 2013, 16:59) Bu...   Mar 7 2013, 20:54
|- - C.R.Helmrich   Indeed. Softrunner, if you want mathematical close...   Mar 7 2013, 22:51
- - greynol   @Canar: Please show me a lossy algorithm with no ...   Mar 7 2013, 20:29
|- - Canar   QUOTE (greynol @ Mar 7 2013, 11:29) Pleas...   Mar 7 2013, 20:32
- - softrunner   QUOTE (2Bdecided @ Mar 7 2013, 20:14) You...   Mar 9 2013, 03:09
|- - saratoga   QUOTE (softrunner @ Mar 8 2013, 21:09) QU...   Mar 9 2013, 04:00
|- - greynol   QUOTE (softrunner @ Mar 8 2013, 18:09) Th...   Mar 9 2013, 08:31
|- - Nessuno   softrunner, you evidently lack the theorical bases...   Mar 9 2013, 10:15
|- - db1989   In support of Nessunoís conclusions, as well as th...   Mar 9 2013, 11:53
||- - greynol   QUOTE (db1989 @ Mar 9 2013, 02:53) * And ...   Mar 9 2013, 17:55
|- - 2Bdecided   QUOTE (softrunner @ Mar 9 2013, 02:09) Th...   Mar 12 2013, 10:47
||- - Dynamic   Lossless is the only true guarantee. LossyWAV...   Mar 12 2013, 12:42
|- - C.R.Helmrich   QUOTE (softrunner @ Mar 9 2013, 03:09) QU...   Mar 12 2013, 21:46
- - Gecko   On a very basic level, lossy encoders have a mecha...   Mar 9 2013, 12:06
- - greynol   So WavPack does have a psychoacoustic model?   Mar 9 2013, 17:46
|- - Gecko   QUOTE (greynol @ Mar 9 2013, 17:46) So Wa...   Mar 10 2013, 17:10
- - greynol   If you know then say.   Mar 10 2013, 17:50
- - Gecko   Well, since Wavpack lossy doesn't just discard...   Mar 10 2013, 19:16
- - greynol   Sorry, but that really doesn't cut it. Could ...   Mar 10 2013, 19:31
- - Gecko   In that case, maybe I need to revise my definition...   Mar 11 2013, 18:49
- - pdq   Can you play the correction file to a Wavpack loss...   Mar 11 2013, 19:25
- - Gecko   I tried the old inversion trick on a drum & ba...   Mar 11 2013, 20:02
|- - bryant   QUOTE (Gecko @ Mar 11 2013, 11:02) I trie...   Mar 28 2013, 04:59
- - db1989   Premises: (1) If a residual signal created by mixi...   Mar 11 2013, 20:21
- - greynol   For the record, I'm not in any position to def...   Mar 11 2013, 20:56
|- - Nessuno   QUOTE (greynol @ Mar 11 2013, 20:56) At a...   Mar 11 2013, 21:57
- - softrunner   QUOTE (2Bdecided @ Mar 12 2013, 13:47) QU...   Mar 22 2013, 03:16
|- - saratoga   QUOTE (softrunner @ Mar 21 2013, 21:16) Q...   Mar 22 2013, 03:24
|- - Gecko   QUOTE (softrunner @ Mar 22 2013, 03:16) B...   Mar 22 2013, 08:48
|- - db1989   QUOTE (softrunner @ Mar 22 2013, 02:16) Q...   Mar 22 2013, 11:51
||- - 2Bdecided   QUOTE (db1989 @ Mar 22 2013, 10:51) QUOTE...   Mar 22 2013, 14:57
||- - db1989   QUOTE (2Bdecided @ Mar 22 2013, 13:57) QU...   Mar 22 2013, 15:20
|- - 2Bdecided   QUOTE (softrunner @ Mar 22 2013, 02:16) I...   Mar 22 2013, 14:43
- - jmvalin   Hey everyone, I just had this great idea that shou...   Mar 22 2013, 07:41
- - 2Bdecided   Sorry db1989, I'm not trying to personally att...   Mar 22 2013, 16:37
|- - db1989   QUOTE (2Bdecided @ Mar 22 2013, 15:37) So...   Mar 22 2013, 18:24
|- - Nessuno   QUOTE (2Bdecided @ Mar 22 2013, 16:37) Le...   Mar 23 2013, 11:00
|- - 2Bdecided   QUOTE (Nessuno @ Mar 23 2013, 10:00) QUOT...   Mar 28 2013, 10:34
|- - db1989   QUOTE (2Bdecided @ Mar 28 2013, 09:34) If...   Mar 28 2013, 13:41
||- - 2Bdecided   QUOTE (db1989 @ Mar 28 2013, 12:41) QUOTE...   Mar 28 2013, 17:30
|||- - db1989   QUOTE (2Bdecided @ Mar 28 2013, 16:30) Ah...   Mar 28 2013, 17:38
||- - DonP   QUOTE (db1989 @ Mar 28 2013, 07:41) QUOTE...   Mar 28 2013, 17:45
||- - db1989   QUOTE (DonP @ Mar 28 2013, 16:45) First, ...   Mar 28 2013, 17:50
||- - Nessuno   QUOTE (db1989 @ Mar 28 2013, 17:50) QUOTE...   Mar 28 2013, 22:29
|- - jmvalin   QUOTE (2Bdecided @ Mar 28 2013, 05:34) It...   Mar 28 2013, 19:42
- - 2Bdecided   RE: An idea of audio encode algorithm, based on maximum allowed volume of   Mar 22 2013, 18:49
- - softrunner   QUOTE (saratoga @ Mar 22 2013, 06:24) QUO...   Mar 25 2013, 03:10
|- - lvqcl   QUOTE (softrunner @ Mar 25 2013, 06:10) A...   Mar 25 2013, 16:12
|- - probedb   QUOTE (softrunner @ Mar 25 2013, 02:10) N...   Mar 25 2013, 16:44
|- - Gecko   QUOTE (softrunner @ Mar 25 2013, 03:10) N...   Mar 25 2013, 18:22
- - greynol   Thanks for chiming-in, David!   Mar 28 2013, 07:20
- - 2Bdecided   I think he implied a noise floor relative to peak ...   Mar 28 2013, 21:19
- - jmvalin   QUOTE (2Bdecided @ Mar 28 2013, 16:19) I ...   Mar 28 2013, 21:49
- - 2Bdecided   QUOTE (jmvalin @ Mar 28 2013, 20:49) I...   Mar 29 2013, 12:16


Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 21st August 2014 - 15:52