IPB

Welcome Guest ( Log In | Register )

An idea of audio encode algorithm, based on maximum allowed volume of , WavPack hybrid mode test included
softrunner
post Mar 6 2013, 00:11
Post #1





Group: Members
Posts: 48
Joined: 19-July 12
Member No.: 101579



Full topic title: "An idea of audio encode algorithm, based on maximum allowed volume of signals difference"

Recently I have discovered for myself, that the difference of the source and encoded audio can be easily obtained by inverting source audio and mixing it with the encoded one. Then the idea of encode algorithm came into my head: just try to keep the signals difference at the same level (or less), defined by user. Thus, the audio quality is simply measured by volume of the difference of the signals, and this difference is nothing but distortions, produced by encoder.
The whole algorithm looks like this:
1. Take maximum allowed volume of signals difference from user.
1. Make a copy of source audio and invert it.
2. Split both source and inverted audio on frames of the same size.
3. Encode first frame of source audio, mix the result with first frame of inverted audio and calculate the volume of obtained difference.
4. If the volume of the difference is higher, than allowed by user, add some bitrate and repeat from item no. 3.
5. If the volume of the difference is not higher, than allowed by user, add first encoded frame to the final output.
6. Repeat items 3-5 with second, third, etc... frames, until the end of the source file.

Of cause, this algorithm is much slower then just direct encode, but definately if should not be slower, than video encoding (and people are ready to wait for many hours while their videos are being encoded).

I tried to reproduce this algorithm manually by test using WavPack hybrid mode as an encoder (source audio sample was splitted on 11 parts of 1 second), and it showed, that 23.4 % of space/bitrate could be saved. Another important thing is that the user is guaranteed, that he will not get distortions with volume level, higher then he expects, so he can safely encode many files simultaneously without looking at the content. User gets freed both from unnecessary waste of bitrate and uncontrolled distortions.

The only thing is needed is that some audio developers get interested in this idea and implement it as a computer program.

The whole set of files of the WavPack test I've made is here.

This post has been edited by softrunner: Mar 6 2013, 00:20
Go to the top of the page
+Quote Post
 
Start new topic
Replies
C.R.Helmrich
post Mar 6 2013, 21:21
Post #2





Group: Developer
Posts: 688
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (softrunner @ Mar 6 2013, 00:11) *
just try to keep the signals difference at the same level (or less), defined by user. Thus, the audio quality is simply measured by volume of the difference of the signals, and this difference is nothing but distortions, produced by encoder.

The only thing is needed is that some audio developers get interested in this idea and implement it as a computer program.

If I understand you correctly, audio developers have implemented this idea already 4 decades ago. The simplest case: take some high-word-length audio (e.g. a CD rip) and convert it to e.g. 8-bit PCM. Your difference signal will always be at the same level, depending on the target word-length. Slightly more elaborate cases: A-Law or µ-Law. There your maximum allowed volume of the difference signal is also known.

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
softrunner
post Mar 7 2013, 16:59
Post #3





Group: Members
Posts: 48
Joined: 19-July 12
Member No.: 101579



QUOTE (saratoga @ Mar 6 2013, 03:21) *
The problem with this approach is that audibility has very little to do with the absolute volume due to masking.

But who can guarantee, that this masking will work, and that the difference will not be audible on all input signals? The whole idea is not about audibility, it is about using minimum bitrate for maximum mathematical closeness of output audio to input audio, just pure calculations, which seems to be the only guarantee here.

QUOTE (greynol @ Mar 6 2013, 05:22) *
However (and IIRC), WavPack Lossy does not use a psychoacoustic model, so this might loosely apply.

At least, we should give it a try...

QUOTE (DVDdoug @ Mar 7 2013, 00:01) *
Delay one sound by a few milliseconds. This will make no difference in the sound, but when you subtract you will get [b]a huge difference file

If an encoder do some shifts of audio on a timeline, that means that the idea of this topic is simply not appliable to it.

QUOTE (C.R.Helmrich @ Mar 7 2013, 00:21) *
If I understand you correctly, audio developers have implemented this idea already 4 decades ago.

Well, I do not see any software, which uses maximum allowed error level of audio as an input parameter.
Go to the top of the page
+Quote Post
greynol
post Mar 7 2013, 18:05
Post #4





Group: Super Moderator
Posts: 10009
Joined: 1-April 04
From: San Francisco
Member No.: 13167



LossyWAV is commonly discussed here and I lamented not including it shortly after posting.

QUOTE (softrunner @ Mar 7 2013, 07:59) *
But who can guarantee, that this masking will work, and that the difference will not be audible on all input signals? The whole idea is not about audibility, it is about using minimum bitrate for maximum mathematical closeness of output audio to input audio, just pure calculations, which seems to be the only guarantee here.

Who can guarantee that this "maximum mathematical closeness" will work, especially when it makes no attempt to consider how the human auditory system functions? Also, please don't insult our intelligence by suggesting that we must try all possible input signals before rejecting the assertion that this idea will do better than already established practice built upon well established knowledge when you have not even offered any evidence supporting your concept.

If this isn't about audibility then I completely fail to see the point. Audio quality is one of the primary determinants in gauging the performance of a lossy encoder. Other worthwhile determinants will focus on performance/ease of coding/decoding related issues. Perhaps someone can make a case as to why "maximum mathematical closeness" affects either of these groups or if it may fall into a new and equally important group.

This post has been edited by greynol: Mar 7 2013, 20:13


--------------------
Your eyes cannot hear.
Go to the top of the page
+Quote Post

Posts in this topic
- softrunner   An idea of audio encode algorithm, based on maximum allowed volume of   Mar 6 2013, 00:11
- - saratoga   QUOTE Then the idea of encode algorithm came into ...   Mar 6 2013, 00:21
|- - softrunner   QUOTE (saratoga @ Mar 6 2013, 03:21) The ...   Mar 6 2013, 00:36
|- - saratoga   QUOTE (softrunner @ Mar 5 2013, 18:36) I...   Mar 6 2013, 00:58
- - greynol   None of the lossy codecs commonly discussed on thi...   Mar 6 2013, 02:22
- - DVDdoug   softrunner, If you want to demonstrate to yoursel...   Mar 6 2013, 21:01
- - C.R.Helmrich   QUOTE (softrunner @ Mar 6 2013, 00:11) ju...   Mar 6 2013, 21:21
|- - softrunner   QUOTE (saratoga @ Mar 6 2013, 03:21) The ...   Mar 7 2013, 16:59
|- - 2Bdecided   QUOTE (softrunner @ Mar 7 2013, 15:59) We...   Mar 7 2013, 17:14
|- - greynol   LossyWAV is commonly discussed here and I lamented...   Mar 7 2013, 18:05
||- - saratoga   QUOTE (greynol @ Mar 7 2013, 12:05) Lossy...   Mar 7 2013, 20:25
|- - db1989   QUOTE (softrunner @ Mar 7 2013, 15:59) We...   Mar 7 2013, 18:51
||- - Canar   QUOTE (softrunner @ Mar 7 2013, 15:59) We...   Mar 7 2013, 20:20
|- - Nessuno   QUOTE (softrunner @ Mar 7 2013, 16:59) Bu...   Mar 7 2013, 20:54
|- - C.R.Helmrich   Indeed. Softrunner, if you want mathematical close...   Mar 7 2013, 22:51
- - greynol   @Canar: Please show me a lossy algorithm with no ...   Mar 7 2013, 20:29
|- - Canar   QUOTE (greynol @ Mar 7 2013, 11:29) Pleas...   Mar 7 2013, 20:32
- - softrunner   QUOTE (2Bdecided @ Mar 7 2013, 20:14) You...   Mar 9 2013, 03:09
|- - saratoga   QUOTE (softrunner @ Mar 8 2013, 21:09) QU...   Mar 9 2013, 04:00
|- - greynol   QUOTE (softrunner @ Mar 8 2013, 18:09) Th...   Mar 9 2013, 08:31
|- - Nessuno   softrunner, you evidently lack the theorical bases...   Mar 9 2013, 10:15
|- - db1989   In support of Nessuno’s conclusions, as well as th...   Mar 9 2013, 11:53
||- - greynol   QUOTE (db1989 @ Mar 9 2013, 02:53) * And ...   Mar 9 2013, 17:55
|- - 2Bdecided   QUOTE (softrunner @ Mar 9 2013, 02:09) Th...   Mar 12 2013, 10:47
||- - Dynamic   Lossless is the only true guarantee. LossyWAV...   Mar 12 2013, 12:42
|- - C.R.Helmrich   QUOTE (softrunner @ Mar 9 2013, 03:09) QU...   Mar 12 2013, 21:46
- - Gecko   On a very basic level, lossy encoders have a mecha...   Mar 9 2013, 12:06
- - greynol   So WavPack does have a psychoacoustic model?   Mar 9 2013, 17:46
|- - Gecko   QUOTE (greynol @ Mar 9 2013, 17:46) So Wa...   Mar 10 2013, 17:10
- - greynol   If you know then say.   Mar 10 2013, 17:50
- - Gecko   Well, since Wavpack lossy doesn't just discard...   Mar 10 2013, 19:16
- - greynol   Sorry, but that really doesn't cut it. Could ...   Mar 10 2013, 19:31
- - Gecko   In that case, maybe I need to revise my definition...   Mar 11 2013, 18:49
- - pdq   Can you play the correction file to a Wavpack loss...   Mar 11 2013, 19:25
- - Gecko   I tried the old inversion trick on a drum & ba...   Mar 11 2013, 20:02
|- - bryant   QUOTE (Gecko @ Mar 11 2013, 11:02) I trie...   Mar 28 2013, 04:59
- - db1989   Premises: (1) If a residual signal created by mixi...   Mar 11 2013, 20:21
- - greynol   For the record, I'm not in any position to def...   Mar 11 2013, 20:56
|- - Nessuno   QUOTE (greynol @ Mar 11 2013, 20:56) At a...   Mar 11 2013, 21:57
- - softrunner   QUOTE (2Bdecided @ Mar 12 2013, 13:47) QU...   Mar 22 2013, 03:16
|- - saratoga   QUOTE (softrunner @ Mar 21 2013, 21:16) Q...   Mar 22 2013, 03:24
|- - Gecko   QUOTE (softrunner @ Mar 22 2013, 03:16) B...   Mar 22 2013, 08:48
|- - db1989   QUOTE (softrunner @ Mar 22 2013, 02:16) Q...   Mar 22 2013, 11:51
||- - 2Bdecided   QUOTE (db1989 @ Mar 22 2013, 10:51) QUOTE...   Mar 22 2013, 14:57
||- - db1989   QUOTE (2Bdecided @ Mar 22 2013, 13:57) QU...   Mar 22 2013, 15:20
|- - 2Bdecided   QUOTE (softrunner @ Mar 22 2013, 02:16) I...   Mar 22 2013, 14:43
- - jmvalin   Hey everyone, I just had this great idea that shou...   Mar 22 2013, 07:41
- - 2Bdecided   Sorry db1989, I'm not trying to personally att...   Mar 22 2013, 16:37
|- - db1989   QUOTE (2Bdecided @ Mar 22 2013, 15:37) So...   Mar 22 2013, 18:24
|- - Nessuno   QUOTE (2Bdecided @ Mar 22 2013, 16:37) Le...   Mar 23 2013, 11:00
|- - 2Bdecided   QUOTE (Nessuno @ Mar 23 2013, 10:00) QUOT...   Mar 28 2013, 10:34
|- - db1989   QUOTE (2Bdecided @ Mar 28 2013, 09:34) If...   Mar 28 2013, 13:41
||- - 2Bdecided   QUOTE (db1989 @ Mar 28 2013, 12:41) QUOTE...   Mar 28 2013, 17:30
|||- - db1989   QUOTE (2Bdecided @ Mar 28 2013, 16:30) Ah...   Mar 28 2013, 17:38
||- - DonP   QUOTE (db1989 @ Mar 28 2013, 07:41) QUOTE...   Mar 28 2013, 17:45
||- - db1989   QUOTE (DonP @ Mar 28 2013, 16:45) First, ...   Mar 28 2013, 17:50
||- - Nessuno   QUOTE (db1989 @ Mar 28 2013, 17:50) QUOTE...   Mar 28 2013, 22:29
|- - jmvalin   QUOTE (2Bdecided @ Mar 28 2013, 05:34) It...   Mar 28 2013, 19:42
- - 2Bdecided   RE: An idea of audio encode algorithm, based on maximum allowed volume of   Mar 22 2013, 18:49
- - softrunner   QUOTE (saratoga @ Mar 22 2013, 06:24) QUO...   Mar 25 2013, 03:10
|- - lvqcl   QUOTE (softrunner @ Mar 25 2013, 06:10) A...   Mar 25 2013, 16:12
|- - probedb   QUOTE (softrunner @ Mar 25 2013, 02:10) N...   Mar 25 2013, 16:44
|- - Gecko   QUOTE (softrunner @ Mar 25 2013, 03:10) N...   Mar 25 2013, 18:22
- - greynol   Thanks for chiming-in, David!   Mar 28 2013, 07:20
- - 2Bdecided   I think he implied a noise floor relative to peak ...   Mar 28 2013, 21:19
- - jmvalin   QUOTE (2Bdecided @ Mar 28 2013, 16:19) I ...   Mar 28 2013, 21:49
- - 2Bdecided   QUOTE (jmvalin @ Mar 28 2013, 20:49) I...   Mar 29 2013, 12:16


Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 1st October 2014 - 07:20