IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Properly downmixing 5.1 to stereo
Hancoque
post Jun 9 2007, 01:24
Post #1





Group: Members
Posts: 291
Joined: 27-January 04
From: Germany
Member No.: 11530



I'm currently working on a plugin to downmix 5.1 to stereo. First, I thought about how an ideal speaker setup looks like. I came up with this image (omitting the subwoofer):



What I conclude from this image is that the stereo separation of the rear channels is stronger than the stereo separation of the front channels. This means I have to mix the front channels differently into the stereo channels than the rear channels.

There are two extreme points that a speaker can have. It can be located at 0° (like the center channel). In that case the channel should go equally to the left and right channel. Or it can be located at ±90°, which means that 100% of the channel goes either to the left or to the right.

The front channels are positioned 30° from the 0° point, so the calculation would be as follows:
Front: 30° / 90° * 50 + 50 = 67%
So 67% of the channel goes to the same side, while the rest (33%) goes to the other side.

The calculation for the rear channels is similar:
Rear: 70° / 90° * 50 + 50 = 89%
So 89% of the channel goes to the same side, while the rest (11%) goes to the other side.

But then I noticed that this would be suitable for headphones but not for speakers. So I decided to set 70° as the maximum and not 90°:
Front: 30° / 70° * 50 + 50 = 71% (other side: 29%)
Rear: 70° / 70° * 50 + 50 = 100% (other side: 0%)

This way I have the widest possible stereo separation while maintaining the separation ratio between front and rear. But I still feel that it's just a compromise and not an ideal solution.

Then something else came to my mind. I noticed that most applications don't mix the center channel 50%/50% into the stereo channels but 71%/71% (-3.01dB = square root of 2, divided by 2). So, aren't two speakers with half the amplitude as loud as one speaker? If I should indeed use 71% instead of 50% I wonder how I have to apply this to the other channels.

This post has been edited by Hancoque: Jun 9 2007, 01:26
Go to the top of the page
+Quote Post
SebastianG
post Jun 11 2007, 12:24
Post #2





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (Hancoque @ Jun 9 2007, 02:24) *
But then I noticed that this would be suitable for headphones but not for speakers.

It really depends on what you're trying to achieve.

For a good headphone experience you could simulate virtual sound sources (your 5 channels) using a simple model for head related tranfer functions. Search the web for HRTF model and/or check out this paper -- it was one of the first more promising google results I got. Of course, since the locations of your sound sources is fixed you could simply calculate all impulse responses in an offline process beforehand or just download HRIR recordings . This for example looks interesting.

If you want a downmix for your home stereo I don't think you can do much better (*) than simply mixing the surround channels to the front channels like this:
Lt = FL + s*SL + c*C
Rt = FR + s*SR + c*C
where s (=surround mix) is usually something between 0.5 and 1
and c (=center mix) is usually 0.7. This would be a "normal" stereo downmix.

If want a "pro logic" downmix you could use something like this:
Lt = FL + s*(SL+SR) + c*C
Rt = FR - s*(SL+SR) + c*C // 180° phase shift für SL+SR
with s=0.5 and c=0.7.

HRTF = head-related transfer function
HRIR = head-related impulse response
FL/FR = front left/right
SL/SR = surround left/right
C = center

Cheers!
SG

edit: (*) There's this "Dolby Virtual Surround". I have no idea what it does. But it might actually work. tongue.gif

This post has been edited by SebastianG: Jun 11 2007, 12:54
Go to the top of the page
+Quote Post
Hancoque
post Jun 11 2007, 17:24
Post #3





Group: Members
Posts: 291
Joined: 27-January 04
From: Germany
Member No.: 11530



QUOTE (SebastianG @ Jun 11 2007, 13:24) *
If you want a downmix for your home stereo I don't think you can do much better than simply mixing the surround channels to the front channels like this:
Lt = FL + s*SL + c*C
Rt = FR + s*SR + c*C
where s (=surround mix) is usually something between 0.5 and 1
and c (=center mix) is usually 0.7. This would be a "normal" stereo downmix.

I read an article about stereo panning laws and now I know why it's 0.7(071...) and not 0.5 for the center. That also told me that my previous calculations are all wrong, because I have to apply the same panning law to the other channels, too. I calculated the new panning coefficients using the -3.01dB panning law. But the result is quite bad, because it makes the front channels nearly mono with only 33.3% left panning (79% to the left, 53% to the right and vice versa). So, I guess trying to preserve the original stereo separation ratios between the front and the rear channels doesn't really work out the way I wished.
Go to the top of the page
+Quote Post
Woodinville
post Jun 11 2007, 21:15
Post #4





Group: Members
Posts: 1402
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



It seems to me that there is no direct formulaic way to do this.

If you want to simply make things sound good, you can use Lt = L +Ls +C/sqrt(2) and the converse.

If you are worried about dialog audiblity you want to up the relative gain of C somewhat.

There is no "right" answer, really.


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post
robaer
post Jun 11 2007, 21:39
Post #5





Group: Members
Posts: 7
Joined: 31-December 06
Member No.: 39148



How is the downmix done on a standalone dvd player. And is it possible to do something similar in software?


--------------------
EAC (Secure Mode) / LAME 3.97 (-V 2) / fb2k / M-Audio 24/96
Go to the top of the page
+Quote Post
Woodinville
post Jun 11 2007, 23:15
Post #6





Group: Members
Posts: 1402
Joined: 9-January 05
From: JJ's office.
Member No.: 18957



QUOTE (robaer @ Jun 11 2007, 13:39) *
How is the downmix done on a standalone dvd player. And is it possible to do something similar in software?



The audio stream in question probably has mixdown coef's contained somewhere in it. The processing shouldn't be hard once you know them.


--------------------
-----
J. D. (jj) Johnston
Go to the top of the page
+Quote Post
robaer
post Jun 11 2007, 23:57
Post #7





Group: Members
Posts: 7
Joined: 31-December 06
Member No.: 39148



QUOTE (Woodinville @ Jun 11 2007, 23:15) *
QUOTE (robaer @ Jun 11 2007, 13:39) *

How is the downmix done on a standalone dvd player. And is it possible to do something similar in software?



The audio stream in question probably has mixdown coef's contained somewhere in it. The processing shouldn't be hard once you know them.

So it's done differently on each disc then? I've often wondered if it's downmixed with Dolby Pro Logic in mind or just plain stereo when using line out instead of spdif. Any info on this could be useful when doing your own downmixes.


--------------------
EAC (Secure Mode) / LAME 3.97 (-V 2) / fb2k / M-Audio 24/96
Go to the top of the page
+Quote Post
mcbear
post Jun 12 2007, 08:45
Post #8





Group: Members
Posts: 50
Joined: 12-April 06
Member No.: 29435



QUOTE (robaer @ Jun 12 2007, 00:57) *
QUOTE (Woodinville @ Jun 11 2007, 23:15) *

QUOTE (robaer @ Jun 11 2007, 13:39) *

How is the downmix done on a standalone dvd player. And is it possible to do something similar in software?



The audio stream in question probably has mixdown coef's contained somewhere in it. The processing shouldn't be hard once you know them.

So it's done differently on each disc then? I've often wondered if it's downmixed with Dolby Pro Logic in mind or just plain stereo when using line out instead of spdif. Any info on this could be useful when doing your own downmixes.

May I direct you to atsc.org, there have a look on the A52/b-standard, describing AC-3 and E-AC3, chapter
7.8.1 gives a good idea what Dolby thinks about downmixing, thus it is the way it is implemented
in any DVD-player...
Go to the top of the page
+Quote Post
Hancoque
post Jun 12 2007, 14:22
Post #9





Group: Members
Posts: 291
Joined: 27-January 04
From: Germany
Member No.: 11530



According to the AC-3 specification on atsc.org this is the way to downmix 5.1 to stereo:
QUOTE
Lo = 1.0 * L + clev * C + slev * Ls ;
Ro = 1.0 * R + clev * C + slev * Rs ;
clev (center level) and slev (surround level) are provided by the AC-3 file.

That bring's up a new question for me: Until now I relied on the correctness of the AC-3 and DTS decoding plugins available for foobar2000. But these plugins don't do any downmixing. So, are clev and slev also used for decoding a 5.1 signal without downmixing to stereo? Because if they weren't there would be no way to use these values for downmixing afterwards.
Go to the top of the page
+Quote Post
mcbear
post Jun 13 2007, 15:24
Post #10





Group: Members
Posts: 50
Joined: 12-April 06
Member No.: 29435



QUOTE (Hancoque @ Jun 12 2007, 15:22) *
According to the AC-3 specification on atsc.org this is the way to downmix 5.1 to stereo:
QUOTE
Lo = 1.0 * L + clev * C + slev * Ls ;
Ro = 1.0 * R + clev * C + slev * Rs ;
clev (center level) and slev (surround level) are provided by the AC-3 file.

That bring's up a new question for me: Until now I relied on the correctness of the AC-3 and DTS decoding plugins available for foobar2000. But these plugins don't do any downmixing. So, are clev and slev also used for decoding a 5.1 signal without downmixing to stereo? Because if they weren't there would be no way to use these values for downmixing afterwards.

According to the spec, no, if the number of input channels equals the number of output channels, the signal is routed directly. In that case you'd have either to extract clev and slev from the decoder, or use the "worst case" downmix equation, which is also given somewhere in the doc, I think. Hope I understood your question correctly btw...
Go to the top of the page
+Quote Post
SebastianG
post Jun 13 2007, 16:04
Post #11





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



The decoder should do the downmixing since it has access to the undecoded stream which may contain downmixing hints (clev and slev) as Woodinville and mcbear already noted. If you can only get 5.1 data and downmixing is up to you you should go with the clev and slev values used the most. (clev=sqrt(0.5), slev=1?)

Cheers!
SG
Go to the top of the page
+Quote Post
Hancoque
post Jun 14 2007, 01:03
Post #12





Group: Members
Posts: 291
Joined: 27-January 04
From: Germany
Member No.: 11530



So, clev and slev are only used for downmixing? In that case I would finally be able to do a proper downmix "by the book", at least if it's AC-3.

But what about DTS? I only found a rear channel attenuation setting in the encoder options (screenshot). What I don't know is if that's only a flag (that decoders must take into account) or if the rear channels are preprocessed so that it doesn't matter to the decoder anymore. And what are the "global" downmix rules for DTS if there are none embedded into each file?
Go to the top of the page
+Quote Post
mcbear
post Jun 14 2007, 08:53
Post #13





Group: Members
Posts: 50
Joined: 12-April 06
Member No.: 29435



QUOTE (Hancoque @ Jun 14 2007, 02:03) *
So, clev and slev are only used for downmixing? In that case I would finally be able to do a proper downmix "by the book", at least if it's AC-3.

Correct, according to the spec !

QUOTE (Hancoque @ Jun 14 2007, 02:03) *
But what about DTS? I only found a rear channel attenuation setting in the encoder options (screenshot). What I don't know is if that's only a flag (that decoders must take into account) or if the rear channels are preprocessed so that it doesn't matter to the decoder anymore. And what are the "global" downmix rules for DTS if there are none embedded into each file?

DTS isn't that open as Dolby wrt this, i.e. to my knowledge you won't find any detailed information
as in the ATSC-documents. So any information posted here would probably bring some trouble
for the poster with it :-)
Depending on what you want to realize, I would go with the general approach, i.e. assume
you'll get 5.1 channels PCM and implement a downmix which prevents from overload under
worst case conditions. Which may lead to some loss, but will work.
Go to the top of the page
+Quote Post
Hancoque
post Jun 15 2007, 01:13
Post #14





Group: Members
Posts: 291
Joined: 27-January 04
From: Germany
Member No.: 11530



I've evaluated a DTS encoder and found out that the -3dB rear channel attenuation is in fact "hardcoded" into the stream. Therefore no additional attenuation of the rear channels should be necessary for a downmix.

I think it's sane to expect that the center channel is attenuated by -3dB. So the DTS downmixing formula should be as follows:
CODE
Lo = 1.0 * L + 0.7071 * C + 1.0 * Ls;
Ro = 1.0 * R + 0.7071 * C + 1.0 * Rs;

I derive the downmix factor of 1.0 for the rear channels from the fact that the rear speakers have the same distance to the listener as the front speakers and thus should be equally loud (ignoring the facing of the earlobes). But on the other hand the default value for AC-3 is -3dB. That's why I'm still not absolutely sure what to use.

This post has been edited by Hancoque: Jun 15 2007, 02:16
Go to the top of the page
+Quote Post
SebastianG
post Jun 15 2007, 18:23
Post #15





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (mcbear @ Jun 14 2007, 09:53) *
DTS isn't that open as Dolby wrt this, i.e. to my knowledge you won't find any detailed information
as in the ATSC-documents. So any information posted here would probably bring some trouble
for the poster with it :-)

You can get those docs legally without paying bucks. Unfortunately I don't remember the website. I registered somewhere and was allowed to download one document per day for free -- including the DTS specification.

Cheers!
SG
Go to the top of the page
+Quote Post
Hancoque
post Jun 15 2007, 19:24
Post #16





Group: Members
Posts: 291
Joined: 27-January 04
From: Germany
Member No.: 11530



There's a publicly available technical documentation for DTS available here.

Two passages caught my attention. Chapter 3.1.11 ("Stereo Down Mix") states that dynamic 2-channel downmixing coefficients can be embedded into the stream. Chapters 7.1.10 ("Embedded down mix flag") and 7.3.10 ("Stereo down mix coefficients") seem to specify that a bit further. It is strange however that I didn't find any options concerning this in the SurCode encoder and the screenshot of the official encoder. So maybe this was never used in any recording.

This post has been edited by Hancoque: Jun 15 2007, 19:43
Go to the top of the page
+Quote Post
SebastianG
post Jun 15 2007, 19:53
Post #17





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



This is "only" the white paper, btw.
...looks quite comprehensive. But I'm positive I got the real spec for free.
IIRC it was from ETSI.org / ETSI, free standards download page.

Cheers!
SG

This post has been edited by SebastianG: Jun 15 2007, 20:00
Go to the top of the page
+Quote Post
mcbear
post Jun 19 2007, 11:26
Post #18





Group: Members
Posts: 50
Joined: 12-April 06
Member No.: 29435



QUOTE (Hancoque @ Jun 15 2007, 20:24) *
There's a publicly available technical documentation for DTS available here.

Two passages caught my attention. Chapter 3.1.11 ("Stereo Down Mix") states that dynamic 2-channel downmixing coefficients can be embedded into the stream. Chapters 7.1.10 ("Embedded down mix flag") and 7.3.10 ("Stereo down mix coefficients") seem to specify that a bit further. It is strange however that I didn't find any options concerning this in the SurCode encoder and the screenshot of the official encoder. So maybe this was never used in any recording.

Thanks for the link/links...at least something open to the public which can be refered to now.
In any case, using the "fail safe" coefficients seems to be advisable, since obviously you can't rely
on the downmix coefficients being embedded in the stream, and you'd need the means to extract
/access them.
Go to the top of the page
+Quote Post
Hancoque
post May 2 2008, 15:28
Post #19





Group: Members
Posts: 291
Joined: 27-January 04
From: Germany
Member No.: 11530



Let me revive this thread. Looking for some embedded downmix coefficients isn't really the way to go for me as not all surround formats provide this information. AC-3 has it, but DTS doesn't (at least it's not available in the programs I know). DVD-Audio might provide it but only two of my seven discs actually have it. Then I thought again about the best general formula and came to this:
CODE
L = Lf + C/2 + Ls
R= Rf + C/2 + Rs

I know that this looks quite different from what is most often used. But let me explain how I came to these values.

I. Center Channel
I was especially concerned about the center level as an attenuation of 3 dB seems to be the general rule of thumb instead of 6 dB. My goal was to achieve a phantom center channel that is equally loud than the original dedicated center channel. I imaged: What if I would split the speaker in half (halving the amplitude of each side, i.e. attenuating by 6 dB) and shove one half next to the left speaker and the other half next to the right one? In my theory this would not change the overall volume because both halved amplitudes would combine to the full amplitude when they both reach the listener (0.5 + 0.5 = 1). But I still wasn't sure if two speakers (each with half the amplitude) are really outputting the same energy as one speaker (with the full amplitude). So I conducted a test. I placed my two stereo speakers directly next to each other, placed a microphone in front of them, calibrated both sides so that the microphone picks up the same amplitude from left and right and then played back three test samples while measuring the amplitude the microphone registers.

Test sample 1: A sound (sine wave of 500 Hz) of full amplitude on one side and the other side being silent (factor: 1.0).

Test sample 2: The same sound distributed to both sides with an attenuation of 3 dB (factor: 0.707).

Test sample 3: The same sound distributed to both sides with an attenuation of 6 dB (factor: 0.5).

The conclusion is that my theory is correct in that sample 3 reproduces the same amplitude as sample 1. Sample 2 has a higher amplitude (by about 3 dB).

So, I think that if one wants to downmix 5.1 to stereo and wants to keep the same volume for the center channel one has to use a center channel attenuation of 6 dB and not 3 dB.

II. Surround Channels
Now, why no attenuation of the surround channels? I think if you want to preserve the relative volume of the original channels you should not attenuate the surround channels. In a 5.1 setup all speakers (ignoring the LFE channel) are equidistant to the listener. So why should one channel that has the same distance to the listener like another channel be attenuated while the other one isn't? And I don't think that the human head attenuates signals coming from behind by 3 to 6 dB compared to signals coming from the front (correct me if I'm wrong). So, attenuating the surround channels cannot be called "most accurate reproduction" but can only be seen like "adjusting it to one's tastes". Or am I wrong?
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 24th July 2014 - 22:16