IPB

Welcome Guest ( Log In | Register )

8 Pages V   1 2 3 > »   
Reply to this topicStart new topic
libebur128 - (yet another) EBU R 128 implementation
Raiden
post Jan 12 2011, 17:08
Post #1





Group: Developer
Posts: 224
Joined: 14-September 04
Member No.: 17002



Hi,
inspired by the wonderful work in this thread, I wrote my own implementation of the EBU R 128 standard.
It is written in plain ANSI C and designed as a library, so you can use it in your own code. It's licensed under the MIT license.

I've also implemented a simple scanning tool, which outputs something like this:
CODE
$ ./r128-sndfile -l -p both ~/music/bad\ loop\ -\ Luo/*.flac

-12.81 LUFS, LRA: 14.16 LU, sample peak: 0.89151001, true peak: 0.99826229, /home/jan/music/bad loop - Luo/bad loop - Luo - 01 Nio.flac
-11.15 LUFS, LRA: 8.26 LU, sample peak: 0.89163208, true peak: 0.99095666, /home/jan/music/bad loop - Luo/bad loop - Luo - 02 Eri Valeire.flac
-10.14 LUFS, LRA: 11.79 LU, sample peak: 0.89154053, true peak: 0.99171823, /home/jan/music/bad loop - Luo/bad loop - Luo - 03 Kauniit Ihmiset.flac
-11.31 LUFS, LRA: 11.75 LU, sample peak: 0.89157104, true peak: 0.92898595, /home/jan/music/bad loop - Luo/bad loop - Luo - 04 Mmin.flac
-26.13 LUFS, LRA: 14.87 LU, sample peak: 0.25204468, true peak: 0.25203928, /home/jan/music/bad loop - Luo/bad loop - Luo - 05 3b Or T.flac
-14.10 LUFS, LRA: 11.40 LU, sample peak: 0.89151001, true peak: 1.02603507, /home/jan/music/bad loop - Luo/bad loop - Luo - 06 Kannas Nsp.flac
--------------------------------------------------------------------------------
-11.75 LUFS, LRA: 13.34 LU, sample peak: 0.89163208, true peak: 1.02603507

There is also ReplayGain tagging, using a reference level of -18 LUFS to match RG's loudness:
CODE
r128-mpg123 -t album [FILE|DIRECTORY]...
r128-mpg123 -t track [FILE|DIRECTORY]...

Download current version here.

This post has been edited by Raiden: Feb 27 2011, 22:29
Go to the top of the page
+Quote Post
googlebot
post Jan 12 2011, 18:01
Post #2





Group: Members
Posts: 698
Joined: 6-March 10
Member No.: 78779



While I'm really a big proponent of modern coding principles, it's nice to see another example of fine, bare-bone ANSI C, which surely still has its share in the embedded market.

The code looks quite decent, great job!

Could you also publish your results for the R128 test samples?

This post has been edited by googlebot: Jan 12 2011, 18:01
Go to the top of the page
+Quote Post
googlebot
post Jan 12 2011, 18:58
Post #3





Group: Members
Posts: 698
Joined: 6-March 10
Member No.: 78779



IMHO, you should both think about contributing your work directly to SoX.
Go to the top of the page
+Quote Post
mudlord
post Jan 12 2011, 20:14
Post #4





Group: Developer (Donating)
Posts: 813
Joined: 1-December 07
Member No.: 49165



Nice nice work.

QUOTE
IMHO, you should both think about contributing your work directly to SoX.


Why, because then it can go LGPL? >_>
Go to the top of the page
+Quote Post
Raiden
post Jan 12 2011, 20:14
Post #5





Group: Developer
Posts: 224
Joined: 14-September 04
Member No.: 17002



QUOTE (googlebot @ Jan 12 2011, 18:01) *
The code looks quite decent, great job!

Thanks! smile.gif

QUOTE (googlebot @ Jan 12 2011, 18:01) *
Could you also publish your results for the R128 test samples?

CODE
seq-3341-1-16bit.wav: -23.0 LUFS
seq-3341-2-16bit.wav: -33.0 LUFS
seq-3341-3-16bit.wav: -23.0 LUFS
seq-3341-4-16bit.wav: -23.0 LUFS
seq-3341-5-16bit.wav: -22.9 LUFS
seq-3341-6-5channels-16bit.wav: -23.0 LUFS
seq-3341-6-6channels-WAVEEX-16bit.wav: -23.0 LUFS
seq-3341-7_seq-3342-5-24bit.wav: -23.0 LUFS
seq-3341-8_seq-3342-6-24bit.wav: -23.0 LUFS
Go to the top of the page
+Quote Post
googlebot
post Jan 12 2011, 20:28
Post #6





Group: Members
Posts: 698
Joined: 6-March 10
Member No.: 78779



Great! Even gets the 6-channel case right in contrast to the current version of R128GAIN.
Go to the top of the page
+Quote Post
mudlord
post Jan 12 2011, 21:49
Post #7





Group: Developer (Donating)
Posts: 813
Joined: 1-December 07
Member No.: 49165



Hmmm, shouldn't ebur128_write_frames be ebur128_read_frames, if its the main sample reading function?

Just thought it would be a bit more descriptive.
Go to the top of the page
+Quote Post
Raiden
post Jan 12 2011, 21:52
Post #8





Group: Developer
Posts: 224
Joined: 14-September 04
Member No.: 17002



QUOTE (googlebot @ Jan 12 2011, 20:28) *
Great! Even gets the 6-channel case right in contrast to the current version of R128GAIN.

Yes, the channel map is not hard coded into the library, but can be set with the function ebur128_set_channel_map. The default is: left, right, center, unused, left surround, right surround. If the scanner finds a file that has five channels, it sets the correct channel map.
Also, if libsndfile finds a channel map in the file (some WAV files embed one), it uses that one.
Go to the top of the page
+Quote Post
Raiden
post Jan 12 2011, 22:12
Post #9





Group: Developer
Posts: 224
Joined: 14-September 04
Member No.: 17002



QUOTE (mudlord @ Jan 12 2011, 21:49) *
Hmmm, shouldn't ebur128_write_frames be ebur128_read_frames, if its the main sample reading function?

Just thought it would be a bit more descriptive.

I guess you are right. I've though of it like "write frames to the library", but read sounds indeed more intuitive. How do other libraries name those functions?
Go to the top of the page
+Quote Post
benski
post Jan 12 2011, 22:17
Post #10


Winamp Developer


Group: Developer
Posts: 670
Joined: 17-July 05
From: Brooklyn, NY
Member No.: 23375



Is there a good mapping between LUFS and ReplayGain values?

edit: nevermind, found the discussion in the other thread.

This post has been edited by benski: Jan 12 2011, 23:00
Go to the top of the page
+Quote Post
googlebot
post Jan 12 2011, 23:07
Post #11





Group: Members
Posts: 698
Joined: 6-March 10
Member No.: 78779



QUOTE (Raiden @ Jan 12 2011, 22:12) *
How do other libraries name those functions?


From a high-level point of view 'write' is more appropriate.

CODE
int ebur128_write_frames(ebur128_state* st, const double* src, size_t frames)


But you wrote it as low-level as it can get (not even array notation): 'I pass you the address of a double named src, read frames double values from memory starting from src'. So, in this case, I agree with mudlord.

Personally I would use the term 'add'. It's compatible with both, a high-level and low-level point of view.

This post has been edited by googlebot: Jan 12 2011, 23:12
Go to the top of the page
+Quote Post
pbelkner
post Jan 13 2011, 09:38
Post #12





Group: Members
Posts: 412
Joined: 13-June 10
Member No.: 81467



First of all congratulations, Raiden, for the great solution!

QUOTE (googlebot @ Jan 12 2011, 19:58) *
IMHO, you should both think about contributing your work directly to SoX.

Maybe part of R128GAIN can become part of SoX because the main part of the R128 algorithm is already implemented as a SoX effect.

On the other hand it is not obvious to me how calculating the album gain would fit yet into the SoX commanline syntax. Maybe someone has a good idea.
Go to the top of the page
+Quote Post
pbelkner
post Jan 13 2011, 09:45
Post #13





Group: Members
Posts: 412
Joined: 13-June 10
Member No.: 81467



QUOTE (mudlord @ Jan 12 2011, 21:14) *
Why, because then it can go LGPL? >_>

As you've propably already noticed most of R128GAIN is (for software technical reasons) organized as a library called "libr128.a" from the very first day. Unfortunately the library's API is by far from being stable. The focus is on the command line tool anyway.

If anything gets stable some day I will think about the licence for another time. Currently it's not first priority.
Go to the top of the page
+Quote Post
pbelkner
post Jan 13 2011, 09:51
Post #14





Group: Members
Posts: 412
Joined: 13-June 10
Member No.: 81467



QUOTE (Raiden @ Jan 12 2011, 22:52) *
QUOTE (googlebot @ Jan 12 2011, 20:28) *
Great! Even gets the 6-channel case right in contrast to the current version of R128GAIN.

Yes, the channel map is not hard coded into the library, but can be set with the function ebur128_set_channel_map. The default is: left, right, center, unused, left surround, right surround. If the scanner finds a file that has five channels, it sets the correct channel map.
Also, if libsndfile finds a channel map in the file (some WAV files embed one), it uses that one.

I suspect that's a consequence of being based on libsndfile. R128GAIN is based on SoX and potentially on FFmpeg. Up to now I couldn't figure out how to get the channel information from SoX and more importantly from FFmpeg. Maybe someone can help out?

The advantage of being based on FFmpeg is that virtually any existing formats and codecs can be processed.
Go to the top of the page
+Quote Post
googlebot
post Jan 13 2011, 11:32
Post #15





Group: Members
Posts: 698
Joined: 6-March 10
Member No.: 78779



QUOTE (pbelkner @ Jan 13 2011, 09:38) *
On the other hand it is not obvious to me how calculating the album gain would fit yet into the SoX commanline syntax. Maybe someone has a good idea.


I wouldn't care about the context of an album from the angle of SoX. SoX would just report the R128 loudness and peaks for a given range of audio data. If SoX really doesn't interpret channel mapping information, it probably won't work or be too much structural change for a simple, additional effect.
Go to the top of the page
+Quote Post
Raiden
post Jan 14 2011, 10:31
Post #16





Group: Developer
Posts: 224
Joined: 14-September 04
Member No.: 17002



I've just uploaded version 0.1.2:
- fixed a rare bug where "ebur128_gated_loudness" returned NaN and not -inf.
- rename main sample read function to "ebur128_add_frames".
- add sample read functions for short, int and float in addition to double.
- add FFmpeg scanner (that one needs C99).

I won't distribute the FFmpeg DLL's because of patent issues. You'll need avcodec-52.dll, avcore-0.dll, avformat-52.dll and avutil-50.dll.
On Linux I recommend to build it yourself. If there are any problems with the CMake build system, let me know...

Source (tar.gz)
Source (zip)
Win32 build (zip)

I can't seem to edit the first post. Help?
Go to the top of the page
+Quote Post
Raiden
post Jan 14 2011, 10:40
Post #17





Group: Developer
Posts: 224
Joined: 14-September 04
Member No.: 17002



QUOTE (pbelkner @ Jan 13 2011, 09:51) *
Up to now I couldn't figure out how to get the channel information from SoX and more importantly from FFmpeg. Maybe someone can help out?

I've played around with FFmpeg, and the channel map is saved in the in64_t "channel_layout" in the codec context. Each bit corresponds to one channel. The mapping is defined in avcodec.h with macros like CH_FRONT_LEFT or CH_FRONT_RIGHT. The channels in the file will always appear in the same order as the corresponding bits in channel_layout, so the left channel will always come before the right channel and so on. If there is no channel map in the file, channel_layout will be set to 0.
Go to the top of the page
+Quote Post
pbelkner
post Jan 14 2011, 10:46
Post #18





Group: Members
Posts: 412
Joined: 13-June 10
Member No.: 81467



QUOTE (Raiden @ Jan 14 2011, 11:40) *
QUOTE (pbelkner @ Jan 13 2011, 09:51) *
Up to now I couldn't figure out how to get the channel information from SoX and more importantly from FFmpeg. Maybe someone can help out?

I've played around with FFmpeg, and the channel map is saved in the in64_t "channel_layout" in the codec context. Each bit corresponds to one channel. The mapping is defined in avcodec.h with macros like CH_FRONT_LEFT or CH_FRONT_RIGHT. The channels in the file will always appear in the same order as the corresponding bits in channel_layout, so the left channel will always come before the right channel and so on. If there is no channel map in the file, channel_layout will be set to 0.

Thanks a lot. I've tried this already with the 6 channel sample from the R128 test vector, unfortunately it's set to zero. Maybe I've made something wrong.
Go to the top of the page
+Quote Post
Raiden
post Jan 14 2011, 11:01
Post #19





Group: Developer
Posts: 224
Joined: 14-September 04
Member No.: 17002



QUOTE (pbelkner @ Jan 14 2011, 10:46) *
Thanks a lot. I've tried this already with the 6 channel sample from the R128 test vector, unfortunately it's set to zero.

You are right, all R128 test files are without channel map... I currently check manually for files that have no channel map but 5 channels.
Go to the top of the page
+Quote Post
Raiden
post Jan 16 2011, 00:51
Post #20





Group: Developer
Posts: 224
Joined: 14-September 04
Member No.: 17002



0.1.3 is up!
- Added tagging support. You need Python and Mutagen (a tagging library written in Python) for that. I've written a little script that writes ReplayGain style tags to OGG, MP3 and FLAC. Just run r128-ffmpeg with this script as argument, like this:
CODE
r128-ffmpeg -t rgtag.py FILENAME(S) ...
...and it will tag your files as an album. Currently only gain is supported; peak is always set to 1. The reference level is -18 LUFS (5 LU louder than the EBU R 128 standard says) to approximate RGs reference level.
- Improved accuracy of the filter coefficients.

Source (tar.gz)
Source (zip)
Win32 build (zip)

Please let me know if it works! Hint: To install Mutagen under Windows, run "C:\Python27\python.exe setup.py install" in the Mutagen dir.

This post has been edited by Raiden: Jan 16 2011, 00:51
Go to the top of the page
+Quote Post
C.R.Helmrich
post Jan 16 2011, 14:28
Post #21





Group: Developer
Posts: 688
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



I gave this a try yesterday (r128-sndfile, version 0.1.2) and today (0.1.3) and looked at the code, and I can say you've done a beautiful job! It's especially nice to see that
  • you fully implemented my optimization proposed in the R128gain thread smile.gif
  • you do not resample. The absence of any resampling makes your version run about 60 times faster than R128gain on my single-core Athlon. Still, the results for 44.1-kHz audio never differ by more than 0.1 dB from those obtained by R128gain, which resamples every input to 48 kHz.

If you want to have my opinion: I'm pretty sure the true peak values can be estimated quite accurately without resampling, so please don't consider adding resampling in the future.

Now two questions: I noticed you changed the algorithm for the calculation of the filter coefficients. Could you provide a reference where you got the new algorithm from? And if I read your code correctly, you store every block energy in your linked list, even the ones which are quieter than -70 LUFS (which you don't need later). I assume that's a design limitation (buffering, album gain handling, and stuff)?

By the way, on the 4 CDs I tested so far, versions 0.1.2 and 0.1.3 give exactly the same results.

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
pbelkner
post Jan 16 2011, 15:07
Post #22





Group: Members
Posts: 412
Joined: 13-June 10
Member No.: 81467



QUOTE (C.R.Helmrich @ Jan 16 2011, 15:28) *
If you want to have my opinion: I'm pretty sure the true peak values can be estimated quite accurately without resampling, so please don't consider adding resampling in the future.

QUOTE (pbelkner @ Jan 6 2011, 01:46) *
R-REC-BS.1770-1-200709-I!!PDF-E.pdf states regarding true peak determination:

QUOTE
  1. Attenuate: 12.04 dB attenuation
  2. 4 over-sampling
  3. Emphasis: Pre-emphasis shelving filter, zero at 14.1 kHz, pole at 20 kHz (optional)
  4. DC block (optional)
  5. Absolute: Absolute value
  6. Max: Highest value detection (optional, included if DC block is included).
Go to the top of the page
+Quote Post
C.R.Helmrich
post Jan 16 2011, 16:13
Post #23





Group: Developer
Posts: 688
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (C.R.Helmrich @ Jan 16 2011, 15:28) *
If you want to have my opinion: I'm pretty sure the true peak values can be estimated quite accurately without resampling, so please don't consider adding resampling in the future.

Like I said, it's my opinion. I'm perfectly fine with trading some true peak accuracy for a 60x speed increase.

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
pbelkner
post Jan 16 2011, 16:46
Post #24





Group: Members
Posts: 412
Joined: 13-June 10
Member No.: 81467



QUOTE (C.R.Helmrich @ Jan 16 2011, 17:13) *
I'm perfectly fine with trading some true peak accuracy for a 60x speed increase.

I agree. Probably R128GAIN will offer a "--fast" switch which at least drops up-sampling to 192 kHz which is probably the most important performance killer.

On the other hand I'm also interested in Raiden's method for calculating the filter coefficients for various sample rates on the fly. I've searched the whole net several times, ask the question in the forum, unfortunately no answer yet.

QUOTE (pbelkner @ Jan 6 2011, 01:46) *
R-REC-BS.1770-1-200709-I!!PDF-E.pdf states:

QUOTE
These filter coefficients are for a sampling rate of 48 kHz. Implementations at other sampling rates will require different coefficient values, which should be chosen to provide the same frequency response that the specified filter provides at 48 kHz. The values of these coefficients may need to be quantized due to the internal precision of the available hardware. Tests have shown that the performance of the algorithm is not sensitive to small variations in these coefficients.

It is not obvious for me how to quantize the given coefficients with respect to other sample frequencies, hence I decided to re-sample to 48 kHz

Go to the top of the page
+Quote Post
Raiden
post Jan 16 2011, 17:47
Post #25





Group: Developer
Posts: 224
Joined: 14-September 04
Member No.: 17002



QUOTE (C.R.Helmrich @ Jan 16 2011, 14:28) *
I noticed you changed the algorithm for the calculation of the filter coefficients. Could you provide a reference where you got the new algorithm from?

QUOTE (pbelkner @ Jan 16 2011, 16:46) *
On the other hand I'm also interested in Raiden's method for calculating the filter coefficients for various sample rates on the fly. I've searched the whole net several times, ask the question in the forum, unfortunately no answer yet.

I've thought long about this. There are five filter coefficients for a normalized, second-order IIR filter like the ones in BS.1770-1. Such a filter depends on five design parameters: High-pass gain, band-pass gain, low-pass gain, Q factor, and angular frequency (K = tan(pi * f_c / f_s), f_s is the sampling frequency, f_c the frequency "where it happens").
If one knows how the filter was designed, one can solve the resulting system of equations to get the parameters.
In the original paper f_s was 48000, but one can recalculate the filter coefficients for any f_s.

Yesterday I had some fun with Maxima and LaTeX, and wrote a little paper that explains what I did.

This paper was very helpful. On page 3 there are formulas for b_0, b_1, b_2, a_1, a_2, obtained with bilinear transform.

This post has been edited by Raiden: Jan 16 2011, 17:54
Go to the top of the page
+Quote Post

8 Pages V   1 2 3 > » 
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 2nd October 2014 - 10:24