IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
AAC encoding from dff files
hlloyge
post Jan 31 2012, 10:51
Post #1





Group: Members
Posts: 695
Joined: 10-January 06
From: Zagreb
Member No.: 27018



Hello all.

I have a question; I have a tune in dff (sacd rip) format which I obtained by shady sources, and it is not relevant for this discussion.

What I want to know is, how to encode it to aac? I am using foobar2000, and I loaded dff decoder plugin, and the song plays fine. But drum at the beginning peaks far above 0 dB, I guess it goes to +6. When I encode it to AAC, I use sox resampler plugin to convert it to 44100, but then it also peaks far above, but a bit less, I guess something about +3 or +4. I think it is because it is decoded AFAIK to 32bit float, which can handle higher peaks. I don't hear distortion whatsoever in original and encoded file, at least not on my desktop speakers, I haven't done any ABX test.

The question is - is this OK? I know mp3 doesn't have bit depth in the normal way wav file has; but I don't know about AAC. Can it handle that high input for encoding and decode it properly, without artifacts, or would I have to use some sort of peak limiting before encoding, or just decode it to wav and then load it up in Audacity and normalize it's peak limits to zero?

Thank you.
Go to the top of the page
+Quote Post
nu774
post Jan 31 2012, 13:59
Post #2





Group: Developer
Posts: 514
Joined: 22-November 10
From: Japan
Member No.: 85902



In theory it is OK (same as MP3). However, it might depend on implementations.
At least encoder has to be able to eat float PCM. Same for decoder.

I tried this before with fb2k + qaac, and it was fine. It had peak of 1.6 or so, and peak was preserved.
Of course you have to set the maximum bitdepth to 32 on CLI encoder setting of fb2k.
If you want to check the decoded result, just convert to WAV with fb2k (also specify 32bit here).
In my case, I quickly looked at the decoded result with some Python scripting like the following:
CODE
import struct
wavdata = open('foo.wav', 'rb').read()
wavdata = wavdata[44:] # chop off the header. might be different on your case
pcm = struct.unpack('f' * (len(wavdata)/4), wavdata) # parse as float32 sequence
pcm = map(abs, pcm) # convert to abs values
print 'peak: %g, avg: %g' % (max(pcm), sum(pcm)/len(pcm))

If you want to see just the peak of it, probably scanning with replaygain is enough.

This post has been edited by nu774: Jan 31 2012, 14:09
Go to the top of the page
+Quote Post
saratoga
post Jan 31 2012, 21:20
Post #3





Group: Members
Posts: 4853
Joined: 2-September 02
Member No.: 3264



Can foobar replaygain scan the file? If so, the easiest/safest solution would be to replaygain scan it, then check the "prevent clipping" option when you convert it to AAC. This way it'll get scaled to 0dB before conversion.

This post has been edited by db1989: Jan 31 2012, 22:33
Reason for edit: removing unnecessary full quote of first post
Go to the top of the page
+Quote Post
C.R.Helmrich
post Jan 31 2012, 22:49
Post #4





Group: Developer
Posts: 686
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



I agree with nu774 and saratoga. Given an encoder supporting floating-point input, a few sporadic peaks slightly above 0 dBFS in the input are OK, I guess. But remember that not every decoder can output floating-point PCM but rather truncates (and clips) to 16- or 24-bit before you can apply any level adjustments, and given such decoders you will risk audible clipping with above-0-dBFS encodings.

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
hlloyge
post Feb 1 2012, 10:52
Post #5





Group: Members
Posts: 695
Joined: 10-January 06
From: Zagreb
Member No.: 27018



Foobar can scan the file, but can't save the results into tags, because, AFAIK, dff has no tagging support. I've thought about that, it would be the best solution. And I don't know how to apply RG values while encoding - I could lower the output for file without RG info while playing back, I know to do that.
And although it plays fine in foobar2000 (AFAIK, foobar internally handles 32bit pcm so it will not clip), I will test out today afternoon how it sounds in iPod Touch.

Alternatively, I can transcode it to some lossless format which can handle 32bit wave input, tag it, and encode to aac.

This post has been edited by hlloyge: Feb 1 2012, 11:05
Go to the top of the page
+Quote Post
nu774
post Feb 1 2012, 11:29
Post #6





Group: Developer
Posts: 514
Joined: 22-November 10
From: Japan
Member No.: 85902



Alternatively, as long as the encoder can take it and encode it properly, you can adjust global gain value of resulting AAC afterwards with aacgain (LC-AAC only, though).

Go to the top of the page
+Quote Post
benski
post Feb 2 2012, 01:28
Post #7


Winamp Developer


Group: Developer
Posts: 670
Joined: 17-July 05
From: Brooklyn, NY
Member No.: 23375



One thing I've always been curious about - and perhaps Chris could answer - does the PNS tool (Perceptual Noise Substitution) create any non-determinism in decoding? Or is the random number generator seeded from the bitstream somehow? I'm asking mainly from the point of view of peak value detection in ReplayGain (and compensation during playback) .

This post has been edited by benski: Feb 2 2012, 01:29
Go to the top of the page
+Quote Post
saratoga
post Feb 2 2012, 01:47
Post #8





Group: Members
Posts: 4853
Joined: 2-September 02
Member No.: 3264



QUOTE (benski @ Feb 1 2012, 19:28) *
One thing I've always been curious about - and perhaps Chris could answer - does the PNS tool (Perceptual Noise Substitution) create any non-determinism in decoding? Or is the random number generator seeded from the bitstream somehow? I'm asking mainly from the point of view of peak value detection in ReplayGain (and compensation during playback) .


I don't know what the spec says, but libfaad's pns_decode uses a random number generator which is always initialized to the same value.

WMA Standard does something similar.
Go to the top of the page
+Quote Post
benski
post Feb 2 2012, 02:27
Post #9


Winamp Developer


Group: Developer
Posts: 670
Joined: 17-July 05
From: Brooklyn, NY
Member No.: 23375



QUOTE (saratoga @ Feb 1 2012, 20:47) *
QUOTE (benski @ Feb 1 2012, 19:28) *
One thing I've always been curious about - and perhaps Chris could answer - does the PNS tool (Perceptual Noise Substitution) create any non-determinism in decoding? Or is the random number generator seeded from the bitstream somehow? I'm asking mainly from the point of view of peak value detection in ReplayGain (and compensation during playback) .


I don't know what the spec says, but libfaad's pns_decode uses a random number generator which is always initialized to the same value.

WMA Standard does something similar.


So playback from the start of the file would be OK, but "random" seeking could potential cause peak values to differ?
Go to the top of the page
+Quote Post
saratoga
post Feb 2 2012, 02:34
Post #10





Group: Members
Posts: 4853
Joined: 2-September 02
Member No.: 3264



QUOTE (benski @ Feb 1 2012, 20:27) *
So playback from the start of the file would be OK, but "random" seeking could potential cause peak values to differ?


Ha, I suppose it would.

This post has been edited by saratoga: Feb 2 2012, 02:35
Go to the top of the page
+Quote Post
hlloyge
post Feb 2 2012, 11:20
Post #11





Group: Members
Posts: 695
Joined: 10-January 06
From: Zagreb
Member No.: 27018



iPod Touch 2nd gen plays the file fine; I can't hear clipping artifacts, so I guess decoder handles it quite well... but it's only mere 3dB over.

Track gain : -6.61 dB
Track peak : 2.396162
Album gain : -6.24 dB
Album peak : 2.396162

It sounds quite loud, though.
Go to the top of the page
+Quote Post
C.R.Helmrich
post Feb 2 2012, 11:45
Post #12





Group: Developer
Posts: 686
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (benski @ Feb 2 2012, 02:28) *
... does the PNS tool (Perceptual Noise Substitution) create any non-determinism in decoding? Or is the random number generator seeded from the bitstream somehow?

The standard doesn't say anything about the random number generator, neither what kind of generator, nor anything about seeding that generator. So yes, peak values might be a bit non-deterministic. But I'll ask our conformance-bit-stream experts how PNS is handled there. Update: yes, a colleague told me that, as alexander writes below, the PNS conformance tool compares energies, not samples.

Chris

This post has been edited by C.R.Helmrich: Feb 2 2012, 12:33


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
.alexander.
post Feb 2 2012, 12:19
Post #13





Group: Members
Posts: 73
Joined: 14-December 06
Member No.: 38681



Haven't read the whole thread, just want to say that reference software contains special tool to test PNS conformance (conf_pns) that check energy in PNS subbands.
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 25th July 2014 - 18:30