And you thought MPPenc has too many switches?

Topic: And you thought MPPenc has too many switches? (Read 4753 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

And you thought MPPenc has too many switches?

2002-06-28 19:55:23

I found this while browsing around in the net:

**** NBC V1.00   ISO/MPEG Audio NBC  Software Only Encoder  ***

|                                                                    |

| copyright Fraunhofer-IIS , Dolby, AT&T ,Lucent, Sony, NEC          |

|                        1994, 1995, 1996                            |

|                                                                    |

**********************************************************************



usage:    encfuchs <time-signal file> <bitstream file> -switches



switches:   -h display this help message

            -h <switch> displays more help for <switch> if available (*)

            -f <filename> read arguments from file



 -br   %lu      * bitrate all channels  : 16k < bitrate <320k    def.:128000

 -mod  %i       * mode: 0:st 1:js 2:dc 3:mo 4:ml 5:mr 6:ms 7:is 0x10:mc def:1:js

 -wsm  %i       * 0: Dolby window (def), 1: FhG window, 2: DOL

 -wss  %i         0: Dolby window, 1: FhG window (def), 2: DOL

 -crc           * enable MPEG1/2 crc check                def.: off

 -bm   %lu        bit reservoir max  default:  MPEG1-max    (MPEG2: 2047*8)

 -bsm  %lu        0: only ISO-bitrates  1: bitrate switching  2: free format 3: variable rate

 -zlv  %f       * 0 dB (max. input) level (def.: 0 dB == 1.0 !!! see more help !)

 -tfm  %d         nr of input time files (default: 1)

 -tfc  %d         specifies channels in time signal file (def: 2)

 -tfw  %d         time sample wordlenght in bytes = 2, 4 (def: 2)

 -bw   %lu        audio bandwidth in Hz    def.: internal table( bitrate )

 -bws  %lu        bandwidth limiting: def: no slope;  0=no slope, 1=logarithmic type; 2: butterworth

 -sr   %lu      * sampling rate  def.: 44100

 -tfs             swap input time

 -tds  %d       * downsampling factor for time signal input (def.: 1)

 -tdf  %d         alternativ downsampling filter 0: steep, 1: flat filter (def.: 0)

 -fb   %lu        begin encoding with bitstream frame xxx

 -fn   %lu        encode xxx bitstream frames

 -fcp  %li        print frame counter only if xxx&frame_counter;  def.: -1

 ----             ------ PREDICTION: -----------------------------------------

 -pre  %u       * 1: prediction activated, 0: prediction deactivated

 -prs  %i       * predictor reset <0: off, otherwise step size>

 -peb  %f%f%f%f   pe dependant bitres control: pe_min (0), pe_max (800), refill_ratio_min (0.0), refill_ratio_max (0.07)

 ----             ------ FRAME INPUT/OUTPUT: ---------------------------------

 -fi   %i%i%i%s   frame input <type> <wordlen(in bytes)> <swap> <filename>

 -fop  %i%i%s     psy2hyb output  <wordlen> <swap> <filename>

 -foh  %i%i%s     hyb2loo output  <wordlen> <swap> <filename>

 -fol  %i%i%s     loo2bmx output  <wordlen> <swap> <filename>

 ----             ------ INTERNAL DECODER: -----------------------------------

 -oti  %u%s       <mode>  1: quant. data  2: hyb-spec  3: psy-time-out  4:

threshold sim <filename>

 -otm  %i         downsampling factor for output (if negative: upsampling

 ----             ------ IO DEBUG-OPTIONS: -----------------------------------

 ----             ------ PROGRAM CONTROL: ------------------------------------

 -trg  %lu%lu     start if value > <trigval>  after <trigdelay> multi-channel samples

 -w               wait after each frame

 -sto  %lu%i      stop at frame <xx> granule <xx>

 -eoe             don't exit on error but wait in endless loop

 ----             ------ ALGORITHM DEBUG-OPTIONS: ----------------------------

 ----             ------ LOOPS/BITMUX: ---------------------------------------

 -efs             enable full huffman search

 -dbo             disable bitstream output

 -mcb             enable smaller xmin bands

 -enc  %d%d%f%f   energy-correction enable  par1: max quant (def: 1)  par2: corr. threshold (def: 1.12) par3: corr-value (def LOG_CON/5)

 -cit  %ui        common iterations:  1: only ch 2: ch and gr, def: 1

 -sfe  %u         scfsi_mode:  0: disable  1: enable

 -rsb             recalc scale factor bits during loops process

 -nmc  %i         when scale factors are changed 0: all bad sfbs, 1: worst sfb 2: all GT lim. lin. mean nmr def: 0

 -itm  %i         loops iteration mode 0:normal 1:threshold iteration 2:bitres_bitcalc_in_loops def:0

 -eft  %i         restore best-result eval. function: 0: no restore  1: lim_lin_mean  2: lim_alpha_mean  3: worst nmr  4: lin_mean def.: 1

 -brp  %u%u       %u :  bits_outer1_repeat,  %u :  bits_outer2_repeat

 -mol  %u         maximum allowed outer loops (minimum is 1)

 -qal  %f         quantizer align factor (0.0 .. 1.0), fades to individual channel alignment

 -mvl  %u%u       %u max. allowed overall loops  %u bits_overall_repeat

 -dsb  %f%f       dynpart start bits = %f * average_bits + %f * more_bits

 -lsp  %%ii       p1: spread mode  p2: enable alias simulation

 ----             ------PSYCHOACOUSTIC: -------------------------------------

 -pbr  %lu        specify different bit rate for psych calculation

 -rpf  %s         read psych data from file %s

 -wpf  %s         write psych data to file %s

 -wpd  %s         write default psych data to file %s

 -wra  %s         write ratio... to file %s

 -rra  %s         read ratio... from file %s

 -dnm  %u       * 0: MDCT * ratio  1: f(FFT-level,MDCT*ratio)

 -rob             use old barc scale interpolation

 -bty  %u         only_long = 0 only_short = 2

 -mbf             set mixed_block_flag = 1

 -att  %s         debug string for ATT noiseless coding

 -dq   %f         maximum dB difference between mdct and fft

 -chm  %s         channel masc for NBC: .:not present, lrcLR possible channels

 -chse %s         channel element: .:not present, s,c,p possible elements

 -mcm  %i         MC mode: 0: only stereo, 1: MC active, 2: mc simulcast,3: mc 5chan,4: only mono def:3

 -fln  %i         number of freq. lines for lowpass band width cutoff

 -bmb  %i         more bits for short block without attack (default: -1, deactive)

 -plr             patch LR ratios in TNS module (default: not active)

 -mss  %f         M/S ratios devided with this factor -> the lower the more MS (default: 1)

 -plf  %s         resource file name for plotmtv %s

 -pst  %i         start plotmtv at frame %i

 -sms             use simple MS

 -smg  %f         simple MS gain (default: 1.5)

 -sta             MS: enable starving of side channel

 -tmo             MS: enable threshold modification if high MS gain

 -tns  %ui        tns mode:  0: off, no BS  1: off, with BS, 2: on  def: 2

 -tnb  %ui        tns blockswitching mode, def: all on

 -tna  %lu        tns start freq., def: 1.275 kHz

 -tne  %lu        tns stop freq., def: fs/2

 -tnd  %ui        tns filter direction, 0=up, 1=down, def: 1

 -tno  %ui        tns max. order, def: 20

 -tnr  %ui        tns coefficient resolution (-1, 3 or 4), def: 4

 -tnm  %ui        tns masked threshold correction, def: 901

 -tnp  %ui        tns info printout switch, def: 0

 -tnw  %ui        lpc window length (40, 50, 60)

(The list goes on and on. I'd better stop now...)

And you thought MPPenc has too many switches?

Reply #1 – 2002-06-28 20:51:58

My line is "-dbo -sfe 0 -mol 5 -enc 1 2 5 -bty 2 -sms -tnw 40 -mod 0 -pre 1 -zlv 400"!

And i know that it's the best!! :insane:

And you thought MPPenc has too many switches?

Reply #2 – 2002-06-28 20:57:01

NICE! talk about a full blown physcoacoustics algorithm there isn't anything you can't do there! now is that for MPEG-1/2 AAC, AC3 NBC streams? look you can even change the LPC window length and coeffcient resolution! :eek: That's something else let me tell you.

And you thought MPPenc has too many switches?

Reply #3 – 2002-06-28 21:09:46

Quote

Originally posted by HotshotGG
NICE! talk about a full blown physcoacoustics algorithm there isn't anything you can't do there! now is that for MPEG-1/2 AAC, AC3 NBC streams?

Well, it is AAC before it was called AAC.
It is the encoder used for MPEG listening tests, whan the audio format was called "MPEG2 audio NBC" (Not Backwards Compatible with MPEG audio 1,2,3).

Quote

look you can even change the LPC window length and coeffcient resolution! :eek: That's something else let me tell you.

Yeah. I'd love to test it, but it only runs on RISC architecture, accepts only RAW PCM streams and is reported to be not 100% ISO compatible. (It was used for quality tests rather than compatibility tests). I.E: Probably wouldn't play on FAAD.

I would guess it would take some months to tweak all the switches to get the best results.

BTW: Just as curiosity: The encoder is called "encfuchs" after H. Fuchs - he is a researcher on prediction (Used on AAC LTP and backward prediction.)

Regards;

Roberto.

And you thought MPPenc has too many switches?

Reply #4 – 2002-06-28 21:26:34

Quote

I would guess it would take some months to tweak all the switches to get the best results.

In deed.

Quote

BTW: Just as curiosity: The encoder is called "encfuchs" after H. Fuchs - he is a researcher on prediction (Used on AAC LTP and backward prediction.)

That's very interesting. Amazes me to think they could come up with something like Long Term Predicition and even Temporary Noise Shaping.

And you thought MPPenc has too many switches?

Reply #5 – 2002-06-29 08:20:47

That is a NBC encoder used in official MPEG listening tests (famous N2006 document)

I have tried it (long time ago, though) on SPARC - but it does not generate ISO 13818-7 bitstreams Something is wrong in the CPE bitfield, and FAAD won't decode the streams. However, SPARC NBC decoder from Nokia would decode the streams.

Interesting thought - regarding these switches, when AAC was in the design phase, people from FhG/ATT/Dolby/... implemented all kinds of algorithms (as you can see from the command line) and then found best combinations for each preset. That was a very long process. New FhG AAC encoders were built on improved code from that encoder (NBC 1.0) . This codec has also many wrong parameter settings (for default presets), because it was built in 1996

And, yes - codec already has best combinations (adjusted according to listening tests) - but it is damn slow.

AACEnc also has many switches, maybe more than that encoder - however most of them are 'hidden' (do not appear in help listing) because I was affraid that some users might screw quality up with some weird command line combinations

And you thought MPPenc has too many switches?

Reply #6 – 2002-06-29 08:27:37

Ok, here is the list of the AACEnc command line switches (including hidden) for testing purposes. Note - many of those WON'T work, because they were disabled in the code. However, quantization and psychoacoustic switches mostly work.

*********** PsyTEL® MPEG-4 AAC Encoder V2.15 (build Jun 29 2002) **********
Copyright © 1999-2002 PsyTEL Research
Copyright © 1999-2002 Ivan Dimkovic

This program is protected by copyright law and international treaties.
Any reproduction or distribution of this program, or any portion of it
may result in severe civil and criminal penalties, and will be
prosecuted to the maximum extent possible under law.

For further info please visit http://www.psytel-research.co.yu
For licensing details please contact:

usage: AACENC -switches
switches: -h Print help
-br <x> Bit rate in kbits per second (dflt: 128)
-if <x> Input file name
-of <x> Output file name
-qual <x> Encoder quality level (1-9) (dflt: 9)
-production Production (slowest) CBR encoding
-altcbr Alternative CBR mode
-ihsc Improved Human Speech Coding
-is Use Intensity Stereo (debug mode!!!)
-disabletuning Use Intensity Stereo (debug mode!!!)
-pns Enable Perceptual Noise Substitution (PNS) Tool
-disable_ms Disable Joint Stereo Coding
-safems Safe M/S Switching
-nh Disable ADTS header (raw ISO 13818-7 AAC file)
-noshort Disable block switching (debug only!)
-no_temporal Disable temporal masking
-lc MPEG-2 AAC Low Complexity (LC) mode
-profile <x> AAC Profile:
0: LC, 1: Main, 2: LTP (dflt: 0)
-adif Use ADIF header
-vr Variable bit rate (VBR mode, good quality)
-vbrhi Total VBR mode (recommended, high quality)
-qvbr <x> Quality controlled VBR (quality: 0 - 100 %%) (dflt: -1)
-tape Preset: Tape VBR Mode
-radio Preset: Radio VBR Mode
-internet Preset: Internet VBR Mode
-streaming Preset: Streaming VBR Mode
-normal Preset: Normal VBR Mode (recommended)
-extreme Preset: Extreme VBR Mode
-archive Preset: Archive VBR Mode (best quality)
-ultra Preset: Ultra (transcoding) mode (highest bitrate)
-lfe Use LFE channel (only for 4 and 6 channel input)
-c <x> Cut-Off frequency in Hz (lowpass) (dflt: 0)
-ltq <x> Decrease Threshold in Quiet by n dB (dflt: 0)
-raise_smr <x> Increase Signal to Mask Ratio by n dB (dflt: 0)
-low_ath Use highest sensitivity hearing threshold
-no_ath Disable ATH
-no_tns Disable TNS coding
-artist <x> Artist Name
-title <x> Title Name
-album <x> Album
-year <x> Year
-use_tags Use Tags
-resample <x> Resample input to x Hz (dflt: 0)
-fb <x> Cut-Off frequency in Hz (lowpass) (dflt: 0)
- ------------- PSYCHOACOUSTIC OPTIONS ------------
-pft <x> Psych filterbank type - 0: Complex MDCT, 1: Complex FFT (dflt: 0)
-cht <x> Chaos measure type: - 0: Euclidic Distance, 1: Peak Filter, 2: Both (dflt: 0)
-bsw <x> Block switching mode: - 0: Automatic, 1: Only Long, 2: Only Short (dflt: 0)
-rpelev <x> Residual PE Level for short block switching (dflt: 0)
-lpe <x> LPC prediction error for short block switching (dflt: 0)
-grm <x> Short window grouping mode: 0: automatic, 1: 8 groups in one (dflt: 0)
-mcb <x> Upper frequency limit for minimum CB threshold adapt (dflt: 0)
-tss <x> Enable temporal threshold smoothing (CBR) (dflt: 0)
-nls <x> Enable non-linear spreading function (dflt: 0)
-dls <x> Short block temporal masking power coeff. (dflt: 0.1)
-dll <x> Long block temporal masking power coeff. (dflt: 0.1)
-tng <x> TNS Gain (switch criteria) (dflt: 0)
-psf <x> PNS start frequency (dflt: 0)
-ptt <x> PNS tonality switch threshold (PNS will be used if tonality is less than threshold) (dflt: 0)
-ptm <x> PNS chaos estimation mode: 0: minimum tonality of cb that form sfb, 1: mean tonality of cb that form sfb, 2: SFM method on spectrum belogning to sbf (dflt: 1)
-TMN <x> Tone Masking Noise for long blocks (dflt: 0)
-TMN_s <x> Tone Masking Noise for short blocks (dflt: 0)
-NMT <x> Noise Masking Tone for long blocks (dflt: 0)
-NMT_s <x> Noise Masking Tone for short blocks (dflt: 0)
-vtmn Variable TMN/NMT
-sfl <x> Spreading function low masking (dB/Bark): (dflt: 0)
-sfh <x> Spreading function high masking (dB/Bark): (dflt: 0)
-mbs <x> Bitres tuning: Maximum bit spend for long blocks (dflt: 0)
-mbs_s <x> Bitres tuning: Maximum bit spend for short blocks (dflt: 0)
-rst <x> Bitres tuning: Refill start (dflt: 0)
-rss <x> Bitres tuning: Refill stop (dflt: 0)
-mbg <x> Bitres tuning: Minimum refill bits for long blocks (dflt: 0)
-svc <x> Block switching constant (0-500) (dflt: 128.0)
- ------------- JOINT STEREO OPTIONS ------------
-jsm <x> M/S Switching Options (0: automatic, 1: all sfbs, 2: none) (dflt: 1)
-mst <x> M/S Switching Threshold (dflt: 5.0)
-smm <x> Simple M/S Mode (dflt: 1)
-mld <x> Enable BMLD Protection Ratios (dflt: 1)
-bmd <x> Perform M/S imaging (BMLD protection) if L and R energies differ less than n dB (dflt: 2.0)
-rlr <x> Reuse L/R Psychoacoustics for M/S Thresholds (dflt: 1)
-ess <x> Enable side channel starving (dflt: 1)
-isf <x> Intensity Stereo starting frequency (dflt: 1)
-ist <x> IS switch ratio (IS will be used if threshold difference is less than ist) (dflt: 1)
-sis <x> Simple IS mode (no IS analysisis in psychoacoustic module) (dflt: 1)
- ------------- QUANTIZER LOOPS OPTIONS ------------
-ity <x> Iteration type (0: threshold based, 1: bitrate_calc_in_loops (dflt: 1)
-mol <x> Maximum number of quantizer outer loops (dflt: 200)
-mno <x> Minimum number of quantizer outer loops (dflt: 1)
-itm <x> Distorted Sfb Amplification mode (1: Smart, 2: All, 3: Mean, 4: Worst) (dflt: 1)
-aal <x> Number of unsuccessful quant loops for ALL_DISTORTED mode (dflt: 8)
-mdl <x> Number of unsuccessful quant loops for MEAN_DISTORTED mode (dflt: 16)
-wdl <x> Number of unsuccessful quant loops for WORST_DISTORTED mode (dflt: 16)
-thm <x> Enable threshold correction (dflt: 1)
-ect <x> Enable consecutive threshold smoothing in the last outer loop (dflt: 1)
-pfn <x> Perform fine noise shaping after the last outer loop (dflt: 1)
-rtm <x> Real-time loop break mode (dflt: 1)
-edt <x> Enable debug trap for endless quantizer loops (dflt: 1)
-efh <x> Full Huffman search mode (0: last loop, 1: all loops, 3: never (dflt: 0)
-fht <x> Full Huffman search type (0: binary tree, 1: greedy-merge, 3: best (dflt: 0)

And you thought MPPenc has too many switches?

Reply #7 – 2002-06-29 09:20:34

haha..
hey that looked just like FLAC

And you thought MPPenc has too many switches?

Reply #8 – 2002-06-29 16:12:50

lol Ivan. You just opened the Pandora's box.

Nice list of switches though. Are you sure that current preset switches are tweaked to the max?

You know, I'd almost recommend hiding that list..

And you thought MPPenc has too many switches?

Reply #9 – 2002-06-29 17:19:53

Quote

Originally posted by JohnV
You just opened the Pandora's box.

lol... Get ready for the l33t page long command lines that "just sound better!#@!!!"

Not that these switches can't be useful for testing, but that's really only the case if all the data from previous testing is taken into account and if people do lots of blind testing.... both of which are very unlikely to be the case.

Notice