Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Speex Wide-band Mode (Read 9710 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Speex Wide-band Mode

Hi,
can someone tell me how speex breaks up the Wideband and Ultra-Wideband speech into low band and high band(s).
I understand that it breaks WB speech(16KHz) into two bands i.e. 0-4KHz(LOW) and 4-8KHz(HIGH).

Can somebody tell me how it breaks UWB speech(32KHz).
Does it breaks UWB speech into three bands i.e. 0-4KHz(LOW), 4-8KHz(HIGH) and 8-16KHz(HIGH).


please reply asap

bye

Speex Wide-band Mode

Reply #1
if what i wrote above is true then please let me know how the LSFs are quantized...??
it quantize LOW band LSFs with lsp_quant_nb() function.
what about the two HIGH bands ??
does it quantize both the high bands with the function lsp_quant_high() ??
or there are two functions for WB and UWB parts ??

Speex Wide-band Mode

Reply #2
answer to first post:

Two-channel critical sampled subband filterbank using a QMF (quadrature mirror filter).
Roughly speaking: You filter the signal twice (in paralell, once with a low pass filter and once with a high pass filter to get 2 signals each containing a half of the spectrum). The more general nyquist theorem tells us, we only need a sampling rate of at least twice the signal bandwidth (the band does not have to start at 0). So we can subsample the two filtered versions. We get one subband sample for each of both bands for two original samples.

Of course it's not possible nor desirable to implement a "perfect brickwall" filter, so, due to the subsampling aliasing will be introduced in both subbands. If special care has been taken during the filter design, this aliasing will cancel itself during the inverse operation almost perfectly.

This is done twice on the 32 kHz signal.
1x32 kHz signal ----split---> 2x16 kHz signal ---split lower band---> 2x8kHz + 1x16 kHz signal
(BTW: ATRAC3 does this three times)

BTW2: There are different classes of filters with different properties. But it's usually not so easy to design them :
QMF -> near orthogonal mapping, linear phase filters, no perfect reconstruction possible (though the error can be made as small as you want at the cost of longer filter kernels)
orthogonal Wavelets -> orthogonal mapping, non-linear phase filters, perfect reconstruction, nice amplitude responses for short filters
bi-orthogonal Wavelets -> NOT orthogonal, linear phase filters, perfect reconstruction

(Appearently you can't have all the goodies at once)


Sebi

Speex Wide-band Mode

Reply #3
Quote
Hi,
can someone tell me how speex breaks up the Wideband and Ultra-Wideband speech into low band and high band(s).
I understand that it breaks WB speech(16KHz) into two bands i.e. 0-4KHz(LOW) and 4-8KHz(HIGH).

Can somebody tell me how it breaks UWB speech(32KHz).
Does it breaks UWB speech into three bands i.e. 0-4KHz(LOW), 4-8KHz(HIGH) and 8-16KHz(HIGH).


please reply asap

bye
[a href="index.php?act=findpost&pid=355727"][{POST_SNAPBACK}][/a]


For ultra-wideband (0-16 kHz, 32 kHz sampling), I first split the band into wideband (0-8 kHz) and "very high band" (8-16 kHz). Then, the wideband itself is split into low (0-4 kHz) and high (4-8 kHz) band. So there's a total of 3 bands encoded separately. In practice the very high band (8-16 kHz) has only the rough shape of its spectrum encoded with 1.8 kbps.

Speex Wide-band Mode

Reply #4
Basically I wanted to know how speex breaks up WB/UWB speech and you answered it well.

Now I have some more questions:
1) I want to play around by changing LPC order of the encoder. Basically I tried encoding and decoding audio(44.1KHz,16bits) with speex and it gives quite good waveform on reconstruction(mind it, I am interested in waveform closeness to original waveform), So I want to change its order and want to see the results. Please tell me how can I change the LPC order and do some testing.

2) Please describe me how can I change the frame size for sppex. And also tell me in which way it should be changed to get a better waveform for CD quality Audio(I know that Speex is not optimized for audio).

3) What I understand is, Speex uses LPC order=8 for high bands(for WB/UWB), is it true ? why doesnt it use same order i.e. 10.

Speex Wide-band Mode

Reply #5
What makes one 'waveform' better than another compared to the original 'waveform' ? You know, Speex makes use of an error weighting that's supposed to sound nice. How do you measure 'waveform closeness' ?

Changing LPC orders also requires training/designing new codebooks and acounting for them in the encoder/decoder!

Why should Speex use order 10 LPC filters for higher bands ? the spectral envelope doesn't need to be that accurate (in terms of frequency resolution) for higher frequencies and 8 seems like an okay-choice if not a bit too high -- 6 may also work fine.

Sebi

Speex Wide-band Mode

Reply #6
The higher order I was talking about is wrt audio signal. you might not need it in the case of speech but for audio, what I think is, it will be helpful.

Regarding closeness of the waveform, what I mean is the smaller error/residual signal, which I encode with entropy coding and get the losslessly compressed bitstream.


bbye

Speex Wide-band Mode

Reply #7
So, that's what you're trying to achieve ? Let me tell you why this is a bad idea:

- You need to ensure that your Speex decoder behaves deterministically on every platform you want to support which is really hard work (Different implementations of the Speex decoder probably produce slightly different outputs. The floating-point decoder gives you the best approximation to your original I suppose. The integer decoder works deterministically. If you don't account for that, the whole thing won't be lossless).

- Encoding a signal via the sum of two signals (quantized + error) is usually a bad idea. Liebchen (the main guy behind MPEG4 ALS) tried that in his diploma thesis (LTAC). I wonder why he ditched that and worked LPAC afterwards .... ;-)

- I really see no advantage of your idea over the standard FLAC/WAVPACK/MPEG4-ALS/LPAC/... approach. These also use short-term decorrelation filters like Speex and code the prediction residual losslessly (except for WavPack hybrid mode).

- Regarding "waveform closeness", it's obvious that there's a trade-off between closeness and bitrate. But it's also worth thinking about the color of the quantization noise you are willing to allow. That's what I ment by weighting. The reason why Speex sounds good is that it uses a special error weighting. But it'll also make your error/residual signal correlated which you should account for/exploit. Simple entropy coding the residual's samples like it was a memoryless source won't do the job.


Sebi

Speex Wide-band Mode

Reply #8
Quote
The higher order I was talking about is wrt audio signal. you might not need it in the case of speech but for audio, what I think is, it will be helpful.
[a href="index.php?act=findpost&pid=359133"][{POST_SNAPBACK}][/a]


by the way:
higher orders also come with the risk of singularities/instabilities due to the finite precision arithmetic. LPC analysis may be very badly conditioned for large orders. IMHO LPC analysis/synthesis should stay where it's used currently --- for narrowband speech (or temporal noise shaping which also uses low order filters).

Sebi

edited: typo