IPB

Welcome Guest ( Log In | Register )

4 Pages V  < 1 2 3 4 >  
Reply to this topicStart new topic
CELT 0.9.1 is out!, Help wanted
jmvalin
post Dec 10 2010, 19:25
Post #26


Xiph.org Speex developer


Group: Developer
Posts: 475
Joined: 21-August 02
Member No.: 3134



QUOTE (IgorC @ Dec 11 2010, 03:01) *
How does CELT scale with higher bitrate 80-128 kbps?
HE-AAC has good quality/size trade at 48-64 kbps but already no advantage over LC-AAC at 80 kbps. While Vorbis vice versa.


CELT was originally designed as a low-delay high-bitrate codec. It's only recently that I've been able to make it listenable around 64 kb/s. I've encoded the usual test file at 80 kb/s and 128 kb/s:

comp_celt80.wav
comp_celt128.wav

Let me know what you think of those. One of my goals is to scale to very high/transparent quality as well.
Go to the top of the page
+Quote Post
jmvalin
post Dec 10 2010, 19:34
Post #27


Xiph.org Speex developer


Group: Developer
Posts: 475
Joined: 21-August 02
Member No.: 3134



QUOTE (SebastianG @ Dec 11 2010, 00:10) *
I just skimmed through some parts of the source code and noticed in vq.c the "scrambling" (exprotation1 etc). It looks like this is roughly equivalent to an all-pass filter. Since you apply this on the spectral coefficients and due to the time/frequency duality this is equivalent to a time-dependent frequency shift within a frame. I know the original motivation for this processing (reducing metallic artefacts) and it seems to be doing what it's supposed to but it sure is an odd thing to do. On the downside you smear strong tonal components over a larger spectrum which kind of defeats the purpose of an MDCT in terms of energy compaction w.r.t. tonal components (MDCT as opposed to, say, a PQMF with fewer subbands). Maybe this is why the guitar sample doesn't work that well... The encoded coefficients correspond to some kind of chirps (due to the time-dependent frequency shift) and not a windowed cosine. Have you checked the impulse response of a single one surrounded by zeros in X followed by inverse exprotation + inverse MDCT? Might be interesting to see what it looks like...


Hi Sebastian,

Actually, the spreading in vq.c is not equivalent to an all-pass filter. It's a non-linear operation (when viewed from a time-domain signal) because it actually creates frequency content that wouldn't be there otherwise. The idea is to avoid "birdie artefacts", aka musical noise. One thing to note is that the amount of spreading depends on two things: 1) the bit-rate (less spreading as the bitrate goes up), and 2) a global per-frame parameter. It's possible with the current bit-stream to use no spreading at all. In fact, there's a (still a bit simple) function that decides how much spreading to apply based on how tonal the audio is. That being said, I think the importance of the spreading has probably gone down recently since we're no longer allowing codebooks of just one pulse. In any case, you can hear the difference and decide for yourself what the effect is:

with spreading
without spreading

What do you think?
Go to the top of the page
+Quote Post
SebastianG
post Dec 10 2010, 20:40
Post #28





Group: Developer
Posts: 1317
Joined: 20-March 04
From: Göttingen (DE)
Member No.: 12875



QUOTE (jmvalin @ Dec 10 2010, 19:34) *
QUOTE (SebastianG @ Dec 11 2010, 00:10) *

I just skimmed through some parts of the source code and noticed in vq.c the "scrambling" (exprotation1 etc). It looks like this is roughly equivalent to an all-pass filter. Since you apply this on the spectral coefficients and due to the time/frequency duality this is equivalent to a time-dependent frequency shift within a frame. [...]

Actually, the spreading in vq.c is not equivalent to an all-pass filter. It's a non-linear operation (when viewed from a time-domain signal) because it actually creates frequency content that wouldn't be there otherwise.

This is not a contradiction to what I tried to say, though. Let me try to reword it. exprotation1 takes a vector and some parameters and maps it to another one (in-place). What I'm saying is that this is a convolution (ignoring the boundaries). It's shift-invariant (frequency-invariant if you will) and it's linear after all. Consider the implications. Convolving in the frequency domain = multiplication in time. And this multiplication in time shifts frequencies. The filter's "group delay" in the frequency domain equals the frequency shift in the time domain. The filter likely has a varying group delay, so the frequency shift varies over time -- possibly very quickly.

QUOTE (jmvalin @ Dec 10 2010, 19:34) *
The idea is to avoid "birdie artefacts", aka musical noise. One thing to note is that the amount of spreading depends on two things: 1) the bit-rate (less spreading as the bitrate goes up), and 2) a global per-frame parameter.
[...]
What do you think?

I get the idea. If I remember correctly, we talked about this 1-2 years ago. But I didn't know exactly what kind of rotations you apply until now. I was just surprized to see you implemented a mapping that is linear and shift-invariant. Hence, the comment. But I get the idea. I would have probably tried pseudo-random rotations first (within small groups of coefficients, like 32 or so) because I consider this to be the equivalent of dithering in a gain/shape coding approach. As such, it should be well-suited for encoding noisy parts without "birdie artefacts" and without introducing extra energy (which is hard to avoid with "normal" dithering). But this shift-invariant linear mapping seems to be doing its job as well. smile.gif

This post has been edited by SebastianG: Dec 10 2010, 20:46
Go to the top of the page
+Quote Post
jmvalin
post Dec 10 2010, 20:48
Post #29


Xiph.org Speex developer


Group: Developer
Posts: 475
Joined: 21-August 02
Member No.: 3134



QUOTE (SebastianG @ Dec 11 2010, 05:40) *
I get the idea. If I remember correctly, we talked about this 1-2 years ago. But I didn't know exactly what kind of rotations you apply until now. I was just surprized to see you implemented a mapping that is linear and shift-invariant. Hence, the comment. But I get the idea. I would have probably tried pseudo-random rotations first (within small groups of coefficients, like 32 or so) because I consider this to be the equivalent of dithering in a gain/shape coding approach. As such, it should be well-suited to encode noisy parts without "birdie artefacts" and without introducing extra energy (which is hard to avoid with "normal" dithering). But this shift-invariant linear mapping seems to be doing its job as well. smile.gif


Actually, the "random rotation" approach was the first thing I tried and it didn't work well, at least not unless you use a very large number of rotation (leading to high complexity). What I'm using not is rotations by angles that are close to 90 degrees. In fact, the less spreading I want, the closer I am to 90 degrees (rather than 0 degrees). It's not intuitive, but that's what I found to work best so far.
Go to the top of the page
+Quote Post
jmvalin
post Dec 10 2010, 23:36
Post #30


Xiph.org Speex developer


Group: Developer
Posts: 475
Joined: 21-August 02
Member No.: 3134



OK, so I've worked a bit on improving some of the samples like the Spanish guitar and the blocks. The problem with the blocks is that I can't actually hear the artefact, but given your description I still think I know what it is. Can you give me your opinion on:

fileZ.wav
fileAA.wav
fileAD.wav (added)

Let me know if it fixes the problem for you without introducing other artefacts. Oh, and if you want to save download time, I also have the flac version (just change the extension).

This post has been edited by jmvalin: Dec 11 2010, 07:55
Go to the top of the page
+Quote Post
IgorC
post Dec 13 2010, 22:01
Post #31





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



Jean-Marc,

What do you plan for test of CELT? If you aren't hurry I will have some spare time after 21st of December and be glad to make much more blind tests.
Do you also consider to make more large test that will involve more people and/or more samples or release some sort of preview version of codec?
I guess if there will be binaries later then probably more people will look into it and report.

Thank you.
Go to the top of the page
+Quote Post
jmvalin
post Dec 13 2010, 22:25
Post #32


Xiph.org Speex developer


Group: Developer
Posts: 475
Joined: 21-August 02
Member No.: 3134



QUOTE (IgorC @ Dec 13 2010, 16:01) *
What do you plan for test of CELT? If you aren't hurry I will have some spare time after 21st of December and be glad to make much more blind tests.
Do you also consider to make more large test that will involve more people and/or more samples or release some sort of preview version of codec?
I guess if there will be binaries later then probably more people will look into it and report.


The plan right now is to do a "tentative freeze" (i.e. frozen unless something bad comes up) of the bit-stream in early January and then do more formal testing with many listeners. Until then, I'm tuning as much as I can based on informal listening by people like you. Just so you know, your help so far has been very useful and has helped me improve many aspects of CELT. You can just compare the latest sample to fileB.wav (last released version) to see just how much progress was made.
Go to the top of the page
+Quote Post
IgorC
post Dec 24 2010, 00:34
Post #33





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



Jean-Marc,

I see that version 0.10.0 has been released. Let me know If you want to resume listening tests.
I got a new pair of Sennheiser HD650 and my previous lovely HD447 (planning to get decent amp in january). I'm still adapting to new headphones but it could be interesting to see the results.

Have read Monty's article about CELT. http://people.xiph.org/~xiphmont/demo/celt/demo.html
Glad to see high performance. smile.gif

This post has been edited by IgorC: Dec 24 2010, 01:25
Go to the top of the page
+Quote Post
NullC
post Dec 24 2010, 02:02
Post #34





Group: Developer
Posts: 200
Joined: 8-July 03
Member No.: 7653



QUOTE (IgorC @ Dec 23 2010, 16:34) *
Jean-Marc,

I see that version 0.10.0 has been released. Let me know If you want to resume listening tests.
I got a new pair of Sennheiser HD650 and my previous lovely HD447 (planning to get decent amp in january). I'm still adapting to new headphones but it could be interesting to see the results.

Have read Monty's article about CELT. http://people.xiph.org/~xiphmont/demo/celt/demo.html
Glad to see high performance. smile.gif



It would be very helpful if you could listen to
http://jmvalin.ca/misc_stuff/old_layout.flac
http://jmvalin.ca/misc_stuff/new_layout2.flac
http://jmvalin.ca/misc_stuff/new_layout5.flac

We're working on using a slightly different band structure in order to scale better to very high bitrates, but we're concerned with harming the quality at low rates.

This post has been edited by NullC: Dec 24 2010, 02:02
Go to the top of the page
+Quote Post
IgorC
post Dec 24 2010, 19:35
Post #35





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



The differences are small.
The results with HD650



ABC/HR logs+comments h*tp://www.mediafire.com/?o2x0saix4ocxbdt

I found that guitar samples are boring me as their artifacts are less or more the same and my profile of listener isn't quite similar with acoustic guitar. However there were killer samples of ac. guitar that could be good to try http://www.hydrogenaudio.org/forums/index....ost&id=5462. Can you change those guitar samples with it or something else?
You can ask me about how sound drums, percussions, rock/metal drive guitars (rhythm and solo), some classic instruments like viol/violin family, trumpets.

Some previously tested samples:
Linchpin http://www.hydrogenaudio.org/forums/index....st&p=682220
Girl http://www.hydrogenaudio.org/forums/index....st&p=683378
Fatboy?
http://ff123.net/samples/Waiting.flac
Creuza http://www.hydrogenaudio.org/forums/index....ost&id=2069

This post has been edited by IgorC: Dec 24 2010, 20:21
Go to the top of the page
+Quote Post
C.R.Helmrich
post Dec 25 2010, 21:23
Post #36





Group: Developer
Posts: 686
Joined: 6-December 08
From: Erlangen Germany
Member No.: 64012



QUOTE (IgorC @ Dec 24 2010, 20:35) *
Fatboy?

http://www.hydrogenaudio.org/forums/index....showtopic=19682

Chris


--------------------
If I don't reply to your reply, it means I agree with you.
Go to the top of the page
+Quote Post
IgorC
post Dec 26 2010, 04:32
Post #37





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



Hi, Chris.

Actually the question was about the possibility to test the sample.
Anyway thank you for pointing out.

Go to the top of the page
+Quote Post
IgorC
post Dec 26 2010, 09:39
Post #38





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



I think it can be worth to redo the last test or do another with another samples? What do you think?
I've experimented with new headphones. Yes, it wasn't time for experiments but I get a new position (angle and distance) of headphones that give balanced and more natural sound (to my taste). Previously the headphones were positioned a little bit tight and high frequencies were somewhat attenuated.
The old results shouldn't be wrong but maybe different. In fact it can be useful to perform new test with new position because the listeners wear headphones or put the speakers in different ways.
This time I will perform full ABX (not just ABC/HR) between old vs new 2, new 2 vs new 5, new5 vs old layouts. The results will be here on Tuesday.

This post has been edited by IgorC: Dec 26 2010, 09:52
Go to the top of the page
+Quote Post
IgorC
post Dec 29 2010, 23:26
Post #39





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



I've tried several time with HD650 and HD447. The samples aren't transparent but no statistical difference between them.
If you need test something more let me know.
Go to the top of the page
+Quote Post
IgorC
post Dec 30 2010, 20:55
Post #40





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



Aparently CELT scales good with higher bitrates.

CELT 0.10.0 vs Aotuv at 96 kbps:
http://downloads.xiph.org/audio/demo/celt1...-0.10.0-96.flac
headphones HD650


ABX, ABC/HR + comments h*tp://www.mediafire.com/?tuc27jkhahjwesy

This post has been edited by IgorC: Dec 30 2010, 21:15
Go to the top of the page
+Quote Post
jmvalin
post Dec 31 2010, 05:34
Post #41


Xiph.org Speex developer


Group: Developer
Posts: 475
Joined: 21-August 02
Member No.: 3134



QUOTE (IgorC @ Dec 30 2010, 14:55) *
Aparently CELT scales good with higher bitrates.

CELT 0.10.0 vs Aotuv at 96 kbps:
http://downloads.xiph.org/audio/demo/celt1...-0.10.0-96.flac
headphones HD650


ABX, ABC/HR + comments h*tp://www.mediafire.com/?tuc27jkhahjwesy


Thanks very much for these results. Testing higher rates is something I wanted to do for a while. What you're reporting is very good news. If you have time, it'd be interesting to also compare with the appropriate AAC profile (not sure which profile is best at that bitrate).

Also, do you have a way to actually build CELT from the source code? If so, it would be interesting if you can play around with it and see on what kind of files we should be trying to improve quality.
Go to the top of the page
+Quote Post
IgorC
post Dec 31 2010, 06:00
Post #42





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



QUOTE (jmvalin @ Dec 31 2010, 01:34) *
If you have time, it'd be interesting to also compare with the appropriate AAC profile (not sure which profile is best at that bitrate).

LC-AAC is far more suggested at 96 kbps by developers and listeners. From my previous tests I always prefer Apple LC-AAC encoder http://www.hydrogenaudio.org/forums/index....&pid=657232
http://www.hydrogenaudio.org/forums/index....c=66949&hl=
But now apple encoder has a bug (a reported one) at 80-96 kbps. The quality is affected.
I will try to find the older version free of bug and perform the test. itunes 9.0.0.70 had not this bug.

QUOTE (jmvalin @ Dec 31 2010, 01:34) *
Also, do you have a way to actually build CELT from the source code? If so, it would be interesting if you can play around with it and see on what kind of files we should be trying to improve quality.

The only code I compile is for microcontrollers (fortunately it will change in next year). If anybody can compile it I will play with it.


P.S. itunes 9.0.070 is found. Will post the results later.

This post has been edited by IgorC: Dec 31 2010, 06:14
Go to the top of the page
+Quote Post
IgorC
post Dec 31 2010, 08:43
Post #43





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



96 kbps:
LC-AAC info: iTunes 9.0.0.70, 96 kbps, VBR, 44100 kHz. (48000 kHz isn't available for this bitrate).


ABX logs h*tp://www.mediafire.com/?uwu1lqc7qd4vn7l

Apple encoder has produced 100 kbps file. But it's ok since it's VBR test. Second iTunes AAC has somewhat inferior settings while qtaacenc has access to higher quality options - --high and --tvbr (True VBR). But qtaacenc encoder doesn't work with older version that I downloaded to avoid the bug.

True vbr would produce the same quality but at slightly lower bitrate (2-3 kbps). So it's fair comparison.

This post has been edited by IgorC: Dec 31 2010, 08:53
Go to the top of the page
+Quote Post
IgorC
post Dec 31 2010, 18:44
Post #44





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



Maybe CELT should compete not only with LD-AAC but also with enhanced extension of it (ELD-AAC). ELD-AAC has low-delay SBR and recently improved parametric stereo derived from MPEG surround standard. It makes ELD-AAC highly competitive at low bitrates.

In my opinion it's will be great to see if CELT will be on par with HE-AAC at low bitrates and LC-AAC at high bitrates. Now CELT is already comparabale with these codecs.

Plus new audio coding standard USAC is near to be complete. It brings improved tools like eSBR, MPEG surround, new entropy coder and efficient speech coding. So it's logical to think that all those tools will be optimized to be included to new revision of ELD.

Anyway I didn't see any ELD encoder that was available publicly. That's where comes CELT.

This post has been edited by IgorC: Dec 31 2010, 18:49
Go to the top of the page
+Quote Post
rt87
post Jan 1 2011, 01:59
Post #45





Group: Members
Posts: 89
Joined: 28-October 03
Member No.: 9505



I wonder if someone can do 48kbps tests comparing with aotuv 5.7 and HE-AAC?


--------------------
Sorry for my English.
Go to the top of the page
+Quote Post
IgorC
post Jan 1 2011, 14:32
Post #46





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



You can have an idea basing on results of 64 kbps test (look at post #22)
Go to the top of the page
+Quote Post
NullC
post Jan 26 2011, 02:09
Post #47





Group: Developer
Posts: 200
Joined: 8-July 03
Member No.: 7653




Anyone following along with the CELT technical development may be interested in this updated I provided to the IETF codec working group today:

http://www.ietf.org/mail-archive/web/codec...t/msg02109.html

Go to the top of the page
+Quote Post
jmvalin
post Jan 28 2011, 00:36
Post #48


Xiph.org Speex developer


Group: Developer
Posts: 475
Joined: 21-August 02
Member No.: 3134



OK, so we've been a bit quiet lately but working hard towards freezing the bit-stream. Before we do that, I'd be interested in some feedback on the latest version. Can you compare the following two files:
Go to the top of the page
+Quote Post
IgorC
post Jan 28 2011, 14:48
Post #49





Group: Members
Posts: 1540
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



The artifacts of both files have very similar nature. No matter how many times I've tried and whatever technics (ABX or ABXY) no difference was catched. While encoded files clearly present distortion comparing to source (comp.wav).
Go to the top of the page
+Quote Post
jmvalin
post Jan 29 2011, 06:24
Post #50


Xiph.org Speex developer


Group: Developer
Posts: 475
Joined: 21-August 02
Member No.: 3134



QUOTE (IgorC @ Jan 28 2011, 08:48) *
The artifacts of both files have very similar nature. No matter how many times I've tried and whatever technics (ABX or ABXY) no difference was catched. While encoded files clearly present distortion comparing to source (comp.wav).


Thanks. That's actually good news. I mainly wanted to make sure that we didn't actually break anything.
Go to the top of the page
+Quote Post

4 Pages V  < 1 2 3 4 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 29th July 2014 - 17:25