IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
opus experimental versions as part of opus-tools, and other questions, just curious
raygrote
post Feb 3 2013, 01:32
Post #1





Group: Members
Posts: 48
Joined: 4-April 08
Member No.: 52552



Hi all,
I've been playing with Opus for a bit and really like it, but I do have some signals that trip it up as high as 128 kbps, and even higher sometimes. I even created a very simple sample which tripped it up at 160 and above, but I will need to do more testing to make sure the codec was to blame. If that indeed is the problem, I will put them here along with ABX so hopefully it can be improved. In fact I am probably violating a rule by saying I can hear these without posting evidence, but again I have to do further testing to make sure this is legitimate.
But first I was going to run these few samples past the experimental builds which seem to try to address these issues. But I don't know where to obtain them. I once found one, but I can't remember the version, or where i got it. I would be looking for the opusenc utility like the one in the command line opus-tools. I really hope these are being made somewhat regularly because I have no clue on how to work with sources.
Also, if these builds really do improve quality, and I think they will, will they begin folding back into the main release like AoTuV did with the official Vorbis years ago?
And another interesting thought. Could the research that went into developing Opus be used to refine a codec like Vorbis which is only for storage? Maybe some new coding strategies based on Celt but something which works well in high delay situations. I know Opus is great for storage as is, and was intended to be so, but if low delay codecs are at so much of a disadvantage, then it would make sense to wonder what something like Opus could do without that restriction. It's probably impractical at least for now but it's an interesting thought.
Thanks for your answers and have a good day!
Go to the top of the page
+Quote Post
jmvalin
post Feb 3 2013, 09:00
Post #2


Xiph.org Speex developer


Group: Developer
Posts: 479
Joined: 21-August 02
Member No.: 3134



The version you want to try is 1.1-alpha, which was released in December. That version includes the improvements you're referring to.
Go to the top of the page
+Quote Post
raygrote
post Feb 3 2013, 15:28
Post #3





Group: Members
Posts: 48
Joined: 4-April 08
Member No.: 52552



Hi,
This may be a stupid question, but how would I use that version? I know nothing about compiling opusenc or anything else for that matter.
The command line Opus-tools only goes up to libopus 1.0.2 in opusenc.
If I get alpha 1.1 to work and I still have problem samples I will put them in a new topic.
Thanks.
Go to the top of the page
+Quote Post
Dynamic
post Feb 3 2013, 19:27
Post #4





Group: Members
Posts: 805
Joined: 17-September 06
Member No.: 35307



QUOTE (raygrote @ Feb 3 2013, 14:28) *
This may be a stupid question, but how would I use that version? I know nothing about compiling opusenc or anything else for that matter.
The command line Opus-tools only goes up to libopus 1.0.2 in opusenc.
If I get alpha 1.1 to work and I still have problem samples I will put them in a new topic.


This post links to compiled alpha version and testing is welcomed:
http://www.hydrogenaudio.org/forums/index....st&p=817976
Go to the top of the page
+Quote Post
IgorC
post Feb 3 2013, 19:40
Post #5





Group: Members
Posts: 1556
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



https://ftp.mozilla.org/pub/mozilla.org/opu...alpha-win32.zip

This post has been edited by IgorC: Feb 3 2013, 19:41
Go to the top of the page
+Quote Post
raygrote
post Feb 3 2013, 21:56
Post #6





Group: Members
Posts: 48
Joined: 4-April 08
Member No.: 52552



Hi,
Thanks for these links. The one in the opus testing topic is working just fine. I'll go reporting there.
Go to the top of the page
+Quote Post
jmvalin
post Feb 3 2013, 23:20
Post #7


Xiph.org Speex developer


Group: Developer
Posts: 479
Joined: 21-August 02
Member No.: 3134



QUOTE (raygrote @ Feb 3 2013, 15:56) *
Hi,
Thanks for these links. The one in the opus testing topic is working just fine. I'll go reporting there.


Please use the link IgorC provided. I'm not sure how the other one was built.
Go to the top of the page
+Quote Post
raygrote
post Feb 4 2013, 22:52
Post #8





Group: Members
Posts: 48
Joined: 4-April 08
Member No.: 52552



Okay will do. It seems that the alpha works better for the tonal samples, and is using a more unconstrained VBR? Because on the samples I tested which had problems with 1.0.2, the alpha 1.1 did much better with a substantial increase in bitrate. The test I did was targetted at 64kbps, and 1.0.2 gave a bitrate of 65, while 1.1 gave one of around 86 I think. If I forced CBR on both versions, 1.0.2 had more distortion in the low end and 1.1 fixed that, but introduced a bit more on the mids and highs which I find more agreeable, since it still beats the griddy sound of SBR in my opinion. I have yet to find a sample that really trips the alpha at a reasonable bitrate like the previous version did, but I will be sure to let you know if I do.
Go to the top of the page
+Quote Post
jmvalin
post Feb 5 2013, 00:16
Post #9


Xiph.org Speex developer


Group: Developer
Posts: 479
Joined: 21-August 02
Member No.: 3134



QUOTE (raygrote @ Feb 4 2013, 16:52) *
Okay will do. It seems that the alpha works better for the tonal samples, and is using a more unconstrained VBR? Because on the samples I tested which had problems with 1.0.2, the alpha 1.1 did much better with a substantial increase in bitrate. The test I did was targetted at 64kbps, and 1.0.2 gave a bitrate of 65, while 1.1 gave one of around 86 I think. If I forced CBR on both versions, 1.0.2 had more distortion in the low end and 1.1 fixed that, but introduced a bit more on the mids and highs which I find more agreeable, since it still beats the griddy sound of SBR in my opinion. I have yet to find a sample that really trips the alpha at a reasonable bitrate like the previous version did, but I will be sure to let you know if I do.


Keep in mind that on average, 1.1 uses the same bit-rate as 1.0.2. On hard samples it'll use a higher rate, while on easier samples, it'll go below 64 kb/s. Also, if you don't want these large variations, you can always enable "constrained VBR", which is about equivalent to what MPEG codecs call CBR (they use a bit reservoir to pretend they're CBR).
Go to the top of the page
+Quote Post
IgorC
post Feb 10 2013, 01:59
Post #10





Group: Members
Posts: 1556
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



wrong topic. Please, I will ask to delete my message.

This post has been edited by IgorC: Feb 10 2013, 02:00
Go to the top of the page
+Quote Post
Gainless
post May 7 2013, 17:25
Post #11





Group: Members
Posts: 169
Joined: 28-October 11
Member No.: 94764



QUOTE (raygrote @ Feb 3 2013, 02:32) *
And another interesting thought. Could the research that went into developing Opus be used to refine a codec like Vorbis which is only for storage? Maybe some new coding strategies based on Celt but something which works well in high delay situations. I know Opus is great for storage as is, and was intended to be so, but if low delay codecs are at so much of a disadvantage, then it would make sense to wonder what something like Opus could do without that restriction. It's probably impractical at least for now but it's an interesting thought.

I think that would be interesting to know, too. Would Opus be actually more efficient with the option of bigger framesizes, or just a trade off with the handling of pre-echos vs. tonality? Jmvalin, you told that the specialisation on the little framesizes is too deep to just confer the techniques inside on something like a theoretical successor of Vorbis. Does that mean that there's not much room for improvement on a new high delay codec over Vorbis with the recent technology?

This post has been edited by Gainless: May 7 2013, 18:04
Go to the top of the page
+Quote Post
NullC
post May 7 2013, 18:50
Post #12





Group: Developer
Posts: 200
Joined: 8-July 03
Member No.: 7653



QUOTE (Gainless @ May 7 2013, 09:25) *
I think that would be interesting to know, too. Would Opus be actually more efficient with the option of bigger framesizes, or just a trade off with the handling of pre-echos vs. tonality? Jmvalin, you told that the specialisation on the little framesizes is too deep to just confer the techniques inside on something like a theoretical successor of Vorbis. Does that mean that there's not much room for improvement on a new high delay codec over Vorbis with the recent technology?
I believe there is a fair amount of room for improvement in the delay insensitive space— but it requires more and different techniques, potentially a larger computational budget, a larger memory budget, etc. In general a larger frame size means you can afford to send more model parameters, more signaling decisions— but then you need to do something useful with those parameters.

Monty had made some promising looking progress with sinusoidal modeling but hasn't been working on it lately. JM invented a new kind of predicted pyramid vector quantization as part of our work on daala which might someday find its way back into an audio codec (the daala work may also produce a number of other tools that might be useful for audio in the future— e.g. we now have a new two dimensional exponential weighed moving average that looks like it works better than the inter energy predictor we use in opus mdct mode).
Go to the top of the page
+Quote Post
Dynamic
post May 7 2013, 19:44
Post #13





Group: Members
Posts: 805
Joined: 17-September 06
Member No.: 35307



The quality versus latency tradeoff at 64 kbps for 20ms frames seems to indicate that 10ms frames require about 10% higher bitrate, 5ms frames need about 32.5% higher bitrate and 2.5ms frames need 75% higher bitrate for same PEAQ score. This limited data set makes it look as though we're approaching an asymptote of marginal gains and the bitrate reduction is going to be pretty small for typical signals if we move to 40ms. However, it's plausible that the long-frame efficiency would be greater for highly tonal signals or signals with multiple tones at the same time. Is the advantage worth it given the typical nature of music?

I expect some of the people like Jean-Marc and NullC can add experience to confirm or contradict this.

A lot of the big gains over Vorbis come in other areas (e.g. the way it encodes the explicit band energy is more efficient and produces better results than the noise normalization version discovered late in Vorbis development).

Right now, there's scope for an offline mode in Opus, where music detection when not live streaming can be modified, for example, to check more carefully the few seconds before the existing detector activates, perhaps with a longer FFT to detect lower frequency tones that would get missed by the short lookahead of the current, live mode. Similarly with bandwidth detection, there's scope to make it more consistent offline with greater lookahead and the option to look backwards. This could offer great improvements for mixed materials at lowish bitrates (around 24-48 kbps) such as audiobooks and podcasts with incidental music, and it's also feasible to set two different bitrates such as 24 kbps mono SILK for speech and 64 kbps VBR stereo CELT when music is detected, to give highly bitrate-efficient high quality podcasts, with commensurate savings of server bandwidth costs for the low-budget, often amateur podcast creators who have to pay the likes of libsyn for their many gigabytes of monthly use and beg for donations from listeners. Some of this could be built into an audio editor/DAW too (where certain tracks could be labelled as speech or music), for the person compiling the podcast allowing the rendering to signal the required mode, but automated and accurate music detection would allow people to use the editor/DAW of their choice and just send it to the encoder without worrying.

There are other ideas for a few years down the line that are being modelled. Some of the Xiph Ghost ideas, like separating tonal from transient parts of the signal and encoding them separately in their most efficient manner, require a lot of processing from today's CPUs, but might be viable in a few years if they can be made to work well enough, and might be a better use of developers time, as NullC said (Monty's sinusoidal modelling).

I'm sure it's the sort of thing that the Opus collaborators consider from time to time. I think they do an admirable job of ensuring that Opus came to fruition first of all and that it continues to be tuned for optimum performance, so I'd be reluctant to get them to shift their priorities, which seem very well placed to me. It has a real chance of making a wider impact than Vorbis because it also addresses the unmet needs for interactive uses of all kinds and covers the bulk of the useful internet-packetized bitrate range about as well as or better than any other codec to date. It can gain a foothold in numerous potential killer applications and bring compatibilty to many devices through that.

This post has been edited by Dynamic: May 7 2013, 19:52
Go to the top of the page
+Quote Post
jmvalin
post May 8 2013, 07:34
Post #14


Xiph.org Speex developer


Group: Developer
Posts: 479
Joined: 21-August 02
Member No.: 3134



QUOTE (Dynamic @ May 7 2013, 14:44) *
The quality versus latency tradeoff at 64 kbps for 20ms frames seems to indicate that 10ms frames require about 10% higher bitrate, 5ms frames need about 32.5% higher bitrate and 2.5ms frames need 75% higher bitrate for same PEAQ score. This limited data set makes it look as though we're approaching an asymptote of marginal gains and the bitrate reduction is going to be pretty small for typical signals if we move to 40ms. However, it's plausible that the long-frame efficiency would be greater for highly tonal signals or signals with multiple tones at the same time. Is the advantage worth it given the typical nature of music?

I expect some of the people like Jean-Marc and NullC can add experience to confirm or contradict this.


Using 40 ms CELT frames would not be useful except at rates below 48 kb/s -- these just add too many temporal artefacts (Vorbis sill use them only below 40 kb/s or so). What we *could* have done to improve transient performance is simply add support for full-overlap windows on 20-ms frames. The problem is that it would have made the code more complicated and would have been bad for latency (remember that Opus is designed for interactive applications).

QUOTE (Dynamic @ May 7 2013, 14:44) *
Right now, there's scope for an offline mode in Opus, where music detection when not live streaming can be modified, for example, to check more carefully the few seconds before the existing detector activates, perhaps with a longer FFT to detect lower frequency tones that would get missed by the short lookahead of the current, live mode.


This is already implemented in git, though opus-tools does not support it out-of-the-box yet (it's an experimental branch).

QUOTE (Dynamic @ May 7 2013, 14:44) *
There are other ideas for a few years down the line that are being modelled. Some of the Xiph Ghost ideas, like separating tonal from transient parts of the signal and encoding them separately in their most efficient manner, require a lot of processing from today's CPUs, but might be viable in a few years if they can be made to work well enough, and might be a better use of developers time, as NullC said (Monty's sinusoidal modelling).


The problems with Ghost go far beyond "not enough CPU". Separating tones, from noise and transients is *really* hard -- sometimes you're not even sure what a tone really means! And you have to do a really good job for such an algorithm to work at all. That's very different from an algorithm like CELT where you can make an incredibly dumb encoder that still gives pretty good quality (and easily out-performs MP3).
Go to the top of the page
+Quote Post
Gainless
post May 8 2013, 11:02
Post #15





Group: Members
Posts: 169
Joined: 28-October 11
Member No.: 94764



QUOTE (jmvalin @ May 8 2013, 08:34) *
QUOTE (Dynamic @ May 7 2013, 14:44) *
Right now, there's scope for an offline mode in Opus, where music detection when not live streaming can be modified, for example, to check more carefully the few seconds before the existing detector activates, perhaps with a longer FFT to detect lower frequency tones that would get missed by the short lookahead of the current, live mode.


This is already implemented in git, though opus-tools does not support it out-of-the-box yet (it's an experimental branch).

Is it already implemented into the recently posted "new test build"?
Thanks for all the answers on the framesizes btw. Didn't really get the thing with the lower frequencies detected when using more lookahead, though, and what its supposed to do when there's no full overlap between the frames anyway huh.gif
Can someone explain (or give a link)?

This post has been edited by Gainless: May 8 2013, 11:02
Go to the top of the page
+Quote Post
Dynamic
post May 9 2013, 11:38
Post #16





Group: Members
Posts: 805
Joined: 17-September 06
Member No.: 35307



As nobody more authoritative has replied... No, I don't think it's implemented in the test builds we've seen here.

For limited public testing of experimental builds, such as here, releases are a little less frequent than changes to the git (from which the devs may compile frequently) once a few major changes have been incorporated and are ready to be tested on a range of real world samples and different listeners.

In the recent experimental version thread someone reported that music detection activated only on the third note of a sample and this was because the first two tones were within the lowest frequency bin - a result of Opus's short transform windows thanks to its short frames (20ms), where low frequencies can look a lot like a DC signal. If you take a longer transform window (lasting more than 20ms), the frequency width of each bin is reduced commensurately and that's one way to detect the onset of music consisting of low frequencies but little high-end, though there are other options and I suspect the devs have worked on picking the optimal choice for this application and putting those fixes into the git repository ready for the next release.

Any method I can think of to pick up this corner case pretty much demands more look-ahead, so this fix could only be made in offline mode, not interactive. Any other fixes to problems found in that experimental version will also be addressed in git if possible before a future experimental version is put up incorporating those fixes and asking for further serious testing.
Go to the top of the page
+Quote Post
jmvalin
post May 9 2013, 16:47
Post #17


Xiph.org Speex developer


Group: Developer
Posts: 479
Joined: 21-August 02
Member No.: 3134



QUOTE (Dynamic @ May 9 2013, 06:38) *
As nobody more authoritative has replied... No, I don't think it's implemented in the test builds we've seen here.

For limited public testing of experimental builds, such as here, releases are a little less frequent than changes to the git (from which the devs may compile frequently) once a few major changes have been incorporated and are ready to be tested on a range of real world samples and different listeners.


Well, some features (like the one where we use extra look-ahead) actually require changes to the tools and not just libopus, so they tend to take longer to expose. As for builds, we generally update them when the changes we make have an impact on what the HA people care about (i.e. mostly stereo music above 8 kb/s). Keep in mind that none of us actually runs Windows!

QUOTE (Dynamic @ May 9 2013, 06:38) *
In the recent experimental version thread someone reported that music detection activated only on the third note of a sample and this was because the first two tones were within the lowest frequency bin - a result of Opus's short transform windows thanks to its short frames (20ms), where low frequencies can look a lot like a DC signal. If you take a longer transform window (lasting more than 20ms), the frequency width of each bin is reduced commensurately and that's one way to detect the onset of music consisting of low frequencies but little high-end, though there are other options and I suspect the devs have worked on picking the optimal choice for this application and putting those fixes into the git repository ready for the next release.

Any method I can think of to pick up this corner case pretty much demands more look-ahead, so this fix could only be made in offline mode, not interactive. Any other fixes to problems found in that experimental version will also be addressed in git if possible before a future experimental version is put up incorporating those fixes and asking for further serious testing.


I think you got confused a bit. The problem with the harpsichord sample is not the speech/music detector, but the tonality detector. Its analysis window is independent of the Opus frame size. But no matter what the analysis window is, there are going to be signals that are too low and increasing the size increases the size of the library and the complexity. So we're trying to find a good balance here. It may take a while though. Right now, the main focus is on 1.1.
Go to the top of the page
+Quote Post
Dynamic
post May 9 2013, 16:56
Post #18





Group: Members
Posts: 805
Joined: 17-September 06
Member No.: 35307



Yes, you're right, I did get mixed up about which detector it was. Thanks for putting me straight.
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 28th August 2014 - 01:27