Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: opus experimental versions as part of opus-tools, and other questions (Read 12795 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

opus experimental versions as part of opus-tools, and other questions

Hi all,
I've been playing with Opus for a bit and really like it, but I do have some signals that trip it up as high as 128 kbps, and even higher sometimes. I even created a very simple sample which tripped it up at 160 and above, but I will need to do more testing to make sure the codec was to blame. If that indeed is the problem, I will put them here along with ABX so hopefully it can be improved. In fact I am probably violating a rule by saying I can hear these without posting evidence, but again I have to do further testing to make sure this is legitimate.
But first I was going to run these few samples past the experimental builds which seem to try to address these issues. But I don't know where to obtain them. I once found one, but I can't remember the version, or where i got it. I would be looking for the opusenc utility like the one in the command line opus-tools. I really hope these are being made somewhat regularly because I have no clue on how to work with sources.
Also, if these builds really do improve quality, and I think they will, will they begin folding back into the main release like AoTuV did with the official Vorbis years ago?
And another interesting thought. Could the research that went into developing Opus be used to refine a codec like Vorbis which is only for storage? Maybe some new coding strategies based on Celt but something which works well in high delay situations. I know Opus is great for storage as is, and was intended to be so, but if low delay codecs are at so much of a disadvantage, then it would make sense to wonder what something like Opus could do without that restriction. It's probably impractical at least for now but it's an interesting thought.
Thanks for your answers and have a good day!

opus experimental versions as part of opus-tools, and other questions

Reply #1
The version you want to try is 1.1-alpha, which was released in December. That version includes the improvements you're referring to.

opus experimental versions as part of opus-tools, and other questions

Reply #2
Hi,
This may be a stupid question, but how would I use that version? I know nothing about compiling opusenc or anything else for that matter.
The command line Opus-tools only goes up to libopus 1.0.2 in opusenc.
If I get alpha 1.1 to work and I still have problem samples I will put them in a new topic.
Thanks.

opus experimental versions as part of opus-tools, and other questions

Reply #3
This may be a stupid question, but how would I use that version? I know nothing about compiling opusenc or anything else for that matter.
The command line Opus-tools only goes up to libopus 1.0.2 in opusenc.
If I get alpha 1.1 to work and I still have problem samples I will put them in a new topic.


This post links to compiled alpha version and testing is welcomed:
http://www.hydrogenaudio.org/forums/index....st&p=817976
Dynamic – the artist formerly known as DickD


opus experimental versions as part of opus-tools, and other questions

Reply #5
Hi,
Thanks for these links. The one in the opus testing topic is working just fine. I'll go reporting there.

opus experimental versions as part of opus-tools, and other questions

Reply #6
Hi,
Thanks for these links. The one in the opus testing topic is working just fine. I'll go reporting there.


Please use the link IgorC provided. I'm not sure how the other one was built.

opus experimental versions as part of opus-tools, and other questions

Reply #7
Okay will do. It seems that the alpha works better for the tonal samples, and is using a more unconstrained VBR? Because on the samples I tested which had problems with 1.0.2, the alpha 1.1 did much better with a substantial increase in bitrate. The test I did was targetted at 64kbps, and 1.0.2 gave a bitrate of 65, while 1.1 gave one of around 86 I think. If I forced CBR on both versions, 1.0.2 had more distortion in the low end and 1.1 fixed that, but introduced a bit more on the mids and highs which I find more agreeable, since it still beats the griddy sound of SBR in my opinion. I have yet to find a sample that really trips the alpha at a reasonable bitrate like the previous version did, but I will be sure to let you know if I do.

opus experimental versions as part of opus-tools, and other questions

Reply #8
Okay will do. It seems that the alpha works better for the tonal samples, and is using a more unconstrained VBR? Because on the samples I tested which had problems with 1.0.2, the alpha 1.1 did much better with a substantial increase in bitrate. The test I did was targetted at 64kbps, and 1.0.2 gave a bitrate of 65, while 1.1 gave one of around 86 I think. If I forced CBR on both versions, 1.0.2 had more distortion in the low end and 1.1 fixed that, but introduced a bit more on the mids and highs which I find more agreeable, since it still beats the griddy sound of SBR in my opinion. I have yet to find a sample that really trips the alpha at a reasonable bitrate like the previous version did, but I will be sure to let you know if I do.


Keep in mind that on average, 1.1 uses the same bit-rate as 1.0.2. On hard samples it'll use a higher rate, while on easier samples, it'll go below 64 kb/s. Also, if you don't want these large variations, you can always enable "constrained VBR", which is about equivalent to what MPEG codecs call CBR (they use a bit reservoir to pretend they're CBR).

opus experimental versions as part of opus-tools, and other questions

Reply #9
wrong topic. Please, I will ask to delete my message.

opus experimental versions as part of opus-tools, and other questions

Reply #10
And another interesting thought. Could the research that went into developing Opus be used to refine a codec like Vorbis which is only for storage? Maybe some new coding strategies based on Celt but something which works well in high delay situations. I know Opus is great for storage as is, and was intended to be so, but if low delay codecs are at so much of a disadvantage, then it would make sense to wonder what something like Opus could do without that restriction. It's probably impractical at least for now but it's an interesting thought.

I think that would be interesting to know, too. Would Opus be actually more efficient with the option of bigger framesizes, or just a trade off with the handling of pre-echos vs. tonality? Jmvalin, you told that the specialisation on the little framesizes is too deep to just confer the techniques inside on something like a theoretical successor of Vorbis. Does that mean that there's not much room for improvement on a new high delay codec over Vorbis with the recent technology?

opus experimental versions as part of opus-tools, and other questions

Reply #11
I think that would be interesting to know, too. Would Opus be actually more efficient with the option of bigger framesizes, or just a trade off with the handling of pre-echos vs. tonality? Jmvalin, you told that the specialisation on the little framesizes is too deep to just confer the techniques inside on something like a theoretical successor of Vorbis. Does that mean that there's not much room for improvement on a new high delay codec over Vorbis with the recent technology?
I believe there is a fair amount of room for improvement in the delay insensitive space— but it requires more and different techniques, potentially a larger computational budget, a larger memory budget, etc.  In general a larger frame size means you can afford to send more model parameters, more signaling decisions— but then you need to do something useful with those parameters.

Monty had made some promising looking progress with sinusoidal modeling but hasn't been working on it lately.  JM invented a new kind of predicted pyramid vector quantization as part of our work on daala which might someday find its way back into an audio codec (the daala work may also produce a number of other tools that might be useful for audio in the future— e.g. we now have a new two dimensional exponential weighed moving average that looks like it works better than the inter energy predictor we use in opus mdct mode).

opus experimental versions as part of opus-tools, and other questions

Reply #12
The quality versus latency tradeoff at 64 kbps for 20ms frames seems to indicate that 10ms frames require about 10% higher bitrate, 5ms frames need about 32.5% higher bitrate and 2.5ms frames need 75% higher bitrate for same PEAQ score. This limited data set makes it look as though we're approaching an asymptote of marginal gains and the bitrate reduction is going to be pretty small for typical signals if we move to 40ms. However, it's plausible that the long-frame efficiency would be greater for highly tonal signals or signals with multiple tones at the same time. Is the advantage worth it given the typical nature of music?

I expect some of the people like Jean-Marc and NullC can add experience to confirm or contradict this.

A lot of the big gains over Vorbis come in other areas (e.g. the way it encodes the explicit band energy is more efficient and produces better results than the noise normalization version discovered late in Vorbis development).

Right now, there's scope for an offline mode in Opus, where music detection when not live streaming can be modified, for example, to check more carefully the few seconds before the existing detector activates, perhaps with a longer FFT to detect lower frequency tones that would get missed by the short lookahead of the current, live mode. Similarly with bandwidth detection, there's scope to make it more consistent offline with greater lookahead and the option to look backwards. This could offer great improvements for mixed materials at lowish bitrates (around 24-48 kbps) such as audiobooks and podcasts with incidental music, and it's also feasible to set two different bitrates such as 24 kbps mono SILK for speech and 64 kbps VBR stereo CELT when music is detected, to give highly bitrate-efficient high quality podcasts, with commensurate savings of server bandwidth costs for the low-budget, often amateur podcast creators who have to pay the likes of libsyn for their many gigabytes of monthly use and beg for donations from listeners. Some of this could be built into an audio editor/DAW too (where certain tracks could be labelled as speech or music), for the person compiling the podcast allowing the rendering to signal the required mode, but automated and accurate music detection would allow people to use the editor/DAW of their choice and just send it to the encoder without worrying.

There are other ideas for a few years down the line that are being modelled. Some of the Xiph Ghost ideas, like separating tonal from transient parts of the signal and encoding them separately in their most efficient manner, require a lot of processing from today's CPUs, but might be viable in a few years if they can be made to work well enough, and might be a better use of developers time, as NullC said (Monty's sinusoidal modelling).

I'm sure it's the sort of thing that the Opus collaborators consider from time to time. I think they do an admirable job of ensuring that Opus came to fruition first of all and that it continues to be tuned for optimum performance, so I'd be reluctant to get them to shift their priorities, which seem very well placed to me. It has a real chance of making a wider impact than Vorbis because it also addresses the unmet needs for interactive uses of all kinds and covers the bulk of the useful internet-packetized bitrate range about as well as or better than any other codec to date. It can gain a foothold in numerous potential killer applications and bring compatibilty to many devices through that.
Dynamic – the artist formerly known as DickD

opus experimental versions as part of opus-tools, and other questions

Reply #13
The quality versus latency tradeoff at 64 kbps for 20ms frames seems to indicate that 10ms frames require about 10% higher bitrate, 5ms frames need about 32.5% higher bitrate and 2.5ms frames need 75% higher bitrate for same PEAQ score. This limited data set makes it look as though we're approaching an asymptote of marginal gains and the bitrate reduction is going to be pretty small for typical signals if we move to 40ms. However, it's plausible that the long-frame efficiency would be greater for highly tonal signals or signals with multiple tones at the same time. Is the advantage worth it given the typical nature of music?

I expect some of the people like Jean-Marc and NullC can add experience to confirm or contradict this.


Using 40 ms CELT frames would not be useful except at rates below 48 kb/s -- these just add too many temporal artefacts (Vorbis sill use them only below 40 kb/s or so). What we *could* have done to improve transient performance is simply add support for full-overlap windows on 20-ms frames. The problem is that it would have made the code more complicated and would have been bad for latency (remember that Opus is designed for interactive applications).

Right now, there's scope for an offline mode in Opus, where music detection when not live streaming can be modified, for example, to check more carefully the few seconds before the existing detector activates, perhaps with a longer FFT to detect lower frequency tones that would get missed by the short lookahead of the current, live mode.


This is already implemented in git, though opus-tools does not support it out-of-the-box yet (it's an experimental branch).

There are other ideas for a few years down the line that are being modelled. Some of the Xiph Ghost ideas, like separating tonal from transient parts of the signal and encoding them separately in their most efficient manner, require a lot of processing from today's CPUs, but might be viable in a few years if they can be made to work well enough, and might be a better use of developers time, as NullC said (Monty's sinusoidal modelling).


The problems with Ghost go far beyond "not enough CPU". Separating tones, from noise and transients is *really* hard -- sometimes you're not even sure what a tone really means! And you have to do a really good job for such an algorithm to work at all. That's very different from an algorithm like CELT where you can make an incredibly dumb encoder that still gives pretty good quality (and easily out-performs MP3).

opus experimental versions as part of opus-tools, and other questions

Reply #14
Right now, there's scope for an offline mode in Opus, where music detection when not live streaming can be modified, for example, to check more carefully the few seconds before the existing detector activates, perhaps with a longer FFT to detect lower frequency tones that would get missed by the short lookahead of the current, live mode.


This is already implemented in git, though opus-tools does not support it out-of-the-box yet (it's an experimental branch).

Is it already implemented into the recently posted "new test build"?
Thanks for all the answers on the framesizes btw. Didn't really get the thing with the lower frequencies detected when using more lookahead, though, and what its supposed to do when there's no full overlap between the frames anyway 
Can someone explain (or give a link)?

opus experimental versions as part of opus-tools, and other questions

Reply #15
As nobody more authoritative has replied... No, I don't think it's implemented in the test builds we've seen here.

For limited public testing of experimental builds, such as here, releases are a little less frequent than changes to the git (from which the devs may compile frequently) once a few major changes have been incorporated and are ready to be tested on a range of real world samples and different listeners.

In the recent experimental version thread someone reported that music detection activated only on the third note of a sample and this was because the first two tones were within the lowest frequency bin - a result of Opus's short transform windows thanks to its short frames (20ms), where low frequencies can look a lot like a DC signal. If you take a longer transform window (lasting more than 20ms), the frequency width of each bin is reduced commensurately and that's one way to detect the onset of music consisting of low frequencies but little high-end, though there are other options and I suspect the devs have worked on picking the optimal choice for this application and putting those fixes into the git repository ready for the next release.

Any method I can think of to pick up this corner case pretty much demands more look-ahead, so this fix could only be made in offline mode, not interactive. Any other fixes to problems found in that experimental version will also be addressed in git if possible before a future experimental version is put up incorporating those fixes and asking for further serious testing.
Dynamic – the artist formerly known as DickD

opus experimental versions as part of opus-tools, and other questions

Reply #16
As nobody more authoritative has replied... No, I don't think it's implemented in the test builds we've seen here.

For limited public testing of experimental builds, such as here, releases are a little less frequent than changes to the git (from which the devs may compile frequently) once a few major changes have been incorporated and are ready to be tested on a range of real world samples and different listeners.


Well, some features (like the one where we use extra look-ahead) actually require changes to the tools and not just libopus, so they tend to take longer to expose. As for builds, we generally update them when the changes we make have an impact on what the HA people care about (i.e. mostly stereo music above 8 kb/s). Keep in mind that none of us actually runs Windows!

In the recent experimental version thread someone reported that music detection activated only on the third note of a sample and this was because the first two tones were within the lowest frequency bin - a result of Opus's short transform windows thanks to its short frames (20ms), where low frequencies can look a lot like a DC signal. If you take a longer transform window (lasting more than 20ms), the frequency width of each bin is reduced commensurately and that's one way to detect the onset of music consisting of low frequencies but little high-end, though there are other options and I suspect the devs have worked on picking the optimal choice for this application and putting those fixes into the git repository ready for the next release.

Any method I can think of to pick up this corner case pretty much demands more look-ahead, so this fix could only be made in offline mode, not interactive. Any other fixes to problems found in that experimental version will also be addressed in git if possible before a future experimental version is put up incorporating those fixes and asking for further serious testing.


I think you got confused a bit. The problem with the harpsichord sample is not the speech/music detector, but the tonality detector. Its analysis window is independent of the Opus frame size. But no matter what the analysis window is, there are going to be signals that are too low and increasing the size increases the size of the library and the complexity. So we're trying to find a good balance here. It may take a while though. Right now, the main focus is on 1.1.

opus experimental versions as part of opus-tools, and other questions

Reply #17
Yes, you're right, I did get mixed up about which detector it was. Thanks for putting me straight.
Dynamic – the artist formerly known as DickD