Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: New opus test build (Read 38899 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

New opus test build

I've posted up a new win32 build of libopus git master: https://people.xiph.org/~greg/opus-tools_exp_a8783a1.zip

There aren't many major changes from the prior alpha but a number of small bugs fix, some of which impact quality.  I think we're particular interested in knowing about any quality regressions vs prior versions and particular cases where the speech/music detection get confused.  (For the actual release we'll likely back that detector off at high rates but for now it's set more aggressively in order to find problems)

New opus test build

Reply #1
I was trying to test the speech/music detection, but wouldn't mind a bit of guidance. I presume it mainly kicks in at the lower end of the target bitrate range.

My source was a digital soundboard recording of nearly three hours length at a small gig recently, captured over USB cable at 48kHz/24bit stereo including some talk from the singers between songs (usually remembering to mute the effects, so fairly clean and dry) before and after the music - mainly ALAC backing tracks via an iPad and one or two singers, occasional live semiacoustic guitar, plus fiddle and crescent tambourine picked up on one of the vocal mics. I mainly jumped to the track boundaries and places I knew the music subsided to listen for music/speech variations.

I could do with some guidance as I'm not entirely sure what bitrate region I should be concentrating on for speech/music detection to have a significant effect, but I presume it's certainly below 48kbps, and most likely a fair bit below 32kbps that the transition might be quite noticeable if speech mode remains active too long.

I used this test version of oggenc with the option --bitrate nn.n which is vbr by default to make four versions at first with 40.0, 32.0, 27.0, 24.0 as the bitrates.

All four versions seemed to me limited to 12 kHz (SWB) audio bandwidth once the music really kicked in, though at 40 kbps there was full bandwidth occasionally during some of the speech and sometimes as the song started (e.g. first cymbal hit) showing about 16-18 kHz then cut down to 12 kHz without a bad transition (saw it on fb2k spectrogram, but barely noticed, and I think it was in CELT mode the whole time, as the bass and percussion were well handled). I guess it's possible that these instances were hybrid mode at FB switching to CELT-only at SWB, but they sounded OK.

I then tried a 48.0 kbps version, which was 20kHz full bandwidth much of the time, especially with percussive backing tracks. In the closing track in particular I noticed it was 12 kHz bandwidth - a track with little high end and subdued percussion, but there were a couple of percussive hits that registered up to 15 or 16 kHz in the original backing track that didn't generate a change in bandwidth in the 48.0 kbps opus, which I think is probably good, as it sounded consistent throughout and probably reduced the likelihood of artifacts among the multilayered vocals they'd laid down in the backing of the closing number, which was recorded many years ago and I have since only tweaked its gain envelope for dynamic variation and increased impact as the layered vocals really build up in the final verse and chorus.

So I got the impression that bandwidth detection was doing a reasonably good job with nothing sounding egregiously out-of-place, and also I got the impression that bandwidth restriction to reduce artifacts from bitrate starvation was keeping obvious artifacts mostly at bay (whereas I personally dislike what I've heard of HE-AACv2 at 24 to 32 kbps as far as treble reproduction is concerned - albeit far superior to the old RealAudio proprietary codec which over-reached in audio bandwidth at the expense of bad treble artifacts in my experience).

I also did 16.0 and 20.0 kbps versions. Both were limited to 8kHz (WB) bandwidth throughout, it seemed, with the 16.0kbps being pretty rough in a number of spots and almost certainly in SILK (LP) mode most of the time. I felt that the 20.0kbps  version was a good deal more musical, and would in many cases be hard to distinguish from a 16.0 kHz sampled PCM file, though I found some rough spots.

One of these rough spots was on the first beat of a new song after a speech section between songs, and I worked up in bitrate to find it was still a bit poor at 27.0 kbps, but reasonably well-handled at 32.0 kbps and might be a speech/music detection candidate. Then again, it might just be somewhere that it music and just needs more bitrate to handle the simultaneous instruments present.

I trimmed the lossless version on whole second boundaries to make a 56 second clip (less than 30 seconds of copyright material each side, includes tail of one track, speech between tracks and the start of the second track) to ensure the frames line up with the original opus (20.0ms frames). It's a longer clip than the usual 30 seconds because is contains less than 30 sec of music and I sought to include the music-speech-music transitions. I doubt it's important, but the source of

The problem is in the first note of the second track, starting at the 0:36 point, which includes a jumble of percussion, arpeggiated chords and bass note but is the first appreciable sound since the preceding speech sound.

I attach this as a FLAC (48kHz, 24bit stereo, 1058kbps, 7.24MiB captured via BOSE Tonematch USB & Audacity) and also attach a selection of Opus encodes using this build at the target bitrates 16.0, 20.0, 24.0, 27.0, 32.0, 40.0 and 48.0 kbps, which I uploaded in a single ZIP archive (1.33MiB)

The whole show recording had a ReplayGain of +2.98 dB (no tagging in attached files) and the live vocals are, I believe, dead-centre.
Dynamic – the artist formerly known as DickD

New opus test build

Reply #2
Good to see the Muse Breaks fix finally bundled. Btw, do you and jmvalin want to see test samples for the builds posted here in the forum or better privately via e-mail? I ever feel a bit like randomly spamming into the threads whenever I'm coming up with some new ones, as it's more or less on a regular basis.

New opus test build

Reply #3
I was trying to test the speech/music detection, but wouldn't mind a bit of guidance. I presume it mainly kicks in at the lower end of the target bitrate range.


Yes, speech/music is mostly used at lower rates. The range over which it has an impact is 20-64 kb/s, though it's most noticeable between 24 kb/s and 48 kb/s for stereo.

All four versions seemed to me limited to 12 kHz (SWB) audio bandwidth once the music really kicked in, though at 40 kbps there was full bandwidth occasionally during some of the speech and sometimes as the song started (e.g. first cymbal hit) showing about 16-18 kHz then cut down to 12 kHz without a bad transition (saw it on fb2k spectrogram, but barely noticed, and I think it was in CELT mode the whole time, as the bass and percussion were well handled). I guess it's possible that these instances were hybrid mode at FB switching to CELT-only at SWB, but they sounded OK.


Yeah, bandwidth switched as still a known issue. It's not quite clear how to solve this. It would be nice if it didn't change like this along with the decisions, but at the same time it's trying to do what's optimal for each. I don't yet have a good solution for this -- even a conceptual one. Open to suggestions.

One of these rough spots was on the first beat of a new song after a speech section between songs, and I worked up in bitrate to find it was still a bit poor at 27.0 kbps, but reasonably well-handled at 32.0 kbps and might be a speech/music detection candidate. Then again, it might just be somewhere that it music and just needs more bitrate to handle the simultaneous instruments present.


The good news is that Opus already supports "delayed decisions" in the speech/music detector, so it's possible to avoid having the transitions happening too late. It's not yet supported in the opusenc tool, but that should happen soon.

New opus test build

Reply #4
Good to see the Muse Breaks fix finally bundled. Btw, do you and jmvalin want to see test samples for the builds posted here in the forum or better privately via e-mail? I ever feel a bit like randomly spamming into the threads whenever I'm coming up with some new ones, as it's more or less on a regular basis.


I think it's better if you post here in the open so everyone else can see what's going on.

New opus test build

Reply #5
Using this test build (libopus 1.1a-67) to encode the ac3-stream of AC3TEST.VOB solves the [a href='index.php?act=findpost&pid=825291']'muffle'-issue[/a].
Thought I'd report it. Good job!

New opus test build

Reply #6
I think it's better if you post here in the open so everyone else can see what's going on.

Ok, that's nice to hear. Here are some new ones, ABXed at 192 kb/s VBR:

[attachment=7502:Girl_In_..._Sample_.flac]
This one has tonal distortions on the right channel, expecially noticeable at the first strikes.

Poquito Mas
Tonal distortion on the 4th strike, and noisy sounds in the background on the left channel, synchronical to the guitar.

Sycho Active
Muffled distortion right before the 2nd kick.

Prior versions performed equal/worse btw...

New opus test build

Reply #7
Forgot to add this one:
[attachment=7504:Blender__Sample_.flac]
Probably the worst of the bunch for Opus, heavy distortions on the left guitar are introduced. The full track is available here for free btw.

New opus test build

Reply #8
Gainless,

Can You provide some ABC/HR logs comparing  the tested build with 1.0.2 and 1.1a ?

Thank You.

New opus test build

Reply #9
Here are some results for 96 kbps
1.0.2 https://ftp.mozilla.org/pub/mozilla.org/opu...0.1.6-win32.zip
1.1a https://ftp.mozilla.org/pub/mozilla.org/opu...alpha-win32.zip
and 1.1post-a from the OP.

The original samples were previously resampled by foobar's SoX plugin (best quality, other settings by default), 44.1/16 -> 48/24
Tested samples: _http://depositfiles.org/files/cvnk0fid3 . Sample02 was excluded because I'm not sure what I'm hearing there. 

Results
Code: [Select]
	1.0.2	1.1a	1.1post-a
01 Winter 4.4 4.4 4.5
02 Sor - - -
03 eig 3.5 4 4
04 fatboy 3.7 4.8 4.8
05 Harpsichord 3.3 4 4
06 Bittersweet 3.5 4 4
07 До свиданья, мама 4.4 4.4 4.5
08 German speech 5 4.2 4.3
09 Let me live 5 4.7 4.8

Average 4.10 4.31 4.36

Average
1.0.2 - 4.10
1.1a - 4.31
1.1post-a - 4.36



1.02 vs 1.1x
1.1x builds  are clearly superior on transient samples (2º, 3º) as well as tonal samples (4º, 5º).  On speech sample (8º) 1.1x builds were a bit inferior but the scores are still considerably higher than 4.0 score.
It's very well known that Opus is very good for speech (sample 8º) and rock music (9º) but still need some extra bitrate for tonal parts. 1.1x does exactly this: take some bitrate from those parts where Opus is excessively good and move them to hard parts where it's actually useful.

That's by far my findings.


New opus test build

Reply #11
Thank You, Kamedo2.
I was thinking to add it but have changed my mind in the last moment. As I already copy a lot of ideas from other people. 

New opus test build

Reply #12
Gainless,

Can You provide some ABC/HR logs comparing  the tested build with 1.0.2 and 1.1a ?

Thank You.

I could if I would have working exe of it. I have the 1.1 beta, but it's introducing annoying sweep sounds on the test files.

Edit: Solved the problem, was a bad resampler setting in the Windows 7 mixer.

New opus test build

Reply #13
Ok, I've done ABC-HR tests with the samples at 128 kb/s VBR, with nice results for the new build, it performed either equally or better than the older versions. "Sycho Active" was even transparent, seems like I've accidentally ABXed a wrong version there (1.1 and 1.0.2 have a lot of trouble with that one).

Sycho Active:
Code: [Select]
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname: Sycho Active

1R = C:\Users\Master O\Desktop\Opus Samples\Sycho_Active__Sample_(1.1).wav
2L = C:\Users\Master O\Desktop\Opus Samples\Sycho_Active__Sample_(1.0.2).wav
3R = C:\Users\Master O\Desktop\Opus Samples\Sycho_Active__Sample_(Exp).wav

---------------------------------------
General Comments:
Comparison of Opus vers. 1.0.2, 1.1 alpha and the last exp build from 17.04.2013 at 128 kb/s VBR. Exp is the clear winner here.
---------------------------------------
1.1
1R Rating: 1.8
1R Comment: Several heavy distortions, cymbal is fine though.
---------------------------------------
1.0.2
2L Rating: 2.0
2L Comment: Same distortion character as in 1.1, but not as loud.
---------------------------------------
ABX Results:
Original vs C:\Users\Master O\Desktop\Opus Samples\Sycho_Active__Sample_(1.1).wav
    12 out of 12, pval < 0.001
Original vs C:\Users\Master O\Desktop\Opus Samples\Sycho_Active__Sample_(1.0.2).wav
    10 out of 10, pval < 0.001
Original vs C:\Users\Master O\Desktop\Opus Samples\Sycho_Active__Sample_(Exp).wav
    5 out of 10, pval = 0.623

Girl In The Fire:
Code: [Select]
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname: Poquito Mas

Tester: fdighdfb

1L = C:\Users\Master O\Desktop\Opus Samples\Girl_In_The_Fire__Sample_(1.0.2).wav
2R = C:\Users\Master O\Desktop\Opus Samples\Girl_In_The_Fire__Sample_(1.1).wav
3L = C:\Users\Master O\Desktop\Opus Samples\Girl_In_The_Fire__Sample_(Exp).wav

---------------------------------------
General Comments:
Comparison of Opus vers. 1.0.2., 1.1 alpha and the last exp build from 17.04.2013 at 128 kb/s VBR.

All versions introduced a slight distortion on the 3rd strike of the guitar on the  right channel, a bit annoying, but not terrible.

(Forgot to change the test name)
---------------------------------------
1.0.2
1L Rating: 3.5
1L Comment:
---------------------------------------
1.1 alpha
2R Rating: 3.5
2R Comment:
---------------------------------------
Latest exp build
3L Rating: 3.5
3L Comment:
---------------------------------------
ABX Results:
Original vs C:\Users\Master O\Desktop\Opus Samples\Girl_In_The_Fire__Sample_(1.0.2).wav
    9 out of 9, pval = 0.002
Original vs C:\Users\Master O\Desktop\Opus Samples\Girl_In_The_Fire__Sample_(1.1).wav
    10 out of 10, pval < 0.001
Original vs C:\Users\Master O\Desktop\Opus Samples\Girl_In_The_Fire__Sample_(Exp).wav
    10 out of 10, pval < 0.001

Poquito Mas:
Code: [Select]
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname: Poquito Mas

Tester: fdighdfb

1R = C:\Users\Master O\Desktop\Opus Samples\Poquito_Mas__Sample_(1.0.2).wav
2R = C:\Users\Master O\Desktop\Opus Samples\Poquito_Mas__Sample_(1.1).wav
3R = C:\Users\Master O\Desktop\Opus Samples\Poquito_Mas__Sample_(Exp).wav

---------------------------------------
General Comments:
Opus vers. 1.0.2, 1.1 alpha and the last exp build from 17.04.2013. at 128 kb/s were compared. 1.0.2 performs worst.
---------------------------------------
1.0.2
1R Rating: 1.5
1R Comment: Heeeavy tonal distortions.
---------------------------------------
1.1
2R Rating: 2.5
2R Comment: Same distortion character as in 1.0.2, but a lot more subtle. Still quite annoying.
---------------------------------------
Latest exp build
3R Rating: 2.5
3R Comment: Same as 1.1
---------------------------------------
ABX Results:
Original vs C:\Users\Master O\Desktop\Opus Samples\Poquito_Mas__Sample_(1.0.2).wav
    10 out of 10, pval < 0.001
Original vs C:\Users\Master O\Desktop\Opus Samples\Poquito_Mas__Sample_(1.1).wav
    11 out of 11, pval < 0.001
Original vs C:\Users\Master O\Desktop\Opus Samples\Poquito_Mas__Sample_(Exp).wav
    11 out of 11, pval < 0.001

Blender:
Code: [Select]
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname: asfsadfsdg

Tester: fdighdfb

1L = C:\Users\Master O\Desktop\Opus Samples\Blender__Sample_(1.0.2).wav
2R = C:\Users\Master O\Desktop\Opus Samples\Blender__Sample_(Exp).wav
3R = C:\Users\Master O\Desktop\Opus Samples\Blender__Sample_(1.1).wav

---------------------------------------
General Comments:
Comparing Opus ver. 1.0.2, 1.1 Alpha and the last exp build from 17.04.2013
at 128 kb/s VBR.

All versions introduced noisy distortions on the guitar of the left channel.
---------------------------------------
1.0.2
1L Rating: 3.3
1L Comment:
---------------------------------------
Latest Exp build
2R Rating: 3.3
2R Comment:
---------------------------------------
1.1
3R Rating: 3.3
3R Comment:
---------------------------------------
ABX Results:
Original vs C:\Users\Master O\Desktop\Opus Samples\Blender__Sample_(1.0.2).wav
    7 out of 8, pval = 0.035
Original vs C:\Users\Master O\Desktop\Opus Samples\Blender__Sample_(Exp).wav
    8 out of 8, pval = 0.004
Original vs C:\Users\Master O\Desktop\Opus Samples\Blender__Sample_(1.1).wav
    8 out of 8, pval = 0.004

New opus test build

Reply #14
There are two first samples (01 and 05) those have called my attention as the quality isn't what someone can expect at 64 kbps. It's not like 1.1x did  bad on them but its performance is enough subpar comparing to HE-AAC on these particular samples.

Sample 01. Some "flushing" artifacts have returned.
Sample 05. It's a hard sample for Opus and there is only a small increment on those hard parts. Anyway I think it's the same case as of harpsichord.

The files are here _http://depositfiles.org/files/0ina4t505

Also have tried http://www.rarewares.org/test_samples/velvet.wv . 1.1x did good and was better than 1.0.2

New opus test build

Reply #15
What I don't really get, is why Opus's VBR mode doesn't use more bitrate savings at "easy" samples, e.g. at content with very low frequency spectrum. A lot of stuff that I can't ABX at 64 kb/s CVBR still gets quite high bitrates, where other encoders use only a third of the actual nominated bitrate.

New opus test build

Reply #16
What I don't really get, is why Opus's VBR mode doesn't use more bitrate savings at "easy" samples, e.g. at content with very low frequency spectrum. A lot of stuff that I can't ABX at 64 kb/s CVBR still gets quite high bitrates, where other encoders use only a third of the actual nominated bitrate.


That's mostly because Opus' bit allocation is implicit. The bit-stream provides a pretty good default with no signalling. This generally work pretty well, but it means we have to explicitly detect easy samples. Right now, there's a detector for very narrow stereo images, but that's about it.

New opus test build

Reply #17
That's mostly because Opus' bit allocation is implicit. The bit-stream provides a pretty good default with no signalling. This generally work pretty well, but it means we have to explicitly detect easy samples. Right now, there's a detector for very narrow stereo images, but that's about it.

Can we expect more from these detectors in the future?

New opus test build

Reply #18
Can we expect more from these detectors in the future?


Depends on whether there's other wide classes of signals on which we can reduce the rate. Things like test tones are so rare that saving bits on them has no real effect on the average rate of a music collection, but if there's something that's worth it, I can look at detecting it.

New opus test build

Reply #19
Depends on whether there's other wide classes of signals on which we can reduce the rate. Things like test tones are so rare that saving bits on them has no real effect on the average rate of a music collection, but if there's something that's worth it, I can look at detecting it.

I mainly think about heavily lowpassed stuff in intros, breaks or ambient music for example, or other things without a lot of high frequency content, like a single rock/metal guitar. Don't know how often it appears on a big collection, but in electronic/complex produced music it's not that uncommon.

New opus test build

Reply #20
Gainless,

Some time ago You have submited one sample with low frequency content.  The most of other formats had very low bitrate on it, but not Opus. There was a test patch for it.  It has decreased a bitrate on this sample dramatically while there were some not harsh birdie-style (frequency  modulation/sweeps) artifacts. It could be good for 64 kbps, maybe at 96 kbps too (?), but at 192 kbps when one expects (near) transparency for all samples even some tiny issue isn't acceptable.

I have a few albums where I've noticed that Opus doesn't save bitrate on parts with low frequency content.

New opus test build

Reply #21
Gainless,

Some time ago You have submited one sample with low frequency content.  The most of other formats had very low bitrate on it, but not Opus. There was a test patch for it.  It has decreased a bitrate on this sample dramatically while there were some not harsh birdie-style (frequency  modulation/sweeps) artifacts. It could be good for 64 kbps, maybe at 96 kbps too (?), but at 192 kbps when one expects (near) transparency for all samples even some tiny issue isn't acceptable.

I have a few albums where I've noticed that Opus doesn't save bitrate on parts with low frequency content.

Like I've said in a previous post, it's mainly about stuff I can't ABX at 64 kbps. I can't remember about a sample I posted for which a certain test patch was made btw, just one with a bit of basic tonal content (Project 100), and a kick (Meduzz) from the old Opus mega-thread, where I've noticed a huge bitrate saving for other encoders. Do you mean one of these two?


New opus test build

Reply #23
Ok, then this is at least clear. What's probably a bit more interesting is the question if the VBR control is adjusted to different bitrates or just has a general "more/less than nominated bitrate on that kind of music"-behaviour. Is it even possible to make special adjustments for, maybe 64 and 192 kbps?

New opus test build

Reply #24
Have tried "Iron Man" sample with the experimental build from OP.
Some audio files and screenshots https://drive.google.com/folderview?id=0Byv...amp;usp=sharing

First, there is no substantial/audible differences between the exp, 1.1a and 1.0.2 but all of them have some distortion on bass drum kicks (on transient part).
The experimental build has a variable bandwidth 12/20 kHz at 96 kbps. I guess it should work this way as it reduces bitrate without any artifacts virtually.
I've reported just in case.