IPB

Welcome Guest ( Log In | Register )

2 Pages V   1 2 >  
Reply to this topicStart new topic
What is the recommended setting for speeches?, I'm new to Opus...
krafty
post Apr 29 2014, 18:37
Post #1





Group: Members
Posts: 274
Joined: 20-March 10
Member No.: 79175



I was going to look after Speex because I heard about it once.
Now I found out it is obsolete and Opus is the way to go.

Usually I am releasing the speeches recording as LAME V5. One hour of speech is like 31MB.

What seeting would it be recommended for Opus?

Thanks
Go to the top of the page
+Quote Post
testyou
post Apr 29 2014, 20:33
Post #2





Group: Members
Posts: 99
Joined: 24-September 10
Member No.: 84113



Try the lowest bitrate you can until the artifacts become too annoying. Also mono.
Eg: --downmix-mono --bitrate 32. Decrease bitrate as required.

This post has been edited by testyou: Apr 29 2014, 20:38
Go to the top of the page
+Quote Post
lithopsian
post Apr 29 2014, 22:09
Post #3





Group: Members
Posts: 158
Joined: 27-February 14
Member No.: 114718



Opus will do anything that Speex can do, usually better. You should be able to push it well below 32 for mono speech, but I guess it depends on how good the original recording is and whether you can tolerate any artefacts. The codec will likely downmix to mono automatically at low bitrates. opusenc also has an option to optimise for speech. That will encourage things like bandpass restriction that reduce the bitrate without affecting the quality of speech too much.
Go to the top of the page
+Quote Post
krafty
post Apr 29 2014, 22:50
Post #4





Group: Members
Posts: 274
Joined: 20-March 10
Member No.: 79175



QUOTE
opusenc also has an option to optimise for speech


What command would that be? I looked through the options and didn't see anything related.

I tried 20, 32 and 40kbps. 40 is acceptable. 32 there is water artifact but it's not that annoying. 20 there's a lot of artifacts.

I also noticed that it always outputs audio as 48000 Hz, and read some pages here that justify this. But I imagine that if someone encodes music audio, will stay away from that, right?
Go to the top of the page
+Quote Post
lithopsian
post Apr 29 2014, 23:03
Post #5





Group: Members
Posts: 158
Joined: 27-February 14
Member No.: 114718



opusenc --speech smile.gif

Which version of opusenc do you have? Might not be in your program. I have opus-tools 0.1.2 with libopus 1.1. opus-tools 0.1.8 is available, maybe it has *less* options?

Are you sure you're outputting mono? 40kbps seems high, but maybe if you are really looking for transparency.

Opus always runs at 48kHz. This is normal. Up-sampling from 44.1kHz shouldn't scare anyone. If you feel you need 96kHz or higher then maybe Opus isn't for you. At high quality (say bitrates 128kbps upwards) Opus doesn't offer much over other modern codecs such as AAC or Ogg Vorbis.
Go to the top of the page
+Quote Post
krafty
post Apr 29 2014, 23:27
Post #6





Group: Members
Posts: 274
Joined: 20-March 10
Member No.: 79175



I seem to have the latest version...
The speeches are done in FLAC mono, then converted to lossy for streaming. So it is already mono.
The help file doesn't show anything about --speech

opusenc opus-tools 0.1.8-git (using libopus 1.1.x-git)

Update:

I just confirmed in IRC that this option is said to be removed a long time ago.

This post has been edited by krafty: Apr 29 2014, 23:54
Go to the top of the page
+Quote Post
krafty
post Apr 30 2014, 02:29
Post #7





Group: Members
Posts: 274
Joined: 20-March 10
Member No.: 79175



Ok, I settled for 32 kbps, which is good enough.
Getting something like 10MB an hour speech.
Very good codec.

My main issue is that it is not compatible with Internet Explorer.
I'll get complaints about audio not opening... headbang.gif
Go to the top of the page
+Quote Post
jensend
post Apr 30 2014, 02:38
Post #8





Group: Members
Posts: 143
Joined: 21-May 05
Member No.: 22191



This is mostly directed at lithopsian's confusion.

Yes, they removed the --speech command line option back in September 2012; they gave two reasons. First, they said people were confused about what it did. It was a hint to the encoder which could help it slightly improve some decisions, not a necessary forced mode switch; you could always encode any content without specifying the --speech or --music options. Second, the newer 1.1 encoder does a passable job of automatic classification anyways, which reduces the impact of the hint. (C.f. the section on Automagic Speech vs. Music Discrimination in Monty's 1.1 demopage.) Though the speech classifier isn't perfect, stuff that's borderline to the classifier is usually also close to borderline as far as which encoder mode is better, so the output quality is fine.

The functionality is still there, just "hidden." You can still give it the hint using opusenc --set-ctl-int with some magic numbers. For normal use this probably won't make a noticeable difference. But for some more particular use cases as well as for some kinds of testing the set-ctls can be helpful.

The only place these magic numbers are documented is in opus_defines.h in the source tree. The numbers could be changed at any time; the names are more descriptive and are frozen as part of the API, but opusenc can't handle the names.

To hint to the encoder that your content is speech you set OPUS_SET_SIGNAL_REQUEST to OPUS_SIGNAL_VOICE, i.e. opusenc --set-ctl-int 4024=3001. Music would be 4024=3002. These do exactly the same thing under the hood that --speech and --music used to do.

The other ctl most likely to come in handy is OPUS_SET_MAX_BANDWIDTH_REQUEST, which allows you to tell Opus to not encode any content above a frequency cutoff (4, 6, 8, or 12 KHz). Good for stuff that will be played back on very constrained devices or for when you know that the high frequencies in a recording are only noise.
Go to the top of the page
+Quote Post
jensend
post Apr 30 2014, 02:51
Post #9





Group: Members
Posts: 143
Joined: 21-May 05
Member No.: 22191



Yes, Microsoft has not been friendly as far as integrating free codecs into IE or Windows Media. See caniuse.com for a table showing which browsers have native opus playback.

For some of the browsers which don't have native support, it's possible to use a 3rd-party solution. I'm not aware of any plugins etc for IE which play Opus yet.

The webm folks publish a framework that allows for ogg and VP8 playback in IE. They're currently looking in to trying to extend this to VP9 and Opus.
Go to the top of the page
+Quote Post
zerowalker
post Jun 24 2014, 22:59
Post #10





Group: Members
Posts: 266
Joined: 6-August 11
Member No.: 92828



If i my hijack the thread a bit, What are people using to have transparent speeches?
I know it's subjective, but would be nice to know the general idea of what people tend to use, as i guess it should be in a fairly close range to one another.
Go to the top of the page
+Quote Post
IgorC
post Jun 24 2014, 23:48
Post #11





Group: Members
Posts: 1533
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



http://www.opus-codec.org/comparison/GoogleTest1.pdf
"Opus at 32 kbps is almost transparent" (mono)


P.S. Another one http://research.nokia.com/files/public/%5B..._Opus_Codec.pdf

This post has been edited by IgorC: Jun 24 2014, 23:52
Go to the top of the page
+Quote Post
zerowalker
post Jun 25 2014, 00:37
Post #12





Group: Members
Posts: 266
Joined: 6-August 11
Member No.: 92828



QUOTE (IgorC @ Jun 25 2014, 00:48) *


So, according to these, OPUS is near-transparency at 32kbps, and that was awhile ago (at least the 2011 one), and they made quite the update with 1.1 if i remember correctly, especially with CVBR, which i prefer to use (Though can someone tell if i should, can't find any real information on this, i just take it as a "safe net" compare to VBR that can flicker high/low which worries me).

So, for full transparency, with some overhead, what is realistic, 96,128?

I rather have more bitrate than to low, i don't like living on the edge so to speak.
Go to the top of the page
+Quote Post
lithopsian
post Jun 25 2014, 15:20
Post #13





Group: Members
Posts: 158
Joined: 27-February 14
Member No.: 114718



I agree with the term "near transparent" at 32kbps. Definitely not transparent at 24kbps, although quite acceptable for many uses. 96-128 seems like massive overkill for (mono?) speech. That will produce near transparent output for most stereo music. Perhaps 48kbps-64kbps if you really are more bothered about quality than space. Of course taking up more space is really the only downside to using a higher bitrate than really necessary.

If you're concerned about varying bitrates then maybe you shouldn't be using Opus wink.gif Let it do what it does best. Fixed bitrate encoding becomes progressively less effective at very low bitrates, and progressively less effective in advanced codecs. The documentation specifically states that VBR mode produces more consistent quality. We're not in 1980 any more, Toto.
Go to the top of the page
+Quote Post
zerowalker
post Jun 25 2014, 15:30
Post #14





Group: Members
Posts: 266
Joined: 6-August 11
Member No.: 92828



QUOTE (lithopsian @ Jun 25 2014, 16:20) *
I agree with the term "near transparent" at 32kbps. Definitely not transparent at 24kbps, although quite acceptable for many uses. 96-128 seems like massive overkill for (mono?) speech. That will produce near transparent output for most stereo music. Perhaps 48kbps-64kbps if you really are more bothered about quality than space. Of course taking up more space is really the only downside to using a higher bitrate than really necessary.

If you're concerned about varying bitrates then maybe you shouldn't be using Opus wink.gif Let it do what it does best. Fixed bitrate encoding becomes progressively less effective at very low bitrates, and progressively less effective in advanced codecs. The documentation specifically states that VBR mode produces more consistent quality. We're not in 1980 any more, Toto.



Oh, well 64 seems like the spot i guess:)

But wait, i am pretty sure CVBR has been said to be better than VBR (or something along does lines?), not talking about CBR here, which i know is a waste in all scenarios pretty much.
Go to the top of the page
+Quote Post
Anakunda
post Jun 25 2014, 18:15
Post #15





Group: Members
Posts: 450
Joined: 24-November 08
Member No.: 63072



Tho I seeked hardly I can't find a '--speech' option for current opusenc (opus-tools)
Is this become obsolete or what?
I read somewhere that opus analysis switches self betweeen celt and opus library. Not sure about it tho and I agree it would be convenient if user could decide self which librayr to use
Go to the top of the page
+Quote Post
zerowalker
post Jun 25 2014, 19:47
Post #16





Group: Members
Posts: 266
Joined: 6-August 11
Member No.: 92828



QUOTE (Anakunda @ Jun 25 2014, 19:15) *
Tho I seeked hardly I can't find a '--speech' option for current opusenc (opus-tools)
Is this become obsolete or what?
I read somewhere that opus analysis switches self betweeen celt and opus library. Not sure about it tho and I agree it would be convenient if user could decide self which librayr to use


In an earlier post they said that, it has become obsolete as it was confusing, and it wasn't a Forced option.
Go to the top of the page
+Quote Post
Anakunda
post Jun 25 2014, 19:49
Post #17





Group: Members
Posts: 450
Joined: 24-November 08
Member No.: 63072



Thanks, so is there a command to force opusenc using celt codec?
Go to the top of the page
+Quote Post
lithopsian
post Jun 25 2014, 20:00
Post #18





Group: Members
Posts: 158
Joined: 27-February 14
Member No.: 114718



Internally, there is CELT, SILK, or a hybrid mode. You have very little direct choice about which one is used, although certain modes are always chosen at certain bitrates - basically CELT is used at high bitrates (high for speech), SILK at low bitrates, and the hybrid mode in between. Again, don't worry about it, any more than you'd try to control the level of channel coupling in an MP3 file (you don't, do you?).
Go to the top of the page
+Quote Post
lithopsian
post Jun 25 2014, 20:05
Post #19





Group: Members
Posts: 158
Joined: 27-February 14
Member No.: 114718



QUOTE (Anakunda @ Jun 25 2014, 19:49) *
Thanks, so is there a command to force opusenc using celt codec?

In short, no. Why would you want to? Get it wrong and it will sound just awful. Get it a bit wrong and it will be worse than it should without you necessarily noticing. In any case, there is no "pure CELT" inside Opus, just an MDCT encoding based on CELT.
Go to the top of the page
+Quote Post
IgorC
post Jun 25 2014, 20:06
Post #20





Group: Members
Posts: 1533
Joined: 3-January 05
From: ARG/RUS
Member No.: 18803



QUOTE (zerowalker @ Jun 25 2014, 11:30) *
But wait, i am pretty sure CVBR has been said to be better than VBR (or something along does lines?), ...

No, You've probably heard that from Apple AAC topics.
For Opus VBR is better than CVBR in quality terms.


It's good to read an official documentation.
http://www.opus-codec.org/docs/
opusenc (.wav to .opus) HTML

QUOTE
--cvbr

Use constrained variable bitrate encoding.

Outputs to a specific bitrate. This mode is analogous to CBR in AAC/MP3 encoders and managed mode in vorbis coders. This delivers less consistent quality than VBR mode but consistent bitrate.


This post has been edited by IgorC: Jun 25 2014, 20:10
Go to the top of the page
+Quote Post
lithopsian
post Jun 25 2014, 20:11
Post #21





Group: Members
Posts: 158
Joined: 27-February 14
Member No.: 114718



Read about CELT, SILK, and hybrid mode here.
Go to the top of the page
+Quote Post
zerowalker
post Jun 25 2014, 21:14
Post #22





Group: Members
Posts: 266
Joined: 6-August 11
Member No.: 92828



QUOTE (IgorC @ Jun 25 2014, 21:06) *
QUOTE (zerowalker @ Jun 25 2014, 11:30) *
But wait, i am pretty sure CVBR has been said to be better than VBR (or something along does lines?), ...

No, You've probably heard that from Apple AAC topics.
For Opus VBR is better than CVBR in quality terms.


It's good to read an official documentation.
http://www.opus-codec.org/docs/
opusenc (.wav to .opus) HTML

QUOTE
--cvbr

Use constrained variable bitrate encoding.

Outputs to a specific bitrate. This mode is analogous to CBR in AAC/MP3 encoders and managed mode in vorbis coders. This delivers less consistent quality than VBR mode but consistent bitrate.



Oh, seems to be the case i guess.

Just i have hard understanding the words though, as i read it it simply says:

CVBR gives you a higher difference in quality fluctuation compared to VBR, but the Bitrate is steady?

Which in terms would simply say, It's a steady flow with less quality?

Is this interpretation correct?

"This mode is analogous to CBR in AAC/MP3 encoders"

That doesn't tell me much though, i though AAC/MP3 at CBR was no variation at all, just force a certain bitrate, which i guess Opus CBR does, while CVBR is "VBR" with a certain limit attached to it.


Also, just to make sure, Framesize, that only matters for VOIP stuff right?
In archiving and playback, you should always set it to max (60?) right?

This post has been edited by zerowalker: Jun 25 2014, 21:18
Go to the top of the page
+Quote Post
lithopsian
post Jun 25 2014, 21:43
Post #23





Group: Members
Posts: 158
Joined: 27-February 14
Member No.: 114718



I think it is confusing to say that Opus cvbr is analogous to AAC/MP3 CBR. Perhaps closer to AAC cvbr, but don't assume it is the same. Opus also has a hard-cbr mode which really is constant bitrate, exactly the same number of bytes in every compressed frame, useful for some specialised applications. I'm not familir with the internals of AAC and MP3, but I don't think either is exactly equivalent to Opus cvbr, or even exactly the same as eachother.

Constraining the bitrate variation results in lower quality audio. Bits cannot be saved in passages that could be encoded well at a very low bitrate, hence there are fewer bits available for passages that require more bits to sound good. End result is that, for the same file size, you will hear more audio artefacts.

The default frame size is 20ms. In 90% of cases, leave it at 20ms. Smaller frame sizes will allow for lower latency, but for most applications, 20ms latency (plus a few ms internally) is negligible. Smaller frame sizes mean more overhead, which means either bigger files or lower quality, although in some rare situations a smaller frame size can reduce some audio artefacts. CELT in particular (used for higher bitrates, higher quality) produces lower quality output with smaller frames, while SILK is less sensitive to frame size. The overhead for 2.5ms frames nearly doubles the file size compared to 20ms frames. Above 20ms the overhead hardly changes so there isn't much point going beyond that. The CELT encoder doesn't even support frame sizes larger than 20ms, although it can create 60ms packets from three 20ms frames. The SILK encoder can create 60ms frames and can form 120ms packets from smaller frames. Smaller frames may be helpful where packet loss is expected, and obviously where low latency is criticial.

This post has been edited by lithopsian: Jun 25 2014, 21:43
Go to the top of the page
+Quote Post
zerowalker
post Jun 27 2014, 22:44
Post #24





Group: Members
Posts: 266
Joined: 6-August 11
Member No.: 92828



I am thinking, something like these would yield transparent results for Mono (Mic) content.

CODE
opusenc  --bitrate 70 --vbr --framesize 60 --comp 10  --ignorelength


Make any sense?
Go to the top of the page
+Quote Post
lithopsian
post Jun 27 2014, 23:16
Post #25





Group: Members
Posts: 158
Joined: 27-February 14
Member No.: 114718



QUOTE (zerowalker @ Jun 27 2014, 22:44) *
I am thinking, something like these would yield transparent results for Mono (Mic) content.

CODE
opusenc  --bitrate 70 --vbr --framesize 60 --comp 10  --ignorelength


Make any sense?


Probably would be transparent, although you really should listen to some results and then decide if they are suitable for your needs.

As for making sense ... well. --vbr is the default. You can specify it without harm but it is unnecessary. Similarly --comp 10 is also the default, it is provided only so that the high CPU load (aka SLOW) of Opus encoding can be reduced in situations where it would be a problem. I would not specify --ignorelength unless you are actually getting problems without it. In most cases, opusenc will work without this option, but for streaming audio through stdin it is probably needed, and occasionally for input files that don't specify the data length appropriately. With it, you may get undefined results from bad input files.

Lastly, ignoring everything that has been said so far, you have specified a non-standard frame size. I see nothing to indicate that you need this frame size, or that it will be of any benefit to you. The default is 20ms, use it unless you really know why you need some other value. At a bitrate of 70kbps for mono speech, you will almost certainly get a straight CELT coding and it does not even support a 20ms frame size, so you are fudging together weird composite packets, increasing latency and complexity, for no good reason.
Go to the top of the page
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 25th July 2014 - 17:28