Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: opus echo sample (Read 4491 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

opus echo sample

I know bitrate=35 isn't going to be hi fi, but I found this while playing with the speech/music theme.  The first note of the tune has an echo to the point of being a whole extra note.  Happens in both 1.1a and babyeater.  The forum says opus isn't an allowed file type for upload, so I just did the flac file.    At 40 its more like just a mushy attack on the first note.

edit: uploaded result opus file converted back to flac.


opus echo sample

Reply #2
I know bitrate=35 isn't going to be hi fi, but I found this while playing with the speech/music theme.  The first note of the tune has an echo to the point of being a whole extra note.  Happens in both 1.1a and babyeater.  The forum says opus isn't an allowed file type for upload, so I just did the flac file.    At 40 its more like just a mushy attack on the first note.

edit: uploaded result opus file converted back to flac.


OK, problem identified. What happens is that the speech/music detector is classifying the silence at the beginning of the file as speech (probably because the training set had too much silence in the speech). I'm working on fixing this. BTW, it seems ike this thread would belong to the Opus forum. Can someone move it?

opus echo sample

Reply #3
What happens is that the speech/music detector is classifying the silence at the beginning of the file as speech (probably because the training set had too much silence in the speech).

I also heard similar issues on other items starting with a fade-in (the first half-second or so was apparently coded with SILK). It was either at 24 or 32 kbps, official v1.1 binary. Makes me wonder: which stereo tools are available for SILK? Can it do "true" stereo with a downmix+residual (or M/S) or just some kind of time-domain intensity stereo?

Chris
If I don't reply to your reply, it means I agree with you.

opus echo sample

Reply #4
I also heard similar issues on other items starting with a fade-in (the first half-second or so was apparently coded with SILK). It was either at 24 or 32 kbps, official v1.1 binary. Makes me wonder: which stereo tools are available for SILK? Can it do "true" stereo with a downmix+residual (or M/S) or just some kind of time-domain intensity stereo?


SILK uses downmix+residual, aka MS stereo in the signal domain (unlike CELT and Vorbis that do MS after band normalization). That's what explains some of the stereo artefacts it sometimes causes. That being said, it usually sounds good on "normal" stereo speech that doesn't have too much channel separation. I recently checked in some changes to the detector in the exp_analysis branch that improves the decision code and adds the possibility of using look-ahead (up to 2 seconds) on that decision to make it better.