IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
looking for efficient speech codec
chrizoo
post Jun 15 2012, 02:12
Post #1





Group: Members
Posts: 345
Joined: 25-March 08
Member No.: 52274



Hi. I have quite a few speech recordings (talkradio, lectures, notes to self, etc.) which I need to encode. I have done some tests with HE-AAC V2 (Nero AAC codec / 1.5.3.0) and am quite pleased with the quality/filesize ratio.

I then tried speex (speex-1.2beta1/$). When comparing to HE-AACv2 files of the same size, speex failed horribly for me during VLC 1.1.11 playback, suffering from major artefacts and lower overall quality. Not quite what I expected. (Unless VLC decoding is broken or the speex version too old ?)

  1. How does speex fare against HE-AACv2 for you ?
  2. Is there any other substantially more efficient speech codec ?
  3. What software player (Windows) can you recommend ?
  4. What considerations should I not overlook if I want to keep the files for life (I'm still young :-) ?


many thanks



Go to the top of the page
+Quote Post
jensend
post Jun 15 2012, 07:15
Post #2





Group: Members
Posts: 143
Joined: 21-May 05
Member No.: 22191



I'll address point 4 first: Bluntly, if you want to keep the files for life, you probably want to keep them in a lossless format. Space is cheap, and formats, encoders, and player support will change over the years. Having a lossless copy means you can always re-encode into whatever the lossy format of the day is.

1. Speex should beat HE-AACv2 at some low bitrates, but its main advantage over HE-AAC and even Vorbis is not quality but latency. It will lose to HE-AAC and Vorbis at moderate or high bitrates.

2. First, the bad news. There's nothing substantially more efficient than HE-AAC with really solid player support right now.

The good news is that this is changing. The Opus codec is already vastly superior to both Speex and HE-AAC for speech, and there's still room for more improvements in the reference encoder. It is just about to release its 1.0 version. Not much player support yet but it will be there quite soon.

3. I find myself just using VLC since it's pretty handy for a lot of stuff.


Just checking- have you been resampling your files before passing them to your HE-AAC encoder? General-purpose codecs like HE-AAC or Vorbis, since they've been tuned for music, will avoid resampling so as to preserve the quality of music's high frequencies, but for speech recordings those higher frequencies are just noise, and removing them allows the encoder to spend more bits on things that matter. Depending on your hearing and preferences, the sample rate sweet spot for straight speech could be anywhere from 12kHz to 24kHz.
Go to the top of the page
+Quote Post
Speckmade
post Jul 13 2012, 01:22
Post #3





Group: Members
Posts: 36
Joined: 15-February 05
Member No.: 19848



More information on your scenario could be important - otherwise we'd have to guess on that.
Most telling hint seems to be the word "efficient" here..? As you mention Speex and HE-AAC you seem to be aiming for very low bitrates. I guess that means storage space or transmission bandwith are a problem?
- Detective work is fun but you might just tell us...

Opus seems to be a really nice replacement for both Speex and HE-AAC, given that it is also free/libre, open and royalty free format, an IETF standard, more efficient than both, ... But player support is just about to emerge now, no broad support yet. Although with being an IETF standard and already visible interest in the format (application support even already for development versions, ...) may promise a lot.
Go to the top of the page
+Quote Post
Garf
post Jul 13 2012, 06:01
Post #4


Server Admin


Group: Admin
Posts: 4883
Joined: 24-September 01
Member No.: 13



QUOTE (jensend @ Jun 15 2012, 08:15) *
Just checking- have you been resampling your files before passing them to your HE-AAC encoder? General-purpose codecs like HE-AAC or Vorbis, since they've been tuned for music, will avoid resampling so as to preserve the quality of music's high frequencies, but for speech recordings those higher frequencies are just noise, and removing them allows the encoder to spend more bits on things that matter. Depending on your hearing and preferences, the sample rate sweet spot for straight speech could be anywhere from 12kHz to 24kHz.


At least for HE-AAC, removing those high frequencies should have only minimal effects, as the entire upper part of the frequency range is encoded in a few kbps (that's the whole High Efficiency part of the codec).
Go to the top of the page
+Quote Post
Garf
post Jul 13 2012, 06:06
Post #5


Server Admin


Group: Admin
Posts: 4883
Joined: 24-September 01
Member No.: 13



QUOTE (chrizoo @ Jun 15 2012, 03:12) *
[*]Is there any other substantially more efficient speech codec ?


Opus, which is a hybrid of a very good speech and a very good music codec.

QUOTE
[*]What software player (Windows) can you recommend ?


As said, Opus support is just emerging as it came out of the standardization process weeks ago. foobar2000 support appears to be right around the corner, whereas Firefox Nightlies have support for it enabled by default since a few days.

QUOTE
[*]What considerations should I not overlook if I want to keep the files for life (I'm still young :-) ?


Archive them in lossless somewhere? If it's speech then the bitrate shouldn't be very high, especially if they're resampled to say 24kHz first.
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 1st August 2014 - 08:56