IPB

Welcome Guest ( Log In | Register )

6 Pages V  < 1 2 3 4 > »   
Reply to this topicStart new topic
Text to speech component
zanson
post Jun 26 2003, 00:09
Post #26





Group: Members
Posts: 126
Joined: 17-April 03
Member No.: 6027



QUOTE (upNorth @ Jun 25 2003 - 06:52 PM)
Vocal remover, lyrics as input and you can have Sam sing your favourite tune...  biggrin.gif

ohmy.gif SCARY VERY SCARY
Go to the top of the page
+Quote Post
VTR
post Jun 26 2003, 03:20
Post #27





Group: Members
Posts: 4
Joined: 19-June 03
From: Fresno
Member No.: 7275



I'm in the process of getting those AT&T natural voices. I'll let yall know how it goes.
Go to the top of the page
+Quote Post
anza
post Jun 26 2003, 19:41
Post #28





Group: Members
Posts: 1317
Joined: 4-January 03
From: Finland
Member No.: 4418



A hard song for MS Mary: Led Zeppelin - D'yer Mak'er biggrin.gif
Go to the top of the page
+Quote Post
paulski
post Jun 26 2003, 20:38
Post #29





Group: Members
Posts: 58
Joined: 23-June 03
Member No.: 7362



:lol: Mary likes a challenge.
I'm looking forward to your impression on natural voices VTR.
Go to the top of the page
+Quote Post
zanson
post Jun 26 2003, 20:39
Post #30





Group: Members
Posts: 126
Joined: 17-April 03
Member No.: 6027



QUOTE (paulski @ Jun 25 2003 - 03:59 AM)
I'm already working on your last suggestion Canar, but I'm not sure what the advantage of using it as an input plugin is. Can you expand on that?
upNorth - I'm not sure if it's possible to delay playback of the audio until the speech has finished because I currently use the callback from foobar that a song has just started as a trigger to speak. Any suggestions?

Paulski

kode54 has a "pause between tracks" plugin with source code on his page: http://www.cqasys.com/projects/kode54/ or http://www.cqasys.com/projects/kode54/0.7/ for it updated to the 0.7beta sdk. You should be able to build upon this to insert the song names instead of a pause. (I haven't looked at any of the code to see how he does the pauses)

This post has been edited by zanson: Jun 26 2003, 20:40
Go to the top of the page
+Quote Post
paulski
post Jun 26 2003, 22:55
Post #31





Group: Members
Posts: 58
Joined: 23-June 03
Member No.: 7362



There's a new version of the plugin available with the following features:
DJ mode: the volume of the music is reduced whilst speaking. (BTW thanks for the info zanson, I'll check it out coz theres currently a latency issue with reducing the volume).
Tag field customisation: There are 3 text fields where you can specifiy what tag fields should be read aloud.
The volume and the rate are currently not enabled (will let you know when).

Paulski
Go to the top of the page
+Quote Post
Saint
post Jul 23 2003, 23:07
Post #32





Group: Members
Posts: 99
Joined: 9-June 02
From: England
Member No.: 2253



This sounds like a really great plugin, any news on a new version? maybe for 0.7x ?? biggrin.gif


--------------------
"If you cannot read this, please ask the flight attendant for assistance."
- United Airlines Flight Safety Brochure
Go to the top of the page
+Quote Post
jrbamford
post Jul 24 2003, 16:00
Post #33





Group: Members
Posts: 308
Joined: 1-December 01
Member No.: 569



I used some linux speech tools when porting a friends mp3 jukebox over to supporting mpc and the like... I had it so you could print out a list of albums with their numbers.. and then using a normal infra red remote you could just press the digits for the album you wanted

123 enter

and it would queue or play the album... or best of all leave it in random mode and it would just peruse all your tracks... it had some user management too, u logged in via telnet.. you rated songs and then in random mode it only played songs everyone liked (or at least wouldn't play any that SOMEONE logged in, didn't like) it was great for computer labs at uni...

I really would love to replicate something like this using foobar but I digress.. when i did the speech part it outputted the usual, artist, album, title... crucially.. it was on a button on the remote... pressing it paused the song... told you its details.. and then played it again... something like this would be really useful.. I dont know how IR controls of foobar are working as yet but this to me is the ideal... telling you every track as an option is good but when you know it (and u usually know most tracks) its just a gimmik that could frustrate... as an option anytime in a song its great... although you really need to NOT be able to see the screen...

is this possible?


--------------------
Binaural recordings of mine: http://binaural.jimtreats.com
Go to the top of the page
+Quote Post
Saint
post Jul 24 2003, 16:10
Post #34





Group: Members
Posts: 99
Joined: 9-June 02
From: England
Member No.: 2253



QUOTE
I used some linux speech tools when porting a friends mp3 jukebox over to supporting mpc and the like... I had it so you could print out a list of albums with their numbers.. and then using a normal infra red remote you could just press the digits for the album you wanted

123 enter

and it would queue or play the album...


That really is a great idea as i use a remote control, to control foobar myself (Hauppague wintv pci controller and a little program called IR Remote). Selecting the album using numbers and having it read out would be a dream come true.

Saint

This post has been edited by Saint: Jul 24 2003, 16:10


--------------------
"If you cannot read this, please ask the flight attendant for assistance."
- United Airlines Flight Safety Brochure
Go to the top of the page
+Quote Post
jrbamford
post Jul 24 2003, 16:20
Post #35





Group: Members
Posts: 308
Joined: 1-December 01
Member No.: 569



this reading aloud is only doing artist... album works too.. but title is never being read out..

QUOTE
That really is a great idea as i use a remote control, to control foobar myself (Hauppague wintv pci controller and a little program called IR Remote).


I have the PVR-250.. dont much like the remote but its only one possible remote i guess. i brought a few IR devices from europe.. one broke... i've since brought the actisys IR200L to use with the PVR-250 and SageTV ... it sends IR channels to my satellite box.. changing channels as and when it wants to... it also works reading in IR...

Anyways how much control of foobar have you got.. I've not tried as i assumed it wasn't much... for my Sage controlling i brought a wireless mouse and keyboard... this lets me do everything.. and is essential when i connect the PC to my projector to watch TV/DVDs.. its great.. but for straight audio nothing beats IR remote control.. smile.gif


--------------------
Binaural recordings of mine: http://binaural.jimtreats.com
Go to the top of the page
+Quote Post
jrbamford
post Jul 24 2003, 16:22
Post #36





Group: Members
Posts: 308
Joined: 1-December 01
Member No.: 569



ok title is working now... great stuff... it'd be nice if you could add your own plaintext seperaters... words such as

BY, ON, IN

etc are all very useful and makes the fields flow a little better together.. now to try some of the updated voices


--------------------
Binaural recordings of mine: http://binaural.jimtreats.com
Go to the top of the page
+Quote Post
paulski
post Jul 30 2003, 07:48
Post #37





Group: Members
Posts: 58
Joined: 23-June 03
Member No.: 7362



Hi

Sorry, I was out of touch with the forum for a while. jrbamford, your suggestion sounds excellent. I think all that's needed is to provide a hotkey function to read out the current song. You can then associate this hotkey with whatever RC you are using. I will enable this feature and put it in the config page as an option.

Paulski

P.S. There are futher possibilities for speech that may enhance foobar and even eliminate the need for a display:
Speaking aloud the list of albums (from the database) and using speech input to add to the playlist and play (general speech control of the app is also an option).

What do you think?
Go to the top of the page
+Quote Post
foosion
post Jul 30 2003, 12:40
Post #38





Group: FB2K Moderator (Donating)
Posts: 4414
Joined: 24-February 03
Member No.: 5153



QUOTE (paulski @ Jul 30 2003, 08:48 AM)
There are futher possibilities for speech that may enhance foobar and even eliminate the need for a display:
Speaking aloud the list of albums (from the database) and using speech input to add to the playlist and play (general speech control of the app is also an option).

What do you think?

My suggestion would be that you implement an interface to the TTS engine in the form of a foobar service (in case you haven't already done this). Other plugins can use TTS capabilities in this way, and you only need one configuration for the TTS engine (in your plugin).
I don't know, if the STT stuff would work out well for things like adding entries to the playlist. Surprise me.


--------------------
http://foosion.foobar2000.org/ - my components for foobar2000
Go to the top of the page
+Quote Post
paulski
post Jul 30 2003, 14:40
Post #39





Group: Members
Posts: 58
Joined: 23-June 03
Member No.: 7362



I'm not sure I understand the advantage of your suggestion foosion. I do not implement a TTS engine myself but simply make use of it via the Microsoft Speech API on client systems that already have it installed (win2k / XP). I guess your solution would save the developer from installing the MS Speech SDK though.

The hotkey approach would still be nice though, whereby users (non-programmers) of girder-type applications can simply foward a key event to foobar that the plugin can capture. The user simply associates key combis within the config panel of the plugin with the girder key to invoke a TTS call.

I'm unsure myself about whether the speech-based playlist control / composition would work nicely.
Go to the top of the page
+Quote Post
jrbamford
post Jul 30 2003, 15:00
Post #40





Group: Members
Posts: 308
Joined: 1-December 01
Member No.: 569



paulski.. yes having it on a button (which can then be mapped to ir controller keys) is what you need... configuring it to pause or not pause (just play simultaneously as it does now) would be useful.. getting a lot more fields added would also be good... shouldn't take much to add 5 or 6 total num of fields... although like i said having a plaintext field where you can type whatever you want into it would be the most flexible.. following a reinstall i dont have your plugin installed yet.. i'm also moving over to the 0.7 beta (does it work with this, does it ONLY work with this?? smile.gif ) but roughly this would be nice

Field 1: %TITLE%
Field 2: by
Field 3: %ARTIST%
Field 4: off of
Field 5: %ALBUM%
Field 6: in
Field 7: %YEAR%

I guess really you would want a page that let you have as many fields as you want without cluttering it up!?! above fields 2, 4, and 6 are plain text to allow you to sculpt it however you want.. I also noticed that this plugin when i last used it didn't get output to the streaming output created by foobar... not a biggie but it'd be nice to have this as an option if its possible.. streaming music is obviously a really good example for where this system is nice... ok they get the information but at the moment that information is restricted (oddcast only broadcasts artist - title) .. as such this kind of a mechanism especially if you are able to put it in before a track started say?! would be a great way to infrom people of what they are listening to..


--------------------
Binaural recordings of mine: http://binaural.jimtreats.com
Go to the top of the page
+Quote Post
foosion
post Jul 30 2003, 16:27
Post #41





Group: FB2K Moderator (Donating)
Posts: 4414
Joined: 24-February 03
Member No.: 5153



I know you did not develop a TTS engine yourself. I merely assumed* it would take some hassle to get the TTS engine working nicely with (within?) foobar, and that you already had some code to accomplish this. So the benefits of having a foobar service for a TTS engine would be that the extra code to get the TTS engine working with foobar would only need to be in one plugin. Other plugins would just use the service like this:
CODE
if (text_to_speech::present())
  text_to_speech::speak("Foobar rules!");
provided that present() and speak() are static methods of a hypothetical text_to_speech interface.
Sorry for making assumptions, I did not have a look at the MS SAPI.

*: usually a bad idea, I know.


--------------------
http://foosion.foobar2000.org/ - my components for foobar2000
Go to the top of the page
+Quote Post
paulski
post Jul 30 2003, 17:42
Post #42





Group: Members
Posts: 58
Joined: 23-June 03
Member No.: 7362



With the latest version of the MS Speech SDK, only requires about 5 - 10 lines of code are needed to get it to say anything (through helper functions). Your point would certainly be valid for the older versions of the SDK though. The benefit of a single plugin may still be valid however since developers wouldn't need to download and install the SDK in order to make use of speech in their code.
Go to the top of the page
+Quote Post
jrbamford
post Jul 30 2003, 18:45
Post #43





Group: Members
Posts: 308
Joined: 1-December 01
Member No.: 569



paulski, any ideas why the speech doesn't come out with the broadcasted web stream..?! what methods are u using to put out the sound?? i guess you are just creating the sound to wave out/direct sound etc.. its got no attachment to foobars output stream and so thats why its never broadcast!? do the SDKs allow you to piggy back the speech stream onto an existing output buffer!?

This post has been edited by jrbamford: Jul 30 2003, 18:47


--------------------
Binaural recordings of mine: http://binaural.jimtreats.com
Go to the top of the page
+Quote Post
zanson
post Jul 30 2003, 19:37
Post #44





Group: Members
Posts: 126
Joined: 17-April 03
Member No.: 6027



QUOTE (jrbamford @ Jul 30 2003, 10:00 AM)
but roughly this would be nice

Field 1: %TITLE%
Field 2: by
Field 3: %ARTIST%
Field 4: off of
Field 5: %ALBUM%
Field 6: in
Field 7: %YEAR%

I guess really you would want a page that let you have as many fields as you want without cluttering it up!?! above fields 2, 4, and 6 are plain text to allow you to sculpt it however you want.. I also noticed that this plugin when i last used it didn't get output to the streaming output created by foobar... not a biggie but it'd be nice to have this as an option if its possible.. streaming music is obviously a really good example for where this system is nice... ok they get the information but at the moment that information is restricted (oddcast only broadcasts artist - title) .. as such this kind of a mechanism especially if you are able to put it in before a track started say?! would be a great way to infrom people of what they are listening to..

It should be pretty easy to just have a text input that you can type foobar format strings into, and then use the conversion functions to get back the string to be said.

ie, just have a text input where you put
%TITLE% by %ARTIST% off of %ALBUM% in %YEAR%

which you use the foobar sdk to convert to
some song by some artist off of some album in 1962

then pass that string into the text to speach engine.
Go to the top of the page
+Quote Post
jrbamford
post Jul 30 2003, 21:52
Post #45





Group: Members
Posts: 308
Joined: 1-December 01
Member No.: 569



sounds good.. cleaner than having lots of text boxes appearing too smile.gif

dammit i would really like a program to create a wave of a txt->speech engine right now... so i could add one onto the end of this playlist before i kill it to a sleeping listener smile.gif


--------------------
Binaural recordings of mine: http://binaural.jimtreats.com
Go to the top of the page
+Quote Post
paulski
post Jul 31 2003, 05:32
Post #46





Group: Members
Posts: 58
Joined: 23-June 03
Member No.: 7362



Nice idea about the text input format. I like it. I will definitely make a version that supports it.
The speech output currently goes directly to the soundcard. It is possible to mix the output with the stream but I don't know how much work that would be.

This post has been edited by paulski: Jul 31 2003, 08:17
Go to the top of the page
+Quote Post
se7ven 777
post Aug 14 2003, 22:21
Post #47





Group: Members
Posts: 19
Joined: 12-May 03
Member No.: 6575



what about this nice plugin?? how is work going??smile.gif any results
Go to the top of the page
+Quote Post
paulski
post Aug 17 2003, 13:39
Post #48





Group: Members
Posts: 58
Joined: 23-June 03
Member No.: 7362



Not yet. I've been as busy as a very busy bee the past couple of weeks. I'll get started over the next few days though and put the source files on my site so others can extend it further.

Paulski
Go to the top of the page
+Quote Post
paulski
post Aug 18 2003, 21:57
Post #49





Group: Members
Posts: 58
Joined: 23-June 03
Member No.: 7362



At last. A new version of the text to speech plugin (go to the link at the start of this thread).
The plugin can now speak aloud a formatted string entered in the config using the same notation provided by title formatting.
There is also the option to manually trigger song announcements using a shortcut key (you have to assign a key yourself in the keyboard shortcuts config by choosing 'Say current playlist item').
There is also a (disabled) option for auto announcing around the end of a song (like the DJs do). I'll enable it at a later date.

Paulski
Go to the top of the page
+Quote Post
Saint
post Aug 18 2003, 23:32
Post #50





Group: Members
Posts: 99
Joined: 9-June 02
From: England
Member No.: 2253



I take it this is for 0.667 as it complains about needing to be compiled with a new SDK. Sounds like a great plugin, keep up the good work.

Saint


--------------------
"If you cannot read this, please ask the flight attendant for assistance."
- United Airlines Flight Safety Brochure
Go to the top of the page
+Quote Post

6 Pages V  < 1 2 3 4 > » 
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 1st August 2014 - 13:25