Jon Aquino's Mental Garden

Engineering beautiful software jon aquino labs | personal blog

Wednesday, April 06, 2005

Audiolicious: Turn Any RSS Feed Into A Podcast, Using Text-To-Speech

>>> Download audiolicious-1.1.zip (1.3 MB) <<<

Audiolicious is a Windows program that lets you turn any RSS feed into a podcast. It uses text-to-speech to convert the feed's webpages into MP3 files.

Listen to some sample textcasts:
Audiolicious works especially well with RSS feeds created using the del.icio.us bookmarking website -- just tag a web page using del.icio.us to add it to the Audiolicious job queue.


Instructions:
  1. Unzip the Audiolicious zipfile to a directory on your computer.
  2. Install Ruby.
  3. Install the Microsoft Speech API.
  4. Install the Microsoft Mary voice.
  5. Test Audiolicious by running audiolicious.bat. The MP3's will appear in the output directory.
  6. Configure Audiolicious by opening audiolicious.rb in your favourite text editor and changing the rss_feed and output_dir.
  7. Setup up audiolicious.bat to run daily using Windows Scheduler.

How this came about. Like many others, I have been disappointed with the quality of the content of today's podcasts (to be fair, podcasting is still in its early days), especially with so much good textual content on the web. For example, I wished that there was a podcast of biographies of great lives -- after all, there are numerous websites on the subject. Then it struck me: What if I used text-to-speech to convert these textual webpages to podcasts that I could listen to during my daily commute? Thus Audiolicious was born.

43 Comments:

  • This comment has been removed by a blog administrator.

    By Blogger Darren Torpey, at 4/15/2005 8:09 p.m.  

  • Just noticed Audiolicious being recommended over at: http://pchere.blogspot.com/2005/02/absolutely-delicious-complete-tool.html.

    It's nice to be recognized, no?

    By Blogger Darren Torpey, at 4/15/2005 8:14 p.m.  

  • Nice! It's mutual - I'm grateful for the recognition, and on the flip side I love making life wonderful for people.

    By Blogger Jonathan, at 4/16/2005 8:14 a.m.  

  • Version 1.1 now works with Feedburner feeds.

    Plus, if you want to convert your blog into a podcast, you can now set filters to filter out stuff before and after your actual post.

    By Blogger Jonathan, at 4/22/2005 10:06 p.m.  

  • Thanks for this, Jonathan. I'm looking forward to trying this. Keep up the great work. ... MP

    By Blogger Michael Pastore, at 4/25/2005 6:10 p.m.  

  • My pleasure Michael. It works great with del.icio.us: tag pages you want to convert, then point Audiolicious to that del.icio.us tag's feed.

    Haven't done much experimentation with converting a blog into a podcast. Todd Brill is planning to use Audiolicious to do some experimentation with that one.

    By Blogger Jonathan, at 4/25/2005 7:05 p.m.  

  • wow. this works pretty well. Does anybody know if some "nicer" voices exist out there for this? thanks.

    By Anonymous Anonymous, at 5/09/2005 10:58 a.m.  

  • Hi Anon - The best voices out there right now are AT&T Natural Voices. They've got an online demo but unfortunately it is limited to 30 seconds -- enough to make amazing answering messages though: http://www.research.att.com/projects/tts/demo.html

    By Blogger Jonathan, at 5/09/2005 12:04 p.m.  

  • Jon's right, the ATT voices are the best. But to use them (or the default system voice) you need to remove the line "Set tts.Voice = tts.GetVoices("Name=Microsoft Mary", "Language=409").Item(0)" from the texttowave.vbs.

    By Anonymous Anonymous, at 5/16/2005 8:41 a.m.  

  • Hi Colin - Thanks for adding that point about how to use the AT&T voices with Audiolicious. Have you got them working?

    By Blogger Jonathan, at 5/16/2005 1:16 p.m.  

  • Jon - Yes, but I already had a copy of ATT Natural Voices that I bought (~35.00USD) with my copy of TextAloud MP3 from www.nextuptech.com about 2 years ago.
    To enable them, simply remove the line that selects Mary in texttomp3. Audiolicious should now use the default system voice. To change the default voice go to the speech control panel. I believe there are free SAPI5 compliant voices availiable through Carnegie Mellon's FestVox project (http://festvox.org) but I'm not sure...

    By Anonymous Anonymous, at 5/18/2005 5:18 a.m.  

  • Great to know - thanks Colin.

    By Blogger Jonathan, at 5/18/2005 7:44 a.m.  

  • With text-to-speech on your portable device (e.g. PocketPC with Fonix iSpeak), you don't need to waste space storing MP3's. I have a Perl script that with a single click reads a URL or text block from the Windows clipboard and enqueues it for listening in iSpeak. It also strips headers/footers from my favorite sites in an extensible way.

    Now if only iSpeak had rewind/fast-forward functionality....

    By Anonymous Anonymous, at 6/01/2005 3:53 p.m.  

  • Hi Brian - Hm - TTS directly on the PDA sounds like a good option. Unfortunately this Fonix iSpeak that you speak of isn't free ($30 US), but it is a good option.

    By Blogger Jonathan, at 6/01/2005 7:16 p.m.  

  • I work for Cepstral, a company that produces high quality TTS voices. Our engine runs on Mac, Windows, and WinCE, so you can use them in mobile or desktop environments. (I recommend David & Diane.) You can try any voices for free at http://www.cepstral.com

    Regards - Cepstral

    By Anonymous Anonymous, at 8/25/2005 11:52 a.m.  

  • This is a fantastic piece of convenience!
    I agree fully that this will multimedialize the blogosphere much quicker than podcasts (as Darren already pointed out).
    I just hope that there is room for improvement to the way the voice sounds...?

    By Anonymous Anonymous, at 8/26/2005 2:08 p.m.  

  • Guy, this is going to be famous, you need a logo!

    By Anonymous Anonymous, at 8/26/2005 7:15 p.m.  

  • Hmm -- Cepstral is OK. I like AT&T better. Interesting that you can run Cepstral on mobile devices though.

    Hi Roland - Yeah, the Microsoft Mary voice is a bit grating. It's alright though.

    Thanks anonymous!!

    By Blogger Jonathan, at 8/26/2005 7:23 p.m.  

  • Wondering if you concidered using festival for this, to make it really open.

    see online demo: http://www.cs.cmu.edu/~awb/festival_demos/userin.html

    Uri

    By Anonymous Anonymous, at 8/27/2005 9:16 a.m.  

  • Hi Uri - Nice demo of the Festival Speech Engine. I think I chose not to use it because I didn't like the quality of the voice that came with it. The demo you mentioned has some nice "MBROLA" voices, but if I remember correctly you have to pay for them.

    By Blogger Jonathan, at 8/27/2005 10:20 a.m.  

  • Somehow I can't get this to work. Does it depend on what version of Ruby I am running?
    What happens is, it gets everything and then creates the mp3 files in the output directory but they are all 0k (that's zeroK) in size.
    As I watch it's process, it says 'cannot find TextToWave\texttowav.wav'

    By Blogger Scott, at 8/29/2005 12:59 p.m.  

  • My bad. I thought I had SAPI 5 and Mary installed but I forgot I just rebuilt my machine. Doh!
    Works cool. Does it support multiple feeds?

    By Blogger Scott, at 8/29/2005 2:49 p.m.  

  • Hi Scott - A couple of options for doing multiple feeds:

    1. Install Audiolicious in several different directories. Voila!

    2. Use a feed-combiner like Superblog.org

    By Blogger Jonathan, at 8/29/2005 8:06 p.m.  

  • yukky mate.. I dont like it.. I wouldnt listen to it.

    sounds like something off a space film..

    silly..

    By Anonymous Anonymous, at 10/31/2005 10:31 a.m.  

  • Hi Anonymous - Yeah, the quality isn't as good as the AT&T voices, that's for sure. At least it's free!

    By Blogger Jonathan, at 10/31/2005 10:33 a.m.  

  • Cool. I have used Textaloud but it doesn't allow for rss feeds.

    In Textaloud however, you can change the speed and pitch of the voice.

    I looked in the texttowave.vbs and couldn't find the code that would make those changes.

    Is there anything to do to change the speed of Mary's voice?

    By Blogger Unknown, at 11/26/2005 5:14 a.m.  

  • Hi SITW - Evidently you can change the rate as follows:

    tts.Rate = 1

    where Rate goes from -10 to 10. Pitch is probably similar - experiment!

    I found this out from http://www.codecomments.com/message441129.html

    If you confirm that this works, it would be great if you posted a comment about how it went.

    By Blogger Jonathan, at 11/26/2005 9:42 a.m.  

  • Having the problem like one of you did. It makes a MP3 file but with 0 k in size. The texttowav works when I drag text to the script, makes a wav file. I have sapi5 installed, restarted, etc and it still will not work. I downloaded the newsest version of ruby , working , script runs, finds the RSS feed, but I do not see it making a wav file. Is there supposed to be a wav file left over in the Text to wav folder? or is it deleted when the MP3 file is made.
    I love what this can do as I am trying to publish content for visually impared readers.
    HELP please

    By Anonymous Anonymous, at 12/29/2005 2:11 p.m.  

  • hi tim - I actually can't remember - I'll bet it deletes the wav when it finishes. Take a look at the script and see if you can figure it out from there.

    By Blogger Jonathan, at 12/29/2005 10:12 p.m.  

  • Johnathan, I have to commend you on your fantastic project. I have just tried it out at a favorite hardware site. This is such a handy breakthrough that I have almost talked myself into buying one of those nice AT&T voices just to use with this. One thing I am have trouble with, though. Could you give me an explanation on how to skip past the headers? I know you can, but am having trouble. Once again, thank you for your innovation.

    By Blogger mrprovo, at 2/24/2006 2:36 a.m.  

  • Hi presidentbryce - open audiolicious.rb with Notepad and go to where it says "gsub" a lot. Then add the following line:

    text = text.gsub(/.*superduper.*/, " ")

    this will eliminate all lines that contain the word "superduper". Be sure not to put any punctuation in there, as that may mess things up. This specification is called a "regular expression". If you stick with letters you'll be ok.

    Keep adding lines like the above as necessary.

    By Blogger Jonathan, at 2/24/2006 11:19 p.m.  

  • I recommend the Heather voice by the Acapela Group now... I haven't tried the AT&T voices extensively, and we know how great giant companies are, but I like Heather.

    By Anonymous Anonymous, at 2/25/2006 5:43 p.m.  

  • Hi Anon - I hadn't heard of that one - thx for that recommendation.

    By Blogger Jonathan, at 2/25/2006 7:18 p.m.  

  • I wanted to say hello because we live somewhat near oneanother. I in Vancouver BC, you in Victoria BC, where I have also lived, and sometimes visit.

    Then I saw your image of an original EWD !
    I have difficulty finding people who know of Dijkstra. He and I are both Dutch, by the way.

    Well, I am going to try your RSS/TTS software now. It is what I was looking for when I landed on this page.

    I also favour the Acapela voices, in particular Lucy, the BBC news-readster voice.

    I am currently working on a barebones RSS reader myself.
    See laurillard.com/rss/
    Feedback is welcome.

    Let me know when you are in Vancouver.

    Greetings,
    Tristan
    Please reply via: blogger1497953@laurillard.com

    By Anonymous Anonymous, at 10/28/2006 4:38 p.m.  

  • Hi Tristan - great to hear from you - I'll send you an email.

    By Blogger Jonathan, at 10/28/2006 6:30 p.m.  

  • It looks that it works great in English, but I can`t use it for Spanish blogs. :(

    Do you know where can I find a Microsoft Spanish voice for Audiolicious?

    Thank you!

    By Anonymous Anonymous, at 12/05/2006 3:47 a.m.  

  • @egocast - Not sure but you would think Microsoft would offer a voice specially for Spanish (similar to how AT&T offers voices for various languages. But those aren't free unfortunately.)

    By Blogger Jonathan, at 12/05/2006 8:18 p.m.  

  • Hey there,

    this sound like a good idea but I can't get it to work. I have all the components installed but the .bat hangs, closes and no mp3s. I have the most recent Ruby though. Is that maybe the issue? Thanks

    By Anonymous Anonymous, at 2/02/2007 5:28 p.m.  

  • Hi Anon - it could be - Actually I haven't used this thing for over a year.

    By Blogger Jonathan, at 2/02/2007 7:52 p.m.  

  • Hi, I've been using TTS voices in embedded applications for a few years now. The heather voice from acapela is better than any of the AT&T natural voices. It also uses much less resources(memory & disk space), which is a constraint in our app. I purchased it thru nextup for about $35.

    By Anonymous Anonymous, at 2/23/2007 7:37 a.m.  

  • Anon - thx for the tip - although I don't think the AT&T voices can be dismissed so easily. They sound good to me.

    By Blogger Jonathan, at 2/24/2007 1:50 a.m.  

  • Acapela's Heather's voice sounds the most natural, better than any of ATT's.

    I am trying to settle on the best. Is there anything even better?

    As for software, is TextAloud the best, or is there something better?

    I just want to do batch jobs, i.e. convert many thousands of text files to MP3 files. In such applications, GUI (which TextAloud uses) is not as convenient as command line. I don't believe TextAloud has a command line mode for easy, batch processing/conversion of thousands of text files.

    My believe is that the quality of the voice file is the most important, and the various different software are distinguishable mostly by implementation of features and ergonomics.

    By Anonymous Anonymous, at 5/21/2007 11:01 a.m.  

  • This seems to be a very nice piece of software. I will give it a try.

    By Blogger Junior D, at 12/08/2010 12:18 p.m.  

Post a Comment

<< Home