Jon Aquino's Mental Garden

Engineering beautiful software jon aquino labs | personal blog

Sunday, April 03, 2005

Podblogging, or using TTS to create a podcast

I recently read an article by a Darren Barefoot complaining about the quality of the material on podcasts compared to blogs. And just today I was thinking how nice it would be to have a podcast of daily biographies of great people, and how this sort of material is probably available on the web in written form only.

Well perhaps text-to-speech (TTS) can help us to bridge the gap between the high quality of textual material on the web and the convenience of listening to audio on a portable device. What if we used TTS to convert the webpages we want to track into audio files we can listen to during the commute? Input: text, output: audio. Shall we call it "podblogging"?

Update: I have cobbled together a primitive podblogging system, and the results are fantastic! Basically, whenever I see a webpage that I want to review later, I tag it using My podblogging system then converts it into an MP3 file and downloads it to my PDA. I've uploaded a sample to OurMedia that will hopefully be processed soon so you can listen to the results (the page being read is

Effective immediately I am unsubscribing to all my podcasts and listening exclusively to podblogs!

Update 2: I have released an open-source Windows program called Audiolicious that converts any RSS feed into a podcast, using text-to-speech. Hopefully this will help to bring podblogging (or "textcasting") to the masses.

This proof-of-concept system that I have assembled is a pretty complicated bit of geekery, but it's still cool to know that it can be done. I'm using a bunch of free programs:

  • lynx -dump to convert from HTML to plain text
  • pyTTS to convert from text to wav
  • Lame to convert from wav to mp3
  • MobSync to move the mp3s onto my Pocket PC
If anyone wants a hint on making their own podblogging system, here are the scripts I use. Not too polished or portable I'm afraid, but it might give you some ideas that you can take away.

----- delicious-audio.bat -----
ruby delicious-audio.rb

----- delicious-audio.rb -----
class Array
def my_shuffle!
size .downto 1 do |n| push delete_at rand(n) end
Dir['c:/Documents and Settings/Jon/My Documents/X30 Storage Card/Delicious/Audio/*.mp3'].each { |file|
Dir['c:/Documents and Settings/Jon/My Documents/X30 Storage Card/Delicious/*.html'].my_shuffle![0..4].each { |file|
file.gsub!("/", "\\")
system("copy \"#{file}\" delicious-audio-1.html")
system("lynx -dump -nolist delicious-audio-1.html > delicious-audio-2.html")
system("txt2mp3.bat delicious-audio-2.html \"#{file.gsub('.html', '.mp3').gsub('\\Delicious\\', '\\Delicious\\Audio\\')}\"")

----- txt2mp3.bat -----
@c:\python23\python %1 %2

----- -----
import pyTTS, re, os, sys
input_file = sys.argv[1]
output_file = sys.argv[2]
tts = pyTTS.Create()
file = open(input_file, 'r')
text =
text = re.sub(r"[\n\t]", " ", text)
text = re.sub(r"\[[^\]]+]", " ", text)
text = re.sub(r" +", " ", text)
sentences = text.split(". ")
i = 0
# Set output_file file to 0-length [Jon Aquino 2005-04-03]
file = open(output_file, "w")
for sentence in sentences:
sentence = sentence.strip()+"."
i = i + 1
print str(i)+"/"+str(len(sentences))+": "+sentence
tts.SpeakToWave('tts.wav', sentence)
os.system("lame --quiet -b 32 tts.wav tts.mp3")
os.system('copy /B "'+output_file+'"+tts.mp3 "'+output_file+'"')


  • One of the reasons I think podcast quality is so much worse than blog quality (in general) is that we are all taught in school how to write, but very few people are trained in public speaking. Even fewer are trained in public speaking via radio (recording, how to have a non-annoying radio voice, that kind of thing). It requires every bit as much practice as writing does, but nobody gets that practice. So most podcasts suck, because people don't know how to compose thoughts for audio transmission.

    By Blogger Darius Kazemi, at 4/03/2005 6:04 p.m.  

  • Hi Darius - no doubt (my attempts at podcasting are no exception). Your thoughts are echoed by that guy.

    By Blogger Jonathan, at 4/03/2005 9:49 p.m.  

Post a Comment

<< Home