Jon Aquino's Mental Garden

Engineering beautiful software jon aquino labs | personal blog

Wednesday, April 06, 2005

Preview: Audiolicious: Turn Any RSS Feed Into A Podcast Using Text-To-Speech

It's late (1:14 AM), and I'm not quite ready to release the binaries yet until I do some more testing, but I want to give my technically-minded readers a peek at an open-source program called Audiolicious that I will be releasing tomorrow or the day after.

Basically it will let you turn any RSS feed into a podcast. It will download the links mentioned in the RSS feed and generate MP3 files from those webpages, using text-to-speech. Unfortunately it's Windows-only (sorry Linux-heads).

The source code is below. If anything, it shows the beautiful concision of Ruby -- it accomplishes a lot in a mere 62 lines of code:

$rss_feed = "http://del.icio.us/rss/JonathanAquino/read-review-mobile"
$output_dir = "output"

require 'net/http'
require 'uri'

# The code for #fetch is from http.rb. It follows redirection e.g. if
# you leave off the trailing slash in the URL of a del.icio.us RSS
# feed. [Jon Aquino 2005-04-05]
def fetch( uri_str, limit = 10 )
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
response = Net::HTTP.get_response(URI.parse(uri_str))
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then fetch(response['location'], limit - 1)
else
response.error!
end
end

def get(url)
fetch(url).body
end

def text(url)
text = get(url).gsub(/[\r\n\t]/, " ")
if text =~ /<body>(.*)<\/body>/i
text = $1
end
text.gsub(/<script[^<]+<\/script>/i, " ").gsub(/<[^>]+>/, " ").gsub(/[^A-Za-z0-9 ][^A-Za-z0-9 ][^A-Za-z0-9 ]+/, " ").gsub(/ +/, " ")
end

def create_mp3_proper(text, filename)
sentences = text.split(". ").collect {|sentence| sentence+"."}
i = 0
sentences.each { |sentence|
i += 1
puts("#{i}/#{sentences.size}: #{sentence}")
File.open("TextToWave/TextToWave.txt", "w") { |file| file.print(sentence) }
system("cscript TextToWave\\TextToWave.vbs TextToWave\\TextToWave.txt")
system("lame-3.96.1\\lame -b 32 TextToWave\\TextToWave.wav TextToWave\\TextToWave.mp3")
system("copy /B \"#{filename}\"+TextToWave\\TextToWave.mp3 \"#{filename}\"")
}
end

def create_mp3(title, url)
create_mp3_proper(text(url), "#{$output_dir}\\#{title}.mp3")
end

def clean_title(title)
title.gsub(/[^A-Za-z0-9]/, " ").gsub(/ +/, " ")[0..40]
end

File.open("audiolicious.txt", "w") {} if not FileTest::exist?("audiolicious.txt")
old_urls = File.readlines("audiolicious.txt").each {|line|line.strip!}

require "rexml/document"
include REXML
doc = Document.new(get($rss_feed))
XPath.each(doc, "//item") { |element|
begin
url = element.elements["link"].text
if old_urls.include?(url) then
puts "Skipping #{url}"
next
end
puts url
create_mp3(clean_title(element.elements["title"].text), url)
File.open("audiolicious.txt", "a") { |file| file.puts(url) }
rescue => e
puts "Exception: #{e.class}: #{e.message}\n\t#{e.backtrace.join("\n\t")}"
end
}

0 Comments:

Post a Comment

<< Home