Monday, December 14th, 2009

Text to Speech via HTML5 Audio

Category: Accessibility


Weston Ruter has created a nice mashup that marries HTML5 Audio support in modern browsers with the new Google Translate API that does text to speech (for them):

Recently Google Translate announced the ability to hear translations into English spoken via text-to-speech (TTS). Looking at the Firebug Net panel for where this TTS data was coming from, I saw that the speech audio is in MP3 format and is queried via a simple HTTP GET (REST) request: Google Translate notes that the speech is only available for short translations to English, and it turns out that the TTS web service is restricting the text to 100 characters. Another restriction is that the service returns 404 Not Found if the request includes a Referer header (presumably one that is not for

In spite of the limitations of the web service which certainly reflect the intention that the web service is only to be used by Google Translate, thanks to the new HTML5’s Audio element and rel=”noreferrer”, the service may be utilized by client-side web applications

Others have been playing with text to speech too, such as this French Ubiquity command. I hope that Google makes it a public API of translate!

Posted by Dion Almaer at 6:24 am

Would love to see text-to-speech get official support, maybe as a web service from Google or elsewhere in the short term, but as an extension to the audio tag in the long term. I could imagine voices to be downloadable and degradable, as with @font-face/font-family.

<audio voice=”swedish-chef; uk-english; english”>Today we make the chocolate mousse!</audio>

Yes, it could lead to all manner of gratuitous and annoying usage, but so can canvas and just about anything else if you set your mind to it. There are some very legitimate uses too, so bring it :)

Comment by Michael Mahemoff — December 14, 2009

I work at a company that’s developing a TTS web service. We have about 50 voices in 5 languages. You can check it out here:

Comment by clafayette — December 14, 2009

When I was in college, we’d often order pizza with a text-to-speech computer voice. Good times.

Comment by Nosredna — December 14, 2009

