FL#171: IBM Watson Speech to Text

IBM Watson Speech to Text is a service that uses machine intelligence to convert the spoken word into written transcriptions. Pietro Passarelli, a Knight-Mozilla Fellow at Vox Media, has integrated this technology into an open-source tool that can turn video interviews into edited stories.
Reporting by Jon Doty

For more information

  • The tool Pietro Passarelli describes is AutoEdit, an open-source project created as part of his Knight-Mozilla fellowship with the Vox Media product team.
  • AutoEdit takes no more than five minutes to transcribe an interview. According to Passarelli, the application breaks apart interviews, transcribes the speech, then reassembles all the pieces.
  • The IBM Speech to Text service provides an API that allows users to add speech transcription capabilities into applications. To transcribe accurately, “the service leverages machine intelligence to combine information about grammar and language structure with knowledge of the composition of the audio signal. The service continuously returns and retroactively updates the transcription as more speech is heard.”
  • The IBM Watson Speech to Text service uses speech recognition capabilities to convert Arabic, English, Spanish, French, Brazilian Portuguese, Japanese and Mandarin speech into text. It supports uncompressed audio files up to 100MB.

Our weekly RJI Futures Lab video update can be delivered directly to your email inbox. Sign up for free on RJI’s newsletter subscriptions page.

Related Stories

Expand All Collapse All

Comments are closed.