You can sign up for a free account and get up to 50 hours of speech to text translation for free without having to enter a credit card so I decided to take advantage of it and see if a 1 minute audio recording I did for a product review could be turned into a text document that I could cut and paste into WordPress with ease.
After you active your account and want to start with your first audio file you just sign into VoiceBase and then click on the green Upload button in the top right.
Now you notice they list all of the file formats supported which include file types: *.mp3, *.mp4, *.flv, *.wmv, *.avi, *.mpeg, *.aac, *.aiff, *.au, *.ogg, *.3gp, *.flac, *.ra, *.m4a, *.wma, *.m4v, *.caf, *.cf, *.mov, *.mpg, *.webm, *.wav, *.asf, *.amr
The maximum file size available for upload for this free plan is: 4000 MB
For purposes of this test I was using an m4a file using the Voice Recorder app on the iPhone 6S with a PowerDeWise Lapel microphone for recording.
The interesting thing is you can choose to do Machine Transcription or Human Transcription, whichever option you choose you will get a statement on how long the transcription will take and according to VoiceBase Machine Transcriptions are completed within 24h often within 2h. Human transcriptions typically take 3 days. So if it is something you wanted quickly that is something to know about using VoiceBase is the 2-24 hour potential delay with machine transcription. Though I imagine the human transcription would have greater accuracy, but we shall see what the machine transcription brings me.
In my case, my very short audio clip was translated and sent to me via email in less than 10 minutes. It included keywords that it found in the video which may yield some SEO target words for you as in this case it was a phone belt clip I was reviewing.
When you click on the link you get to see your transcription and compare the audio to the text. In this case I found the text to be about 85% accurate, I found it a little dismaying that it can’t intelligently know that iPhone is as such and instead does i Phone as two separate words in several places. Overall the text though was mostly accurate word for word though Otterbox became auto box when comparing the phone cases.
My last paragraph was even more troublesome and I will cut and paste it for you here:
what’s on your clip it does a good job. It only does hold the phone in vertical mode though I do sometimes prefer to hold my phone in a horizontal modem a belt clip so that’s something toconsider but overall I would probably get about three and a half stars. This is the Dr mode iphone six protective case that also includes a belt clip and this is the review of it for Dragon by going tocome.
It has a tendency to merge words and it really has a problem with brand names and such, though it could be my New York accent when I speak that gives the transcription some trouble. Where I said “I would give it about 3 and a half stars” it translated to I would probably get about three and a half stars. Note I talk pretty quickly so when comparing the transcription here to Dragon Naturally speaking it is only slightly worse, though with Dragon software I can do voice training and increase accuracy. VoiceBase does let you define custom vocabulary which I didn’t do for this quick test period however.
You also have the ability to turn on swear word filter if you don’t want a transcription that includes curse words.
Overall, it probably saved me a few minutes on several hundred words of written text, I still had to do some editing but if I have a lot of time stuck in a car where I can record my voice to speak notes/audio blog posts and convert to written later it appeared to be faster than if I manually transcribed them by typing them out which I can do at 105 Words Per Minute but I have more productive things to do with my time.