Although the numbers behind the name do not reflect it, the currently-named “SpeechLess” front end for MaryTTS is now being released as beta software. I was able to assemble a three man team to create a GUI and to my way of thinking, it has come along nicely. Although the demo is web-based, these guys have been able to construct it so the entire thing is local. That means little to no latency between hitting enter and having the text replicated to speech.
I’ve talked at length about how TTS in the Linuxsphere is less than user friendly at about every turn. Our goal is to create a front end that makes MaryTTS easy to use for everyone. We’re getting there.
The first thing to do is download the jar file, the current release, from the GitHub page. You will see the download cue box on the right. Yeah, I know…the numbering doesn’t reflect a beta release. We’ll fix that. Once you have it downloaded and extracted, change directories within your terminal and issue one of any of these commands:
java -jar SpeechLess.one-jar.jar
java -jar SpeechLess.one-jar.jar nimbus
java -jar SpeechLess.one-jar.jar metal
java -jar SpeechLess.one-jar.jar motif
java -jar SpeechLess.one-jar.jar gtk
I’m running KDE and if I choose to run the command with the first or the last of the above commands, the fonts are huge. However, if I execute any of the other commands, it’s fine. Also, you can right click the jar file once extracted and choose to open it with any of the Java apps currently on your computer. At this time, SpeechLess is built against Oracle Java 8, but it should work with any of the open source Java offerings as well. You can comment below if you’ve encountered any problems.
Once executed, the icon for the application is placed in your system tray. Clicking it either way allows you to “show SpeechLess” or quit. Upon opening the app, just type something in the input field at the bottom and hit enter. It will “speak” the text you typed in. At this time, if it runs across a word that isn’t recognized or if it’s a potty-mouth word, it will tell you in the upper right hand corner that the word isn’t recognized. I am suggesting today that if it does not recognize the word, you are presented an option to open a thesaurus to seek an alternative.
Rijk, if you see this, let’s talk about it.
The quality of the voices are acceptable, especially when measured against the majority of voices already available in Linux TTS applications. My current tool for using TTS is a paid subscription website at www.spokentext.net. The voices are amazingly clear, and while some of the vocal inflections are a bit quirky, overall it’s great. For a yearly C-note, I can’t complain. But two things strike me as not so great.
- It’s proprietary.
- It’s web-based.
With MaryTTS the whole thing is on your machine. so there isn’t any latency to speak of. It’s a great tool, but open source applications could be so much better.
That being said, let’s talk about a growing concern of mine: The mindset that an application or operating system should not be necessarily easy to use, that ease of use isn’t a prerequisite. The developer creates the app to meet his or her needs and calls it done. So what if you have to compile from source? So what if it is a command line tool only with no GUI?
I am still uncomfortable with this opinion. I will illustrate a common thread that runs through the meat of the argument below. This is the type of conversation I’ve had with a number of developers and other people who wanted to “help” in the last month. It does not reflect the opinions or beliefs of current project developers or others who assist.
“Ken, I think this app is ready for prime time.”
“No, Not yet. There is still work left to do in order to make it more intuitive for the new user.”
“I disagree. The current state is fine. There doesn’t have to be a GUI for everything. How are they going to learn if they are not challenged to do so?”
“This isn’t about learning. This is about having a tool that is available for everyone to use. Think about giving this to your mom or your aunt. Would you make her jump through hoops to learn the commands to open and use the application?”
“Well, sure. They don’t need to be spoon fed. All they have to do is call me if they have problems.”
First off, you are not going to be around every time she is going to need your help. Secondly, there is no reason for her or anyone to have to learn complex tools to use a simple application. That’s just laziness. Or worse yet…stick around.
Again, this isn’t about learning how to use the command line. This is about offering a tool to those who need it for day to day matters. Adding any layer of complexity to this tool’s use is not only counter productive, in some cases it’s down-right mean, passive-aggressive behavior taken straight from the textbooks.
Let me tell what I think, based on the emails and messages I’ve received.
Aunt Betty can now use her computer on her own, meaning you are no longer the Great and Powerful Oz; the man behind the curtain who is hiding the easy tools from Aunt Betty or anyone else requiring or needing your help; You, the person who seems to be almost magical in the way you can fix things on the computer, are afraid of losing your power.
The truth of the matter…Linux isn’t hard at all. I’ve got hundreds of 12-15 year-old kids to back that up.
Two of you have emailed me and said just about as much. You were looking to argue the point. If you want to argue your point, do so in the comments below this article. We can hash it out here.
That being said, let me present the first beta release of SpeechLess: A GUI that makes using MaryTTS much easier…a lot easier. And with your help, it will get even easier than it is now. Play with the controls…they are much too geeky at this time. Tell us how to make them better…more intuitive. How can we improve them? How can we remove the “geek” from the tool names? What can we do to make it easier to use, not only for Aunt Betty, but for everyone who needs a text to speech tool that talks nice to them?
Someone is going to directly benefit from your suggestion.
Oh, and a bit of assistance here. Who can create a butler-type graphic character to represent the current application? The name “speechless” is only temporary. We’ll decide on a more permanent name once you show us a great servant for the people.
Help keep FOSS Force strong. If you like this article, become a subscriber.
Ken Starks is the founder of the Helios Project and Reglue, which for 20 years provided refurbished older computers running Linux to disadvantaged school kids, as well as providing digital help for senior citizens, in the Austin, Texas area. He was a columnist for FOSS Force from 2013-2016, and remains part of our family. Follow him on Twitter: @Reglue