Quadmore Software Services
 
 
 



Open Source text-to-speech in Java SWING, with JAXB and Java Web Services, along with the Mckoi SQL database
BabyTalk Web version 1.6 home page
Code last modified: September 29, 2005
Page last modified: September 29, 2005

BabyTalk Web is a Java desktop text-to-speech application implementing:

1) the FreeTTS engine version 1.2.1
2) Java Web Services 1.6 and JAXB
3) The Mckoi Open Source SQL/Java database

BabyTalk Web was conceived / coded / copyright 2003, 2004, 2005 by Bert Szoghy webmaster@quadmore.com

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation; This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details, at: http://www.gnu.org/licenses/licenses.html You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

BabyTalk Web will display a treeview of public domain text files that can be downloaded from the Internet. You can click a "tree leaf" to download a text selection and have it read out automatically. While a file is read, the sentence currently being read can be optionally displayed "close captioned" style in a pop-up window. BabyTalk Web will also allow you to skip forward or back in the text and will remember where you left off when you resume after pausing. You can also open a local text file to be read, or optionally resume reading a previous text at the position you left off, using a local database also provided with BabyTalk Web. Finally, you can optionally choose to automatically resume reading the last text read when the application is re-started.

September 29, 2005: version 1.6 release.


Click here to download BabyTalk Web 1.6 full source code with ANT build file (file size about 9 megabytes).



NEWS

(September 2005) BabyTalk Web version 1.6 released!

BabyTalk Web with SAPI support on Windows XP
Click image to view actual size

NEW THIS VERSION:

1) Switched database from SimpleText to the Mckoi Open Source database, which allows for transactions and is friendlier to J2EE applications;



2) Enabled the JAXB Validator check (strictly for good form, see notes);

3) Major code cleanup;

4) Added the Quadmore Java to SAPI text-to-speech for Windows in a Java package with UNICODE support along with some OS checking: you can switch voices while the text is being read (change from the default Java voice to various SAPI voices back and forth);

5) Reduced the tree view font size for the Windows OS;

6) The Mckoi database comes with its own Java SWING query tool. To launch it, we've provided the file "test.bat" which explains how to;


Click image to view larger

7) Fixed an issue where jumping back to the beginning would skip the first sentence of the document;

8) Pausing the reading of the text is now spelled out in the application's status bar in real time.

9) Can now add a new Internet library, meaning you can go get an XML file on a different web server and automatically add its available documents to the tree view on-the fly.

10) Using the free Microsoft SAPI on Windows, BabyTalk Web can now read texts in languages other than English, including French, German, Chinese and Japanese at no cost!

11) The checkbox selections are now saved in the database and will be remembered after restart;

12) Additional libraries will be remembered after restart;

13) If additional library files are deleted, the database is automatically cleaned up.

Please see the new sections "Building additional libraries for BabyTalk Web version 1.6" and "Using Microsoft SAPI to read a text in a language other than English" below!



(September 13, 2005) BabyTalk Web version 1.5 released!

BabyTalk Web
Click image to view actual size

Click here to download BabyTalk Web 1.5 full source code with ANT build file (file size about 8 megabytes).

Click here to download BabyTalk Web 1.5 binary ready-to-run application with no source code (file size about 8 megabytes).



NEW THIS VERSION:

1) Added Apache ANT support (using version 1.6.5);

2) Upgraded Sun Java Web Services version 1.3 to version 1.6;

3) Upgraded Sun Java Runtime support from version 1.4.2 to 1.5;

4) Upgraded to FreeTTS 1.2.1: the freetts.jar was recompiled to add a the line:

    com.sun.speech.freetts.en.us.cmu_us_kal.KevinVoiceDirectory

    in the file:

    \test\com\sun\speech\freetts\internal_voices.txt

5) SpeechClass.java file was overhauled for FreeTTS 1.2.

6) Resolved new Java 1.5 file I/O differences.


(December 2004) BabyTalk Web version 1.0 tests out A-OK using the new Sun Java 5 runtime on Linux Redhat 9. Noticeable load speed improvements as well as a great new color scheme.



SUPPORTED PLATFORMS

BabyTalkWeb uses JAXB so it is limited to operating systems that support the Java Web Services, specifically: Solaris 9, Solaris 10, Windows 2003 Server Professional Edition, Windows XP Professional Edition, Windows 2000 Professional Edition, RedHat Linux 9.0, RedHat Linux AS 3.0

That said, Java Web Services will also work well on other versions of Windows but this is unsupported by Sun.



REQUIREMENTS / INSTALLATION

1) A sound card on your computer along with speakers turned on.

2) Sun Java Runtime version 1.5 or greater needs to be installed on your machine for BabyTalk Web version 1.5: http://java.sun.com/j2se/1.5.0/download.jsp.

3) If you choose the source code download, you will need Apache ANT to compile and run the application: http://ant.apache.org

4) A decent computer with at least 64 megs of unused RAM. On a Windows 2000 Server 1.4 gigahertz box with 128 megs of RAM it takes about fifteen seconds to load and the audio signature following the splash screen will not "clip" -- if no other programs are running. Any equivalent or better hardware will do.

5) Extracting the zip file using folder names to a directory of your choice, but I recommend: C:\BabyTalkWeb on Windows boxes. An issue has been found on Windows NT4 and XP where spaces in the directory path or name will prevent the synthetizer initialisation, even if short names are used in the command prompt. Therefore, you will probably encounter errors if you unzip to: C:\PROGRAM FILES\BabyTalk Web

6) A monitor display resolution of 800 x 600 or greater.



TO COMPILE AND RUN THE APPLICATION FROM THE SOURCE CODE DOWNLOAD

In a command prompt go to the BabyTalkWeb directory first, then type:
ant run



TO RUN THE APPLICATION FROM THE BINARY DOWNLOAD

In a command prompt go to the BabyTalkWeb directory first, then type (case sensitive on all platforms):
java SplashScreen



KNOWN ISSUES

1) The status of the third checkbox, "Remember position and last read when restarting BabyTalk Web", is checked only once before each sentence is read. This means for the resume to work, the checkbox must be checked before the last sentence is displayed. If you do not want a resume, the checkbox must be unchecked before the last sentence is displayed.

2) The checkbox settings themselves are not remembered after restarting the application, all three checkboxes are checked when the app is launched. This feature has not been implemented yet. Not a bug, but I have found this is confusing when monitoring the status of the third checkbox on application restart.

3) Get the following message in the console: Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException when doing the following 5 steps to reproduce:
a) open the app and add the suggested additional library "publicfrenchtexts.xml";

b) close the app;

c) delete the library file "publicfrenchtexts.xml" from the directory "c:\babytalkweb" (on Windows XP SP1);

d) open the app and wait for the cleanup routine to delete the reference from the LIBRARY table (the concierge thread kicks in 5 seconds after BabyTalk Web finishes loading);

e) After the (successful) message "Redundant file reference removed from database" is displayed in the console, add the suggested additional library "publicfrenchtexts.xml" through the menu a second time again: at that point, message will be displayed in the console.

The exception is of absolutely no perceivable consequence, the library is added correctly again, the database insert happens correctly again, a newly-added text can be selected and read from the tree view and everything behaves as expected. It just looks bad and I can't seem to trap the message. Let me know if you see the cause, I suspect it might not even be my code which is the culprit.



ACKNOWLEDGEMENTS

BabyTalk Web implements the FreeTTS text-to-speech engine, required to compile changes to this application. Please note that the FreeTTS team is in no way involved in the BabyTalk project and should not be contacted concerning it for support.

FreeTTS is free from Sun Microsystems, and the project homepage is at:
http://freetts.sourceforge.net/

Also implemented is the Java Web Services Developer Pack 1.6, specifically JAXB. The home page for Java WSDP version 1.6 is:
http://java.sun.com/webservices/downloads/webservicespack.html

The Open Source Mckoi SQL database web page is at: http://www.mckoi.com/database/

The Simpletext flat file database and JDBC driver are by Thought Inc, and the installer is included in this distribution of BabyTalk Web, as per licensing instructions. The Thought, Inc. web page is at: http://www.thoughtinc.com/simpletext.html

The opening sound clip used in BabyTalk Web is an excerpt from the song "Voyages of the Stingray" by the Modernes Pickles. Learn more about the Modernes Pickles at:
http://www.satanbelanger.net

BabyTalk Web was coded using a registered version of TextPad: http://www.textpad.com No other development tool was used.

Special thanks to Non Prophet for helping to get the ball rolling and to Dirk Schnelle at http://jvoicexml.sourceforge.net.

Please check for the latest version of the different BabyTalk projects at: http://www.quadmore.com/babytalkweb/

Bert Szoghy is a Senior Analyst, currently working in Quebec City.



PURPOSE

"BabyTalk Web" is the second phase of a planned three phase project, each phase producing a different application which will be maintained separately.

The first phase ("BabyTalk") is already complete at version 1.5. It lets you pick a local text file, and it will read it out loud to you while showing you the sentence it is reading "close captioned" style. BabyTalk will also allow you to skip forward or back in the text and will remember where you left off when you resume after pausing.

The third phase will implement VoiceXML and the forthcoming CMU Sphinx speech recognition engine version 4, targeting visibility-impaired users ("BabyTalk Interactive").



FEATURES AT A GLANCE

1) Opening splash screen displays a JPG image, then plays a WAV audio file before launching the main application;

2) Launching the application the first time executes a database bootstrap routine;

3) Main window implements a JTree treeview control which in turn displays JAXB-generated "leaves" from an XSD schema and XML file. Clicking on a leaf downloads a text file over the Internet;

4) Timer listening for the launching of menu items;

5) SWING Threads;

6) A "Resume Reading Previous Text" JDialog window implementing a JList list box control. Clicking on a list box item resumes the reading of a previous text;

7) Check boxes combined with a local flat file database using SQL and JDBC allow saving desired settings and reading history;

8) Standalone desktop text-to-speech engine;

9) Java Web Services 1.3 functionality integrated into a SWING desktop application;

10) Global variables shared between asynchroneous threads;

11) Java package and class deployment example;

12) About window;

13) Java I/O routines;

14) Java string replace all routine.



HOW THE TECHNOLOGY IS USED

From the point of view of text-to-speech functionality, "BabyTalk Web" is the same as "BabyTalk", that is, the first few hundred sentences of a text file are parsed into virtual XML and the reading of the file can begin while the rest of the file is slowly being parsed in a separate thread. The TTS engine used is FreeTTS 1.2.1. This design is unchanged.

On the SWING side of things, the design is enhanced with three checkboxes and a treeview control. This was still achieved using the GridBagLayout. The complexity of doing the layout was probably the hardest element of the project to accomplish, and I would still like to make the treeview control proportionally wider but have given up. Accomplishing this in TextPad is a pain. In any case, the final result has the "BabyTalk Web" window wider, fitting in a minimum monitor resolution of 800 x 600.

The treeview provides a list of public domain text files, describing author, title and publication date. This list is generated using XML, JAXB and Java Web Services. Clicking on a text file in the treeview will download that text over the Internet and optionally begin to play it once completely downloaded. The XML file publicdomaintext.xml is provided with the application and points to some public domain texts stored on the Quadmore web site. These texts were selected from the Gutenberg Project repository and are in the public domain.

The file selection is slightly different in BabyTalk Web. The purpose of the different phases of BabyTalk is to provide visually impaired persons a completely hands-off interactive experience. The design of phase two is heading toward this goal. For this reason, when a file is opened to be read in BabyTalk, reading will now automatically begin without having to click the PLAY button. The Play button is basically only useful after a pause.

An added complexity with BabyTalk Web is that there are 4 ways you can now open a file to be read:

1) As before, by going menu FILE > Open, which launches a local file chooser window;

2) By clicking on a file in the treeview;

3) By clicking menu FILE > RESUME READING TEXT and selecting a text in the popup dialog list box;

4) By checking the RESUME READING TEXT AT LAST POSITION WHEN RESTARTING BABYTALK WEB checkbox and re-starting the application.

This forced the coding to be streamlined in a way where there would be only one point of entry and one point of exit for all of the above.

JDialog was used because of the control it allows of the title bar and because I wanted to avoid mixing a native window interface (for the pop-ups) with a 100% Java one (for the main window).

In future versions of "BabyTalk Web", I will implement functionality to download additional XML files from different web sites, validate them against the BabyTalk XML schema, then append the books listed in the XML files to the treeview.

I also intend to add as a separate download in the near future a complete suite of server-side scripts which will allow you to set up a database of texts, as well as web pages to insert, update, and delete these texts including a script which will create (the correct term is "concatenate") a valid XML file that will be recognized by "BabyTalk Web". That means you can have your own web site with your own database of texts as a backend for "BabyTalk Web". Set up your database, insert texts into that web database using web pages provided, generate your own "publicdomaintext.xml" and substitute it with the one that comes with the application and "BabyTalk Web" will now download your texts over the Internet. The database used for these scripts will be mySQL and the script language used will be PHP, but later I aim to include the .NET and JSP equivalents. Personally I like mySQL and PHP because they will run on any operating system, they are completely free, and paying to have a web site hosted running PHP and mySQL is the least expensive solution on the market -- and the most secure. The problem with .NET hosting and Java Server Pages hosting is that they will most often require dedicated computers. PHP and mySQL on the other hand are a good solution for shared hosting. If you don't mind if your web site response time is a bit longer than a microsecond and 99.8% uptime will do just as well as 99.999%, then you should look at shared hosting.

The text files that you download over the Internet by clicking on a treeview item are intended to be stored in a database. This means that they will not necessarily have a file name. The URL pointing to them will likely contain a parameter identifying the text in the database. This was a design problem because I wanted to cache the text files locally once downloaded, to allow "BabyTalk Web" to resume reading exactly where left off even after closing and restarting. This meant persistently identifying and storing a combination of filename and sentence number of where last read in the text.

One way to accomplish this could be to modify and maintain an XML file locally, but I do not like this idea by principle: XML is not a database. Once again, XML is a means of transport and a database is a means of storage. On the other hand, I needed only two database tables, with one of these tables containing only one row. That's not a big requirement for a manly relational database (hang on, no need to buy 50 Oracle licenses yet).

Therefore I went about looking for a 100% Java Open Source database that could do SQL and that was as lightweight as possible. I was happy to find Thought, Inc's Simpletext flat file database which fit these requirements with a minimum amount of quirks. A flat-file database is the old way databases used to work, that is, with information stored in multiple text files. This is the way dBase, Foxpro, Clipper and numerous other commercial products from the early 1990's worked. For small databases such as mine, that's just fine. Unfortunately this architecture does not scale well and data integrity is not enforced between files. This is the reason single-file relational databases maintained by "engines" took over this market. For our purposes, the two database tables are unrelated (no foreign keys), and are simple enough to create a bootstrap routine to recreate these two tables from scratch if a glitch happens.

For "BabyTalk Web", the Simpletext database is used to remember (conditionally to a checkbox selection) which text was read last and what is the last position read in that text. That's one database table. The second table is a reference table: the ID column value keeps track of the next filename that will be used to store a text file locally. You could use date/time timestamps to create a unique filename. That's the way you usually do it if you are maintaining stuff in a historical context. Casting dates to strings can be tricky and system dates are differently formatted depending on operating system so I chose an even E-Zer scheme: an incremented number is used instead of dates, so the second table's first column contains only a single digit. If that number was 10, for example, the next text file to be stored locally will be called "11.txt".

Because unique filenames were required and we don't want to use a a potentially very slow SQL query such as "SELECT MAX(ID) FROM (...)" , only one value is maintained in a separate table. This is a database technique called a reference table. Some relational database engines (Oracle) will keep track of a maximum value for a numeric primary key of a database table, most don't. Querying only one row is lightning fast. The second column of the second table keeps track of either a zero (meaning "false") or one (meaning "true") which will indicate to BabyTalk Web when opening whether to load the last read text at the last position read. This condition is also controlled by a checkbox. That's all Simpletext is used for, but you will agree its role is very elegant. In real life, you would likely use a relational database locally to go with your fat client, such as Oracle, or Sybase, or mySQL, or (if you didn't know any better), Microsoft Access. Personally I like Sybase SQL Anywhere (now renamed Adaptive Server Anywhere) and mySQL in that order, if that will sway you in any way. Sybase has two database engines, I like the one that used to be called Watcom.

As per the Simpletext Runtime license, THOUGHT Inc. grants a royalty-free right to distribute copies of the JAVA Simpletext classes for use with applications you have developed using THOUGHT Inc. SimpleText (i.e. BabyTalk Web) provided that the complete distribution is made available to clients. The file simpletext.zip accompanying this application therefore contains the complete Simpletext installer, copyright and license notices.



NOTES ON MODIFYING THE CODE AND RECOMPILING

To modify and/or recompile the BabyTalkWeb application code requires first installing the Java SDK version 1.5 or later as well as Apache ANT, and set the JAVA_HOME, and ANT_HOME environment variables.

Next, you need to install the Java Web Services Developer Pack 1.6.

No extra classpath settings are needed either for the Simpletext flat file database and JDBC driver or the Mckoi SQL database (depending on the version).

Tip: On Windows 2000 the limit of the Classpath environment setting field is easily reached on a regular development box. To work around this issue, use ANT!


If you modify the XML schema, you will need to regenerate the .java files containing the JAXB get and set methods handling the XML entities defines in your schema. Please refer to the JAXB documentation.

Tip: I initially tried defining "fiction" and "nonfiction" as enumerated types in my my XML schema instead of strings in my first schema, but ran into problems when trying to display the data in the JTree. When you stop and think about it, the reason you are using XML is to keep the "data middleman" simple between the fat client and the remote web server, not to hardcode in problems for yourself. So from a purist perspective, keeping everything as a String and casting if need be afterward has its merits. XML is not a database, it's a means of efficient transport. If you have a funky datatype, create a <myDatatype> tag, define it as a String, and put whatever you need between <myDatatype> and </myDatatype> and handle it on the client end.

BabyTalkWeb using the Sun Java 1.4 runtime on Linux Redhat 9 - click to enlarge



BUILDING ADDITIONAL LIBRARIES FOR BABYTALK WEB VERSION 1.6

This section describes how to go about creating a new downloadable library located remotely on the Internet that the BabyTalk Web application can be made aware of. BabyTalk Web can display more than one library in its tree view at the same time.

The design goal for the final BabyTalk project (which will be called "BabyTalk Interactive") is to allow a user with visibility issues to choose between libraries, between fiction and non-fiction, between authors and between texts in a rapid and efficient fashion strictly through speech recognition.

There is a mountain of quality public domain content worthy of being remembered.

For this quick tutorial, let's start with 3 classic French texts borrowed from the Gutenberg Project (http://www.gutenberg.org), which now hosts documents written in many different languages.

To begin with, the three texts are edited to move down the Ebook description to the bottom of the file. This is because the related BabyTalk Web projects aim to provide a solution for people with visibility problems and to provide them with a means of being read texts. Avoiding a time-consuming "blurb" that would be read tediously by a text-to-speech engine is a great benefit to these users and not much extra effort.

If the text file contains an index of a table of matters or an index at the beginning of the file, then it is also a very good idea to cut and paste that to the bottom as well to resolve the same annoyance.

Because the Gutenberg Project texts are saved in various ASCII formats, it is a good idea to check if strange typographical characters appear in the file you want to make available in the library. For example, some of the French files I used here had strange underscore "_" characters which were undesirable and which I removed with Textpad's search and replace functionality.

Finally, poetry is great, but currently the sentence seperator is the period (".") character, so you will want a text with structured sentences or the text-to-speech engine will go on and on in a monotonous tone.

Next, a new XML plain text file is created, let's call it "publicfrenchtexts.xml". This file will contain the following:

<?xml version="1.0"?>
<Collection>
<books>
<book>
	<title>Un amour vrai</title>
	<authorFirstName>Laure</authorFirstName>
	<authorLastName>Conan</authorLastName>
	<publicationYear>1878</publicationYear>
	<location>http://www.quadmore.com/libraryFR/14537-8.txt</location>
	<bookCategory>fiction</bookCategory>
</book>
<book>
	<title>La Esmeralda</title>
	<authorFirstName>Victor</authorFirstName>
	<authorLastName>Hugo</authorLastName>
	<publicationYear>1880</publicationYear>
	<location>http://www.quadmore.com/libraryFR/13628-8.txt</location>
	<bookCategory>fiction</bookCategory>
</book>
<book>
	<title>Les fleurs du mal</title>
	<authorFirstName>Charles</authorFirstName>
	<authorLastName>Baudelaire</authorLastName>
	<publicationYear>1857</publicationYear>
	<location>http://www.quadmore.com/libraryFR/8flrm10.txt</location>
	<bookCategory>fiction</bookCategory>
</book>
</books>
</Collection>


I admit to being somewhat of a stickler, the Gutenberg Project does not provide the publication year for its texts as a rule, and in order to provide something as useful as possible (and I admit to satisfy my own curiosity), I look up the individual works elsewhere on the Internet to find this data.

You should check if your new XML file is valid (i.e. that you made no typos). To do this on Windows, double-clicking on the XML file in Windows Explorer will launch Internet Explorer which will do this automatically: if it displays in Internet Explorer version 5.1 or higher without an error message, then it's valid XML. BabyTalk Web wil check the validity of the XML as well, but will choke and cough if it's bad.

Finally, place the XML file along with the text files on the Internet. The location on the Internet must of course match what you put in the XML <location> tags for the texts. The library XML file can be placed anywhere else, of course.

In BabyTalk Web (version 1.6), select menu "FILE" > "Add new library".

Type in the URL where your XML file is located. For example, try the working URL:

http://www.quadmore.com/libraryFR/publicfrenchtexts.xml

BabyTalk Web will download the XML file to your local drive, parse it and add the works to the tree view (look in "FICTION" > "C" for "Conan").

We have a demo AVI screen capture (silent) movie of adding a new Internet library to BabyTalk Web here (25 megabytes).



USING MICROSOFT SAPI TO READ A TEXT IN A LANGUAGE OTHER THAN ENGLISH

To read a text from the additional libary described in the previous section in French, or any other language than English, you will be forced to use a Microsoft SAPI voice in the corresponding language on a Windows computer as the defaul Java FreeTTS voice does not provide grammar and phonetic dictionaries for anything else than English.

For French, install the French Microsoft Reader and then install the French TTS voice add-on.

For German, install the German Microsoft Reader and then install the German TTS voice add-on.

Concerning the Microsoft Reader installation: please note that "activation" is not a requirement. After installing Microsoft Reader, you can skip the activation and install the TTS voice and that is all you need to do to use that voice in BabyTalk Web.

For Japanese and Simplified Chinese, just download the SAPI language pack.

For other languages, you will have to probably pay for a third-party SAPI voice. Try the Cepstral voices for Italian and Spanish (only the registered "unlocked" voices will work with BabyTalk Web) at:
http://www.cepstral.com/demos/

Finally, when running BabyTalk Web, you will need to select the SAPI voice in the same language as your text from the new SAPI drop down menu. Then simply click on the text in the tree view.



OLDER VERSION OF BABYTALK WEB

Binary and source code of version 1.0 using the JDK 1.4, Java Web Services 1.3 and FreeTTS 1.1.2 can be downloaded here.



OUR OTHER SOFTWARE

BabyTalk: Open Source text-to-speech in Java SWING

Johanne's Time Organizer: Open Source time tracking in Powerbuilder

Quadmore Java to Microsoft SAPI bridge for Windows: Sun Java to Microsoft Speech API 5.1