Open source is increasingly getting more traction in the localization industry. Since the spring of 2007, Forum Open Language Tools (FOLT) has been driving the Translation Memory Open Source System (TMOSS) initiative.
In April 2008, Frank Bergmann, founder of ]project-open[, released the developer version (V0.1) of an open source tool called TinyTM. In July 2008, Welocalize announced the GlobalSight Open Source initiative.
The potential attraction of open source technology for the translation industry is that the source code is not owned by any LSP : it is free to use and to distribute, thereby democratizing the benefits of TM. It also reduces the difficulty and cost associated with customizing and with integrating third party software.
We shall focus here on TinyTM because it is new and, as its name suggests, small. More importantly, it is driven by a group of people that have a track record of leading open source initiatives. What follows is a high level overview of TinyTM's key components and an brief assessment of its market potential.
Version V0.1 of TinyTM consists of a TM engine, and ODBC driver, and a client application. The TM engine is written entirely in PL/SQL programming language and runs on the PostgreSQL open source database. Client applications can connect to the TM engine via an ODBC driver and query the TM engine via the PL/SQL function calls.
TinyTM comes with a sample Microsoft Word client that includes a Visual Basic macro. The macro allows the Word client to interact with the PL/SQL functions in the TM engine.
TinyTM uses a variant of "Levenshtein distance" as the main measure for the "% match" value between a source segment and a segment from the TM. According to TinyTM's website, Trigram Fuzzy String Indexing will be included in future versions of the application.
The architecture is designed for both small and large-scale deployment. In a single-user environment, a translator can install both the client and the TM engine on a single computer. In an enterprise, the TM engine can be made accessible to multiple users on the network. Scaled properly, it can potentially be used as an enterprise or even an Internet application.
Since the TM engine is not directly coupled with any specific client application, it can potentially be integrated with a variety of third party applications such as workflow engines and content management systems (CMS). As TinyTM is open source, it opens up plenty of opportunities for customizations and third party integrations.
But be careful: V0.1 of TinyTM is a developer version, not for end-users. It is currently not very easy to install and configure.
In addition, there are no filters, no scoping utility, and no TM import/export utility. However, more features will likely be added and connected to TinyTM in the future, assuming that Frank's team is able to create the ecosystem that is needed for the success of this initiative.
Yan Yu is TAUS' new technology review writer. He recently founded Spartan Consulting based in San Fransisco, and has extensive experience of the IT industry.