June 2007

June Editorial

Lost in translation? If you’ve ever tried to decipher a piece of language that comes out of the mouth of Altavista-Yahoo’s Babelfish, you know that there’s a real need for adequate translating software. As it stands, machine-generated translations are practically worthless. That is, unless you use “babelizing” as a way of bringing some jokes into a boring party. It’s definitely a lot of fun to visit

The essential problem is that translating softwares have thus far been “rule-based.” This means that a software database is fed the words and rules of a language, and can only recognize the rule or the word, but not the context in which it is used. Rule-based translators, in essence, are too stupid to see the difference between a river bank or a bank for depositing your money. The one reliable result of Babelfish is the production of gibberish. For the same reason, I refuse to use spellcheckers that rely on a similar system.

There is, however, a light in this translation tunnel. The next generation of translation software will build upon a different paradigm: statistic-based generation. This new software is called Euromatrix, and is being developed at the University of Saarland. Using the millions of multilingual documents that have been produced by large international institutions, it statistically evaluates how to translate words. If the word “bank,” for example, is located somewhere close to the word “water,” it will be recognized as “river bank,” and not anything to do with the latest bank crash. The essential idea behind the program: If a word has been translated the same way 1000 times in a row, it is unlikely to be the wrong choice for translation 1001.

In order to use statistics related to word usage, the programs need to begin with a large base of “parallel texts” in many languages. International bureaucratic institutions like the UN or EU deliver this.

The Euromatrix project coordinator expects that this new kind of software will be able to translate about 60 to 80% of a text. The other percentage points and final editing will still be the job of human beings. Hopefully then the Babelfish will get a long-deserved vacation, and remain in use only as a source of fun. Yours sincerely (babelized: Our greetings of the warm one),

Angela Wilson Publisher