A Leap in Translation Technology
Internet content is written in many different languages other than English, and this can pose a problem to search engines if the user selects to receive results from any country. The problem stems from the difficulties involved in translating a foreign language into English, or vice versa.
Anyone who has ever used this service, or any other online translation software to decipher what the phrase says, will quickly realise that accuracy is nowhere near what it might be. The difficulty lies in teaching a computer and its software to be able to understand grammar, semantics, tenses, verb use etc and to be able to understand them in both languages, and how to translate between the two.
However, the technology may have been taken a step further by the Massachusetts Institute of Technology (MIT). Researchers there have designed a language mapping system that has been able to translate an ancient Semitic language, called Ugaritic, in the amazingly short time of a couple of hours. Their research, they believe, could have great application in language translation systems.
The basis for the system’s operation is described in yesterday’s press release by the MIT news office, which describes the assumptions that the system works on ‘The first is that the language being deciphered is closely related to some other language: In the case of Ugaritic, the researchers chose Hebrew. The next is that there’s a systematic way to map the alphabet of one language on to the alphabet of the other, and that correlated symbols will occur with similar frequencies in the two languages.’
‘The system makes a similar assumption at the level of the word: The languages should have at least some cognates, or words with shared roots, like main and mano in French and Spanish, or homme and hombre. And finally, the system assumes a similar mapping for parts of words. A word like “overloading,” for instance, has both a prefix — “over” — and a suffix — “ing.” The system would anticipate that other words in the language will feature the prefix “over” or the suffix “ing” or both, and that a cognate of “overloading” in another language — say, “surchargeant” in French — would have a similar three-part structure.’
The system goes a long way towards making translation software, such as Google Translate, significantly more accurate. Further, its application in businesses and as a catalyst for further types of language technology may prove invaluable.