58. Crowds Train Computer Translators

Native speakers and software engineers combine forces to make more languages available online.

By Michael Fitzgerald|Wednesday, December 19, 2012

Automatic translators have become powerful tools for communication across dozens of the world's major languages. But accessing one of the other 6,800 tongues spoken on Earth has continued to require costly and time-consuming human translators. Now Microsoft has created a way for communities to develop a computer translator by themselves. 

A pilot project organized by Microsoft focused on the Asian Hmong Daw language. About 30 Hmong-speaking volunteers in Fresno, California, uploaded definitions from a Hmong-English dictionary, alongside copies of documents written in both Hmong and English. Then they wrote sample sentences demonstrating how individual words in their language are used. Statistical techniques automatically assigned probable translations, and over a four-month period the volunteers corrected mistranslations, steadily improving the computer system's pattern recognition. 

In February Microsoft officially added Hmong Daw to its Bing Translate service. More than 50 community and commercial groups are now "training" their own computer translators. One of the newer efforts aims to preserve the vanishing Mayan language.

Comment on this article