Arabic Chat Translator Transforms Social Media Analysis

New product addresses demand for analysis of online communication lifelines in the Middle East

Cambridge—May 18, 2011—Innovative translation software launched today by Basis Technology will transform the way governments and businesses analyze Arabic social media and online communications, according to Carl Hoffman, CEO of the linguistics software company. Hoffman said the release of Rosette Chat Translator for Arabic is particularly timely given the unprecedented role social media is playing in the ongoing upheavals in the Middle East.

Basis Technology’s Rosette Chat Translator converts Arabic chat alphabet, which uses English characters and numbers to represent the Arabic language, into standard written Arabic. Hoffman said the software will increase the efficiency and dependability of translations of Arabic, a complex language spoken by more than 450 million people and an official language of 25 countries. The new software product is intended for use by intelligence agencies and commercial enterprises.

“The ability to decipher Arabic chat is essential in today’s environment, in which we rely heavily on email, instant messaging and social networks in our day-to-day communications,” Hoffman said. “Because of its immediate need, we believe Rosette Chat Translator will help shape the emerging field of social media analytics and enhance our understanding of this critical language.”

Romanized Arabic chat, also called Arabizi, is increasingly used for online messages in social media, blogs and chat rooms, as well as cell phone text messaging. Hoffman said Arabic social media are notoriously difficult for text analysis tools to decipher since the Arabic chat alphabet has many variations. The words are spelled informally and phonetically, and vary depending on the particular Arabic dialect of the author. For example, the phrase “Tell them,” may be written “2ulluhom” by an Egyptian; “2illun” by a Lebanese; or “Gullhom” in the Gulf dialect.

“When words can be spelled in different ways based on such factors as an individual’s background, analysts need to be sure the words are translated with high accuracy,” Hoffman said, “with little room for error.”

“Rosette Chat Translator converts all word variations to the correct modern Arabic word, minimizing information loss and ensuring consistency across translations,” he said. The new translator uses a linguistic algorithm which looks at the frequency of the structural components of each word together with a statistical model trained on the input of millions of Internet users from all over the Arabic-speaking world, he said.

“The product may be integrated into existing text mining, media monitoring, or search-based applications,” Hoffman said. “We believe this product has important implications for government intelligence gathering and web search tools.”

For more details or to request a free product evaluation, visit /arabic-chat.

About Basis Technology

Basis Technology develops innovative products and solutions incorporating multilingual text analytics and digital forensics. Our Rosette® linguistics platform provides morphological analysisentity extractionname matchingname translation, and Arabic chat translation, yielding useful information from unstructured data in such fields as information retrieval, government intelligence, e‑discovery, and financial compliance. Our digital forensics team pioneers better, faster, and cheaper techniques to extract forensic evidence, keeping government and law enforcement ahead of exponential growth of data storage volumes.

Our products and services are used by over 250 major organizations, including, Clearwell, EMC, Endeca/Oracle, Exalead/Dassault, Fujitsu, Google, Hewlett-Packard, Microsoft, NetBase, Oracle, and governments around the world. Learn more at or call +1-617-386-2090.