Chat Translation

Chat Translation


Convert Arabic text written with the Roman alphabet to Arabic script

Overview

What is “Arabizi”?

The Arabic chat language, known as “Arabizi,” is a casual version of written Arabic that was born when Arabic speakers began using Western keyboards to spell out their native language with the Roman alphabet. With the growth of digital communication, Arabizi has become one of the most proliferate online languages. With as many as 420 million speakers in the world, Arabic coverage and by extension, Arabizi, is necessary for any global text analytics system.

Arabizi transliteration

Rosette’s chat translator transliterates all Arabizi text to Arabic script, minimizing information loss and ensuring consistency across translations. The linguistic algorithm looks at the frequency of the structural components of each word together with a statistical model trained on the input of millions of Internet users from all over the Arabic-speaking world. It can also transliterate Arabic text into Latin-script Arabizi.

In order to process the huge volumes of Arabizi text being created, it must first be transliterated to Arabic. It can then be run through other forms of linguistic analysis, such as morphology, entity extraction, and sentiment analysis.

Product Highlights

  • Arabizi ↔ Arabic transliteration
  • Cloud or Enterprise deployments
  • Fast and scalable
  • Industrial-strength support
  • Constantly stress-tested and improved

How it Works

Accurately Translate

The Rosette Chat Translator (RCT) translates Romanized Arabic “chat” (also known as Arabizi) into standard Arabic script with very high accuracy. This product leverages two fundamental techniques:

  • An algorithmic approach, breaking down words into morphological components and phonemes to produce Arabic candidate transliterations. The transliterations are ranked according to many metrics, such as the popularity of the phoneme mappings or how frequently the Arabic output is used in Arabic text.
  • A statistical approach, using a large database of Roman alphabet spellings generated from the input of millions of Arabic speakers online.

This dual algorithmic and statistical approach is increasingly recognized as the most effective method for text analysis and machine translation. Our technology is the only commercially available product offering these features in the realm of Arabic transliteration. RCT is designed for performance and concurrency from the ground up. It is capable of transliterating thousands of words per second, enabling your applications to instantly convert textual input and quickly process large databases of text. Available both as a Java class library and as a web service, the functionality can be integrated into almost any software environment.

Crowd-Sourced Translations

Unlike machine translation systems which rely on conventional dictionaries, RCT is powered by a database of 300 million Arabic words collected from thousands of different websites. This approach enables translation which reflects contemporary usage throughout the various regional Arabic-speaking online communities. Our database subscription service enables the translation engine to remain current with the latest trends in terminology and personalities.

Tech Specs

Availability and Platform Support

Deployment Availability:
Plugins:
Bindings:

Supported Languages

Arabizi ↦ Modern Standard Arabic
Modern Standard Arabic ↦Arabizi

Rosette Cloud

Easy to Use

Built for the most demanding text analytics applications and engineered to deliver high accuracy without sacrificing speed, Rosette Cloud is instantly accessible and offers a variety of plans to suit both startups and enterprises.

Try transliteration and the rest of Rosette’s endpoints, free up to 10,000 calls/month!

Get a Rosette Cloud Key

Quality Documentation and Support

Customers love our thorough and responsive support team. We also provide in-depth documentation that lists all the features and functions of the various endpoints along-side examples in the binding of your choice.

Visit our GitHub for bindings and documentation.

Enterprise Ready

Evaluate Rosette’s functional fit with your business and data needs in the cloud knowing that scalable, customizable, enterprise deployments are available if you need them.

INPUT

{
  "content": "ana r2ye7 el gam3a el sa3a 3 el 3asr"
}

OUTPUT

{
  "transliteration": "أنا رايح الجامعة الساعة ٣ العصر"
}

Rosette Enterprise

Customize and scale your text analytics on premise

For organizations with vast data quantities, unique integration needs, and data security restrictions, we provide on-premise deployments to be hosted on your internal servers.

Request Product Evaluation

If your organization requires an enterprise solution, we’re happy to work with you to meet your business’ unique needs. For free evaluation of Rosette Enterprise please complete the form below and our Customer Engineering team will provide you with an evaluation package.

Drop Us a Line

EMAIL:
info@basistech.com

PHONE:
+1-617-386-2000

Select Customers

Blog

Add Sentiment Analysis, Translated Names, Entities and More to Elasticsearch

Read More

Blog

Rosette API Adds Support for “Arabizi” Script

Read More

No coding required

rapidminer-1

rapidminer

RapidMiner is the industry’s #1 predictive analytics platform. The client platform, RapidMiner Studio, empowers organizations to easily prep data, create models and operationalize predictive analytics within any business process.

Try RapidMiner