23 Apr 2018
Blog

Eight Languages Added to Rosette 1.10.0


Match and translate Greek names • Extract sentiment from Persian text

Rosette text analytics enables users to extract value from unstructured text. All of our capabilities are engineered with a multilingual architecture that enables expansion to any language. By processing text in the native language, Rosette delivers higher accuracy than solutions that rely on machine translation.

We’re thrilled to deliver less commonly supported languages in some of our most complex functionality. Rosette 1.10.0 brings name translation and matching in Greek as well as Persian (Dari and Farsi) sentiment analysis. We’ve also added support for additional languages to morphological analysis and language identification.

Learn more below, or jump to the release notes.

Match, translate, and manage international names

Rosette 1.10.0 supports the transliteration of person, organization, and location names from Greek to Latin script, and matches Greek names written in Greek script to English or Greek names written in Latin script. With the addition of Greek coverage, Rosette can now help you intelligently fuzzy match, translate, and deduplicate names in 17 languages, as well as between each language and English.

Name analytics addresses questions of identity verification in many high-stakes verticals including financial compliance, border security, and customer management. Businesses, banks, and governments are increasingly engaging in cross-border transactions, so multilingual names analytics is particularly valuable.

Why add Greek name analysis support now? Because you asked! Several Rosette customers in the banking, AML, and compliance verticals have found that Greek citizens have started  banking internationally in increasing numbers. As international financial institutions engage with Greek customers, they need tools to quickly and accurately verify and screen names in Greek script.

Powering OSINT and VoC Analysis in Western Asia

Rosette 1.10.0 adds support for sentiment analysis in Persian, also known as Farsi and Dari, at both the document and the entity level. Persian is spoken by more than a hundred million people around the world and is the official language of Iran, Tajikistan, and Afghanistan. Rosette now analyzes sentiment in six languages: Arabic, English, French, Japanese, Persian, and Spanish.

Predictive analytics and open source intelligence (OSINT) systems rely on advanced text analytics both to identify and extract entities and topics of interest from vast public data lakes. Sentiment analysis is key in harnessing valuable information from social media and other public data sources by giving a glimpse into the feelings and motivations around entities.

The addition of Persian sentiment analysis provides more complete support for OSINT systems in Western Asia and may be the first reliable commercial Persian sentiment analysis on the market.

What else is new?

Entity Extraction and Linking: New deep neural network processor (in BETA)

We’ve added an additional entity extraction processor, which can be used in place of the standard statistical extractor. The new processor employs a deep neural network (DNN) that improves accuracy up to 7% and reduces the error rate up to 32%. It is available for English, Arabic, and Korean.

To enable this processor, provide DNN for the modelType. Example:

{"content": "your_text_here", "options": {"modelType": "DNN"}}

If you try out the new model, please share your feedback and experience with us. This processor is in beta and we’re always looking to improve.

Name Matching: Improved match results

We’ve made improvements to name match scores and segmentation rules for Arabic, Persian, and Japanese names.

Morphological Analysis: New language support

We’ve added support for lemmatizing Catalan, Estonian, Serbian, and Slovak text.

Language Identification: New language support

Short string language identification now also supports Malay and Indonesian. Both languages were already supported for longer texts.

TL; DR check the release notes. As always, you can find docs and FAQs at support.rosette.com. Not a Rosette user yet? Get your Rosette Cloud key today (no credit card required, 10,000 free calls/month), or let us know if you’re interested in on-premise deployments through Rosette Enterprise.

Rapidminer: New language support across endpoints

Are you a Rapidminer user? We’ve also added the new language coverage to our RapidMiner extension. Get started today!