Document Enrichment

Transform your unstructured data into novel insight


Enhance your analysis with Rosette API

Add facets straight from the Cloud. Configure the processors you need and Rosette API takes care of the rest: offsets, language identification, entity extraction, categorization, name translation, and sentiment analysis. With extensive language support and customization parameters, Rosette puts you in control.

Facet on real‐world entities

Identify the people, locations, organizations and more hiding in your text and link them to real-world knowledge base entries. Rosette is pre-trained on 700+ entity types and 21 languages and can be adapted in the field to domain-specific content for improved accuracy.

Explore the nuances of your data

Illuminate the patterns in your documents with multi-faceted queries and bring them to life with aggregations and Kibana visualizations. Map connections and relationships with Graph.

Quick and easy integration with Elastic architecture

Dive deeper into your text fast with Rosette’s 100% Java plug and play connection to Elastic. Download, install, and start querying with just one line of code. Fully compatible with Rosette’s other Elastic plugins for Multilingual Search Enhancement and Identity Resolution.

Identity Resolution

Robust indexing and querying for names, dates, and more


Enterprise level multilingual identity verification

Whether you’re tackling log analysis, e-commerce, watch list screening, or customer support, you want to know who you’re dealing with. Rosette handles misspellings, nicknames, aliases, titles, phonetic spellings, cross-script variations, and translations in over 40 different languages.

Can you find “Abdul Jabbar, Karim” if you search “Kareem AbdalJabar” or “كريم عبد الجبار”?

Complex name and information matching, the easy way

No more complicated multi-field queries with one field for every potential name variation. The Rosette plugin contains a custom mapper which does all the work behind the scenes. Simplify your identity verification workflow with custom Lucene queries to generate candidate documents, and rescore queries to grade names and rerank results accordingly.

Better search, recall, and precision

Using a special “name data” type, Rosette indexes keys for different name variations in separate subfields for every token, boosting speed without sacrificing accuracy. With appropriate hardware configuration, query up to 50 names per second against an index of 100 million names.

Quick and easy integration with Elastic architecture

Get better search results fast with Rosette’s 100% Java plug and play connection to Elastic. Download, install, and start querying with just one line of code. Fully compatible with Rosette’s other Elastic plugins for Multilingual Search Enhancement and Document Enrichment.

Multilingual Search Enhancement

Leverage your text and globalize your analysis


The results you need in the languages you want

Query in over 32 languages with proprietary statistical models backed by Basis Technology’s 20+ years of experience in multilingual enterprise search. We’ve got your back with dedicated support, training, and implementation assistance.

Advanced linguistics processing for nuanced analysis

It counts to know where one word ends and the next begins. Limit false positives, improve accuracy, and boost speed with advanced tokenization for languages like Chinese and Japanese, which don’t separate individual words with whitespaces.

Increased precision and recall

With lemmatization, link related words based on semantics to improve recall without diminishing precision. Tackle languages which prolifically create noun compounds—such as German, Korean, and Dutch—with decompounding to correctly index individual component words for better recall.

Quick and easy integration with Elastic architecture

Get better search results fast with Rosette’s 100% Java plug and play connection to Elastic. Download, install, and start querying with just one line of code. Fully compatible with Rosette’s other Elastic plugins for Identity Resolution and Document Enrichment.