What (or who) we talk about


Entities are the key actors in your content: the people, places, organizations, email addresses, products, dates, times, and more that are hidden in your text. Using a machine learning statistical model, Rosette uncovers these entities to help you understand what your content is telling you.

Context aware

Rosette knows how to read context for the 18 entity types it understands. It knows the difference between Apple Inc. and the fruit, and can disambiguate between Paris Hilton the person and Paris Hilton the hotel. It also uniquely adapts to new entity types. Annotate a small quantity of your data and Rosette will learn to recognize it. This works in 20 languages.

Content aware

Unlike most extractors, Rosette can be tuned for a wide range of content, including news articles, blogs, restaurant reviews, financial documents, medical records, legal contracts, and patent filings. It can also extract entities from short strings such as tweets. If your style of content is not on the list, you can train Rosette to build a new model.

What is it used for?

Rosette entity extraction enables more advanced analytics such as name matching, sentiment analysis, and relationship extraction that are used in a variety of applications:

  • Resolving a person’s identity for government security and fraud detection
  • Tracking customer sentiment about products and companies
  • Analyzing research for patent law, legal discovery, and compliance
  • Predicting world events and feeding open source intelligence
  • Providing targeted search for content publishers and recommendation engines

Select Customers Include:

Attivio Bing Kobo KPMG Dassault

Blog: Adapt Rosette’s Entity Extraction to Your Content for Increased Accuracy

Read More

Forecasting the Future: The EMBERS Predictive Analytics Success Story

Read More

Supported Languages & Features

Languages (20)

  • Arabic
  • Chinese, Simp.
  • Chinese, Trad.
  • Dutch
  • English
  • French
  • German
  • Hebrew
  • Indonesian
  • Italian
  • Japanese
  • Korean
  • Malay
  • Pashto
  • Persian
  • Portuguese
  • Russian
  • Spanish
  • Urdu
  • Vietnamese

Entity Types (18)

  • Person
  • Location
  • Organization
  • Product
  • Title
  • Nationality
  • Religion
  • Credit Card
  • Lat/Long
  • Money
  • Number
  • ID Number
  • Phone
  • Email
  • URL
  • Distance
  • Date
  • Time
Entity Extraction
Entity Extraction
Released

Live Demo:

Finds 18 types of entities out of the box, including names of people, places, products, organizations in 20 languages.