Names are hard

Names are vitally important data points in financial compliance, anti-fraud, government intelligence, law enforcement, and identity verification. Yet it can be challenging to match names when your data includes variables such as misspellings, aliases or nicknames, initials, and non-Latin scripts.

Rosette solves these challenges with a linguistic, statistics-based system that compares and matches names of people, places, and organizations despite their many variations. Built by linguistics experts, our name matching is unrivaled in its ability to connect entities with high adaptability, precision, and scalability. With fluency across 15 languages and a deep understanding of the linguistic complexities of names, Rosette is the first choice for name matching.

Industry leading indexing model

Rosette blends machine learning with traditional name matching techniques such as name lists, common key, and rules to determine a match score. This score can be used to maximize precision or recall depending upon the application.

This highly adaptable model solves 13 different name phenomenon (see all 13 in the Tech Specs section), two examples:

Same name in multiple languages Mao ZedongМао Цзэдун泽东
Semantically similar names Eagle Pharmaceuticals, Inc. ↔ Eagle Drugs

Product Highlights

  • 15 supported languages
  • Matches names of people, locations and organizations
  • Increases name search accuracy
  • Ranks results by relevancy with a similarity score
  • Intuitive cloud API
  • Customizable SDK
  • Fast and scalable
  • Industrial-strength support
  • Constantly stress-tested and improved

How It Works

The industry leader in names

Rosette uses machine learning rather than name lists for its name matching logic. This means new names are found the first time. It also avoids the problem of an exponentially growing list, especially with names that have multiple elements. A 3-element names (first, middle, last), for example, with 12 variations for each element would add 12x12x12 = 1,728 variations to a list.

Unlike expensive and less accurate legacy solutions driven by thousands of spelling variants from known names, Rosette analyzes the intrinsic structure of each name component and performs an intelligent comparison using advanced linguistic algorithms. Under the hood, Rosette name matching utilizes the cutting edge of NLP techniques including neural networks, hidden Markov models, transliteration rules, and word embedding vectors.

Customizable to your needs

Rosette is unique among text analytics software in its adaptability. Our on-premise name matching not only supports matching against vast data lakes, but can be tailored to fit your needs.

  • Set the minimum threshold of the similarity score to manage the precision and recall of the returned search results
  • Create a list of “stopwords” to ignore when determining match similarity scores (e.g., titles, honorifics).
  • Pre-set two name words to always match with a given score (e.g., “Elizabeth” and “Lisbeth” always match at 90%)

We built our name matching technology with large, complex databases in mind. Unlike lightweight solutions that have been adapted to make them scalable, Rosette name matching was built for customers with tens of millions of data entries, and use cases that cannot afford lags in performance and accuracy.

Tech Specs

Availability and Platform Support

Deployment Availability:

Supported Languages

Arabic French Korean Russian
Chinese, Simplified German Pashto Spanish
Chinese, Traditional Italian Persian Urdu
English Japanese Portuguese

13 Ways Rosette Matches Names

Phonetic similarity JesusHeyzeusHaezoos
Transliteration spelling differences Abdul RasheedAbd al-Rashid
Nicknames WilliamWillBillBilly
Missing spaces or hyphens MaryEllenMary EllenMary-Ellen
Titles and honorifics Dr.Mr.Ph.D.
Truncated name components McDonaldsMcDonaldMcD
Missing name components Phillip Charles CarrPhillip Carr
Out-of-order name components Diaz, Carlos AlfonzoCarlos Alfonzo Diaz
Initials J. E. SmithJames Earl Smith
Names split inconsistently across database fields Dick. Van DykeDick Van . Dyke
Same name in multiple languages Mao ZedongМао Цзэдун泽东澤東
Semantically similar names Eagle Pharmaceuticals, Inc. ↔ Eagle Drugs, Co.
Semantically similar names across language Nippon Telegraph and Telephone Corporation ↔ 日本電信電話株式会社

Try the Demo

Cloud API

Easy to Use API

Ideal for product evaluation, academic research, and smaller, cost-conscious businesses, our fast and powerful API is instantly accessible and free to get started.

Our matching endpoint supports pairwise matching, generating a match score for any two names, locations, or organizations entered by the user. If you need to search for name matches against extensive databases of entities, talk to our customer engineering team about evaluating our on-premise name matching SDKs.

Try name matching and the rest of Rosette’s endpoints, free up to 10,000 calls/month!

Quality Documentation and Support

Customers love our thorough and responsive support team. We also provide in-depth documentation that lists all the features and functions of the various API endpoints along-side examples in the binding of your choice.

Visit our GitHub for bindings and documentation.

Enterprise Ready

Evaluate Rosette’s functional fit with your business and data needs on our cloud API knowing that scalable, customizable, on-premise deployments are available if you need them.

  "name1": {
    "text": "Влади́мир Влади́мирович Пу́тин",
    "language": "rus",
    "entityType": "PERSON"
  "name2": {
    "text": "Vladimir Putin",
    "language": "eng",
    "entityType": "PERSON"

  "result": {
    "score": 0.9486632809417912

On Premise

Match against massive databases on premise

For organizations with vast data quantities, unique integration needs, and data security restrictions, we provide on-premise API deployment and SDKs to be hosted on your internal servers. Our on-premise name matching SDKs allow you to search for matches against enormous databases. Rosette name matching is built to support fast, accurate matching against tens of millions of entities.

