Who is 孙武? | Name Matching

Name Matching


Verify identities and match names and organizations against vast databases with industry-leading accuracy and recall

name matching

Overview

Names are hard

Names are vitally important data points in financial compliance, anti-fraud, government intelligence, law enforcement, and identity verification. Yet it can be challenging to match names when your data includes variables such as misspellings, aliases or nicknames, initials, and non-Latin scripts.

Rosette solves these challenges with a linguistic, statistics-based system that compares and matches names of people, places, and organizations despite their many variations. Built by linguistics experts, our name matching is unrivaled in its ability to connect entities with high adaptability, precision, and scalability. With fluency across 18 languages and a deep understanding of the linguistic complexities of names, Rosette is the first choice for name matching.

Industry leading indexing model

Rosette blends machine learning with traditional name matching techniques such as name lists, common key, and rules to determine a match score. This score can be used to maximize precision or recall depending upon the application.

This highly adaptable model solves 13 different name phenomenon (see all 13 in the Tech Specs section), two examples:

Same name in multiple languages Mao ZedongМао Цзэдун泽东
Semantically similar names Eagle Pharmaceuticals, Inc. ↔ Eagle Drugs

Product Highlights

  • 18 supported languages
  • Matches names of people, locations and organizations
  • Increases name search accuracy
  • Ranks results by relevancy with a similarity score
  • Fast and scalable
  • Industrial-strength support
  • Constantly stress-tested and improved
  • Cloud and enterprise deployments

How It Works

The industry leader in names

Rosette uses machine learning rather than name lists for its name matching logic. This means new names are found the first time. It also avoids the problem of an exponentially growing list, especially with names that have multiple elements. A 3-element names (first, middle, last), for example, with 12 variations for each element would add 12x12x12 = 1,728 variations to a list.

Unlike expensive and less accurate legacy solutions driven by thousands of spelling variants from known names, Rosette analyzes the intrinsic structure of each name component and performs an intelligent comparison using advanced linguistic algorithms. Under the hood, Rosette name matching utilizes the cutting edge of NLP techniques including neural networks, hidden Markov models, transliteration rules, and word embedding vectors.

Customizable to your needs

Rosette is unique among text analytics software in its adaptability. Our on-premise name matching not only supports matching against vast data lakes, but can be tailored to fit your needs.

  • Set the minimum threshold of the similarity score to manage the precision and recall of the returned search results
  • Create a list of “stopwords” to ignore when determining match similarity scores (e.g., titles, honorifics).
  • Pre-set two name words to always match with a given score (e.g., “Elizabeth” and “Lisbeth” always match at 90%)

We built our name matching technology with large, complex databases in mind. Unlike lightweight solutions that have been adapted to make them scalable, Rosette name matching was built for customers with tens of millions of data entries, and use cases that cannot afford lags in performance and accuracy.

Tech Specs

Availability and Platform Support

Deployment Availability:
Plugins:
Bindings:

Supported Languages

Arabic German Korean Spanish
Chinese, Simplified Greek Pashto Thai
Chinese, Traditional Hungarian Persian Urdu
English Italian Portuguese
French Japanese Russian

13 Ways Rosette Matches Names

Phonetic similarity JesusHeyzeusHaezoos
Transliteration spelling differences Abdul RasheedAbd al-Rashid
Nicknames WilliamWillBillBilly
Missing spaces or hyphens MaryEllenMary EllenMary-Ellen
Titles and honorifics Dr.Mr.Ph.D.
Truncated name components McDonaldsMcDonaldMcD
Missing name components Phillip Charles CarrPhillip Carr
Out-of-order name components Diaz, Carlos AlfonzoCarlos Alfonzo Diaz
Initials J. E. SmithJames Earl Smith
Names split inconsistently across database fields Dick. Van DykeDick Van . Dyke
Same name in multiple languages Mao ZedongМао Цзэдун泽东澤東
Semantically similar names Eagle Pharmaceuticals, Inc. ↔ Eagle Drugs, Co.
Semantically similar names across language Nippon Telegraph and Telephone Corporation ↔ 日本電信電話株式会社

Try the Demo


Rosette Cloud

Easy to Use

Built for the most demanding text analytics applications and engineered to deliver high accuracy without sacrificing speed, Rosette Cloud is instantly accessible and offers a variety of plans to suit both startups and enterprises.

Our matching endpoint supports pairwise matching, generating a match score for any two names, locations, or organizations entered by the user. If you need to search for name matches against extensive databases of entities, talk to our customer engineering team about evaluating our enterprise deployments.

Try name matching and the rest of Rosette Cloud’s endpoints, free up to 10,000 calls/month!

Get a Rosette Cloud Key

Quality Documentation and Support

Customers love our thorough and responsive support team. We also provide in-depth documentation that lists all the features and functions of the various Rosette Cloud endpoints along-side examples in the binding of your choice.

Visit our GitHub for bindings and documentation.

Enterprise Ready

Evaluate Rosette’s functional fit with your business and data needs on Rosette Cloud knowing that scalable, customizable, enterprise deployments are available if you need them.

{
  "name1": {
    "text": "Влади́мир Влади́мирович Пу́тин",
    "language": "rus",
    "entityType": "PERSON"
  },
  "name2": {
    "text": "Vladimir Putin",
    "language": "eng",
    "entityType": "PERSON"
  }
}

{
  "result": {
    "score": 0.9486632809417912
  }
}

Rosette Enterprise

Match against massive databases on premise

For organizations with vast data quantities, unique integration needs, and data security restrictions, we provide on-premise deployments to be hosted on your internal servers. Our enterprise solutions allow you to search for matches against enormous databases. Rosette name matching is built to support fast, accurate matching against tens of millions of entities.

Request Product Evaluation

If your organization requires an on-premise solution, we’re happy to work with you to meet your business’ unique needs. For free evaluation of Rosette Enterprise please complete the form below and our Customer Engineering team will provide you with an evaluation package.

Drop Us a Line

EMAIL:
info@basistech.com

PHONE:
+1-617-386-2000

Select Customers Include

Learn More

Blog

An Overview of Fuzzy Name Matching Techniques

Read More
Building a virtual quad to connect an international community

Building a virtual quad to connect an international community

Read More

No coding required

rapidminer-1

rapidminer

RapidMiner is the industry’s #1 predictive analytics platform. The client platform, RapidMiner Studio, empowers organizations to easily prep data, create models and operationalize predictive analytics within any business process.

Try RapidMiner