Tag: word embedding

Rosette 1.17.0 Release: Hebrew Name Translation, French Semantic Similarity, Robust Address Matching

Recent Rosette® Cloud and Enterprise releases (1.17.0, 1.16.1) bring expanded language coverage to name translation and semantic similarity, and ease of use to the address matching capability within Rosette Name Indexer. We have also made improvements to Arabic-Arabic and Arabic-English name matching, as well as better morphological analysis in various languages. Hebrew name translation Name […]

Fuzzy Name Matching Techniques

Methods of name matching and their respective strengths and weaknesses In a structured database, names are often treated the same as metadata for some other field like an email, phone number, or an ID number. But what happens if you only have a name to lookup a record? This happens quite frequently since humans tend […]

Word Embeddings for Fuzzy Matching of Organization Names

Rosette’s name matching is enhanced by word embeddings to match based on semantics as well as phonetics Tracking mentions of particular organizations across news articles, social media, and internal communications is integral to the workflow of dozens of use-cases across industries. However it can be especially challenging to match names of companies and organizations because […]

Minds Converge: A Machine Learning Meeting in Toulon

Basis Technology R&D presents at the International Conference on Learning Representations in France The International Conference on Learning Representations (ICLR) is an annual gathering of leading machine learning experts working in both industry and academia. This year’s conference was held from April 24-26 in Toulon, France. ICLR focuses on a broad range of subjects, with […]

Using Deep Learning to Power Multilingual Text Embeddings for Global Analysis, Part II

Wait! Have you read Part I yet? Check it out, then come on back.  Putting Text Embeddings to Work Using the updated text embeddings endpoint in Rosette API 1.5, you’ll notice significant accuracy improvements on longer strings of text, both sentences and documents. We’ve also begun to incorporate text embeddings into some of our higher […]

Using Deep Learning to Power Multilingual Text Embeddings for Global Analysis, Part I

A Crash Course in Basic Text Embeddings A chronic problem with using machines to analyze human language is that the same meaning can be expressed using many different words. Take for example the sentence “Bill Gates was educated at Harvard.”  There are many ways to express this relationship: Bill Gates studied at Harvard, Bill Gates […]

Rosette API 1.5 Released

Today we’re pleased to announce the launch of Rosette API version 1.5! Updates include new targeted relationship extraction (replacing the previous “open” relationship extraction), changes to entity linking and extraction,  improved text embeddings, and expanded support for Chinese, Japanese, Korean, and Vietnamese, including sentiment analysis for Japanese text (beta). What’s new?   Targeted Relationships The […]

Notes from the Lab: Fueling New Research into Machine Learning with Wikidata

Basis Technology R&D team pioneers new technique and open sources WikiSem500, a dataset for multilingual word embedding evaluation The most time consuming and expensive aspect of machine learning research is data preparation—aggregation and cleaning—and every data scientist has been frustrated by it. However the importance of good, testing data makes it hard to cut corners. […]