Category: Text Analytics

Understanding Dari and Pashto Names: A Challenge to Intelligence Gathering in Afghanistan

Shakespeare asked “What’s in a name?” It turns out there’s a lot in the name of a typical Afghani including common nouns and personal titles—not used as titles! One of our linguistic experts, Bushra Zawaydeh, wrote today’s post about the challenges to natural language processing software of automatically extracting names of people from text written in the […]

Rosette Version 7.6 Released!

Extract New Entity Types and Languages; Enhanced Name Translation and Matching for Spanish and Persian Names We’re excited to announce the release of Rosette 7.6 which has many improvements, and several new features. New Entity Type: Products Rosette Entity Extractor can now find product names in news and review articles. Product names are key to brand […]

Mission Possible: Connecting Structured and Unstructured Data to Create New Insights

Advanced text analytics can link structured data with unstructured data in ways that were impossible years ago. These capabilities are unlocking insights and enabling new workflows in business domains where entities — people, places, organizations, disease or drug names and more — are the connectors between data sources. ABSTRACT Advanced text analytics can link structured […]

Haven’t I Met You Before? Cross-Document Coreference Resolution

The Dude: Nobody calls me Lebowski. You got the wrong guy. I’m the Dude, man. Blond Treehorn Thug: Your name’s Lebowski, Lebowski. Your wife is Bunny. The Dude: My… my wi-, my wife, Bunny? Do you see a wedding ring on my finger? Does this place look like I’m married? From The Big Lebowski The example […]

Indexing Strategies for Multilingual Search with Solr and Rosette

As a solutions engineer at Basis Technology, I often discuss the integration of Rosette and Apache Solr with our existing and potential clients, who look to Rosette to improve multilingual Solr search in many languages (including English). Generally this Rosette – Solr integration is a fairly simple process that involves little, if any, programming. There are a number […]

Mining Gold from Big Data with Text Analytics

Sunday’s New York Times featured a news analysis article about the age of big data and how that means more analysis and technologies are being applied to domains which formerly seemed removed from data crunching—political science, sports, advertising, public health, and more. Technology reporter Steve Lohr highlights “a drift toward data-driven discovery and decision-making.” Although the article […]

Arabic and Afghan Name Translation Software Improves Intelligence Analysis

Software suite facilitates inter-agency collaboration; meets federally mandated intelligence standards Cambridge—June 8, 2011—A crucial tool for U.S. intelligence agencies charged with translating foreign languages vital to national security was unveiled today by Basis Technology. The Highlight language analysis suite version 4.0 quickly and consistently translates the names of people and places from Arabic script to English, […]