Topic Extraction

Identify keywords and significant phrases in your text data, even when they are not explicitly mentioned


Quickly get the gist of your content

Topic extraction discovers the central keywords in documents or databases, but unlike categorization or entity extraction, it is not constrained by a finite list of recognized entity types or categories. Instead, the topic endpoint identifies “keyphrases” and “concepts” for the given input based on frequency and linguistic patterns in the text, ranking them according to their relative importance.

Topic extraction allows users to quickly review a list of keyphrases and concepts to get the gist of an article or document. On a macro level, the same principle can be applied to a corpus of documents to understand what ideas are most common amongst them. Knowing the keyphrases and concepts in each document enables users to automatically tag, sort, and organize their data, making it more useful to analysts and database managers.

Keyphrases vs. concepts

Keyphrases are significant phrases or words taken directly from the text that Rosette deems to be representative of the content. They are uncovered based on frequency, with consideration of common stop words like “and” or “that,” as well as language-specific statistical patterns of where keywords are likely to be located. Concepts are themes detected within the text that may not be explicitly mentioned in the input. For example, an article about the Super Bowl may have the concept of “sports” or “American football” even if neither is mentioned in the text.

Topic extraction is currently only available in English, but our on-premise tools can also be custom-trained for new languages as needed.

Product highlights

  • English only
  • Extracts keyphrases
  • Identifies concepts
  • Cloud or Enterprise deployments
  • Fast and scalable
  • Industrial-strength support
  • Constantly stress-tested and improved

Tech Specs

Availability and platform support

Deployment availability:

Supported Languages


Topic Extraction Example

Rosette Cloud

Easy to use

Built for the most demanding text analytics applications and engineered to deliver high accuracy without sacrificing speed, Rosette Cloud is instantly accessible and offers a range of call volumes to suit both startups and enterprises.

Try topic extraction and the rest of Rosette’s endpoints, signup today for a 30-day free trial!

Get a Rosette Cloud Key

Quality documentation and support

Customers love our thorough and responsive support team. We also provide in-depth documentation that lists all the features and functions of the various endpoints along-side examples in the binding of your choice.

Visit our GitHub for bindings and documentation.

Enterprise ready

Evaluate Rosette’s functional fit with your business and data needs in the cloud knowing that scalable, customizable, enterprise deployments are available if you need them.

{"content": "To Sleep John Keats, 1795 - 1821
O soft embalmer of the still midnight!
 Shutting with careful fingers and benign
Our gloom-pleased eyes, embower’d from the light,
 Enshaded in forgetfulness divine;
O soothest Sleep! if so it please thee, close,
 In midst of this thine hymn, my willing eyes,
Or wait the amen, ere thy poppy throws
 Around my bed its lulling charities;
 Then save me, or the passèd day will shine
Upon my pillow, breeding many woes;
Save me from curious conscience, that still lords
 Its strength for darkness, burrowing like a mole;
Turn the key deftly in the oilèd wards,
 And seal the hushèd casket of my soul. - John Keats

This poem is in the public domain.

John Keats
Born in 1795, John Keats was an English Romantic poet and author of three poems considered to be among the finest in the English language."}

 [{"phrase": "lulling charities"},
 {"phrase": "O soothest Sleep"},
 {"phrase": "John Keats"},
 {"phrase": "O soft embalmer"},
 {"phrase": "hushèd casket"},
 {"phrase": "English Romantic poet"},
 {"phrase": "forgetfulness divine"},
 {"phrase": "pleased eyes"},
 {"phrase": "passèd day"},
 {"phrase": "oilèd wards"}],

 [{"phrase": "John Keats",
 "conceptId": "Q82083"}]}

Rosette Enterprise

Customize and scale your text analytics on premise

For organizations with vast data quantities, unique integration needs, and data security restrictions, we provide on-premise deployments to be hosted on your internal servers.

Request product evaluation

If your organization requires an enterprise solution, we’re happy to work with you to meet your business’ unique needs. For free evaluation of Rosette Enterprise please complete the form below and our Customer Engineering team will provide you with an evaluation package.

Drop us a line



Select Rosette Customers

konasearch salesforce

Deep Search for Salesforce

AI-driven Search Application for SalesForce

KonaSearch is a best-in-class search application for SalesForce enabling users to search every field, file, and object across multiple orgs and other data sources.

View on AppExchange

SalesForce Search