Connecting the dots
Relationships are the grammatical connections between two entities described by the action that connects them, for example, “Hillary Clinton” connects to “Wellesley College” with the relationship “attend”. Rosette extracts these relationships using a combination of machine learning and semantic rules.
Rosette extracts relationships using a multi-step process:
- Performs deep syntactic parsing of the sentence and identifies dependencies between words
- Resolves the entities using entity extraction and entity linking for disambiguation
- Extracts and filters the relationships to remove noise
- Clusters and classifies them using training from external sources such Wikidata
Wikidata helps clarify the action connecting the entities and returns surrounding information such as where and when to provide context. Rosette uses this context to improve the accuracy of the relationships.
Rosette relationship extraction includes pre-built targeted extractors for a selection of relationship types that are especially useful in commerce and intelligence applications. You can create your own targeted extractor to find documents mentioning people, locations, and organizations that share the particular relationships you seek.
This is how Rosette knows that the latest “Ghostbusters” was “filmed” in “Boston” by “Paul Feig” and was “released” in “July of 2016”.