Hats off to Newsle! Well done, LinkedIn!
It’s official; two guys in a dorm room is the new startup paradigm. This past Monday, news tracking site Newsle, which was co-founded by two Harvard sophomores, was acquired by a pillar of social media, LinkedIn.
Here in Cambridge, we are thrilled to see the success of yet another alumnus of our startup program. When we met Jonah and Axel in early 2012, we were impressed with their ideas for Newsle and thought they had real potential.
Newsle saw the value in filtering out the noise of social media, by focusing on what public figures, Facebook friends, and LinkedIn contacts are doing rather than saying. (Think: “Wile E. Coyote is named customer of the year at Acme Inc. vs. “I nearly caught that rabbit!” Mr. Coyote tweeted.) They saw a great opportunity to integrate Basis Technology’s Rosette Entity Extractor to do some of the linguistic heavy-lifting, so they could focus on their core software.
“We use Rosette to help us find news about specific companies,” Hansen said. “For example, to see only Apple-related news for Apple’s board member Bill Campbell, Rosette will pick out articles that contain ‘Apple, Inc.’ but won’t catch those with ‘green apple.’”
Rosette locates 18 different entity types (including people, places, and organizations) in unstructured text, in over a dozen languages, and can be adapted and extended extended to improve the accuracy.
“We chose Rosette because it was more accurate and faster than the alternatives we tested,” Hansen said. In 2012, Newsle was processing 100,000 news sources and tracking news on over 3 million people with the numbers only increasing, “so performance is critical to our service.”
Newsle was one of the first companies to take advantage of our text analytics startup program, which makes our best ideas and innovations accessible to high-impact, early-stage companies.
Our startup program includes:
- Access to linguistics across 40+ Asian, European, and Middle Eastern languages
- Easy-as-pie, out-of-box integration with Solr and Elasticsearch
- Full suite of tools required for high-quality multilingual text search, including language identification, tokenization, lemmatization, decompounding, part-of-speech tagging, entity extraction, and cross-lingual fuzzy name matching
- Single API with access to all languages. Additional languages are easily turned on as needed.
- High-performance libraries tested to scale by the likes of EMC, Bing, Yelp, and more