SOLUTIONS


RLP for Lucene
DOWNLOAD PAPER

Read how you can build a global-ready search server using Apache Lucene or Solr using the Rosette Linguistics Platform
Building a Multlingual Search Engine with Apache Lucene



What is Lucene

Lucene is an open source search toolkit library whose development is sponsored by the Apache Software Foundation.

What is Solr

Solr is an open source, web-based search service that runs on top of Lucene, also sponsored by the Apache Software Foundation. It adds a schema, administrative tools, cache management, replication and faceted browsing.


Cost-effective, reliable, multilingual search that's easy to deploy

The same multilingual text processing technology used by industry-leading search engines FAST, Yahoo!, Bing.com, Google, Endeca and others is now available for the open source search solution, Apache Lucene and Solr.

Deploy in Days

Out-of-the-box you can connect Basis Technology’s Rosette Linguistics Platform (RLP) to Apache Lucene and have robust and accurate multilingual search up and running on your website, intranet or internal network. This combination is an indispensable and low-cost search solution for enterprises of any size.

Dependability you can bank on

Lucene — a high-performance open source search toolkit — is a popular search solution with over 3,000 installations in organizations including IBM, CNET, and Wikipedia. RLP has a ten year track record of providing linguistic intelligence to meet the demanding accuracy and performance required by major search and text mining providers.

Search to the standards of enterprise search vendors

  • Language identification and full-text search in 54 languages
  • Linguistically improved search in 19 languages including major European languages, Arabic, Japanese, Chinese, and Korean. (Read more)
  • Entity extraction and faceted navigation in 12 languages
  • A scalable, high performance architecture (Read more)

Basis Technology offers a fixed pricing model for RLP — no matter the number of users or servers.

Users enjoy the same quality of experience with Lucene they have come to expect from their favorite web and enterprise search engines.

Request an evaluation copy of RLP today with the “RLP for Lucene” module.

All You Need To Do

Download and install the RLP SDK or runtime package. Lucene leverages RLP functions and passes along information such as the location of documents to be indexed. The “RLP for Lucene” module enables RLP to connect to Lucene out of the box. No additional work is needed for Lucene to search text in any language RLP supports. The “RLP for Lucene” module comes with RLP at no additional cost.


RLP Linguistic Capabilities:

  • Language Identification: Identifies the language a document is written in.
  • Language–specific processing: The base linguistics function of RLP is the starting point for building a search index and refining queries. Advanced linguistic features improve precision and recall of search results.
  • Segmentation and tokenization:Separates streams of text into unique word tokens, especially needed for languages -- such as Chinese and Japanese – written without spaces between words.
  • Lemmatization: Provides the dictionary form for an inflected word to improve recall.
  • Noun decompounding: Separates compound words (such as in German and Dutch) into their separate components to improve recall.
  • Part-of-speech tagging: Improves precision and recall.
  • Entity extraction: Extracts entities to enable faceted search on key names and entities in search results.

Apache Lucene Performance & Scalability

  • Thread safe
  • Cross-platform solution
  • Support for multiple cores
  • Small RAM requirements
  • Incremental indexing as speedy as batch indexing
  • Index is only 20-30% the size of text indexed
  • Powerful search algorithms

For more details, see http://lucene.apache.org/.

Basis Technology’s solution is strengthened through partnership with Lucid Imagination, a company founded by the top developers of Apache Lucene and Apache Solr, including Erik Hatcher (co-author of Lucene in Action), Yonik Seeley (creator of Solr) and Grant Ingersoll (creator of "Lucene Boot Camp" training).