Text Embedding

Text Embedding


Compare semantic similarity between words and documents across nine languages

Overview

What is text embedding?

The text embedding endpoint transforms text into a numerical representation (an embedding) of the text’s semantic meaning. If two words or documents have a similar embedding, they are semantically similar. For example, “anchor” and “boat” have close embeddings, while “anchor” and “koala” do not. Similarly, the same word in different languages like “amore” and “love” have close embeddings.

Semantic similarity at work

A frequent challenge for machine analysis of human language is that the same meaning can be expressed in many ways. Text embeddings can be used for a variety of text analysis tasks, including judging the semantic similarity of one or more texts across languages. Knowing the embeddings of two documents, phrases or words can allow you to evaluate how similar they are in meaning or content, improving name matching, analyzing news, sorting and filtering content, and more.

Product Highlights

  • 9 supported languages
  • Cloud and Enterprise deployments
  • Fast and scalable
  • Industrial-strength support
  • Constantly stress-tested and improved

How It Works

Converting words to vectors

Text embeddings are the mathematical representations of words as vectors. They are created by analyzing a body of text and representing each word, phrase, or entire document as a vector in a high dimensional space (similar to a multi-dimensional graph). Once text has been mapped as vectors, it can be added, subtracted, multiplied, or otherwise transformed to mathematically express or compare the relationships between different words, phrases, and documents.

Word embedding vs. text embedding

Word embedding, the cutting edge of today’s natural language processing and deep learning technology, is the mapping to vectors of individual words. Text embedding takes the process a step further by creating vectors for phrases, paragraphs, and documents as well. Word embeddings show that “king” is similar to “queen” but not to “avalanche,” while text embeddings can show that The Book of John is similar to The Book of Luke, but not to Harry Potter.

Tech Specs

Availability and Platform Support

Deployment Availability:
Bindings:

Supported Languages

Arabic Chinese (Simp. & Trad.) English German Japanese
Korean Korean (North) Korean (South) Russian Spanish

Rosette Cloud

Easy to Use

Built for the most demanding text analytics applications and engineered to deliver high accuracy without sacrificing speed, Rosette Cloud is instantly accessible and offers a variety of plans to suit both startups and enterprises.

Try text embedding and the rest of Rosette’s endpoints, free up to 10,000 calls/month!

Get a Rosette Cloud Key

Quality Documentation and Support

Customers love our thorough and responsive support team. We also provide in-depth documentation that lists all the features and functions of the various endpoints along-side examples in the binding of your choice.

Visit our GitHub for bindings and documentation.

Enterprise Ready

Evaluate Rosette’s functional fit with your business and data needs on the cloud knowing that scalable, customizable, enterprise deployments are available if you need them.

{"content": "Cambridge, Massachusetts"}
 
{
  "documentEmbedding": [
    0.0220256,
    0.03633998,
    0.05246549,
    -0.03751056,
    0.0347335,
    0.02479751,
    -0.03860506,
    0.00603574,
    -0.04244069,
    0.00521813,
    0.01740657,
    -0.08501768,
    -0.01918706,
    -0.05974227,
    0.00762913,
    -0.00020686,
    -0.04639495,
    0.00458408,
    0.01220596,
    0.06160719,
    -0.03988802,
    -0.03095652,
    -0.01182547,
    0.04861571,
    0.02967435,
    -0.04560868,
    -0.16111824,
    -0.06562275,
    0.00208866,
    0.01622739,
    -0.09196278,
    0.13520485,
    0.03665138,
    -0.01748736,
    0.05908763,
    0.07113674,
    0.04435388,
    -0.04436791,
    -0.0018729,
    -0.03612895,
    0.00324841,
    0.0218222,
    0.00414962,
    0.02750619,
    -0.00466647,
    -0.03516347,
    0.00061686,
    0.03071387,
    0.060716,
    -0.05394382,
    -0.03460756,
    -0.0916905,
    -0.04351116,
    0.03095916,
    0.07264832,
    0.00440244,
    -0.06487004,
    -0.0124327,
    -0.02594845,
    0.06403252,
    0.05990276,
    0.08421157,
    0.00113943,
    -0.05188083,
    0.01336752,
    0.05737128,
    0.0868928,
    -0.02797472,
    0.02951868,
    -0.06528687,
    -0.02593506,
    -0.1377904,
    0.05021935,
    -0.00331138,
    0.00345429,
    -0.0806604,
    -0.02997256,
    0.04178474,
    -0.16860084,
    -0.00202994,
    0.04082655,
    0.04052638,
    -0.02616019,
    -0.07079905,
    0.04114204,
    -0.05405192,
    -0.02079529,
    0.03362259,
    0.12866253,
    0.04686183,
    0.03205459,
    0.01844979,
    0.10577367,
    -0.04331236,
    0.03550498,
    0.03498939,
    -0.05236725,
    0.05650697,
    -0.03229797,
    -0.05911481,
    0.08041807,
    -0.01093418,
    -0.04541076,
    0.00499057,
    0.03379054,
    0.01985912,
    0.05434353,
    -0.06876269,
    -0.02142489,
    -0.04368682,
    -0.02340091,
    0.04271708,
    -0.03868493,
    0.03260612,
    -0.00310602,
    -0.08135383,
    0.03890613,
    0.05206529,
    0.01902638,
    -0.03261049,
    -0.01225097,
    -0.04929554,
    0.06811376,
    -0.10045446,
    -0.03772711,
    0.06436889,
    0.0335337,
    0.03110947,
    -0.01010367,
    -0.03986244,
    0.01340914,
    -0.06304926,
    0.05365673,
    -0.07044137,
    0.06421522,
    0.0632241,
    -0.04348637,
    0.13118945,
    -0.02082631,
    0.07590587,
    -0.04813327,
    -0.02577493,
    0.05642929,
    0.00033935,
    -0.01024516,
    0.06391647,
    0.03264675,
    -0.02187326,
    0.04832495,
    0.02241259,
    0.05681982,
    -0.04124964,
    0.08708096,
    0.06066873,
    -0.03356391,
    -0.03327714,
    -0.03449181,
    -0.02047219,
    0.06597982,
    0.08629483,
    0.03777988,
    0.01191289,
    0.10955901,
    -0.05159367,
    0.00001431,
    -0.00435081,
    -0.07139333,
    -0.10915583,
    -0.06582265,
    -0.02754464,
    0.04510804,
    0.09508634,
    -0.02923319,
    0.03627863,
    0.02647047,
    0.06838391,
    0.07216309,
    -0.00809051,
    0.07248835,
    0.0123264,
    -0.09173338,
    -0.02095788,
    0.02871792,
    -0.03392723,
    0.05959549,
    -0.10397915,
    -0.03820326,
    -0.05222115,
    -0.02296818,
    -0.06410559,
    0.02745123,
    0.02334865,
    -0.02446206,
    -0.12417631,
    -0.01871051,
    0.02439541,
    -0.02481432,
    -0.03880155,
    0.04188481,
    0.02300973,
    0.10600527,
    0.02696968,
    0.02788247,
    0.05024018,
    0.05907565,
    0.02856795,
    -0.00740766,
    0.02289764,
    -0.0643627,
    -0.00749485,
    -0.03111451,
    0.06580845,
    0.02102997,
    -0.10717536,
    0.16490568,
    0.03047366,
    -0.02454999,
    0.07184675,
    -0.02504459,
    -0.11541119,
    0.03915355,
    -0.03187835,
    -0.05494586,
    -0.15862629,
    -0.02779816,
    0.00724561,
    0.00901807,
    -0.01519001,
    0.04528573,
    -0.05221211,
    0.01260346,
    -0.01652065,
    0.01324382,
    -0.01688977,
    0.01070876,
    -0.03916383,
    -0.03296183,
    -0.06774635,
    -0.05388693,
    -0.01320887,
    0.07467077,
    0.06863626,
    -0.06439278,
    0.06113409,
    -0.00122581,
    -0.0411741,
    0.11657882,
    -0.01979883,
    -0.01714609,
    -0.00621283,
    0.05906631,
    0.00404663,
    0.02791196,
    -0.11955266,
    -0.0623432,
    -0.12302965,
    0.04749805,
    -0.05722075,
    0.08342554,
    -0.0616898,
    0.0171079,
    0.1030134,
    0.00575187,
    -0.01223959,
    -0.01106031,
    0.02733183,
    -0.05465746,
    -0.00639093,
    0.10582153,
    0.05119603,
    -0.16957831,
    0.0605646,
    0.05737981,
    0.12555394,
    -0.00963913,
    -0.15966235,
    0.06239227,
    -0.01519997,
    -0.00653814,
    -0.01759958,
    -0.00281965,
    -0.07387377,
    0.01542045,
    -0.01574635,
    0.09960862,
    0.06726488,
    0.01381977,
    0.03104461,
    0.05140565,
    -0.08996302,
    0.06713541,
    -0.10765704,
    -0.00975681,
    0.15130819,
    0.0128835,
    -0.00251494,
    -0.02743187,
    0.00955417,
    -0.10639542,
    0.04656886
  ]
}

Rosette Enterprise

Customize and scale your text analytics on premise

For organizations with vast data quantities, unique integration needs, and data security restrictions, we provide on-premise deployments to be hosted on your internal servers.

Request Product Evaluation

If your organization requires an enterprise solution, we’re happy to work with you to meet your business’ unique needs. For free evaluation of Rosette Enterprise please complete the form below and our Customer Engineering team will provide you with an evaluation package.

Drop Us a Line

EMAIL:
info@basistech.com

PHONE:
+1-617-386-2000

Select Rosette Customers

Blog

Using Deep Learning to Power Multilingual Text Embeddings for Global Analysis, Part I

Read More

Blog

Using Deep Learning to Power Multilingual Text Embeddings for Global Analysis, Part II

Read More

No coding required

rapidminer-1

rapidminer

RapidMiner is the industry’s #1 predictive analytics platform. The client platform, RapidMiner Studio, empowers organizations to easily prep data, create models and operationalize predictive analytics within any business process.

Try RapidMiner