What’s in a Persian Name?
How Persian names are composed from history, social class, religious affiliation, and geographic origins
This month we have a special blog post by computational linguist and NLP engineer Zina Saadi.
For this article, we’ll focus on Persian names of Iran, where people speak a dialect of Persian often called Farsi. Many aspects of the language give Farsi its depth and complexity, including its phonology and morphology, possible spelling variations, and borrowings from other languages. Especially when tackling challenges such as building software for fuzzy name matching and translation, a full understanding of the ins and outs of Persian names is absolutely essential.
Given Names and Affixes
Interestingly, something as basic — from a Western viewpoint — as surnames didn’t exist in Iran until the rule of the Shah of Iran (1925 to 1941). Before 1925, a full Iranian name was a given name and a collection of affixes: morphemes that appear at the beginning (a prefix), end (a suffix), or middle of a word. For example:
Pre-Shah era, full Persian name:
|Prefix(es) +||Given Name(s) +||Suffix(es)|
Name affixes give insight to the person, speaking to their social class, education, religious affiliation, and origin or birth city.
This name has two prefixes that denote respect:
Hajji حاجى is a person who has completed the pilgrimage to Mecca and
Mir میر meaning a master, is a contraction with Amir (امیر ), meaning “prince” in Arabic.
The suffix Tehrani تهرانی refers to someone who is from Tehran.
The Alavi علوى suffix here is the Nisbah (Arabic morphological transformation rule) of someone having descended from the first religious Imam: Imam Ali ibn Abu Talib.
A single affix may often communicate different meanings, depending on where it appears attached to a name. For example, the affix Mirza ميرزا, as a name-prefix, denotes respect. As a suffix, it denotes royal descent.
Example of “Mirza” denoting respect: Mirza Ali (name of a 17th-century Persian physician)
Example of “Mirza” denoting royal descent: Iskander Ali Mirza (Persian: اسکندر على ميرزا)
(Urdu: اسکندر على مرزا)
Post-Shah Era: Farsi Surnames
During the reign of the Shah of Iran, there was a notable shift in Farsi name format. Therefore, Farsi names are distinguished between the pre-Shah era and the post-Shah era. For instance, suppose you are reading Persian philosophy books and you encounter the name2 ملاصدرا , which could be transliterated as Mulla Sadra or Mollasadra. This given name is the name of 17th-century Persian philosopher Ṣadr ad-Dīn Muḥammad Shīrāzī. The name-prefix is Mulla (also pronounced as Molla) plus Sadra, which is a contraction of the name Ṣadr ad-Dīn. As the surname Muḥammad Shīrāzī is present, it is clear this person was named after 1925.
(a title specific to a religious group)
|Mollasadra||Pre-Shah-era name (alternative transliteration)|
(a contraction of Sadra)
|Muḥammad Shīrāzī||Post-Shah-era name|
Since the Shah took power in 1925, surnames have been required, although name affixes remain in use. Below are some examples of the composition of these full names:
|Kivan كيوان||Muhammadi محمدى||Single-word given name, single-word surname|
|Amir Hussein امير حسين||Bahramzadah بهرمزدى||Compound given name, surname with affix|
|Alireza عليرضا||Darya-Bandari دريا بندری||Compound given name, compound surname|
Challenges to Persian Name Translation and Name Matching
Four aspects of the linguistics of the Persian language heavily influence how Persian names are formed: phonological and morphological specifications; orthographic (spelling) variations; and cross-lingual borrowings.
Similar to English, Farsi consonants and vowels can have more than one pronunciation and the same sound can be represented by different characters.
Table A: Each character maps to one sound.
Table B: One sound (see the IPA column) maps to more than one character.
Table C: One sound maps to more than one vowel/character.
|/ɒː/||Arezou/Shadi||Aleph and aleph with madda can both be pronounced /p:/|
|/uː/||Oraee/Mousa||Aleph vav and vav can be both pronounced as the long vowel /uː/|
|/ɒi/||Ajay||Final aleph yaa sukun is pronounced as /ɒi/|
|/ow/||Showvan||Dhamma vav sukun is pronounced as /ow/|
|/æi/||Haidar||Fatha yaa sukun is pronounced as /æi/|
|/ei/||Oveissi||Kasra yaa sukun is pronounced as /ei/|
|/e/||Ilham/Zhale||Aleph kasra and kasra heh are both pronounced as /e/|
|/o/||Oveissi||Aleph dhamma is pronounced as /o/|
|/æ/||Afshar/Yahya||Aleph fatha and yaa fatha are both pronounced as /æ/|
|/iː/||Issa/Pari||Medial yaa and final yaa are both pronounced as the long vowel /iː/|
Farsi given names and surnames may have a variety of prefixes and suffixes attached:
- Prefixes: Por (meaning full) as in
- Suffixes: Zadeh (meaning descendent) as in
- Prefixes and suffixes with the same meaning, such as Nezhad as in
Spelling variations fall into a few categories:
- Foreign names written in Persian often vary in spelling (just as Persian names spelled in English do). Example: Variations of “Leonardo” in Persian (from Iran-News) can be
- Borrowed names from Arabic with a hamza character. The hamza (ء) can be dropped if it ends the name, such as وفاء vs. وفا or if it appears medially, such as in رضائی vs. رضایی or مسئول vs. مسول.
- Inconsistent use of the space character (Unicode code point U+0020) and the zero-width non-joiner character (U+200C). Although this variation is one that really only computers care about, it can make or break a search engine looking for Persian names. Depending on the typist, affixes can be joined to names using either character.
Description of the variation
|Moslemy Zade||Name with affix using Zero-Width Non-Joiner ZWNJ (U+200C) as the space between a name and its affix|
|Taqizadeh||Name with affix using the whitespace (U+0020) character between them|
|Wafa||Borrowed name from Arabic with hamza|
|Wafa||Borrowed name from Arabic without hamza|
There are two types of borrowings in Persian names: borrowings from Arabic and borrowings from other languages.
Borrowings from Arabic
Names spelled with letters specific to Arabic (not used in purely Persian words), such as thaa (ث), dhaal (ذ), ain (ع), ghain (غ), Saad (ص), Dhaad (ض), Taa (ط), and Zha(ظ) are an indication that the name comes from Arabic. In addition, the mapping of taa marbuta (ة ) in Arabic converts to a final-heh (ه), final-alef (ا) , or final-teh (ت) in Persian. Thus for the case of writing Arabic names in Persian, the final Persian spelling might be one of three possibilities. (See table below.) Unless you are familiar with the name’s pronunciation in Persian, it is challenging to know which English transliteration to choose.
|محبوبة Mahbouba||محبوبه Mahboubeh||Arabic taa marbuta replaced by Persian ه|
|سميرة Samira||سميرا Samira||Arabic taa marbuta replaced by Persian ا|
|هداية Hedaya||هدايت Hedayat||Arabic taa marbuta replaced by Persian ت|
Borrowings from other languages
For names borrowed from languages other than Arabic, Persians apply their own pronunciation to the characters, which can be a challenge to matching names across languages.
Name in Persian and pronunciation
Origin of borrowed name
Meaning of name
Applications doing fuzzy name matching and translation for Persian need a deep understanding beginning with “What is a full name?” Linguistically speaking, what aspects of the language affect how a name is written? What social and cultural backgrounds should be considered when working with Persian names?
As this blog post has shown, Persian names are particularly tricky to fuzzy match, search for, or translate for several reasons:
- Names can have the same meaning as a common noun and be literally translated by accident
- There is a one-to-many or many-to-many mapping between Arabic characters and Latin characters that may vary, depending on the context in which the Persian character appears, and one sound may map to multiple characters or vice versa
- There may be a varying number of expected name components
- Electronically, names may be composed in different ways, concatenated, or using whitespace or a Zero Width Non-Joiner ZWNJ (U+200C)
- Borrowings from other languages pose their own issues.
The name translator and name-matching functions within Rosette will take care of all these complexities. Try them out today by getting a free Rosette API key.