domingo, 4 de mayo de 2014

Latin alphabet extensions

To see this page correctly, you may need a Unicode font such as Arial Unicode MS (
This is a simple javascript program designed to render some special characters of the Latin alphabet. Each special character can be rendered by a combination of two symbols in the textarea, according to the following (more or less) logical rules:
  • A forward slash / renders the acute accent, used in Spanish vowels (fa/cil > fácil), Croatian ć (kuc/a > kuća), Polish ś and ź, etc.
  • A backslash \ renders the grave accent, used for example in French vowels (pe\re > père)
  • A 'less than' sign < renders the caron present in several Slavic languages (Crotian č š ž, Czech ě ř, Turkmen ň, Pinyin 3rd tone ǎ ě etc.). It also renders Slovak letters ď ľ ť.
  • A 'greater than' sign > renders a circumflex accent present for example in French ê, or in several Esperanto letters (ĉ ĝ ĥ ĵ ŝ)
  • An opening parentheses sign ( renders Romanian/Vietnamese ă, Esperanto ŭ (hodiau( > hodiaŭ) and Turkish ğ.
  • A closing parentheses sign ) renders Vietnamese letters ơ ư.
  • A percent sign % renders a dieresis, present in Spanish ü, French ë, German umlaut vowels ä ö ü, etc.
  • An equal sign = renders Hungarian vowels ő ű.
  • An opening square bracket [ renders the macron, present in Latvian, Nahuatl or Tahitian vowels ā ē ī ō ū, also used in the transcription of Arabic vowels.
  • A closing square bracket ] renders an underlined ṉ or ṟ, which may be used in the transliteration of some Asian languages.
  • A plus sign + renders some letters which seem to have a horizontal line inside, such as Polish ł, Croatian đ, Norwegian ø or Maltese ħ. The letter ʉ is used for example to transcribe a Thai sound in some transliterations.
  • An asterisk * renders some letters which have a dot or a circle above them, such as Swedish å, Czech ů, Maltese ċ ġ ż or Polish ż.
  • An underline sign _ renders letters which have a dot under them, such as Yoruba ṣ, the letter ṭ used in Tamazight and in the transcription of some Indian languages. It also renders one of the Vietnamese tones (ạ ẹ ị etc.).
  • A dollar sign $ renders Portuguese nasal vowels ã õ, Vietnamese tone ẽ, Spanish ñ, or Guarani letter g̃
  • The ampersand @ renders Vietnamese tone ả ẻ etc, as well as some letters which seem to have a comma or a curve above them (eg Latvian ģ).
  • An opening curly bracket { renders letters with 'hooks', such as Polish ą ę , Latvian ķ, Lithuanian ų, Catalan/French ç. b{ and d{ render ɓ and ɗ respectively, which are used in Comori to represent fricative sounds.
  • A closing curly bracket represent other letters which cannot be classified under the previous categories: a} > æ ,o} > œ. i} and I} render Turkish vowel letters ı and İ. d} and t} render Icelandic ð and þ respectively. s} renders German ß. g} gives ɣ, used in Tamazight. x} renders letter schwa ə, used in Azeri and to represent the corresponding vowel sound. n} renders nasal ŋ, which is used in some African languages and transliterations of other languages. z} renders ʐ, which is used in some transliterations of Tamil, for example.
  • Note than q} renders ɔ, which is used to represent an open o sound in some African languages. q>, q/ and q\ render that sign with circumflex, accute and grave accents.
  • Yoruba can use vowels ẹ (e_) and ọ (o_) with accute or grave accents. ẹ́ ẹ̀ ọ́ ọ̀ are rendered by f/ f\ p/ p\ respectively (f follows e, p follows o).
  • Chinese Pinyin tones over letter ü (ǖǘǚǜ) can be rendered with v1, v2 or v/, v3 or v<, v4 or v\ (v[ cannot be used as it renders letter v̄)
  • If needed, three Greek letters can be rendered: d[ > δ, e} > ε, t8 > θ. ε is used in some African languages to represent an open e.
  • Numbers 0 1 2 3 4 are used to render Vietnamese letters ă ơ ư with different tone marks: ắ ằ ẵ ẳ ặ ờ ỡ ở ợ ố ứ ừ ữ ử ự
  • Numbers 5 6 7 8 9 are used to render Vietnamese letters â ê ô with different tone marks: ấ ầ ẫ ẩ ậ ế ề ễ ể ệ ố ồ ỗ ổ ộ
  • Combinations !* and ?* render Spanish initial exclamation and interrogation signs ¡¿ respectively.
  • x* renders middle point ·, used for example in Catalan combination l·l. x< renders the diacritic ˇ alone, in case it might be needed.
  • The number sign # can be used to escape any of the previous characters (d#+ > d+ instead of đ).
  • For the combinations, I did not use some diacritics that are more common in writing such as . , ; :, or others which might be less available in some keyboards ´ ` ¨ ~ &).

| a/ á | a\ à | a% ä | n$ ñ | s< š | c> ĉ | a( ă | u) ư | a[ ā | n] ṉ | d+ đ | a* å | s_ ṣ | s{ ş | a} æ | a@ ả | u= ű |

Converted text will appear here.