Non-breaking Space - Encodings

Encodings

Format Representation of non-breaking space
Unicode and ISO/IEC 10646 U+00A0 no-break space (HTML:    ). Can be encoded by UTF-8 as 0xC2 0xA0.
ISO/IEC 8859 0xA0
CP1252 (MS Windows default in most countries using Germanic or Romance languages) 0xA0
KOI8-R 0x9A
EBCDIC 0x41
CP437, CP850, CP866 0xFF
SGML and HTML (including Wikitext) Character entity reference:  
Numeric character references:   or  
TeX tilde (~)
ASCII Not available

Unicode defines several other non-break space characters that differ from the regular space in width:

  • No-break thin space, known in Unicode as “Narrow No-Break Space” (U+202F   narrow no-break space (HTML:  )). It was introduced in Unicode 3.0 for Mongolian, to separate a suffix from the word stem without indicating a word boundary. Also required for French punctuation (before ?, ! or ;).
  • Word joiner, encoded in Unicode 3.2 and above as U+2060 and HTML as ⁠. The word-joiner does not normally produce any space but prohibits a line break on either side of it.
  • The Byte Order Mark, U+FEFF, officially named “Zero Width No-Break Space”, can also be used with the same meaning as the word joiner, but in current documents this use is deprecated. See also Zero-width non-breaking space.

Read more about this topic:  Non-breaking Space