HTML Charset

HTML Charset is also called HTML Character Sets or HTML Encoding. 

The browser displays a particular web page after analyzing its character set. Hence, it is important to define a proper character set so that the browser displays it correctly.

The default HTML charset used is UTF-8 that contains a wide array of symbols, characters, and entities.

HTML Character Encoding

There are various types of Character Encoding which are given below:

Syntax : 

<meta charset= “UTF-8”>

  1. ASCII Character Set in HTML

  2. ANSI Character Set (Windows-1252)

  3. ISO-8859-1 Character Set

  4. UTF-8 Character Set

ASCII Character Set in HTML

ASCII stands for American Standard Code for Information Interchange. defines a unique binary number and supports the numbers from 0 to 9, characters in uppercase and lowercase and special characters as well.

ASCII was the most popular character set for the web

ASCII uses the values from 0 to 31 (and 127) for control characters.

ASCII uses values from 32 to 126 for letters, digits, and symbols.

ASCII does not use the values from 128 to 255.

ANSI Character Set (Windows-1252)

ANSI stands for American National Standard Institute.

It is represented with 8 bits, thus representing 256 characters.

ANSI is identical to ASCII for values from 0 to 127.

ANSI has a proprietary set of characters for values from 128 to 159.

ANSI is identical to UTF-8 for values from 160 to 255.

ISO-8859-1 Character Set

ISO-8859-1 as the default character encoding in HTML 4. It was also an extension of the ASCII standard with International characters. It also used full bytes (8-bits) to show characters.

ISO-8859-1 is identical to ASCII for values from 0 to 127.

ISO-8859-1 does not use values from 128 to 159.

ISO-8859-1 is identical to UTF-8 for values from 160 to 255.

UTF-8 Character Set

UTF-8 is a variable width character encoding that covers almost all of the characters and symbols in the world. ANSI (Windows-1252) was the original Windows character set, which supported 256 different character codes.

ISO-8859-1 was the default character set for HTML 4. This character set also supported 256 different character codes.

Why UTF 8 is also supported in HTML4?

Because ANSI and ISO-8859-1 were so limited, HTML 4 also supported UTF-8. The default character encoding for HTML5 is UTF-8.

UTF-8 Syntax for HTML4 : 

<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">  

UTF-8 Syntax for HTML5 : 

<meta charset="UTF-8">