According to the GSM specification, a standard SMS message can contain up to 140 bytes of data (payload). Standard Latin (ISO-8859-1) character encoding represents a single character using 1 byte, which is 8 bits. Therefore, the maximum number of Latin 1 characters that could be included in an SMS is 140.
GSM encoding represents characters using 7 bits instead of 8. This therefore provides a maximum of 160 characters per SMS.
(140 * 8 bits) / 7 bits = 160
This effectively halves the number of characters the GSM character set can support, compared to ISO-8859-1. In order to include common characters that are usually represented using the 8th bit, these characters as well as other symbol characters must be re-mapped to a combination of lower bits. These re-mapped characters are often referred to as special characters. This re-mapping, in combination with packing 7-bit characters into 8-bit bytes is called GSM Encoding.
Here is a link to a table that lists the 7-bit default alphabet as specified by GSM 03.38.
Back to Glossary.