Character encoding is a system that maps characters (letters, numbers, symbols) to numerical values that computers can understand and store. Without character encoding, computers would only be able to work with binary data (0s and 1s), which is not human-readable.
Feature | ASCII | Unicode |
---|---|---|
Character Set | Limited to English characters | Includes characters from all languages |
Number of Characters | 128 | Millions |
Bits per Character | 7 | 16 (UTF-16) or 32 (UTF-32) |
Compatibility | Widely supported but limited | More comprehensive and supports a wider range of characters |
Consider the letter "A". In ASCII, it is represented by the decimal value 65, or the binary value 01000001
. In Unicode, it is also represented by the same decimal value 65, but it is assigned a different Unicode code point, which is U+0041
.
Understanding character encoding is essential for working with text data in computer systems. While ASCII is still widely used for basic text, Unicode provides a more comprehensive and future-proof solution for handling characters from all languages. Choosing the appropriate encoding is crucial for ensuring accurate data representation and communication.
Create a customised learning path powered by AI — stay focused, track progress, and earn certificates.
Build Your Learning Path →