QR Code Character Encoding |
QR Code Encoding
The method by which the characters are converted into bits of data and stored in a QR code
- There are four standard encoding modes used by QR codes: numeric, alphanumeric, binary and kanji.
- This tool demonstrates numeric, alphanumeric and binary encoding.
- All characters can be encoded using binary encoding although this may not be the most efficient selection.
- The selected encoding is stored as a 4-bit mode indicator at the beginning of the data.
Table of encoding types for a QR code
| Encoding | Description | 4-bit Mode Indicator |
|---|---|---|
| Numeric | Numeric characters: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 | 0001 |
| Alphanumeric | Uppercase letters, numeric characters, space or one of the characters $ % * + - , / | 0010 |
| Binary | Any character. | 0100 |
| Kanji | Kanji characters | 1000 |
Numeric Encoding
Numeric encoding can be used if the data is made up entirely of the characters 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9
Encoding the Data
To encode the numeric characters:
- The digits are placed into groups of three from left to right to create values between 0 and 999.
- The 3-digit values are coverted to 10-bit binary numbers.
- If there is one remaining digit, it is multiplied by 16 and converted to an 8-bit binary number
- If there are two remaining digits, 2-digit value is multiplied by 2. The result is converted to an 8-bit binary number
Example: 12345
Digit groups: 123 45
Calculation for 123
- 123 = 0001111011 (10-bit binary value)
Calculation for 45
- 45 × 2 = 90
- 90 = 01011010 (8-bit binary value)
Example: 9876543
Digit groups: 987, 654 and 3
Calculation for 987
- 987 = 1111011011 (10-bit binary value)
Calculation for 654
- 654 = 1010001110 (10-bit binary value)
Calculation for 3
- 3 × 16 = 48
- 48 = 00110000 (8-bit binary value)
Alphanumeric Encoding
Alphanumeric encoding can be used if the data is made up entirely of the following:
- Numeric digits
- Upper case letters
- Space
- The characters $ % * + - , / .
| ASCII value | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
|---|---|---|---|---|---|---|---|---|---|
| Character | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| ASCII value | 9 | A | B | C | D | E | F | G | H |
| Character | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |
| ASCII value | I | J | K | L | M | N | O | P | Q |
| Character | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| ASCII value | R | S | T | U | V | W | X | Y | Z |
| Character | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 |
| ASCII value | $ | % | * | + | - | . | / | : | |
| Character | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 |
Encoding the Data
To encode the alphanumeric characters:
- Look up the value of each alphanumeric character.
- The values are taken in pairs.
- The first value of each pair is multiplied by 45 and added to the second value to form a value between 1 and 2024.
- This value is then converted to an 11-bit binary number.
- If there is one remaining character, its value is multiplied by 4 and converted to an 8-bit binary number.
Example: CAT
Character groups: CA T
Calculation for C and A
- The values for C and A are 12 and 10
- 45 × 12 + 10 = 550
- 550 = 01000100110 (11-bit binary value)
Calculation for T
- The value for T is 29
- 4 × 29 = 29
- 29 = 01110100 (8-bit binary value)
Binary Encoding
Binary encoding can be used for any character that has a unicode value and is stored using one of more bytes using UTF-8 encoding.
For values between 0 and 127, one byte of data is used and the character is encoded using it's ASCII value.
| ASCII value | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Character | ! | " | # | $ | % | & | ' | ( | ) | * | + | , |
| ASCII value | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 |
| Character | - | . | / | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| ASCII value | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 |
| Character | 9 | : | ; | < | = | > | ? | @ | A | B | C | D |
| ASCII value | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 |
| Character | E | F | G | H | I | J | K | L | M | N | O | P |
| ASCII value | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 |
| Character | Q | R | S | T | U | V | W | X | Y | Z | [ | \ |
| ASCII value | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 |
| Character | ] | ^ | _ | ` | a | b | c | d | e | f | g | h |
| ASCII value | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 |
| Character | i | j | k | l | m | n | o | p | q | r | s | t |
| ASCII value | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 |
| Character | u | v | w | x | y | z | { | | | } | ~ | DEL | |
| ASCII value | 33 | 34 | 35 | 36 | 37 | 38 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Character | ! | " | # | $ | % | & | ||||||
| ASCII value | 39 | 40 | 41 | 42 | 43 | 44 | ||||||
| Character | ' | ( | ) | * | + | , | ||||||
| ASCII value | 45 | 46 | 47 | 48 | 49 | 50 | ||||||
| Character | - | . | / | 0 | 1 | 2 | ||||||
| ASCII value | 51 | 52 | 53 | 54 | 55 | 56 | ||||||
| Character | 3 | 4 | 5 | 6 | 7 | 8 | ||||||
| ASCII value | 57 | 58 | 59 | 60 | 61 | 62 | ||||||
| Character | 9 | : | ; | < | = | > | ||||||
| ASCII value | 63 | 64 | 65 | 66 | 67 | 68 | ||||||
| Character | ? | @ | A | B | C | D | ||||||
| ASCII value | 69 | 70 | 71 | 72 | 73 | 74 | ||||||
| Character | E | F | G | H | I | J | ||||||
| ASCII value | 75 | 76 | 77 | 78 | 79 | 80 | ||||||
| Character | K | L | M | N | O | P | ||||||
| ASCII value | 81 | 82 | 83 | 84 | 85 | 86 | ||||||
| Character | Q | R | S | T | U | V | ||||||
| ASCII value | 87 | 88 | 89 | 90 | 91 | 92 | ||||||
| Character | W | X | Y | Z | [ | \ | ||||||
| ASCII value | 93 | 94 | 95 | 96 | 97 | 98 | ||||||
| Character | ] | ^ | _ | ` | a | b | ||||||
| ASCII value | 99 | 100 | 101 | 102 | 103 | 104 | ||||||
| Character | c | d | e | f | g | h | ||||||
| ASCII value | 105 | 106 | 107 | 108 | 109 | 110 | ||||||
| Character | i | j | k | l | m | n | ||||||
| ASCII value | 111 | 112 | 113 | 114 | 115 | 116 | ||||||
| Character | o | p | q | r | s | t | ||||||
| ASCII value | 117 | 118 | 119 | 120 | 121 | 122 | ||||||
| Character | u | v | w | x | y | z | ||||||
| ASCII value | 123 | 124 | 125 | 126 | 127 | 128 | ||||||
| Character | { | | | } | ~ | DEL | | ||||||
Multi-Byte Characters
For character with unicode values greater than 127, UTF-8 encoding is used and each character is encoded using two or more bytes of data.
The table below shows examples of such characters and the UTF-8 encoded byte values store in the QR code
| Character | Unicode Value | UTF-8 Encoding |
|---|---|---|
| ½ | 189 | 194 189 |
| × | 215 | 195 151 |
| π | 960 | 207 128 |
| € | 8364 | 226 130 172 |
| ⇨ | 8680 | 226 135 168 |
| ☺ | 9786 | 226 152 186 |
| ⚅ | 9861 | 226 154 133 |
| 🐼 | 128060 | 240 159 144 188 |
| 😀 | 128512 | 240 159 152 128 |
| 🦋 | 129419 | 240 159 166 139 |
Encoding a character as UTF-8
To encode a unicode character value as UTF-8:
- Convert the unicode value to binary adjusting the preceding 0s so that the number starts with 01.
- Split the binary values into groups of six bits from right to left.
- Count the number of bits in the first group and add this to the number of groups/
- If the total is greater than 8, add one to the group count and add an extra group of value 0/
- Create an 8-bit binary value starting with a 1 for each group and ending in os. Add this to the first value.
- Add 10000000 to the value of each remaining group/
The result is to create a multibyte character with the first byte representing the number of bytes, one for each 1 at the beginning, and the other bytes starting with 10 to indicate a continuation bytes.
Example: π
The unicode value for π is 960
- 960 is represented by the binary sequence 011 1100 0000
- Split into groups: 01111 000000
- 2 groups plus 5 initial bits = 7, so no extra groups need to be added
- 11000000 + 01111 = 11001111
- The 8-bit binary value 11001111 = 207
- 10000000 + 000000 = 10000000
- The 8-bit binary value 10000000 = 128
The UTF-8 Encoding for π is 207 128
Example: 🐼
The unicode value for 🐼 is 128060
- 128060 is represented by the binary sequence 01 1111 0100 0011 1100
- Split into groups: 011111 010000 111100
- 3 groups plus 7 initial bits = 10, so as extra groups needs to be added
- 11110000 + 0 = 11110000
- The 8-bit binary value 11110000 = 240
- 10000000 + 011111 = 10011111
- The 8-bit binary value 10011111 = 159
- 10000000 + 010000 = 10010000
- The 8-bit binary value 10010000 = 144
- 10000000 + 111100 = 10111100
- The 8-bit binary value 10111100 = 188
The UTF-8 Encoding for 🐼 is 240 159 144 188
Select Encoding for Hello World
Select the required encoding for the data Hello World
| Encoding | Text Value | |
|---|---|---|
Alphanumeric |
HELLO WORLD | |
Binary |
Hello World |
Hello World is encoded using QR Codes encoding and is made up of:
