QR Code Character Encoding



QR Code Encoding

The method by which the characters are converted into bits of data and stored in a QR code

  • There are four standard encoding modes used by QR codes: numeric, alphanumeric, binary and kanji.
  • This tool demonstrates numeric, alphanumeric and binary encoding.
  • All characters can be encoded using binary encoding although this may not be the most efficient selection.
  • The selected encoding is stored as a 4-bit mode indicator at the beginning of the data.

Table of encoding types for a QR code

Encoding Description 4-bit Mode Indicator
Numeric Numeric characters: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 0001
Alphanumeric Uppercase letters, numeric characters, space or one of the characters $ % * + - , / 0010
Binary Any character. 0100
Kanji Kanji characters 1000


Numeric Encoding

Numeric encoding can be used if the data is made up entirely of the characters 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9


Encoding the Data

To encode the numeric characters:

  • The digits are placed into groups of three from left to right to create values between 0 and 999.
  • The 3-digit values are coverted to 10-bit binary numbers.
  • If there is one remaining digit, it is multiplied by 16 and converted to an 8-bit binary number
  • If there are two remaining digits, 2-digit value is multiplied by 2. The result is converted to an 8-bit binary number

Example: 12345

Digit groups: 123 45

Calculation for 123

  • 123 = 0001111011 (10-bit binary value)

Calculation for 45

  • 45 × 2 = 90
  • 90 = 01011010 (8-bit binary value)

Example: 9876543

Digit groups: 987, 654 and 3

Calculation for 987

  • 987 = 1111011011 (10-bit binary value)

Calculation for 654

  • 654 = 1010001110 (10-bit binary value)

Calculation for 3

  • 3 × 16 = 48
  • 48 = 00110000 (8-bit binary value)

Alphanumeric Encoding

Alphanumeric encoding can be used if the data is made up entirely of the following:

  • Numeric digits
  • Upper case letters
  • Space
  • The characters $ % * + - , / .
Alphanumeric data values
ASCII value 0 1 2 3 4 5 6 7 8
Character 0 1 2 3 4 5 6 7 8

ASCII value 9 A B C D E F G H
Character 9 10 11 12 13 14 15 16 17

ASCII value I J K L M N O P Q
Character 18 19 20 21 22 23 24 25 26

ASCII value R S T U V W X Y Z
Character 27 28 29 30 31 32 33 34 35

ASCII value $ % * + - . / :
Character 36 37 38 39 40 41 42 43 44


Encoding the Data

To encode the alphanumeric characters:

  • Look up the value of each alphanumeric character.
  • The values are taken in pairs.
  • The first value of each pair is multiplied by 45 and added to the second value to form a value between 1 and 2024.
  • This value is then converted to an 11-bit binary number.
  • If there is one remaining character, its value is multiplied by 4 and converted to an 8-bit binary number.

Example: CAT

Character groups: CA T

Calculation for C and A

  • The values for C and A are 12 and 10
  • 45 × 12 + 10 = 550
  • 550 = 01000100110 (11-bit binary value)

Calculation for T

  • The value for T is 29
  • 4 × 29 = 29
  • 29 = 01110100 (8-bit binary value)


Binary Encoding

Binary encoding can be used for any character that has a unicode value and is stored using one of more bytes using UTF-8 encoding.

For values between 0 and 127, one byte of data is used and the character is encoded using it's ASCII value.

7-bit ASCII Values (32-127)
ASCII value 33 34 35 36 37 38 39 40 41 42 43 44
Character ! " # $ % & ' ( ) * + ,

ASCII value 45 46 47 48 49 50 51 52 53 54 55 56
Character - . / 0 1 2 3 4 5 6 7 8

ASCII value 57 58 59 60 61 62 63 64 65 66 67 68
Character 9 : ; < = > ? @ A B C D

ASCII value 69 70 71 72 73 74 75 76 77 78 79 80
Character E F G H I J K L M N O P

ASCII value 81 82 83 84 85 86 87 88 89 90 91 92
Character Q R S T U V W X Y Z [ \

ASCII value 93 94 95 96 97 98 99 100 101 102 103 104
Character ] ^ _ ` a b c d e f g h

ASCII value 105 106 107 108 109 110 111 112 113 114 115 116
Character i j k l m n o p q r s t

ASCII value 117 118 119 120 121 122 123 124 125 126 127 128
Character u v w x y z { | } ~ DEL €
ASCII value 33 34 35 36 37 38
Character ! " # $ % &

ASCII value 39 40 41 42 43 44
Character ' ( ) * + ,

ASCII value 45 46 47 48 49 50
Character - . / 0 1 2

ASCII value 51 52 53 54 55 56
Character 3 4 5 6 7 8

ASCII value 57 58 59 60 61 62
Character 9 : ; < = >

ASCII value 63 64 65 66 67 68
Character ? @ A B C D

ASCII value 69 70 71 72 73 74
Character E F G H I J

ASCII value 75 76 77 78 79 80
Character K L M N O P

ASCII value 81 82 83 84 85 86
Character Q R S T U V

ASCII value 87 88 89 90 91 92
Character W X Y Z [ \

ASCII value 93 94 95 96 97 98
Character ] ^ _ ` a b

ASCII value 99 100 101 102 103 104
Character c d e f g h

ASCII value 105 106 107 108 109 110
Character i j k l m n

ASCII value 111 112 113 114 115 116
Character o p q r s t

ASCII value 117 118 119 120 121 122
Character u v w x y z

ASCII value 123 124 125 126 127 128
Character { | } ~ DEL €



Multi-Byte Characters

For character with unicode values greater than 127, UTF-8 encoding is used and each character is encoded using two or more bytes of data.

The table below shows examples of such characters and the UTF-8 encoded byte values store in the QR code

Example Characters and their UTF Encoding
Character Unicode Value UTF-8 Encoding
½ 189  194 189
× 215  195 151
π 960  207 128
8364  226 130 172
8680  226 135 168
9786  226 152 186
9861  226 154 133
🐼 128060  240 159 144 188
😀 128512  240 159 152 128
🦋 129419  240 159 166 139


Encoding a character as UTF-8

To encode a unicode character value as UTF-8:

  • Convert the unicode value to binary adjusting the preceding 0s so that the number starts with 01.
  • Split the binary values into groups of six bits from right to left.
  • Count the number of bits in the first group and add this to the number of groups/
  • If the total is greater than 8, add one to the group count and add an extra group of value 0/
  • Create an 8-bit binary value starting with a 1 for each group and ending in os. Add this to the first value.
  • Add 10000000 to the value of each remaining group/

The result is to create a multibyte character with the first byte representing the number of bytes, one for each 1 at the beginning, and the other bytes starting with 10 to indicate a continuation bytes.


Example: π

The unicode value for π is 960

  • 960 is represented by the binary sequence 011 1100 0000
  • Split into groups: 01111 000000
  • 2 groups plus 5 initial bits = 7, so no extra groups need to be added
  • 11000000 + 01111 = 11001111
  • The 8-bit binary value 11001111 = 207
  • 10000000 + 000000 = 10000000
  • The 8-bit binary value 10000000 = 128

The UTF-8 Encoding for π is 207 128


Example: 🐼

The unicode value for 🐼 is 128060

  • 128060 is represented by the binary sequence 01 1111 0100 0011 1100
  • Split into groups: 011111 010000 111100
  • 3 groups plus 7 initial bits = 10, so as extra groups needs to be added
  • 11110000 + 0 = 11110000
  • The 8-bit binary value 11110000 = 240
  • 10000000 + 011111 = 10011111
  • The 8-bit binary value 10011111 = 159
  • 10000000 + 010000 = 10010000
  • The 8-bit binary value 10010000 = 144
  • 10000000 + 111100 = 10111100
  • The 8-bit binary value 10111100 = 188

The UTF-8 Encoding for 🐼 is 240 159 144 188


Text:

Select Encoding for Hello World

Select the required encoding for the data Hello World

Encoding Text Value

Alphanumeric

HELLO WORLD

Binary

Hello World

Hello World is encoded using QR Codes encoding and is made up of: