Mode Encoding

Aztec codes use "modes" to efficiently encode different types of characters. This is like having different dictionaries optimized for different kinds of text.

Why Use Modes?

Imagine you're sending a message and you want to use as few bits as possible. You could:

Use 8 bits for every character (like ASCII) → 8 bits per character
Notice that uppercase letters only need 5 bits (26 letters < 32 = 2^5)

Aztec takes this further with specialized modes for different character types.

The Six Modes

1. Upper Mode (5 bits per character)

For uppercase letters and space.

Code	Character	Code	Character
1	(space)	15	N
2	A	16	O
3	B	17	P
4	C	18	Q
5	D	19	R
6	E	20	S
7	F	21	T
8	G	22	U
9	H	23	V
10	I	24	W
11	J	25	X
12	K	26	Y
13	L	27	Z
14	M

Codes 28-31 are reserved for mode switching.

2. Lower Mode (5 bits per character)

For lowercase letters and space. Same structure as Upper, but with a-z instead of A-Z.

3. Digit Mode (4 bits per character)

For numbers and a few common characters.

Code	Character
2	(space)
3	, (comma)
4	. (period)
5-14	0-9

Codes 0, 1, and 15 are for mode switching.

Note: Digit mode uses only 4 bits per character, making it very efficient for numbers!

4. Punctuation Mode (5 bits per character)

For punctuation marks.

Code	Character	Code	Character
1	CR	17	,
2	CR LF	18	-
3	. (space)	19	.
4	, (space)	20	/
5	: (space)	21	:
6	!	22	;
7	"	23	<
8	#	24	=
9	$	25	>
10	%	26	?
11	&	27	[
12	'	28	]
13	(	29	{
14	)	30	}
15	*	31	U/L (switch)
16	+

Notice codes 2-5 encode two-character sequences in a single code—very efficient for common patterns like ". " or ", ".

5. Mixed Mode (5 bits per character)

For control characters and special symbols not in other modes.

Code	Character	Code	Character
1-18	Control chars (^A through ^_)	23	`
19	@	24	\|
20	\	25	~
21	^	26	DEL
22	_

6. Byte Mode (8 bits per byte)

For raw binary data or characters not in other modes (like emoji or non-Latin text).

[Mode switch] + [Length] + [Bytes...]
   5 bits       5 or 16    8 bits each

Short form: length 1-31 uses 5-bit length
Long form: length 32+ uses 0 + 11-bit length (for lengths 32-2079)

Mode Switching

To use a character from a different mode, you need to switch. There are two types:

Latching (Permanent Switch)

A "latch" changes your current mode until you latch again. Use this when you'll be staying in the new mode for multiple characters.

Latch codes from Upper mode:

Code	Bits	Destination
28	5	Lower
29	5	Mixed
30	5	Digit
31	5	Byte

Shifting (Temporary Switch)

A "shift" changes mode for just one character, then returns. Use this for isolated characters.

Shift codes from Upper mode:

Code	Bits	Destination
0	5	Punctuation (one char)

Multi-Step Switches

Some mode transitions require going through intermediate modes:

Upper → Punct: Must go through Mixed
  Upper --[29]--> Mixed --[30]--> Punct

Lower → Upper: Must go through Digit
  Lower --[30]--> Digit --[14]--> Upper

Encoding Strategy

The encoder uses a simple look-ahead strategy to choose between latching and shifting:

Look at the next few characters
Count how many can use the potential target mode
If 2 or more → latch (permanent switch)
If just 1 → shift (temporary switch)

Example: Encoding "Hello, World!"

Let's trace through this string:

Current mode: Upper (default start)

'H' - In Upper mode: code 9 (01001)
'e' - Not in Upper! Look ahead: "ello" all lowercase
      → Latch to Lower: code 28 (11100)
      In Lower mode: 'e' = code 6 (00110)
'l' - In Lower mode: code 13 (01101)
'l' - In Lower mode: code 13 (01101)
'o' - In Lower mode: code 16 (10000)
',' - Not in Lower! Only one punctuation
      → Shift to Punct: code 0 (00000)
      In Punct mode: ',' = code 17 (10001)
      (Back to Lower automatically)
' ' - In Lower mode: code 1 (00001)
'W' - Not in Lower! Look ahead: "World" mostly lowercase
      → Shift to Upper: code 28 (11100)
      In Upper mode: 'W' = code 24 (11000)
      (Back to Lower automatically)
'o' - In Lower mode: code 16 (10000)
'r' - In Lower mode: code 19 (10011)
'l' - In Lower mode: code 13 (01101)
'd' - In Lower mode: code 5 (00101)
'!' - Not in Lower!
      → Shift to Punct: code 0 (00000)
      In Punct mode: '!' = code 6 (00110)

Total encoding:

01001 11100 00110 01101 01101 10000 00000 10001 00001 11100 11000 10000 10011 01101 00101 00000 00110
  H   L/L    e     l     l     o    P/S    ,    space U/S    W     o     r     l     d    P/S    !

That's 85 bits for 13 characters, averaging about 6.5 bits per character—much better than 8 bits per character in plain ASCII!

Efficiency Comparison

Input	ASCII (8 bits each)	Aztec Encoded
"HELLO"	40 bits	25 bits (Upper mode)
"hello"	40 bits	30 bits (latch + Lower)
"12345"	40 bits	25 bits (latch + Digit)
"Hello"	40 bits	30 bits (mix of modes)

Two-Character Sequences

Punctuation mode has special codes for common patterns:

Code	Sequence	Use Case
2	CR LF	Line endings (Windows)
3	". "	End of sentence
4	", "	Comma in text
5	": "	After colons

These encode two characters in 5 bits instead of 10+!

Galois Fields - The math behind error correction
Reed-Solomon - How error correction works

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mode Encoding

Why Use Modes?

The Six Modes

1. Upper Mode (5 bits per character)

2. Lower Mode (5 bits per character)

3. Digit Mode (4 bits per character)

4. Punctuation Mode (5 bits per character)

5. Mixed Mode (5 bits per character)

6. Byte Mode (8 bits per byte)

Mode Switching

Latching (Permanent Switch)

Shifting (Temporary Switch)

Multi-Step Switches

Encoding Strategy

Example: Encoding "Hello, World!"

Efficiency Comparison

Two-Character Sequences

Next

FilesExpand file tree

03-ModeEncoding.md

Latest commit

History

03-ModeEncoding.md

File metadata and controls

Mode Encoding

Why Use Modes?

The Six Modes

1. Upper Mode (5 bits per character)

2. Lower Mode (5 bits per character)

3. Digit Mode (4 bits per character)

4. Punctuation Mode (5 bits per character)

5. Mixed Mode (5 bits per character)

6. Byte Mode (8 bits per byte)

Mode Switching

Latching (Permanent Switch)

Shifting (Temporary Switch)

Multi-Step Switches

Encoding Strategy

Example: Encoding "Hello, World!"

Efficiency Comparison

Two-Character Sequences

Next