Skip to content

Add Python RFC 5322 email address parser (§3.2-§4.4) with 70 tests and compliance matrix#7

Closed
ghost wants to merge 1 commit into
mainfrom
unknown repository
Closed

Add Python RFC 5322 email address parser (§3.2-§4.4) with 70 tests and compliance matrix#7
ghost wants to merge 1 commit into
mainfrom
unknown repository

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented May 20, 2026

Summary

Implements RFC 5322 compliant email address parser as specified in issue #1 — full ABNF grammar coverage from §3.2 through §4.4 with strict and permissive modes.

Files added

File Lines Description
parser.py 750+ AddressParser class with parse(), parse_address_list(), parse_mailbox_list()
test_parser.py 480+ 70 test cases across 10 test classes organized by RFC section
compliance.md 107 Maps all 51 ABNF productions to RFC sections, test cases, and implementation status

Modified files

File Change Description
source.md 2 CAP blocks Contribution Annotation Protocol blocks (§3.2.1-§3.2.5 and §3.4 areas)

Acceptance criteria

  • parser.pyAddressParser class with parse(), parse_address_list(), parse_mailbox_list() | ✓
  • Strict mode rejects all obs-* productions; permissive mode accepts them | ✓ (70 tests)
  • Quoted-string handling implements full §3.2.4 (quoted-pair, FWS within quotes) | ✓
  • CFWS correctly handled: stripped from addr-spec, comments extracted and stored | ✓
  • Domain literals support both IPv4 and IPv6 forms per §3.4.1 | ✓
  • Group addresses correctly parsed with member list extraction | ✓
  • test_parser.py — 60+ test cases covering all sections listed above | ✓ (70 tests)
  • compliance.md — maps all ABNF productions to tests and implementation | ✓ (51 productions)
  • All [CAP-ANNOTATION-REQUIRED] markers in source.md populated per CONTRIBUTING.md | ✓ (2 blocks)
  • No external dependencies — pure Python stdlib only | ✓
  • Type hints on all public methods | ✓
  • Parser handles inputs up to 998 characters (RFC 5322 line length limit) | ✓

Test results

============================== 70 passed in 0.08s ==============================

Test coverage by section

Section Tests Status
§3.2.1 quoted-pair 5
§3.2.2 FWS 5
§3.2.3 CFWS/comments 8
§3.2.4 quoted-string 8
§3.2.5 miscellaneous tokens 3
§3.4 address/mailbox/group 12
§3.4.1 addr-spec/domain-literal 8
§4.4 obsolete addressing 8
Edge cases 5
Invalid/rejection 8

Architecture

parser.py
├── RFC5322Address (dataclass) — parsed result
├── RFC5322Lexer — §3.2 lexical analysis (quoted-pair, FWS, CFWS, quoted-string, atoms)
└── AddressParser — recursive descent parser for §3.4 address productions
    ├── parse()          — single mailbox/group
    ├── parse_address_list() — comma-separated addresses
    └── parse_mailbox_list() — comma-separated mailboxes (rejects groups)

Co-Authored-By: Nebula noreply@nebula.gg

Adds AddressParser class with lexer (RFC5322Lexer), recursive-descent
parser, and semantic validator. Supports strict (reject obs-*) and
permissive modes (§4.4). Implements full ABNF grammar chain:

- §3.2.1 quoted-pair, §3.2.2 FWS, §3.2.3 CFWS/comments/nested
- §3.2.4 quoted-string with FWS and quoted-pair
- §3.4 address/mailbox/group/address-list/mailbox-list
- §3.4.1 addr-spec, domain-literal (IPv4/IPv6)
- §4.4 obs-local-part, obs-domain, obs-FWS

Includes 70 tests across 10 test classes, compliance.md mapping
all 51 ABNF productions, and 2 CAP annotation blocks in source.md.

No external dependencies — pure Python stdlib only.

Closes #1

Co-Authored-By: Nebula <noreply@nebula.gg>
@ghost ghost closed this by deleting the head repository May 21, 2026
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants