Skip to content

structured-world/structured-email-address

Repository files navigation

structured-email-address

RFC 5321/5322/6531 conformant email address parser, validator, and normalizer for Rust.

CI Crates.io docs.rs License: Apache-2.0

What makes this different?

Every Rust email crate stops at RFC validation. This one goes further:

Feature email_address email-address-parser This crate
RFC 5322 grammar Partial Full Full
RFC 6531 (UTF-8) Yes Yes Yes
Subaddress/+tag extraction - - Yes
Provider-aware dot-stripping - - Yes
Configurable case folding - - Yes
PSL domain validation - - Yes
Anti-homoglyph detection - - Yes
IDN domain Unicode accessor - - Yes
Display name parsing Yes - Yes
Configurable strictness Partial Partial Full
Serde support Yes - Yes
Zero dependencies* Yes nom idna + 3

* Dependencies: idna, unicode-normalization, unicode-security. Optional: psl, serde.

Quick Start

use structured_email_address::{EmailAddress, Config};

// Parse with defaults (RFC 5322 Standard mode)
let email: EmailAddress = "user+tag@example.com".parse()?;
assert_eq!(email.local_part(), "user+tag");
assert_eq!(email.tag(), Some("tag"));
assert_eq!(email.domain(), "example.com");

// International domains: IDNA roundtrip
let email: EmailAddress = "user@münchen.de".parse()?;
assert_eq!(email.domain(), "xn--mnchen-3ya.de");
assert_eq!(email.domain_unicode(), "münchen.de");

Configured Parsing

use structured_email_address::{EmailAddress, Config};

let config = Config::builder()
    .strip_subaddress()          // user+tag → user
    .dots_gmail_only()           // a.l.i.c.e@gmail.com → alice@gmail.com
    .lowercase_all()             // USER → user
    .check_confusables()         // detect Cyrillic lookalikes
    .domain_check_psl()          // verify domain in Public Suffix List
    .build();

let email = EmailAddress::parse_with("A.L.I.C.E+promo@Gmail.COM", &config)?;
assert_eq!(email.canonical(), "alice@gmail.com");
assert_eq!(email.tag(), Some("promo"));
assert!(email.is_freemail());

Display Names

use structured_email_address::{EmailAddress, Config};

let config = Config::builder().allow_display_name().build();
let email = EmailAddress::parse_with("John Doe <user@example.com>", &config)?;
assert_eq!(email.display_name(), Some("John Doe"));

Batch Parsing

Parse thousands of addresses in one call. Config is shared, results preserve input order:

use structured_email_address::{EmailAddress, Config};

let config = Config::builder().strip_subaddress().lowercase_all().build();
let results = EmailAddress::parse_batch(
    &["alice@example.com", "invalid", "bob+tag@example.org"],
    &config,
);
assert!(results[0].is_ok());
assert!(results[1].is_err());
assert!(results[2].is_ok());

For large lists (10K+), enable the rayon feature for parallel parsing:

structured-email-address = { version = "0.0.1", features = ["rayon"] }
let results = EmailAddress::parse_batch_par(&huge_list, &config);

Batch Benchmarks (baseline)

100K emails (mix of valid + invalid), strip_subaddress + dots_gmail_only + lowercase_all config. Apple M1 Pro, Rust 1.85, cargo bench --all-features.

Variant Time Throughput
parse_batch (sequential) 49.1 ms ~2.0M emails/sec
parse_batch_par (rayon) 9.6 ms ~10.4M emails/sec

Rayon gives ~5x speedup on this workload.

Strictness Levels

Level Grammar Use case
Strict RFC 5321 (envelope) SMTP validation, reject exotic addresses
Standard RFC 5322 (header) Default — full grammar, no obsolete forms
Lax RFC 5322 + obs-* Legacy system interop

Features

Feature Default Description
serde Yes Serialize/deserialize as canonical string
psl Yes Domain validation against Public Suffix List
rayon No Parallel batch parsing via parse_batch_par()
# Minimal (no serde, no PSL)
structured-email-address = { version = "0.0.1", default-features = false }

Anti-Homoglyph Protection

Detects visually confusable email addresses using Unicode skeleton mapping:

use structured_email_address::confusable_skeleton;

// Cyrillic 'а' (U+0430) vs Latin 'a' (U+0061)
assert_eq!(
    confusable_skeleton("аlice"),  // Cyrillic а
    confusable_skeleton("alice"),  // Latin a
);

Support the Project

USDT TRC-20 Donation QR Code

USDT (TRC-20): TFDsezHa1cBkoeZT5q2T49Wp66K8t2DmdA

License

Apache License 2.0

About

RFC 5321/5322/6531 email address parser, validator, and normalizer for Rust. Subaddress extraction, provider-aware normalization, PSL domain validation, anti-homoglyph protection.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages