Skip to content

Findit-AI/fasrt

Repository files navigation

fasrt

A blazing fast, zero-copy subtitle parser and writer for SRT and WebVTT in Rust.

github LoC Build codecov

docs.rs crates.io crates.io license

Installation

[dependencies]
fasrt = "0.2"

Features

  • Zero-copy, zero-allocation parsing — borrows directly from the input string
  • #![no_std] support (with optional alloc and std features)
  • Lazy iterator-based parsing — blocks are yielded on demand
  • DFA-based lexing via logos for fast tokenization
  • Strongly-typed newtypes (Hour, Minute, Second, Millisecond, Percentage) with compile-time validation
  • W3C WebVTT spec conformant — validated against Web Platform Tests

SRT

  • Strict and lossy parsing modes
  • Timestamps (HH:MM:SS,mmm)
  • Multiline cue bodies
  • Writer (std feature)

WebVTT

  • WEBVTT signature and header text
  • Timestamps (short MM:SS.mmm and long HH:MM:SS.mmm forms, unbounded hours)
  • Cue identifiers (zero-copy &str)
  • Cue settings (vertical, line, position, size, align, region)
  • NOTE, STYLE, REGION blocks
  • Full REGION definition parsing (id, width, lines, regionanchor, viewportanchor, scroll)
  • Float percentages (e.g., 50.5%)
  • CRLF, CR, LF line endings
  • BOM handling
  • Error recovery (--> in cue body, malformed timing lines)
  • Writer with round-trip fidelity (std feature)
  • Cue text parsing — two-layer design:
    • CueParser: logos DFA-backed, zero-alloc token stream (no_std)
    • CueText: W3C spec-compliant DOM tree builder with Node/TagNode types (alloc/std)
    • Tags: <b>, <i>, <u>, <c>, <ruby>, <rt>, <v>, <lang>, with classes and annotations
    • W3C tree building algorithm: implied end tags, <rt> scoping, unterminated tag handling
    • Full HTML5 named character reference support (2,231 entities via phf perfect hash map)
    • Numeric (&#32;) and hexadecimal (&#x20;) character references
    • Lazy text normalization via CueStr with OnceCell-cached decoding and NULL (U+0000 → U+FFFD) replacement

Optional dependencies

Feature Default Description
std Yes Enables std::io writer and thiserror::Error impls
alloc No Enables CueText DOM tree and entity decoding without std
memchr Yes (via alloc/std) SIMD-accelerated fast path for entity decoding

Benchmarks

Measured on Apple Silicon with cargo bench (Criterion).

SRT

Benchmark Input Time Throughput
Parse (strict) 2 cues, 89 B ~170 ns 520 MiB/s
Parse (strict) 26 KB file ~38 µs 661 MiB/s
Parse (lossy) 332 files, ~8 MB ~12.1 ms 646 MiB/s
Collect into Vec 26 KB file ~40 µs 616 MiB/s

WebVTT

Benchmark Input Time Throughput
Parse 2 cues, 96 B ~318 ns 291 MiB/s
Parse Settings + region + style, 354 B ~915 ns 387 MiB/s
Parse All WPT fixtures, ~34 KB ~113 µs 314 MiB/s
Collect into Vec Settings + region + style, 354 B ~973 ns 364 MiB/s

Cue Text

Benchmark Input Time Throughput
Parse Tags only, 166 B ~316 ns 552 MiB/s
Parse 500 timestamps, ~11 KB ~14.1 µs 776 MiB/s

Run benchmarks yourself:

cargo bench

License

fasrt is under the terms of both the MIT license and the Apache License (Version 2.0).

See LICENSE-APACHE, LICENSE-MIT for details.

Copyright (c) 2026 FinDIT Studio authors.

About

A blazing fast, zero-copy subtitle parser and writer for SRT and WebVTT in Rust.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors