Area
- Runtime / Core crates (stdlib/core/derive)
- Incan Language (syntax/semantics)
- Documentation
Problem statement
Streaming IO currently makes users write bounded-read loops by hand:
loop:
let chunk = input.read_bytes(chunk_size).map_err(map_io_error)?
if len(chunk) == 0:
break
hasher.update(chunk)
That pattern is correct, but it is not the right readability target for ordinary stdlib code. It leaks EOF-as-empty-chunk into every caller, repeats the same loop + read + empty-check scaffold across hashing, encoding, file copying, parsers, uploads/downloads, and makes Incan code read worse than the equivalent Python shape.
Python has a compact chunk-read idiom; the expression reference uses while chunk := file.read(9000):, and Python truth-value testing treats "empty sequences and collections" as false. Sources: https://docs.python.org/3.10/reference/expressions.html#assignment-expressions and https://docs.python.org/3/library/stdtypes.html#truth-value-testing.
Incan should not copy that directly by adding general truthiness or assignment expressions just to make this one IO pattern shorter. The better target is a typed chunk-stream API that makes EOF iterator exhaustion, not an empty byte value every caller must remember to test.
Proposed solution
Add a stdlib chunk stream abstraction for binary readers, likely centered on BinaryReader.chunks(chunk_size) or an equivalent read_chunks(reader, chunk_size) helper.
The desired user-facing shape is:
for chunk in input.chunks(chunk_size):
hasher.update(chunk)
return hasher.finalize()
For callers that need domain error mapping:
for chunk in input.chunks(chunk_size).map_err(HashError.from_io):
hasher.update(chunk)
return hasher.finalize()
The stream should yield non-empty bytes chunks and treat EOF as iterator completion. Zero or negative chunk sizes should be rejected up front with a clear error rather than becoming odd loop behavior.
This likely depends on native associated types from RFC 098 or a nearby trait design, because a clean API needs to preserve projected item/error types through chunk streams and adapters. Design options to settle:
Iterator with type Item = Result[bytes, IoError]
- separate
FallibleIterator with type Item and type Error
- named concrete chunk stream type returned by
BinaryReader.chunks(size)
- opaque stream return type if/when Incan supports that shape
Rust's for model is useful prior art here: the Rust Reference says that when an iterator is empty, "the for expression completes." Source: https://doc.rust-lang.org/reference/expressions/loop-expr.html#iterator-loops.
Alternatives considered
Keep the current explicit loop pattern. This is implementable today, but it keeps repeating a low-level EOF convention in user code and stdlib source.
Add Python-style assignment expressions and truthiness. That would make the local example shorter, but it broadens the language for a problem that is better solved as a typed streaming abstraction.
Add one-off helpers in std.hash or std.encoding. That would reduce local duplication, but it would not give file copy, parsers, upload/download code, or future stream consumers a shared vocabulary.
Expose only whole-file read_bytes() helpers. That is explicitly not enough for large-file paths and pushes users toward whole-file materialization.
Scope / acceptance criteria
-
In scope:
- Define a reader chunk-stream API for bounded binary reads.
- Define EOF as stream exhaustion, not an emitted empty chunk.
- Define how stream read errors are represented and mapped.
- Implement the API for stdlib file/binary reader types.
- Add examples and docs showing chunked file copy and hash feeding.
- Migrate stdlib code that currently hand-rolls
loop + read_bytes + len(chunk) == 0 where the new API applies.
- Add tests for normal chunking, empty files, final short chunks, read errors, and invalid chunk sizes.
-
Out of scope:
- General Python-style truthiness.
- General assignment expressions / walrus syntax.
- Async streams unless a follow-up RFC explicitly pulls them in.
- Generic associated types unless the chosen stream trait requires them.
-
Done when:
- Incan users can express bounded binary stream consumption with a clear
for chunk in ... shape.
- EOF and error behavior are documented and tested.
std.hash and other relevant stdlib surfaces no longer need repeated manual chunk-loop scaffolding for ordinary reader draining.
Area
Problem statement
Streaming IO currently makes users write bounded-read loops by hand:
That pattern is correct, but it is not the right readability target for ordinary stdlib code. It leaks EOF-as-empty-chunk into every caller, repeats the same
loop + read + empty-checkscaffold across hashing, encoding, file copying, parsers, uploads/downloads, and makes Incan code read worse than the equivalent Python shape.Python has a compact chunk-read idiom; the expression reference uses
while chunk := file.read(9000):, and Python truth-value testing treats "empty sequences and collections" as false. Sources: https://docs.python.org/3.10/reference/expressions.html#assignment-expressions and https://docs.python.org/3/library/stdtypes.html#truth-value-testing.Incan should not copy that directly by adding general truthiness or assignment expressions just to make this one IO pattern shorter. The better target is a typed chunk-stream API that makes EOF iterator exhaustion, not an empty byte value every caller must remember to test.
Proposed solution
Add a stdlib chunk stream abstraction for binary readers, likely centered on
BinaryReader.chunks(chunk_size)or an equivalentread_chunks(reader, chunk_size)helper.The desired user-facing shape is:
For callers that need domain error mapping:
The stream should yield non-empty
byteschunks and treat EOF as iterator completion. Zero or negative chunk sizes should be rejected up front with a clear error rather than becoming odd loop behavior.This likely depends on native associated types from RFC 098 or a nearby trait design, because a clean API needs to preserve projected item/error types through chunk streams and adapters. Design options to settle:
Iteratorwithtype Item = Result[bytes, IoError]FallibleIteratorwithtype Itemandtype ErrorBinaryReader.chunks(size)Rust's
formodel is useful prior art here: the Rust Reference says that when an iterator is empty, "the for expression completes." Source: https://doc.rust-lang.org/reference/expressions/loop-expr.html#iterator-loops.Alternatives considered
Keep the current explicit loop pattern. This is implementable today, but it keeps repeating a low-level EOF convention in user code and stdlib source.
Add Python-style assignment expressions and truthiness. That would make the local example shorter, but it broadens the language for a problem that is better solved as a typed streaming abstraction.
Add one-off helpers in
std.hashorstd.encoding. That would reduce local duplication, but it would not give file copy, parsers, upload/download code, or future stream consumers a shared vocabulary.Expose only whole-file
read_bytes()helpers. That is explicitly not enough for large-file paths and pushes users toward whole-file materialization.Scope / acceptance criteria
In scope:
loop + read_bytes + len(chunk) == 0where the new API applies.Out of scope:
Done when:
for chunk in ...shape.std.hashand other relevant stdlib surfaces no longer need repeated manual chunk-loop scaffolding for ordinary reader draining.