-
Notifications
You must be signed in to change notification settings - Fork 41
Description
Issue
Currently the BinTools.Write_s and BinTools.Read_s combination regularly fails.
So I asked claude code to stress test the current implementation to find a reproducible example.
Environment:
micromamba create -n cp79 python=3.13
micromamba activate cp79
micromamba install -c conda-forge -c cadquery OCP=7.9.3.0
micromamba install ipythonReproducing example
-
test.py:
from OCP.BinTools import BinTools from OCP.BRepPrimAPI import BRepPrimAPI_MakeBox, BRepPrimAPI_MakeSphere from OCP.BRepAlgoAPI import BRepAlgoAPI_Fuse from OCP.TopoDS import TopoDS_Shape import io box = BRepPrimAPI_MakeBox( 8.189314010683498, 46.85954928498498, 14.39317846913498 ).Shape() sphere = BRepPrimAPI_MakeSphere(18.12601828498498).Shape() shape = BRepAlgoAPI_Fuse(box, sphere).Shape() BinTools.Write_s(shape, "shape.brep") with open("shape.brep", "rb") as f: data = f.read() buf = io.BytesIO(data) result = TopoDS_Shape() BinTools.Read_s(result, buf)
-
Error
$ python test.py Traceback (most recent call last): File "/home/bernhard/test.py", line 22, in <module> BinTools.Read_s(result, buf) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^ OCP.Standard.Standard_Failure: EXCEPTION in BinTools_ShapeSet::ReadGeometry(S,OS) 0x558d8ee274a0 : Standard_Failure: BinTools_SurfaceSet::ReadGeometry: UnExpected BRep_PointRepresentation = -1
Note: It fails on my Mac M1 and on my Linux box:
Analysis
The claude code helped, so take it with a pinch of salt, but I thought it makes sense
This shape
- serializes to exactly 5797 bytes
- causes OCCT to request a backward seek during parsing (seek from position 4096 back to 3064)
The backward seek triggers a bug in pystreambuf's position tracking:
Read sequence:
read(1024) × 4 → position 4096
seek(3064) → OCCT wants to re-read earlier data (NORMAL)
read(1024) × 3 → positions 3064 → 4088 → 5112 → 5797 (EOF)
seek(6140) → BUG! pystreambuf calculates wrong position (343 bytes past EOF)
read(1024) → returns 0 bytes → OCCT fails
Why this specific shape?
- The shape's binary representation requires OCCT to seek backward during parsing
- Simple shapes (like a plain box) read sequentially without seeks → work fine
- Complex shapes that need to reference earlier data trigger backward seeks
- The combination of file size (5797) and seek positions causes pystreambuf's buffer arithmetic to overflow
The bug is in pystreambuf.h line ~399-453 (seekoff_without_calling_python), where buffer position calculations after a backward seek
produce incorrect results.
Note: BinTools.Read_s is reliable
| Direction | Reliable? | Reason |
|---|---|---|
| Write (C++ → Python) | Yes | Sequential, C++ owns buffer, copies to Python |
| Read (Python → C++) | No | Pointer aliasing, backward seeks break position tracking |
Alternative implementation
A much simpler solution we use in https://github.com/jdegenstein/ocp-addons is
py::bytes serialize_shape(const TopoDS_Shape &shape) {
std::ostringstream buf;
BinTools::Write(shape, buf);
return py::bytes(std::move(buf.str()));
}
TopoDS_Shape deserialize_shape(const py::bytes &buf) {
std::istringstream stream(buf);
TopoDS_Shape shape;
BinTools::Read(shape, stream);
return shape;
}
py::bytes serialize_location(const TopLoc_Location &location) {
std::ostringstream buf;
BinTools_OStream occtStream(buf);
BinTools_ShapeWriter().WriteLocation(occtStream, location);
return py::bytes(std::move(buf.str()));
}
TopLoc_Location deserialize_location(const py::bytes &buf) {
std::istringstream stream(buf);
BinTools_IStream occtStream(stream);
// This is not a memory leak and can only be copied due to weird occt impl
return *BinTools_ShapeReader().ReadLocation(occtStream);
}
Side-by-Side Comparison
Comparing both implementations, claude code came up with:
| Aspect | pystreambuf | ocp-addons |
|---|---|---|
| Buffer ownership | Shared/complex (Python bytes + C++ pointers) | Clear (C++ owns during operation) |
| Python calls during I/O | Many (~N/1024 for N bytes) | Zero (only at start/end) |
| GIL safety | Risky (pointers into Python memory) | Safe (pure C++ during I/O) |
| Memory pattern | Incremental chunks | Single allocation |
| Seek support | Complex, error-prone | Native C++ stringstream (reliable) |
| Error handling | Complex (Python exceptions mid-stream) | Simple (fails at boundaries) |
| Code complexity | ~500 lines | ~15 lines |
| Streaming large files | Theoretically better (incremental) | Requires full memory |
-
Advantage of the ocp_addons approach:
- Clean boundary: Python <-> C++ conversion happens exactly once at the start (for read) or end (for write), not continuously during I/O
- No pointer aliasing:
std::ostringstreamowns its buffer entirely. No raw pointers into Python object internals. - GIL-safe: The entire BinTools operation runs in pure C++ with no Python dependencies mid-stream
- Simple type conversion:
py::bytes->std::string(pybind11 handles this cleanly, copies the data)std::string->py::bytes(single copy at the end viastd::move)
- Native stream semantics:
std::stringstreamhas well-defined, tested seek/tell behavior
-
The Trade-off
The pystreambuf approach was designed for streaming large data without loading everything into memory. But in practice:
- Most CAD shapes are manageable in memory
- The reliability cost of the complex bridging might outweigh the memory benefit