torch.AcceleratorError in surya layout encoder on MPS (Apple Silicon) with 66-page PDF

## 🧨 Describe the Bug

`marker_single` crashes with `torch.AcceleratorError: index 4755 is out of bounds: 0, range 0 to 1` during the layout recognition phase when processing a 66-page PDF on Apple Silicon (MPS backend). The error is reproducible and occurs consistently around page 14–16.

## 📄 Input Document

The PDF that triggers the crash is publicly available:
https://www.defensordelpueblo.es/wp-content/uploads/2025/03/Defensor-del-Pueblo_Informe-anual-2024.pdf

(2.7 MB, 66 pages — Spanish Ombudsman Annual Report 2024)

## 📤 Output Trace / Stack Trace

<details>
<summary>Click to expand</summary>

```
2026-03-01 09:40:26,850 [WARNING] surya: `TableRecEncoderDecoderModel` is not compatible with mps backend. Defaulting to cpu instead
Recognizing Layout:  24%|██▍       | 16/66 [01:12<03:47,  4.55s/it]
Traceback (most recent call last):
  File "marker/scripts/convert_single.py", line 38, in convert_single_cli
    rendered = converter(fpath)
  File "marker/converters/pdf.py", line 195, in __call__
    document = self.build_document(temp_path)
  File "marker/converters/pdf.py", line 182, in build_document
    document = DocumentBuilder(self.config)(provider, layout_builder, line_builder, ocr_builder)
  File "marker/builders/document.py", line 33, in __call__
    layout_builder(document, provider)
  File "marker/builders/layout.py", line 56, in __call__
    layout_results = self.surya_layout(document.pages)
  File "marker/builders/layout.py", line 88, in surya_layout
    layout_results = self.layout_model(
        [p.get_image(highres=False) for p in pages],
        batch_size=int(self.get_batch_size()),
    )
  File "surya/layout/__init__.py", line 51, in __call__
    self.foundation_predictor.prediction_loop(...)
  File "surya/foundation/__init__.py", line 780, in prediction_loop
    updated_inputs, outputs, merge_idxs = self.prefill(
        current_inputs, max_lookahead_tokens=0
    )
  File "surya/foundation/__init__.py", line 556, in prefill
    image_embeddings = self.model.get_image_embeddings(
        pixel_values=image_tiles, ...
    )
  File "surya/common/surya/__init__.py", line 258, in get_image_embeddings
    chunk_embeddings = self.vision_encoder.embed_images(
        image_batch=chunk_pixels.unsqueeze(0).to(device=self.device),
        grid_thw=chunk_grid_thw.unsqueeze(0).to(device=self.device),
    )
  File "surya/common/surya/encoder/__init__.py", line 798, in embed_images
    return super().forward(hidden_states=image_batch, grid_thw=grid_thw)
  File "surya/common/surya/encoder/__init__.py", line 765, in forward
    hidden_states = blk(hidden_states, cu_seqlens=cu_seqlens, position_embeddings=position_embeddings)
  File "surya/common/surya/encoder/__init__.py", line 595, in forward
    hidden_states = hidden_states + self.attn(self.norm1(hidden_states), ...)
  File "surya/common/surya/encoder/__init__.py", line 544, in forward
    self.unpack_qkv_with_mask(q, k, v, cu_seqlens)
  File "surya/common/surya/encoder/__init__.py", line 438, in unpack_qkv_with_mask
    max_seq_len = seq_lengths.max().item()
torch.AcceleratorError: index 4755 is out of bounds: 0, range 0 to 1
```

</details>

## ⚙️ Environment

- **Marker version**: 1.10.2
- **Surya version**: 0.17.1
- **Python version**: 3.13.5
- **PyTorch version**: 2.10.0
- **Transformers version**: 4.57.6
- **Operating System**: macOS 15.5 (Darwin 24.6.0), Apple Silicon (MPS)

## ✅ Expected Behavior

The PDF should convert to markdown without crashing. A shorter 29-page PDF from the same source (the Anexo version) converts successfully.

## 📟 Command or Code Used

<details>
<summary>Click to expand</summary>

```bash
curl -sL -o /tmp/test.pdf "https://www.defensordelpueblo.es/wp-content/uploads/2025/03/Defensor-del-Pueblo_Informe-anual-2024.pdf"
marker_single /tmp/test.pdf --output_format markdown --disable_image_extraction --output_dir /tmp/output
```

</details>

## 📎 Additional Context

- The error is **reproducible** across multiple runs, always crashing at the same point (~page 14–16 of 66).
- surya logs a warning at startup: `TableRecEncoderDecoderModel is not compatible with mps backend. Defaulting to cpu instead` — but the layout model still runs on MPS and crashes.
- Likely related to a specific page layout (complex table or chart) that causes a tensor index to go out of bounds in the vision encoder's attention mechanism.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch.AcceleratorError in surya layout encoder on MPS (Apple Silicon) with 66-page PDF #993

🧨 Describe the Bug

📄 Input Document

📤 Output Trace / Stack Trace

⚙️ Environment

✅ Expected Behavior

📟 Command or Code Used

📎 Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

torch.AcceleratorError in surya layout encoder on MPS (Apple Silicon) with 66-page PDF #993

Description

🧨 Describe the Bug

📄 Input Document

📤 Output Trace / Stack Trace

⚙️ Environment

✅ Expected Behavior

📟 Command or Code Used

📎 Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions