[All] Arrow ingestion can return retryable errors while retaining batches for recovery

### SDK

All

### SDK Version

newest

### Operating System

_No response_

### Description

When Arrow Flight ingestion receives a retryable server error such as RESOURCE_EXHAUSTED, recovery starts, but new ingest_batch() calls can return an error during the reconnect window.

This is ambiguous because ingest_batch() adds the batch to pending_batches before checking whether the sender is available. If batch_tx is temporarily None, the caller receives an error, but the batch may still be retained and replayed during recovery. The caller cannot know whether it should retry the batch, which risks duplicate ingestion.

### Reproduction

Create an Arrow Flight stream with recovery enabled and a small recovery_backoff_ms / recovery_timeout_ms so the recovery window is easy to hit.

Configure the server or mock Flight server to return a retryable error, for example gRPC RESOURCE_EXHAUSTED, after receiving a batch.

Call ingest_batch(batch1) to trigger the server error.

While recovery is in progress, keep calling ingest_batch(batchN) from the client.

Observe that some ingest_batch() calls return an error instead of blocking or returning an offset.

Also note that the failed ingest_batch() call may have already inserted the batch into Arrow pending_batches before returning the error, so recovery may replay a batch the caller believes was rejected.

### Expected behavior

ingest_batch() should have clear enqueue semantics during retryable recovery, like proto. Accept and retain the batch, return an offset, and let recovery replay it.

Proto ingestion has an explicit pause path for graceful stream closure, but Arrow currently has no equivalent pause/recovery state, so retryable Flight errors can surface directly to callers during recovery.

### Is it a regression?

No

### Logs

```shell

```

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[All] Arrow ingestion can return retryable errors while retaining batches for recovery #236

SDK

SDK Version

Operating System

Description

Reproduction

Expected behavior

Is it a regression?

Logs

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[All] Arrow ingestion can return retryable errors while retaining batches for recovery #236

Description

SDK

SDK Version

Operating System

Description

Reproduction

Expected behavior

Is it a regression?

Logs

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions