Skip to content

docs(readme): add Troubleshooting section for upstream API errors#58789

Open
ademczuk wants to merge 1 commit into
anthropics:mainfrom
ademczuk:docs/troubleshooting-api-errors
Open

docs(readme): add Troubleshooting section for upstream API errors#58789
ademczuk wants to merge 1 commit into
anthropics:mainfrom
ademczuk:docs/troubleshooting-api-errors

Conversation

@ademczuk
Copy link
Copy Markdown

What this changes

Adds a short Troubleshooting upstream API errors section to README.md, placed between Plugins and Reporting Bugs. The position is intentional: a user who hits an error reads top-down and hits the triage guidance before they get to the "file a GitHub issue" link.

The section explains the most common upstream-error pattern users currently see (HTTP 500 + api_error + a request_id), walks through the right diagnostic steps (status page, retry, model swap, search before filing), and notes that the same flow applies to overloaded_error and rate_limit_error so each error type points users at the right action.

No behavioural change. README-only. 21 added lines, no removals.

Why

Looking at currently-open issues in this tracker, 40+ are reports of the same upstream 500 surface, filed in clusters that line up with status-page incidents:

The status page shows matching incident windows ("Elevated errors across Claude Models" on Apr 13, model-specific elevated errors on Apr 15, further windows on May 8 / 12 / 13). The duplicates are users seeing the JSON in their terminal, not finding a section in this README explaining what it means, and reasonably concluding it must be a Claude Code bug to report.

A short troubleshooting section in the README is the smallest change that can address that: it teaches readers to recognise the upstream signal, points them at the right surface (status.claude.com), and surfaces the request_id so escalation has the data the server side needs. It does not promise any change in CLI behaviour, which I have no visibility into from outside.

What I considered and didn't do

  • Changing the error rendering in the CLI itself. The CLI source is not in this repo. Out of scope for an external PR.
  • A consolidation issue listing the duplicates. Discussed with the user who pointed at this work; the docs PR is the more durable fix because it reduces the upstream rate of duplicates rather than just tagging the ones that already exist. Happy to file the consolidation issue as a follow-up if maintainers prefer.
  • Adding the same section to the docs site at code.claude.com/docs. I couldn't find a public source repo for that site to PR against. If there is one, point me at it and I'll mirror this content there as a separate PR.

Verification

  • README renders correctly on GitHub (preview link in the diff).
  • No changes outside README.md.
  • Pre-existing README content is untouched; the new section is purely additive between two existing top-level headings.

The current README points users at the GitHub issue tracker for any error
they see in the terminal. In practice, a sizable fraction of recent open
issues (40+ from mid-March through mid-April 2026) are reports of upstream
API service errors that the CLI is surfacing verbatim:

  API Error: 500 {"type":"error","error":{"type":"api_error","
  message":"Internal server error"},"request_id":"req_..."}
  Claude may be experiencing issues. Check status.anthropic.com

These are not client bugs - they map to incident windows on
status.claude.com (e.g. "Elevated errors across Claude Models" on Apr 13
and Apr 15) and resolve when the service-side incident does. The
`request_id` is server-side and the trace lives there.

Add a Troubleshooting section between Plugins and Reporting Bugs that
walks a user through:

- Reading the error correctly (api_error 500 is upstream, not client)
- Checking the status page first
- Retrying / switching model
- Searching existing issues before filing
- When it does make sense to escalate, and what to include (request_id)

Same template applies to overloaded_error and rate_limit_error, which I
called out briefly so users hit the right action for each error type
rather than treating every server response as a bug.

Goal is to reduce the duplicate-filing rate on transient incidents
without changing any client behaviour or making promises about the
CLI's internal handling, which I have no visibility into.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant