Direct answer
Timeouts usually come from large context, long output targets, tool-heavy agent loops, client-side timeout limits, or temporary provider latency. Reduce the request to a small non-streaming text call first, then add context, output length, streaming, and tools back one at a time.
Private order, key, and balance details belong in the customer portal or support. Public docs can explain the diagnostic path, not reveal account-specific state.
Error phrases this guide covers
Search tools, logs, and support tickets do not always use the same wording. Treat these phrases as the same troubleshooting family before changing unrelated settings.
Fast check before changing everything
Run the smallest check that isolates the failing layer. If the small request works, the problem is usually the client configuration, hidden context, permissions, or advanced feature path rather than the whole account.
curl https://base.corvusllm.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_CORVUSLLM_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-5.5","messages":[{"role":"user","content":"Reply with one short sentence."}],"max_tokens":40,"stream":false}'
Common causes
- The prompt includes many files, long chat history, images, or pasted logs.
- The client has a shorter timeout than the model route needs for the selected request.
- Streaming hides progress from some clients until the first chunk arrives, so the app thinks the request is dead.
- Tool definitions or agent loops make the request much heavier than a normal chat message.
Fix steps
- Retry the same model with a tiny non-streaming text prompt and low output limit.
- If that works, halve the project context or chat history and test again.
- Increase the client timeout only after you confirm the small request succeeds.
- Disable streaming and tool calls until plain chat is reliable, then re-enable one advanced feature at a time.
Verify before retrying production traffic
- Measure whether the failure happens before the first token or during a long answer.
- Check whether another model family answers the same short prompt quickly.
- Confirm the app did not send duplicate background retries that make latency look worse.
Use one small request first. Large retries can spend balance, hide the original cause, and create confusing logs.
Diagnostic decision tree
Work through these checks in order. The goal is to isolate the failing layer before editing unrelated settings or sending another expensive request.
| Check | Action | Pass result | Fail result |
|---|---|---|---|
| Minimal request | Run the smallest check from this page with the same key, endpoint shape, and one public model slug. | The account and basic route probably work; move to client settings, hidden context, tools, or retries. | Fix auth, base URL, balance, model slug, or current route health before testing advanced features. |
| Client final URL | Inspect the actual URL or provider profile the client sends, not only the visible settings field. | Continue with request body, model slug, payload size, and feature compatibility checks. | Correct host/base/full-endpoint confusion before changing keys or model families. |
| Balance movement | Compare dashboard balance before and after one tiny diagnostic request. | If charged and no answer arrives, collect the support packet before retrying large prompts. | If not charged, focus first on request rejection, wrong endpoint, auth, or client-side failure. |
| Feature isolation | Disable streaming, tools, images, file context, long history, and automation loops for one retry. | Re-enable one feature at a time until the failing layer is identified. | Keep the request small and do not use production retries as the diagnostic method. |
| Route health | Check Service Status and try a tiny prompt on one nearby public model row if your workflow allows it. | Use a documented fallback only if quality and cost are acceptable. | Wait, switch safely, or contact support with timestamps instead of hammering the failing route. |
Prevent it next time
Set a clear request-size budget for production clients. Large context, tools, and streaming should have explicit retry limits, because otherwise one stuck job can create several expensive and slow attempts.
Minimum support packet
Collect these details before opening support. This avoids exposing secrets while giving enough context to match logs and reproduce the public failure path.
| Field | Why support needs it |
|---|---|
| Timestamp | Use UTC or include timezone so logs can be matched accurately. |
| Endpoint path | Include /v1, /anthropic, or the exact client route shape involved. |
| Public model slug | Send the customer-facing slug, not a private key, upstream account name, or hidden route. |
| Exact error text | Include the visible request timeout message and any HTTP status shown by the client. |
| Minimal request result | State whether the tiny check on this page works with the same key. |
| Balance movement | State whether balance changed after the failed request or only after retries. |
| Client and feature flags | Name the tool, SDK, streaming setting, image input, tools, file context, or automation loop involved. |
When to contact support
Contact support when a minimal reproducible check still fails, when the dashboard history does not match what your client received, or when usage appears charged but no usable answer reached the client.
- Include timestamp, endpoint path, public model slug, exact error wording, and whether the same key works on a minimal request.
- Include whether the dashboard balance changed and whether the client retried in the background.
- Do not send secrets, full API keys, regulated data, or private production prompts in public support messages.
Open the support bot after collecting the reproducible details.
Related sources
Use these pages to verify the exact base URL, model slug, billing behavior, service status, or broader troubleshooting route before changing unrelated settings.