# SIMOSphere AI — Complete LLM-Optimised Documentation Bundle > Single-file dump of all product documentation. Agents can ingest this > file in one request instead of crawling individual pages. Last updated > 2026-05-04. --- ## 1. Product overview SIMOSphere AI is an OpenAI-compatible LLM inference API operated by SIMO GmbH (Aschaffenburg, Germany). It exposes open-weight models via a drop-in `Authorization: Bearer` API on `api.simosphereai.com`, hosted on NVIDIA RTX 6000 GPUs inside the EU. The platform consists of: - A request gateway (Node 22 + Express 5 + TypeScript) that handles authentication, rate limiting, plan-based billing, BYOK routing, PII redaction, and audit logging. - A managed inference layer (vLLM 0.6.x with 16 parallel slots per GPU) serving the open-weight model catalogue. - A management dashboard (Next.js 15) for tenants to manage API keys, view usage, and configure connectors. - An onboarding site (this domain) with marketing content, documentation, and self-serve registration. ## 2. Use cases ### 2.1 Drop-in OpenAI replacement Set `base_url="https://api.simosphereai.com/v1"` in any OpenAI SDK and the existing application code keeps working. All Chat Completions, Embeddings, and Tool-Use semantics match the OpenAI specification. ### 2.2 GDPR-bound inference for regulated sectors Finance, healthcare, and public-sector customers use SIMOSphere AI to satisfy data-sovereignty requirements. We sign Auftragsverarbeitungsverträge (AVV / DPA per Art. 28 GDPR) on request. No data leaves the EU. ### 2.3 BYOK routing to other providers Enterprise tenants can register their own Anthropic, OpenAI, or Mistral keys; the gateway then proxies requests to those upstreams while still applying our PII redaction, rate limiting, and audit logging. The upstream provider sees only the redacted prompt. ### 2.4 CI-conformant document generation The CI Documentor feature converts natural-language briefs into PDF and DOCX documents that match the customer's corporate identity (fonts, colours, logo, layout). Powered by a server-side WeasyPrint pipeline behind the same API surface. ### 2.5 Web-grounded answers via Tavily Professional and Enterprise plans include a managed Tavily web-search tool. Agents call it via the standard OpenAI tool-use API; the gateway handles the upstream API call and bills the search separately. ## 3. API reference ### 3.1 Authentication All `/v1/*` and `/v2/*` endpoints accept either: - `Authorization: Bearer sk-simo-` - `X-API-Key: sk-simo-` Keys are issued via the dashboard. Each key carries a plan (Starter, Professional, Enterprise), a rate limit, and optional scope flags. ### 3.2 Endpoints | Method | Path | Purpose | | ------ | ----------------------------- | ------------------------------------------------- | | POST | /v1/chat/completions | OpenAI-compatible chat completion | | POST | /v1/embeddings | OpenAI-compatible embeddings | | GET | /v1/models | List available model IDs | | POST | /v1/files | Upload a file (for retrieval / vision) | | POST | /v2/documents/render | CI-conformant PDF/DOCX render | | POST | /v2/translate | Multilingual translation (8 locales) | | GET | /health | Liveness | | GET | /openapi.json | Full OpenAPI 3.1 spec | The full machine-readable surface lives at https://onboarding.simosphereai.com/openapi.json. ### 3.3 Rate limits Default limits per API key: | Plan | Requests / min | Tokens / min | | ------------- | -------------- | ------------ | | Starter | 60 | 60 000 | | Professional | 600 | 600 000 | | Enterprise | negotiated | negotiated | Every response carries: ``` X-RateLimit-Limit: 60 X-RateLimit-Remaining: 42 X-RateLimit-Reset: 1714809600 Retry-After: 12 (only on 429) ``` ### 3.4 Streaming Set `"stream": true` in the request body. The gateway returns `text/event-stream` with the same `data: {"choices":[…]}` chunk shape that OpenAI uses, terminated by `data: [DONE]`. ### 3.5 Errors Errors are JSON. Shape: ```json { "error": { "type": "invalid_request_error", "code": "model_not_found", "message": "Model 'gpt-5' is not available on this tenant.", "param": "model" } } ``` HTTP status codes match OpenAI conventions: 400 (validation), 401 (missing/invalid key), 402 (out of credit), 403 (plan does not allow the requested capability), 404 (unknown route), 429 (rate limit), 500 (server), 503 (backend overload). ### 3.6 Webhooks Tenants can register webhooks via the dashboard or `POST /admin/webhooks`. Events: `usage.threshold_reached`, `key.rotated`, `invoice.created`, `audit.flagged`. Payloads are signed with `X-SIMO-Signature: sha256=` (HMAC of the request body using the secret returned at registration time). ## 4. Pricing See `/pricing.md` for the machine-readable version. Summary: - Starter: €29/month + €0.15 / 1M input + €0.60 / 1M output - Professional: €49/month + same per-token rates + Tavily included - Enterprise: €149/month + negotiable rates + M365 connector + custom MCP ## 5. Compliance - GDPR: All processing inside EU; AVV/DPA on request. - EU AI Act: Classified as general-purpose AI system; transparency obligations satisfied via the `aiDisclaimer` service injected into every Chat Completions response footer when the tenant requests it. - ISO 27001: Implementation in progress; certification audit scheduled for Q3 2026. - SOC 2 Type II: Audit period planned for 2026. - BSI C5: Out of scope for the initial certification cycle. ## 6. Operations - Status page: https://status.simosphereai.com/ - Incident notification: hello@simo-online.com - Service level: 99.5 % monthly availability for Professional and Enterprise plans, see `/sla` per locale for the full agreement. ## 7. Contact - Sales / general: hello@simo-online.com - Address: Würzburger Str. 152, 63743 Aschaffenburg, Germany - Company: SIMO GmbH, HRB 15769 AG Aschaffenburg - Trademark: DPMA 30 2024 240 269 - Public website: https://www.simosphereai.com ## 8. Agent integration cheatsheet If you are an autonomous agent looking to integrate: 1. Fetch the OpenAPI spec: GET /openapi.json 2. Discover capabilities: GET /.well-known/agent-card.json 3. Discover OAuth metadata: GET /.well-known/oauth-protected-resource 4. Discover MCP server: GET /.well-known/mcp/server-card.json 5. Sign your requests per RFC 9421 using the keys at /.well-known/http-message-signatures-directory if you want distinguishable bot traffic. 6. For sandboxed exploration without a real key, request trial credits via hello@simo-online.com.