# SIMOSphere AI — Complete LLM-Optimised Documentation Bundle

> Single-file dump of all product documentation. Agents can ingest this
> file in one request instead of crawling individual pages. Last updated
> 2026-05-04.

---

## 1. Product overview

SIMOSphere AI is an OpenAI-compatible LLM inference API operated by
SIMO GmbH (Aschaffenburg, Germany). It exposes open-weight models via
a drop-in `Authorization: Bearer` API on `api.simosphereai.com`,
hosted on NVIDIA RTX 6000 GPUs inside the EU.

The platform consists of:

- A request gateway (Node 22 + Express 5 + TypeScript) that handles
  authentication, rate limiting, plan-based billing, BYOK routing,
  PII redaction, and audit logging.
- A managed inference layer (vLLM 0.6.x with 16 parallel slots per
  GPU) serving the open-weight model catalogue.
- A management dashboard (Next.js 15) for tenants to manage API
  keys, view usage, and configure connectors.
- An onboarding site (this domain) with marketing content,
  documentation, and self-serve registration.

## 2. Use cases

### 2.1 Drop-in OpenAI replacement

Set `base_url="https://api.simosphereai.com/v1"` in any OpenAI SDK and
the existing application code keeps working. All Chat Completions,
Embeddings, and Tool-Use semantics match the OpenAI specification.

### 2.2 GDPR-bound inference for regulated sectors

Finance, healthcare, and public-sector customers use SIMOSphere AI to
satisfy data-sovereignty requirements. We sign Auftragsverarbeitungsverträge
(AVV / DPA per Art. 28 GDPR) on request. No data leaves the EU.

### 2.3 BYOK routing to other providers

Enterprise tenants can register their own Anthropic, OpenAI, or Mistral
keys; the gateway then proxies requests to those upstreams while still
applying our PII redaction, rate limiting, and audit logging. The
upstream provider sees only the redacted prompt.

### 2.4 CI-conformant document generation

The CI Documentor feature converts natural-language briefs into PDF and
DOCX documents that match the customer's corporate identity (fonts,
colours, logo, layout). Powered by a server-side WeasyPrint pipeline
behind the same API surface.

### 2.5 Web-grounded answers via Tavily

Professional and Enterprise plans include a managed Tavily web-search
tool. Agents call it via the standard OpenAI tool-use API; the gateway
handles the upstream API call and bills the search separately.

## 3. API reference

### 3.1 Authentication

All `/v1/*` and `/v2/*` endpoints accept either:

- `Authorization: Bearer sk-simo-<key>`
- `X-API-Key: sk-simo-<key>`

Keys are issued via the dashboard. Each key carries a plan (Starter,
Professional, Enterprise), a rate limit, and optional scope flags.

### 3.2 Endpoints

| Method | Path                          | Purpose                                           |
| ------ | ----------------------------- | ------------------------------------------------- |
| POST   | /v1/chat/completions          | OpenAI-compatible chat completion                 |
| POST   | /v1/embeddings                | OpenAI-compatible embeddings                      |
| GET    | /v1/models                    | List available model IDs                          |
| POST   | /v1/files                     | Upload a file (for retrieval / vision)            |
| POST   | /v2/documents/render          | CI-conformant PDF/DOCX render                     |
| POST   | /v2/translate                 | Multilingual translation (8 locales)              |
| GET    | /health                       | Liveness                                          |
| GET    | /openapi.json                 | Full OpenAPI 3.1 spec                             |

The full machine-readable surface lives at
https://onboarding.simosphereai.com/openapi.json.

### 3.3 Rate limits

Default limits per API key:

| Plan          | Requests / min | Tokens / min |
| ------------- | -------------- | ------------ |
| Starter       | 60             | 60 000       |
| Professional  | 600            | 600 000      |
| Enterprise    | negotiated     | negotiated   |

Every response carries:

```
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1714809600
Retry-After: 12         (only on 429)
```

### 3.4 Streaming

Set `"stream": true` in the request body. The gateway returns
`text/event-stream` with the same `data: {"choices":[…]}` chunk shape
that OpenAI uses, terminated by `data: [DONE]`.

### 3.5 Errors

Errors are JSON. Shape:

```json
{
  "error": {
    "type": "invalid_request_error",
    "code": "model_not_found",
    "message": "Model 'gpt-5' is not available on this tenant.",
    "param": "model"
  }
}
```

HTTP status codes match OpenAI conventions: 400 (validation), 401
(missing/invalid key), 402 (out of credit), 403 (plan does not allow
the requested capability), 404 (unknown route), 429 (rate limit),
500 (server), 503 (backend overload).

### 3.6 Webhooks

Tenants can register webhooks via the dashboard or
`POST /admin/webhooks`. Events: `usage.threshold_reached`,
`key.rotated`, `invoice.created`, `audit.flagged`.

Payloads are signed with `X-SIMO-Signature: sha256=<hex>` (HMAC of the
request body using the secret returned at registration time).

## 4. Pricing

See `/pricing.md` for the machine-readable version. Summary:

- Starter: €29/month + €0.15 / 1M input + €0.60 / 1M output
- Professional: €49/month + same per-token rates + Tavily included
- Enterprise: €149/month + negotiable rates + M365 connector + custom MCP

## 5. Compliance

- GDPR: All processing inside EU; AVV/DPA on request.
- EU AI Act: Classified as general-purpose AI system; transparency
  obligations satisfied via the `aiDisclaimer` service injected into
  every Chat Completions response footer when the tenant requests it.
- ISO 27001: Implementation in progress; certification audit scheduled
  for Q3 2026.
- SOC 2 Type II: Audit period planned for 2026.
- BSI C5: Out of scope for the initial certification cycle.

## 6. Operations

- Status page: https://status.simosphereai.com/
- Incident notification: hello@simo-online.com
- Service level: 99.5 % monthly availability for Professional and
  Enterprise plans, see `/sla` per locale for the full agreement.

## 7. Contact

- Sales / general: hello@simo-online.com
- Address: Würzburger Str. 152, 63743 Aschaffenburg, Germany
- Company: SIMO GmbH, HRB 15769 AG Aschaffenburg
- Trademark: DPMA 30 2024 240 269
- Public website: https://www.simosphereai.com

## 8. Agent integration cheatsheet

If you are an autonomous agent looking to integrate:

1. Fetch the OpenAPI spec: GET /openapi.json
2. Discover capabilities: GET /.well-known/agent-card.json
3. Discover OAuth metadata: GET /.well-known/oauth-protected-resource
4. Discover MCP server: GET /.well-known/mcp/server-card.json
5. Sign your requests per RFC 9421 using the keys at
   /.well-known/http-message-signatures-directory if you want
   distinguishable bot traffic.
6. For sandboxed exploration without a real key, request trial credits
   via hello@simo-online.com.