Endpoint types
- OpenAI-compatible
- Anthropic-compatible
Use the Common compatible servers:
openai-compat provider for any server that implements the OpenAI Chat Completions API (POST /v1/chat/completions).Model ID prefix: openai-compat:Example model ID:| Server | Default base URL | Notes |
|---|---|---|
| LM Studio | http://localhost:1234/v1 | Enable the local server in LM Studio’s settings |
| vLLM | http://localhost:8000/v1 | Specify --served-model-name when launching |
| Ollama (OpenAI mode) | http://localhost:11434/v1 | Alternative to the native Ollama integration |
| Azure OpenAI | https://<resource>.openai.azure.com/openai/deployments/<deployment> | Requires deployment name as model |
| llama.cpp server | http://localhost:8080/v1 | Start with --port 8080 |
Configuring a custom endpoint
Navigate to the custom endpoint section
Scroll to Custom Endpoints (or the specific provider section — OpenAI-Compatible or Anthropic-Compatible).
Enter the base URL
Paste your server’s base URL, for example:Do not include the specific path (e.g.,
/chat/completions) — Luminy appends the correct path automatically.Enter the model name
Type the model name exactly as your server expects it, for example:This becomes the model ID after the prefix:
openai-compat:llama-3.1-8b-instruct.Enter an API key (if required)
Some servers require a bearer token or API key. Paste it into the API Key field. If your server has no authentication, you can leave this blank or enter any placeholder string — the field is optional.
Example: LM Studio
LM Studio runs a local OpenAI-compatible server on your machine.Enable the LM Studio server
Open LM Studio, load a model, go to the Local Server tab, and click Start Server. It listens on
http://localhost:1234 by default.Configure in Luminy
In Settings → OpenAI-Compatible, set:
- Base URL:
http://localhost:1234/v1 - Model: the exact model name shown in LM Studio (e.g.,
Meta-Llama-3.1-8B-Instruct-Q4_K_M) - API Key: leave blank or enter any value
Use cases
LM Studio
Run quantized GGUF models locally with a polished UI. Connect Luminy via the built-in OpenAI-compatible server.
Private vLLM deployments
Deploy vLLM on a GPU server and expose it behind a private URL. Configure the base URL and optional bearer token in Luminy.
Azure OpenAI
Use your Azure OpenAI deployment endpoint and API key. Enter the full deployment URL as the base URL.
Corporate API gateways
Many enterprises proxy AI APIs through internal gateways. If the gateway is OpenAI-compatible, Luminy connects directly.
