LLM Providers
ORQO supports multiple LLM providers out of the box — from cloud APIs to fully local inference. You can mix providers within a single team, assigning different models to different agents based on cost, capability, or privacy requirements.
Cloud Providers
OpenRouter
The recommended provider for most users. A single API key gives you access to 200+ models from OpenAI, Anthropic, Google, Meta, Mistral, and more.
| Engine provider | openai (OpenAI-compatible API) |
| Base URL | https://openrouter.ai/api/v1 |
| Credential | OPENROUTER_API_KEY |
| Models | 200+ from all major providers |
OpenRouter is the recommended single-key solution — one API key gives you access to all major providers without managing separate accounts.
OpenAI
Direct access to OpenAI's models.
| Engine provider | openai |
| Base URL | https://api.openai.com/v1 |
| Credential | OPENAI_API_KEY |
| Models | GPT-4.1, GPT-4o, o1, o3, o4-mini, and more |
Anthropic
Direct access to Anthropic's Claude models.
| Engine provider | anthropic |
| Base URL | https://api.anthropic.com |
| Credential | ANTHROPIC_API_KEY |
| Models | Claude Opus, Sonnet, Haiku |
Google AI
Direct access to Google's Gemini models.
| Engine provider | google |
| Base URL | https://generativelanguage.googleapis.com/v1beta |
| Credential | GOOGLE_API_KEY |
| Models | Gemini 2.5 Flash, Gemini 2.5 Pro, and more |
Open-Source & Self-Hosted Models
ORQO supports open-source models via Ollama, LM Studio, and any OpenAI-compatible inference server. This enables privacy-sensitive deployments, cost control, and mixed-provider teams where some agents use cloud models and others use your own.
The base URL you configure is where the ORQO server connects to — not your browser. On managed ORQO (the default), the server runs in the cloud. This means:
- Cloud-hosted model endpoints (Ollama Cloud, your own server at
https://llm.yourcompany.com) work out of the box. localhostonly works in self-hosted deployments where the ORQO server and model server run on the same machine (Enterprise split-plane solution).- If you run a model server in your own infrastructure, expose it at a reachable address (e.g.,
https://ollama.yourcompany.com:11434) and configure that as the base URL.
Ollama
Ollama runs open-source models with a single command. ORQO connects to Ollama via its native API — either your own Ollama instance or Ollama Cloud.
| Engine provider | ollama |
| Credential | OLLAMA_API_KEY (optional — only needed for Ollama Cloud) |
| Models | 400+ open-source models (Gemma, Llama, Qwen, Mistral, and more) |
Ollama Cloud (Recommended)
The easiest way to use open-source models — no infrastructure to manage.
- Sign up at ollama.com and create an API key at ollama.com/settings/keys.
- In ORQO, go to Settings → Credentials and add an
OLLAMA_API_KEYcredential with your key. - Set the custom base URL on your Ollama model to
https://ollama.com. - Assign the credential to your LLM config.
Ollama Cloud offers a free tier, a Pro plan ($20/month), and a Max plan ($100/month). All plans include access to every model in the Ollama library.
Your Own Ollama Instance
Run Ollama on your own server for full data sovereignty.
- Install Ollama on a server reachable by the ORQO server.
- Pull models:
ollama pull gemma4orollama pull llama3.1:8b. - Expose the Ollama port (default
11434) at a reachable address — e.g.,https://ollama.yourcompany.com:11434. - In ORQO, set the custom base URL on the Ollama model to that address. No credential needed.
Recommended Ollama Models
| Model | Parameters | Size | Context | Best for |
|---|---|---|---|---|
gemma4:latest | 4B MoE | ~9.6 GB | 128K | Best balance of speed and capability |
gemma4:26b | 26B MoE | ~18 GB | 256K | Highest capability (needs 32GB+ RAM) |
llama3.1:8b | 8B | ~4.7 GB | 128K | Fast, reliable tool calling |
qwen2.5:7b-instruct | 7B | ~4.7 GB | 128K | Strong multilingual performance |
qwen2.5-coder:7b | 7B | ~4.7 GB | 128K | Specialized for code tasks |
LM Studio
LM Studio is a desktop application for running local models with a visual interface. It exposes an OpenAI-compatible API — a different protocol from Ollama.
| Engine provider | openai (OpenAI-compatible API) |
| Credential | None required |
To use LM Studio with ORQO:
- Run LM Studio on a server reachable by the ORQO server and start the local server.
- In ORQO, go to Settings → LLM Configs and click Add Custom Model.
- Set:
- Display name — e.g., "Gemma 4 (LM Studio)"
- Model ID — The exact model identifier shown in LM Studio
- Engine provider —
openai - Base URL — The address where LM Studio is reachable (e.g.,
https://lmstudio.yourcompany.com:1234/v1)
- Create an LLM config selecting your custom model.
These tools use different API protocols. Ollama uses its own format (/api/chat), while LM Studio speaks the OpenAI format (/v1/chat/completions). ORQO handles both — just select the correct engine provider: ollama for Ollama, openai for LM Studio.
Custom / OpenAI-Compatible Endpoints
Any service that implements the OpenAI chat completions API can be added as a Custom provider. This includes inference servers like vLLM, TGI, or corporate API gateways.
- Go to Settings → LLM Configs and click Add Custom Model.
- Set the engine provider to match the API format:
openai— for OpenAI-compatible APIs (most common)anthropic— for Anthropic-compatible APIsollama— for Ollama-compatible APIs
- Set the base URL to your endpoint (e.g.,
https://llm-gateway.yourcompany.com/v1). - Set the model ID to match what your server expects.
- If your server requires authentication, create a credential and assign it to the LLM config.
Mixing Providers in a Team
One of ORQO's key capabilities is mixed-provider teams. Each agent in a team can use a different LLM configuration, letting you optimize for cost, speed, capability, or privacy per role.
Example team setup:
| Agent | Role | Provider | Model | Why |
|---|---|---|---|---|
| Researcher | Web research, data gathering | OpenRouter | GPT-4.1 Mini | Fast, cheap |
| Analyst | Deep analysis, reasoning | Anthropic | Claude Sonnet 4 | Best reasoning |
| Writer | Content generation | Ollama Cloud | Gemma 4 26B | Open-source, private |
To set this up:
- Create LLM configs for each provider/model combination.
- Set the team's default LLM config (used by agents without an override).
- On individual agents, select a different LLM config to override the team default.
Full details: LLM Assignment
Provider Comparison
| Provider | Auth | Cost | Data Sovereignty | Tool Calling | Best For |
|---|---|---|---|---|---|
| OpenRouter | API key | Per-token | Provider-hosted | Yes | Most users — widest model selection |
| OpenAI | API key | Per-token | Provider-hosted | Yes | Direct access, higher rate limits |
| Anthropic | API key | Per-token | Provider-hosted | Yes | Claude models specifically |
| API key | Per-token | Provider-hosted | Yes | Gemini models, long context | |
| Ollama Cloud | API key | Subscription | Ollama-hosted | Yes | Easy access to open-source models |
| Ollama (self-hosted) | None | Your hardware | Full control | Yes | Privacy, air-gapped, compliance |
| LM Studio | None | Your hardware | Full control | Yes | Self-hosted with visual management |
| Custom | Optional | Varies | Varies | Depends | Corporate gateways, vLLM, TGI |