LLM Providers

ORQO supports multiple LLM providers out of the box — from cloud APIs to fully local inference. You can mix providers within a single team, assigning different models to different agents based on cost, capability, or privacy requirements.

Zero Data Retention on included access

On ORQO's included model access, prompts and outputs are routed only to providers operating under a Zero Data Retention policy — your data is never stored or used for training. Models that qualify show a ZDR badge in the picker. See Data Privacy.

Cloud Providers

OpenRouter

The recommended provider for most users. A single API key gives you access to 200+ models from OpenAI, Anthropic, Google, Meta, Mistral, and more.


Engine provider	`openai` (OpenAI-compatible API)
Base URL	`https://openrouter.ai/api/v1`
Credential	`OPENROUTER_API_KEY`
Models	200+ from all major providers

tip

OpenRouter is the recommended single-key solution — one API key gives you access to all major providers without managing separate accounts.

OpenAI

Direct access to OpenAI's models.


Engine provider	`openai`
Base URL	`https://api.openai.com/v1`
Credential	`OPENAI_API_KEY`
Models	GPT-4.1, GPT-4o, o1, o3, o4-mini, and more

Anthropic

Direct access to Anthropic's Claude models.


Engine provider	`anthropic`
Base URL	`https://api.anthropic.com`
Credential	`ANTHROPIC_API_KEY`
Models	Claude Opus, Sonnet, Haiku

Google AI

Direct access to Google's Gemini models.


Engine provider	`google`
Base URL	`https://generativelanguage.googleapis.com/v1beta`
Credential	`GOOGLE_API_KEY`
Models	Gemini 2.5 Flash, Gemini 2.5 Pro, and more

Open-Source & Self-Hosted Models

ORQO supports open-source models via Ollama, LM Studio, and any OpenAI-compatible inference server. This enables privacy-sensitive deployments, cost control, and mixed-provider teams where some agents use cloud models and others use your own.

Important: Base URL Must Be Reachable by ORQO

The base URL you configure is where the ORQO server connects to — not your browser. On managed ORQO (the default), the server runs in the cloud. This means:

Cloud-hosted model endpoints (Ollama Cloud, your own server at https://llm.yourcompany.com) work out of the box.
localhost only works in self-hosted deployments where the ORQO server and model server run on the same machine (Enterprise split-plane solution).
If you run a model server in your own infrastructure, expose it at a reachable address (e.g., https://ollama.yourcompany.com:11434) and configure that as the base URL.

Ollama

Ollama runs open-source models with a single command. ORQO connects to Ollama via its native API — either your own Ollama instance or Ollama Cloud.


Engine provider	`ollama`
Credential	`OLLAMA_API_KEY` (optional — only needed for Ollama Cloud)
Models	400+ open-source models (Gemma, Llama, Qwen, Mistral, and more)

Ollama Cloud (Recommended)

The easiest way to use open-source models — no infrastructure to manage.

Sign up at ollama.com and create an API key at ollama.com/settings/keys.
In ORQO, go to Settings → Credentials and add an OLLAMA_API_KEY credential with your key.
Set the custom base URL on your Ollama model to https://ollama.com.
Assign the credential to your LLM config.

Ollama Cloud offers a free tier, a Pro plan ($20/month), and a Max plan ($100/month). All plans include access to every model in the Ollama library.

Your Own Ollama Instance

Run Ollama on your own server for full data sovereignty.

Install Ollama on a server reachable by the ORQO server.
Pull models: ollama pull gemma4 or ollama pull llama3.1:8b.
Expose the Ollama port (default 11434) at a reachable address — e.g., https://ollama.yourcompany.com:11434.
In ORQO, set the custom base URL on the Ollama model to that address. No credential needed.

Recommended Ollama Models

ORQO syncs Ollama's catalog automatically — 400+ models, each tagged with an AAU tier dot so you can see its credit cost at a glance. A representative spread:

Model	Tier	Best for
`gemma3:12b`	🟦 Efficient	Fast, cheap classification and routing — light enough to self-host
`ministral-3:8b`	🟦 Efficient	Reliable tool calling on modest hardware
`minimax-m2.7`	🟩 Standard	Efficient coding and agentic tasks
`minimax-m3`	🟨 Performance	Flagship coding/agentic model, up to 1M-token context
`glm-5`	🟨 Performance	Strong general reasoning
`qwen3-coder:480b`	🟧 Premium	Heavy-duty code generation
`kimi-k2:1t`	🟧 Premium	Frontier-scale reasoning

Open-source models still carry tier dots

Ollama Cloud bills by a flat monthly subscription rather than per token, but each model is still assigned an AAU tier from a reference price for its size class. A call to an Ollama 🟨 Performance model costs the same 10 credits as any other Performance model — tiering is consistent across every provider. The 🟦 Efficient models are also light enough to run on your own hardware.

LM Studio

LM Studio is a desktop application for running local models with a visual interface. It exposes an OpenAI-compatible API — a different protocol from Ollama.


Engine provider	`openai` (OpenAI-compatible API)
Credential	None required

To use LM Studio with ORQO:

Run LM Studio on a server reachable by the ORQO server and start the local server.
In ORQO, go to Settings → LLM Configs and click Add Custom Model.
Set:
- Display name — e.g., "Gemma 4 (LM Studio)"
- Model ID — The exact model identifier shown in LM Studio
- Engine provider — openai
- Base URL — The address where LM Studio is reachable (e.g., https://lmstudio.yourcompany.com:1234/v1)
Create an LLM config selecting your custom model.

Ollama vs LM Studio

These tools use different API protocols. Ollama uses its own format (/api/chat), while LM Studio speaks the OpenAI format (/v1/chat/completions). ORQO handles both — just select the correct engine provider: ollama for Ollama, openai for LM Studio.

Custom / OpenAI-Compatible Endpoints

Any service that implements the OpenAI chat completions API can be added as a Custom provider. This includes inference servers like vLLM, TGI, or corporate API gateways.

Go to Settings → LLM Configs and click Add Custom Model.
Set the engine provider to match the API format:
- openai — for OpenAI-compatible APIs (most common)
- anthropic — for Anthropic-compatible APIs
- ollama — for Ollama-compatible APIs
Set the base URL to your endpoint (e.g., https://llm-gateway.yourcompany.com/v1).
Set the model ID to match what your server expects.
If your server requires authentication, create a credential and assign it to the LLM config.

Mixing Providers in a Team

One of ORQO's key capabilities is mixed-provider teams. Each agent in a team can use a different LLM configuration, letting you optimize for cost, speed, capability, or privacy per agent.

Example team setup:

Agent	Role	Provider	Model	Why
Researcher	Web research, data gathering	OpenRouter	GPT-4.1 Mini	Fast, cheap
Analyst	Deep analysis, reasoning	Anthropic	Claude Sonnet 4	Best reasoning
Writer	Content generation	Ollama Cloud	Gemma 4 26B	Open-source, private

To set this up:

Create LLM configs for each provider/model combination.
Set the team's default LLM config (used by agents without an override).
On individual agents, select a different LLM config to override the team default.

Full details: LLM Assignment

Provider Comparison

Provider	Auth	Cost	Data Sovereignty	Tool Calling	Best For
OpenRouter	API key	Per-token	Provider-hosted	Yes	Most users — widest model selection
OpenAI	API key	Per-token	Provider-hosted	Yes	Direct access, higher rate limits
Anthropic	API key	Per-token	Provider-hosted	Yes	Claude models specifically
Google	API key	Per-token	Provider-hosted	Yes	Gemini models, long context
Ollama Cloud	API key	Subscription	Ollama-hosted	Yes	Easy access to open-source models
Ollama (self-hosted)	None	Your hardware	Full control	Yes	Privacy, air-gapped, compliance
LM Studio	None	Your hardware	Full control	Yes	Self-hosted with visual management
Custom	Optional	Varies	Varies	Depends	Corporate gateways, vLLM, TGI

Cloud Providers​

OpenRouter​

OpenAI​

Anthropic​

Google AI​

Open-Source & Self-Hosted Models​

Ollama​

Ollama Cloud (Recommended)​

Your Own Ollama Instance​

Recommended Ollama Models​

LM Studio​

Custom / OpenAI-Compatible Endpoints​

Mixing Providers in a Team​

Provider Comparison​